Multi-Armed Bandits in Brain-Computer Interfaces

Front Hum Neurosci. 2022 Jul 5:16:931085. doi: 10.3389/fnhum.2022.931085. eCollection 2022.

Abstract

The multi-armed bandit (MAB) problem models a decision-maker that optimizes its actions based on current and acquired new knowledge to maximize its reward. This type of online decision is prominent in many procedures of Brain-Computer Interfaces (BCIs) and MAB has previously been used to investigate, e.g., what mental commands to use to optimize BCI performance. However, MAB optimization in the context of BCI is still relatively unexplored, even though it has the potential to improve BCI performance during both calibration and real-time implementation. Therefore, this review aims to further describe the fruitful area of MABs to the BCI community. The review includes a background on MAB problems and standard solution methods, and interpretations related to BCI systems. Moreover, it includes state-of-the-art concepts of MAB in BCI and suggestions for future research.

Keywords: Brain-Computer Interface (BCI); calibration; multi-armed bandit (MAB); real-time optimization; reinforcement learning.

Publication types

  • Review