We study the online restless bandit problem, where the state of each arm evolves according to a Markov chain, and the reward of pulling an arm ...
確定! 回上一頁