Batch Learning from Logged Bandit Feedback through Counterfactual Risk Minimization. Adith Swaminathan, Thorsten Joachims; 16(52):1731−1755, 2015.
確定! 回上一頁