From what I understand from a quick browse of the paper, the innovative part compared to AlphaZero type of approach is that MuZero doesn't "know" the rules ...
確定! 回上一頁