I ran rllib train --run=PPO --env=CartPole-v0 on my computer. ... a stable PPO CartPole-v1 (harder than v0) on my 5 years old mac book pro.
確定! 回上一頁