... if np.random.random() > self.epsilon: return np.argmax(self.Q(discretized_obs).data.to(torch.device('cpu')).numpy()) else: # Choose a random action ...
確定! 回上一頁