... nn_random_policy.py import tensorflow as tf import numpy as np import gym env ... tf.multinomial(tf.log(tf.concat([outputs, 1-outputs], 1)), 1) with tf.
確定! 回上一頁