We aim to make stochastic gradient descent (SGD) adaptive to (i) the noise \sigma^2 in the stochastic gradients and (ii) problem-dependent ...
確定! 回上一頁