11.2.4 Stochastic Gradient Descent · N can be huge. · Randomly draw a mini-batch Bi B i from training data. And consider the following loss function instead:.
確定! 回上一頁