There, we store each BatchNorm layer's running mean and variance while training. It's a decaying running average. Then, at the test time, ...
確定! 回上一頁