That is why the code reshape x to (1, batch_size * num_channels, H, W) and then use F.batch_norm to apply the modulation on each sample and each channel instead ...
確定! 回上一頁