nn.LayerNorm produces same result without grad attribute. A similar question and answer with layer norm implementation can be found here, layer ...
確定! 回上一頁