As it turns out, for Linear() layers, PyTorch uses fairly complicated default weight and bias initialization. I went to the initialization source code at ...
確定! 回上一頁