PyTorch handles this with scaling the output of the dropout layer at training time with this probability: enter image description here.
確定! 回上一頁