For training, we used the AdamW optimizer [36] with parameters first momentum 0.9, second momentum 0.999, and weight decay 0.0001. The initial learning rate was ...
確定! 回上一頁