Bad performance using Adam and L2 regularization? Checkout @ml4aad's AdamW (Adam with decoupled weight decay), now part of. @tensorflow.
確定! 回上一頁