MATE -KD first trains a masked language model based generator to perturb text by maximizing the divergence between teacher and student logits.
確定! 回上一頁