Hence, when they trained XLNet-Large, they excluded the next-sentence prediction objective. RoBERTa authors also found that removing the NSP ...
確定! 回上一頁