Instead, substantial amounts of knowledge are encoded in their parameters by pre-training on large corpora with a language modeling (LM) objective [6, 7, 8, 9].
確定! 回上一頁