Dataset · This original corpus consist in OCRed documents from 10 European languages with about 20M characters (3.5M tokens) aligned with their corresponding ...
確定! 回上一頁