Size: WIT is the largest multimodal dataset of image-text examples that is publicly available. Multilingual: With 108 languages, WIT has 10x or ...
確定! 回上一頁