... 91.0% top-1 accuracy on ImageNet with a finetuned encoder.Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch.
確定! 回上一頁