objects than CBS. Our T5 + MF model outperforms the existing state-of-the-art end-to-end single-stage image captioning systems (Agrawal et ...
確定! 回上一頁