We achieve state-of-the-art controllability on two chal- lenging benchmarks, and generate diverse captions by using different verbs, semantic roles, or ...
確定! 回上一頁