Global sequence matching under temporal order consistency matters in contrastive-based video-paragraph/text learning.
確定! 回上一頁