The 3D visual grounding task aims to ground a natural language description to the targeted object in a 3D scene, which is usually represented in ...
確定! 回上一頁