The discriminator is based on the ViT architecture [8] which was ... VQGAN [11] is a similar architecture but uses images instead of text as input for ...
確定! 回上一頁