Note that when the f-divergence is discrete as in JS, KL we might face problems in learning models with gradients as the divergence loss is not ...
確定! 回上一頁