Our targets are of shape bs x sl, so we need to flatten those before using them in F.cross_entropy: def loss_func(inp, targ): return ...
確定! 回上一頁