Shouldn't the hidden_states get multiplied with output weights to get the output. Yes and No. It depends on your problem formulation.
確定! 回上一頁