num_heads – Number of parallel attention heads. Note that embed_dim ; dropout – Dropout probability on attn_output_weights ; bias – If specified, adds bias to ...
確定! 回上一頁