5 GB VRAM, using the 8bit adam optimizer from bitsandbytes along with xformers ... The balance between the AdamW weight decay and learning rate can have a ...
確定! 回上一頁