You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thank you so much for releasing this wonderful code!
I notice in your examples/pretrain_llama_7b.sh, the dtype is set to fp32, which seems to make activations fp32. However, I think it's more common to make activations bf16? Also, I notice that it seems like the param_dtype is always set to fp32.
Could you please elaborate a bit on this choice? Thank you very much!
The text was updated successfully, but these errors were encountered:
Hi, thank you so much for releasing this wonderful code!
I notice in your
examples/pretrain_llama_7b.sh
, thedtype
is set tofp32
, which seems to make activationsfp32
. However, I think it's more common to make activationsbf16
? Also, I notice that it seems like the param_dtype is always set tofp32
.Could you please elaborate a bit on this choice? Thank you very much!
The text was updated successfully, but these errors were encountered: