You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Also, I saw in one of the python scripts that you rename some of the weights to match the naming scheme between HuggingFace and llm-sharp. There is a useful attribute that you can add on any field to specify the name you want torch to store it as:
I saw on the TODO list Flash Attention, so I wanted to bring to your attention the announcement here.
Two packages were announced there:
1] Loading model weights saved using the PyTorch format / safetensors format (including handling for HuggingFace's sharding)
2] Flash Attention - self explanatory :)
The text was updated successfully, but these errors were encountered: