-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why Torchchat uses MATH as SDPA backend? #1452
Comments
This seems like a good idea! cc @Jack-Khuu if there's any history behind using MATH as default backend? |
Hmn, actually, it seems like there's an issue exporting when the default backend is not MATH. See issue: pytorch/pytorch#129418 It seems like there's a requirement that decompositions during export must not introduce any mutation ops. |
cc @mingfeima |
I think we may add an argument to let user to decide which backend to use, say "--attention-backend"; so when users are not depending on |
I draft a PR #1456 to add an argument |
🐛 Describe the bug
Hi maintainers,
I find that, Torchchat uses MATH as SDPA backend in https://github.com/pytorch/torchchat/blob/main/torchchat/generate.py#L542. However, for other libs like vllm, they all accept flash attention as default backend.
So why Torchchat uses MATH as a default backend? Is this required for accuracy? If not, I can help to add an argument to let user set the backend. Thanks!
Versions
The text was updated successfully, but these errors were encountered: