Why Torchchat uses MATH as SDPA backend? #1452

yanbing-j · 2025-01-08T08:40:03Z

🐛 Describe the bug

Hi maintainers,

I find that, Torchchat uses MATH as SDPA backend in https://github.com/pytorch/torchchat/blob/main/torchchat/generate.py#L542. However, for other libs like vllm, they all accept flash attention as default backend.

So why Torchchat uses MATH as a default backend? Is this required for accuracy? If not, I can help to add an argument to let user set the backend. Thanks!

Versions

lucylq · 2025-01-10T18:04:40Z

I can help to add an argument to let user set the backend.

This seems like a good idea! cc @Jack-Khuu if there's any history behind using MATH as default backend?

lucylq · 2025-01-11T00:17:12Z

Hmn, actually, it seems like there's an issue exporting when the default backend is not MATH.

See issue: pytorch/pytorch#129418

It seems like there's a requirement that decompositions during export must not introduce any mutation ops. SDPBackend.MATH is known to work well with export; with other backends, we may run into issues. cc @angelayi

yanbing-j · 2025-01-11T02:50:28Z

cc @mingfeima

mingfeima · 2025-01-13T01:22:42Z

Hmn, actually, it seems like there's an issue exporting when the default backend is not MATH.

See issue: pytorch/pytorch#129418

It seems like there's a requirement that decompositions during export must not introduce any mutation ops. SDPBackend.MATH is known to work well with export; with other backends, we may run into issues. cc @angelayi

I think we may add an argument to let user to decide which backend to use, say "--attention-backend"; so when users are not depending on export can have better performance with flash attention.

yanbing-j · 2025-01-13T06:50:17Z

I draft a PR #1456 to add an argument attention_backend. The default value of this argument is math. Please take a look, thanks!

lucylq added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module enhancement New feature or request labels Jan 11, 2025

yanbing-j mentioned this issue Jan 13, 2025

Add attention_backend to let user choose #1456

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why Torchchat uses MATH as SDPA backend? #1452

Why Torchchat uses MATH as SDPA backend? #1452

yanbing-j commented Jan 8, 2025

lucylq commented Jan 10, 2025

lucylq commented Jan 11, 2025

yanbing-j commented Jan 11, 2025

mingfeima commented Jan 13, 2025

yanbing-j commented Jan 13, 2025

Why Torchchat uses MATH as SDPA backend? #1452

Why Torchchat uses MATH as SDPA backend? #1452

Comments

yanbing-j commented Jan 8, 2025

🐛 Describe the bug

Versions

lucylq commented Jan 10, 2025

lucylq commented Jan 11, 2025

yanbing-j commented Jan 11, 2025

mingfeima commented Jan 13, 2025

yanbing-j commented Jan 13, 2025