-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flash attention unavailable after 0.0.21 on Windows system #863
Comments
af6b866 commit has same problem
|
Hi, |
It seems like flash-attention 2.3.2 supports windows now. Dao-AILab/flash-attention#595 (comment) |
I will try to build flash attn with torch2.1.0 and cuda12.1 to see if it worked |
Does xformers automatically uses if FA2 is installed in the venv, or you have to build it with FA2 installed instead? |
@danthe3rd Flash attention is able to be compiled/installed on windows after 2.3.2 |
馃悰 Bug
Command
python -m xformers.info
To Reproduce
Steps to reproduce the behavior:
install xformers 0.0.21 or build from source on latest commit on windows, memory_efficient_attention.flshattF/B are all unavailable.
(Also, the build.env.TORCH_CUDA_ARCH_LIST in pre-built wheel doesn't have 8.6 and 8.9)
Expected behavior
both pre-built wheel and build from source should give us flash attention support.
(If this situation is bcuz windows doesn't support some feature which is needed in flashattn2, plz at least give us flash attn1 support on windows)
I also wondered if this is some bug in xformers.info, but since xformers 0.0.21 actually give me slower result than 0.0.20, I think flash attn just gone.
Environment
Additional context
here is the output of xformers.info on 0.0.21:
Here is the output of 0.0.20:
The text was updated successfully, but these errors were encountered: