Skip to content

williamberman/memory_efficient_attn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Collection of toy implementations of memory efficient attention.

All numpy implementations are in attn.py, and there's a cuda implementation in attn_chunk_q_chunk_kv_cuda/attn_chunk_q_chunk_kv_kernel.cu

Building the cuda implementation
cd attn_chunk_q_chunk_kv_cuda
python setup.py install
Running the tests

No formal testing framework, just run

python test.py

If no assertions are thrown, tests pass :)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published