Skip to content
View weishengying's full-sized avatar

Block or report weishengying

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. cutlass_flash_atten_fp8 cutlass_flash_atten_fp8 Public

    使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention

    Cuda 46 3

  2. tiny-flash-attention tiny-flash-attention Public

    使用 cutlass 实现 flash-attention 精简版,具有教学意义

    Cuda 29 1

  3. cute_gemm cute_gemm Public

    Cuda 4 1

  4. flash-attention flash-attention Public

    Forked from vllm-project/flash-attention

    Fast and memory-efficient exact attention

    Python