Any plans of supporting INT8 and INT4 precision with tensor core support? #220
Replies: 3 comments
-
INT8 is planned but currently delayed because the developer working on it has gone missing. I will give him at least one more month before we explore other options to get it supported. The hard part of getting suppott is very deep integration of our part and the fact not every developer has a GPU that can support it. So in my personal case for example I can not finish his INT8 support since my card can not do this in general. iNT4 is currently not planned since huggingface has no support for it. Once they add support we can explore that to. |
Beta Was this translation helpful? Give feedback.
-
We are waiting for your decision and implementation, hakurei has already released the lit-6B-8bit model. |
Beta Was this translation helpful? Give feedback.
-
Since the last update we have been focussed on overhauling our backend to make these implementations easier to implement. So currently the holdup is that being finished. There are already unofficial versions from the community you can find in our discord. |
Beta Was this translation helpful? Give feedback.
-
This could reduce VRAM requirements by a lot and speed up the AI models significantly.
Beta Was this translation helpful? Give feedback.
All reactions