Skip to content

Nintorac/simple_rwkv

Repository files navigation

simple-rwkv

RWKV LLM servicer for SimpleAI

Description

This project uses the RWKV-LM model and turns it into an gRPC service that can be used through SimpleAI.

RWKV is an RNN with Transformer-level language model performance that can be trained like a GPT transformer and is 100% attention-free. It combines the best of RNN and transformer, providing great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Usage

Edit the MODEL variable in get_models.py to choose the model size and context.

Edit the STRATEGY variable in lib_raven.py to decide how the weights will be loaded, play with this to optimise the throughput for your system. See below for a graphic explanation or checkout ChatRWKV for more information.

Strategies as of 20 Apr 2023

Build

docker build . -t raven-rwkv-service:latest

Start service

docker run -it --rm -p 50051:50051 --gpus all raven-rwkv-service:latest

Add to model.toml

```toml
[raven]
    [raven.metadata]
        owned_by    = 'BlinkDL'
        permission  = []
        description = 'RWKV fine tuned for instruction answering'
    [raven.network]
        url = 'localhost:50051'

## Credits

Heavily borrowed from lhenault & BlinkDL

https://huggingface.co/spaces/BlinkDL/Raven-RWKV-7B

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published