LLM Server

描述

LLM Server 是一个使用Rust开发，基于 silent 和 candle 的大语言模型服务，提供了类似openai的接口，易于部署和使用。

目前支持的模型

whisper
llama及其衍生模型的gguf量化版本

安装

使用本服务默认你已经安装了Rust

# 获取代码
git clone https://github.com/silent-rs/llm_server.git
cd llm_server
# 编译
cargo build --release
# 开启Mac Metal支持
cargo build --release --features metal
# 开启CUDA支持 (需要安装CUDA驱动)
cargo build --release --features cuda
# 运行程序
./target/release/llm_server --configs ./configs.toml

配置文件说明

# 服务监听地址，优先获取运行参数host, 其次获取环境变量HOST，再次获取配置文件，最后使用默认值localhost
host = "0.0.0.0"
# 服务监听端口，优先获取运行参数port, 其次获取环境变量PORT，再次获取配置文件，最后使用默认值8000
port = 8000
# 对话模型配置列表
[[chat_configs]]
model_id = "model_path/yi-chat-6b.Q5_K_M.gguf"
alias = "yi-chat-6b.Q5_K_M.gguf"
cpu = false
gqa = 1
tokenizer = "model_path/tokenizer.json"
[[chat_configs]]
model_id = "model_path/yi-chat-6b.Q5.gguf"
alias = "yi-chat-6b.Q5.gguf"
cpu = false
gqa = 1
tokenizer = "model_path/tokenizer.json"

# 语音转文字模型配置列表
[[whisper_configs]]
model_id = "model_path/whisper-large-v3"
# alias 为模型的别名，用于区分不同的模型，目前不同模型的别名不能相同且固定
alias = "large-v3"
cpu = false

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github		.github
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Cargo.toml		Cargo.toml
LICENSE		LICENSE
config.toml		config.toml
deny.toml		deny.toml
pyproject.toml		pyproject.toml
readme.md		readme.md
readme_en.md		readme_en.md
rustfmt.toml		rustfmt.toml
test.py		test.py
test_chat_config.toml		test_chat_config.toml
test_config.toml		test_config.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Server

描述

目前支持的模型

安装

配置文件说明

About

Releases

Packages

Languages

License

silent-rs/llm_server

Folders and files

Latest commit

History

Repository files navigation

LLM Server

描述

目前支持的模型

安装

配置文件说明

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages