Skip to content

NeonGeckoCom/neon-llm-llama

 
 

Repository files navigation

NeonAI LLM Llama

Proxies API calls to Llama.

Request Format

API requests should include history, a list of tuples of strings, and the current query

Example Request:

{
 "history": [["user", "hello"], ["llm", "hi"]],
 "query": "how are you?"
}

Response Format

Responses will be returned as dictionaries. Responses should contain the following:

  • response - String LLM response to the query

Docker Configuration

When running this as a docker container, the XDG_CONFIG_HOME envvar is set to /config. A configuration file at /config/neon/diana.yaml is required and should look like:

MQ:
  port: <MQ Port>
  server: <MQ Hostname or IP>
  users:
    neon_llm_llama:
      password: <neon_llama user's password>
      user: neon_llama
LLM_LLAMA:
  context_depth: 3
  max_tokens: 256
  num_parallel_processes: 2
  num_threads_per_process: 4

For example, if your configuration resides in ~/.config:

export CONFIG_PATH="/home/${USER}/.config"
docker run -v ${CONFIG_PATH}:/config neon_llm_llama

Note: If connecting to a local MQ server, you may need to specify --network host

GPU

System setup

# Nvidia Docker
sudo apt install curl
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

Run docker

export CONFIG_PATH="/home/${USER}/.config"
docker run --gpus 0 -v ${CONFIG_PATH}:/config neon_llm_llama

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages

  • Python 98.6%
  • Dockerfile 1.4%