Skip to content
@bentoml

BentoML

The easiest way to run AI Inference in the cloud

Welcome to BentoML 👋 Twitter Follow Slack

BentoML

What is BentoML? 👩‍🍳

BentoML is an open-source model serving library for building model inference APIs and multi-model serving systems with any open-source or custom AI models. It comes with everything you need for serving optimization, model packaging, and simplifies production deployment via ☁️ BentoCloud.

Get in touch 💬

👉 Join our Slack community!

👀 Follow us on X @bentomlai and LinkedIn

📖 Read our blog

Pinned Loading

  1. BentoML BentoML Public

    The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!

    Python 6.8k 766

  2. OpenLLM OpenLLM Public

    Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.

    Python 9.2k 587

Repositories

Showing 10 of 80 repositories
  • BentoML Public

    The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!

    bentoml/BentoML’s past year of commit activity
    Python 6,766 Apache-2.0 766 215 13 Updated Jun 28, 2024
  • bentoml/llm-bench’s past year of commit activity
    Python 10 1 2 1 Updated Jun 27, 2024
  • OpenLLM Public

    Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.

    bentoml/OpenLLM’s past year of commit activity
    Python 9,210 Apache-2.0 587 60 0 Updated Jun 27, 2024
  • BentoVLLM Public

    Self-host LLMs with vLLM and BentoML

    bentoml/BentoVLLM’s past year of commit activity
    Python 36 9 3 0 Updated Jun 25, 2024
  • BentoLMDeploy Public

    Self-host LLMs with LMDeploy and BentoML

    bentoml/BentoLMDeploy’s past year of commit activity
    Python 7 1 0 0 Updated Jun 25, 2024
  • bentoml/bentocloud-homepage-news’s past year of commit activity
    1 1 0 0 Updated Jun 21, 2024
  • bentoml/openllm-repo’s past year of commit activity
    HTML 0 0 0 0 Updated Jun 21, 2024
  • chatgpt-lite Public Forked from blrchen/chatgpt-lite

    Fast ChatGPT UI with support for both OpenAI and Azure OpenAI. 快速的ChatGPT UI,支持OpenAI和Azure OpenAI。

    bentoml/chatgpt-lite’s past year of commit activity
    TypeScript 0 MIT 79 0 1 Updated Jun 20, 2024
  • asynq Public Forked from hibiken/asynq

    Simple, reliable, and efficient distributed task queue in Go

    bentoml/asynq’s past year of commit activity
    Go 0 MIT 695 0 2 Updated Jun 20, 2024
  • bentoml/BentoWhisperX’s past year of commit activity
    Python 6 1 0 8 Updated Jun 20, 2024