Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: #6761 Onnx Connector #8106

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

nmoeller
Copy link
Contributor

@nmoeller nmoeller commented Aug 14, 2024

Motivation and Context

  1. Why is this changed required ?
    To enable Onnx Models with Semantic Kernel, there was the issue Python: Add support for local models via ONNX #6761 in the Backlog to add a Onnx Connector
  2. What problem does it solve ?
    It solves the problem, that semantic kernel is not yet integrated with Onnx Gen Ai Runtime
  3. What scenario does it contribute to?
    The scenario is to use different connector than HF,OpenAI or AzureOpenAI. When User's want to use Onnx they can easliy integrate it now
  4. If it fixes an open issue, please link to the issue here.
    Python: Add support for local models via ONNX #6761

Description

The changes made are designed by my own based on other connectors, i tried to stay as close as possible to the structure.
For the integration i installed the mistral python package in the repository.

I added the following Classes :

  • OnnxCompletionBase --> Responsible to control the inference
  • OnnxTextCompletion --> Inherits from OnnxCompletionBase
    • Support for Text Completion with and without Images
    • Ready for Multimodal Inference
    • Ready for Text Only Inference
    • Supports all Models on onnxruntime-genai
  • OnnxChatCompletion -->Inherits from OnnxCompletionBase
    • Support for Text Completion with and without Images
    • The user needs to provide the corresponding chat template to use this class
    • Ready for Multimodal Inference
    • Ready for Text Only Inference
    • Supports all Models on onnxruntime-genai

What is integrated yet :

  • OnnxCompletionBase Class
  • OnnxChatCompletionBase Class with Dynamic Template Support
  • OnnxTextCompletionBase Class
  • Sample Multimodal Inference with Phi3-Vision
  • Sample of OnnxChatCompletions with Phi3
  • Sample of OnnxTextCompletions with Phi3
  • Integration Tests
  • Unit Tests

Some Notes

Contribution Checklist

@markwallace-microsoft markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Aug 14, 2024
@nmoeller nmoeller changed the title Python : Issue-6761-Onnx-Connector Python: Issue-6761-Onnx-Connector Aug 14, 2024
@nmoeller nmoeller changed the title Python: Issue-6761-Onnx-Connector Python: #6761 Onnx Connector Aug 14, 2024
@nmoeller nmoeller marked this pull request as ready for review September 17, 2024 14:32
@nmoeller nmoeller requested a review from a team as a code owner September 17, 2024 14:32
…i-Connector

# Conflicts:
#	python/tests/integration/completions/chat_completion_test_base.py
#	python/uv.lock
python/tests/conftest.py Outdated Show resolved Hide resolved
if os.environ["OLLAMA_MODEL"]:
ollama_setup = True
except KeyError:
OllamaTextCompletion()
Copy link
Contributor

@TaoChenOSU TaoChenOSU Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, if we do this, it's very likely that we will never know if our test pipeline is setup correctly since all of the ollama tests will be skipped instead of failing.

Copy link
Contributor

@TaoChenOSU TaoChenOSU Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for the others in chat & text completion

python/tests/integration/completions/test_utils.py Outdated Show resolved Hide resolved
@TaoChenOSU
Copy link
Contributor

Regarding our offline conversation on the prompt template, is using a prompt template to parse the chat history to some format an overkill? Prompt template can do much more that substituting arguments. Is it possible to override the _prepare_chat_history_for_request method to get what Onnx wants?

nmoeller and others added 6 commits September 19, 2024 08:44
…_completion_base.py

Co-authored-by: Tao Chen <taochen@microsoft.com>
…_completion_base.py

Co-authored-by: Tao Chen <taochen@microsoft.com>
…_completion_base.py

Co-authored-by: Tao Chen <taochen@microsoft.com>
…_chat_completion.py

Co-authored-by: Tao Chen <taochen@microsoft.com>
Copy link
Member

@eavanvalkenburg eavanvalkenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to rethink the prompt based setup but the rest looks good!

@nmoeller
Copy link
Contributor Author

nmoeller commented Sep 20, 2024

As disscussed with @eavanvalkenburg i introduced hardcoded Templates in the onnx_utils.py file and integrated applying the Template into the _prepare_chat_history_for_request.

If people want to introduce custom templates they can overwrite the _prepare_chat_history_for_request.
Also the output from string to the SK Objects can now be overwritten, since they are seperated into _create_chat_message_content and _create_streaming_chat_message_content.

Copy link
Member

@eavanvalkenburg eavanvalkenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of small improvements, but looking good!

python/semantic_kernel/connectors/ai/onnx/onnx_utils.py Outdated Show resolved Hide resolved
python/semantic_kernel/connectors/ai/onnx/onnx_utils.py Outdated Show resolved Hide resolved
python/semantic_kernel/connectors/ai/onnx/onnx_utils.py Outdated Show resolved Hide resolved
nmoeller and others added 2 commits September 21, 2024 11:45
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests for the Python Semantic Kernel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants