diff --git a/docs/ai/llm-spans.md b/docs/ai/llm-spans.md index 19c4162321..9ed8134795 100644 --- a/docs/ai/llm-spans.md +++ b/docs/ai/llm-spans.md @@ -2,7 +2,7 @@ linkTitle: LLM Calls ---> -# Semantic Conventions for LLM requests +# Semantic Conventions for LLM Spans **Status**: [Experimental][DocumentStatus] @@ -10,9 +10,10 @@ linkTitle: LLM Calls -- [LLM Request attributes](#llm-request-attributes) - [Configuration](#configuration) -- [Semantic Conventions for specific LLM technologies](#semantic-conventions-for-specific-llm-technologies) +- [LLM Request attributes](#llm-request-attributes) +- [LLM Response attributes](#llm-response-attributes) +- [LLM Span Events](#llm-span-events) @@ -35,65 +36,52 @@ By default, these configurations SHOULD NOT capture prompts and completions. These attributes track input data and metadata for a request to an LLM. Each attribute represents a concept that is common to most LLMs. - + | Attribute | Type | Description | Examples | Requirement Level | |---|---|---|---|---| -| `llm.vendor` | string | The name of the LLM foundation model vendor, if applicable. If not using a vendor-supplied model, this field is left blank. | `openai` | Recommended | -| `llm.request.model` | string | The name of the LLM a request is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model requested. If the LLM is a fine-tuned custom model, the value SHOULD have a more specific name than the base model that's been fine-tuned. | `gpt-4` | Required | -| `llm.request.max_tokens` | int | The maximum number of tokens the LLM generates for a request. | `100` | Recommended | -| `llm.temperature` | float | The temperature setting for the LLM request. | `0.0` | Recommended | -| `llm.top_p` | float | The top_p sampling setting for the LLM request. | `1.0` | Recommended | -| `llm.stream` | bool | Whether the LLM responds with a stream. | `false` | Recommended | -| `llm.stop_sequences` | array | Array of strings the LLM uses as a stop sequence. | `["stop1"]` | Recommended | - -`llm.model` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used. - -| Value | Description | -|---|---| -| `gpt-4` | GPT-4 | -| `gpt-4-32k` | GPT-4 with 32k context window | -| `gpt-3.5-turbo` | GPT-3.5-turbo | -| `gpt-3.5-turbo-16k` | GPT-3.5-turbo with 16k context window| -| `claude-instant-1` | Claude Instant (latest version) | -| `claude-2` | Claude 2 (latest version) | -| `other-llm` | Any LLM not listed in this table. Use for any fine-tuned version of a model. | +| [`llm.request.max_tokens`](../attributes-registry/llm.md) | int | The maximum number of tokens the LLM generates for a request. | `100` | Recommended | +| [`llm.request.model`](../attributes-registry/llm.md) | string | The name of the LLM a request is being made to. [1] | `gpt-4` | Required | +| [`llm.stop_sequences`](../attributes-registry/llm.md) | string | Array of strings the LLM uses as a stop sequence. | `stop1` | Recommended | +| [`llm.stream`](../attributes-registry/llm.md) | boolean | Whether the LLM responds with a stream. | `False` | Recommended | +| [`llm.temperature`](../attributes-registry/llm.md) | double | The temperature setting for the LLM request. | `0.0` | Recommended | +| [`llm.top_p`](../attributes-registry/llm.md) | double | The top_p sampling setting for the LLM request. | `1.0` | Recommended | +| [`llm.vendor`](../attributes-registry/llm.md) | string | The name of the LLM foundation model vendor, if applicable. [2] | `openai` | Recommended | + +**[1]:** The name of the LLM a request is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model requested. If the LLM is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. + +**[2]:** The name of the LLM foundation model vendor, if applicable. If not using a vendor-supplied model, this field is left blank. ## LLM Response attributes These attributes track output data and metadata for a response from an LLM. Each attribute represents a concept that is common to most LLMs. - + | Attribute | Type | Description | Examples | Requirement Level | |---|---|---|---|---| -| `llm.response.id` | string | The unique identifier for the completion. | `chatcmpl-123` | Recommended | -| `llm.response.model` | string | The name of the LLM a response is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model actually used. If the LLM is a fine-tuned custom model, the value SHOULD have a more specific name than the base model that's been fine-tuned. | `gpt-4-0613` | Required | -| `llm.response.finish_reason` | string | The reason the model stopped generating tokens | `stop` | Recommended | -| `llm.usage.prompt_tokens` | int | The number of tokens used in the LLM prompt. | `100` | Recommended | -| `llm.usage.completion_tokens` | int | The number of tokens used in the LLM response (completion). | `180` | Recommended | -| `llm.usage.total_tokens` | int | The total number of tokens used in the LLM prompt and response. | `280` | Recommended | - -`llm.response.finish_reason` MUST be one of the following: - -| Value | Description | -|---|---| -| `stop` | If the model hit a natural stop point or a provided stop sequence. | -| `max_tokens` | If the maximum number of tokens specified in the request was reached. | -| `tool_call` | If a function / tool call was made by the model (for models that support such functionality). | +| [`llm.response.finish_reason`](../attributes-registry/llm.md) | string | The reason the model stopped generating tokens. | `stop` | Recommended | +| [`llm.response.id`](../attributes-registry/llm.md) | string | The unique identifier for the completion. | `chatcmpl-123` | Recommended | +| [`llm.response.model`](../attributes-registry/llm.md) | string | The name of the LLM a response is being made to. [1] | `gpt-4-0613` | Required | +| [`llm.usage.completion_tokens`](../attributes-registry/llm.md) | int | The number of tokens used in the LLM response (completion). | `180` | Recommended | +| [`llm.usage.prompt_tokens`](../attributes-registry/llm.md) | int | The number of tokens used in the LLM prompt. | `100` | Recommended | +| [`llm.usage.total_tokens`](../attributes-registry/llm.md) | int | The total number of tokens used in the LLM prompt and response. | `280` | Recommended | + +**[1]:** The name of the LLM a response is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model actually used. If the LLM is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. -## Events +## LLM Span Events In the lifetime of an LLM span, an event for prompts sent and completions received MAY be created, depending on the configuration of the instrumentation. - + | Attribute | Type | Description | Examples | Requirement Level | -| `llm.prompt` | string | The full prompt string sent to an LLM in a request. If the LLM accepts a more complex input like a JSON object made up of several pieces (such as OpenAI's different message types), this field is blank, and the response is instead captured in an event determined by the specific LLM technology semantic convention. | `\n\nHuman:You are an AI assistant that tells jokes. Can you tell me a joke about OpenTelemetry?\n\nAssistant:` | Recommended | - +|---|---|---|---|---| +| [`llm.completion`](../attributes-registry/llm.md) | string | The full response string from an LLM in a response. [1] | `Why did the developer stop using OpenTelemetry? Because they couldnt trace their steps!` | Recommended | +| [`llm.prompt`](../attributes-registry/llm.md) | string | The full prompt string sent to an LLM in a request. [2] | `\\n\\nHuman:You are an AI assistant that tells jokes. Can you tell me a joke about OpenTelemetry?\\n\\nAssistant:` | Recommended | - -| Attribute | Type | Description | Examples | Requirement Level | -| `llm.completion` | string | The full response string from an LLM. If the LLM responds with a more complex output like a JSON object made up of several pieces (such as OpenAI's message choices), this field is the content of the response. If the LLM produces multiple responses, then this field is left blank, and each response is instead captured in an event determined by the specific LLM technology semantic convention.| `Why did the developer stop using OpenTelemetry? Because they couldn't trace their steps!` | Recommended | +**[1]:** The full response string from an LLM. If the LLM responds with a more complex output like a JSON object made up of several pieces (such as OpenAI's message choices), this field is the content of the response. If the LLM produces multiple responses, then this field is left blank, and each response is instead captured in an event determined by the specific LLM technology semantic convention. + +**[2]:** The full prompt string sent to an LLM in a request. If the LLM accepts a more complex input like a JSON object, this field is blank, and the response is instead captured in an event determined by the specific LLM technology semantic convention. [DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.22.0/specification/document-status.md \ No newline at end of file diff --git a/docs/ai/openai.md b/docs/ai/openai.md index 4c7acf404a..7dcbee5a6d 100644 --- a/docs/ai/openai.md +++ b/docs/ai/openai.md @@ -14,53 +14,63 @@ focusing on the `/chat/completions` endpoint. By following to these guidelines, developers can ensure consistent, meaningful, and easily interpretable telemetry data across different applications and platforms. + + +- [Chat Completions](#chat-completions) + * [Request Attributes](#request-attributes) + * [Response attributes](#response-attributes) +- [OpenAI Span Events](#openai-span-events) + * [Prompt Events](#prompt-events) + * [Tools Events](#tools-events) + * [Choice Events](#choice-events) + + + ## Chat Completions The span name for OpenAI chat completions SHOULD be `openai.chat` to maintain consistency and clarity in telemetry data. -## Request Attributes +### Request Attributes These are the attributes when instrumenting OpenAI LLM requests with the `/chat/completions` endpoint. - + | Attribute | Type | Description | Examples | Requirement Level | |---|---|---|---|---| -| `llm.vendor` | string | The name of the LLM foundation model vendor, if applicable. If not using a vendor-supplied model, this field is left blank. | `openai` | Recommended | -| `llm.request.model` | string | The name of the LLM a request is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model requested. If the LLM is a fine-tuned custom model, the value SHOULD have a more specific name than the base model that's been fine-tuned. | `gpt-4` | Required | -| `llm.request.max_tokens` | int | The maximum number of tokens the LLM generates for a request. | `100` | Recommended | -| `llm.temperature` | float | The temperature setting for the LLM request. | `0.0` | Recommended | -| `llm.top_p` | float | The top_p sampling setting for the LLM request. | `1.0` | Recommended | -| `llm.stream` | bool | Whether the LLM responds with a stream. | `false` | Recommended | -| `llm.stop_sequences` | array | Array of strings the LLM uses as a stop sequence. | `["stop1"]` | Recommended | -| `llm.openai.n` | integer | The number of completions to generate. | `1` | Recommended | -| `llm.openai.presence_penalty` | float | If present, the `presence_penalty` used in an OpenAI request. Value is between -2.0 and 2.0. | `-0.5` | Recommended | -| `llm.openai.frequency_penalty` | float | If present, the `frequency_penalty` used in an OpenAI request. Value is between -2.0 and 2.0. | `-0.5` | Recommended | -| `llm.openai.logit_bias` | string | If present, the JSON-encoded string of a `logit_bias` used in an OpenAI request. | `{2435:-100, 640:-100}` | Recommended | -| `llm.openai.user` | string | If present, the `user` used in an OpenAI request. | `bob` | Opt-in | -| `llm.openai.response_format` | string | An object specifying the format that the model must output. Either `text` or `json_object` | `text` | Recommended | -| `llm.openai.seed` | integer | Seed used in request to improve determinism. | `1234` | Recommended | +| [`llm.openai.logit_bias`](../attributes-registry/llm.md) | string | If present, the JSON-encoded string of a `logit_bias` used in an OpenAI request | `{2435:-100, 640:-100}` | Recommended | +| [`llm.openai.presence_penalty`](../attributes-registry/llm.md) | double | If present, the `presence_penalty` used in an OpenAI request. Value is between -2.0 and 2.0. | `-0.5` | Recommended | +| [`llm.openai.response_format`](../attributes-registry/llm.md) | string | An object specifying the format that the model must output. Either `text` or `json_object` | `text` | Recommended | +| [`llm.openai.user`](../attributes-registry/llm.md) | string | If present, the `user` used in an OpenAI request. | `bob` | Recommended | +| [`llm.request.max_tokens`](../attributes-registry/llm.md) | int | The maximum number of tokens the LLM generates for a request. | `100` | Recommended | +| [`llm.request.model`](../attributes-registry/llm.md) | string | The name of the LLM a request is being made to. [1] | `gpt-4` | Required | +| [`llm.stop_sequences`](../attributes-registry/llm.md) | string | Array of strings the LLM uses as a stop sequence. | `stop1` | Recommended | +| [`llm.stream`](../attributes-registry/llm.md) | boolean | Whether the LLM responds with a stream. | `False` | Recommended | +| [`llm.temperature`](../attributes-registry/llm.md) | double | The temperature setting for the LLM request. | `0.0` | Recommended | +| [`llm.top_p`](../attributes-registry/llm.md) | double | The top_p sampling setting for the LLM request. | `1.0` | Recommended | +| [`llm.vendor`](../attributes-registry/llm.md) | string | The name of the LLM foundation model vendor, if applicable. | `openai`; `microsoft` | Recommended | + +**[1]:** The name of the LLM a request is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model requested. If the LLM is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. -## Response attributes +### Response attributes Attributes for chat completion responses SHOULD follow these conventions: - + | Attribute | Type | Description | Examples | Requirement Level | |---|---|---|---|---| -| `llm.response.id` | string | The unique identifier for the completion. | `chatcmpl-123` | Recommended | -| `llm.response.model` | string | The name of the LLM a response is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model actually used. If the LLM is a fine-tuned custom model, the value SHOULD have a more specific name than the base model that's been fine-tuned. | `gpt-4-0613` | Required | -| `llm.response.finish_reason` | string | The reason the model stopped generating tokens | `stop` | Recommended | -| `llm.usage.prompt_tokens` | int | The number of tokens used in the LLM prompt. | `100` | Recommended | -| `llm.usage.completion_tokens` | int | The number of tokens used in the LLM response (completion). | `180` | Recommended | -| `llm.usage.total_tokens` | int | The total number of tokens used in the LLM prompt and response. | `280` | Recommended | -| `llm.openai.created` | int | The UNIX timestamp (in seconds) if when the completion was created. | `1677652288` | Recommended | -| `llm.openai.system_fingerprint` | string | This fingerprint represents the backend configuration that the model runs with. | asdf987123 | Recommended | +| [`llm.openai.created`](../attributes-registry/llm.md) | int | The UNIX timestamp (in seconds) if when the completion was created. | `1677652288` | Recommended | +| [`llm.openai.seed`](../attributes-registry/llm.md) | int | Seed used in request to improve determinism. | `1234` | Recommended | +| [`llm.response.finish_reason`](../attributes-registry/llm.md) | string | The reason the model stopped generating tokens. | `stop` | Recommended | +| [`llm.response.id`](../attributes-registry/llm.md) | string | The unique identifier for the completion. | `chatcmpl-123` | Recommended | +| [`llm.usage.completion_tokens`](../attributes-registry/llm.md) | int | The number of tokens used in the LLM response (completion). | `180` | Recommended | +| [`llm.usage.prompt_tokens`](../attributes-registry/llm.md) | int | The number of tokens used in the LLM prompt. | `100` | Recommended | +| [`llm.usage.total_tokens`](../attributes-registry/llm.md) | int | The total number of tokens used in the LLM prompt and response. | `280` | Recommended | -## Request Events +## OpenAI Span Events In the lifetime of an LLM span, an event for prompts sent and completions received MAY be created, depending on the configuration of the instrumentation. Because OpenAI uses a more complex prompt structure, these events will be used instead of the generic ones detailed in the [LLM Semantic Conventions](llm-spans.md). @@ -69,25 +79,25 @@ Because OpenAI uses a more complex prompt structure, these events will be used i Prompt event name SHOULD be `llm.openai.prompt`. - + | Attribute | Type | Description | Examples | Requirement Level | |---|---|---|---|---| -| `role` | string | The role of the prompt author, can be one of `system`, `user`, `assistant`, or `tool` | `system` | Required | -| `content` | string | The content for a given OpenAI response, denoted by ``. The value for `` starts with 0, where 0 is the first message. | `Why did the developer stop using OpenTelemetry? Because they couldn't trace their steps!` | Required | -| `tool_call_id` | string | If role is `tool` or `function`, then this tool call that this message is responding to. | `get_current_weather` | Conditionally Required: If `role` is `tool`. | +| [`llm.openai.content`](../attributes-registry/llm.md) | string | The content for a given OpenAI response. | `Why did the developer stop using OpenTelemetry? Because they couldn't trace their steps!` | Required | +| [`llm.openai.role`](../attributes-registry/llm.md) | string | The role of the prompt author, can be one of `system`, `user`, `assistant`, or `tool` | `user` | Required | +| [`llm.openai.tool_call.id`](../attributes-registry/llm.md) | string | If role is `tool` or `function`, then this tool call that this message is responding to. | `get_current_weather` | Conditionally Required: Required if the prompt role is `tool`. | ### Tools Events Tools event name SHOULD be `llm.openai.tool`, specifying potential tools or functions the LLM can use. - + | Attribute | Type | Description | Examples | Requirement Level | |---|---|---|---|---| -| `type` | string | They type of the tool. Currently, only `function` is supported. | `function` | Required | -| `function.name` | string | The name of the function to be called. | `get_weather` | Required ! -| `function.description` | string | A description of what the function does, used by the model to choose when and how to call the function. | `` | Required | -| `function.parameters` | string | JSON-encoded string of the parameter object for the function. | `{"type": "object", "properties": {}}` | Required | +| [`llm.openai.function.description`](../attributes-registry/llm.md) | string | A description of what the function does, used by the model to choose when and how to call the function. | `Gets the current weather for a location` | Required | +| [`llm.openai.function.name`](../attributes-registry/llm.md) | string | The name of the function to be called. | `get_weather` | Required | +| [`llm.openai.function.parameters`](../attributes-registry/llm.md) | string | JSON-encoded string of the parameter object for the function. | `{"type": "object", "properties": {}}` | Required | +| [`llm.openai.tool_call.type`](../attributes-registry/llm.md) | string | The type of the tool. Currently, only `function` is supported. | `function` | Required | ### Choice Events @@ -97,18 +107,27 @@ Span Events. Choice event name SHOULD be `llm.openai.choice`. -If there is more than one `tool_call`, separate events SHOULD be used. +If there is more than one `choice`, separate events SHOULD be used. - -| `type` | string | Either `delta` or `message`. | `message` | Required | + +| Attribute | Type | Description | Examples | Requirement Level | |---|---|---|---|---| -| `finish_reason` | string | The reason the OpenAI model stopped generating tokens for this chunk. | `stop` | Recommended | -| `role` | string | The assigned role for a given OpenAI response, denoted by ``. The value for `` starts with 0, where 0 is the first message. | `system` | Required | -| `content` | string | The content for a given OpenAI response, denoted by ``. The value for `` starts with 0, where 0 is the first message. | `Why did the developer stop using OpenTelemetry? Because they couldn't trace their steps!` | Required | -| `tool_call.id` | string | If exists, the ID of the tool call. | `call_BP08xxEhU60txNjnz3z9R4h9` | Required | -| `tool_call.type` | string | Currently only `function` is supported. | `function` | Required | -| `tool_call.function.name` | string | If exists, the name of a function call for a given OpenAI response, denoted by ``. The value for `` starts with 0, where 0 is the first message. | `get_weather_report` | Required | -| `tool_call.function.arguments` | string | If exists, the arguments to call a function call with for a given OpenAI response, denoted by ``. The value for `` starts with 0, where 0 is the first message. | `{"type": "object", "properties": {"some":"data"}}` | Required | +| [`llm.openai.choice.type`](../attributes-registry/llm.md) | string | The type of the choice, either `delta` or `message`. | `message` | Required | +| [`llm.openai.content`](../attributes-registry/llm.md) | string | The content for a given OpenAI response. | `Why did the developer stop using OpenTelemetry? Because they couldn't trace their steps!` | Required | +| [`llm.openai.function.arguments`](../attributes-registry/llm.md) | string | If exists, the arguments to call a function call with for a given OpenAI response, denoted by ``. The value for `` starts with 0, where 0 is the first message. | `{"type": "object", "properties": {"some":"data"}}` | Conditionally Required: [1] | +| [`llm.openai.function.name`](../attributes-registry/llm.md) | string | The name of the function to be called. | `get_weather` | Conditionally Required: [2] | +| [`llm.openai.role`](../attributes-registry/llm.md) | string | The role of the prompt author, can be one of `system`, `user`, `assistant`, or `tool` | `user` | Required | +| [`llm.openai.tool_call.id`](../attributes-registry/llm.md) | string | If role is `tool` or `function`, then this tool call that this message is responding to. | `get_current_weather` | Conditionally Required: [3] | +| [`llm.openai.tool_call.type`](../attributes-registry/llm.md) | string | The type of the tool. Currently, only `function` is supported. | `function` | Conditionally Required: [4] | +| [`llm.response.finish_reason`](../attributes-registry/llm.md) | string | The reason the model stopped generating tokens. | `stop` | Recommended | + +**[1]:** Required if the choice is the result of a tool call of type `function`. + +**[2]:** Required if the choice is the result of a tool call of type `function`. + +**[3]:** Required if the choice is the result of a tool call. + +**[4]:** Required if the choice is the result of a tool call. [DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.22.0/specification/document-status.md \ No newline at end of file diff --git a/docs/attributes-registry/llm.md b/docs/attributes-registry/llm.md new file mode 100644 index 0000000000..8c203ba211 --- /dev/null +++ b/docs/attributes-registry/llm.md @@ -0,0 +1,125 @@ + + +# Large Language Model (LLM) + + + +- [Generic LLM Attributes](#generic-llm-attributes) + * [Request Attributes](#request-attributes) + * [Response Attributes](#response-attributes) + * [Event Attributes](#event-attributes) +- [OpenAI Attributes](#openai-attributes) + * [Request Attributes](#request-attributes-1) + * [Response Attributes](#response-attributes-1) + * [Event Attributes](#event-attributes-1) + + + +## Generic LLM Attributes + +### Request Attributes + + +| Attribute | Type | Description | Examples | +|---|---|---|---| +| `llm.request.max_tokens` | int | The maximum number of tokens the LLM generates for a request. | `100` | +| `llm.request.model` | string | The name of the LLM a request is being made to. | `gpt-4` | +| `llm.stop_sequences` | string | Array of strings the LLM uses as a stop sequence. | `stop1` | +| `llm.stream` | boolean | Whether the LLM responds with a stream. | `False` | +| `llm.temperature` | double | The temperature setting for the LLM request. | `0.0` | +| `llm.top_p` | double | The top_p sampling setting for the LLM request. | `1.0` | +| `llm.vendor` | string | The name of the LLM foundation model vendor, if applicable. | `openai` | + + +### Response Attributes + + +| Attribute | Type | Description | Examples | +|---|---|---|---| +| `llm.response.finish_reason` | string | The reason the model stopped generating tokens. | `stop` | +| `llm.response.id` | string | The unique identifier for the completion. | `chatcmpl-123` | +| `llm.response.model` | string | The name of the LLM a response is being made to. | `gpt-4-0613` | +| `llm.usage.completion_tokens` | int | The number of tokens used in the LLM response (completion). | `180` | +| `llm.usage.prompt_tokens` | int | The number of tokens used in the LLM prompt. | `100` | +| `llm.usage.total_tokens` | int | The total number of tokens used in the LLM prompt and response. | `280` | + + +### Event Attributes + + +| Attribute | Type | Description | Examples | +|---|---|---|---| +| `llm.completion` | string | The full response string from an LLM in a response. | `Why did the developer stop using OpenTelemetry? Because they couldnt trace their steps!` | +| `llm.prompt` | string | The full prompt string sent to an LLM in a request. | `\\n\\nHuman:You are an AI assistant that tells jokes. Can you tell me a joke about OpenTelemetry?\\n\\nAssistant:` | + + +## OpenAI Attributes + +### Request Attributes + + +| Attribute | Type | Description | Examples | +|---|---|---|---| +| `llm.openai.frequency_penalty` | double | If present, the `frequency_penalty` used in an OpenAI request. Value is between -2.0 and 2.0. | `-0.5` | +| `llm.openai.logit_bias` | string | If present, the JSON-encoded string of a `logit_bias` used in an OpenAI request | `{2435:-100, 640:-100}` | +| `llm.openai.presence_penalty` | double | If present, the `presence_penalty` used in an OpenAI request. Value is between -2.0 and 2.0. | `-0.5` | +| `llm.openai.response_format` | string | An object specifying the format that the model must output. Either `text` or `json_object` | `text` | +| `llm.openai.seed` | int | Seed used in request to improve determinism. | `1234` | +| `llm.openai.user` | string | If present, the `user` used in an OpenAI request. | `bob` | + +`llm.openai.response_format` MUST be one of the following: + +| Value | Description | +|---|---| +| `text` | text | +| `json_object` | json_object | + + +### Response Attributes + + +| Attribute | Type | Description | Examples | +|---|---|---|---| +| `llm.openai.created` | int | The UNIX timestamp (in seconds) if when the completion was created. | `1677652288` | +| `llm.openai.system_fingerprint` | string | This fingerprint represents the backend configuration that the model runs with. | `asdf987123` | + + +### Event Attributes + + +| Attribute | Type | Description | Examples | +|---|---|---|---| +| `llm.openai.choice.type` | string | The type of the choice, either `delta` or `message`. | `message` | +| `llm.openai.content` | string | The content for a given OpenAI response. | `Why did the developer stop using OpenTelemetry? Because they couldn't trace their steps!` | +| `llm.openai.finish_reason` | string | The reason the OpenAI model stopped generating tokens for this chunk. | `stop` | +| `llm.openai.function.arguments` | string | If exists, the arguments to call a function call with for a given OpenAI response, denoted by ``. The value for `` starts with 0, where 0 is the first message. | `{"type": "object", "properties": {"some":"data"}}` | +| `llm.openai.function.description` | string | A description of what the function does, used by the model to choose when and how to call the function. | `Gets the current weather for a location` | +| `llm.openai.function.name` | string | The name of the function to be called. | `get_weather` | +| `llm.openai.function.parameters` | string | JSON-encoded string of the parameter object for the function. | `{"type": "object", "properties": {}}` | +| `llm.openai.role` | string | The role of the prompt author, can be one of `system`, `user`, `assistant`, or `tool` | `user` | +| `llm.openai.tool_call.id` | string | If role is `tool` or `function`, then this tool call that this message is responding to. | `get_current_weather` | +| `llm.openai.tool_call.type` | string | The type of the tool. Currently, only `function` is supported. | `function` | + +`llm.openai.choice.type` MUST be one of the following: + +| Value | Description | +|---|---| +| `delta` | delta | +| `message` | message | + +`llm.openai.role` MUST be one of the following: + +| Value | Description | +|---|---| +| `system` | system | +| `user` | user | +| `assistant` | assistant | +| `tool` | tool | + +`llm.openai.tool_call.type` MUST be one of the following: + +| Value | Description | +|---|---| +| `function` | function | + \ No newline at end of file diff --git a/model/registry/llm.yaml b/model/registry/llm.yaml new file mode 100644 index 0000000000..60912165db --- /dev/null +++ b/model/registry/llm.yaml @@ -0,0 +1,194 @@ +groups: + - id: registry.llm + prefix: llm + type: attribute_group + brief: > + This document defines the attributes used to describe telemetry in the context of LLM (Large Language Models) requests and responses. + attributes: + - id: vendor + type: string + brief: The name of the LLM foundation model vendor, if applicable. + examples: 'openai' + tag: llm-generic-request + - id: request.model + type: string + brief: The name of the LLM a request is being made to. + examples: 'gpt-4' + tag: llm-generic-request + - id: request.max_tokens + type: int + brief: The maximum number of tokens the LLM generates for a request. + examples: [100] + tag: llm-generic-request + - id: temperature + type: double + brief: The temperature setting for the LLM request. + examples: [0.0] + tag: llm-generic-request + - id: top_p + type: double + brief: The top_p sampling setting for the LLM request. + examples: [1.0] + tag: llm-generic-request + - id: stream + type: boolean + brief: Whether the LLM responds with a stream. + examples: [false] + tag: llm-generic-request + - id: stop_sequences + type: string + brief: Array of strings the LLM uses as a stop sequence. + examples: ["stop1"] + tag: llm-generic-request + - id: response.id + type: string + brief: The unique identifier for the completion. + examples: ['chatcmpl-123'] + tag: llm-generic-response + - id: response.model + type: string + brief: The name of the LLM a response is being made to. + examples: ['gpt-4-0613'] + tag: llm-generic-response + - id: response.finish_reason + type: string + brief: The reason the model stopped generating tokens. + examples: ['stop'] + tag: llm-generic-response + - id: usage.prompt_tokens + type: int + brief: The number of tokens used in the LLM prompt. + examples: [100] + tag: llm-generic-response + - id: usage.completion_tokens + type: int + brief: The number of tokens used in the LLM response (completion). + examples: [180] + tag: llm-generic-response + - id: usage.total_tokens + type: int + brief: The total number of tokens used in the LLM prompt and response. + examples: [280] + tag: llm-generic-response + - id: prompt + type: string + brief: The full prompt string sent to an LLM in a request. + examples: ['\\n\\nHuman:You are an AI assistant that tells jokes. Can you tell me a joke about OpenTelemetry?\\n\\nAssistant:'] + tag: llm-generic-events + - id: completion + type: string + brief: The full response string from an LLM in a response. + examples: ['Why did the developer stop using OpenTelemetry? Because they couldnt trace their steps!'] + tag: llm-generic-events + - id: openai.presence_penalty + type: double + brief: If present, the `presence_penalty` used in an OpenAI request. Value is between -2.0 and 2.0. + examples: -0.5 + tag: tech-specific-openai-request + - id: openai.frequency_penalty + type: double + brief: If present, the `frequency_penalty` used in an OpenAI request. Value is between -2.0 and 2.0. + examples: -0.5 + tag: tech-specific-openai-request + - id: openai.logit_bias + type: string + brief: If present, the JSON-encoded string of a `logit_bias` used in an OpenAI request + examples: ['{2435:-100, 640:-100}'] + tag: tech-specific-openai-request + - id: openai.user + type: string + brief: If present, the `user` used in an OpenAI request. + examples: ['bob'] + tag: tech-specific-openai-request + - id: openai.response_format + type: + members: + - id: text + value: 'text' + - id: json_object + value: 'json_object' + brief: An object specifying the format that the model must output. Either `text` or `json_object` + examples: 'text' + tag: tech-specific-openai-request + - id: openai.seed + type: int + brief: Seed used in request to improve determinism. + examples: 1234 + tag: tech-specific-openai-request + - id: openai.created + type: int + brief: The UNIX timestamp (in seconds) if when the completion was created. + examples: 1677652288 + tag: tech-specific-openai-response + - id: openai.system_fingerprint + type: string + brief: This fingerprint represents the backend configuration that the model runs with. + examples: 'asdf987123' + tag: tech-specific-openai-response + - id: openai.role + type: + members: + - id: system + value: 'system' + - id: user + value: 'user' + - id: assistant + value: 'assistant' + - id: tool + value: 'tool' + brief: The role of the prompt author, can be one of `system`, `user`, `assistant`, or `tool` + examples: 'user' + tag: tech-specific-openai-events + - id: openai.content + type: string + brief: The content for a given OpenAI response. + examples: 'Why did the developer stop using OpenTelemetry? Because they couldn''t trace their steps!' + tag: tech-specific-openai-events + - id: openai.function.name + type: string + brief: The name of the function to be called. + examples: 'get_weather' + tag: tech-specific-openai-events + - id: openai.function.description + type: string + brief: A description of what the function does, used by the model to choose when and how to call the function. + examples: 'Gets the current weather for a location' + tag: tech-specific-openai-events + - id: openai.function.parameters + type: string + brief: JSON-encoded string of the parameter object for the function. + examples: '{"type": "object", "properties": {}}' + tag: tech-specific-openai-events + - id: openai.function.arguments + type: string + brief: If exists, the arguments to call a function call with for a given OpenAI response, denoted by ``. The value for `` starts with 0, where 0 is the first message. + examples: '{"type": "object", "properties": {"some":"data"}}' + tag: tech-specific-openai-events + - id: openai.finish_reason + type: string + brief: The reason the OpenAI model stopped generating tokens for this chunk. + examples: 'stop' + tag: tech-specific-openai-events + - id: openai.tool_call.id + type: string + brief: If role is `tool` or `function`, then this tool call that this message is responding to. + examples: 'get_current_weather' + tag: tech-specific-openai-events + - id: openai.tool_call.type + type: + members: + - id: function + value: 'function' + brief: The type of the tool. Currently, only `function` is supported. + examples: 'function' + tag: tech-specific-openai-events + - id: openai.choice.type + type: + members: + - id: delta + value: 'delta' + - id: message + value: 'message' + brief: The type of the choice, either `delta` or `message`. + examples: 'message' + tag: tech-specific-openai-events \ No newline at end of file diff --git a/model/trace/llm.yaml b/model/trace/llm.yaml new file mode 100644 index 0000000000..a4ee102374 --- /dev/null +++ b/model/trace/llm.yaml @@ -0,0 +1,164 @@ +groups: + - id: llm.request + type: span + brief: > + A request to an LLM is modeled as a span in a trace. The span name should be a low cardinality value representing the request made to an LLM, like the name of the API endpoint being called. + attributes: + - ref: llm.vendor + requirement_level: recommended + note: > + The name of the LLM foundation model vendor, if applicable. If not using a vendor-supplied model, this field is left blank. + - ref: llm.request.model + requirement_level: required + note: > + The name of the LLM a request is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model requested. If the LLM is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. + - ref: llm.request.max_tokens + requirement_level: recommended + - ref: llm.temperature + requirement_level: recommended + - ref: llm.top_p + requirement_level: recommended + - ref: llm.stream + requirement_level: recommended + - ref: llm.stop_sequences + requirement_level: recommended + + - id: llm.response + type: span + brief: > + These attributes track output data and metadata for a response from an LLM. Each attribute represents a concept that is common to most LLMs. + attributes: + - ref: llm.response.id + requirement_level: recommended + - ref: llm.response.model + requirement_level: required + note: > + The name of the LLM a response is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model actually used. If the LLM is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. + - ref: llm.response.finish_reason + requirement_level: recommended + - ref: llm.usage.prompt_tokens + requirement_level: recommended + - ref: llm.usage.completion_tokens + requirement_level: recommended + - ref: llm.usage.total_tokens + requirement_level: recommended + + - id: llm.events + type: span + brief: > + In the lifetime of an LLM span, events for prompts sent and completions received may be created, depending on the configuration of the instrumentation. + attributes: + - ref: llm.prompt + requirement_level: recommended + note: > + The full prompt string sent to an LLM in a request. If the LLM accepts a more complex input like a JSON object, this field is blank, and the response is instead captured in an event determined by the specific LLM technology semantic convention. + - ref: llm.completion + requirement_level: recommended + note: > + The full response string from an LLM. If the LLM responds with a more complex output like a JSON object made up of several pieces (such as OpenAI's message choices), this field is the content of the response. If the LLM produces multiple responses, then this field is left blank, and each response is instead captured in an event determined by the specific LLM technology semantic convention. + + - id: llm.openai + type: span + brief: > + These are the attributes when instrumenting OpenAI LLM requests with the `/chat/completions` endpoint. + attributes: + - ref: llm.vendor + requirement_level: recommended + examples: ['openai', 'microsoft'] + tag: tech-specific-openai-request + - ref: llm.request.model + requirement_level: required + note: > + The name of the LLM a request is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model requested. If the LLM is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. + tag: tech-specific-openai-request + - ref: llm.request.max_tokens + tag: tech-specific-openai-request + - ref: llm.temperature + tag: tech-specific-openai-request + - ref: llm.top_p + tag: tech-specific-openai-request + - ref: llm.stream + tag: tech-specific-openai-request + - ref: llm.stop_sequences + tag: tech-specific-openai-request + - ref: llm.openai.presence_penalty + tag: tech-specific-openai-request + - ref: llm.openai.logit_bias + tag: tech-specific-openai-request + - ref: llm.openai.user + tag: tech-specific-openai-request + - ref: llm.openai.response_format + tag: tech-specific-openai-request + - ref: llm.openai.seed + tag: tech-specific-openai-response + - ref: llm.response.id + tag: tech-specific-openai-response + - ref: llm.response.finish_reason + tag: tech-specific-openai-response + - ref: llm.usage.prompt_tokens + tag: tech-specific-openai-response + - ref: llm.usage.completion_tokens + tag: tech-specific-openai-response + - ref: llm.usage.total_tokens + tag: tech-specific-openai-response + - ref: llm.openai.created + tag: tech-specific-openai-response + - ref: llm.openai.system_fingerprint + tag: tech-sepecifc-openai-response + + - id: llm.openai.prompt + type: span + brief: > + These are the attributes when instrumenting OpenAI LLM requests and recording prompts in the request. + attributes: + - ref: llm.openai.role + requirement_level: required + - ref: llm.openai.content + requirement_level: required + - ref: llm.openai.tool_call.id + requirement_level: + conditionally_required: > + Required if the prompt role is `tool`. + + - id: llm.openai.tool + type: span + brief: > + These are the attributes when instrumenting OpenAI LLM requests that specify tools (or functions) the LLM can use. + attributes: + - ref: llm.openai.tool_call.type + requirement_level: required + - ref: llm.openai.function.name + requirement_level: required + - ref: llm.openai.function.description + requirement_level: required + - ref: llm.openai.function.parameters + requirement_level: required + + - id: llm.openai.choice + type: span + brief: > + These are the attributes when instrumenting OpenAI LLM requests and recording choices in the result. + attributes: + - ref: llm.openai.choice.type + requirement_level: required + - ref: llm.response.finish_reason + - ref: llm.openai.role + requirement_level: required + - ref: llm.openai.content + requirement_level: required + - ref: llm.openai.tool_call.id + requirement_level: + conditionally_required: > + Required if the choice is the result of a tool call. + - ref: llm.openai.tool_call.type + requirement_level: + conditionally_required: > + Required if the choice is the result of a tool call. + - ref: llm.openai.function.name + requirement_level: + conditionally_required: > + Required if the choice is the result of a tool call of type `function`. + - ref: llm.openai.function.arguments + requirement_level: + conditionally_required: > + Required if the choice is the result of a tool call of type `function`. \ No newline at end of file