Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New component: Blob Attribute Uploader Connector #33737

Open
3 tasks
michaelsafyan opened this issue Jun 24, 2024 · 16 comments
Open
3 tasks

New component: Blob Attribute Uploader Connector #33737

michaelsafyan opened this issue Jun 24, 2024 · 16 comments
Labels
needs triage New item requiring triage Sponsor Needed New component seeking sponsor

Comments

@michaelsafyan
Copy link
Contributor

michaelsafyan commented Jun 24, 2024

The purpose and use-cases of the new component

The Blob Attribute Uploader Connector takes selected attributes (from spans, span events, logs, etc.) and:

  • Writes them to a large blob storage system
  • Replaces them in the original signal with a "Foreign Attribute Reference" referencing the URI of where it was written
  • Forwards the signal to a pipeline for the same signal type for further processing, export

This component is intended to address a number of concerns:

  • Sensitivity of data: certain data may be necessary to retain for debugging but may not be suitable for access by all oncallers or others with access to general operational data; writing certain attributes to a separate blob storage system may allow for finer-grained, alternative access restrictions to be applied compared with the general ops backend.
  • Size of the data:: some operational backends may have limitations around the size of the data they can receive; sending large attributes to a separate blob storage backend may avoid these limitations.
  • Costs of storage: while most operational data may need to be available quickly to address incidents, certain attributes may be needed to be accessed less frequently and may be suitable for lower cost, long-term storage options.

Motivating Examples:

  • HTTP request/response pairs stored in span attributes (http.request.body.content and http.response.body.content)
  • LLM prompt/response pairs stored in span event attributes ( gen_ai.prompt and gen_ai.completion)

Use Cases Related to the Examples:

  • Additional restrictions around the access are needed beyond that of the general operations solution; writing to a separate blob storage allows additional access controls to be applied. Links to the destination enable the results to be located in a separate backend storage system that provides the necessary checks on access.

  • Full request/responses get used rarely by the oncallers, only when their end user opens a ticket through their support mechanism; writing this data to a separate, low-cost storage system allows the user to save on their ops storage costs.

Example configuration for the component (subject to change)

The following is intended to illustrate the general idea, but is subject to change:

The configuration consists of a list of ConfigStanzas:

config := LIST[ConfigStanza]

Each config stanza defines how it will handle exactly one type of attribute. The properties of the stanza are:

  • match_attribute_key: (REQUIRED) The exact attribute key to match (e.g. http.request.body.content)
  • match_attribute_only_in: (OPTIONAL) Allows the key to be matched in only a specific part of the signal.
    • Supported values include:
      • SPAN: only look at span-level attributes (not resource, scope, or event attributes)
      • RESOURCE: only look at resource-level attributes (not span, scope, or event attributes)
      • SCOPE: only look at scope-level attributes (not span, resource, or event attributes)
      • EVENT: only look at event-level attributes (not span, resource, or scope attributes)
  • destination_uri: (Required) The pattern to which to write the data.
    • Ex: gs://example-bucket/full-http/request/payloads/${trace_id}/${span_id}.txt
    • Patterns may reference other parts of the signal, including:
      • trace_id
      • span_id
      • resource.attributes
      • span.attributes
      • scope.attributes
    • Keys can be referenced with dot or bracket notation (e.g. span.attributes.foo or span.attributes[foo]).
  • content_type: (OPTIONAL) Indicates the content type of the attribute (default: AUTO)
    • Options include:
      • AUTO: attempt to infer the content type automatically
      • extract_from: expr: derive it from other information in the signal
        - Ex: extract_from: span.attributes["http.request.header.content-type"]
      • any literal string (e.g. "application/json"): to use a static value
  • fraction_to_write: (OPTIONAL) Allows down sampling of the payloads. Defaults to 1.0 (i.e. 100%)
  • fraction_written_behavior: (OPTIONAL) Defaults to REPLACE_WITH_REFERENCE.
    • Options include:
      • REPLACE_WITH_REFERENCE: replace the value with a reference to the destination location.
      • KEEP: the write is a copy, but the original data is not altered.
      • DROP: the fact that a write happened will not be recorded in the attribute
  • fraction_not_written_behavior: (Optional) Defaults to DROP.
    • Options include:
      • DROP: remove the attribute in its entirety
      • KEEP: don't modify the original data if this fraction wasn't matched

Here is a full example with the above in mind:

 - match_attribute_key: http.request.body.content
   match_only_in: SPAN
   destination_uri:  "gs://${env.GCS_BUCKET}/${trace_id}/${span_id}/request.json"
   content_type: "application/json"

 - match_attribute_key: http.response.body.content
   match_only_in: SPAN
   destination_uri: "gs://${env.GCS_BUCKET}/${trace_id}/${span_id}/response.json"
   content_type: "application/json"

Telemetry data types supported

Traces

Is this a vendor-specific component?

  • This is a vendor-specific component
  • If this is a vendor-specific component, I am a member of the OpenTelemetry organization.
  • If this is a vendor-specific component, I am proposing to contribute and support it as a representative of the vendor.

Code Owner(s)

braydonk, michaelsafyan, dashpole

Sponsor (optional)

dashpole

Additional context

No response

@michaelsafyan michaelsafyan added needs triage New item requiring triage Sponsor Needed New component seeking sponsor labels Jun 24, 2024
@dashpole
Copy link
Contributor

I am willing to potentially sponsor this, but I would would love to see if any others have needed to store very large or sensitive attributes separately. I plan to raise this tomorrow at the SIG meeting.

@dashpole
Copy link
Contributor

I raised this at the SIG meeting today, but this wasn't an issue people on the call had run into before.

@dashpole
Copy link
Contributor

There is some consideration of moving the "larger" genai attributes. open-telemetry/semantic-conventions#483 (comment)

@karthikscale3
Copy link

We Langtrace are also interested to test out this span processor as we are also thinking about this problem. We currently have 2 GenAI OTEL instrumentation libraries - python and typescript.

@lmolkova
Copy link

The LLM Semconv WG is considering reporting prompts and completions in event payloads (and breaking them down into individual structured pieces) - open-telemetry/semantic-conventions#980

Still, there is a possibility that prompts/completion messages could be big. There is interest in the community to record generated images, audio, etc for debugging/evaluation purposes.

From general semconv perspective, we don't usually define span attributes that may contain unbounded data (gen_ai.prompt and completion are temporary exceptions), are are likely to recommend events/logs payloads for this.

In this context, it could make sense to also support blob uploads with LogProcessor. See also open-telemetry/semantic-conventions#1217 where a similar concerns have been raised for logs.

@michaelsafyan
Copy link
Contributor Author

In the interests of transparency, I have started related work on this here:

https://github.com/michaelsafyan/open-telemetry.opentelemetry-collector-contrib/tree/blob_writer_span_processor

I originally started with a "processor", but I'm having doubts regarding whether this functionality is possible with a processor and am now looking into representing it as an "exporter" that wraps another exporter (but perhaps this is incorrect?). In any event, the (very early, not yet complete code) is in development here:

https://github.com/michaelsafyan/open-telemetry.opentelemetry-collector-contrib/tree/blob_writer_span_processor/exporter/blobattributeexporter

I appreciate the insight that this may shift to a different representation... with that in mind, I am going to try to make this more general. While I will start with span attributes to handle current representations, I will keep the naming general and allow this to grow to address write-aside to blob storage from other signal types and other parts of the signal.

@michaelsafyan
Copy link
Contributor Author

Quick Status update:

  • Still working on this
  • Current ETA expectation is ~2 weeks to get a working demo

Will give another update in 2 weeks time or when this is working, whichever is sooner.

@michaelsafyan
Copy link
Contributor Author

Apologies that this is taking longer than expected. I am, however, still working on this.

@michaelsafyan
Copy link
Contributor Author

The general shape of this is now present and can be found in:

https://github.com/michaelsafyan/open-telemetry.opentelemetry-collector-contrib/tree/blob_writer_span_processor/connector/blobattributeuploadconnector

I still need to polish this and create end-to-end testing, but there is probably enough here to get early feedback.

Note that while the original scope was intended to focus on spans, the above covers BOTH spans AND span events, given the pivot of the GenAI semantic conventions towards span event attributes.

I also pivoted from hand-rolling the string interpolation, to trying to leverage OTTL to do it:

... this required some hackery in OTTL, though, and am wondering if there is an even cleaner approach than this.

@codefromthecrypt
Copy link
Contributor

@michaelsafyan thanks! To catch you up to date, the current semver 1.27.0 is already span events, so this is relevant.

What's a question mark to many is the change to log events. For example, not all backends know what to do with them, and there is some implied indexing. So, I would expect that once this is in, folks will want to transform log events (with span context) back to span events.

Do you feel up to adding a function like interpolateSpanEvent to do that? Something like logEventWithSpanContextToSpanEvent?

@michaelsafyan
Copy link
Contributor Author

@codefromthecrypt can you elaborate on what you mean by folks will want to transform log events (with span context) back to span events. Is that so that separate logs can get processed by this connector?

The way that I'm thinking about this is that blobattributeuploadconnector will be a generic component that enable:

  1. Uploading attribute content to a blob storage destination.
  2. Replacing the original attribute value with a "Foreign Attribute Reference" (see foreignattr.go)

What I have there now targets:

  • span attributes
  • span event attributes

A logical expansion of this logic would be to also handle:

  • log attributes
  • (maybe?) log body

Other types of conversions (such as span events to logs, or logs back into span events) make sense and would be useful, but probably should be considered out of scope for this particular component (and should probably be tracked in a separate issue), though I agree that it is important for different users to decide whether their events data is recorded as events attached to a span or as separate logs (and that a connector is likely to be a good way to implement that).

@codefromthecrypt
Copy link
Contributor

@michaelsafyan so the main q about log events was in relation to the genai spec which is about to switch to them. Since this spec is noted in the description, that's why I thought it might be in scope for this change/PR.

What do you think is a better place to move the topic of transform "span events to log events" to? If you don't have a specific idea, I'll open a new issue, just didn't want to duplicate this, if it was in scope.

@michaelsafyan
Copy link
Contributor Author

I think new, separate issues for "Log Events -> Span Event Connector" and "Span Events -> Logs Connector" would make sense.

@michaelsafyan michaelsafyan changed the title New component: blob writer span processor New component: Blob Attribute Uploader Connector Aug 14, 2024
@codefromthecrypt
Copy link
Contributor

cool. I opened #34695 first, and if I made any mistakes in the description please correct if you have karma to do so, or ask me to, if you don't.

@michaelsafyan
Copy link
Contributor Author

Just providing another update, since it has been a while.

I was out on vacation last week and had other work to catch up on this past week.

I am hoping to resume this work this coming week.

This is still on my plate.

@michaelsafyan
Copy link
Contributor Author

Quick status update:

  • Believe that the code (for spans and span events) is largely complete, but bugs may turn up as tests are written
  • Iterating on unit tests (traces_test.go).

I am, however, encountering merge conflicts when attempting to sync from upstream ... so this may require some additional work to resolve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage New item requiring triage Sponsor Needed New component seeking sponsor
Projects
None yet
Development

No branches or pull requests

5 participants