Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency behaviour on filter processor #18756

Closed
kedare opened this issue Feb 17, 2023 · 10 comments
Closed

Inconsistency behaviour on filter processor #18756

kedare opened this issue Feb 17, 2023 · 10 comments
Labels
bug Something isn't working closed as inactive priority:p2 Medium processor/filter Filter processor Stale

Comments

@kedare
Copy link

kedare commented Feb 17, 2023

Component(s)

processor/filter

What happened?

Description

I have been trying to setup a filter to drop a few spans from my traces, matching the following conditions (AND)

  • with attribute http.host should match regexp .*\.blob\.core\.windows\.net
  • with service.name == cardiolib
  • with otel.library.name == opentelemetry.instrumentation.requests

I have been following the documentation and tried to ways to achieve this

  processors:
    filter/cardiolib-skip-blob-requests:
      traces:
        span:
          - 'IsMatch(attributes["http.host"], ".*\\.blob\\.core\\.windows\\.net") == true and attributes["otel.library.name"] == "opentelemetry.instrumentation.requests" and resource.attributes["service.name"] == "cardiolib"'

With this filter, somehow, everything is dropped (from all services), I can't explain why.

If I change the span filter to this one instead

'attributes["otel.library.name"] == "opentelemetry.instrumentation.requests" and resource.attributes["service.name"] == "cardiolib"'

Then nothing gets dropped, I don't get how this IsMatch can affect the whole pipeline as it looks like it's not even looking forn the rest of the conditions ?

The documentations says:

If any condition is met, the telemetry is dropped (each condition is ORed together)

But from what I understand, this concerns the list of conditions under the span attribute and not the conditions in the list element itself ? (If so, the and statement is useless?)

I also tried to use the attribute filters this way but it doesn't have any effect either (with full fqdn instead of regexp to test if that was not a regexp issue)

config:
  processors:
    filter:
      spans:
        exclude:
          match_type: regexp
          services:
            - cardiolib
          attributes:
            - Key: http.host
              Value: "xxx.blob.core.windows.net"
          libraries:
            - Name: opentelemetry.instrumentation.requests

Steps to Reproduce

Expected Result

Actual Result

Collector version

otel/opentelemetry-collector-contrib:0.67.0

Environment information

Environment

Running on Kubernetes, datadog upstream

OpenTelemetry Collector configuration

config:
  receivers:
    prometheus:
      config:
        scrape_configs:
        - job_name: 'otelcol'
          scrape_interval: 10s
          static_configs:
          - targets: ['0.0.0.0:8888']

  processors:
    filter/test1:
      spans:
        exclude:
          match_type: regexp
          services:
            - cardiolib
          attributes:
            - key: http.host
              value: xxx.blob.core.windows.net
            - key: http.method
              value: PUT
    filter/test2:
      traces:
        span:
          - 'attributes["otel.library.name"] == "opentelemetry.instrumentation.requests" and resource.attributes["service.name"] == "cardiolib"'

  exporters:
    otlp:
      endpoint: "${K8S_HOST_IP}:14317"
      tls:
        insecure: true

  service:
    pipelines:
      traces:
        receivers: [otlp]
        processors:
          - filter/test2
          - batch
        exporters: [otlp]

Log output

No response

Additional context

No response

@kedare kedare added bug Something isn't working needs triage New item requiring triage labels Feb 17, 2023
@github-actions github-actions bot added the processor/filter Filter processor label Feb 17, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@TylerHelmuth TylerHelmuth added priority:p2 Medium and removed needs triage New item requiring triage labels Feb 17, 2023
@TylerHelmuth
Copy link
Member

@kedare can you provide a sample of what your data looks like? It could also be good to verify the data in the pipeline is exactly what you are expecting by using the logging exporter with loglevel: debug to print out the data.

Is otel.library.name definitely an attribute on the span or is it on the instrumentation scope?

It is also possible something is erroring and data being dropped as a result of the error. Do your collector logs include any errors?

@kedare
Copy link
Author

kedare commented Feb 17, 2023

I am not setting the otel.library.name anywhere by myself, I see it appearing on datadog and on the tempo on the local environment so I supposed it was set automatically by the opentelemetry sdk.

I will try to get you the proper data on monday, as we only do this kind of request from k8s cluster and not on local environment.

But I digged a little bit on a local environment with another type of requests and I can't see any http.host on the span from requests integration, so I can't explain why I do see them on datadog or tempo/grafana

This is what I get from a request

back-otlp-1  | ScopeSpans #1
back-otlp-1  | ScopeSpans SchemaURL:
back-otlp-1  | InstrumentationScope opentelemetry.instrumentation.requests 0.36b0
back-otlp-1  | Span #0
back-otlp-1  |     Trace ID       : f5e09207831d0f272dd962de64c9ac9e
back-otlp-1  |     Parent ID      : 1452cce409efcc8b
back-otlp-1  |     ID             : 849f2c68897bfea3
back-otlp-1  |     Name           : HTTP POST
back-otlp-1  |     Kind           : Client
back-otlp-1  |     Start time     : 2023-02-17 17:22:30.293110429 +0000 UTC
back-otlp-1  |     End time       : 2023-02-17 17:22:30.322640281 +0000 UTC
back-otlp-1  |     Status code    : Unset
back-otlp-1  |     Status message :
back-otlp-1  | Attributes:
back-otlp-1  |      -> http.method: Str(POST)
back-otlp-1  |      -> http.url: Str(http://back:9530/cardiolib/start/12)
back-otlp-1  |      -> http.status_code: Int(200)

Is that normal that there are missing attributes ?

Also I did not see any errors in the logs

@TylerHelmuth
Copy link
Member

I suspect "opentelemetry.instrumentation.requests" is actually the instrumentation scope name and I think your example confirms that. Assuming that is correct, you can check it via instrumentation_scope.name == "opentelemetry.instrumentation.requests".

I can't explain anything about IsMatch yet, but can you try filtering with:

  processors:
    filter/cardiolib-skip-blob-requests:
      traces:
        span:
          - 'instrumentation_scope.name == "opentelemetry.instrumentation.requests" and resource.attributes["service.name"] == "cardiolib"'

Your example didn't include anything about the Resource, so I can't make any guarantees that service.name really exists as an attribute on the Resource.

Is that normal that there are missing attributes ?

I don't actually see http.host as part of the semantic convention anymore for http. It was removed in this PR: open-telemetry/opentelemetry-specification#2469. I don't see the replacement name in your example either tho. You might have to ask datadog where they are getting that field from.

@kedare
Copy link
Author

kedare commented Feb 20, 2023

Thank you, inseed that was the issue :)
I ended up doing like this

  processors:
    filter:
      traces/cardiolib-skip-blob-requests:
        span:
          - > 
            IsMatch(attributes["http.url"], "^https:\/\/.*\.blob\.core\.windows\.net\/.*") == true
            and instrumentation_scope.name == "opentelemetry.instrumentation.requests"
            and service.name == "cardiolib"

For the http.host I can't explain where it's coming from, what is weird is that I also see it on Tempo, that would not be the case if that was only done by datadog

@TylerHelmuth
Copy link
Member

@kedare I suspect it is coming from something on the resource. All OTLP data is associated to a resource, can you see yours?

@kedare
Copy link
Author

kedare commented Feb 22, 2023

But I think the resource is static and global to the application ?
When I checked on Tempo the http.host was set as span attribute and not resource attribute.

@TylerHelmuth
Copy link
Member

I can't make any claims on how Tempo works, but within the context of the collector the data is in OTLP format within the processors so the spans are definitely associated to a Resource. The loggingexporter, when set to loglevel: debug, will show you the OTLP representation of the data.

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Apr 24, 2023
@github-actions
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working closed as inactive priority:p2 Medium processor/filter Filter processor Stale
Projects
None yet
Development

No branches or pull requests

2 participants