Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

receiver/prometheus: glue and complete staleness marking for disappearing metrics #3423

Conversation

odeke-em
Copy link
Member

Adds the staleness store to the metricsBuilder and transaction to track when metrics
disappear between scrapes.

Depends on #3414
Fixes #3413
Fixes #2961

@alolita alolita requested a review from dashpole June 16, 2021 15:12
@odeke-em odeke-em force-pushed the receiver-prometheus-add-staleness-tracking+emit-staleness-markers branch 2 times, most recently from 1a1339f to 6ab3223 Compare June 17, 2021 09:56
@odeke-em
Copy link
Member Author

The test failures were from an exported struct whose name was changed in the past 21 hours from "service.AppSettings" to "service.CollectorSettings" per #3268

@odeke-em odeke-em force-pushed the receiver-prometheus-add-staleness-tracking+emit-staleness-markers branch from 6ab3223 to 4e062a8 Compare June 17, 2021 10:06
@odeke-em
Copy link
Member Author

Kindly /cc-ing @tigrannajaryan @bogdandrutu

@odeke-em
Copy link
Member Author

The receiver/prometheusreceiver TestEndToEnd test is quite rigid from 2 years ago, so that'll require me digging as I figure out what it does then retrofit with NaNs for values.

@tigrannajaryan
Copy link
Member

The build is failing and there are merge conflicts. Please fix.

@odeke-em odeke-em force-pushed the receiver-prometheus-add-staleness-tracking+emit-staleness-markers branch 2 times, most recently from bdf50eb to f5b3e98 Compare June 22, 2021 14:27
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 30, 2021
@alolita alolita removed the Stale label Jun 30, 2021
@alolita
Copy link
Member

alolita commented Jun 30, 2021

@odeke-em the tests for Windows, Unit tests and build are still failing. Can you please fix?

@odeke-em odeke-em force-pushed the receiver-prometheus-add-staleness-tracking+emit-staleness-markers branch 3 times, most recently from e29db94 to f0edaab Compare July 7, 2021 00:15
@tigrannajaryan
Copy link
Member

Status checks don't finish, can't merge. Will close/open to see if it helps.

@tigrannajaryan tigrannajaryan reopened this Jul 8, 2021
@odeke-em odeke-em force-pushed the receiver-prometheus-add-staleness-tracking+emit-staleness-markers branch from fef92c0 to 044efa0 Compare July 8, 2021 17:17
@odeke-em odeke-em force-pushed the receiver-prometheus-add-staleness-tracking+emit-staleness-markers branch from 044efa0 to 0f47d55 Compare July 8, 2021 17:18
@odeke-em
Copy link
Member Author

odeke-em commented Jul 8, 2021

Thanks @tigrannajaryan and @rakyll! The Windows test is spurious but the other failure let me examine what's tripping it. Should shortly pass.

This is to fix data race that existed a long time before this code
but manifested when Prometheus concurrently modified pReceiver.
@odeke-em
Copy link
Member Author

odeke-em commented Jul 8, 2021

Thanks @tigrannajaryan and @rakyll! The Windows test is spurious but the other failure let me examine what's tripping it. Should shortly pass.

Yup it was a data race from code unrelated to this PR, that existed because Prometheus's code would modify pReceiver concurrently with pReceiver.Start being invoked.

@odeke-em
Copy link
Member Author

odeke-em commented Jul 8, 2021

Alright @tigrannajaryan @rakyll @alolita now all the related code and tests pass 🎉! The contrib failure is unrelated to this change

@tigrannajaryan tigrannajaryan merged commit 8b79380 into open-telemetry:main Jul 8, 2021
@odeke-em odeke-em deleted the receiver-prometheus-add-staleness-tracking+emit-staleness-markers branch July 8, 2021 23:42
@odeke-em
Copy link
Member Author

odeke-em commented Jul 8, 2021

Thank you all for the patient reviews and for your time!

dashpole added a commit to dashpole/opentelemetry-collector that referenced this pull request Sep 8, 2021
dashpole added a commit to dashpole/opentelemetry-collector that referenced this pull request Sep 8, 2021
dashpole added a commit to dashpole/opentelemetry-collector that referenced this pull request Sep 8, 2021
dashpole added a commit to dashpole/opentelemetry-collector that referenced this pull request Sep 8, 2021
dashpole added a commit to dashpole/opentelemetry-collector that referenced this pull request Sep 8, 2021
dashpole added a commit to dashpole/opentelemetry-collector that referenced this pull request Sep 8, 2021
bogdandrutu pushed a commit that referenced this pull request Sep 9, 2021
…heus, rather than making our own (#3989)

* Revert "receiver/prometheus: glue and complete staleness marking for disappearing metrics (#3423)"

This reverts commit 8b79380.

* Revert "receiver/prometheus: add store to track stale metrics (#3414)"

This reverts commit cdc1634.

* stop dropping staleness markers from prometheus, and fix tests

* add staleness end to end test from #3423

* fix import grouping
bogdandrutu pushed a commit that referenced this pull request Sep 9, 2021
…heus, rather than making our own (#3991)

* Revert "receiver/prometheus: glue and complete staleness marking for disappearing metrics (#3423)"

This reverts commit 8b79380.

* Revert "receiver/prometheus: add store to track stale metrics (#3414)"

This reverts commit cdc1634.

* stop dropping staleness markers from prometheus, and fix tests

* add staleness end to end test from #3423

* fix import grouping
bogdandrutu pushed a commit that referenced this pull request Sep 9, 2021
…heus, rather than making our own (#3990)

* Revert "receiver/prometheus: glue and complete staleness marking for disappearing metrics (#3423)"

This reverts commit 8b79380.

* Revert "receiver/prometheus: add store to track stale metrics (#3414)"

This reverts commit cdc1634.

* stop dropping staleness markers from prometheus, and fix tests

* add staleness end to end test from #3423

* fix import grouping
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants