-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: selective and fast logs deduplicator #1900
Labels
Comments
In the past we failed by not excluding the |
Would a de-duplicated log line have a field which indicates how many lines have been de-duplicated into one? I think that can still be important to know in some cases. |
Definitely yes! |
3 tasks
3 tasks
This was referenced Sep 25, 2023
This was referenced Oct 2, 2023
3 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Problem
A high traffic Mimir cluster can log a lot. For example, I've analysed the log rate of a medium-size cluster (with a good % of requests return 4xx because of some limits hit or out of order/bounds samples written) running with
-log.level=info
and the vaste majority of logs come from 2 sources:grpc_logging.go:38
(48k logs / sec)push.go:89
(13k logs / sec)All other logging callers are orders of magnitude less noisy.
Data as been queried from Loki:
Logs are very important and useful when debugging, but repeating the same log hundreds or thousands of times per second is not much useful, other than adding pressure to the system.
Proposal
I propose to build a logs deduplicator in Mimir, following these design principles:
grpc_logging.go
andpush.go
(in the future can be plugged in other places, if required).An example log:
The deduplication key for the example log above should be composed only by:
Discussion on actual proposed implementation will follow.
The text was updated successfully, but these errors were encountered: