-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[processor/interval]: time-based batching #34906
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
I wonder if we could discard old samples during ingestion. Like, do not store an array of datapoints, just replace the old one if we receive another with more recent timestamp 🤔 |
oh that's absolutely the case here, sorry if my pseudocode wasn't clear enough. we are only storing the last datapoint per stream, but sharding our stored streams into 60 maps, so that we can flush one every second, instead of all every minute, more evenly distributing load over the course of a minute. note we store streams in a say you have a stream with id |
Interesting.... I like the idea. For any given data stream, we're still aggregating at the given interval. But overall, we're doing flushes at interval / 60 (which could be configured) rate. To reduce the spikiness |
Issue filed by code owner, and another has voiced support. Removing |
Component(s)
processor/interval
Is your feature request related to a problem? Please describe.
intervalprocessor
exports all metrics strictly on interval. with sufficient scale, this poses challenges, as metrics are collected over e.g. 60 seconds and then flushed all at once, leading to spikes and silence, instead of a constant load on the network and receiving side.Describe the solution you'd like
Distribute metrics export over the entire interval.
I suggest this "sharding" is done on the stream level, grouping the streams as such (pseudocode):
The text was updated successfully, but these errors were encountered: