[Telemetry] Usage collectors not using `isReady` correctly #81944

chrisronline · 2020-10-28T19:26:40Z

The following collectors need to change their isReady to properly wait for the task manager state to exist:

Actions
Alerts
~~APM~~ APM is unique, as their telemetry data is not something easily computed
Lens

They are intentionally catching errors where the task manager hasn't ran their task yet, but that should be used to indicate if the usage collector is ready or not.

cc @TinaHeiligers

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-10-28T19:31:42Z

Pinging @elastic/kibana-telemetry (Team:KibanaTelemetry)

TinaHeiligers · 2020-10-28T19:33:08Z

Thanks @chrisronline!
Related to #52446

TinaHeiligers · 2020-11-02T20:13:50Z

@chrisronline the actions, alerts and lens collectors explicitly comment that they're fine with not returning any usage data the first time that their collectors' fetch methods are called. I could still follow up with them with draft "fixes" to double check that they're ok with skipping data.

chrisronline · 2020-11-03T13:54:22Z

@chrisronline the actions, alerts and lens collectors explicitly comment that they're fine with not returning any usage data the first time that their collectors' fetch methods are called. I could still follow up with them with draft "fixes" to double check that they're ok with skipping data.

Sure, but we don't have to settle for that and I don't know why we would. We have a way to ensure the telemetry data can always be consistent. My guess is that the authors of those collectors aren't aware that a "miss" might mean there is no telemetry data for that user the entire day.

TinaHeiligers · 2020-11-03T14:55:37Z

My guess is that the authors of those collectors aren't aware that a "miss" might mean there is no telemetry data for that user the entire day.

@chrisonline Gotcha! I'll update the README and change the example code to make it clearer.

chrisronline · 2020-11-03T16:37:37Z

FWIW, we've had this ability for some time now and there is some history in the original PR: #36153

TinaHeiligers · 2020-11-04T19:33:41Z

@chrisronline The updated README gives more info on using isReady. Is that sufficient to close this issue or would you prefer to wait on the teams involved?

chrisronline · 2020-11-04T19:57:34Z

@TinaHeiligers Yea that's great, thanks for adding that!

cc @mikecote for any thoughts on if the alerting team should update the alerts/actions collectors

TinaHeiligers · 2020-11-04T20:23:56Z

cc @wylieconlon Lens
cc @dgieselaar APM

You might want to reconsider the implementation of isReady in these collectors.

dgieselaar · 2020-11-05T08:05:36Z

@TinaHeiligers this is intentional - our telemetry collection can take a long time (minutes). The advice from the telemetry team (IIRC) was to not block telemetry collection on startup. Is the recommendation now different?

chrisronline · 2020-11-05T14:01:35Z

The telemetry collectors will wait up to 1 minute before attempting to send usage data without that collector; however, AFAIK, that time period is somewhat arbitrary and possibly changeable.

@dgieselaar Just curious, can you provide more information on why it takes so long?

mikecote · 2020-11-09T13:04:05Z

I created this issue #82947 for the alerting team to explore / fix this.

dgieselaar · 2020-11-09T13:05:26Z

@chrisronline heavy queries (aggregations on billions of documents), and we run them sequentially to be on the safe side as to not overloading the cluster.

chrisronline · 2020-11-10T16:05:22Z

Thanks @dgieselaar.

Once this data comes back, is it persisted in memory so that the next telemetry collection cycle picks it up? If so, does it persist in memory until Kibana is restarted? Is it ever re-fetched?

afharo · 2020-11-18T11:06:02Z

@chrisronline AFAIK, they use taskManager to periodically run those queries and they store the results in a savedObject. Then they retrieve the results from the SOs in the fetch method. Maybe @dgieselaar can confirm. If that's the behaviour, I think it is safe to restarts 🙂

dgieselaar · 2020-11-18T11:14:06Z

That's correct @afharo, thank you - and apologies to @chrisronline for missing this earlier :). We run our queries every 12hrs.

chrisronline · 2020-11-20T13:20:50Z

I see, okay. It seems we need to make an exception for APM. I'll update the issue

TinaHeiligers · 2020-11-20T15:41:28Z

@wylieconlon Have you had a chance to review the Lens usage collector's implementation of isReady yet?
cc @flash1293

elasticmachine · 2020-12-10T12:13:51Z

Pinging @elastic/kibana-core (Team:Core)

flash1293 · 2020-12-10T12:46:41Z

@TinaHeiligers I think this fell through the cracks, is there a deadline for this?

TinaHeiligers · 2021-01-05T21:48:57Z

@flash1293 There isn't a strict deadline from the Core team as such, since it's up to plugin owners to decide how important their data is for them. As with most things telemetry-related, the sooner we ensure we're capturing consistent data, the better 😄

flash1293 · 2021-01-11T13:31:14Z

I checked the code and it seems like we have the same structure as other consumers listed (although implemented a little different)

@gmmorris Seems like you implemented this, could you confirm it doesn't matter whether isReady is async itself (like here https://github.com/elastic/kibana/blob/master/x-pack/plugins/actions/server/usage/actions_usage_collector.ts#L33-L36 ) or returning false synchronously until it's ready (like here https://github.com/elastic/kibana/blob/master/x-pack/plugins/lens/server/usage/collectors.ts#L19-L23 )?

YulNaumenko · 2021-01-22T19:45:13Z

I checked the code and it seems like we have the same structure as other consumers listed (although implemented a little different)

@gmmorris Seems like you implemented this, could you confirm it doesn't matter whether isReady is async itself (like here https://github.com/elastic/kibana/blob/master/x-pack/plugins/actions/server/usage/actions_usage_collector.ts#L33-L36 ) or returning false synchronously until it's ready (like here https://github.com/elastic/kibana/blob/master/x-pack/plugins/lens/server/usage/collectors.ts#L19-L23 )?

@flash1293 you are right it doesn't matter as the approach, because you are doing the same check if the Task Manager is ready or not and isReady method support both boolean and Promise<boolean>. But in your case for keeping the code cleaner and straightforward the better usage of isReady will be to have it async, because you have the your check as async.

TinaHeiligers added Feature:Telemetry Team:KibanaTelemetry labels Oct 28, 2020

TinaHeiligers added the v7.11 label Oct 28, 2020

TinaHeiligers self-assigned this Oct 28, 2020

TinaHeiligers added the bug Fixes for quality problems that affect the customer experience label Oct 28, 2020

timroes added v7.11.0 and removed v7.11 labels Oct 29, 2020

TinaHeiligers mentioned this issue Nov 4, 2020

Usage collector readme #82548

Merged

2 tasks

mikecote mentioned this issue Nov 9, 2020

Fix usage of isReady for usage collection of alerts and actions #82947

Closed

afharo added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc and removed Team:KibanaTelemetry labels Dec 10, 2020

flash1293 mentioned this issue Jan 25, 2021

[Lens] Clean up usage collector #89109

Merged

flash1293 closed this as completed in #89109 Jan 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Telemetry] Usage collectors not using `isReady` correctly #81944

[Telemetry] Usage collectors not using `isReady` correctly #81944

chrisronline commented Oct 28, 2020 •

edited by flash1293

Loading

elasticmachine commented Oct 28, 2020

TinaHeiligers commented Oct 28, 2020 •

edited

Loading

TinaHeiligers commented Nov 2, 2020

chrisronline commented Nov 3, 2020

TinaHeiligers commented Nov 3, 2020 •

edited

Loading

chrisronline commented Nov 3, 2020

TinaHeiligers commented Nov 4, 2020

chrisronline commented Nov 4, 2020

TinaHeiligers commented Nov 4, 2020

dgieselaar commented Nov 5, 2020

chrisronline commented Nov 5, 2020

mikecote commented Nov 9, 2020

dgieselaar commented Nov 9, 2020

chrisronline commented Nov 10, 2020

afharo commented Nov 18, 2020

dgieselaar commented Nov 18, 2020

chrisronline commented Nov 20, 2020

TinaHeiligers commented Nov 20, 2020

elasticmachine commented Dec 10, 2020

flash1293 commented Dec 10, 2020

TinaHeiligers commented Jan 5, 2021

flash1293 commented Jan 11, 2021

YulNaumenko commented Jan 22, 2021

[Telemetry] Usage collectors not using isReady correctly #81944

[Telemetry] Usage collectors not using isReady correctly #81944

Comments

chrisronline commented Oct 28, 2020 • edited by flash1293 Loading

elasticmachine commented Oct 28, 2020

TinaHeiligers commented Oct 28, 2020 • edited Loading

TinaHeiligers commented Nov 2, 2020

chrisronline commented Nov 3, 2020

TinaHeiligers commented Nov 3, 2020 • edited Loading

chrisronline commented Nov 3, 2020

TinaHeiligers commented Nov 4, 2020

chrisronline commented Nov 4, 2020

TinaHeiligers commented Nov 4, 2020

dgieselaar commented Nov 5, 2020

chrisronline commented Nov 5, 2020

mikecote commented Nov 9, 2020

dgieselaar commented Nov 9, 2020

chrisronline commented Nov 10, 2020

afharo commented Nov 18, 2020

dgieselaar commented Nov 18, 2020

chrisronline commented Nov 20, 2020

TinaHeiligers commented Nov 20, 2020

elasticmachine commented Dec 10, 2020

flash1293 commented Dec 10, 2020

TinaHeiligers commented Jan 5, 2021

flash1293 commented Jan 11, 2021

YulNaumenko commented Jan 22, 2021

[Telemetry] Usage collectors not using `isReady` correctly #81944

[Telemetry] Usage collectors not using `isReady` correctly #81944

chrisronline commented Oct 28, 2020 •

edited by flash1293

Loading

TinaHeiligers commented Oct 28, 2020 •

edited

Loading

TinaHeiligers commented Nov 3, 2020 •

edited

Loading