Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collectorSets in the New Platform #46924

Closed
rudolf opened this issue Sep 30, 2019 · 8 comments · Fixed by #51618
Closed

collectorSets in the New Platform #46924

rudolf opened this issue Sep 30, 2019 · 8 comments · Fixed by #51618
Assignees
Labels
discuss Feature:Telemetry Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Monitoring Stack Monitoring team

Comments

@rudolf
Copy link
Contributor

rudolf commented Sep 30, 2019

Legacy

In order to collect and report on metrics, plugins create and register "collectors" with:

const myCollector = server.usage.collectorSet.makeUsageCollector({
  type: 'dashboard',
  fetch: async (callCluster: CallAPI): Record<string, any>
})
server.usage.collectorSet.register(myCollector);

Each collector defines a fetch() function which returns the collected metrics. Registering a collector simply adds it to the server.usage.collectorSet, collector's fetch() methods don't automatically get called.

Registered collectors are consumed from the following locations, each of which independently calls the various collector's fetch() functions:

  • api/stats for stats collectors and api/stats?extended=true for usage collectors
  • monitoring plugin
  • telemetry plugin

New Platform: metrics plugin (TBD)

We create a metrics OSS plugin which exposes an API to allow other plugins to register which metrics they want captured similar to collectorSet.register. This plugin will also expose the api/stats endpoint. As a first step metric collectors can simply be registered. In the future, this plugin will also be responsible for fetching the metrics and exposing the data, not the collectors, through an API. When plugins register a metric they can define the collection interval or use the default. Monitoring, Telemetry and api/stats can then pull data from this plugin.

metrics and telemetry are quite closely related and could be implemented in the same OSS plugin. However, having these separate allows paranoid administrators to disable the telemetry plugin entirely, which means the code that uploads telemetry data won't even be loaded, without affecting the functionality of the monitoring plugin or the api/stats endpoint.

Although I don't believe there is a strong case for it right now, having a small dedicated metrics plugin would also make it much easier to move this API into Core in the future.

Implementation

  1. Implement NP metrics plugin
  2. Change Legacy usageMixin to proxy all calls to server.usage.collectorSet to the NP metrics plugin with server.newPlatform.setup.plugins.metrics. This way legacy plugins can continue to use the legacy collectorSet API but all the logic is located in the NP metrics plugin.

Open questions:

  1. Is there any existing functionality which wouldn't be supported by the suggested design?
  2. We have two constructors for creating collectors makeUsageCollector vs makeStatsCollector. But when we want stats collectors we seem to always filter by TYPE getCollectorByType() and when we want usage collectors we filter by constructor isUsageCollector(). There should be a consistent way to create/distinguish between these two.
@elasticmachine
Copy link
Contributor

Pinging @elastic/stack-monitoring

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-stack-services

@rudolf rudolf added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Sep 30, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-platform

@Bamieh
Copy link
Member

Bamieh commented Sep 30, 2019

Thanks Rudolf for all the effort you're putting into this.

I like the general approach of having a separate plugin for this, which can be utilized by telemetry and other plugins using these collectors right now and in the future. I want to note that we do have a ui_metrics plugin in OSS, I feel that it makes sense to rename it into metrics and have the current ui_metrics apis as a sub to that in addition to api/stats and exposing collectorsSet.

This is a lot more plausible than baking collectors into telemetry which introduces various limitations.

Here are my thoughts on the open questions raised in the topic:

  1. I believe this approach will cover all our current cases, and any future case I can think of (mainly a cloud case we've been discussing recently).

  2. makeStatsCollector is meant to be cover more general usecase than makeUsageCollector and it is currently being used in a few places. When I was refactoring code to move telemtery to OSS I thought of possible ways to make this more intuitive; I believe this is something I can focus on in the future.

@chrisronline
Copy link
Contributor

Each collector defines a fetch() function which returns the collected metrics. Registering a collector simply adds it to the server.usage.collectorSet, collector's fetch() methods don't automatically get called.

Just to be clear, is this the extent of the API? We currently also has isReady which we'll need to preserve as well.

@rudolf
Copy link
Contributor Author

rudolf commented Oct 1, 2019

@Bamieh

I want to note that we do have a ui_metrics plugin in OSS, I feel that it makes sense to rename it into metrics and have the current ui_metrics apis as a sub to that in addition to api/stats and exposing collectorsSet.

That makes a lot of sense.

@chrisronline

Just to be clear, is this the extent of the API?

No, I think it makes sense to keep the current API including the isReady check. The NP might remove the need for many of the isReady checks which seem to guard against the TaskManager not being ready. This is because plugins are guaranteed to only load after the plugins they depend on are loaded. But I think we should prioritize moving the code into a NP plugin before we look at making API changes if any.

@chrisronline
Copy link
Contributor

No, I think it makes sense to keep the current API including the isReady check. The NP might remove the need for many of the isReady checks which seem to guard against the TaskManager not being ready. This is because plugins are guaranteed to only load after the plugins they depend on are loaded. But I think we should prioritize moving the code into a NP plugin before we look at making API changes if any.

👍Like you said, isReady is not always depended on other plugins so it's important we preserve this functionality.

@tsullivan
Copy link
Member

I'm really looking forward to a new metrics plugin! I already have a lot of use cases in mind for Reporting 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Feature:Telemetry Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Monitoring Stack Monitoring team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants