[Alerting] Add a tie breaker field to alerts #62002

FrankHassanabad · 2020-03-31T15:29:54Z

Describe the feature:

Provide a tie_breaker_id by copying the _id field so we can have stable sort/export in batches.

Describe a specific use case for the feature:

For Saved objects and for SIEM rules we ran into issues with sorting and paging where we ended up with duplicates and in some cases missing data. We solved this by loading all records into memory at once and then exporting. For our UI tables we allow the user to view 300 records at a time to minimize the chances of duplicates or misses when doing paging but it is possible when they have above 300.

We have replicated this issue by paging through rules 10 at a time and looking at each rule and seeing we have found duplicates. We have replicated this issue also using CURL and eliminating any chances it was a UI issue.

There are docs such as this one which highlight the problem and a solution (In the important section)
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-search-after

If the alerting framework provide a copy of _id as a tiebreaker field called

tie_breaker_id

as suggested there for us to sort on, export on, etc... that would be a small mapping change utilizing copy_to:
https://www.elastic.co/guide/en/elasticsearch/reference/current/copy-to.html

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-03-31T15:30:13Z

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

mikecote · 2020-04-14T18:57:12Z

@elastic/kibana-platform Is there any plans to have an alternative to sorting by _id now that Elasticsearch doesn't support this? some sort of tie breaker field?

FrankHassanabad · 2020-04-15T13:48:15Z

FWIW, I created a tie_breaker_id keyword field for large list support API endpoints I am working on @mikecote, and then during document inserts I use a uuid for it. It's not going to be the fastest and I have an extra field consuming disk space compared to just being able to copy over the _id but that's what I'm doing now in my code.

pgayvallet · 2020-04-17T06:15:14Z

Is there any plans to have an alternative to sorting by _id now that Elasticsearch doesn't support this? some sort of tie breaker field?

Not that I'm aware of. @rudolf?

rudolf · 2020-04-17T08:27:39Z

I wasn't aware of this duplicating behaviour, but I think we should treat it as a bug in Core. @FrankHassanabad suggestion for using copy_to sounds like a good suggestion.

@FrankHassanabad What is the impact on users, could showing a duplicate mislead a user to take the wrong actions or is this mostly an annoyance? What's the priority for getting this fixed? It doesn't sound like a lot of work, but our roadmap is quite full.

Theoretically, generating a saved object ndjson export with duplicates could lead to failure on import if overwrite=false which then leads to an id collision. I will have to go through the source code, but I think we use a Set which would remove duplicates and mitigate this problem. We don't use search_after directly for saved object queries, so this should only impact custom ES queries against the saved object index that uses search_after.

rudolf · 2020-04-17T08:30:44Z

I read to quickly and didn't see that data could be either missing or duplicated, we can probably sometimes live with duplicates but the missing data wouldn't be acceptable.

rudolf · 2020-04-17T10:04:35Z

copy_to isn't possible on the _id field, so we will have to either use an ingest pipeline or do it inside the saved objects repository. I'm not sure if we can rely on ingest processing always being enabled on ES so doing it client-side might be safer, especially for a minor release.

There's no immediate requirement to implement this in Core/Saved Objects, we could only implement this in task manager for now since Saved Objects only uses from + to for paging. I'm not aware of any saved object types that currently expect to be able to page over more than 10k results, but there are plans to move larger indexes to saved objects in which case we might have to switch to search_after for paging find results.

FrankHassanabad · 2020-04-18T01:31:10Z

@rudolf, I generate a UUID client side at the moment for the a tie breaker field before pushing the record into the index within Kibana/NodeJS

. I'm not aware of any saved object types that currently expect to be able to page over more than 10k results

For large lists since they can be above 10k we had to switch from SO to another data index so I could use search after for our SIEM large list support. Also for a few other Elastic Search abilities such as delete by query, etc...

If SO is adding support for above 10k and a few other features and everyone feels comfortable we can add large volumes of data into it I would be 👍 for moving back to it. I started with it, but quickly ran into issues with the 10k support and missing things such as delete by query where I had to move to a data index.

FrankHassanabad · 2020-07-29T16:23:05Z

"Reviewed by Frank Hassanabad on 7/29/2020, still valid as of this date"

yctercero · 2020-11-06T15:33:36Z

Just wanted to double back on this ticket. We're adding auto refresh to our all rules table and the inconsistencies in sorting without a tiebreaker_id are now very evident. Would be much appreciated if we could bump this up :)

pmuellr · 2020-11-06T16:15:40Z

Is there any chance you could use an existing alerting SO field like createdAt as the tie breaker? Of course, that's not guaranteed to be unique, but is maybe unique enough? Or maybe that's already your primary search term. Is there another unique-ish field that is stable for your uses, like name, that you could use as an additional sort to get the uniqueness (ie, is the name something a user can change ad-hoc, or are you creating these programmatically and they don't change over time)? How stable do these sorts need to be over time? Just during a user session, paging through UI's? Seems like something like sorting by [createdAt, name] would probably be good enough in that case.

It sounds like the only other option is for us to add a new field to the alert SO, which would presumably be a UUID, as the tie breaker.

I'm worried about the cardinality explosion here, as the internal index on that new field will literally be as large as the # of alerts, so reusing an existing field(s) would help out there (I think!). We could potentially make this field optional, and probably not available for editing in the UI, so that alertsClient users could specify a value if they want, but the default would be that we don't use the field.

Also wondering if this will bite us in the end, for things like the alerts list UI. No matter how we end up sorting the alerts in that list (eg, status, name, etc), seems like we may also be in the same situation - with an unstable sort, so missing/dup entries while paging - but perhaps that's a bit more survivable for customers than Frank's use case.

pmuellr · 2020-11-17T17:27:40Z

@FrankHassanabad , any thoughts on my comment ^^^

FrankHassanabad · 2020-11-19T22:37:08Z

Thanks, I did a sampling of data from the prepackaged rules install and the createdAt field could help keep us unique or mostly unique to lessen the problems and impacts:

2020-11-05T19:46:41.552Z
2020-11-05T19:46:41.406Z
2020-11-05T19:46:41.753Z
2020-11-05T19:46:41.716Z
2020-11-05T19:46:42.533Z

It's not guaranteed as you can see some of those are very close to being non-unique and we might end up with some that are maybe not unique but I think at the very least for a workaround we can use them to help keep things more stable or as close to stable as we can.

yctercero · 2020-11-19T22:55:15Z

@FrankHassanabad thanks for checking that!

I think we do still need a way to pass multiple fields to sort by though right? So for our rules example, when the user selects to sort by enabled, we pass that through as the sort_field but can't specify any secondary field.

FrankHassanabad · 2020-11-20T00:52:43Z

Yeah, we need to allow comma separated values to sort by within the find_rules_route and then we can send multiples down and sort in the order given to us.

rudolf · 2020-11-20T15:53:07Z

It sounds like the only other option is for us to add a new field to the alert SO, which would presumably be a UUID, as the tie breaker.

I'm worried about the cardinality explosion here, as the internal index on that new field will literally be as large as the # of alerts, so reusing an existing field(s) would help out there (I think!).

Adding another field is the only way to guarantee no duplicates (ES team might add a virtual tie breaker field at some point in the future elastic/elasticsearch#56828).

Although every field adds overhead, I don't think the "cost" of this is prohibitively high, instead of N it will now be 2N.

We might have to add a tie breaker field to all saved objects for #77961

pmuellr · 2020-11-23T19:48:21Z

Ah, adding a tie breaker to SO's themselves seems like the way to do this, as this is a generic SO issue IIRC, not specific to alerts. I'd certainly hate to add one to alerts in vX only to have it added to SO's in v>X, since then we'd have two tie breakers (and more wasted index space).

lukeelmers · 2021-02-22T22:32:10Z

With the changes in elastic/elasticsearch#68833, a tiebreaker no longer needs to be provided to Elasticsearch (as of 7.12) as long as you are searching using a point-in-time... ES now automatically applies an internally-created tiebreaker for you.

In #89915, we introduced the ability to use SavedObjects.find with the point-in-time API and search_after, so you can now query over large numbers of SOs with consistent results. When #91175 lands, we'll have a helper exposed on the SO client to make this process easier.

I think this means this issue can be closed, but I'll let @FrankHassanabad confirm.

pmuellr · 2021-02-23T14:02:48Z

Not entirely clear to me if the tiebreaker needs to be consistent over time, or just over the "initial load" of the list being generated. If it needs to be consistent over time, then the per-PIT tiebreaker won't help.

We also recently made the alert params a flattened mapping type, instead of object enabled:false. Which looks like can be used for sorting. So it seems like alert types could use a param as a tie-breaker; perhaps it would be some "hidden" param, not shown in the UI.

For a case like this, you'd like the process of alert creation to call into the alert type to explicitly set such a field to some random value. But we don't really have a way of doing that now. We've talked about "hooks", where an alert type could be involved in processes like this, which seems like would be a good way to handle it. Eg, there would be a "pre-save" hook that an alert type could implement which would be passed the alert data just before it's created/updated, with a chance to modify it.

For now though, setting such a field in the alert type UI or via the alerts client is probably good enough.

ymao1 · 2021-07-29T14:56:31Z

@FrankHassanabad @yctercero Is a tie breaker field still needed given the changes to the core's SavedObjects.find api detailed in #62002 (comment)?

yctercero · 2021-08-03T22:52:23Z

@FrankHassanabad @yctercero Is a tie breaker field still needed given the changes to the core's SavedObjects.find api detailed in #62002 (comment)?

I don't believe so. We're working on moving over to PIT, though don't have details on timeline.

ymao1 · 2021-08-03T23:39:59Z

Closing as don't need.

FrankHassanabad added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Mar 31, 2020

FrankHassanabad changed the title ~~[Alerting] Add a tie breaker to alerts~~ [Alerting] Add a tie breaker field to alerts Mar 31, 2020

rudolf added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Apr 17, 2020

rudolf removed the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label May 6, 2020

yctercero mentioned this issue Nov 6, 2020

[Security Solution][Detections] - Auto refresh all rules/monitoring tables #82062

Merged

6 tasks

gmmorris mentioned this issue Nov 6, 2020

Dependencies on Kibana Alerting #67992

Open

59 tasks

mikecote added the discuss label Nov 17, 2020

rudolf mentioned this issue Dec 17, 2020

Add a tie_breaker field to all saved objects #86300

Closed

gmmorris added NeededFor:Detections and Resp Project:ImproveAlertingManagementUX Alerting team project for improving the management experience of alerting. Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework labels Jun 30, 2021

ymao1 closed this as completed Aug 3, 2021

kobelb added the needs-team Issues missing a team label label Jan 31, 2022

botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Alerting] Add a tie breaker field to alerts #62002

[Alerting] Add a tie breaker field to alerts #62002

FrankHassanabad commented Mar 31, 2020

elasticmachine commented Mar 31, 2020

mikecote commented Apr 14, 2020

FrankHassanabad commented Apr 15, 2020 •

edited

Loading

pgayvallet commented Apr 17, 2020

rudolf commented Apr 17, 2020 •

edited

Loading

rudolf commented Apr 17, 2020

rudolf commented Apr 17, 2020

FrankHassanabad commented Apr 18, 2020

FrankHassanabad commented Jul 29, 2020

yctercero commented Nov 6, 2020

pmuellr commented Nov 6, 2020

pmuellr commented Nov 17, 2020 •

edited

Loading

FrankHassanabad commented Nov 19, 2020

yctercero commented Nov 19, 2020

FrankHassanabad commented Nov 20, 2020

rudolf commented Nov 20, 2020

pmuellr commented Nov 23, 2020

lukeelmers commented Feb 22, 2021

pmuellr commented Feb 23, 2021

ymao1 commented Jul 29, 2021

yctercero commented Aug 3, 2021

ymao1 commented Aug 3, 2021

[Alerting] Add a tie breaker field to alerts #62002

[Alerting] Add a tie breaker field to alerts #62002

Comments

FrankHassanabad commented Mar 31, 2020

elasticmachine commented Mar 31, 2020

mikecote commented Apr 14, 2020

FrankHassanabad commented Apr 15, 2020 • edited Loading

pgayvallet commented Apr 17, 2020

rudolf commented Apr 17, 2020 • edited Loading

rudolf commented Apr 17, 2020

rudolf commented Apr 17, 2020

FrankHassanabad commented Apr 18, 2020

FrankHassanabad commented Jul 29, 2020

yctercero commented Nov 6, 2020

pmuellr commented Nov 6, 2020

pmuellr commented Nov 17, 2020 • edited Loading

FrankHassanabad commented Nov 19, 2020

yctercero commented Nov 19, 2020

FrankHassanabad commented Nov 20, 2020

rudolf commented Nov 20, 2020

pmuellr commented Nov 23, 2020

lukeelmers commented Feb 22, 2021

pmuellr commented Feb 23, 2021

ymao1 commented Jul 29, 2021

yctercero commented Aug 3, 2021

ymao1 commented Aug 3, 2021

FrankHassanabad commented Apr 15, 2020 •

edited

Loading

rudolf commented Apr 17, 2020 •

edited

Loading

pmuellr commented Nov 17, 2020 •

edited

Loading