Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Search Sessions] Monitoring hardening part 1 #96196

Merged
merged 11 commits into from
Apr 7, 2021

Conversation

lizozom
Copy link
Contributor

@lizozom lizozom commented Apr 4, 2021

Summary

Implements some of the more urgent items from #96131

  • Decrease default page size to 100
  • Set default strategy
  • Don't create session objects when feature is disabled
  • Clear monitoring task if it was disabled (after kibana restart)
  • Use concatMap to serialize session checkup

Checklist

Delete any items that are not applicable to this PR.

For maintainers

Liza K added 2 commits April 4, 2021 19:12
Set default strategy
Don't create sessions when disabled
Clear monitoring task when disabled
Use concatMap to serialize session checkup
@lizozom lizozom self-assigned this Apr 4, 2021
@lizozom lizozom requested a review from a team as a code owner April 4, 2021 16:14
@lizozom lizozom added auto-backport Deprecated - use backport:version if exact versions are needed bug Fixes for quality problems that affect the customer experience v7.12.1 v7.13.0 v8.0.0 Team:AppServices labels Apr 4, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-services (Team:AppServices)

@lizozom lizozom added the release_note:skip Skip the PR/issue when compiling release notes label Apr 4, 2021
@lizozom lizozom changed the title [Sessions] Monitoring hardening part 1 [Search Sessions] Monitoring hardening part 1 Apr 4, 2021
Copy link
Member

@lukasolson lukasolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, changes LGTM! A couple of minor things below. Also, just to understand, were we seeing the entries without "strategy" when the feature was disabled? Or how can I verify the before/after behavior to make sure this fixes it?

@@ -154,7 +154,7 @@ export async function checkRunningSessions(
try {
await getAllSavedSearchSessions$(deps, config)
.pipe(
mergeMap(async (runningSearchSessionsResponse) => {
concatMap(async (runningSearchSessionsResponse) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in the other issue as well, but what are we hoping to accomplish by processing these serially rather than in parallel? The main issue of concern seemed to be that the update request is too large, so the only benefits I can see with serially are a decrease in CPU/memory, but I'm not sure if that will outweigh the fact that this change may also make the process take much longer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole idea of paging the sessions was to update the items in reasonable amounts and in orderly fashion.
With mergeMap we weren't doing that.

I think that serializing bulks of 100 search sessions makes sense, but it's going to be hard to quantify the impact.
Not 100% the same (more related to reducing the page size), but the article @Dosant linked convinced me https://eng.lifion.com/promise-allpocalypse-cfb6741298a7?gi=2c3ce0dba6ea

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukasolson, I think there is also a difference between grouping per 100 requests or sending all of them to ES? with concatMap we spread out the load.

@lizozom
Copy link
Contributor Author

lizozom commented Apr 6, 2021

@lukasolson IMO searches without a strategy were initiated server side by TSVB

@lizozom lizozom requested a review from lukasolson April 6, 2021 09:28
const [coreStart, pluginsStart] = await core.getStartServices();
if (!sessionConfig.enabled) {
logger.info('Search sessions are disabled. Clearing task.');
await pluginsStart.taskManager.removeIfExists(SEARCH_SESSIONS_TASK_ID);
Copy link
Contributor

@Dosant Dosant Apr 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmmorris (or someone else who another task manager expert :D ), could you please review? (If this already hasn't been discussed somewhere) 🙏

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oooh I do not know about this 😬
What removeIfExists will do here is delete the SO that backs the task that is currently running.
I'd expect that to cause this task execution to fail....

I suspect you'll see that if you add an end to end test that verifies this behaviour.

What are we trying to achieve here?
Are you looking to respond reactively to a config change?
A better approach would be to do this from outside the task execution - subscribe to the config change and delete the task. And, in the task executor, skip the code path you don't want to execute if config is disabled but the task was rerun before you got a chance to remove it.

Copy link
Contributor

@Dosant Dosant Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what we try to achieve:

  1. Assume Kibana is running with this feature ON
  2. Restart Kibana with the feature OFF
  3. We want to be sure that we don't run the task when the feature is OFF

Currently, this is not happening in master.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmmorris Updated.
Since changing this setting restarts the server, I'm now unschedulng the task on the setupMonitoring function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@Dosant Dosant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, will test

@Dosant
Copy link
Contributor

Dosant commented Apr 6, 2021

@lizozom, FYI, I have a pr that implements cancel on a task definition here
I started it from your pr. We can wait for yours to go in first or merge those together. I am fine with both

Copy link
Contributor

@Dosant Dosant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Tested pageSize and concatMap change.
I am not sure we've finished this discussion: https://github.com/elastic/kibana/pull/96196/files#r607388697, but I think concatMap is the right thing here to do to spread out the load on es. In the cases where it gets too slow - pageSize can be increased.

I'd wait for someone from alerting to code review remove task change

@lizozom lizozom requested a review from gmmorris April 7, 2021 10:36
Copy link
Contributor

@gmmorris gmmorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The task lifecycle LGTM

I haven't tested it locally, as I'm not quite familiar with search sessions, but the flow in the task scheduling looks good 👍.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @lizozom

@lizozom lizozom merged commit 7584b72 into elastic:master Apr 7, 2021
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Apr 7, 2021
* Decrease default pageSize to 100
Set default strategy
Don't create sessions when disabled
Clear monitoring task when disabled
Use concatMap to serialize session checkup

* ts

* ts

* ts

* Update x-pack/plugins/data_enhanced/server/search/session/session_service.ts

Co-authored-by: Lukas Olson <olson.lukas@gmail.com>

* Search sessions are disabled

* Clear task on server start

Co-authored-by: Lukas Olson <olson.lukas@gmail.com>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Apr 7, 2021
* Decrease default pageSize to 100
Set default strategy
Don't create sessions when disabled
Clear monitoring task when disabled
Use concatMap to serialize session checkup

* ts

* ts

* ts

* Update x-pack/plugins/data_enhanced/server/search/session/session_service.ts

Co-authored-by: Lukas Olson <olson.lukas@gmail.com>

* Search sessions are disabled

* Clear task on server start

Co-authored-by: Lukas Olson <olson.lukas@gmail.com>
@kibanamachine
Copy link
Contributor

💚 Backport successful

7.12 / #96399
7.x / #96400

The backport PRs will be merged automatically after passing CI.

kibanamachine added a commit that referenced this pull request Apr 7, 2021
* Decrease default pageSize to 100
Set default strategy
Don't create sessions when disabled
Clear monitoring task when disabled
Use concatMap to serialize session checkup

* ts

* ts

* ts

* Update x-pack/plugins/data_enhanced/server/search/session/session_service.ts

Co-authored-by: Lukas Olson <olson.lukas@gmail.com>

* Search sessions are disabled

* Clear task on server start

Co-authored-by: Lukas Olson <olson.lukas@gmail.com>

Co-authored-by: Liza Katz <lizka.k@gmail.com>
Co-authored-by: Lukas Olson <olson.lukas@gmail.com>
kibanamachine added a commit that referenced this pull request Apr 7, 2021
* Decrease default pageSize to 100
Set default strategy
Don't create sessions when disabled
Clear monitoring task when disabled
Use concatMap to serialize session checkup

* ts

* ts

* ts

* Update x-pack/plugins/data_enhanced/server/search/session/session_service.ts

Co-authored-by: Lukas Olson <olson.lukas@gmail.com>

* Search sessions are disabled

* Clear task on server start

Co-authored-by: Lukas Olson <olson.lukas@gmail.com>

Co-authored-by: Liza Katz <lizka.k@gmail.com>
Co-authored-by: Lukas Olson <olson.lukas@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed bug Fixes for quality problems that affect the customer experience release_note:skip Skip the PR/issue when compiling release notes v7.12.1 v7.13.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants