-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Search Sessions] Monitoring hardening part 1 #96196
[Search Sessions] Monitoring hardening part 1 #96196
Conversation
Set default strategy Don't create sessions when disabled Clear monitoring task when disabled Use concatMap to serialize session checkup
Pinging @elastic/kibana-app-services (Team:AppServices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, changes LGTM! A couple of minor things below. Also, just to understand, were we seeing the entries without "strategy" when the feature was disabled? Or how can I verify the before/after behavior to make sure this fixes it?
@@ -154,7 +154,7 @@ export async function checkRunningSessions( | |||
try { | |||
await getAllSavedSearchSessions$(deps, config) | |||
.pipe( | |||
mergeMap(async (runningSearchSessionsResponse) => { | |||
concatMap(async (runningSearchSessionsResponse) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned this in the other issue as well, but what are we hoping to accomplish by processing these serially rather than in parallel? The main issue of concern seemed to be that the update request is too large, so the only benefits I can see with serially are a decrease in CPU/memory, but I'm not sure if that will outweigh the fact that this change may also make the process take much longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole idea of paging the sessions was to update the items in reasonable amounts and in orderly fashion.
With mergeMap
we weren't doing that.
I think that serializing bulks of 100 search sessions makes sense, but it's going to be hard to quantify the impact.
Not 100% the same (more related to reducing the page size), but the article @Dosant linked convinced me https://eng.lifion.com/promise-allpocalypse-cfb6741298a7?gi=2c3ce0dba6ea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukasolson, I think there is also a difference between grouping per 100 requests or sending all of them to ES? with concatMap
we spread out the load.
x-pack/plugins/data_enhanced/server/search/session/monitoring_task.ts
Outdated
Show resolved
Hide resolved
x-pack/plugins/data_enhanced/server/search/session/session_service.ts
Outdated
Show resolved
Hide resolved
…vice.ts Co-authored-by: Lukas Olson <olson.lukas@gmail.com>
@lukasolson IMO searches without a strategy were initiated server side by TSVB |
const [coreStart, pluginsStart] = await core.getStartServices(); | ||
if (!sessionConfig.enabled) { | ||
logger.info('Search sessions are disabled. Clearing task.'); | ||
await pluginsStart.taskManager.removeIfExists(SEARCH_SESSIONS_TASK_ID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmmorris (or someone else who another task manager expert :D ), could you please review? (If this already hasn't been discussed somewhere) 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oooh I do not know about this 😬
What removeIfExists
will do here is delete the SO that backs the task that is currently running.
I'd expect that to cause this task execution to fail....
I suspect you'll see that if you add an end to end test that verifies this behaviour.
What are we trying to achieve here?
Are you looking to respond reactively to a config change?
A better approach would be to do this from outside the task execution - subscribe to the config change and delete the task. And, in the task executor, skip the code path you don't want to execute if config is disabled but the task was rerun before you got a chance to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what we try to achieve:
- Assume Kibana is running with this feature ON
- Restart Kibana with the feature OFF
- We want to be sure that we don't run the task when the feature is OFF
Currently, this is not happening in master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmmorris Updated.
Since changing this setting restarts the server, I'm now unschedulng the task on the setupMonitoring
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM, will test
x-pack/plugins/data_enhanced/server/search/session/check_running_sessions.ts
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Tested pageSize
and concatMap
change.
I am not sure we've finished this discussion: https://github.com/elastic/kibana/pull/96196/files#r607388697, but I think concatMap
is the right thing here to do to spread out the load on es. In the cases where it gets too slow - pageSize
can be increased.
I'd wait for someone from alerting to code review remove task change
…bana into sessions/monitoring-hardening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The task lifecycle LGTM
I haven't tested it locally, as I'm not quite familiar with search sessions, but the flow in the task scheduling looks good 👍.
💚 Build SucceededMetrics [docs]
History
To update your PR or re-run it, just comment with: cc @lizozom |
* Decrease default pageSize to 100 Set default strategy Don't create sessions when disabled Clear monitoring task when disabled Use concatMap to serialize session checkup * ts * ts * ts * Update x-pack/plugins/data_enhanced/server/search/session/session_service.ts Co-authored-by: Lukas Olson <olson.lukas@gmail.com> * Search sessions are disabled * Clear task on server start Co-authored-by: Lukas Olson <olson.lukas@gmail.com>
* Decrease default pageSize to 100 Set default strategy Don't create sessions when disabled Clear monitoring task when disabled Use concatMap to serialize session checkup * ts * ts * ts * Update x-pack/plugins/data_enhanced/server/search/session/session_service.ts Co-authored-by: Lukas Olson <olson.lukas@gmail.com> * Search sessions are disabled * Clear task on server start Co-authored-by: Lukas Olson <olson.lukas@gmail.com>
* Decrease default pageSize to 100 Set default strategy Don't create sessions when disabled Clear monitoring task when disabled Use concatMap to serialize session checkup * ts * ts * ts * Update x-pack/plugins/data_enhanced/server/search/session/session_service.ts Co-authored-by: Lukas Olson <olson.lukas@gmail.com> * Search sessions are disabled * Clear task on server start Co-authored-by: Lukas Olson <olson.lukas@gmail.com> Co-authored-by: Liza Katz <lizka.k@gmail.com> Co-authored-by: Lukas Olson <olson.lukas@gmail.com>
* Decrease default pageSize to 100 Set default strategy Don't create sessions when disabled Clear monitoring task when disabled Use concatMap to serialize session checkup * ts * ts * ts * Update x-pack/plugins/data_enhanced/server/search/session/session_service.ts Co-authored-by: Lukas Olson <olson.lukas@gmail.com> * Search sessions are disabled * Clear task on server start Co-authored-by: Lukas Olson <olson.lukas@gmail.com> Co-authored-by: Liza Katz <lizka.k@gmail.com> Co-authored-by: Lukas Olson <olson.lukas@gmail.com>
Summary
Implements some of the more urgent items from #96131
concatMap
to serialize session checkupChecklist
Delete any items that are not applicable to this PR.
For maintainers