-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Search Sessions] Implement cancel on search session monitoring task, fetch and process sessions page by page #96321
Conversation
b6f8d1b
to
73ad72f
Compare
* trackingTimeout controls for how long task manager waits for search session monitoring task to complete before considering it timed out, | ||
* If tasks timeouts it receives cancel signal and next task starts in "trackingInterval" time | ||
*/ | ||
trackingTimeout: schema.duration({ defaultValue: '5m' }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'5m' is the current default in task manager. I don't change the default and expose a way to override it.
deps: CheckRunningSessionsDeps, | ||
config: SearchSessionsConfig | ||
): Promise<void> { | ||
): Observable<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just changing from Promise to Observable to be able to abort
@@ -48,12 +52,17 @@ function searchSessionRunner( | |||
logger, | |||
}, | |||
sessionConfig | |||
); | |||
) | |||
.pipe(takeUntil(aborted$)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cancelation idea is the following:
We can use pageConfig
to process sessions page by page. When abort is triggered we will process the last received page and won't fetch for the one that are left
Pinging @elastic/kibana-app-services (Team:AppServices) |
@elasticmachine merge upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of minor nits below, but overall LGTM
@@ -29,7 +32,8 @@ interface SearchSessionTaskDeps { | |||
function searchSessionRunner( | |||
core: CoreSetup<DataEnhancedStartDependencies>, | |||
{ logger, config }: SearchSessionTaskDeps | |||
) { | |||
): TaskRunCreatorFunction { | |||
const aborted$ = new Subject<void>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have one global aborted$
for all instances of this task, doesn't this mean that if one instance hits the timeout, any others in-progress will also be aborted? What if we moved the initialization of aborted$
between lines 37/38?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it inside the task factory. Now we for sure have a single aborted$
instance per task
I still would like @elastic/kibana-alerting-services to review how I use their API 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look global to me, actually.
A TaskRunner is creates whenever a task is picked up by Kibana, so each task will have its own abort observable, which will be GCed when that task completes.
Looking at this usage, this PR LGTM 👍
x-pack/plugins/data_enhanced/server/search/session/check_running_sessions.test.ts
Outdated
Show resolved
Hide resolved
@elasticmachine merge upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested and I see the behavior is correct, mostly that pages are fetched and updated correctly, but I didn't test any negative scenarios. I don't have time to dive deeper today. If this is urgent, could you ask @lukasolson to take another look?
x-pack/plugins/data_enhanced/server/search/session/monitoring_task.ts
Outdated
Show resolved
Hide resolved
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: |
… fetch and process sessions page by page (elastic#96321)
… fetch and process sessions page by page (elastic#96321)
@@ -39,6 +43,8 @@ function searchSessionRunner( | |||
logger.debug('Search sessions are disabled. Skipping task.'); | |||
return; | |||
} | |||
if (aborted$.getValue()) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks OK, but it's worth clarifying that if (for some reason) you return here before Task Manager actually calls cancel, then you'll be wiping out your State as TM will think you've just completed the task without returning any state.
It doesn't look like you're using task state, so I think this is fine, but I felt it was worth flagging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the flag!
Summary
Partially addressed #96131
trackingTimeout
configChecklist
For maintainers