Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Auditbeat] Cherry-pick #9693 to 6.6: Report process errors #9845

Merged
merged 1 commit into from
Jan 4, 2019

Conversation

cwurm
Copy link
Contributor

@cwurm cwurm commented Jan 2, 2019

Cherry-pick of PR #9693 to 6.6 branch. Original message:

So far, the process metricset has been rather strict. If an unexpected error occurred while collecting process information, the whole collection would stop and return an error.

This changes it to keep iterating through processes even when that happens. The unexpected error will be stored in the Process object and sent to Elasticsearch as well as logged as a warning. This only happens the first time the error is encountered for a process, not on subsequent collection cycles (with a typical collection frequency of 1s, that would flood the log and ES).

For error documents, it sets event.kind: error and event.action: process_error.

Fyi, I have renamed ProcessInfo to Process not just because it now contains more than just types.ProcessInfo, but also to bring it in line with Socket in socket.go. Socket already contains an Error field (and that was the inspiration for this change).

Beware: The diff Github shows is misleading in places, it shows replacements/deletions where a few lines have just moved down a bit.

Some additional background on why this change can be found in this comment thread on a PR that introduced some error catching during process collection.

If anybody wants to test what happens with errors, run it as non-root and comment the continue statement in line 375 - it will report errors for processes of other users. At some point, we might want to have a test that simulates an error.

@cwurm cwurm changed the title Cherry-pick #9693 to 6.6: [Auditbeat] Report process errors [Auditbeat] Cherry-pick #9693 to 6.6: Report process errors Jan 2, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/secops

@cwurm cwurm requested review from webmat and andrewkroh and removed request for webmat January 2, 2019 16:14
Changes the process metricset to keep iterating through processes even when an unexpected error occurs. The error will be stored in the Process object and sent to Elasticsearch as well as logged as a warning. This only happens the first time the error is encountered for a process, not on subsequent collection cycles.

(cherry picked from commit 2cd7c42)
Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No diff vs #9693. LGTM

@cwurm cwurm merged commit 33e0227 into elastic:6.6 Jan 4, 2019
@cwurm cwurm deleted the backport_9693_6.6 branch January 4, 2019 11:06
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
Changes the process metricset to keep iterating through processes even when an unexpected error occurs. The error will be stored in the Process object and sent to Elasticsearch as well as logged as a warning. This only happens the first time the error is encountered for a process, not on subsequent collection cycles.

(cherry picked from commit f7ce3b1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants