Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: crashpad scope flushing synchronization #1019

Merged
merged 9 commits into from
Jul 23, 2024

Conversation

supervacuus
Copy link
Collaborator

This change cleans up a couple of issues in the current crashpad backend:

  • Via scope-flushing, we use mpack in the signal handler, which uses system malloc. This change configures mpack to use sentry_malloc, which switches to the page-allocator inside the signal handler. Fixes Inconsistent allocator usage causing crashes #687.
  • We flush the scope outside the crash handler and then again inside. This is only necessary on macOS and on Windows when a crash gets handled vie the WER module. There is no need to flush the scope into __sentry_event outside the handler on Linux.
  • The scope-flushing unnecessarily introduced a shared mutable variable between the two paths (via crashpad_backend_flush_scope and the signal handler), which could be called from two different threads. We can ensure that these two paths handle separate local events.
  • Further, the two paths could also lead to overwritten __sentry_event files if the scope flushing is triggered from a thread other than the signal handler thread, silently invalidating any changes to the event coming from the before_send and on_crash hooks. This is now fully synchronized in the following fashion:
    • crashpad_backend_flush_scope signals via std::atomic<bool> that a flush is in progress
    • if the application crashes and our signal handler wants to flush the scope into the crash event, it must wait until any flush-in-progress is finished
    • the signal-handler also signals that the application has crashed and prevents further flushes from outside the signal handler
  • these changes fix Crashpad crash collection is not thread-safe #931
  • last but not least: crashpad invokes the FirstChanceHandler on any incoming signal (Linux) or unhandled exception (Windows). While we locked the signal handler on Linux, the Windows handler would be called twice if another thread also surfaces an exception. This is bad because the signal handler doesn't act like it will only be called once before it terminates. This adapts the code in the crashpad client to prevent second invocations to the FirstChanceHandler. This adapts the proposed changes by: FirstChanceHandler execute once crashpad#95

@supervacuus supervacuus merged commit e58b655 into master Jul 23, 2024
20 checks passed
@supervacuus supervacuus deleted the fix/crashpad_scope_handling branch July 23, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crashpad crash collection is not thread-safe Inconsistent allocator usage causing crashes
2 participants