Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory termination #476

Merged
merged 40 commits into from
May 24, 2024
Merged

Memory termination #476

merged 40 commits into from
May 24, 2024

Conversation

naftaly
Copy link
Contributor

@naftaly naftaly commented May 12, 2024

Adding OOM support to KSCrash.

In a gist, this works by writing an OOM report to disk outside of the report area for later modification. We also keep track of memory information through a memory mapped file for crash resistance, we write to it anytime memory limit or pressure changes. On termination, that data is basically a breadcrumb we can use on cold start.

On cold start, we look for that mapped file, if memory pressure is >= critical or memory level is >= critical it means we were terminated due to memory. At this point we'll read in the OOM report we write to disk on the previous app launch, modify it to fit our needs with latest information and move it to the report folder where the rest of the system will pick it up and send it in.

One thing that would make everything better is to have new repositories for each cold start. it can make handling files a lot more robust.

@GLinnik21 GLinnik21 linked an issue May 12, 2024 that may be closed by this pull request
Sources/KSCrashRecording/KSCrashAppMemory.mm Outdated Show resolved Hide resolved
Sources/KSCrashRecording/KSCrashAppMemory.mm Outdated Show resolved Hide resolved
Sources/KSCrashRecording/KSCrashAppMemory.mm Outdated Show resolved Hide resolved
Sources/KSCrashRecording/KSCrashAppMemory.mm Outdated Show resolved Hide resolved
Sources/KSCrashRecording/Monitors/KSCrashMonitor_Memory.mm Outdated Show resolved Hide resolved
Sources/KSCrashRecording/Monitors/KSCrashMonitor_Memory.mm Outdated Show resolved Hide resolved
Sources/KSCrashRecording/Monitors/KSCrashMonitor_Memory.mm Outdated Show resolved Hide resolved
@GLinnik21
Copy link
Collaborator

I'm really impressed by the clever way you've used memory-mapped files! It's a great solution. I just wanted to check in about something. How can we be sure that data cached in memory is going to be written to disc by the kernel in case of a crash?

@naftaly
Copy link
Contributor Author

naftaly commented May 15, 2024

I'm really impressed by the clever way you've used memory-mapped files! It's a great solution. I just wanted to check in about something. How can we be sure that data cached in memory is going to be written to disc by the kernel in case of a crash?

The way I understand it is that mmap is handled in kernel space and not user space, which leads to file based mmap to basically be a “kernel cache” that is always dumped to disk based on the mapping sync functions, and a process being terminated is one of those instances where the kernel will explicitly call those sync functions, which leads to crash resilience.

@GLinnik21 GLinnik21 deleted the branch kstenerud:master May 17, 2024 14:34
@GLinnik21 GLinnik21 closed this May 17, 2024
@naftaly
Copy link
Contributor Author

naftaly commented May 17, 2024

@GLinnik21 Can you reopen the PR so I can change the base to main and we can continue work on it?

@GLinnik21
Copy link
Collaborator

Oh sorry. It got automatically closed because I merged and deleted release-2.0 to master. I can try to restore release-2.0 or you can try changing the target branch to kstenerud:master. What works better for you?
image

@GLinnik21 GLinnik21 reopened this May 17, 2024
@naftaly
Copy link
Contributor Author

naftaly commented May 17, 2024

I wasn't able to change the base for some reason, likely because the PR is closed.
Screenshot 2024-05-17 at 7 40 22 PM

@naftaly
Copy link
Contributor Author

naftaly commented May 17, 2024

Oh, there we go :) thank you.

@naftaly naftaly changed the base branch from release-2.0 to master May 17, 2024 23:42
@naftaly naftaly marked this pull request as ready for review May 18, 2024 23:44
@naftaly naftaly requested a review from GLinnik21 May 18, 2024 23:44
@GLinnik21 GLinnik21 requested a review from bamx23 May 19, 2024 13:00
@GLinnik21
Copy link
Collaborator

Could you add some tests to cover the new functionality where possible? It would help maintain the quality and stability of the project. Thanks!

@naftaly
Copy link
Contributor Author

naftaly commented May 19, 2024

Could you add some tests to cover the new functionality where possible? It would help maintain the quality and stability of the project. Thanks!

I've added a few tests. If there's anything in particular you'd like to be tested feel free to point it out, I'm happy to add more.

@naftaly naftaly requested a review from GLinnik21 May 20, 2024 20:02
@naftaly
Copy link
Contributor Author

naftaly commented May 21, 2024

Any clues what’s going on with this check, I don’t see what’s wrong.

@GLinnik21
Copy link
Collaborator

Any clues what’s going on with this check, I don’t see what’s wrong.

Let's try to rerun for now.

@GLinnik21
Copy link
Collaborator

I wanted to share a thought on organizing our code. When we put multiple class implementations in a single file, especially in both implementation and header files, it can make the code harder to read and maintain. Keeping everything in separate files can really simplify code reading and understanding. Some of our files with multiple implementations have grown to several hundred lines of code, which can be overwhelming. What do you think about splitting them up to make things more manageable?

@naftaly
Copy link
Contributor Author

naftaly commented May 24, 2024

@bamx23 @GLinnik21 I addressed the last few pieces of feedback. I found a simple way to enable/disable reporting OOMs but keep the data flowing and added to other types of reports.

I also wanted to let both of you know I wrote a small piece about what we did here. https://medium.com/@alexandercohen/reducing-memory-terminations-in-ios-apps-3e76797ca5bd

@naftaly
Copy link
Contributor Author

naftaly commented May 24, 2024

Could you folks rerun the actions, they seem to have failed due to a timeout.

@GLinnik21
Copy link
Collaborator

@GLinnik21 do you have anything left to discuss here or are you good to merge?

As always, the more tests, the better. If other code is too low-level to test, I'm fine with that.

@GLinnik21
Copy link
Collaborator

I also wanted to let both of you know I wrote a small piece about what we did here. https://medium.com/@alexandercohen/reducing-memory-terminations-in-ios-apps-3e76797ca5bd

Wow! I read the whole article and found it very insightful. Thank you for the shoutout!

Copy link
Collaborator

@bamx23 bamx23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Let's go!

funny approve gif

@bamx23 bamx23 merged commit 992a1c6 into kstenerud:master May 24, 2024
19 checks passed
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory Termination Support
3 participants