Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eks-log-collector.sh: add reboot history #1920

Merged
merged 1 commit into from
Aug 20, 2024

Conversation

gomesdigital
Copy link
Contributor

Issue #, if available:
#1919

Description of changes:
Adds a reboot history log to the system dir in the collected logs. Logs will yield reboot events, timestamp and their status.
e.g.

reboot   system boot  6.1.102-108.177. Sun Aug 11 01:16   still running
reboot   system boot  6.1.102-108.177. Sun Aug 11 01:15   still running

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Testing Done
Tested by executing the script on a c6g.2xlarge running AL23. Outputted successsfully, including feedback:

Trying to collect Docker daemon logs... 
Trying to collect sandbox-image daemon information... 
Trying to collect CPU Throttled Process Information... 
Trying to collect IO Throttled Process Information... 
Trying to collect reboot history... 
Trying to collect Nvidia Bug report... No Nvidia drivers found, nothing to do.

Trying to archive gathered information... 

	Done... your bundled logs are located in /var/log/eks_<instance-id>_2024-08-12_1313-UTC_0.7.8.tar.gz

See this guide for recommended testing for PRs. Some tests may not apply. Completing tests and providing additional validation steps are not required, but it is recommended and may reduce review time and time to merge.

@@ -800,8 +801,14 @@ get_io_throttled_processes() {
ok
}

get_reboot_history() {
try "collect reboot history"
timeout 75 last reboot > "${COLLECT_DIR}"/system/last_reboot.txt 2>&1 || echo -e "\tTimed out, ignoring \"reboot history output \" "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually use this:

journalctl --list-boots

any advantage to last reboot?

Copy link
Contributor Author

@gomesdigital gomesdigital Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, last reboot will include a status whereas journalctl --list-boots won't. From that we can deduce whether a shutdown was graceful or not. In our case we see *2 still running statuses, indicating that the previous reboot was unexpected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense 👍

@cartermckinnon cartermckinnon merged commit 56e4d46 into awslabs:main Aug 20, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants