Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: assert out on EIO in read #7633

Merged
merged 1 commit into from
Dec 12, 2022
Merged

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented Dec 5, 2022

We already assert out on EIO in writes. However on reads we only surfaced the exceptional future, and some code paths (like state_machine.cc) have general exception handlers that retry forever, resulting in nodes with bad disks staying up but not progressing.

Fixes #7637

Backports Required

  • none - not a bug fix
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v22.3.x
  • v22.2.x
  • v22.1.x

UX Changes

See release notes.

Release Notes

Bug Fixes

  • In some cases, Redpanda could hang on EIOs from the underlying storage device. This behavior has changed to terminate redpanda with an assertion on EIO, in anticipation of the node/drive requiring replacement.

@jcsp jcsp marked this pull request as ready for review December 6, 2022 14:03
Comment on lines +215 to +216
if (ec.code().value() == EIO) {
vassert(false, "I/O error during read! Disk failure?");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

I'm curious if we'll hit a crash loop in automated deployments, but this still seems much better than the alternative. I wonder if longer term we'd want to peel back this limitation, e.g. assuming the entire disk isn't fully borked, stopping/moving just the replicas that hit a bad reads, while gracefully decommissioning the node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Longer term, I suspect that for things like decode errors we'll want to treat them as non-crashing errors and report a "damaged" partition, but for EIOs we might always keep the termination behavior, as it's such a critical failure indicator.

@jcsp jcsp merged commit 8694d84 into redpanda-data:dev Dec 12, 2022
@jcsp jcsp deleted the storage-assert-eio branch December 12, 2022 09:37
@jcsp
Copy link
Contributor Author

jcsp commented Dec 12, 2022

/backport v22.3.x

@jcsp
Copy link
Contributor Author

jcsp commented Dec 12, 2022

/backport v22.2.x

@jcsp
Copy link
Contributor Author

jcsp commented Dec 12, 2022

/backport v22.1.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x 875faa41a0019ce2bf8cec82bc96d656a007494b

Workflow run logs.

@jcsp
Copy link
Contributor Author

jcsp commented Dec 12, 2022

Manual backport to v22.1.x at #7703

Comment on lines +218 to +219
return ss::make_exception_future<result<records_t>>(
std::current_exception());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should pass ec here instead of std::current_exception. While future::handle_exception_type does invoke its continuation inside a catch block, that doesn't seem to be a guarantee that the API needs to maintain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: state machine readers spin on EIO
5 participants