Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc/simple_protocol: disambiguate condition variable timeout exception #6224

Merged
merged 2 commits into from
Aug 26, 2022

Conversation

mmaslankaprv
Copy link
Member

Cover letter

When ss::condition_variable_timeout_exception is thrown from service
handler it should not be treated as generic error but result in timeout
being returned to the client.

Fixes: #6201

Fixes #ISSUE-NUMBER, Fixes #ISSUE-NUMBER, ...

Backport Required

  • not a bug fix
  • issue does not exist in previous branches
  • papercut/not impactful enough to backport
  • v22.2.x
  • v22.1.x
  • v21.11.x

UX changes

Describe in plain language how this PR affects an end-user. What topic flags, configuration flags, command line flags, deprecation policies etc are added/changed.

Release notes

  • none

When `ss::condition_variable_timeout_exception` is thrown from service
handler it should not be treated as generic error but result in timeout
being returned to the client.

Fixes: redpanda-data#6201

Signed-off-by: Michal Maslanka <michal@redpanda.com>
@andrwng
Copy link
Contributor

andrwng commented Aug 25, 2022

This is a good back stop to avoid seeing these errors, but I wonder if it hides underlying bugs. #5461 was actually an improperly handled exception. I think the end behavior with this patch vs with proper handling would be the same (a timeout error), so maybe it's okay.

Regardless, from the logs in the linked issues, the error was logged before upgrading, so I think we will have to add the message to ALLOWED_LOGS

@mmaslankaprv
Copy link
Member Author

This is a good back stop to avoid seeing these errors, but I wonder if it hides underlying bugs. #5461 was actually an improperly handled exception. I think the end behavior with this patch vs with proper handling would be the same (a timeout error), so maybe it's okay.

Regardless, from the logs in the linked issues, the error was logged before upgrading, so I think we will have to add the message to ALLOWED_LOGS

Do you mean the previous version allowed log ?

@andrwng
Copy link
Contributor

andrwng commented Aug 26, 2022

This is a good back stop to avoid seeing these errors, but I wonder if it hides underlying bugs. #5461 was actually an improperly handled exception. I think the end behavior with this patch vs with proper handling would be the same (a timeout error), so maybe it's okay.
Regardless, from the logs in the linked issues, the error was logged before upgrading, so I think we will have to add the message to ALLOWED_LOGS

Do you mean the previous version allowed log ?

Yeah, I think this error is coming from an older version, so dev patch won't help much.

This seems to be a creeping pattern, that log error messages that have been fixed will continue to show up in upgrade tests. I'm not sure what a good policy is to address this long term. So far we've been patching by including to the allowed logs list. Maybe we should just collect bad log messages in some global V22_1_ALLOWED_LOG_LIST or somesuch, and maybe we can improve the log parsing to target lines on specific versions (between the initial banner messages).

Signed-off-by: Michal Maslanka <michal@redpanda.com>
@rystsov rystsov merged commit 417f150 into redpanda-data:dev Aug 26, 2022
Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI Failure (BadLog) in ControllerUpgradeTest.test_updating_cluster_when_executing_operations
4 participants