Skip to content

Commit

Permalink
heartbeat_manager: reduce log severity when missing partition
Browse files Browse the repository at this point in the history
The heartbeat manager logs an error message when it attempts parses a
bad heartbeat response that includes a partition it doesn't know about.
A similar message is logged at the debug level when handling a good
response that includes a partition it doesn't know about, since the
following is a valid series of events:

1. replica R becomes leader of topic partition P
2. R sends out heartbeat requests to P's followers
3. an admin deletes P
4. the controller leader sends the requests to delete P
5. R shuts down its consensus
6. the reply to P's heartbeat is received by R, but the partition no
   longer exsts

This sequence is possible regardless whether or not the request was
successfully sent out. With the current logging, we log an error e.g. if
before step 6, the request timed out, and we process the response.

This commit makes this log line less severe, instead logging at the warn
level but with some context explaining when we might expect the log.
  • Loading branch information
andrwng committed Aug 5, 2022
1 parent 88bf12b commit e9903d5
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion src/v/raft/heartbeat_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,11 @@ void heartbeat_manager::process_reply(
for (auto& [g, req_meta] : groups) {
auto it = _consensus_groups.find(g);
if (it == _consensus_groups.end()) {
vlog(hbeatlog.error, "cannot find consensus group:{}", g);
vlog(
hbeatlog.warn,
"cannot find consensus group:{}, may have been moved or "
"deleted",
g);
continue;
}

Expand Down

0 comments on commit e9903d5

Please sign in to comment.