[v22.1.x] Fix duplicates consistency error by caching already translated offsets #5453

rystsov · 2022-07-13T05:36:28Z

Backport from pull request #5356, #5220

rystsov · 2022-07-13T05:37:12Z

There were multiple conflicts on cherry-picking

rystsov · 2022-07-13T05:37:58Z

Had to omit cherry-picking upgrade testing because the upgrade testing dependencies are not backported

(cherry picked from commit 7862d4b) Conflicts: src/v/cluster/controller.cc src/v/cluster/controller.h src/v/redpanda/application.cc

Kafka client doesn't process unknown_server_error correctly and it may lead to duplicates violating the idempotency. See the following issue for more info: https://issues.apache.org/jira/browse/KAFKA-14034 request_timed_out just like unknown_server_error means that the true outcome of the operation is unknown and unlike unknown_server_error it doesn't cause the problem so switching to using it to avoid the problem (cherry picked from commit 1a72446)

(cherry picked from commit 1dfd8d9)

Update all partition::replicate dependees which don't perform offset translation to bypass it via a direct raft reference (cherry picked from commit 67a3112) Conflicts: src/v/cluster/partition.h src/v/cluster/partition_probe.cc src/v/kafka/server/group.cc

rystsov · 2022-07-13T19:01:24Z

The sketchiest change is in feature_manages: it uses logical version which should be consistent with releases so if a feature (in this case rm_stm_kafka_cache) has one version in 22.1.5 and different version in 22.2.1 it will cause a problem. But rm_stm_kafka_cache was added after serde_raft_0 which isn't a part of this backport so it can't have the same version. After this PR is merged I'll update the dev to have consistent logical versions.

Updating consumer groups to use conditional replication to prevent a situation when after a check a leadership jumps away, invalidates the check, jumps back just in time for the post check replication. check condition leadership goes to a new node the node replicates something which invalidates the conditions the leadership jumps back the node successfully replicates assuming that the condition is true Switched to a conditional replicate to fix the problem. When a group manager detects a leadership change it replays the group's records to reconstruct the groups state. We cache the current term in the state and use it as a condition on replicate. In this case we know that if the leadership bounce the replication won't pass. (cherry picked from commit e693bea) Conflicts: src/v/kafka/server/group.cc

We're going to mix raft and kafka offset in the same class, since both the offsets uses the same type it's easy to make an error and treat one as it was another. Introducing kafka offset to rely on the type system to prevent such errors. (cherry picked from commit 9335075) Conflicts: src/v/cluster/types.h

Shifting offset translation down the abstraction well to eventually reach rm_stm (cherry picked from commit e3d24d9)

Preparing rm_stm to use kafka::offset based seq-offset cache. Right now it uses raft offsets but there is a problem with it: once the cache items become older that the head of the log (eviction) panda becomes unable to use offset translation so we need to store already translated offsets. Since the cache is persisted as a part the snapshot so we need to change the disk format and provide backward compatibility. The change is splitted into two commits. Current commit introduces types to represent old format seq_cache_entry_v1 and tx_snapshot_v1 and adds compatibility machinary to convert old snapshot (tx_snapshot_v1) to new snapshot (tx_snapshot). The follow up commit updates the default types to use new format and updates the mapping between old and default types. (cherry picked from commit 63c5883) Conflicts: src/v/cluster/rm_stm.cc

switching to caching seq-kafka offsets cache to avoid out of range errors on translating offsets beyond the eviction point (cherry picked from commit 4b42c7e)

(cherry picked from commit 065fb54)

(cherry picked from commit 9000762)

(cherry picked from commit 8e7346d) Conflicts: src/v/cluster/feature_table.cc src/v/cluster/feature_table.h tests/rptest/tests/cluster_features_test.py

bharathv · 2022-07-13T20:36:03Z

The sketchiest change is in feature_manages: it uses logical version which should be consistent with releases so if a feature (in this case rm_stm_kafka_cache) has one version in 22.1.5 and different version in 22.2.1 it will cause a problem. But rm_stm_kafka_cache was added after serde_raft_0 which isn't a part of this backport so it can't have the same version. After this PR is merged I'll update the dev to have consistent logical versions.

Looks risky but I guess we don't have an option here. What you are suggesting makes sense to me. I skimmed through the feature tables on the branches (21.1.x, 22.1.x, dev) and the only conflict seems to be on dev where we need to bump serde_raft_0 and replace the existing serde_raft_0 entry with rm_stm_kafka_cache. Would be amazing to verify the upgrade combinations with an upgrade test though (I hope it is backported).

bharathv

lgtm.

I'm not sure what the policy is around backporting new features but in this case this is a 'bugfix' which needs a serialization change and this is blocking 2.1.5 release.

@jcsp You might be interested in this given it's implications on the feature table in the dev branch. Just FYI.

rystsov requested review from dotnwat, NyaliaLui, mmaslankaprv, graphcareful, VadimPlh and ztlpn as code owners July 13, 2022 05:36

rystsov added this to the v22.1.5 milestone Jul 13, 2022

github-actions bot added the area/redpanda label Jul 13, 2022

mmedenjak added kind/bug Something isn't working area/kafka consistency kind/backport PRs targeting a stable branch labels Jul 13, 2022

VadimPlh and others added 4 commits July 13, 2022 11:54

cluster: move feature_table outside controller

911fe49

(cherry picked from commit 7862d4b) Conflicts: src/v/cluster/controller.cc src/v/cluster/controller.h src/v/redpanda/application.cc

cluster: remove dead code

c803886

(cherry picked from commit 1dfd8d9)

rystsov force-pushed the v22.1.x-5356 branch from 18600cd to 74153f0 Compare July 13, 2022 18:54

rystsov added 8 commits July 13, 2022 12:12

cluster: shift offset translation to partition

3594b59

Shifting offset translation down the abstraction well to eventually reach rm_stm (cherry picked from commit e3d24d9)

rm_stm: shift offset translation to rm_stm

b7d1f6e

switching to caching seq-kafka offsets cache to avoid out of range errors on translating offsets beyond the eviction point (cherry picked from commit 4b42c7e)

rm_stm: remove dead code

2161b46

(cherry picked from commit 065fb54)

rm_stm: add feature_table as a dependency

5a95cb2

(cherry picked from commit 9000762)

rm_stm: put kafka offset cache behind feature manager

f7a13ff

(cherry picked from commit 8e7346d) Conflicts: src/v/cluster/feature_table.cc src/v/cluster/feature_table.h tests/rptest/tests/cluster_features_test.py

rystsov force-pushed the v22.1.x-5356 branch from 74153f0 to f7a13ff Compare July 13, 2022 19:12

bharathv approved these changes Jul 14, 2022

View reviewed changes

rystsov merged commit 042089c into redpanda-data:v22.1.x Jul 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v22.1.x] Fix duplicates consistency error by caching already translated offsets #5453

[v22.1.x] Fix duplicates consistency error by caching already translated offsets #5453

rystsov commented Jul 13, 2022 •

edited

Loading

rystsov commented Jul 13, 2022

rystsov commented Jul 13, 2022 •

edited

Loading

rystsov commented Jul 13, 2022

bharathv commented Jul 13, 2022 •

edited

Loading

bharathv left a comment

[v22.1.x] Fix duplicates consistency error by caching already translated offsets #5453

[v22.1.x] Fix duplicates consistency error by caching already translated offsets #5453

Conversation

rystsov commented Jul 13, 2022 • edited Loading

rystsov commented Jul 13, 2022

rystsov commented Jul 13, 2022 • edited Loading

rystsov commented Jul 13, 2022

bharathv commented Jul 13, 2022 • edited Loading

bharathv left a comment

Choose a reason for hiding this comment

rystsov commented Jul 13, 2022 •

edited

Loading

rystsov commented Jul 13, 2022 •

edited

Loading

bharathv commented Jul 13, 2022 •

edited

Loading