Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More detailed partition reconfiguration tracking #10201

Merged

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Apr 19, 2023

Enriched /reconfiguration API with more information allowing users to track the progress of partition reconciliation. Now the API returns a complete set of information related with partition reconfiguration that is taking place.

The API will now return the following JSON:

{
    "ns": "kafka",
    "topic": "topic-khbikkrzeo",
    "partition": 9,
    "previous_replicas": [
        {
            "node_id": 2,
            "core": 0
        },
        {
            "node_id": 3,
            "core": 0
        },
        {
            "node_id": 1,
            "core": 0
        }
    ],
    "current_replicas": [
        {
            "node_id": 4,
            "core": 0
        },
        {
            "node_id": 3,
            "core": 0
        },
        {
            "node_id": 1,
            "core": 0
        }
    ],
    "bytes_left_to_move": 190,
    "bytes_moved": 0,
    "partition_size": 190,
    "reconciliation_statuses": [
        {
            "node_id": 2,
            "operations": [
                {
                    "type": "update",
                    "core": 0,
                    "retry_number": 7,
                    "revision": 89,
                    "status": "Generic failure occurred during partition operation execution (cluster::errc:52)"
                }
            ]
        },
        {
            "node_id": 1,
            "operations": [
                {
                    "type": "update",
                    "core": 0,
                    "retry_number": 3,
                    "revision": 89,
                    "status": "Current node is not a leader for partition (cluster::errc:17)"
                }
            ]
        },
        {
            "node_id": 4,
            "operations": [
                {
                    "type": "update",
                    "core": 0,
                    "retry_number": 5,
                    "revision": 89,
                    "status": "Current node is not a leader for partition (cluster::errc:17)"
                }
            ]
        },
        {
            "node_id": 3,
            "operations": [
                {
                    "type": "update",
                    "core": 0,
                    "retry_number": 5,
                    "revision": 89,
                    "status": "Current node is not a leader for partition (cluster::errc:17)"
                }
            ]
        }
    ]
}

FIxes: https://github.com/redpanda-data/core-internal/issues/444

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

Improvements

  • more observability in partition reconfigurations

@mmaslankaprv mmaslankaprv changed the title Partition moving progress api More detailed partition reconfiguration tracking Apr 19, 2023
@emaxerrno
Copy link
Contributor

@mmaslankaprv - what's the difference between "shard" and "core" can you use consistent naming.

@emaxerrno
Copy link
Contributor

@mmaslankaprv - this needs some rpk progress bar of sorts. so you can watch it like

watch rpk repartition-progress or smth command line for it.

@mmaslankaprv
Copy link
Member Author

@mmaslankaprv - what's the difference between "shard" and "core" can you use consistent naming.

you are right, i changed it to be consistent.

@dotnwat
Copy link
Member

dotnwat commented Apr 21, 2023

/ci-repeat 1

Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need a dedicated ducktape test?

@@ -2605,6 +2605,32 @@ admin_server::mark_transaction_expired_handler(
});
}

ss::future<ss::json::json_return_type>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't look like this needs to be a coroutine

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i use future<> here as in next commit i actually leverage this function being a coroutine.

Comment on lines 498 to 483
co_await ss::maybe_yield();
}

using ret_t = result<std::vector<ntp_reconciliation_state>>;

auto node_results = co_await ssx::parallel_transform(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was the maybe_yield used because the set could be large? if that's true, then should concurrently be limited for the parallel transform?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the set of ntps can be indeed large, here we are limited to the number of nodes as the ntps are grouped, i think there is no need to limit concurrency here.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
In order to provide a generic error code to express errors originating
from outside of the cluster module (errors with different category) or
an exceptions occurred in `controller_backend` we introduce a separate
error code.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
src/v/cluster/controller_backend.cc Outdated Show resolved Hide resolved
if (ec.category() == error_category()) {
it->last_error = static_cast<errc>(ec.value());
} else {
it->last_error = errc::partition_operation_failed;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we just save the error code as is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need it send it over the RPC we do not have way to serialize error category

src/v/cluster/controller_api.cc Outdated Show resolved Hide resolved
src/v/redpanda/admin/api-doc/partition.json Outdated Show resolved Hide resolved
src/v/redpanda/admin_server.cc Outdated Show resolved Hide resolved
Comment on lines 2666 to 2667
size_t left_to_move = 0;
size_t already_moved = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not blocking this pr, but it would be really great to have these available as metrics and drilled down per node to be able to see a dynamic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, we will do it as a follow up

Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pretty cool.. I only have a minor comment.

src/v/cluster/controller_api.cc Outdated Show resolved Hide resolved
Signed-off-by: Michal Maslanka <michal@redpanda.com>
Added revision, last error and retry count to backend operation. The
information will be used to track partition reconfiguration progress.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Added `controller_api` that allows caller to request partition
reconciliation state from all the replicas where partition is currently
hosted. The API returns a data structure containing operations that are
executed by `controller_backend` on all of the replicas.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
@mmaslankaprv
Copy link
Member Author

ci failure: #10163

ztlpn
ztlpn previously approved these changes Apr 25, 2023
bharathv
bharathv previously approved these changes Apr 26, 2023
The `/reconfiguartions` endpoint didn't provide an insight into the
progress of partition reconfigurations.

Added information that will allow user to check the operation progress
and additionally check status of reconciliation on all replicas.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Signed-off-by: Michal Maslanka <michal@redpanda.com>
@mmaslankaprv mmaslankaprv dismissed stale reviews from bharathv and ztlpn via d83a975 April 26, 2023 07:03
@mmaslankaprv mmaslankaprv merged commit 60079f3 into redpanda-data:dev Apr 27, 2023
@mmaslankaprv mmaslankaprv deleted the partition-moving-progress-api branch April 27, 2023 05:35
@vshtokman
Copy link
Contributor

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x cd455277bf16c14d2f56cd1efec5b68d37b0f90a 711949d999520d41f2e45f5d1912c45b255e7f1b 8f83991236f209513a8e1e292d7bc0bc9037d9ca 8e01dd02678263b6fc4e46b4619af88292fe477d 179969311f7f249e90722f0c26c9651bc3a556d1 b2467b331364f5f6da21b049cf1155b246c49f8d d83a975e5ec44294c78451e4a3c88ef43a45c59c

Workflow run logs.

@vshtokman
Copy link
Contributor

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x cd455277bf16c14d2f56cd1efec5b68d37b0f90a 711949d999520d41f2e45f5d1912c45b255e7f1b 8f83991236f209513a8e1e292d7bc0bc9037d9ca 8e01dd02678263b6fc4e46b4619af88292fe477d 179969311f7f249e90722f0c26c9651bc3a556d1 b2467b331364f5f6da21b049cf1155b246c49f8d d83a975e5ec44294c78451e4a3c88ef43a45c59c

Workflow run logs.

@vshtokman
Copy link
Contributor

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x cd455277bf16c14d2f56cd1efec5b68d37b0f90a 711949d999520d41f2e45f5d1912c45b255e7f1b 8f83991236f209513a8e1e292d7bc0bc9037d9ca 8e01dd02678263b6fc4e46b4619af88292fe477d 179969311f7f249e90722f0c26c9651bc3a556d1 b2467b331364f5f6da21b049cf1155b246c49f8d d83a975e5ec44294c78451e4a3c88ef43a45c59c

Workflow run logs.

@vshtokman
Copy link
Contributor

/backport v22.2.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x cd455277bf16c14d2f56cd1efec5b68d37b0f90a 711949d999520d41f2e45f5d1912c45b255e7f1b 8f83991236f209513a8e1e292d7bc0bc9037d9ca 8e01dd02678263b6fc4e46b4619af88292fe477d 179969311f7f249e90722f0c26c9651bc3a556d1 b2467b331364f5f6da21b049cf1155b246c49f8d d83a975e5ec44294c78451e4a3c88ef43a45c59c

Workflow run logs.

@vshtokman
Copy link
Contributor

Please ignore the /backport v22.2.x command. I was testing out the backport bot.

bharathv added a commit that referenced this pull request May 10, 2023
[backport] [v23.1.x] More detailed partition reconfiguration tracking #10201
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants