prospective-parachains: respond with multiple backable candidates #3160

alindima · 2024-01-31T16:06:06Z

paves the way for elastic scaling

polkadot/node/core/prospective-parachains/src/fragment_tree.rs

eskimor · 2024-02-01T14:40:39Z

polkadot/node/core/prospective-parachains/src/fragment_tree.rs

+	// Try finding a candidate chain starting from `base_node` of length `expected_count`.
+	// If not possible, return the longest we could find.
+	// Does a depth-first search, since we're optimistic that there won't be more than one such
+	// chains. Assumes that the chain must be acyclic.


What happens if they are cyclic? What is the worst case complexity here, assuming parachains misbehave as much as they possibly can?

that's a good question. we stop the search on the respective branch when we see that node A has a child that we've already visited. maybe we should just return an error. I don't think it's a valid state transition to accept. Is it?

Even if we were to accept cycles, we'd still be bounded by a function of the requested number of candidates.

I modified the code in the latest revision to accept cycles in the fragment trees. This is actually a documented behaviour of prospective-parachains even with some tests for it. See:

polkadot-sdk/polkadot/node/core/prospective-parachains/src/fragment_tree.rs

Line 43 in 74b597f

//! # Cycles

I also documented the performance considerations. Cycles are not an issue because we are limited by the core count anyway so we won't loop indefinitely

polkadot/node/core/prospective-parachains/src/fragment_tree.rs

polkadot/node/core/prospective-parachains/src/lib.rs

polkadot/roadmap/implementers-guide/src/node/utility/provisioner.md

eskimor

Looks sensible! Algorithmic complexity still makes me uneasy: Pinged Rob on what is his stance on forks by now.

eskimor · 2024-02-02T17:18:32Z

polkadot/node/core/prospective-parachains/src/fragment_tree.rs

+	// 2. `expected_count` is equal to the number of cores a para is scheduled on (in an elastic
+	//    scaling scenario). For non-elastic-scaling, this is just 1. In practice, this should be a
+	//    small number (1-3), capped by the total number of available cores (a constant alterable
+	//    only via governance/sudo).


So with realistic numbers (current and expected/exaggerated future numbers): How bad does it get? What would be limits to those parameters?

Elastic scaling will eventually enable chains to be bounded by the rate of a single node's block production.

In Cumulus currently, block authorship is typically slower than block verification (in validators) due to the need to read from disk. Most of the time is spent in accessing the trie for storage, not in code execution. Access patterns for the trie are also very inefficient. So it's not feasible to use more than 3 cores until that bottleneck is cleared.

If this bottleneck is alleviated, then I could see parachains using many more cores.

thinking about this a bit more, I don't think this traversal is any different in terms of worst-case complexity from the routine used for populating the fragment tree (which populates the tree breadth-first). Is it?
In an acyclic tree, it'll visit each node once, or until it reaches the requested count (in a worst case scenario). In a cyclic one, it'd be bounded by the requested count so it's the same complexity.

The complexity is linear in the number of nodes, but the number of nodes is an exponential function (because it's an n-ary tree, at least in theory, of arity num_forks and height of max_candidate_depth).

If we'll hit a bottleneck, this wouldn't be the first place where that'd surface.

I think we could limit the number of total forks allowed for a parachain (regardless of their level in the tree). So that the total number of branches in the tree would bring down the worst-case complexity massively. WDYT?

anyway, I think this is somewhat orthogonal to this PR.
to quote the docs from fragment_tree module:

//! The code in this module is not designed for speed or efficiency, but conceptual simplicity. //! Our assumption is that the amount of candidates and parachains we consider will be reasonably //! bounded and in practice will not exceed a few thousand at any time. This naive implementation //! will still perform fairly well under these conditions, despite being somewhat wasteful of //! memory.

we should revisit this assumption maybe, but it doesn't seem like it'll be a problem for the forseeable future

Prospective-parachains seems to be a good subsystem benchmark target. That should give some clear numbers on CPU usage in worst case scenarios.

opened a separate issue to track this security item and added it to the elastic scaling task: #3219

…aling-dev

sandreim

🚀

sandreim · 2024-02-05T14:58:45Z

polkadot/node/core/prospective-parachains/src/fragment_tree.rs

 			}

 			node
 		};

-		// TODO [now]: taking the first selection might introduce bias
+		// TODO: taking the first best selection might introduce bias


Is this to be solved in this PR ?

no, this was here beforehand
It could be solved by randomising the order in which we visit a node's children, but that would make tests harder to write.
Since validators don't favour particular collators when requesting collations, the order of potential forks in the fragment tree should already be somewhat "random" based on network latency (unless collators find a way to censor/DOS other collators)

sandreim · 2024-02-05T16:44:49Z

polkadot/node/core/prospective-parachains/src/tests.rs

+		}};
+	}
+
+	// Parachain 1 looks like this:


sandreim · 2024-02-05T16:56:34Z

polkadot/node/core/prospective-parachains/src/fragment_tree.rs

+	// 2. `expected_count` is equal to the number of cores a para is scheduled on (in an elastic
+	//    scaling scenario). For non-elastic-scaling, this is just 1. In practice, this should be a
+	//    small number (1-3), capped by the total number of available cores (a constant alterable
+	//    only via governance/sudo).


Prospective-parachains seems to be a good subsystem benchmark target. That should give some clear numbers on CPU usage in worst case scenarios.

#3130 builds on top of #3160 Processes the availability cores and builds a record of how many candidates it should request from prospective-parachains and their predecessors. Tries to supply as many candidates as the runtime can back. Note that the runtime changes to back multiple candidates per para are not yet done, but this paves the way for it. The following backing/inclusion policy is assumed: 1. the runtime will never back candidates of the same para which don't form a chain with the already backed candidates. Even if the others are still pending availability. We're optimistic that they won't time out and we don't want to back parachain forks (as the complexity would be huge). 2. if a candidate is timed out of the core before being included, all of its successors occupying a core will be evicted. 3. only the candidates which are made available and form a chain starting from the on-chain para head may be included/enacted and cleared from the cores. In other words, if para head is at A and the cores are occupied by B->C->D, and B and D are made available, only B will be included and its core cleared. C and D will remain on the cores awaiting for C to be made available or timed out. As point (2) above already says, if C is timed out, D will also be dropped. 4. The runtime will deduplicate candidates which form a cycle. For example if the provisioner supplies candidates A->B->A, the runtime will only back A (as the state output will be the same) Note that if a candidate is timed out, we don't guarantee that in the next relay chain block the block author will be able to fill all of the timed out cores of the para. That increases complexity by a lot. Instead, the provisioner will supply N candidates where N is the number of candidates timed out, but doesn't include their successors which will be also deleted by the runtime. This'll be backfilled in the next relay chain block. Adjacent changes: - Also fixes: #3141 - For non prospective-parachains, don't supply multiple candidates per para (we can't have elastic scaling without prospective parachains enabled). paras_inherent should already sanitise this input but it's more efficient this way. Note: all of these changes are backwards-compatible with the non-elastic-scaling scenario (one core per para).

…ch#3233) paritytech#3130 builds on top of paritytech#3160 Processes the availability cores and builds a record of how many candidates it should request from prospective-parachains and their predecessors. Tries to supply as many candidates as the runtime can back. Note that the runtime changes to back multiple candidates per para are not yet done, but this paves the way for it. The following backing/inclusion policy is assumed: 1. the runtime will never back candidates of the same para which don't form a chain with the already backed candidates. Even if the others are still pending availability. We're optimistic that they won't time out and we don't want to back parachain forks (as the complexity would be huge). 2. if a candidate is timed out of the core before being included, all of its successors occupying a core will be evicted. 3. only the candidates which are made available and form a chain starting from the on-chain para head may be included/enacted and cleared from the cores. In other words, if para head is at A and the cores are occupied by B->C->D, and B and D are made available, only B will be included and its core cleared. C and D will remain on the cores awaiting for C to be made available or timed out. As point (2) above already says, if C is timed out, D will also be dropped. 4. The runtime will deduplicate candidates which form a cycle. For example if the provisioner supplies candidates A->B->A, the runtime will only back A (as the state output will be the same) Note that if a candidate is timed out, we don't guarantee that in the next relay chain block the block author will be able to fill all of the timed out cores of the para. That increases complexity by a lot. Instead, the provisioner will supply N candidates where N is the number of candidates timed out, but doesn't include their successors which will be also deleted by the runtime. This'll be backfilled in the next relay chain block. Adjacent changes: - Also fixes: paritytech#3141 - For non prospective-parachains, don't supply multiple candidates per para (we can't have elastic scaling without prospective parachains enabled). paras_inherent should already sanitise this input but it's more efficient this way. Note: all of these changes are backwards-compatible with the non-elastic-scaling scenario (one core per para).

prospective-parachains: respond with multiple backable candidates

9bfdc59

paves the way for elastic scaling

alindima marked this pull request as draft January 31, 2024 16:06

alindima added T0-node This PR/Issue is related to the topic “node”. I5-enhancement An additional feature request. T8-polkadot This PR/Issue is related to/affects the Polkadot network. labels Jan 31, 2024

eskimor reviewed Feb 1, 2024

View reviewed changes

alindima added 3 commits February 2, 2024 12:24

allow cycles, address review comments and document performance

a1f208d

add prdoc

d4f38bd

more tests

baf0fd0

alindima marked this pull request as ready for review February 2, 2024 13:39

alindima requested a review from sandreim February 2, 2024 13:39

eskimor reviewed Feb 2, 2024

View reviewed changes

eskimor approved these changes Feb 2, 2024

View reviewed changes

alindima added 2 commits February 5, 2024 11:08

fix name

cfada9a

Merge remote-tracking branch 'origin/master' into alindima/elastic-sc…

b5c1584

…aling-dev

sandreim approved these changes Feb 5, 2024

View reviewed changes

alindima mentioned this pull request Feb 6, 2024

prospective-parachains: limit count of nodes in a fragment tree #3219

Closed

alindima added this pull request to the merge queue Feb 6, 2024

Merged via the queue into master with commit 7df1ae3 Feb 6, 2024
130 checks passed

alindima deleted the alindima/elastic-scaling-dev branch February 6, 2024 09:02

alindima mentioned this pull request Feb 14, 2024

provisioner: allow multiple cores assigned to the same para #3233

Merged

kamilsa mentioned this pull request Feb 28, 2024

Respond with multiple backable candidates qdrvm/kagome#1993

Closed

iceseer mentioned this pull request Mar 4, 2024

Feature/prospective parachains multi candidates qdrvm/kagome#1996

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prospective-parachains: respond with multiple backable candidates #3160

prospective-parachains: respond with multiple backable candidates #3160

alindima commented Jan 31, 2024 •

edited

Loading

eskimor Feb 1, 2024

alindima Feb 1, 2024

alindima Feb 2, 2024

eskimor left a comment

eskimor Feb 2, 2024

rphmeier Feb 3, 2024

alindima Feb 5, 2024 •

edited

Loading

alindima Feb 5, 2024

sandreim Feb 5, 2024

alindima Feb 6, 2024

sandreim left a comment

sandreim Feb 5, 2024

alindima Feb 6, 2024

sandreim Feb 5, 2024

sandreim Feb 5, 2024

prospective-parachains: respond with multiple backable candidates #3160

prospective-parachains: respond with multiple backable candidates #3160

Conversation

alindima commented Jan 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eskimor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alindima Feb 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sandreim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alindima commented Jan 31, 2024 •

edited

Loading

alindima Feb 5, 2024 •

edited

Loading