Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prospective-parachains: respond with multiple backable candidates #3160

Merged
merged 6 commits into from
Feb 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
163 changes: 141 additions & 22 deletions polkadot/node/core/prospective-parachains/src/fragment_tree.rs
Original file line number Diff line number Diff line change
Expand Up @@ -756,53 +756,127 @@ impl FragmentTree {
depths.iter_ones().collect()
}

/// Select a candidate after the given `required_path` which passes
/// the predicate.
/// Select `count` candidates after the given `required_path` which pass
/// the predicate and have not already been backed on chain.
///
/// If there are multiple possibilities, this will select the first one.
///
/// This returns `None` if there is no candidate meeting those criteria.
/// Does an exhaustive search into the tree starting after `required_path`.
/// If there are multiple possibilities of size `count`, this will select the first one.
/// If there is no chain of size `count` that matches the criteria, this will return the largest
/// chain it could find with the criteria.
/// If there are no candidates meeting those criteria, returns an empty `Vec`.
/// Cycles are accepted, see module docs for the `Cycles` section.
///
/// The intention of the `required_path` is to allow queries on the basis of
/// one or more candidates which were previously pending availability becoming
/// available and opening up more room on the core.
pub(crate) fn select_child(
pub(crate) fn select_children(
&self,
required_path: &[CandidateHash],
count: u32,
pred: impl Fn(&CandidateHash) -> bool,
) -> Option<CandidateHash> {
) -> Vec<CandidateHash> {
let base_node = {
// traverse the required path.
let mut node = NodePointer::Root;
for required_step in required_path {
node = self.node_candidate_child(node, &required_step)?;
if let Some(next_node) = self.node_candidate_child(node, &required_step) {
node = next_node;
} else {
return vec![]
};
}

node
};

// TODO [now]: taking the first selection might introduce bias
// TODO: taking the first best selection might introduce bias
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to be solved in this PR ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, this was here beforehand
It could be solved by randomising the order in which we visit a node's children, but that would make tests harder to write.
Since validators don't favour particular collators when requesting collations, the order of potential forks in the fragment tree should already be somewhat "random" based on network latency (unless collators find a way to censor/DOS other collators)

// or become gameable.
//
// For plausibly unique parachains, this shouldn't matter much.
// figure out alternative selection criteria?
match base_node {
self.select_children_inner(base_node, count, count, &pred, &mut vec![])
}

// Try finding a candidate chain starting from `base_node` of length `expected_count`.
// If not possible, return the longest one we could find.
// Does a depth-first search, since we're optimistic that there won't be more than one such
// chains (parachains shouldn't usually have forks). So in the usual case, this will conclude
// in `O(expected_count)`.
// Cycles are accepted, but this doesn't allow for infinite execution time, because the maximum
// depth we'll reach is `expected_count`.
//
// Worst case performance is `O(num_forks ^ expected_count)`.
// Although an exponential function, this is actually a constant that can only be altered via
// sudo/governance, because:
// 1. `num_forks` at a given level is at most `max_candidate_depth * max_validators_per_core`
// (because each validator in the assigned group can second `max_candidate_depth`
// candidates). The prospective-parachains subsystem assumes that the number of para forks is
// limited by collator-protocol and backing subsystems. In practice, this is a constant which
// can only be altered by sudo or governance.
// 2. `expected_count` is equal to the number of cores a para is scheduled on (in an elastic
// scaling scenario). For non-elastic-scaling, this is just 1. In practice, this should be a
// small number (1-3), capped by the total number of available cores (a constant alterable
// only via governance/sudo).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with realistic numbers (current and expected/exaggerated future numbers): How bad does it get? What would be limits to those parameters?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elastic scaling will eventually enable chains to be bounded by the rate of a single node's block production.

In Cumulus currently, block authorship is typically slower than block verification (in validators) due to the need to read from disk. Most of the time is spent in accessing the trie for storage, not in code execution. Access patterns for the trie are also very inefficient. So it's not feasible to use more than 3 cores until that bottleneck is cleared.

If this bottleneck is alleviated, then I could see parachains using many more cores.

Copy link
Contributor Author

@alindima alindima Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about this a bit more, I don't think this traversal is any different in terms of worst-case complexity from the routine used for populating the fragment tree (which populates the tree breadth-first). Is it?
In an acyclic tree, it'll visit each node once, or until it reaches the requested count (in a worst case scenario). In a cyclic one, it'd be bounded by the requested count so it's the same complexity.

The complexity is linear in the number of nodes, but the number of nodes is an exponential function (because it's an n-ary tree, at least in theory, of arity num_forks and height of max_candidate_depth).

If we'll hit a bottleneck, this wouldn't be the first place where that'd surface.

I think we could limit the number of total forks allowed for a parachain (regardless of their level in the tree). So that the total number of branches in the tree would bring down the worst-case complexity massively. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway, I think this is somewhat orthogonal to this PR.
to quote the docs from fragment_tree module:

//! The code in this module is not designed for speed or efficiency, but conceptual simplicity.
//! Our assumption is that the amount of candidates and parachains we consider will be reasonably
//! bounded and in practice will not exceed a few thousand at any time. This naive implementation
//! will still perform fairly well under these conditions, despite being somewhat wasteful of
//! memory.

we should revisit this assumption maybe, but it doesn't seem like it'll be a problem for the forseeable future

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prospective-parachains seems to be a good subsystem benchmark target. That should give some clear numbers on CPU usage in worst case scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened a separate issue to track this security item and added it to the elastic scaling task: #3219

fn select_children_inner(
&self,
base_node: NodePointer,
expected_count: u32,
remaining_count: u32,
pred: &dyn Fn(&CandidateHash) -> bool,
accumulator: &mut Vec<CandidateHash>,
) -> Vec<CandidateHash> {
if remaining_count == 0 {
// The best option is the chain we've accumulated so far.
return accumulator.to_vec();
}

let children: Vec<_> = match base_node {
NodePointer::Root => self
.nodes
.iter()
.take_while(|n| n.parent == NodePointer::Root)
.filter(|n| self.scope.get_pending_availability(&n.candidate_hash).is_none())
.filter(|n| pred(&n.candidate_hash))
.map(|n| n.candidate_hash)
.next(),
NodePointer::Storage(ptr) => self.nodes[ptr]
.children
.iter()
.filter(|n| self.scope.get_pending_availability(&n.1).is_none())
.filter(|n| pred(&n.1))
.map(|n| n.1)
.next(),
.enumerate()
.take_while(|(_, n)| n.parent == NodePointer::Root)
.filter(|(_, n)| self.scope.get_pending_availability(&n.candidate_hash).is_none())
.filter(|(_, n)| pred(&n.candidate_hash))
.map(|(ptr, n)| (NodePointer::Storage(ptr), n.candidate_hash))
.collect(),
NodePointer::Storage(base_node_ptr) => {
let base_node = &self.nodes[base_node_ptr];

base_node
.children
.iter()
.filter(|(_, hash)| self.scope.get_pending_availability(&hash).is_none())
.filter(|(_, hash)| pred(&hash))
.map(|(ptr, hash)| (*ptr, *hash))
.collect()
},
};

let mut best_result = accumulator.clone();
for (child_ptr, child_hash) in children {
accumulator.push(child_hash);

let result = self.select_children_inner(
child_ptr,
expected_count,
remaining_count - 1,
&pred,
accumulator,
);

accumulator.pop();

// Short-circuit the search if we've found the right length. Otherwise, we'll
// search for a max.
if result.len() == expected_count as usize {
return result
} else if best_result.len() < result.len() {
best_result = result;
}
}

best_result
}

fn populate_from_bases(&mut self, storage: &CandidateStorage, initial_bases: Vec<NodePointer>) {
Expand Down Expand Up @@ -987,6 +1061,7 @@ mod tests {
use polkadot_node_subsystem_util::inclusion_emulator::InboundHrmpLimitations;
use polkadot_primitives::{BlockNumber, CandidateCommitments, CandidateDescriptor, HeadData};
use polkadot_primitives_test_helpers as test_helpers;
use std::iter;

fn make_constraints(
min_relay_parent_number: BlockNumber,
Expand Down Expand Up @@ -1524,6 +1599,21 @@ mod tests {
assert_eq!(tree.nodes[2].candidate_hash, candidate_a_hash);
assert_eq!(tree.nodes[3].candidate_hash, candidate_a_hash);
assert_eq!(tree.nodes[4].candidate_hash, candidate_a_hash);

for count in 1..10 {
assert_eq!(
tree.select_children(&[], count, |_| true),
iter::repeat(candidate_a_hash)
.take(std::cmp::min(count as usize, max_depth + 1))
.collect::<Vec<_>>()
);
assert_eq!(
tree.select_children(&[candidate_a_hash], count - 1, |_| true),
iter::repeat(candidate_a_hash)
.take(std::cmp::min(count as usize - 1, max_depth))
.collect::<Vec<_>>()
);
}
}

#[test]
Expand Down Expand Up @@ -1591,6 +1681,35 @@ mod tests {
assert_eq!(tree.nodes[2].candidate_hash, candidate_a_hash);
assert_eq!(tree.nodes[3].candidate_hash, candidate_b_hash);
assert_eq!(tree.nodes[4].candidate_hash, candidate_a_hash);

assert_eq!(tree.select_children(&[], 1, |_| true), vec![candidate_a_hash],);
assert_eq!(
tree.select_children(&[], 2, |_| true),
vec![candidate_a_hash, candidate_b_hash],
);
assert_eq!(
tree.select_children(&[], 3, |_| true),
vec![candidate_a_hash, candidate_b_hash, candidate_a_hash],
);
assert_eq!(
tree.select_children(&[candidate_a_hash], 2, |_| true),
vec![candidate_b_hash, candidate_a_hash],
);

assert_eq!(
tree.select_children(&[], 6, |_| true),
vec![
candidate_a_hash,
candidate_b_hash,
candidate_a_hash,
candidate_b_hash,
candidate_a_hash
],
);
assert_eq!(
tree.select_children(&[candidate_a_hash, candidate_b_hash], 6, |_| true),
vec![candidate_a_hash, candidate_b_hash, candidate_a_hash,],
);
}

#[test]
Expand Down
74 changes: 43 additions & 31 deletions polkadot/node/core/prospective-parachains/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -146,12 +146,20 @@ async fn run_iteration<Context>(
handle_candidate_seconded(view, para, candidate_hash),
ProspectiveParachainsMessage::CandidateBacked(para, candidate_hash) =>
handle_candidate_backed(&mut *ctx, view, para, candidate_hash).await?,
ProspectiveParachainsMessage::GetBackableCandidate(
ProspectiveParachainsMessage::GetBackableCandidates(
relay_parent,
para,
count,
required_path,
tx,
) => answer_get_backable_candidate(&view, relay_parent, para, required_path, tx),
) => answer_get_backable_candidates(
&view,
relay_parent,
para,
count,
required_path,
tx,
),
ProspectiveParachainsMessage::GetHypotheticalFrontier(request, tx) =>
answer_hypothetical_frontier_request(&view, request, tx),
ProspectiveParachainsMessage::GetTreeMembership(para, candidate, tx) =>
Expand Down Expand Up @@ -552,12 +560,13 @@ async fn handle_candidate_backed<Context>(
Ok(())
}

fn answer_get_backable_candidate(
fn answer_get_backable_candidates(
view: &View,
relay_parent: Hash,
para: ParaId,
count: u32,
required_path: Vec<CandidateHash>,
tx: oneshot::Sender<Option<(CandidateHash, Hash)>>,
tx: oneshot::Sender<Vec<(CandidateHash, Hash)>>,
) {
let data = match view.active_leaves.get(&relay_parent) {
None => {
Expand All @@ -568,7 +577,7 @@ fn answer_get_backable_candidate(
"Requested backable candidate for inactive relay-parent."
);

let _ = tx.send(None);
let _ = tx.send(vec![]);
return
},
Some(d) => d,
Expand All @@ -583,7 +592,7 @@ fn answer_get_backable_candidate(
"Requested backable candidate for inactive para."
);

let _ = tx.send(None);
let _ = tx.send(vec![]);
return
},
Some(tree) => tree,
Expand All @@ -598,46 +607,49 @@ fn answer_get_backable_candidate(
"No candidate storage for active para",
);

let _ = tx.send(None);
let _ = tx.send(vec![]);
return
},
Some(s) => s,
};

let Some(child_hash) =
tree.select_child(&required_path, |candidate| storage.is_backed(candidate))
else {
let backable_candidates: Vec<_> = tree
.select_children(&required_path, count, |candidate| storage.is_backed(candidate))
.into_iter()
.filter_map(|child_hash| {
storage.relay_parent_by_candidate_hash(&child_hash).map_or_else(
|| {
gum::error!(
target: LOG_TARGET,
?child_hash,
para_id = ?para,
"Candidate is present in fragment tree but not in candidate's storage!",
);
None
},
|parent_hash| Some((child_hash, parent_hash)),
)
})
.collect();

if backable_candidates.is_empty() {
gum::trace!(
target: LOG_TARGET,
?required_path,
para_id = ?para,
%relay_parent,
"Could not find any backable candidate",
);

let _ = tx.send(None);
return
};
let Some(candidate_relay_parent) = storage.relay_parent_by_candidate_hash(&child_hash) else {
gum::error!(
} else {
gum::trace!(
target: LOG_TARGET,
?child_hash,
para_id = ?para,
"Candidate is present in fragment tree but not in candidate's storage!",
?relay_parent,
?backable_candidates,
"Found backable candidates",
);
let _ = tx.send(None);
return
};

gum::trace!(
target: LOG_TARGET,
?relay_parent,
candidate_hash = ?child_hash,
?candidate_relay_parent,
"Found backable candidate",
);
}

let _ = tx.send(Some((child_hash, candidate_relay_parent)));
let _ = tx.send(backable_candidates);
}

fn answer_hypothetical_frontier_request(
Expand Down
Loading
Loading