Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BinaryHeap::contains and BinaryHeap::remove #82002

Closed
wants to merge 5 commits into from
Closed

Add BinaryHeap::contains and BinaryHeap::remove #82002

wants to merge 5 commits into from

Conversation

billyrieger
Copy link
Contributor

Tracking issue: #82001.

@rust-highfive
Copy link
Collaborator

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 11, 2021
I think incorrect (or malicious) `PartialEq` and `PartialOrd` implementations for `T` could define `x == y` and `x < y` to both be conditionally true with interior mutability in such a way that `pos == len - 1` (implying `data[pos] == data[len - 1]`) but  we end up in the `data[pos] < data[len - 1]` branch.  `sift_up()` and `sift_down()` assume that `pos` is in-bounds which would be violated in this case.
@m-ou-se
Copy link
Member

m-ou-se commented Mar 3, 2021

Thanks for your PR.

I'm afraid that adding these functions will be misleading, as they do nothing special with the structure of a binary heap, making them O(n) just like on a Vec.

For contains, using .iter() to look for an item seems fine to me, and makes the time complexity a lot clearer. Maybe #82331 will help here by allowing .as_slice().contains(x). (Feel free to add a comment there about your use case.)

For remove, there's already .retain(|e| e != x), although that one doesn't stop the search after finding the element. Maybe a function to remove an item at a specific index would make sense in addition to #82331.

What do you think?

@Thomasdezeeuw
Copy link
Contributor

@m-ou-se I'm mostly interesting interest in the remove function from this pr. If @billyrieger could change it to take advantage of the internal layout, i.e that fact it is a max-heap, would such a change be acceptable?

@scottmcm
Copy link
Member

scottmcm commented Mar 4, 2021

Maybe remove-by-index could make sense? The heap can implement that efficiently, and the index could be found by .iter().position(p) since that's going to be linear complexity anyway...

@billyrieger
Copy link
Contributor Author

@m-ou-se I agree that contains is redundant/unhelpful but like you said there's currently no good analogue for remove. Adding a remove_at method or something similar along with as_slice would be a solid alternative though.

Initially I had an implementation that used the max-heap property to search for elements as @Thomasdezeeuw suggested but I'm not sure that it's any better than just iterating over the underlying data. The average- and worst-case time complexity is still the same, at least.

impl<T: Ord> BinaryHeap<T> {
    fn find(&self, item: &T) -> Option<usize> {
        self.find_recursive(item, 0)
    }

    fn find_recursive(&self, item: &T, pos: usize) -> Option<usize> {
        if pos >= self.data.len() {
            None
        } else {
            match self.data[pos].cmp(item) {
                // If `self.data[pos] == item`, return the item's position.
                Ordering::Equal => Some(pos),
                // If `self.data[pos] < item`, the item cannot be in either of
                // the child branches (because this is a max heap).
                Ordering::Less => None,
                // If `self.data[pos] > item`, we need to search both child branches.
                Ordering::Greater => {
                    let left_child = 2 * pos + 1;
                    let right_child = 2 * pos + 2;
                    self.find_recursive(item, left_child)
                        .or_else(|| self.find_recursive(item, right_child))
                }
            }
        }
    }
}

@billyrieger
Copy link
Contributor Author

the index could be found by .iter().position(p)

This relies on the fact that heap.iter() has the same ordering as the underlying Vec<T> which isn't necessarily the case (I believe the docs say the iteration order is unspecified). Adding as_slice would make this work (#82331).

@bors
Copy link
Contributor

bors commented Mar 4, 2021

☔ The latest upstream changes (presumably #82756) made this pull request unmergeable. Please resolve the merge conflicts.

@crlf0710 crlf0710 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 20, 2021
@crlf0710
Copy link
Member

crlf0710 commented Mar 20, 2021

Triage: there's merge conflicts now, and it would be best if you could start a discussion on Zulip to make progress on this.

@JohnCSimon
Copy link
Member

@billyrieger ping from triage: can you please fix the merge conflicts? Thank you.

@billyrieger
Copy link
Contributor Author

I'm going to close this - #82331 covers some of the same functionality and at this point it's unclear what the implementation details for BinaryHeap::remove should be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants