Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement PartialEq, Eq and Hash #364

Closed
ashfordneil opened this issue May 12, 2017 · 3 comments
Closed

implement PartialEq, Eq and Hash #364

ashfordneil opened this issue May 12, 2017 · 3 comments

Comments

@ashfordneil
Copy link

Without these traits being implemented, it is difficult to store Regex structs, or types containing them, in collections such as HashSet and HashMap. It also prevents any type containing a Regex from automatically deriving these traits. Would it be okay to implement these traits for the Regex struct, perhaps just using the as_str() function and performing comparisons / hashes on the output of that?

@ashfordneil ashfordneil changed the title support PartialEq, Eq and Hash implement PartialEq, Eq and Hash May 12, 2017
@BurntSushi
Copy link
Member

regex 0.1 had this impl, but I removed it. It's way too misleading because finite state machines actually have a concept of equality that is very different from "do these regex strings match."

You should solve this by putting your Regex in a newtype and deriving the traits you want.

@BurntSushi
Copy link
Member

See #178 and #313

not-my-profile added a commit to not-my-profile/ruff that referenced this issue Jan 15, 2023
The caching mechanism of the CLI (ruff_cli::cache) relies on
ruff::settings::Settings implementing the Hash trait.

The ruff::settings::Settings struct previously couldn't automatically
derive the Hash implementation via the #[derive(Hash)] macro attribute
since some of its field types intentionally[1] don't implement Hash
(namely regex::Regex, globset::GlobMatcher and globset::GlobSet).

The code therefore previously implemented the Hash trait by hand for the
whole struct. Implementing Hash by hand for structs that are subject to
change is a bad idea since it's very easy to forget to update the Hash
implementation when adding a new field to the struct. And the Hash
implementation indeed was already incorrect by omitting several fields
from the hash.

This commit introduces wrapper types for Regex, GlobMatcher & GlobSet
that implement Hash so that we can still add #[derive(Hash)] to the
Settings struct, guaranteeing a correct hash implementation.

This commit also changes all occurrences of FxHashSet and FxHashMap
within Settings to BTreeSet and BTreeMap since the B-Tree-based
collections implement Hash. FxHashSet and FxHashMap are type aliases
from the rustc_hash crate for HashSet and HashMap from the standard
library.[2] While BTreeMap::get is more costly than HashMap::get
(O(log(n)) vs O(1))[3] none of these hash maps/sets are expected to
contain so many entries that this would actually matter and directly
using BTree(Set|Map) instead of yet another wrapper type spares us a
bunch of .into() calls around our codebase.

[1]: rust-lang/regex#364 (comment)
[2]: https://docs.rs/rustc-hash/
[3]: https://doc.rust-lang.org/std/collections/#performance
not-my-profile added a commit to not-my-profile/ruff that referenced this issue Jan 16, 2023
The caching mechanism of the CLI (ruff_cli::cache) relies on
ruff::settings::Settings implementing the Hash trait.

The ruff::settings::Settings struct previously couldn't automatically
derive the Hash implementation via the #[derive(Hash)] macro attribute
since some of its field types intentionally[1][2] don't implement Hash
(namely regex::Regex, globset::GlobMatcher and globset::GlobSet and
HashMap and HashSet from the standard library).

The code therefore previously implemented the Hash trait by hand for the
whole struct. Implementing Hash by hand for structs that are subject to
change is a bad idea since it's very easy to forget to update the Hash
implementation when adding a new field to the struct. And the Hash
implementation indeed was already incorrect by omitting several fields
from the hash.

This commit introduces wrapper types for Regex, GlobMatcher, GlobSet,
HashSet & HashMap that implement Hash so that we can still add
#[derive(Hash)] to the Settings struct, guaranteeing a correct hash
implementation.

[1]: rust-lang/regex#364 (comment)
[2]: The standard library doesn't impl<T: Hash + Ord> Hash for HashSet<T>
     presumably since sorted() requires an allocation and Hash
     implementations are generally expected to work without allocations.
not-my-profile added a commit to not-my-profile/ruff that referenced this issue Jan 16, 2023
The caching mechanism of the CLI (ruff_cli::cache) relies on
ruff::settings::Settings implementing the Hash trait.

The ruff::settings::Settings struct previously couldn't automatically
derive the Hash implementation via the #[derive(Hash)] macro attribute
since some of its field types intentionally[1][2] don't implement Hash
(namely regex::Regex, globset::GlobMatcher and globset::GlobSet and
HashMap and HashSet from the standard library).

The code therefore previously implemented the Hash trait by hand for the
whole struct. Implementing Hash by hand for structs that are subject to
change is a bad idea since it's very easy to forget to update the Hash
implementation when adding a new field to the struct. And the Hash
implementation indeed was already incorrect by omitting several fields
from the hash.

This commit introduces wrapper types for Regex, GlobMatcher, GlobSet,
HashSet & HashMap that implement Hash so that we can still add
#[derive(Hash)] to the Settings struct, guaranteeing a correct hash
implementation.

[1]: rust-lang/regex#364 (comment)
[2]: The standard library doesn't impl<T: Hash + Ord> Hash for HashSet<T>
     presumably since sorted() requires an allocation and Hash
     implementations are generally expected to work without allocations.
charliermarsh pushed a commit to astral-sh/ruff that referenced this issue Jan 16, 2023
The caching mechanism of the CLI (ruff_cli::cache) relies on
ruff::settings::Settings implementing the Hash trait.

The ruff::settings::Settings struct previously couldn't automatically
derive the Hash implementation via the #[derive(Hash)] macro attribute
since some of its field types intentionally[1][2] don't implement Hash
(namely regex::Regex, globset::GlobMatcher and globset::GlobSet and
HashMap and HashSet from the standard library).

The code therefore previously implemented the Hash trait by hand for the
whole struct. Implementing Hash by hand for structs that are subject to
change is a bad idea since it's very easy to forget to update the Hash
implementation when adding a new field to the struct. And the Hash
implementation indeed was already incorrect by omitting several fields
from the hash.

This commit introduces wrapper types for Regex, GlobMatcher, GlobSet,
HashSet & HashMap that implement Hash so that we can still add
#[derive(Hash)] to the Settings struct, guaranteeing a correct hash
implementation.

[1]: rust-lang/regex#364 (comment)
[2]: The standard library doesn't impl<T: Hash + Ord> Hash for HashSet<T>
     presumably since sorted() requires an allocation and Hash
     implementations are generally expected to work without allocations.
@therustmonk
Copy link

Information for those who got here. A naive comparison can be implemented this way:

struct Wrapper {
    regex: Regex,
}

impl PartialEq for Wrapper {
    fn eq(&self, other: &Self) -> bool {
        self.regex.as_str() == other.regex.as_str()
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants