Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use marker type to enforce validation of Address's network #1489

Merged
merged 1 commit into from
Jan 11, 2023

Conversation

jirijakes
Copy link
Contributor

Adds marker type NetworkValidation to Address to help compiler enforce network validation. Inspired by Martin's suggestion. Closes #1460.

Open questions:

  1. Compilation fails with serde, which uses Address:from_str via macro serde_string_impl!(Address, "a Bitcoin address");. I don't think there is much we can do, so unless somebody has a better idea how to combine serde and network validation, I would just demacroed the macro for Address and add unsafe_mark_network_valid into it.
  2. Would someone prefer wrapping the validation types by mod validation? As they are now, they live in address namespace so I don't think mod is necessary.
  3. Almost all methods that used to be on Address are now on Address<NetworkValid> except one (address_type) that needs to be called on both and a few that are only on Address<NetworkUnchecked> (mainly is_valid_for_network). Some methods (e. g. to_qr_uri, is_standard and perhaps others) could be, theoretically, called on both valid and unchecked. I think we should encourage validating the network ASAP, so I would leave them on NetworkValid only, but I can move them if others have different opinion.
  4. Should NetworkValid and NetworkUnchecked enums have some trait impls derived? The PartialEq was necessary for tests (I think assert_eq required it) but I am not sure whether some other would be good to have. The enums are only used as types so I guess it's not necessary, but also I do not fully understand why the PartialEq was needed.

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think the answer to your serde question is simply don't impl Deserialize for UncheckedNetwork. See my other comments as well.

bitcoin/src/address.rs Outdated Show resolved Hide resolved
bitcoin/src/address.rs Outdated Show resolved Hide resolved
bitcoin/src/address.rs Outdated Show resolved Hide resolved
bitcoin/src/address.rs Outdated Show resolved Hide resolved
bitcoin/src/address.rs Show resolved Hide resolved
bitcoin/src/address.rs Show resolved Hide resolved
bitcoin/src/address.rs Outdated Show resolved Hide resolved
bitcoin/src/address.rs Outdated Show resolved Hide resolved
@@ -565,22 +621,55 @@ impl<'a> fmt::Display for AddressEncoding<'a> {
/// * [BIP341 - Taproot: SegWit version 1 spending rules](https://github.com/bitcoin/bips/blob/master/bip-0341.mediawiki)
/// * [BIP350 - Bech32m format for v1+ witness addresses](https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki)
#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One annoying property of these derives is they require V to implement the traits. We should just implement them manually.

Unrelated but I think deriving PartialOrd was wrong. I would expect the addresses to be lexicographically sorted but this sorts according to the internal representation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this marked as resolved if the derives are still here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I did not understand you would like them implemented manually as part of this PR. Would you be OK if it's done as follow-up? It is not completely related to this change and I can imagine it might spark new set of discussions. I could look at it right after this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's strongly related to this PR because that's literally doing this PR right. But since the downsides are mostly about documentation being bloated/less readable and a bunch of more code the compiler must process I think it's not too bad to put it in a separate PR if it helps you.

FTR AFAIK nobody is working in this file so you shouldn't get git conflicts anytime soon if you decide to do it in this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I prefer opening new PR for it. It might take me some time to figure out some of the implementations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The derivative crate can actually help but we have strict policy on dependencies here. @apoelstra what do you think? It has MSRV 1.36, doesn't look maintained but presumably does what's needed so maybe that's not an issue (esp. for proc macro). I don't know the author but I plan to review the crate anyway for my project and I would publish crev proof.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I realized. If we remove the derives from the marker types (NetworkChecked, NetworkUnchecked), I think the following would not be possible:

#[derive(Clone)]
pub struct MyStruct<V: NetworkValidation> {
    pub address: Address<V>
}

#[test]
fn test_my_struct() {
    let s: MyStruct<NetworkChecked> = todo!();
    let s2 = s.clone();
}

This compiles when NetworkChecked has Clone, otherwise it does not.

So although rust-bitcoin library may not need the derives (because we could manually implement all the traits directly on Address), people may want to use the marker types in their own types that use derivation.


At this moment, similar problem exists with Debug because Address<V> does not implement Debug and I guess the compiler does not know that all NetworkValidation implementations have it (they don't have it in this PR, but I added it in my experiments).

In this situation, the compiler suggests to add where Address<V>: Debug to MyStruct or #[derive(Debug)] to Address<V>. I haven't figured out yet how to deal with this situation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes, I was thinking that we would deal with the annoyance so downstream users don't have to but maybe it's not that simple. But making the struct generic like you showed already has other problems, so I'm not sure if we should be very worried about it. Most people probably should have the field fixed.

@Kixunil
Copy link
Collaborator

Kixunil commented Dec 19, 2022

Would someone prefer wrapping the validation types by mod validation?

I kinda like that such types in a separate module look like enums. address::validataion::NetworkValid. But I don't mind not having that if other think it's too annoying.

to_qr_uri

And also Display. Good question, if someone only wants to validate the address and display it somewhere else without creating Script out of it should they be forced to validate? I guess yes, because if they didn't want to validate they would just use String already?

Should NetworkValid and NetworkUnchecked enums have some trait impls derived?

No, they are not stored stored, just faked via PhantomData so they don't need any traits.

I do not fully understand why the PartialEq was needed.

Because, frankly, the derive macros are retarded and put wrong bounds on generic arguments. I remember complaining about this at serde and when they tried it broke something else. Also there's an argument to be made that auto-adding the bounds leaks internal representation. I agree with this but the solution shouldn't be "don't let users specify the bounds".

bitcoin/src/address.rs Outdated Show resolved Hide resolved
@apoelstra
Copy link
Member

apoelstra commented Dec 19, 2022

Concept ACK. this expect_network thing is actually pretty nice.

To answer your questions

  1. I agree that addresses coming from serde should always be unchecked. It would be a cool idea if we could provide a serde(with) adaptor that would let the user enforce a specific network, but we can do that in a followup PR. But I definitely don't think we should put unchecked assumptions here, I think that's a footgun.
  2. I don't care about modules. I think what you've done here is fine.
  3. We'll debate checked vs unchecked methods case-by-case.
  4. The reason the derive on NetworkUnchecked is necessary is because of (IMO) a design bug in Rust's trait derivation logic. Specifically, when you do #[derive(Trait)] on Address<U>, it will derive Trait for Address only if Trait is implemented for U, even if U is only used in a phantom. IOW if you don't derive all the traits you want for Address on the marker types, then deriving them of Address will have no effect.

This is triply stupid because (a) every trait ought to be implemented on uninhabited types, and (b) even if it weren't uninhabited, it shouldn't matter what traits are impl'd for phantom data, and (c) even if it did matter, phantoms ought to implement every trait. But Rust doesn't work like that, so here we are. You have to add a bunch of extra derived traits on your empty enums.

@apoelstra
Copy link
Member

I guess yes, because if they didn't want to validate they would just use String already?

I like this argument, but I disagree -- I think we should allow displaying etc on unchecked addresses. It's reasonable for some applications that simply pass addresses around to want to parse them/check for validity while still being agnostic about the specific network.

@Kixunil
Copy link
Collaborator

Kixunil commented Dec 19, 2022

It would be a cool idea if we could provide a serde(with) adaptor that would let the user enforce a specific network

Same argument as before: tempts people to hard-code networks. Maybe if the adaptor only provided to check that the network is not mainnet (for quick prototyping) I'd be open to it.

But I definitely don't think we should put unchecked assumptions here

deserialize_address_unchecked with adaptor could make sense if you have a reason to believe the data comes from some valid source.

every trait ought to be implemented on uninhabited types

impl Iterator for Uninhabited {
    type Item = ??? what goes here ???;

   // ...
}

shouldn't matter what traits are impl'd for phantom data

It seems to me that the whole PhantomData is just a bad way of expressing variance. There should've been a special syntax for it that doesn't require actually having it in the struct.

even if it did matter, phantoms ought to implement every trait.

Same issue as above

There is a fourth one though. If you have this:

#[derive(Clone)]
struct Foo<I: Iterator> {
    item: I::Item,
}

Then the trait bound created by the derive is I: Iterator which is wrong in two ways - it breaks because I::Item: Clone is missing and it's also over-constraining I. (Iterator may be a silly example but I actually have some real-life code that uses associated types.)

It's reasonable for some applications that simply pass addresses around to want to parse them/check for validity while still being agnostic about the specific network.

They could just call assume_valid/whatever it'll be called. We could even provide this as an example of the use case so that people don't feel bad about calling "dangerous" functions.

@Kixunil
Copy link
Collaborator

Kixunil commented Dec 19, 2022

Oh, crap, now I realized we can't implement assume_valid_ref(&self) -> Address<NetworkValid> but it'd be really useful for the display use case. I will think about redesigning this. Sorry for not realizing sooner.

@apoelstra
Copy link
Member

I think assume_valid(self) -> Address is probably sufficient for the common case of "parse/construct/whatever then immediately check it".

Same argument as before [about having a network-specific serde adaptor]: tempts people to hard-code networks.

Hmm, maybe we want to use DeserializeSeed or something like this? In general serde support seems like a tricky ergonomic question, since we ideally want people to be able to store an Address<NetworkValid> and round-trip them while preserving the validity invariant. Maybe there is just no way in Rust to make this work naturally.

@Kixunil
Copy link
Collaborator

Kixunil commented Dec 19, 2022

I think assume_valid(self) -> Address is probably sufficient for the common case of "parse/construct/whatever then immediately check it".

There's a good chance it is. But I also don't like the idea of closing the door for now. I think it would be better to make the representation of Address private for now and provide access methods. If we do so we can change the representation to this at any time in the future:

/// ...
///
/// `repr(transparent)` is for internal purposes only, do not rely on it!
#[repr(transparent)] // so that we can use `unsafe` to cast the types
struct Address<V: Validation> {
    raw: RawAddress,
    _phantom: PhantomData<V>,
}

struct RawAddress {
    payload: Payload,
    network: Network,
}

DeserializeSeed

Oh, yeah, forgot about that one. It seems to make sense to have AddressValidator(pub Network) which impls DeserializeSeed. I would ACK that.

@apoelstra
Copy link
Member

I agree, though I'd be okay accepting this PR as-is and then doing a followup to hide the internals.

@jirijakes
Copy link
Contributor Author

Thanks a lot to everybody.

I reflected most of the comments in these three commits:

Also, part of the last one is polishing of documentation (there were invalid links etc.).

What remains is the naming.

The reasoning behind my choice of NetworkUnchecked and NetworkValid is more precise reflection of the validation status. Either “I have not checked it yet, so I don't know whether it's valid or not” or “I checked it and found it valid". NetworkChecked would leave me with “OK, you checked it, so is it valid or not?” Yes, the 'checked' implies 'valid', or else I would not have the value at all, but it adds one indirection in my thinking.

However, I understand why the naming may bother people and I don't insist on it. Also, since you are much more familiar with the project, you all will have better feeling for the names.

Alternative I would prefer next is NetworkChecked and NetworkUnchecked. I would still leave the Network in the name because Address<Checked> and Address<NetworkChecked> have quite different meanings.

With that, we could have Address<NetworkUnchecked>::assume_checked(), which I quite like.

I cannot think of good names with 'valid' instead of 'check' that would make sense. I would quite like assume_valid() but it does not fit with the marker types (it's not NetworkValid vs. NetworkInvalid).

@apoelstra
Copy link
Member

0e448ae looks good to me, except for the naming.

OK, you checked it, so is it valid or not?

There is no way for this library to know what "valid" means or whether it is true about a given address. The only notion of "validity" we have is whether or not it parses correctly, and therefore any instance of Address is valid regardless of whether or not the network has been checked.

I agree with NetworkChecked and NetworkUnchecked, ofr the reasons that you give.

@jirijakes
Copy link
Contributor Author

Renamed to NetworkChecked and NetworkUnchecked and assume_checked() in https://github.com/rust-bitcoin/rust-bitcoin/compare/0e448ae3398cd4486855693f524c7e0e40f7f902..b9311a4cabb3d51a8c09563731dc71e5b3e2c2f3 .

And squashed.

@jirijakes jirijakes marked this pull request as ready for review December 21, 2022 09:41
@jirijakes jirijakes changed the title WIP: Use marker type to enforce validation of Address's network Use marker type to enforce validation of Address's network Dec 21, 2022
apoelstra
apoelstra previously approved these changes Dec 21, 2022
Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK b9311a4

Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man, I really like this PR.

bitcoin/src/address.rs Outdated Show resolved Hide resolved
bitcoin/src/address.rs Outdated Show resolved Hide resolved
Comment on lines +688 to +767
/// Create new address from given components, infering the network validation
/// marker type of the address.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This inference is cool, I had to play with it to check the doc statement was correct and that it was what we wanted - yes on both accounts!

If you want it, I added

    #[test]
    fn address_constructor_infers_network_unchecked() {
        let addr = Address::new(Bitcoin, Payload::PubkeyHash(hex_into!("162c5ea71c0b23f5b9022ef047c4a86470a5b070")));
        // This method is only valid for `Address<NetworkUnchecked>`.
        addr.is_valid_for_network(Network::Bitcoin);
    }

    #[test]
    fn address_constructor_infers_network_checked() {
        let addr = Address::new(Bitcoin, Payload::PubkeyHash(hex_into!("162c5ea71c0b23f5b9022ef047c4a86470a5b070")));
        // This method is only valid for `Address<NetworkChecked>`.
        addr.is_standard();
    }

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't focus on a proper review rn, sorry (I'm sick).


/// Methods on [`Address`] that can be called on both `Address<NetworkChecked>` and
/// `Address<NetworkUnchecked>`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this doc comment have any effect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, and other two on the other impl Addr<…>, were originally plain comments but I turned them into doc as a hint to readers of documentation that implementation of methods and functions are spread across multiple impls.

Would you prefer another way? Not having them as doc comments?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just curious, if it has no effect it doesn't matter which kind of comment it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets rendered in the documentation at the beginning of the impl.

@jirijakes
Copy link
Contributor Author

Thanks for additional comments. I fixed wording of doc comments of impl Address<…> and made Address::new() public.

Can't focus on a proper review rn, sorry (I'm sick).

No worries. Get well, Martin.

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thankfully my health significantly improved to the point of being able to properly review this. 🎉

This is almost ACK, I'm just unsure about address_type.

@@ -177,7 +177,8 @@ impl WatchOnly {

/// Creates the PSBT, in BIP174 parlance this is the 'Creater'.
fn create_psbt<C: Verification>(&self, secp: &Secp256k1<C>) -> Result<Psbt> {
let to_address = Address::from_str(RECEIVE_ADDRESS)?;
let to_address = Address::from_str(RECEIVE_ADDRESS)?
.expect_network(Network::Regtest)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated but we should have a const constructor for Address if possible.

expected: Network,
/// Network on which the address was found to be valid.
found: Network
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I would normally complain about this not being a separate type but I guess people will usually parse and validate the address at roughly same place so merging them seems OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I started to write it as separate type but when I saw that other parsing-related errors are in the big enum, I added it there, otherwise Address::from_str("…")?.require_network(…)? would not be possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we would need a third type which seems too much.


/// Methods on [`Address`] that can be called on both `Address<NetworkChecked>` and
/// `Address<NetworkUnchecked>`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just curious, if it has no effect it doesn't matter which kind of comment it is.

bitcoin/src/address.rs Show resolved Hide resolved
bitcoin/src/address.rs Outdated Show resolved Hide resolved
sanket1729
sanket1729 previously approved these changes Dec 23, 2022
Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d344f56. Though I also prefer the name require_network in place of expect_network

@jirijakes
Copy link
Contributor Author

One more change: address_type was previously publicly available on both NetworkChecked and NetworkUnchecked. Newly it is publicly available only on NetworkChecked, internally can be called as address_type_unchecked on both.

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I just noticed stale derives and have some additional questions.

/// .require_network(Network::Bitcoin).unwrap();
///
/// // variant 3
/// let address: Address<NetworkChecked> = "32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf".parse::<Address<_>>()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is : Address<NetworkChecked> even needed here? I think inference should work fine because require_network is only implemented on unchecked address. It'd be nice to show people the simplest version of the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not needed for type inference. I added it as an indication for readers to see what's happening with checked/unchecked and to see that Address (variant 2) and Address<NetworkChecked> (variant 3) are equivalent.

Variant 1 could also be simplified all the way to:

let address: Address<_> = "32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf".parse().unwrap();
let address = address.require_network(Network::Bitcoin).unwrap();

but I think it adds a bit less value to the readers.

@@ -565,22 +621,55 @@ impl<'a> fmt::Display for AddressEncoding<'a> {
/// * [BIP341 - Taproot: SegWit version 1 spending rules](https://github.com/bitcoin/bips/blob/master/bip-0341.mediawiki)
/// * [BIP350 - Bech32m format for v1+ witness addresses](https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki)
#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this marked as resolved if the derives are still here?


/// Marker that address's network has not yet been validated. See section [*Parsing addresses*](Address#parsing-addresses)
/// on [`Address`] for details.
#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These enums shouldn't have any derives since they don't do anything but bloat compile times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK they are needed as long as they are on Address struct. If you are OK with implementing the traits on Address manually (cf. #1489 (comment) ) as follow-up to this PR, I would remove them as part of the follow-up, too.

Otherwise let me know and I will do it as part of this one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I meant changing them to manual. I know it's annoying but at least you only have 3-field struct which isn't much. I have a project with several structs each having ~6 fields and a I think at least one impl is for a stupid reason... And I can't hack it with PhantomData because I have T::Assoc. 🤷‍♂️

}
}

// Alternate formatting `{:#}` is used to return uppercase version of bech32 addresses which should
// be used in QR codes, see [`Address::to_qr_uri`].
impl fmt::Display for Address {
impl<V: NetworkValidation> fmt::Display for Address<V> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this here only because calling Debug is wrong and we want to avoid duplicating the code since debug is (and should be) implemented for both? Note that @apoelstra didn't address my argument that people who want to just validate can call assume_valid. Or display the original string which will be faster anyway.

I suggest to have fmt_unchecked private method and call it from both impls, Display being only available for checked.

Oh and maybe we should mess up debug formatting a bit to discourage invalid usage? (put Address() around it or something).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable. I will address this in my next commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brings an issue with serde and I am not sure what would be the best way around it.

Currently, we have Serialize and Deserialize for Address<NetworkUnchecked> (via macro) and Serialize for Address<NetworkChecked> (manual impl).

Serialize for Address<NetworkUnchecked> uses Display (in the macro). After removing Display from Address<NetworkUnchecked>, it can't be used anymore.

I guess most correct way would be to have only Deserialize for Address<NetworkUnchecked> and Serialize for Address<NetworkChecked> but it seems to me it would hurt ergonomics of the usage.

Alternative is to keep Serialize for Address<NetworkUnchecked> and make it work (via private fmt).

What do people think would be better approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best approach is to use the private fmt method.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it seems to me it would hurt ergonomics of the usage.

This might be true. If people have it inside a larger struct then it can get annoying since they have to move it out and in. Ideally they would have the struct tagged as well but there are n Rust-native tools to compose this. I will give it more thought.

@tcharding
Copy link
Member

tcharding commented Dec 26, 2022

Perhaps we should only implement Serialize for NetworkChecked since we explicitly do not implement Display folks using serde to get a string to then display it may forget to check the network? I.e., I rekon variant 2 is better.

@tcharding
Copy link
Member

From the issue:

That would likely incentivize people to hard-code the network which is very bad.

While playing around with this I realized that the current API has not helped to discourage users from hard coding the network e.g.,

    let address = Address::from_str("32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf")
        .expect("failed to parse address")
        .require_network(Network::Bitcoin)
        .expect("wrong network");

Where would one get the network from? Everyone either has to feature gate test code or pass the network around?

/// A Bitcoin address.
///
/// ### Parsing addresses
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While reviewing I wrote some code, you can use any of this in the examples if you like it

    // Use `FromStr` to get an `Address` from a string.
    let address = Address::from_str("32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf")
        .expect("failed to parse address")
        .require_network(Network::Bitcoin)
        .expect("wrong network");

    // Display or serialize an `Address`.
    println!("address: {}", address);
    let ser = serde_json::to_string(&address).expect("failed to serialize address");

    // Use `parse` to get an unchecked address from a string.
    let address: Address<NetworkUnchecked> = "32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf"
        .parse()
        .expect("failed to parse address");

    // `Display` is explicitly *not* implemented for `Address<NetworkUnchecked>`.
    println!("address: {:?}", address);
    
    // `Serialize` is also explicitly *not* implemented for `Address<NetworkUnchecked>`.
    // let _ = serde_json::to_string(&address).expect("failed to serialize address");

    // The generic on `Address<T>` is there to catch incorrect addresses and to prevent hard coding of the network in bitcoin codebases - ouch current API does not encourage _not_ hard coding the network.
    let address = address
        .require_network(Network::Bitcoin)
        .expect("wrong network");

    // Equivalent to
    let address: Address<NetworkChecked> = "32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf"
        .parse::<Address<NetworkUnchecked>>()
        .expect("failed to parse address")
        .require_network(Network::Bitcoin)
        .expect("wrong network");

@jirijakes
Copy link
Contributor Author

Perhaps we should only implement Serialize for NetworkChecked since we explicitly do not implement Display folks using serde to get a string to then display it may forget to check the network? I.e., I rekon variant 2 is better.

I feel this variant would be more correct. But it would disallow just writing #[derive(Serialize, Deserialize)] on structs containing addresses, albeit unchecked (they could be checked upon usage). One would have to have a struct with unchecked for deserializing and another with checked for serializing.

One more option may be something between. Deserialize for unchecked, Serialize for checked and additionally we could provide a function serializing unchecked. User then would be forced to add #[serde(serialize_with = "bitcoin::address::serialize_address_network_unchecked")], which could give him some hint that this approach is not optimal. After all, if they really want to serialize unchecked, they can write this serializing function themselves. Might be better if it's done in the library and properly reviewed and tested.

@Kixunil
Copy link
Collaborator

Kixunil commented Dec 27, 2022

the current API has not helped to discourage users from hard coding the network

I don't think it's even possible. I'm not too concerned about that, I'm only concerned about encouraging to hardcode.

Everyone either has to feature gate test code or pass the network around?

Passing around the network from some configuration is the right way to do that. An exception is mobile apps which can not take arguments and so they need to have different builds for different networks. But that still needs to be only hardcoded in a single place of the application so that a build for different network is easy.

But it would disallow just writing #[derive(Serialize, Deserialize)]

Exactly, this would be very annoying.

#[serde(serialize_with = "bitcoin::address::serialize_address_network_unchecked")]

This sounds interesting, will think about it.

@jirijakes
Copy link
Contributor Author

#[serde(serialize_with = "bitcoin::address::serialize_address_network_unchecked")]

I played with this idea for a while. While it may look good at first, the issue is that this will not work when the address is in some container (Vec, Option etc.). There would have to be special serializing function for each of them.

So given that, I believe providing serializer for Address<NetworkUnchecked> looks most practical, although not most correct.

@Kixunil
Copy link
Collaborator

Kixunil commented Jan 3, 2023

OK, that seems to make sense. What do others think?

@apoelstra
Copy link
Member

+1 to having a serializer for Address<NetworkUnchecked>.

Parsing addresses from strings required a subsequent validation of
network of the parsed address. However, this validation was not
enforced by compiler, one had to remember to perform it.

This change adds a marker type to `Address` that will assist the
compiler in enforcing this validation.
@jirijakes
Copy link
Contributor Author

I believe this is ready for final review.

The current version contains:

  • serde Serializer for both NetworkChecked and NetworkUnchecked
  • serde Deserializer only for NetworkChecked
  • Debug for NetworkUnchecked producing Address<NetworkUnchecked>(…)
  • Display only for NetworkChecked
  • same derives on marker types as on Address. I will attempt to make the impls manual in follow-up PR

@Kixunil
Copy link
Collaborator

Kixunil commented Jan 11, 2023

OK, 1 week should be enough, I'll review this.

FTR, I recently realized we'll have a similar problem with Hash<KnownPreimage> but that one is simpler because we can use #[serde(with = "...")]

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK bef7c6e

I recommend that my comments are addressed in a separate PR. This was sitting here for a while and is overall good.

fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "Address<NetworkUnchecked>(")?;
self.fmt_internal(f)?;
write!(f, ")")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be annoying in generic contexts since Address<T: NetworkValidation> doesn't imply Address<T>: Debug. I suggest adding const IS_CHECKED: bool; to the NetworkValidation trait and using it here to determine if the address should be wrapped.

This can go into a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand. Will have a look at it.

@@ -797,10 +1005,11 @@ fn find_bech32_prefix(bech32: &str) -> &str {
}
}

impl FromStr for Address {
// Address can be parsed only with NetworkUnchecked.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could've been doc comment.

@apoelstra
Copy link
Member

Where is the serde::Serialize impl for Address<Unchecked>?

Other than this nit, bef7c6e LGTM.

@jirijakes
Copy link
Contributor Author

Where is the serde::Serialize impl for Address<Unchecked>?

It's right after the other impls.

#[cfg(feature = "serde")]
crate::serde_utils::serde_string_serialize_impl!(Address, "a Bitcoin address");
#[cfg(feature = "serde")]
crate::serde_utils::serde_string_deserialize_impl!(Address<NetworkUnchecked>, "a Bitcoin address");
#[cfg(feature = "serde")]
impl serde::Serialize for Address<NetworkUnchecked> {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
serializer.collect_str(&DisplayUnchecked(self))
}
}

Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK bef7c6e

@apoelstra
Copy link
Member

Ah! It just doesn't use the macro. Ok, I'm happy.

@apoelstra apoelstra merged commit e36ae3a into rust-bitcoin:master Jan 11, 2023
@jirijakes jirijakes deleted the address_validation branch January 12, 2023 01:26
apoelstra added a commit that referenced this pull request Jan 23, 2023
…ion>`

ebfbe74 Implement `Debug` for generic `Address<V: NetworkValidation>` (Jiri Jakes)

Pull request description:

  Previously `Debug` was implemented for both `Address<NetworkChecked>` and `Address<NetworkUnchecked>`, but not for cases when the `NetworkValidation` parameter was generic. This change adds this ability. Based on Kixunil's tip.

  With previous implementation, the `test_address_debug()` resulted in error:

  ![image](https://user-images.githubusercontent.com/1381856/213907042-f1b27f41-fa46-4fa0-b816-cc4df53f5d29.png)

  The added `Debug` on `NetworkChecked` and `NetworkUnchecked` are required by compiler.

  ---

  While dealing with derives and impls, I also attempted to turn all the derives on `Address` into manual impls (see Kixunil's suggestion in #1489 (comment)). The motivation behind this was the possibility to remove derives on `NetworkChecked` and `NetworkUnchecked`, too. However, even with manual impls, all the traits on `NetworkChecked` and `NetworkUnchecked` were still required by compiler in this sort of situations (see also the rest of the same discussion linked above). I do not fully understand why, perhaps limitation of this way of sealing traits?

  It can be demonstrated by removing `Debug` derivation on `NetworkUnchecked` and `NetworkChecked` in this PR and running `test_address_debug()`.

  Therefore, if we want to allow users of the library to define types generic in `NetworkValidation` and at the same time derive impls, it seems to me that `NetworkChecked` and `NetworkUnchecked` will have to have the same set of impls as `Address` itself.

ACKs for top commit:
  Kixunil:
    ACK ebfbe74
  tcharding:
    ACK ebfbe74
  apoelstra:
    ACK ebfbe74

Tree-SHA512: 87f3fa4539602f31bf4513a29543b04e943c3899d8ece36d0d905c3b5a2d76e29eb86242694b5c494faa5e54bb8f69f5048849916c6438ddd35030368f710353
@stevenroose
Copy link
Collaborator

stevenroose commented Apr 9, 2023

Bff, I haven't been active for a while, so I can only complain after the fact. But this change seems totally unnecessary to me and it means using Address<impl bitcoin::address::NetworkValidation + Sync> all over the place in so many library APIs. Many of the recent changes seem to be focussed more on internal use than external. Most users will import top-level types, so the recent rename of BlockHeader to Header (supposing anyone external will ever use block::Header..) means a lot of use bitcoin::block::Header as Blockheader;. There could have been a pub use crate::block::Header as BlockHeader; in the lib.rs at least, like we used to do for various other types inside submodules.

@Kixunil
Copy link
Collaborator

Kixunil commented Apr 9, 2023

This change uncovered a bunch of bugs in soon-to-be-production code I maintain, so this is a huge win. I find the annoyance worth it. + Sync shouldn't be needed since all Address types should be Sync already (and if not, we can make them).

BlockHeader was indeed questionable.

In case you're trying to upgrade bitcoincore-rpc I already did most of the work here: romanz/rust-bitcoincore-rpc#1

@stevenroose
Copy link
Collaborator

Yeah I am continually rebasing my sync+async refactor of the lib: rust-bitcoin/rust-bitcoincore-rpc#212

Not all addresses are Sync because it has a generic parameter now and the generic type isn't guaranteed to be Sync (because the seal thing is a hack and the compiler doesn't know that all possible generic types for the type are in this library and are hence Sync). The fix I made should enforce the Sync-ness.

Also, in your bump to 0.30 MR you're not allowing users to enter network-unchecked arguments which is annoying because the return type of the API itself is network-unchecked.

@Kixunil
Copy link
Collaborator

Kixunil commented Apr 9, 2023

Just saw your PR, makes sense.

I did intentionally only accept checked to enforce checking but it's true it's kinda questionable since Core checks it anyway.

@tcharding
Copy link
Member

I accept full responsibility for the BlockHeader -> block::Header change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Address handling is bit of a footgun
6 participants