blockstore: add an option to skip duplicate Puts by CID, mimicking carv1's Selective Writer API #123
Labels
P1
High: Likely tackled by core team if no one steps up
Milestone
Filecoin writes proofs into CAR files which are hashed, so we need their contents to be deterministic.
The way Filecoin currently generates those CARv1 files is via v1's selective writer API, which ensures canonical ordering via traversals, and also deduplicates by CID:
go-car/selectivecar.go
Lines 229 to 230 in 71cfa2f
For Ignite's current project, they receive blocks via graphsync, which ensures the order of blocks as per the IPLD selector, just like v1's selective writer. However, we might receive duplicate blocks from a client. When graphsync receives blocks they end up getting "Put" to our carv2 read-write blockstore.
If we want to be compatible, we should support deduplicating by CID. I propose a ReadWrite blockstore option for it, like
DeduplicateByCID
; if one callsPut
on the same CID twice, the second call will simply do nothing and return a nil error.In the future we could satisfy this need by porting Selective Writers to carv2 (#104), but that can't happen for another month or two.
I could also ask Ignite to implement a Blockstore wrapper that does this deduplication on Put calls, but deduplicating by CID also seems like a reasonable opt-in feature that others might want in the future. It wouldn't make the API significantly more complex or the read-write blockstore significantly slower, either.
The text was updated successfully, but these errors were encountered: