Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility to traverse inner nodes #16

Merged
merged 1 commit into from
Jan 18, 2021
Merged

Conversation

liamsi
Copy link
Member

@liamsi liamsi commented Jan 13, 2021

Adds an optional "NodeVisitor" function to the tree that will be called on each node of the tree.

This can be used to tackle #9, and can also be used in the IPLD-plugin. It might also be useful to inspect/debug the tree, or, to print-out the tree in different ways.

 - to print the tree nodes
 - collect nodes (incl. inner) to write storage
@codecov
Copy link

codecov bot commented Jan 13, 2021

Codecov Report

Merging #16 (8340e76) into master (4b4c58a) will increase coverage by 0.16%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #16      +/-   ##
==========================================
+ Coverage   95.13%   95.30%   +0.16%     
==========================================
  Files           7        7              
  Lines         288      298      +10     
==========================================
+ Hits          274      284      +10     
  Misses         10       10              
  Partials        4        4              
Impacted Files Coverage Δ
nmt.go 92.95% <100.00%> (+0.53%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4b4c58a...8340e76. Read the comment docs.

@evan-forbes
Copy link
Member

evan-forbes commented Jan 14, 2021

This is super neat! The implementation as an Option is perfect.

@liamsi
Copy link
Member Author

liamsi commented Jan 16, 2021

Looks like this is all we need for the IPLD plugin.

@liamsi liamsi marked this pull request as ready for review January 16, 2021 09:50
@musalbas
Copy link
Member

As NodeVisitorFn is stateless (it just takes in the hash of a node and its children), how would the function determine which nodes are part of which Merkle proof or is it not relevant?

@liamsi
Copy link
Member Author

liamsi commented Jan 18, 2021

Good point. Currently, this is not relevant - at least for the first LL ipld plugin it isn't and I tried to keep the diff as small as possible. That said, there probably are use-cases where this will become relevant (e.g. when doing #15 this might need to change) 🤔 Did you have something specific in mind when you wondered about this? Maybe I'm missing something?

For now, I don't see any urgent need: nodes who generate proofs are expected to have all the data to generate these, and, they can always recompute the (inner) proof-nodes from scratch. The NodeVisitorFn is only called when computing the root (and simply visits all nodes).

@musalbas
Copy link
Member

I assume this is used in the IPLD plugin in order to publish the intermediate nodes? If so this seems like a non-generalisable feature but I guess it's OK if it's implemented as an option rather than a mandatory constructor argument.

@liamsi
Copy link
Member Author

liamsi commented Jan 18, 2021

Yes, that is exactly what it is used for. What kind of generalization do you have in mind though? For reading the nodes from storage?

If there is sth. obvious which do need soon, we can add this here too. Otherwise, let's track this in a separate issue.

@liamsi liamsi merged commit 6e8a6a5 into master Jan 18, 2021
@liamsi liamsi deleted the ismail/add_node_visitor branch January 18, 2021 16:05
@musalbas
Copy link
Member

Yes, that is exactly what it is used for. What kind of generalization do you have in mind though? For reading the nodes from storage?

If there is sth. obvious which do need soon, we can add this here too. Otherwise, let's track this in a separate issue.

I suppose actually storing the tree itself (in memory) within NMT itself and letting people read from it.

@liamsi
Copy link
Member Author

liamsi commented Jan 18, 2021

The tree exists in memory but proofs are currently recomputed on demand. That's isn't "reading from it" but should be quite efficient already, especially for a first version / mvp (the leaf-data and the leaf-hashes are also stored in memory and the former can be retrieved via the API).

If we want to avoid recomputing the inner proof nodes every time they are needed, we can actually store the inner nodes too. Not sure this would need to be exposed as an exported API or sth like this. Of course, if we want to store and then read the inner nodes from somewhere else, we'd need to change the API. We could also pass in an optional NodeGetterFn or SubTreeGetter maybe? For now I think recomputing the nodes is fine and will be fast enough, especially way faster than retrieving them from any kind of persistent storage (especially much faster thanreading them from sth like IPFS). Does that make sense?

@adlerjohn
Copy link
Member

Hashing is pretty fast and involves no disk I/O. Storing inner nodes would mean multiple on-disk accesses along with more storage overhead. So it might not actually be so bad to just fetch the leaves all at once then recompute proofs as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants