Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p/discover: persistent node database #793

Merged
merged 14 commits into from
Apr 28, 2015

Conversation

karalabe
Copy link
Member

This PR introduces a seed cache database containing all the nodes that passed the discovery ping-pong procedure. Whenever ethereum starts up and there are no known nodes, the first 10 seeds are retrieved (and deleted) from the cache; and are used beside the bootstrap servers for connecting to the network.

The reason for the immediate deletion of the seed nodes is self cleanup: all seeds are evacuated when probing, but live ones get added back after the ping-pong, resulting in stale data gradually disappearing.

It might make sense to put an additional upper bound on the total number of peers we'd like to cache and always drop the oldest ones, but I'd vote to see how this mechanism behaves and polish it afterwards.

@fjl Please check if this is what you had in mind :)

@fjl
Copy link
Contributor

fjl commented Apr 24, 2015

This is almost what I had in mind, but not quite. I didn't explain it well enough before you started.

The purpose of discover.nodeDB is tracking nodes which have been bonded with (we should put this
into the type's comment). findnode requests from unknown nodes are discarded.

Calling it Cache or SeedXXX would indicate that its use is purely seeding the table. These names would be a good choice if it were so. But it isn't. Using it for seeding is only a nice side effect.

My idea going forward is as follows:

  • The node DB needs to have an efficient way of updating a timestamp of the last ping received and ping sent for a node. The mechanism should be geared towards allowing other metadata to be stored later.
  • Inactive nodes should expire after a reasonable amount of time. I think 24h would be reasonable.
    This should happen in some kind of background loop.
    Note that there is no need to be very exact about when a node expires. It does not matter
    if nodes are still in the database even 30 minutes after they should've been expired.
  • When bootstrapping from a non-empty DB, we should attempt to insert the most recent nodes into the table before expiring items. The scenario to keep in mind is that a client might not be started for more than 24h. In this case, we should still try to insert some of the nodes because they might be up.
  • The node DB is not supposed to be part of the public API of p2p/discover. The long term plan is to also store reputation values and RLPx session resumption tokens in this database. When we start doing that, the DB will move to its own package. I'd like to keep it unexported for now.

I suggest that the implementation of the DB should store metadata items as separate keys prefixed with the node ID. Example: for a single node with ID AAAA, LevelDB would contain these items:

key value
version database version
n:AAAA:discover RLP encoded discover.Node
n:AAAA:discover:lastping UNIX timestamp of last ping sent
n:AAAA:discover:lastpong UNIX timestamp of last pong sent
n:AAAA:reputation ...

With this scheme, deleting a node simply means deleting everything with the ID as prefix.
Updating is fast because only the key that is updated needs to change. Scanning for expired nodes
is slow but we don't care about that.

API-wise, it could look like this:

func (*nodeDB) node(NodeID) *Node
func (*nodeDB) updateNode(*Node)
func (*nodeDB) lastPing(NodeID) time.Time
func (*nodeDB) updateLastPing(NodeID, time.Time) 
...
func (*nodeDB) expire(from time.Time)

@fjl fjl mentioned this pull request Apr 24, 2015
14 tasks
@fjl fjl changed the title Discovery node cache p2p/discover: persistent node database Apr 24, 2015
@fjl
Copy link
Contributor

fjl commented Apr 24, 2015

I updated the comment above with stricter name-spacing of keys.

@karalabe
Copy link
Member Author

The database structure and details are imho perfectly reasonable and fine.

One issue I'm seeing with the API proposal however are circular dependencies. The moment you move this nodedb out into it's own package, that will depend on a lot of stuff from discover, but discover itself will depend on nodedb for the seeding.

Edit: One potential solution I can imagine is to have the based nodedb database for storing, querying and expiring items according to some schema, and then have client packages (e.g. discover provide a public adapter to it to fetch data specifically generated by itself). However, I don't fully see the dependency implications yet of such a solution.

@fjl
Copy link
Contributor

fjl commented Apr 24, 2015

We will address that when we get there. My guess is that Node and NodeID would move into the new package. I am not really planning to build a generic database for everyone's node metadata. Maybe the new package will be p2p/internal/nodedb.

@karalabe
Copy link
Member Author

I've updated the design to use the fancier db layout/schema. However, entry expiration is not yet done, neither have I spend time to even marginally test it beside passing the system tests. If you have time @fjl , take a glace to make sure it's going in the right direction.

PS: Since leveldb doesn't have any querying mechanism other than iterating over the entire database, the current seed query is very sub-optimal. Ideas?

Edit: I have to run, so I won't have time until Monday to finish up this new version.

@obscuren obscuren modified the milestone: Frontier Apr 24, 2015
return time.Time{}
}
var unix int64
if err := rlp.DecodeBytes(blob, &unix); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integer values don't need to use RLP. Note also that package rlp will refuse to encode or decode int64. Let's use binary.BigEndian.Uint64 or go fancy and use binary.Varint (as for the version number).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should forgo having {fetch,store}Time and only have {fetch,store}Int64 instead. It's
easy to do time.Unix(db.fetchInt64(key(...)), 0) and db.storeInt64(key(...), t.Unix())

@fjl
Copy link
Contributor

fjl commented Apr 24, 2015

This looks good. 🎉

Since leveldb doesn't have any querying mechanism other than iterating over the entire database, the current seed query is very sub-optimal. Ideas?

It doesn't matter how efficient the query is. If it becomes a problem, we can track the most recent nodes by maintaining an index (with a different key prefix). We could also roll the seed query into the initial expiration because it needs to scan the database anyway.

Discovery startup can take up to 2 seconds because it waits for package nat to figure out the external IP address. We can run the query concurrently at that time. I doubt it'll take more than a second to scan all nodes.

@obscuren
Copy link
Contributor

leveldb supports prefixes and range sets if you need them

field = string(item[len(id):])

return id, field
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two don't need to be methods. They can be plain functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno why GitHub doesn't close this diff, it's been updated.

@karalabe
Copy link
Member Author

@obscuren @fjl Hey all, I think this PR's mostly done, so we could do another round of reviews on it.

@obscuren
Copy link
Contributor

👍

@karalabe
Copy link
Member Author

Ah, good catch with the lockup. Didn't know about the blocking behavior. Will update in 3 mins.

@karalabe
Copy link
Member Author

PTAL

obscuren added a commit that referenced this pull request Apr 28, 2015
p2p/discover: persistent node database
@obscuren obscuren merged commit 91cb8cd into ethereum:develop Apr 28, 2015
maoueh pushed a commit to streamingfast/go-ethereum that referenced this pull request Sep 13, 2023
…port EIP-4337 Bundled Transactions (ethereum#945)

* added new api to support conditional transactions (EIP-4337) (ethereum#700)

* Refactored the code and updated the miner to check for the validity of options (ethereum#793)

* refactored the code and updated the miner to check for the validity of options

* added new errors -32003 and -32005

* added unit tests

* addressed comments

* Aa 4337 update generics (ethereum#799)

* poc

* minor bug fix

* use common.Hash

* updated UnmarshalJSON function (reference - tynes)

* fix

* done

* linters

* with test

* undo some unintentional changes

---------

Co-authored-by: Pratik Patil <pratikspatil024@gmail.com>

* handelling the block range and timestamp range, also made timestamp a pointer

---------

Co-authored-by: Evgeny Danilenko <6655321@bk.ru>

* Added filtering of conditional transactions in txpool (ethereum#920)

* added filtering of conditional transactions in txpool

* minor fix in ValidateKnownAccounts

* bug fix

* Supporting nil knownAccounts

* lints

* bundled transactions are not announced/broadcasted to the peers

* fixed after upstream merge

* few fixes

* sentry reject conditional transaction

* Changed the namespace of conditional transaction API from `eth` to `bor` (ethereum#985)

* added conditional transaction to bor namespace

* test comit

* test comit

* added conditional transaction

* namespapce changed to bor

* cleanup

* cleanup

* addressed comments

* reverted changes in ValidateKnownAccounts

* addressed comments and removed unwanted code

* addressed comments

* bug fix

* lint

* removed licence from core/types/transaction_conditional_test.go

---------

Co-authored-by: Evgeny Danilenko <6655321@bk.ru>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants