[Question] Snap/Fast sync vs. Full sync for Archive nodes? #24413

dzou · 2022-02-16T22:24:58Z

The documentation describes that there are 3 sync modes {fast, snap, full} and different types of nodes including archive and full.

What is the difference between using snap/fast sync for archive nodes versus if full sync is used for archive nodes? Are archive nodes using snap sync not able to serve the same requests that archive nodes using full sync can serve?

Running:

geth --syncmode snap --gcmode archive

I tried to find some documentation but not clear what the verdict is:

eth/63 fast synchronization algorithm #1889 -- Introducing fast sync, author notes: "This allows a fast synced node to still retain its status an an archive node containing all historical data for user queries (and thus not influence the network's health in general), but at the same time to reassemble a recent network state at a fraction of the time it would take full block processing."
[Ethereum Stackexchange] Do we have a full archive node after a geth fast sync? -- Author speculates that the node is not able to serve requests prior to a "pivot block".
[Ethereum Stackexchange] does --gcmode=archive require --syncmode=full? -- Answer is not known.

Would love to receive some clarification on this! Many thanks.

The text was updated successfully, but these errors were encountered:

karalabe · 2022-02-17T09:44:13Z

There are two different notions: gcmode and syncmode. The latter refers how to do the initial sync and the former is how to behave garbage collection wide.

In fast/snap sync, the "current" state of the network is downloaded directly. As such, any state before it (not block, just the state) will not be available for serving. If you are running with gcmode=archive after snap sync, then your node will hold onto all generated state after initial sync, but anything before it will still be missing.

In full sync, blocks are processed one by one from the genesis. By default get will garbage collect the generated state, but if cgmode=archive is specified it will hold onto them. Thus you will have all the state available from genesis.

This allows a fast synced node to still retain its status an an archive node

That statement is wrong (I guess I messed it up at the time). I was meaning to write it "retains it's status a a full node".

Author speculates that the node is not able to serve requests prior to a "pivot block".

Archive mode retains all the state that gets generated during block processing. If you reprocess all the blocks from genesis - with archive flag set - you will have all that state. If you do the initial sync without block processing, the state for that segment will not be available.

Does --gcmode=archive require --syncmode=full? -- Answer is not known.

Depends on the behavior you want. If you want everything since genesis, then full is required. There are also use cases where you don't want everything, only from "today onward". In that case you can fast/snap sync and have archive keeping only the states after.

dzou · 2022-02-22T18:58:21Z

@karalabe -- Thank you so much for the response. I just have one more followup to help me understand --

What would happen if we ran an archive node with --syncmode full --gcmode archive and then shut it down for a day then switched to --syncmode snap --gcmode archive? Would it be able to sync faster to current time and still retain all information in history to serve?

We manage some archive nodes will syncmode=full and are wondering if there is someway to speed things up.

holiman · 2022-02-23T08:16:34Z

What would happen if we ran an archive node with --syncmode full --gcmode archive

Then it would store every state from genesis to until you shut it off.

day then switched to --syncmode snap --gcmode archive?

It would ignore the new syncmdoe and continue.

I guess what would be desireable for you would be this:

Node A has (all) state for blocks 0-2M,
Node B has all state for blocks 2M-3M,
etc..
N. Node N has state from 13M to head

This is possible, but would require a bit of coding, and some special setup. For example, you would run nodes 1-N-1 with --nodiscover --maxpeers=0 to prevent them from importing more data.

The way to create a "archive node from 1M to 2M could be to:

Use syncmode=full until 1M,
Do a state-pruning
- After pruning, you can also copy the datadir for use with the 2M-3M node, which needs to continue without gcmode=archive
Use syncmode=full gcmode=archive between 1M and 2M
Stop the node
Run the node with --nodiscover --maxpeers=0.

I guess the one thing lacking to script up such a scenario right now is that we don't have a way to stop at a certain block, e.g. geth ...args.. --exit-at=2000000.

Another useful option would be to extend gcmode, so that one could say e.g. gcmode=0:full,1000000:archive,2000000:full, meaning it would be given a set of N:<mode>, in increasing order, and automatically switch at the given numbers.
I'll file this up as a potential feature.

dzou · 2022-02-23T16:19:46Z

Thank you! 🙏

shiziwen · 2022-05-31T08:24:27Z

2. Do a state-pruning

@holiman Thanks for your replay.
But what do you mean by Do a state-pruning? What should I do or which command should I use to complete the state-pruning?

holiman · 2022-05-31T11:47:54Z

I mean geth snapshot prune-state.

shiziwen · 2022-06-01T02:49:32Z

I mean geth snapshot prune-state.

Thank you very much, I will figure out what this command do.

shiziwen · 2022-06-01T04:02:04Z

@holiman Hi, I have another question about the state and the snapshot.

As I know, every block has its state(accurately, the world state MPT) which contains the account (also the contract) info, for full sync mode + full node, it will use the downloaded transactions to generate the state, and for fast(now is snap) sync + full node, it will not download the state until the pivot block and after that it will work as full sync, right?

So the full node actually save the state for every block after the full sync, right?

But from some documents and my test, the full node(either snap sync or full sync), it will only preserve state for the latest 128 blocks. Otherwise, it will return error with missing trie node XXX (path ) when use eth_getBalance to get one account at a specific block number .
So, I don't understand why, or for the state, what's the difference between full node and archive node?

And what's the snapshot? what's the differences between state and snapshot?

Thank you very much.

MikeC-BC · 2022-07-27T20:09:15Z

A follow-up question: Is it possible to reprocess blocks from a certain block height to retain that state?

E.g. Say I ran my node with --syncmode fast and --gcmode archive until block 15000000 when I switched to --syncmode full. I now want state starting from block 14000000.

Can this be done with some reprocessing of blocks without having to do a full sync from scratch?

karalabe · 2022-07-28T03:50:54Z

Only if you reprocess everything from genesis (i.e. a sull sync).

ubuntutest · 2023-01-25T13:33:36Z

an old documentation told me to use Geth with pruning to get a copy of only the newest blocks.

to do this I should have used the "--fast" flag, I understand this is deprecated.

currently it seems there are "full" "snap" "light" flags.

Can you tell me the difference between "snap" and "light" ? full I think it's obvious.

MariusVanDerWijden · 2023-01-25T20:50:28Z

@ubuntutest https://geth.ethereum.org/docs/fundamentals/sync-modes

dzou added the type:docs label Feb 16, 2022

karalabe closed this as completed Feb 17, 2022

holiman mentioned this issue Feb 23, 2022

Setting up archive node sets #24461

Open

aathan mentioned this issue Sep 12, 2022

Geth refuses to sync after being a few months offline #25730

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Snap/Fast sync vs. Full sync for Archive nodes? #24413

[Question] Snap/Fast sync vs. Full sync for Archive nodes? #24413

dzou commented Feb 16, 2022 •

edited

Loading

karalabe commented Feb 17, 2022

dzou commented Feb 22, 2022

holiman commented Feb 23, 2022 •

edited

Loading

dzou commented Feb 23, 2022

shiziwen commented May 31, 2022

holiman commented May 31, 2022

shiziwen commented Jun 1, 2022

shiziwen commented Jun 1, 2022

MikeC-BC commented Jul 27, 2022 •

edited

Loading

karalabe commented Jul 28, 2022

ubuntutest commented Jan 25, 2023

MariusVanDerWijden commented Jan 25, 2023

[Question] Snap/Fast sync vs. Full sync for Archive nodes? #24413

[Question] Snap/Fast sync vs. Full sync for Archive nodes? #24413

Comments

dzou commented Feb 16, 2022 • edited Loading

karalabe commented Feb 17, 2022

dzou commented Feb 22, 2022

holiman commented Feb 23, 2022 • edited Loading

dzou commented Feb 23, 2022

shiziwen commented May 31, 2022

holiman commented May 31, 2022

shiziwen commented Jun 1, 2022

shiziwen commented Jun 1, 2022

MikeC-BC commented Jul 27, 2022 • edited Loading

karalabe commented Jul 28, 2022

ubuntutest commented Jan 25, 2023

MariusVanDerWijden commented Jan 25, 2023

dzou commented Feb 16, 2022 •

edited

Loading

holiman commented Feb 23, 2022 •

edited

Loading

MikeC-BC commented Jul 27, 2022 •

edited

Loading