Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Introduce a pruning mode which keeps all logs #9541

Closed
udoprog opened this issue Sep 12, 2018 · 4 comments
Closed

Introduce a pruning mode which keeps all logs #9541

udoprog opened this issue Sep 12, 2018 · 4 comments
Labels
F8-enhancement 🎊 An additional feature request. M4-core ⛓ Core client code / Rust.
Milestone

Comments

@udoprog
Copy link
Contributor

udoprog commented Sep 12, 2018

  • Parity Ethereum version: 2.0.0
  • Operating system: Linux
  • Installation: snap
  • Fully synchronized: yes and no
  • Network: ethereum
  • Restarted: yes

Hey,

This follows a rather lengthy set of discussions with @joshua-mir on Gitter.

I have a number of contracts for which I need to access all historical logs. I'd like to be able to set up an archive-like parity node which stores all logs, but omits tracing and block state to reduce the on disk size as much as possible.

I believe this is how go-ethereum nodes work by default with --syncmode=fast. I am currently operating geth nodes with that option that takes up ~120GB of disk space and I'm able to query for all historical logs. For parity to be a workable replacement I need a similar mode of operation.

@jam10o-new jam10o-new added Z1-question 🙋‍♀️ Issue is a question. Closer should answer. M4-core ⛓ Core client code / Rust. F7-optimisation 💊 An enhancement to provide better overall performance in terms of time-to-completion for a task. and removed Z1-question 🙋‍♀️ Issue is a question. Closer should answer. labels Sep 12, 2018
@jam10o-new jam10o-new added this to the 2.1 milestone Sep 12, 2018
@jam10o-new jam10o-new added F8-enhancement 🎊 An additional feature request. and removed F7-optimisation 💊 An enhancement to provide better overall performance in terms of time-to-completion for a task. labels Sep 12, 2018
@epheph
Copy link

epheph commented Sep 12, 2018

Parity warp mode will do what you need, but you need to give it enough time (days) to catch up. My understanding of how warp sync works is in two phases:
1.) Sync from warp-barrier block (a very recent block, such as 6.26M right now) to latest
2.) Sync from block 0 to warp-barrier block

Step #1 can complete before step #2. Even though your node SEEMS sync'd, you might have a huge gap between 0 and your warp-barrier block where fetches for blocks and logs return null or []

To see what is going on, enable the parity jsonrpc module and issue a

curl --data '{"method":"parity_chainStatus","params":[],"id":1,"jsonrpc":"2.0"}' -H "Content-Type: application/json" -X POST localhost:8545

@udoprog
Copy link
Contributor Author

udoprog commented Sep 13, 2018

@epheph Hey!

--pruning removing logs was the conclusion that the Gitter conversation lead me to. Can you verify for me that it's not supposed to?

Is there a way to see which logs are being downloaded right now? I've been running a node for days, and I'm yet to see any logs beyond when the node finished warping. Not the warp barrier (args: --warp-barrier 4410000), but the block at which the node started syncing after it finished warping. No logs apart from Syncing the latest blocks, and the data directory does not seem to be increasing very quickly.

@epheph
Copy link

epheph commented Sep 13, 2018

I am not a parity core developer, but here is my understanding/what I have witnessed in my usage of parity.

The thing that --pruning actually prunes is state, which is required for performing an off-chain call at a specific block. Like if you wanted to see what an ERC20 balance was, for instance. If you had:

--pruning fast --pruning-history 1500

You would be able to query, in the last 1500 blocks, what a specific ERC20 balance was AT that specific block. The other thing that pruning influences is your ability to trace_replayTransaction.

Event logs, however, are not a part of that state that is being pruned. If you are running a [non-light] node, and it is FULLY sync'd, your node will be able to respond to all getLogs requests. A warp node can become a fully sync'd node, it just happens much later, well after it can respond successfully to requests for recent blocks.

You check the status of your warp sync by including parity in your --jsonrpc-apis and issuing the parity_chainStatus curl i included above.

@andresilva
Copy link
Contributor

Echoing what @epheph said, the pruning algorithm only affects state. After your client is fully synced, i.e. all of the blocks from genesis to the current tip of the chain have been imported, all of the logs should be available. Whenever a node is warp-synced it will fetch a snapshot of the state at a given block and start syncing from there on, also in the background it will start syncing all the blocks before the snapshot, we call this process the ancient block sync. Currently there are some issues with this that we're addressing (#9531). We're also updating the RPC APIs to provide useful errors in case the chain isn't fully synced (#9475). In the meantime, the eth_syncing API should return information about whether the ancient block sync process has finished or not (https://wiki.parity.io/JSONRPC-eth-module#eth_syncing).

I'm closing this issue since the original request is invalid, but wanted to let you know that we're aware of the current usability pain points and we're working on improving it. 👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
F8-enhancement 🎊 An additional feature request. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

No branches or pull requests

5 participants