Potential Database Corruption during sync #2603

tomusdrw · 2016-10-13T06:57:45Z

2016-10-12 23:29:58  Syncing #2422970 b8be…d6b2      1 blk/s    6 tx/s   0 Mgas/s       0+ 7245 Qed   #2430219    1/46/50 peers      2 GiB db    7 MiB chain   40 MiB queue   11 MiB sync
2016-10-12 23:30:02  Block import failed for #2422985 (843d…5b07)
Error: Trie(IncompleteDatabase(11b9caba988cd1aeefcc20ca0595f051064c70e7149a5a0670366c322268c310))
2016-10-12 23:30:04  Bad header 2423110 (b8e4…a224) from 27: 27, state = ChainHead
2016-10-12 23:30:04  Bad header 2423110 (b8e4…a224) from 83: 83, state = ChainHead
2016-10-12 23:30:04  Bad header 2423110 (b8e4…a224) from 41: 41, state = ChainHead
2016-10-12 23:30:04  Bad header 2423110 (b8e4…a224) from 47: 47, state = ChainHead
2016-10-12 23:30:04  Bad header 2423110 (b8e4…a224) from 69: 69, state = ChainHead
2016-10-12 23:30:04  Bad header 2423110 (b8e4…a224) from 61: 61, state = ChainHead
2016-10-12 23:30:04  Bad header 2423110 (b8e4…a224) from 72: 72, state = ChainHead
2016-10-12 23:30:04  Bad header 2423110 (b8e4…a224) from 5: 5, state = ChainHead
2016-10-12 23:30:04  Bad header 2423110 (b8e4…a224) from 48: 48, state = ChainHead
2016-10-12 23:30:06  Bad header 2423110 (b8e4…a224) from 37: 37, state = ChainHead
2016-10-12 23:30:07  Bad header 2423110 (b8e4…a224) from 2: 2, state = ChainHead
2016-10-12 23:30:08  Bad header 2423110 (b8e4…a224) from 57: 57, state = ChainHead
2016-10-12 23:30:08  Syncing #2422984 969c…b7f0      1 blk/s   15 tx/s   0 Mgas/s       0+    0 Qed   #2422983    3/39/50 peers      2 GiB db    8 MiB chain    2 KiB queue   11 MiB sync
2016-10-12 23:30:11  Bad header 2423110 (b8e4…a224) from 23: 23, state = ChainHead
thread 'IO Worker #1' panicked at 'Potential DB corruption encountered: Database missing expected key: 1e34…d51d', ethcore/src/state/mod.rs:645
...
error: Process didn't exit successfully: `target/release/parity` (signal: 11, SIGSEGV: invalid memory reference)

Enough disk space (20GB)
4GB RAM node

Running latest master via:
$ cargo run --release --no-default-features --bin parity -- --relay-set strict --force-sealing

The text was updated successfully, but these errors were encountered:

rphmeier · 2016-10-13T09:32:01Z

That's probably a rocksdb OOM issue, judging by the sigsegv.

arkpar · 2016-10-14T09:54:25Z

Could not reproduce on my local VM (ubuntu 14.04)
Reproduced on the DO 4GB machine (ubuntu 15) though.

pyskell · 2016-10-18T21:50:36Z

Adding some more info to this based on the suggestion from @keorn

This doesn't seem to have to do specifically with many days of runtime as even after restarting parity, or attempting to sync a new copy of the chain from the network, the same issue is encountered. So even a brand new machine, running the latest version of parity, will be unable to sync to either network. Even using the newer parity restore <snapshot> does not work (my earlier comment was in error). The only thing that has worked is fully downloading another user's blockchain.

While this seems to be due to a heavy set of blocks to process (around 2,420,000), possibly related to the recent exploit, it's important to note that this even failed to freshly sync from the network on a VPS with 16GB of RAM and 8 CPUs (Digital Ocean $160 droplet option). As such, even for more than capable machines this is a DoS for new nodes attempting to enter the network. And hints that the issue may not exactly be tied to the intense computation required for the exploit blocks.

Also worth noting is that the panic/crash is immediate. So if I start parity to sync a fresh chain, let it crash at the problem block hours later, and then start it again, it will crash within about a second.

My output in particular differs a bit from the original commenter's so I've included it below:

thread 'IO Worker #2' panicked at 'Potential DB corruption encountered: Database missing expected key: 1348…1230', ethcore/src/state.rs:629
stack backtrace:
   1:     0x7f3f8de417b9 - <unknown>
   2:     0x7f3f8de4948c - <unknown>
   3:     0x7f3f8de48359 - <unknown>
   4:     0x7f3f8de48a48 - <unknown>
   5:     0x7f3f8de488a2 - <unknown>
   6:     0x7f3f8de48810 - <unknown>
   7:     0x7f3f8da7f5da - <unknown>
   8:     0x7f3f8da01a4f - <unknown>
   9:     0x7f3f8d9c3e50 - <unknown>
  10:     0x7f3f8da37461 - <unknown>
  11:     0x7f3f8da39837 - <unknown>
  12:     0x7f3f8d9ef69a - <unknown>
  13:     0x7f3f8d8ecab5 - <unknown>
  14:     0x7f3f8de50f76 - <unknown>
  15:     0x7f3f8d94da3e - <unknown>
  16:     0x7f3f8de46ff2 - <unknown>
  17:     0x7f3f8c5830a3 - start_thread
  18:     0x7f3f8cf9387c - clone
  19:                0x0 - <unknown>
2016-10-14 13:07:16  Finishing work, please wait...

I have a working copy of the blockchain here (courtesy of another user) if it can be of any use debugging: full parity copy

This copy includes the problem blocks but parity doesn't need to process them so the remainder of blocks sync as normal.

inmathwetrust · 2016-10-19T16:57:11Z

I am also affected by this issue as soon as i run the executable.
Running the implementation on ubuntu 16.04.1 LTS

Stage 3 block verification failed for #2422712 (a1b3…1ce4)
Error: Block(UnknownParent(1ec2be8ab88022c770b1e76ba0147c6e16e28d88e274947f038fdc1b54552f81))

Is there a workaround for this issue? or ETA for a fix? THANKS.

pyskell · 2016-10-19T19:48:25Z

@inmathwetrust

You can download my copy at the "full parity node" link and copy the DB to your .parity folder.

Just two things to keep in mind:

This is for the ETC network, not Ethereum
Make sure you don't overwrite any keys you might have stored in your .parity folder

arkpar · 2016-10-21T14:54:11Z

This should be fixed in 1.3.9. Please let us know if you see it again.

kenzaka07 · 2016-10-23T11:27:07Z

Hi.

I was using Parity 1.3.9... Everything was going well but syncing too slow, until such time it encountered this issue and won't let me sync on this block #2451318. Everytime I will restart the Parity, it will always crashed... This is the first time I have encountered such issue from when I started using 1.3.0 all the way to 1.3.9.

Please let me know what should I do... I am now behind syncing to the latest block because of slow syncing recently...

2016-10-23 19:14:29  Starting Parity/v1.3.9-beta-e9987c4-20161021/x86_64-windows-msvc/rustc1.12.0
2016-10-23 19:14:29  Using state DB journalling strategy fast
2016-10-23 19:14:29  Configured for Frontier/Homestead using Ethash engine
2016-10-23 19:14:42  NAT mapped to external address 112.201.176.90:58848
2016-10-23 19:14:42  Public node URL: enode://fd8891a24d019c70283d26f53ada8ae04309f42c1478777a733d5061428216f788ed2783297da0328127445f2dd308c1122e307fae67e1613241c707eff8e172@112.201.176.90:58848+60778
2016-10-23 19:14:50  Syncing #2451318 dd33…ffe9      0 blk/s    0 tx/s   0 Mgas/s       0+    0 Qed   #2451318    5/ 5/25 peers     18 MiB db    8 KiB chain  0 bytes queue   11 KiB sync
2016-10-23 19:15:04  Syncing #2451318 dd33…ffe9      0 blk/s    0 tx/s   0 Mgas/s       0+    0 Qed   #2451318    1/ 3/25 peers     18 MiB db    8 KiB chain  0 bytes queue   19 KiB sync
2016-10-23 19:15:04  Syncing #2451318 dd33…ffe9      0 blk/s    0 tx/s   0 Mgas/s       0+    0 Qed   #2451318    1/ 3/25 peers     18 MiB db    8 KiB chain  0 bytes queue   19 KiB sync
2016-10-23 19:15:04  Syncing #2451318 dd33…ffe9      0 blk/s    0 tx/s   0 Mgas/s       0+    0 Qed   #2451318    1/ 3/25 peers     18 MiB db    8 KiB chain  0 bytes queue   19 KiB sync
2016-10-23 19:15:12  Syncing #2451318 dd33…ffe9      0 blk/s    0 tx/s   0 Mgas/s       0+    0 Qed   #2451318    4/ 5/25 peers     18 MiB db    8 KiB chain  0 bytes queue  130 KiB sync
thread 'Verifier #0' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"', ../src/libcore\result.rs:788
stack backtrace:
   0:     0x7ff67bb1346e - <unknown>
   1:     0x7ff67bb11363 - <unknown>
   2:     0x7ff67bb11e2d - <unknown>
   3:     0x7ff67bb11c76 - <unknown>
   4:     0x7ff67bb11bd4 - <unknown>
   5:     0x7ff67bb11b6b - <unknown>
   6:     0x7ff67bb1edb5 - <unknown>
   7:     0x7ff67ba2419a - <unknown>
   8:     0x7ff67b768069 - <unknown>
   9:     0x7ff67b5c037f - <unknown>
  10:     0x7ff67b62039a - <unknown>
  11:     0x7ff67bb15631 - <unknown>
  12:     0x7ff67b6818cb - <unknown>
  13:     0x7ff67bb0f15e - <unknown>
  14:     0x7ffd1dc48363 - BaseThreadInitThunk
2016-10-23 19:15:19  Finishing work, please wait...
thread 'Verifier #1' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"', ../src/libcore\result.rs:788
stack backtrace:
   0:     0x7ff67bb1346e - <unknown>
   1:     0x7ff67bb11363 - <unknown>
   2:     0x7ff67bb11e2d - <unknown>
   3:     0x7ff67bb11c76 - <unknown>
   4:     0x7ff67bb11bd4 - <unknown>
   5:     0x7ff67bb11b6b - <unknown>
   6:     0x7ff67bb1edb5 - <unknown>
   7:     0x7ff67ba2419a - <unknown>
   8:     0x7ff67b768069 - <unknown>
   9:     0x7ff67b5c037f - <unknown>
  10:     0x7ff67b62039a - <unknown>
  11:     0x7ff67bb15631 - <unknown>
  12:     0x7ff67b6818cb - <unknown>
  13:     0x7ff67bb0f15e - <unknown>
  14:     0x7ffd1dc48363 - BaseThreadInitThunk

gavofyork · 2016-10-27T18:09:02Z

this is fixed in master #2832 and will be fixed in the 1.3.10 stable. please test when those are release and reopen if the issue reappears.

5chdn · 2017-07-10T08:01:48Z

Some user reported this issue with the latest beta 1.6.8 - is this the very same issue?

arkpar · 2017-07-10T11:09:34Z

@5chdn probably not. Was there a out of memory or out of disk error on prior run?

5chdn · 2017-07-11T10:50:41Z

@arkpar can't tell, I was guiding him how to access the node logs and this is the first time he looked at it. We now reset the db and it works.

tomusdrw added Z0-unconfirmed 🤔 Issue might be valid, but it’s not yet known. M4-core ⛓ Core client code / Rust. labels Oct 13, 2016

arkpar added F2-bug 🐞 The client fails to follow expected behavior. and removed Z0-unconfirmed 🤔 Issue might be valid, but it’s not yet known. labels Oct 14, 2016

tomusdrw mentioned this issue Oct 17, 2016

Block import fails while syncing #2640

Closed

tomusdrw added this to the 1.4 Civility milestone Oct 17, 2016

keorn mentioned this issue Oct 18, 2016

Panic after manny days of running #1961

Closed

keorn mentioned this issue Oct 21, 2016

State root not found for block #2755

Closed

tomusdrw mentioned this issue Oct 24, 2016

Parity 1.3.9 - Verifier #0' panicked at 'DB flush failed.: "Corruption: block checksum mismatch #2830

Closed

gavofyork closed this as completed Oct 27, 2016

gavofyork mentioned this issue Oct 27, 2016

crashing with state root not found after freshly resynching blockchain #2849

Closed

5chdn mentioned this issue Nov 4, 2017

Thread 'Verifier #0' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"', /checkout/src/libcore/result.rs:860 #6974

Closed

5chdn mentioned this issue Nov 13, 2017

Thread 'IO Worker #0' panicked at 'DB flush failed. #7029

Closed

This was referenced Nov 20, 2017

Corruption: block checksum mismatch #7087

Closed

Parity block checksum mismatch error #7088

Closed

5chdn mentioned this issue Dec 14, 2017

Thread 'Verifier #0' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"' #7279

Closed

5chdn mentioned this issue Jan 2, 2018

Thread 'IO Worker #0' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"' #7334

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential Database Corruption during sync #2603

Potential Database Corruption during sync #2603

tomusdrw commented Oct 13, 2016

rphmeier commented Oct 13, 2016

arkpar commented Oct 14, 2016

pyskell commented Oct 18, 2016 •

edited

Loading

inmathwetrust commented Oct 19, 2016

pyskell commented Oct 19, 2016

arkpar commented Oct 21, 2016

kenzaka07 commented Oct 23, 2016 •

edited

Loading

gavofyork commented Oct 27, 2016 •

edited

Loading

5chdn commented Jul 10, 2017

arkpar commented Jul 10, 2017

5chdn commented Jul 11, 2017

Potential Database Corruption during sync #2603

Potential Database Corruption during sync #2603

Comments

tomusdrw commented Oct 13, 2016

rphmeier commented Oct 13, 2016

arkpar commented Oct 14, 2016

pyskell commented Oct 18, 2016 • edited Loading

inmathwetrust commented Oct 19, 2016

pyskell commented Oct 19, 2016

arkpar commented Oct 21, 2016

kenzaka07 commented Oct 23, 2016 • edited Loading

gavofyork commented Oct 27, 2016 • edited Loading

5chdn commented Jul 10, 2017

arkpar commented Jul 10, 2017

5chdn commented Jul 11, 2017

pyskell commented Oct 18, 2016 •

edited

Loading

kenzaka07 commented Oct 23, 2016 •

edited

Loading

gavofyork commented Oct 27, 2016 •

edited

Loading