Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import failed: Slot number must increase #2636

Closed
EclesioMeloJunior opened this issue Jul 1, 2022 · 9 comments · Fixed by #2726
Closed

Import failed: Slot number must increase #2636

EclesioMeloJunior opened this issue Jul 1, 2022 · 9 comments · Fixed by #2726
Assignees

Comments

@EclesioMeloJunior
Copy link
Member

EclesioMeloJunior commented Jul 1, 2022

Describe the bug

  • While running a gossamer node with two other substrate nodes I notice that the substrate throws the following error:
2022-06-30 14:32:51 💔 Error importing block 0xf919e19068653baf95680af89d1751ce43cf4cc9422c0fa8e98300f470a430fc: consensus error: Import failed: Slot number must increase: parent slot: 414153490, this slot: 414153490

This block was created by the gossamer node:

2022-06-30T14:32:42-04:00 INFO built block 155 with hash 0xf919e19068653baf95680af89d1751ce43cf4cc9422c0fa8e98300f470a430fc, state root 0x8305859db24d16d1a34cee766dbf7d5b3132f3d4d0c57d31f21ec9e765752d68, epoch 6 and slot 414153490	babe.go:L541	pkg=babe

The slot number specified in the error messages was claimed by the gossamer node as a secondary vrf slot in the epoch 6:

2022-06-30T14:32:40-04:00 DBUG claimed secondary slot, for slot number: 414153490	crypto.go:L110	pkg=babe
2022-06-30T14:32:40-04:00 DBUG epoch 6: claimed secondary vrf slot 414153490	epoch.go:L239	pkg=babe

logs at https://gist.github.com/kishansagathiya/1bfbc988015a37c884beae7630a8169f

@kishansagathiya
Copy link
Contributor

I don't think this is related to secondary slot claiming. I have seen this happen for primary block as well. This has more to do with some time related inaccuracies. This error occurs only for the first slot in the epoch.
If I just change

if authoringSlot < currSlot {
to

		if authoringSlot <= currSlot {

we would not see the problem anymore. But that would mean that first slot would never get used, so that's bad.

@kishansagathiya
Copy link
Contributor

Different error block has an unknown parent, fun!!

2022-07-18 17:19:07 💔 Error importing block 0xaf9a9621dd46c50b668967ccc78f983aad789127658143b780296121e34c0f97: block has an unknown parent    

@kishansagathiya
Copy link
Contributor

kishansagathiya commented Jul 18, 2022

2022-07-18T17:14:48+05:30 DBUG reporting reputation change of -2147483648 to peer 12D3KooWNQGD8BoRkEmV4EXf8FuL4qZXmX1Ch4MvuH1zo4CEqouQ, reason: Genesis mismatch	handler.go:L81	pkg=peerset
2022-07-18T17:14:48+05:30 TRCE failed to validate handshake from peer 12D3KooWNQGD8BoRkEmV4EXf8FuL4qZXmX1Ch4MvuH1zo4CEqouQ using protocol /gssmr_test/block-announces/1: genesis hash mismatch	notifications.go:L177	pkg=network
2022-07-18T17:14:48+05:30 TRCE failed to handle message BlockAnnounceHandshake Roles=124 BestBlockNumber=2727946398 BestBlockHash=0x203d1723c21ab01fd0c7d98a859d5bf623e7be7b1a8ccdce3330fc80cdeb6791 GenesisHash=0x988d2e4f8c4f992b98370ccc21b36dc1af6409dd67381243743e2b8aae028021 from stream id 12D3KooWNQ-1-7: failed to validate handshake	inbound.go:L48	pkg=network

This might be the reason! But anyway I am seeing many of these errors, with or without Import failed: Slot number must increase everytime I try to replicate this bug

@EclesioMeloJunior
Copy link
Member Author

EclesioMeloJunior commented Jul 18, 2022

@kishansagathiya I think this is not the reason, the error genesis hash mismatch does not enable us to sync with the peers, could you compare the block 0 hash in the gossamer and in the peer they should be the same otherwise the block will not be imported. Maybe the genesis spec you're using is different from the one is used in the peer

@kishansagathiya
Copy link
Contributor

kishansagathiya commented Jul 18, 2022

@kishansagathiya I think this is not the reason, the error genesis hash mismatch does not enable us to sync with the peers, could you compare the block 0 hash in the gossamer and in the peer they should be the same otherwise the block will not be imported. Maybe the genesis spec you're using is different from the one is used in the peer

If you see genesis hash mismatch, doesn't necessarily mean that genesis are mismatching. It's the problem with what we are decoding. An example of that is described here #2435 (comment)

@EclesioMeloJunior
Copy link
Member Author

🤔 hm, does this error you've reported here -> #2636 (comment) happened at the beginning of the sync when both peers start a connection? Or the error happened when peers were connected for some time exchanging blocks?

@EclesioMeloJunior
Copy link
Member Author

EclesioMeloJunior commented Jul 18, 2022

If you see genesis hash mismatch, doesn't necessarily mean that genesis are mismatching. It's the problem with what we are decoding. An example of that is described here #2435 (comment).

Could you post some logs related to what is being decoded? I mean, I don't believe the error genesis hash mismatch is related to some decode issue but with some miss-configuration between the peers, It is common to see the genesis hash mismatch when both peers have different genesis hash

@kishansagathiya
Copy link
Contributor

So, what I am noticing here is that substrate start a new epoch with the proposal of a particular block at epoch start slot. Let's say this is the xth block. Gossamer is able to import this block. But then gossamer also build a block at the same slot, but it builds (x+1)th block.

I believe, gossamer should check the slot in the blocks that it imports. If gossamer believes that it is it's time to author a block, it should ignore the received block and instead of authoring x+1th block, it should author xth block.

@kishansagathiya
Copy link
Contributor

I was think about what can cause the above scenario.

  • inconsistency in slot calculation, like may be rust code and go code running simultaneously, it does not produce the same slot number. Otherwise, why would I have a block for a slot number that is ongoing?
  • Also, should I even import a block for the ongoing slot number? I feel like no.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants