-
Notifications
You must be signed in to change notification settings - Fork 1.7k
parity just hangs in the background after running for some time #2523
Comments
Please try upgrading to the latest version (1.3.4) and see if it still hangs |
will do this in a moment and report back |
no, v1.3.4 didn't fix the issue. it's still happening on I'll try v1.3.5 |
@gituser please run with |
@arkpar, I did run with that flag when it crashed and stderr was redirected to the same file. Do you really need whole big logfile? I've pasted last 100 lines of it there are no errors. However, I have an idea where bug might be residing, I'm running in exactly same VM parity-v1.3.3 for 2 days already without issues in the screen (without daemon mode). So this bug might be residing there. So worth checking daemon lib you're using (maybe update to latest?) NOTE: v1.3.1 runs perfectly fine in daemon mode. As for current situation: I've compiled latest v1.3.5 and now running it through start-stop-daemon, but not using daemon mode and without --log-file, both stdout and stderr are redirected to the logfile. Will see how it goes and report back. |
@gituser if that fails, try the |
@arkpar, thank you, i'll try that if everything else fails. So far v1.3.5 been running ok for ~ 7 hours without daemon mode. |
so it's something to do with daemon mode. on other VM exactly same v1.3.5 in daemon mode hanged, whilst in other VM just running parity without daemon works fine. here is log |
Daemon library is in the latest version, it was updated between 1.3.1 and 1.3.4, but the change is minor and is not affecting our use case. So I think it's not related to deamon library but rather to ongoing attacks by that time. @gituser could you test the behaviour with 1.3.9 to see if it's still reproducible? |
@tomusdrw sorry for longness, somehow didn't notice there was a reply to this issue. I'm certain it's related to daemon mode. I'll test latest 1.3.10 and report back in few days. |
with 1.3.10 parity was stuck at block #2415298 forever, I'm not sure if it's related to the new HF, I'm just building 1.3.11 and re-syncing from scratch.. |
I'm still trying to get syncd fully. Here are some more observations:
Also had 3 OOM on 1.3.11 during initial sync from scratch on 5GB + 1GB swap VM. UPDATE: there is indeed an issue with daemon mode, verified on 1.3.11 - but this time parity is not stalling, it keeps giving messages about syncing blocks in the log, but in fact it's always behind and not fully synced, if i run without daemon mode - it works just fine. |
What is status of this issue ? We currently (parity 1.4.10) face very similar problem. After some time (this time approx. after 2 days) parity stops responding properly to some RPC calls (e.g. getWork, block information, but for example peerCount works). Logs show continuous synchronisation but our mining software still gets the same work. It looks like parity is not fully synced. After restart there is no syncing, parity continues to work like it was fully synced and everything is back normal. This happened second time (previously on parity 1.4.9) during the last month. I attach two log files: first from the moment where parity started to return the same work from getWork (block at height #3205515). Second from the moment of the restart. Issue is really hard to reproduce (2 times/month) so I cannot attach more detailed logs. We run parity as systemd service with following config file: |
@zet-tech So the RPC server is responding to those requests but responses are incorrect? It seems like it's a different issue then, most probably related to sync stalling for some reason. If that's the case it would be very helpful if you could provide As a workaround you may also check |
@tomusdrw Yes, server responded to getWork but always with the same value for about 8h. After giving it some thought, I agree that is must be some sync related problem. But if sync was stalled then after parity restart there should be some syncing to catch up with blockchain. It was not (see logs in restart.txt) and after the restart everything worked fine. It looks like parity continued to sync properly, but RPC was fixed at given block and was unaware of any further synced blocks. Running beta in our environment is currently not possible due to some RPC changes. I've configured new node with sync=trace but as I mentioned we probably need do wait some time (2-3 weeks) for this bug to reproduce. |
@zet-tech were you able to reproduce this? Please note, that 1.5.x will be the stable branch soon, so you might have to update your RPC calls anyways. |
(Un)Fortunately we are still waiting for this bug to reproduce with sync=trace enabled. But maybe it wont because we started to restart parity every few days. If we manage to reproduce this error and gather some logs then we will file new issue. |
Ok. |
Hi.
Been investigating this issue with @tomusdrw but without any luck, maybe someone else encountered this as well.
Parity hangs after some time. Sometimes it can run for 5 hours without issues, sometimes 10 hours, sometimes 24 hours. But every time running time is different. On the restart it starts syncing again without problems.
I run parity version
version Parity/v1.3.4-beta-a8b2cf9-20161006/x86_64-linux-gnu/rustc1.12.0
(beta + ported fresh rocksDB commit - ethcore/parity@e380955.Before that I've tried plain beta version
Parity/v1.3.4-beta-50021c7-20161005/x86_64-linux-gnu/rustc1.12.0
same situation.I also ran parity without any load in fresh VM and it hanged there as well with same symptoms.
On the contrary v1.3.1 is running just fine in other VM.
All VMs are identical and running Debian Jessie 8.0 x64.
The worst of the situation parity stops responding to IPC / RPC but still hangs in the background. There is no crash or anything it just stops altogether, but still running and you can only kill it with kill -9.
ps output:
strace output: https://paste.sh/0XCbZnHK#8WLrI9BqLQvFvQi9UudoYalu
Could be something related to time, but I have ntpd set and why it doesn't reproduce on v1.3.1?
On all my server I'm running ntpd to adjust time from my ISP time servers.
gdb threads:
Last 100 lines of log:
The text was updated successfully, but these errors were encountered: