Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

parity crashes with too many open files #8813

Closed
XertroV opened this issue Jun 5, 2018 · 16 comments
Closed

parity crashes with too many open files #8813

XertroV opened this issue Jun 5, 2018 · 16 comments
Labels
F1-panic 🔨 The client panics and exits without proper error handling. M4-core ⛓ Core client code / Rust. P2-asap 🌊 No need to stop dead in your tracks, however issue should be addressed as soon as possible.
Milestone

Comments

@XertroV
Copy link
Contributor

XertroV commented Jun 5, 2018

I'm running:

  • Which Parity version?: 1.9.5
  • Which operating system?: Linux
  • How installed?: via installer binaries (github releases)
  • Are you fully synchronized?: yes
  • Which network are you connected to?: ethereum
  • Did you try to restart the node?: yes

Node regularly crashes due to "too many open files"

Going to bump up ulimit and upgrade nodes - will let you know if this issue persists.

(Reporting because the logs tell me to)

Note: although we're having issues with the 1.10.4 node still, I can't find the same crash occurring.

One thing you'll note: (here's a sample of two lines in the logs

Jun 05 **17:59:37** eth-eu-node-01 parity[16742]: 2018-06-05 17:59:37 UTC Updated conversion rate to Ξ1 = US$608.48 (195647540 wei/gas)
Jun 05 **19:16:33** eth-eu-node-01 parity[16742]: 2018-06-05 19:16:33 UTC Removed existing file '/home/ubuntu/.local/share/io.parity.ethereum/jsonrpc.ip

The node does nothing for like 75 minutes after the crash occurs. During this time the node is running at 100% cpu on one core and using like 1-2GB ram.

This is meant to be a production node! (Like we're using it - as a business - in production) (Side note: I've had a worse time with geth, so don't feel too bad)

It's also a full archive node.

Some machine stats:

(this is from a machine running 1.10.4, but the machines are the same - it's also sync'd)

ubuntu@eth-aws-syd-node-02:~$ free -m
              total        used        free      shared  buff/cache   available
Mem:          61406        8318         415          61       52672       52458
Swap:          2047         506        1541

(from the node that was crashing) - basically identical to other prod nodes

ubuntu@eth-eu-node-01:~$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 245562
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 245562
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
ubuntu@eth-eu-node-01:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             30G     0   30G   0% /dev
tmpfs           6.0G  578M  5.5G  10% /run
/dev/xvda1      7.7G  4.8G  3.0G  62% /
tmpfs            30G     0   30G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            30G     0   30G   0% /sys/fs/cgroup
/dev/nvme0n1    1.8T  1.2T  552G  68% /mnt/eth
tmpfs           6.0G     0  6.0G   0% /run/user/1000

config:

[websockets]
disable = true

[footprint]
cache_size_queue = 4000
cache_size_state = 4000
cache_size_blocks = 4000
cache_size_db = 14000
pruning = "archive"

[secretstore]
disable = true

[parity]
auto_update = "all"
release_track = "stable"
identity = "securevote-eth-syd-node-02"

[rpc]
apis = ["web3", "eth", "net", "rpc"]
port = 38545
hosts = ["*"]
cors = ["*"]
interface = "all"

[ipfs]
interface = "0.0.0.0"

[network]
warp = false
min_peers = 50
max_peers = 500

(also note auto update doesn't seem to be working.. though a few months ago I deleted the download cache folder b/c we were having issues on the beta track)

log samples of crash

Jun 05 17:59:27 eth-eu-node-01 parity[31580]: 2018-06-05 17:59:27 UTC Reorg to #5737725 ad0f…0a66 (3f21…53d7 #5737724 583f…ec53 )
Jun 05 17:59:31 eth-eu-node-01 parity[31580]: 2018-06-05 17:59:31 UTC Imported #5737725 3f21…53d7 (176 txs, 7.99 Mgas, 350.03 ms, 30.63 KiB)
Jun 05 17:59:35 eth-eu-node-01 parity[31580]: ====================
Jun 05 17:59:35 eth-eu-node-01 parity[31580]: stack backtrace:
Jun 05 17:59:35 eth-eu-node-01 parity[31580]:    0:     0x560a3558e91c - <no info>
Jun 05 17:59:35 eth-eu-node-01 parity[31580]: Thread 'Verifier #2' panicked at 'Low-level database error. Some issue with your hard disk?: Error(Msg
Jun 05 17:59:35 eth-eu-node-01 parity[31580]: This is a bug. Please report it at:
Jun 05 17:59:35 eth-eu-node-01 parity[31580]:     https://github.com/paritytech/parity/issues/new
Jun 05 17:59:36 eth-eu-node-01 systemd[1]: parity.service: Main process exited, code=exited, status=128/n/a
Jun 05 17:59:36 eth-eu-node-01 systemd[1]: parity.service: Unit entered failed state.
Jun 05 17:59:36 eth-eu-node-01 systemd[1]: parity.service: Failed with result 'exit-code'.
Jun 05 17:59:36 eth-eu-node-01 systemd[1]: parity.service: Service hold-off time over, scheduling restart.
Jun 05 17:59:36 eth-eu-node-01 systemd[1]: Stopped Parity Daemon.
Jun 05 17:59:36 eth-eu-node-01 systemd[1]: Started Parity Daemon.
Jun 05 17:59:36 eth-eu-node-01 parity[16742]: Loading config file from /home/ubuntu/.local/share/io.parity.ethereum/config.toml
Jun 05 17:59:36 eth-eu-node-01 parity[16742]: 2018-06-05 17:59:36 UTC Starting Parity/v1.9.5-stable-ff821da-20180321/x86_64-linux-gnu/rustc1.24.1
Jun 05 17:59:36 eth-eu-node-01 parity[16742]: 2018-06-05 17:59:36 UTC Keys path /home/ubuntu/.local/share/io.parity.ethereum/keys/Foundation
Jun 05 17:59:36 eth-eu-node-01 parity[16742]: 2018-06-05 17:59:36 UTC DB path /home/ubuntu/.local/share/io.parity.ethereum/chains/ethereum/db/906a34
Jun 05 17:59:36 eth-eu-node-01 parity[16742]: 2018-06-05 17:59:36 UTC Path to dapps /home/ubuntu/.local/share/io.parity.ethereum/dapps
Jun 05 17:59:36 eth-eu-node-01 parity[16742]: 2018-06-05 17:59:36 UTC State DB configuration: archive
Jun 05 17:59:36 eth-eu-node-01 parity[16742]: 2018-06-05 17:59:36 UTC Operating mode: active
Jun 05 17:59:36 eth-eu-node-01 parity[16742]: 2018-06-05 17:59:36 UTC Configured for Foundation using Ethash engine
Jun 05 17:59:37 eth-eu-node-01 parity[16742]: 2018-06-05 17:59:37 UTC Updated conversion rate to Ξ1 = US$608.48 (195647540 wei/gas)
Jun 05 19:16:33 eth-eu-node-01 parity[16742]: 2018-06-05 19:16:33 UTC Removed existing file '/home/ubuntu/.local/share/io.parity.ethereum/jsonrpc.ip
Jun 05 19:16:38 eth-eu-node-01 parity[16742]: 2018-06-05 19:16:38 UTC Public node URL: enode://622a5b7b6db8bcc10b4abf3abd1dc64d02e975e129518c029292a
Jun 05 19:16:43 eth-eu-node-01 parity[16742]: 2018-06-05 19:16:43 UTC Syncing #5737726 1764…1f4a     0 blk/s    0 tx/s   0 Mgas/s      0+   28 Qed
Jun 05 19:16:53 eth-eu-node-01 parity[16742]: 2018-06-05 19:16:53 UTC Syncing #5737734 5e23…6c00     0 blk/s  117 tx/s   5 Mgas/s      0+  148 Qed
Jun 05 20:21:46 eth-eu-node-01 parity[16742]: 2018-06-05 20:21:46 UTC   26/50 peers     56 MiB chain   23 MiB db  0 bytes queue  148 KiB sync  RPC:
Jun 05 20:22:21 eth-eu-node-01 parity[16742]: 2018-06-05 20:22:21 UTC Imported #5738239 fece…2814 (40 txs, 7.99 Mgas, 94.09 ms, 12.96 KiB)
Jun 05 20:22:21 eth-eu-node-01 parity[16742]: 2018-06-05 20:22:21 UTC   26/50 peers     56 MiB chain   23 MiB db  0 bytes queue  148 KiB sync  RPC:
Jun 05 20:22:24 eth-eu-node-01 parity[16742]: 2018-06-05 20:22:24 UTC Imported #5738240 b5ab…cb33 (60 txs, 1.77 Mgas, 92.87 ms, 7.89 KiB)
Jun 05 20:22:44 eth-eu-node-01 parity[16742]: 2018-06-05 20:22:44 UTC Incoming streams error, closing sever: Error { repr: Os { code: 24, message: "
Jun 05 20:22:45 eth-eu-node-01 parity[16742]: 2018-06-05 20:22:45 UTC Couldn't open disk map for writing: Too many open files (os error 24)
Jun 05 20:22:45 eth-eu-node-01 parity[16742]: 2018-06-05 20:22:45 UTC Couldn't open disk map for writing: Too many open files (os error 24)
Jun 05 20:22:45 eth-eu-node-01 parity[16742]: 2018-06-05 20:22:45 UTC Couldn't open disk map for writing: Too many open files (os error 24)
Jun 05 20:22:45 eth-eu-node-01 parity[16742]: 2018-06-05 20:22:45 UTC Couldn't open disk map for writing: Too many open files (os error 24)
Jun 05 20:22:45 eth-eu-node-01 parity[16742]: 2018-06-05 20:22:45 UTC Couldn't open disk map for writing: Too many open files (os error 24)
Jun 05 20:22:46 eth-eu-node-01 parity[16742]: ====================
Jun 05 20:22:46 eth-eu-node-01 parity[16742]: stack backtrace:
Jun 05 20:22:46 eth-eu-node-01 parity[16742]:    0:     0x559242d9191c - <no info>
Jun 05 20:22:46 eth-eu-node-01 parity[16742]: Thread 'IO Worker #2' panicked at 'Low-level database error. Some issue with your hard disk?: Error(Ms
Jun 05 20:22:46 eth-eu-node-01 parity[16742]: This is a bug. Please report it at:
Jun 05 20:22:46 eth-eu-node-01 parity[16742]:     https://github.com/paritytech/parity/issues/new
Jun 05 20:22:47 eth-eu-node-01 systemd[1]: parity.service: Main process exited, code=exited, status=128/n/a
Jun 05 20:22:47 eth-eu-node-01 systemd[1]: parity.service: Unit entered failed state.
Jun 05 20:22:47 eth-eu-node-01 systemd[1]: parity.service: Failed with result 'exit-code'.
Jun 05 20:22:47 eth-eu-node-01 systemd[1]: parity.service: Service hold-off time over, scheduling restart.
Jun 05 20:22:47 eth-eu-node-01 systemd[1]: Stopped Parity Daemon.
Jun 05 20:22:47 eth-eu-node-01 systemd[1]: Started Parity Daemon.
Jun 05 20:22:47 eth-eu-node-01 parity[29384]: Loading config file from /home/ubuntu/.local/share/io.parity.ethereum/config.toml
Jun 05 20:22:47 eth-eu-node-01 parity[29384]: 2018-06-05 20:22:47 UTC Starting Parity/v1.9.5-stable-ff821da-20180321/x86_64-linux-gnu/rustc1.24.1
Jun 05 20:22:47 eth-eu-node-01 parity[29384]: 2018-06-05 20:22:47 UTC Keys path /home/ubuntu/.local/share/io.parity.ethereum/keys/Foundation
Jun 05 20:22:47 eth-eu-node-01 parity[29384]: 2018-06-05 20:22:47 UTC DB path /home/ubuntu/.local/share/io.parity.ethereum/chains/ethereum/db/906a34
Jun 05 20:22:47 eth-eu-node-01 parity[29384]: 2018-06-05 20:22:47 UTC Path to dapps /home/ubuntu/.local/share/io.parity.ethereum/dapps
Jun 05 20:22:47 eth-eu-node-01 parity[29384]: 2018-06-05 20:22:47 UTC State DB configuration: archive
Jun 05 20:22:47 eth-eu-node-01 parity[29384]: 2018-06-05 20:22:47 UTC Operating mode: active
Jun 05 20:22:47 eth-eu-node-01 parity[29384]: 2018-06-05 20:22:47 UTC Configured for Foundation using Ethash engine
Jun 05 20:22:48 eth-eu-node-01 parity[29384]: 2018-06-05 20:22:48 UTC Updated conversion rate to Ξ1 = US$608.77 (195554340 wei/gas)
Jun 05 21:36:38 eth-eu-node-01 parity[29384]: 2018-06-05 21:36:38 UTC Removed existing file '/home/ubuntu/.local/share/io.parity.ethereum/jsonrpc.ip
Jun 05 21:36:43 eth-eu-node-01 parity[29384]: 2018-06-05 21:36:43 UTC Public node URL: enode://622a5b7b6db8bcc10b4abf3abd1dc64d02e975e129518c029292a
Jun 05 21:36:43 eth-eu-node-01 parity[29384]: 2018-06-05 21:36:43 UTC Syncing #5738246 68d6…75fa     1 blk/s  197 tx/s   8 Mgas/s      0+  260 Qed
Jun 05 21:36:48 eth-eu-node-01 parity[29384]: 2018-06-05 21:36:48 UTC Syncing #5738255 5778…3cc2     1 blk/s  253 tx/s  14 Mgas/s      0+  248 Qed
@Tbaut
Copy link
Contributor

Tbaut commented Jun 6, 2018

Please upgrade as this has been solved. Duplicates #8123

@Tbaut Tbaut closed this as completed Jun 6, 2018
@Tbaut Tbaut added Z7-duplicate 🖨 Issue is a duplicate. Closer should comment with a link to the duplicate. F1-panic 🔨 The client panics and exits without proper error handling. labels Jun 6, 2018
@5chdn 5chdn added this to the 1.12 milestone Jun 6, 2018
@XertroV
Copy link
Contributor Author

XertroV commented Jun 7, 2018

@Tbaut (cc @5chdn) - recarding this being solved, apparently not. Just noticed this: (from about 20 min ago, coincidentally only 10min before downgrading to 1.10.6 as per #8818)

(this occurred in v1.11.3)

(note: using default of 1024 (via ulimit -n))

Jun 07 01:27:11 eth-aws-syd-node-02 parity[7854]: 2018-06-07 01:27:11 UTC    1/50 peers    614 MiB chain   69 MiB db  0 bytes queue   63 KiB sync  RPC:  0 conn,  4 req/s, 524008703 µs
Jun 07 01:31:57 eth-aws-syd-node-02 parity[7854]: 2018-06-07 01:31:57 UTC Incoming streams error, closing sever: Os { code: 24, kind: Other, message: "Too many open files" }
Jun 07 01:31:57 eth-aws-syd-node-02 parity[7854]: 2018-06-07 01:31:57 UTC Incoming streams error, closing sever: Os { code: 24, kind: Other, message: "Too many open files" }
Jun 07 01:31:58 eth-aws-syd-node-02 parity[7854]: 2018-06-07 01:31:58 UTC Incoming streams error, closing sever: Os { code: 24, kind: Other, message: "Too many open files" }
Jun 07 01:31:58 eth-aws-syd-node-02 parity[7854]: 2018-06-07 01:31:58 UTC Incoming streams error, closing sever: Os { code: 24, kind: Other, message: "Too many open files" }
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]: 2018-06-07 01:31:59 UTC Incoming streams error, closing sever: Os { code: 24, kind: Other, message: "Too many open files" }
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]: ====================
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]: stack backtrace:
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]:    0:     0x557ea9ada1fc - <no info>
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]: Thread 'IO Worker #2' panicked at 'db get failed, key: [2, 77, 88, 252, 73, 60, 64, 76, 43, 220, 134, 96, 67, 213, 206, 137, 92, 186, 158, 47, 13, 101, 212, 65, 175, 20, 236, 84, 178, 207, 180, 198, 191]: Error(Msg("IO error: While open a file for random read: /home/ubuntu/.local/share/io.parity.ethereum/chains/ethereum/db/906a34e69aec8c0d/archive/db/528057.sst: Too many open files"), State { next_error: None, backtrace: None })', libcore/result.rs:945
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]: This is a bug. Please report it at:
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]:     https://github.com/paritytech/parity/issues/new
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]: ====================
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]: stack backtrace:
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]:    0:     0x557ea9ada1fc - <no info>
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]: Thread 'IO Worker #1' panicked at 'db get failed, key: [2, 119, 134, 138, 231, 153, 14, 9, 25, 91, 83, 58, 52, 4, 17, 84, 103, 215, 110, 10, 223, 32, 93, 159, 77, 75, 36, 210, 101, 201, 16, 10, 50]: Error(Msg("IO error: While open a file for random read: /home/ubuntu/.local/share/io.parity.ethereum/chains/ethereum/db/906a34e69aec8c0d/archive/db/528063.sst: Too many open files"), State { next_error: None, backtrace: None })', libcore/result.rs:945
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]: This is a bug. Please report it at:
Jun 07 01:31:59 eth-aws-syd-node-02 parity[7854]:     https://github.com/paritytech/parity/issues/new
Jun 07 01:32:00 eth-aws-syd-node-02 parity[30981]: Loading config file from /home/ubuntu/.local/share/io.parity.ethereum/config.toml
Jun 07 01:32:00 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:00 UTC Starting Parity/v1.11.3-beta-a66e36b-20180605/x86_64-linux-gnu/rustc1.26.1
Jun 07 01:32:00 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:00 UTC Keys path /home/ubuntu/.local/share/io.parity.ethereum/keys/Foundation
Jun 07 01:32:00 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:00 UTC DB path /home/ubuntu/.local/share/io.parity.ethereum/chains/ethereum/db/906a34e69aec8c0d
Jun 07 01:32:00 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:00 UTC Path to dapps /home/ubuntu/.local/share/io.parity.ethereum/dapps
Jun 07 01:32:00 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:00 UTC State DB configuration: archive
Jun 07 01:32:00 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:00 UTC Operating mode: active
Jun 07 01:32:05 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:05 UTC Configured for Foundation using Ethash engine
Jun 07 01:32:06 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:06 UTC Running without a persistent transaction queue.
Jun 07 01:32:06 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:06 UTC Removed existing file '/home/ubuntu/.local/share/io.parity.ethereum/jsonrpc.ipc'.
Jun 07 01:32:07 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:07 UTC Sending warning alert CloseNotify
Jun 07 01:32:07 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:07 UTC Updated conversion rate to Ξ1 = US$615.75 (7733503 wei/gas)
Jun 07 01:32:11 eth-aws-syd-node-02 parity[30981]: 2018-06-07 01:32:11 UTC Public node URL: enode://4417851ea3d23f37456a7133feac14b3620e8c9ed6a7c2a09d38968ccd76e02ad3cd2c607c90559d3373eab358fe0328b2241178cb320c9b33cf447252c6af8c@172.31.2.39:30303

@Tbaut Tbaut added M4-core ⛓ Core client code / Rust. and removed Z7-duplicate 🖨 Issue is a duplicate. Closer should comment with a link to the duplicate. labels Jun 7, 2018
@Tbaut
Copy link
Contributor

Tbaut commented Jun 7, 2018

Can you tell us more regarding what you are doing with these nodes, querying the blockchain over WS according to your config? How heavy is the load?

@Tbaut Tbaut reopened this Jun 7, 2018
@XertroV
Copy link
Contributor Author

XertroV commented Jun 8, 2018

@Tbaut sure

The purpose of the nodes is arbitrary public historical access. Technically we only need this from certain dates and on certain (somewhat unpredictable) contracts, but I don't think there's any option besides running a full archive node atm. This is to support our voting platform (most code is public atm at @secure-vote). We'd use infura or something like that except they don't provide historical access. We run 3 of these nodes (one in US, one in EU, and one in AUS). The machines are all i3.2xLarge (I think, they have 60GB of RAM and 1.8 TB ssds)

These nodes don't need to deal with any local transactions, though the ability to broadcast transactions is useful (just that we don't want them being local)

Some of the issues we were having recently:

  • Some bots (presumably) were abusing the interface via eth_sendRawTransaction that meant we were getting like 6000 JSONRPC requests per minute (100-150/sec). Parity was treating them as local transactions which parity was caching through the persistent txqueue (now disabled with no_persistent_txqueue=false)
  • We were having issues with stability and responsiveness. I think in large part this was due to the above abuse (and maybe some jsonrpc instability too). I've set server_threads and processing_threads both to 15 currently, which hopefully will make things smoother (not sure what defaults were)

We'll be using websockets soon - I just need to configure everything in AWS with loadbalancers

I've also taken some steps on the aws side to prevent any direct connections to the nodes RPC port (which is, locally, 38545) - only the load balancer and localhost can connect directly now.

The load is not heavy from our side yet, but was very heavy when the nodes were being abused.

Let me know if there's anymore specific info that would help you. Very much appreciate the Parity client - it's been much better for us than Geth (which even on 1.8 - or whatever the latest version is) is much slower in archive mode.

Also, as part of the above mitigation the nodes are now 1.10.6 (mentioned in #8818)

@atlanticcrypto
Copy link

atlanticcrypto commented Jun 8, 2018

Since upgrading to 1.11.3 I have had similar error out problems. Happening on two 1.11.3 production nodes - both serving mining functions. Peers set to 300 / 500. The peer boundary was lowered from 1500 to 500 as anything over 500 peers and this problem happened much faster.

2018-06-08 09:22:22 WARN jsonrpc_http_server Incoming streams error, closing sever: Os { code: 24, kind: Other, message: "Too many open files" }

This was never a problem prior to the 1.11 build. I suspect this has to do with the parallel transaction processing?

Rolling back solves this problem for me.


I have temporarily increased the open file limit on this systemd process to see if that helps - that seems more of a band-aid though, as there was never an open file boundary issue before.

@atlanticcrypto
Copy link

Increasing open file limit from 4,092 to 500,000 has solved this for me for the moment. Overkill adjustment? Maybe.

I adjusted the limit as a parameter in the systemd service file directly. It did not take with the ulimit console function.

@5chdn
Copy link
Contributor

5chdn commented Jun 12, 2018

Having the same issue with nightly across different machines.

@dvdplm
Copy link
Collaborator

dvdplm commented Jun 13, 2018

@5chdn @Njcrypto is #8876 relevant for this?

@5chdn 5chdn added the P2-asap 🌊 No need to stop dead in your tracks, however issue should be addressed as soon as possible. label Jun 23, 2018
@5chdn
Copy link
Contributor

5chdn commented Jun 23, 2018

@XertroV @Njcrypto can you confirm this is fixed (or not) with 1.11.4?

@XertroV
Copy link
Contributor Author

XertroV commented Jun 24, 2018

@5chdn its not going to be that easy for me to check now.

I think the original reasons we were getting this error is the "abuse" we were having via publicly accessible eth_sendRawTransaction (relating to #8820). I've made two (workaround) mitigations since then: upping the ulimit and cutting off an "easy way" they were sending RPC messages. I'd have to undo those and try and find some spammers :P.

If anyone writes a script (or knows a way) that can spam eth_sendRawTransaction then I can run that against 1.11.4

Even if spam can crash the node, it should be able to be worked around with the new --tx-queue-no-unfamiliar-locals option.

@atlanticcrypto
Copy link

It has solved the open file issue for me, #8974 keeps it from being a production client though. More on that in that thread.

@5chdn
Copy link
Contributor

5chdn commented Jun 26, 2018

Thanks

@5chdn 5chdn closed this as completed Jun 26, 2018
@peterbitfly
Copy link

Ever since upgrading to 1.11.5 parity seems to utilize a large number of open files and I have run into this issue even with an open file limit of 64000.

An old v1.9 node running since more than 2 weeks only has 635 open files (output of ls -l /proc/<PID>/fd | wc -l).

A new 1.11.5 node running only for a day has already over 30000 open files. 30292 of them are from the type lrwx------ 1 <user><group>64 Jul 7 12:55 30549 -> socket:[3202699920].

The node has medium rpc load (a few requests per second) and is used for mining.

@5chdn 5chdn reopened this Jul 7, 2018
@atlanticcrypto
Copy link

My four 1.11.5 nodes have 875, 872, 748, 621 open files. All nodes have been running 1.11.5 with > 5 days uptime. All nodes are mining.

I was a victim of the 1.11.4 open file issue, and it has been solved for me under my specific configuration.

@5chdn
Copy link
Contributor

5chdn commented Jul 9, 2018

Can you tripple-check your version string? @ppratscher

@5chdn 5chdn closed this as completed Jul 9, 2018
@peterbitfly
Copy link

Yeah, it could be the case that the binary on the affected node was updated but has not been restarted. We will continue to monitor the situation and reply in case it happens again.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
F1-panic 🔨 The client panics and exits without proper error handling. M4-core ⛓ Core client code / Rust. P2-asap 🌊 No need to stop dead in your tracks, however issue should be addressed as soon as possible.
Projects
None yet
Development

No branches or pull requests

6 participants