Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queued buffers not cleaned up in closed sockets #4779

Closed
zz85 opened this issue Jan 20, 2016 · 13 comments
Closed

Queued buffers not cleaned up in closed sockets #4779

zz85 opened this issue Jan 20, 2016 · 13 comments
Labels
memory Issues and PRs related to the memory management or memory footprint. net Issues and PRs related to the net subsystem.

Comments

@zz85
Copy link

zz85 commented Jan 20, 2016

I initially reported this issue on ws repo, however there might be a possibility that it is a node issue after testing it on both the official windows Nodejs and Chakra builds. Somehow it seems that the issue can be reproduced more quickly on Linux/MacOS than on Windows.

Observations

  • Slow clients (clients receiving data slower than server sends it) causes socket to increase their bufferSize
  • This typically increases the RSS of the node process
  • Upon disconnections of clients, we expect RSS to fall
  • This is true for TCP test case but not for WS (Websockets)
  • Slower RSS recovery for TCP test can be triggered by disconnecting most but not all of the connected clients
  • Recovery of RSS can be sometimes triggered by connecting new clients and disconnecting them quickly

Reproduction Steps

  • Test cases are in https://github.com/zz85/node-ws-slow-client-buffer-leak repository
  • Run server on a separate machine over the network
  • Run 200 clients for 30 seconds
  • RSS builds up to approximately 700MB.
  • Disconnects clients.
  • RSS falls to about 300MB and stays there (RSS falls to 60MB for windows nodejs)
@Fishrock123 Fishrock123 added net Issues and PRs related to the net subsystem. memory Issues and PRs related to the memory management or memory footprint. labels Jan 20, 2016
@Fishrock123
Copy link
Contributor

Upon disconnections of clients, we expect RSS to fall

Do we? V8 is pretty aggressive about memory and will hold onto it for the future so that it doesn't need to reallocate. The numbers you are stating here seem normal. I'm not sure about Chakra though, but we don't support that at the current time anyways.

@zz85
Copy link
Author

zz85 commented Jan 20, 2016

I'm not sure if this is a behaviour caused by v8. If so, is there a difference in v8 on windows (I'm referring to the official build which recovers most of the RSS in 1 minute) while node on *nix seems to hold on to more memory for infinite amount of time? (The tests runs manual gc periodically which doesn't seem to pressure node into reclaiming RSS).

(Also an interesting observation using the Chakra build is that it unsent socket buffer seems to be using the JS heap memory rather than outside the JS engine)

I had also not suspected this to be a v8 behaviour because the v8 heap usage in the tests always fall when the clients disconnect. The unreclaimed memory seems to be buffers since they are not accounted for in v8 heap memory.

If v8 is indeed being aggressive about memory, could it cause buffers outside it's heap not to be released? Is there a possibility that it could be a behaviour of libuv? Or is there a possibility that the OS reports an RSS usage more than node actually uses?

@mscdex
Copy link
Contributor

mscdex commented Jan 20, 2016

What node versions did you test with?

Is it possible this may be related to #4773 since WebSockets (initially) uses http?

@Fishrock123
Copy link
Contributor

Oh right, I think our Buffers do allocate out of heap, yeah.

(Also an interesting observation using the Chakra build is that it unsent socket buffer seems to be using the JS heap memory rather than outside the JS engine)

Probably because they haven't implemented the same Buffer API as V8 has. That would mean it still uses TypedArray memory management.

@zz85
Copy link
Author

zz85 commented Jan 20, 2016

Probably because they haven't implemented the same Buffer API as V8 has. That would mean it still uses TypedArray memory management.

Hmm, interesting.

What node versions did you test with?

I tested with Node 4.2.1 on both Linux and Mac. Node 4.2.4 on Windows since I downloaded that today.

Is it possible this may be related to #4773 since WebSockets (initially) uses http?

I can try compiling at work tomorrow to see if that helps.

@Fishrock123
Copy link
Contributor

Hmm, interesting.

That is, V8 has a special C++ API for us that allows us to allocate Buffer memory outside V8 heap.

@zz85
Copy link
Author

zz85 commented Jan 20, 2016

Out of curiosity, does "system / JSArrayBufferData" in the heapdump (after clients disconnected) mean that reference is held on by the OS or nodejs?

image

@bnoordhuis
Copy link
Member

does "system / JSArrayBufferData" in the heapdump (after clients disconnected) mean that reference is held on by the OS or nodejs?

Node.js - although node isn't necessarily the retainer, that could be something in your app.

Aside, the heap snapshot sure looks like it's #4773.

@zz85
Copy link
Author

zz85 commented Jan 20, 2016

Node.js - although node isn't necessarily the retainer, that could be something in your app.

Thanks, I'm learning things here :)

@zz85
Copy link
Author

zz85 commented Jan 21, 2016

I can try compiling at work tomorrow to see if that helps.

So here's some observations and updates from today

  • Updated https://github.com/zz85/node-ws-slow-client-buffer-leak to include a http server and client to confirm the finding of http: remove reference to onParserExecute #4773. This seems accurate.
  • Tests are ran on MacOS + Node 4.2.5
  • Tests are ran again with Node 4.2.5 with the patch from http: remove reference to onParserExecute #4773
  • Patched HTTP tests - RSS usage ~60MB instead of ~500MB after garbage collection
  • Patched Websockets tests - RSS usage ~100MB instead of ~200MB after garbage collection (seems to be some other overheads in WS)
  • If HTTP Server sends strings only without buffers, garbage collection seems to work well and there is no "buffer leak" on the unpatched Node 4.2.5
  • Another observation: sometimes the queue socket bufferSize can be bigger than reported RSS. I'm guessing RSS can be misreported on MacOS since it might be reported "Compressed Memory".
  • Stills puzzling: I ran the tests on node 4.2.1 & node 4.2.5 on Windows 7 (both downloaded binaries), Node 4.2.1 seems to exhibit leaks while RSS on 4.2.5 has low RSS (20MB for HTTP server, 80MB for WS server), even lower than the RSS reported on MacOS after patching.
  • Still, overall it does seems that http: remove reference to onParserExecute #4773 seems help a lot for MacOS node RSS :)

(some screenshots belows)

Websockets (Stock Node 4.2.5) before clients disconnection
ws original - 1

Websockets (Stock Node 4.2.5) after clients disconnection
ws original - 3

Websockets (Patched Node 4.2.5) before clients disconnection
ws patch - 1

Websockets (Patched Node 4.2.5) after clients disconnection
ws patch - 2

HTTP Server (Stock Node 4.2.5) after clients disconnection
httpserver original - 1

HTTP Server (Stock Node 4.2.5) after clients disconnection
httpserver original - 2

HTTP Server (Patched Node 4.2.5) before clients disconnection
httpserver patch - 1

HTTP Server (Patched Node 4.2.5) after clients disconnection
httpserver patch - 2

@ChALkeR
Copy link
Member

ChALkeR commented Jan 21, 2016

Upon disconnections of clients, we expect RSS to fall

This expectation is incorrect, memory allocation doesn't work like that in general, and that depends on underlying memory allocators (including glibc).

Even in simple C/C++ programs, when you create a lot of strings and delete most of them — the memory is not returned to the system, but remains reserved for the further allocations by the same process (i.e. rss does not go down), because small chunks of memory are allocated using sbrk by extending the Data Segment.

If you think that this is a bug on Node.js side, please at least reproduce this with a pedantic memory allocator. For example, check what happens on valgrind --tool=massif. Note that v8 optimizations could also have some impact. Also, small Buffer objects are pooled, that could also have an impact, etc.


Update: looking into comments, it seems that there is an actual issue (if there are unneeded objects kept referenced). Does #4773 solve the issue completely, or is there still something left?

@zz85
Copy link
Author

zz85 commented Jan 22, 2016

Does #4773 solve the issue completely, or is there still something left?

I haven't been able to reproduce any significant leak after testing #4773 so I would say that addresses most of all my concerns and it should be fine to close this issue 😄

I guess what puzzles about is why an unpatched version of node 4.2.5 doesn't seem to exhibit this buffer leak issue. However, it could be a memory allocation behaviour from what you've mentioned.

Thanks for mentioning valgrind too, I should learn using it when I face with memory issues again (although it could probably be one of the last thing a JS developer thinks about using :)

@jasnell
Copy link
Member

jasnell commented Apr 13, 2016

Closing given the last comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
memory Issues and PRs related to the memory management or memory footprint. net Issues and PRs related to the net subsystem.
Projects
None yet
Development

No branches or pull requests

6 participants