Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pipe: enable writev for pipe handles on Unix #10677

Closed
wants to merge 1 commit into from

Conversation

aqrln
Copy link
Contributor

@aqrln aqrln commented Jan 7, 2017

This PR enables writev for Unix Domain Sockets on supported platforms thus enabling cork/uncork functionality for them and increasing IPC performance.

Only those Unix-like systems that are confirmed to support the feature (Linux, macOS and FreeBSD) are explicitly listed in the define guard. If anyone has an opportunity to test it under other platforms supported by Node, the list should be expanded, inverted into list of platforms that do not support the feature or even changed to a simple Windows/Unix check.

It should fix #5095 and similar issues, but I haven't benchmarked this specific case and haven't prepared any public benchmarks yet. In my usecase it gives about 1.3–1.4x performance increase on Linux (but strangely the same performance on macOS).

UPD: after discussion on IRC and experiment on CI the patch is enabled on all platforms except Windows.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

pipe, stream_wrap

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. net Issues and PRs related to the net subsystem. lts-watch-v6.x labels Jan 7, 2017
@jasnell
Copy link
Member

jasnell commented Jan 7, 2017

/cc @indutny @bnoordhuis

@aqrln
Copy link
Contributor Author

aqrln commented Jan 8, 2017

I ran node_redis benchmark and this patch indeed seems to fix #5095.

Full results: https://gist.github.com/aqrln/6fb1af97459e207ec654e75a6fc4cd00

Comparison of benchmarks before and after the patch on a Linux VM:

Comparing before, linux-before.txt ( 37 lines) to after, linux-after.txt ( 37 lines)
clients: 1:        0 ->      0 ops/sec (∆ -      0 -    NaN%)
PING:          27089 ->  30315 ops/sec (∆ +   3226 +  11.91%)
PING:         618513 -> 668113 ops/sec (∆ +  49600 +   8.02%)
SET 4B str:    25407 ->  26432 ops/sec (∆ +   1025 +   4.03%)
SET 4B str:   432767 -> 446921 ops/sec (∆ +  14154 +   3.27%)
SET 4B buf:    12026 ->  12261 ops/sec (∆ +    235 +   1.95%)
SET 4B buf:    17086 -> 171152 ops/sec (∆ + 154066 + 901.71%)
GET 4B str:    28040 ->  28977 ops/sec (∆ +    937 +   3.34%)
GET 4B str:   507737 -> 525030 ops/sec (∆ +  17293 +   3.41%)
GET 4B buf:    23984 ->  27532 ops/sec (∆ +   3548 +  14.79%)
GET 4B buf:   454798 -> 523850 ops/sec (∆ +  69052 +  15.18%)
SET 4KiB str:  20527 ->  23892 ops/sec (∆ +   3365 +  16.39%)
SET 4KiB str: 140304 -> 154618 ops/sec (∆ +  14314 +  10.20%)
SET 4KiB buf:  10802 ->  11967 ops/sec (∆ +   1165 +  10.79%)
SET 4KiB buf:  15961 -> 129128 ops/sec (∆ + 113167 + 709.02%)
GET 4KiB str:  23923 ->  25663 ops/sec (∆ +   1740 +   7.27%)
GET 4KiB str: 171351 -> 187105 ops/sec (∆ +  15754 +   9.19%)
GET 4KiB buf:  23174 ->  26569 ops/sec (∆ +   3395 +  14.65%)
GET 4KiB buf: 144582 -> 163794 ops/sec (∆ +  19212 +  13.29%)
INCR:          25148 ->  27120 ops/sec (∆ +   1972 +   7.84%)
INCR:         471591 -> 534386 ops/sec (∆ +  62795 +  13.32%)
LPUSH:         24301 ->  26746 ops/sec (∆ +   2445 +  10.06%)
LPUSH:        383287 -> 441743 ops/sec (∆ +  58456 +  15.25%)
LRANGE 10:     23138 ->  25018 ops/sec (∆ +   1880 +   8.13%)
LRANGE 10:    204398 -> 236106 ops/sec (∆ +  31708 +  15.51%)
LRANGE 100:    15573 ->  17351 ops/sec (∆ +   1778 +  11.42%)
LRANGE 100:    33627 ->  37870 ops/sec (∆ +   4243 +  12.62%)
SET 4MiB str:    196 ->    215 ops/sec (∆ +     19 +   9.69%)
SET 4MiB str:    193 ->    221 ops/sec (∆ +     28 +  14.51%)
SET 4MiB buf:    398 ->    461 ops/sec (∆ +     63 +  15.83%)
SET 4MiB buf:    415 ->    468 ops/sec (∆ +     53 +  12.77%)
GET 4MiB str:    415 ->    466 ops/sec (∆ +     51 +  12.29%)
GET 4MiB str:    211 ->    227 ops/sec (∆ +     16 +   7.58%)
GET 4MiB buf:    295 ->    321 ops/sec (∆ +     26 +   8.81%)
GET 4MiB buf:    296 ->    318 ops/sec (∆ +     22 +   7.43%)
:              85250 ->  85161 ops/sec (∆ -     89 -   0.10%)
Mean difference in ops/sec:  +18075.4

@addaleax
Copy link
Member

addaleax commented Jan 8, 2017

});

const writev = connection._writev.bind(connection);
connection._writev = common.mustCall(writev);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clever trick!

throw err;
});

server.listen(common.PIPE, startClient.bind(null, server));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d find () => startClient(server) a bit easier to read than .bind()

});

client.on('data', common.mustCall((data) => {
assert.equal(data.toString(), 'ping');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use assert.strictEqual here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, it's better to use strictEqual.


common.refreshTmpDir();

function startServer() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn’t really need to be wrapped in a function, does it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't. It is only for readability and consistency with startClient.

Copy link
Contributor Author

@aqrln aqrln Jan 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...well, at least if this startClient function exists. I'll simplify the code so that it resolves this remark and the one related to bind at once.

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I’d prefer it if @bnoordhuis or @indutny had a look at it too

setTimeout(() => {
connection.write('ng');
connection.end();
}, 200);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this timeout value relevant? Could it be smaller, or maybe even a setImmediate?

Copy link
Contributor Author

@aqrln aqrln Jan 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... maybe it is even better to remove the setTimeout completely since there's a common.mustCall on connection._writev and even 200 ms timeout doesn't guarantee that a client connection would emit two data events if server used write instead of writev.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tested it and yeah, even setTimeout is useless here, not to mention process.nextTick. I'll remove it completely. Though let it be two writes, just in case.

Copy link
Member

@indutny indutny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@trevnorris
Copy link
Contributor

@aqrln you mentioned in IRC that this fixes an issue I self assigned. If that's the case, mind adding the Fixes: field in the commit message?

@aqrln
Copy link
Contributor Author

aqrln commented Jan 9, 2017

@trevnorris sure, my bad. I forgot to add it.

@jasnell
Copy link
Member

jasnell commented Jan 10, 2017

@jasnell
Copy link
Member

jasnell commented Jan 10, 2017

@trevnorris ... does this LGTY?

@trevnorris
Copy link
Contributor

@aqrln Sorry for being pedantic but the full URL should be used. not just the issue number. other than that, LGTM

This commit enables writev for Unix Domain Sockets on supported
platforms thus enabling cork/uncork functionality for them and
improving IPC performance.

Fixes: nodejs#5095
@aqrln
Copy link
Contributor Author

aqrln commented Jan 11, 2017

@trevnorris thanks for pointing that out, I've updated the commit message.

@aqrln
Copy link
Contributor Author

aqrln commented Jan 12, 2017

@jasnell I hope this should be ready to land :)

@MylesBorins MylesBorins mentioned this pull request Mar 9, 2017
@aqrln aqrln deleted the pipe-cork branch March 10, 2017 04:20
MylesBorins added a commit that referenced this pull request Mar 21, 2017
Notable changes

* performance: The performance of several APIs has been improved.
  - `Buffer.compare()` is up to 35% faster on average. (Brian White)
    #10927
  - `buffer.toJSON()` is up to 2859% faster on average. (Brian White)
    #10895
  - `fs.*statSync()` functions are now up to 9.3% faster on average.
    (Brian White) #11522
  - `os.loadavg` is up to 151% faster. (Brian White)
    #11516
  - `process.memoryUsage()` is up to 34% faster. (Brian White)
    #11497
  - `querystring.unescape()` for `Buffer`s is 15% faster on average.
    (Brian White) #10837
  - `querystring.stringify()` is up to 7.8% faster on average.
    (Brian White) #10852
  - `querystring.parse()` is up to 21% faster on average. (Brian White)
    #10874

* IPC:
  - Batched writes have been enabled for process IPC on platforms that
    support Unix Domain Sockets. (Alexey Orlenko)
    #10677
  - Performance gains may be up to 40% for some workloads.

* child_process:
  - `spawnSync` now returns a null `status` when child is terminated by
    a signal. (cjihrig) #11288
  - This fixes the behavior to act like `spawn()` does.

* http:
  - Control characters are now always rejected when using
    `http.request()`. (Ben Noordhuis)
    #8923
  - Debug messages have been added for cases when headers contain
    invalid values. (Evan Lucas)
    #9195

* node:
  - Heap statistics now support values larger than 4GB. (Ben Noordhuis)
    #10186

* timers:
  - Timer callbacks now always maintain order when interacting with
    domain error handling. (John Barboza)
    #10522

PR-URL: #11759
MylesBorins added a commit that referenced this pull request Mar 21, 2017
Notable Changes:

* buffer:
  - The performance of `.toJSON()` is now up to 2859% faster on average
    (Brian White) #10895

* IPC:
  - Batched writes have been enabled for process IPC on platforms that
    support Unix Domain Sockets. (Alexey Orlenko)
    #10677
  - Performance gains may be up to 40% for some workloads.

* http:
  - Control characters are now always rejected when using
    `http.request()`. (Ben Noordhuis)
    #8923

* node:
  - Heap statistics now support values larger than 4GB. (Ben Noordhuis)
    #10186
MylesBorins added a commit that referenced this pull request Mar 21, 2017
Notable Changes:

* buffer:
  - The performance of `.toJSON()` is now up to 2859% faster on average
    (Brian White) #10895

* IPC:
  - Batched writes have been enabled for process IPC on platforms that
    support Unix Domain Sockets. (Alexey Orlenko)
    #10677
  - Performance gains may be up to 40% for some workloads.

* http:
  - Control characters are now always rejected when using
    `http.request()`. (Ben Noordhuis)
    #8923

* node:
  - Heap statistics now support values larger than 4GB. (Ben Noordhuis)
    #10186

PR-URL: #11760
MylesBorins added a commit that referenced this pull request Mar 21, 2017
Notable changes

* performance: The performance of several APIs has been improved.
  - `Buffer.compare()` is up to 35% faster on average. (Brian White)
    #10927
  - `buffer.toJSON()` is up to 2859% faster on average. (Brian White)
    #10895
  - `fs.*statSync()` functions are now up to 9.3% faster on average.
    (Brian White) #11522
  - `os.loadavg` is up to 151% faster. (Brian White)
    #11516
  - `process.memoryUsage()` is up to 34% faster. (Brian White)
    #11497
  - `querystring.unescape()` for `Buffer`s is 15% faster on average.
    (Brian White) #10837
  - `querystring.stringify()` is up to 7.8% faster on average.
    (Brian White) #10852
  - `querystring.parse()` is up to 21% faster on average. (Brian White)
    #10874

* IPC:
  - Batched writes have been enabled for process IPC on platforms that
    support Unix Domain Sockets. (Alexey Orlenko)
    #10677
  - Performance gains may be up to 40% for some workloads.

* child_process:
  - `spawnSync` now returns a null `status` when child is terminated by
    a signal. (cjihrig) #11288
  - This fixes the behavior to act like `spawn()` does.

* http:
  - Control characters are now always rejected when using
    `http.request()`. (Ben Noordhuis)
    #8923
  - Debug messages have been added for cases when headers contain
    invalid values. (Evan Lucas)
    #9195

* node:
  - Heap statistics now support values larger than 4GB. (Ben Noordhuis)
    #10186

* timers:
  - Timer callbacks now always maintain order when interacting with
    domain error handling. (John Barboza)
    #10522

PR-URL: #11759
imyller added a commit to imyller/meta-nodejs that referenced this pull request Apr 20, 2017
    Notable Changes:

    * buffer:
      - The performance of `.toJSON()` is now up to 2859% faster on average
        (Brian White) nodejs/node#10895

    * IPC:
      - Batched writes have been enabled for process IPC on platforms that
        support Unix Domain Sockets. (Alexey Orlenko)
        nodejs/node#10677
      - Performance gains may be up to 40% for some workloads.

    * http:
      - Control characters are now always rejected when using
        `http.request()`. (Ben Noordhuis)
        nodejs/node#8923

    * node:
      - Heap statistics now support values larger than 4GB. (Ben Noordhuis)
        nodejs/node#10186

Signed-off-by: Ilkka Myller <ilkka.myller@nodefield.com>
imyller added a commit to imyller/meta-nodejs that referenced this pull request Apr 20, 2017
    Notable changes

    * performance: The performance of several APIs has been improved.
      - `Buffer.compare()` is up to 35% faster on average. (Brian White)
        nodejs/node#10927
      - `buffer.toJSON()` is up to 2859% faster on average. (Brian White)
        nodejs/node#10895
      - `fs.*statSync()` functions are now up to 9.3% faster on average.
        (Brian White) nodejs/node#11522
      - `os.loadavg` is up to 151% faster. (Brian White)
        nodejs/node#11516
      - `process.memoryUsage()` is up to 34% faster. (Brian White)
        nodejs/node#11497
      - `querystring.unescape()` for `Buffer`s is 15% faster on average.
        (Brian White) nodejs/node#10837
      - `querystring.stringify()` is up to 7.8% faster on average.
        (Brian White) nodejs/node#10852
      - `querystring.parse()` is up to 21% faster on average. (Brian White)
        nodejs/node#10874

    * IPC:
      - Batched writes have been enabled for process IPC on platforms that
        support Unix Domain Sockets. (Alexey Orlenko)
        nodejs/node#10677
      - Performance gains may be up to 40% for some workloads.

    * child_process:
      - `spawnSync` now returns a null `status` when child is terminated by
        a signal. (cjihrig) nodejs/node#11288
      - This fixes the behavior to act like `spawn()` does.

    * http:
      - Control characters are now always rejected when using
        `http.request()`. (Ben Noordhuis)
        nodejs/node#8923
      - Debug messages have been added for cases when headers contain
        invalid values. (Evan Lucas)
        nodejs/node#9195

    * node:
      - Heap statistics now support values larger than 4GB. (Ben Noordhuis)
        nodejs/node#10186

    * timers:
      - Timer callbacks now always maintain order when interacting with
        domain error handling. (John Barboza)
        nodejs/node#10522

    PR-URL: nodejs/node#11759

Signed-off-by: Ilkka Myller <ilkka.myller@nodefield.com>
imyller added a commit to imyller/meta-nodejs that referenced this pull request Apr 20, 2017
    Notable Changes:

    * buffer:
      - The performance of `.toJSON()` is now up to 2859% faster on average
        (Brian White) nodejs/node#10895

    * IPC:
      - Batched writes have been enabled for process IPC on platforms that
        support Unix Domain Sockets. (Alexey Orlenko)
        nodejs/node#10677
      - Performance gains may be up to 40% for some workloads.

    * http:
      - Control characters are now always rejected when using
        `http.request()`. (Ben Noordhuis)
        nodejs/node#8923

    * node:
      - Heap statistics now support values larger than 4GB. (Ben Noordhuis)
        nodejs/node#10186

Signed-off-by: Ilkka Myller <ilkka.myller@nodefield.com>
imyller added a commit to imyller/meta-nodejs that referenced this pull request Apr 20, 2017
    Notable changes

    * performance: The performance of several APIs has been improved.
      - `Buffer.compare()` is up to 35% faster on average. (Brian White)
        nodejs/node#10927
      - `buffer.toJSON()` is up to 2859% faster on average. (Brian White)
        nodejs/node#10895
      - `fs.*statSync()` functions are now up to 9.3% faster on average.
        (Brian White) nodejs/node#11522
      - `os.loadavg` is up to 151% faster. (Brian White)
        nodejs/node#11516
      - `process.memoryUsage()` is up to 34% faster. (Brian White)
        nodejs/node#11497
      - `querystring.unescape()` for `Buffer`s is 15% faster on average.
        (Brian White) nodejs/node#10837
      - `querystring.stringify()` is up to 7.8% faster on average.
        (Brian White) nodejs/node#10852
      - `querystring.parse()` is up to 21% faster on average. (Brian White)
        nodejs/node#10874

    * IPC:
      - Batched writes have been enabled for process IPC on platforms that
        support Unix Domain Sockets. (Alexey Orlenko)
        nodejs/node#10677
      - Performance gains may be up to 40% for some workloads.

    * child_process:
      - `spawnSync` now returns a null `status` when child is terminated by
        a signal. (cjihrig) nodejs/node#11288
      - This fixes the behavior to act like `spawn()` does.

    * http:
      - Control characters are now always rejected when using
        `http.request()`. (Ben Noordhuis)
        nodejs/node#8923
      - Debug messages have been added for cases when headers contain
        invalid values. (Evan Lucas)
        nodejs/node#9195

    * node:
      - Heap statistics now support values larger than 4GB. (Ben Noordhuis)
        nodejs/node#10186

    * timers:
      - Timer callbacks now always maintain order when interacting with
        domain error handling. (John Barboza)
        nodejs/node#10522

    PR-URL: nodejs/node#11759

Signed-off-by: Ilkka Myller <ilkka.myller@nodefield.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. net Issues and PRs related to the net subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

node: unix sockets are slow in case of lots of small buffers
8 participants