Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process: implement resourceUsage() #11490

Closed
wants to merge 3 commits into from
Closed

Conversation

Trott
Copy link
Member

@Trott Trott commented Feb 22, 2017

Picked up from stalled #1568. Not sure we want to do this or not, but let's figure it out and either move forward or close it once and for all...

From @romankl (original author and still listed as the author for the actual commit in this PR):

libuv already provides the uv_getrusage function to get the
resource mesaures for the current process.
This change implements the new process.resourceUsage() function
to make use of it.

An example output looks like:

{ userCpuTimeUsedSec: 0,
  userCpuTimeUsedMs: 159729,
  systemCpuTimeUsedSec: 0,
  systemCpuTimeUsedMs: 30597,
  maxResidentSetSize: 20426752,
  sharedMemSize: 0,
  integralUnsharedDataSize: 0,
  integralUnsharedStackSize: 0,
  pageReclaims: 3002,
  pageFaults: 2250,
  swaps: 0,
  blockInputOperations: 0,
  blockOutputOperations: 0,
  ipcMessagesSent: 55,
  ipcMessagesReceived: 54,
  signalsReceived: 0,
  voluntaryContextSwitches: 92,
  involuntaryContextSwitches: 374 }

Not supported fields on windows are zero.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

process

@Trott Trott added process Issues and PRs related to the process subsystem. semver-minor PRs that contain new features and should be released in the next minor version. labels Feb 22, 2017
@nodejs-github-bot nodejs-github-bot added the c++ Issues and PRs that require attention from people who are familiar with C++. label Feb 22, 2017
Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a nit


* Returns: {Object} containing resource usage measures

Note: Windows only provides the following resource usage measures:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could you mention that these values refer to the current process?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@Trott
Copy link
Member Author

Trott commented Feb 22, 2017

src/node.cc Outdated
if (err)
return env->ThrowUVException(err, "uv_getrusage");

Local<Object> resource_usage = Object::New(env->isolate());
Copy link
Contributor

@mscdex mscdex Feb 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's still faster to create an object in JS land and pass it to the C++ side. I think @trevnorris had done something like this before (I can't recall exactly what part of core that was). I know Object::New() in general is/was quite slow (even compared to say Array::New()) from my own experience with third-party addons, so I could definitely see it possibly being faster in JS land.

One other advantage is that the properties could be pre-defined in JS, avoiding hidden classes, which I assume would otherwise be created when setting/adding these properties in C++.

Copy link
Contributor

@mscdex mscdex Feb 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a little benchmarking (on Linux FWIW) using 3 possible implementations (after 1e5 iterations):

resourceUsage(): 525.068ms
resourceUsage2(): 418.420ms
resourceUsage3(): 152.699ms

resourceUsage() is the implementation currently in this PR.

resourceUsage2() is using my previously described method: passing an object literal with predefined properties to C++ for setting of values.

resourceUsage3() is an implementation that uses an object generator function defined in JS that is called by C++ and passes all (18) values which are set in an object literal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mscdex If you're confident in the benchmark results, feel free to push the most performant implementation directly to this branch if you want. (Or open a separate PR and close this one. Either way!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Trott I've now added the optimization as a separate commit. I also fixed a couple of typos in the original commit's message.

src/env.h Outdated
V(ipc_messages_received, "ipcMessagesReceived") \
V(signals_received, "signalsReceived") \
V(voluntary_context_switches, "voluntaryContextSwitches") \
V(involuntary_context_switches, "involuntaryContextSwitches") \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the existing properties are alphabetically sorted and end _string -- Should the new additions be likewise for consistency?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer an issue with the perf improvement commit I added, which removes these changes.

- `userCpuTimeUsedSec`
- `userCpuTimeUsedMs`
- `systemCpuTimeUsedSec`
- `systemCpuTimeUsedMs`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libuv/libuv#342 and libuv/libuv#1041 added block input/output operations for Windows.

Copy link
Contributor

@mscdex mscdex Feb 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it was added in libuv v1.10.0. If/when this PR gets backported, we'd need to have a separate doc change because v4.x and v6.x have older versions of libuv which may/may not be upgraded to v1.10.0+.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the doc updates as a separate commit for convenience.

@Trott
Copy link
Member Author

Trott commented Feb 22, 2017

@mscdex mscdex force-pushed the resource-usage branch 2 times, most recently from 093b946 to 9f5feb9 Compare February 22, 2017 05:19
@mscdex
Copy link
Contributor

mscdex commented Feb 22, 2017

Fixed lint issues and added a new doc commit that updates the function's description in the API documentation.

To whomever backports this PR: Exclude the separate doc commit for v4.x and v6.x since that change is irrelevant for libuv versions prior to v1.10.0 (those particular branches have older libuv versions as of this writing).

New CI: https://ci.nodejs.org/job/node-test-pull-request/6540/

Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, couple of nits tho

src/node.cc Outdated

Local<Function> fn = args[0].As<Function>();
Local<Value> argv[] = {
Number::New(env->isolate(), rusage.ru_utime.tv_sec),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, can you capture Isolate* isolate = env->isolate() above and just make these Number::New(isolate, ... to make the code slightly more concise.

src/node.cc Outdated
if (err)
return env->ThrowUVException(err, "uv_getrusage");

Local<Function> fn = args[0].As<Function>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should likely include a CHECK(args[0]->IsFunction()) here just to be safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how it'd ever be possible for it to not be a function?

src/node.cc Outdated
Number::New(env->isolate(), rusage.ru_nivcsw)
};
Local<Value> resource_usage = fn->Call(
env->context(), Null(env->isolate()), 18, argv).ToLocalChecked();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argcount tends to go out of date. Maybe sizeof(argv) / sizeof(argv[0])?

@mscdex mscdex added the wip Issues and PRs that are still a work in progress. label Feb 22, 2017
@mscdex mscdex removed the wip Issues and PRs that are still a work in progress. label Feb 22, 2017
@mscdex
Copy link
Contributor

mscdex commented Feb 22, 2017

I've updated the implementation to use a typed array for transferring values back to JS, which is ~37% faster than the object generator function-based solution.

Also I've now included a benchmark.

CI: https://ci.nodejs.org/job/node-test-pull-request/6542/

systemCpuTimeUsedSec: resValues[2],
systemCpuTimeUsedMs: resValues[3],
maxResidentSetSize: resValues[4],
sharedMemSize: resValues[5],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sharedMemorySize ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would integralSharedMemorySize be more consistent with the next two values? http://man7.org/linux/man-pages/man2/getrusage.2.html

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM

@jasnell
Copy link
Member

jasnell commented Feb 22, 2017

Much better with the typed array. LGTM

@Trott
Copy link
Member Author

Trott commented Feb 23, 2017

@mscdex Are you hoping to land as-is or are you in favor of the sharedMemSize -> sharedMemorySize suggestion?

@mscdex
Copy link
Contributor

mscdex commented Feb 23, 2017

@Trott I don't have any opinion on it. I did not modify any of the property names, they are from the original PR.

Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a suggestion.

FWIW, I think Rich's suggestion of renaming sharedMemSize to integralSharedMemorySize is a good one (although I question how useful that field and its siblings are in general, but that aside. No reason not to expose them.)

signalsReceived: 0,
voluntaryContextSwitches: 92,
involuntaryContextSwitches: 374 }
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worthwhile to call out that older Linux kernels set some of these fields to zero, see http://stackoverflow.com/a/7205853.

I expect that maxResidentSetSize in particular will be used a lot but that isn't supported by <2.6.32 kernels (torvalds/linux@1f10206 for the curious.) Someone developing on a new kernel but deploying on 2.6.18 is in for a nasty surprise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blurb added.

r-52 and others added 2 commits February 24, 2017 04:56
libuv already provides the `uv_getrusage` function to get the
resource measurements for the current process.
This change implements the new `process.resourceUsage()` function
to make use of it.

An example output looks like:

{ userCpuTimeUsedSec: 0,
  userCpuTimeUsedMs: 159729,
  systemCpuTimeUsedSec: 0,
  systemCpuTimeUsedMs: 30597,
  maxResidentSetSize: 20426752,
  integralSharedMemorySize: 0,
  integralUnsharedDataSize: 0,
  integralUnsharedStackSize: 0,
  pageReclaims: 3002,
  pageFaults: 2250,
  swaps: 0,
  blockInputOperations: 0,
  blockOutputOperations: 0,
  ipcMessagesSent: 55,
  ipcMessagesReceived: 54,
  signalsReceived: 0,
  voluntaryContextSwitches: 92,
  involuntaryContextSwitches: 374 }

Not supported fields on Windows are zero.
Creating an object in JS and using a typed array to transfer values
from C++ to JS is faster than creating an object and setting
properties in C++.
@mhdawson
Copy link
Member

@Trott have you looked at the output for the non Linux/windows/OS X platforms ? Trying to understand how well covered we are across the board. I don't think we can tell from the unit test as it just validates the numbers are 0 or greater so they could be all 0 on a platform and that would not be obvious. I understand it is only flowing through what's available in libuv, but since we are exposing in Node.js it would be good to know how well it works across the board.

@Trott
Copy link
Member Author

Trott commented Feb 25, 2017

@sam-github wrote:

uv already mapped windows info to the unix names… so node doesn't have to.

So we can't instruct Windows users to look at Windows docs for the meanings of things like tv_sec. The end user will have to know what those are from UNIX-like operating systems.

FWIW, it looks like using the UNIX-like names (such as tv_sec) is what PHP did for their getrusage: http://php.net/manual/en/function.getrusage.php

@Trott
Copy link
Member Author

Trott commented Feb 25, 2017

Which all brings up another point: Should this be os.resourceUsage() or would it be better to call it os.getrusage()?

@mscdex
Copy link
Contributor

mscdex commented Feb 25, 2017

FWIW, it looks like using the UNIX-like names (such as tv_sec) is what PHP did for their getrusage

To be fair though, Windows probably wasn't a high priority for PHP when that function was added. The same can be said for node's fs functions, which have been around since before Windows support was really a thing.

I'd probably prefer the more descriptive names over the POSIX names. I'm sure I'm not the only one that wouldn't know offhand for example what 'ixrss', 'idrss', and 'isrss' are and what the difference between them is.

@jasnell
Copy link
Member

jasnell commented Feb 25, 2017

We could bikeshed endlessly over the names. Friendly APIs are better for same reason using non-abbreviated variable names and avoiding jargon is docs is better. It's more accessible and more friendly to developers who may not have the same level of understanding. Docs are going to need to be used either way. Let's just use the friendly names and move on.

@sam-github
Copy link
Contributor

Once part of the node API, these names will live forever, its worth spending some time talking about.

Using the common names, the names that uv gives and that all the other language runtimes I've found use is a reasonable way of choosing names. If we invent long string names, then we have to decide on them. Choosing Linux's kernel header comments and converting to camelCase works, but its Linux-centric, and it should also generate conversation ("bikeshedding"?).

Long names aren't "friendly" to me, I'd call them cumbersome, annoying, and needless innovation. And while Windows is unfortunately different, it doesn't actually return all these values in a struct as Unix does. To be Windows-like, we'd have to implement 4 or 5 individual APIs, so the single-method approach here is already to far away from Windows to be made familiar to a Windows user. And anyhow, while lots of people do dev on Windows, not many deploy, so we don't lose much by using common Unix names, these are production stats.

What is the use case for these? I can think of two primaries: 1) dumping the object to log streams, 2) graphing on production monitoring consoles

In both cases size of the property strings matters, long names make unreadably cluttered graphs, people will likely shorten them back to the common names so that they can be graphed beside java, swift, php etc., and log streams with long property names will generate lines so long they'll be harder to read, and more compact json logs are preferred by some people in production, for performance reasons.

These aren't the kinds of object properties that people are going to be typing out over and over in their code, they are the kinds of strings that will be given to ops people, who will likely be annoyed by our not using the common names. /cc @rmg

https://docs.python.org/2/library/resource.html#resource-usage
https://golang.org/pkg/syscall/#Rusage
https://doc.rust-lang.org/1.6.0/libc/struct.rusage.html
swift - looks like they just call the native getrusage API: http://stackoverflow.com/questions/22127281/getrusage-returning-zeros-in-ru-utime-tv-usec-and-ru-utime-tv-sec

@rmg
Copy link
Contributor

rmg commented Feb 25, 2017

I think the real danger in the Linux/non-Linux centric balance is to not punish familiarity with one like child_process.exec() does.

I'm OK with the "friendly" names. If the function was going to be process.getrusage() I would have a different opinion.

Or in bikeshedding terms, I'm OK with any colour as long as we don't call the bikeshed a toolshed and confuse people when they open it and find bike lockers instead of tools.

@Trott
Copy link
Member Author

Trott commented Feb 27, 2017

Adding ctc-review label. Ping @nodejs/ctc

Decisions on which there is not currently consensus (hence CTC-pinging):

  • Should this function be process.resourceUsage() or process.getrusage()
  • Should we use our own descriptive property names or the standard-ish UNIX-like property names?

@bnoordhuis
Copy link
Member

I don't have a problem with friendly names. You could argue it's a little inconsistent with e.g. fs.Stats but even there we friendly-fied their names by calling them ino and nlink rather than st_ino and st_nlink.

However, if it gets renamed to process.getrusage(), we should use the traditional names. Anything else is just cruel to UNIX programmers.

{ userCpuTimeUsedSec: 0,
userCpuTimeUsedMs: 159729,
systemCpuTimeUsedSec: 0,
systemCpuTimeUsedMs: 30597,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is milli seconds. The previous values for the above two entities don't look right to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part was from the original PR, I haven't looked at it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird. I pulled the changes locally and tested I still got zero.

{ userCpuTimeUsedSec: 0,
  userCpuTimeUsedMs: 71801,
  systemCpuTimeUsedSec: 0,
  systemCpuTimeUsedMs: 16618,
  maxResidentSetSize: 24780800,
  integralSharedMemorySize: 0,
  integralUnsharedDataSize: 0,
  integralUnsharedStackSize: 0,
  pageReclaims: 6438,
  pageFaults: 0,
  swaps: 0,
  blockInputOperations: 0,
  blockOutputOperations: 2,
  ipcMessagesSent: 0,
  ipcMessagesReceived: 0,
  signalsReceived: 0,
  voluntaryContextSwitches: 0,
  involuntaryContextSwitches: 138 }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ms = microseconds in this case rather than milliseonds. We probably want to change the name or do something to prevent every single person who uses this to puzzle over the same thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also: Score one for using traditional UNIX names instead of trying to be "friendly". :-D )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fooCpuTimeUsedUsec?

Another option is a single fooCpuTimeUsed number with sub-second precision. I think V8 should be able to unbox the double so it would be infinitesimally more efficient too (one slot instead of two.)

@@ -117,6 +118,35 @@ function setupMemoryUsage() {
};
}

function setupResourceUsage() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function doesn't accept any parameters. Why don't we move the array creation part also to C++?

Copy link
Contributor

@mscdex mscdex Feb 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't accept any parameters because it doesn't need any, it's just called once during startup to wrap the C++ function.

Regarding array creation, I'm assuming you mean the Float64Array()? I copied how it was being done in other functions that use a similar pattern. I don't think there is any real benefit to moving it to C++. If anything it would be more complicated because of the additional C++ code that would be required (e.g. adding to the Persistent list in src/env.h, etc.). Also, I personally would rather work more in JS than C++ ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I failed to notice that the array is reused. Clever 👍

const assert = require('assert');
const usage = process.resourceUsage();

assert(usage.userCpuTimeUsedSec >= 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better if we tighten the tests by checking if the values are there for sure in Windows atleast?

Copy link
Contributor

@mscdex mscdex Feb 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a general problem. I don't know if we can reliably assume certain (supported) properties are strictly greater than zero.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, even I was worried about that. At least, we know for sure that except for few of them, others will be zero in Windows (according to the documentation). We can assert that, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, but that is somewhat dependent upon libuv. *shrug* I'm -0 on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I am okay with this. Could you look at the comment which I left in the doc please?

@jasnell
Copy link
Member

jasnell commented Mar 17, 2017

ping @Trott. Do you think this is ready to go?

@mscdex ... do you have any remaining thoughts on this?

@Trott
Copy link
Member Author

Trott commented Mar 17, 2017

@jasnell I don't think it's quite ready to go. I think there isn't yet a clear decision about:

  • Should this function be process.resourceUsage() or process.getrusage()
  • Should we use our own descriptive property names or the standard-ish UNIX-like property names?

I definitely have this on my list of things to work to get resolution on, but right now it's being ignored in favor of things like #9531.

Copy link
Member

@mhdawson mhdawson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mscdex
Copy link
Contributor

mscdex commented Apr 5, 2017

@jasnell Thoughts about what? I don't have an opinion on the API naming if that's what you're referring to.

@Trott Trott added the blocked PRs that are blocked by other issues or PRs. label Apr 5, 2017
@Trott
Copy link
Member Author

Trott commented Apr 5, 2017

Marked as blocked for the moment just to make sure no one lands this without a full vetting of the naming etc. I'll get back to this at some point soon, honest! :-P (Although by that time, I may come around to the idea that this shouldn't be in core because we've lived without it for long enough and it doesn't seem to have caused anyone tons of pain. Proving me wrong with a real-world example welcome.)

@Trott
Copy link
Member Author

Trott commented Apr 9, 2017

I'm going to close this because:

  • There doesn't seem to be an urgent need for this on anyone's part. As such, I'm a little reluctant to add API surface because once we add something, we might never be able to remove it.

  • I'm especially reluctant to add an API when there's not certainty on what to call properties in the return value. (The whole "use friendly names or use POSIX names or what?" issue.)

If anyone feels strongly that this is misguided and this is something where we need to make those decisions and get this API out there, feel free to comment or re-open.

@Trott Trott closed this Apr 9, 2017
@Trott Trott removed the ctc-review label Apr 9, 2017
@Trott
Copy link
Member Author

Trott commented Apr 9, 2017

(And actually, since @mscdex was doing most of the heavy-lifting on this in recent weeks anyway, I suppose he can just open his own PR...)

@Trott Trott deleted the resource-usage branch January 13, 2022 22:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked PRs that are blocked by other issues or PRs. c++ Issues and PRs that require attention from people who are familiar with C++. process Issues and PRs related to the process subsystem. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.