Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsyncLocalStorage losing state when used with net.Server and repl.REPLServer #37866

Closed
iamFIREcracker opened this issue Mar 22, 2021 · 18 comments

Comments

@iamFIREcracker
Copy link

  • Version: v14.5.0
  • Platform: Darwin hairstyle.local 18.7.0 Darwin Kernel Version 18.7.0: Tue Jan 12 22:04:47 PST 2021; root:xnu-4903.278.56~1/RELEASE_X86_64 x86_64
  • Subsystem: async_hooks

What steps will reproduce the bug?

I am poking around a tiny wrapper around repl to make it possible for someone interact with a running Node.js instance, remotely, but when I tried to use AsyncLocalStorage to attach some state to the active session, such state gets gets lost as soon as the REPLServer takes over.

Save the following into a file:

var asyncHooks = require("async_hooks");
var fs = require("fs");
var util = require("util");
var net = require("net");

function debug(...args) {
  fs.writeFileSync(1, `${util.format(...args)}\n`, { flag: "a" });
}

var ctx = new asyncHooks.AsyncLocalStorage();
var reqId = 0;

net
  .createServer(function (socket) {
    ctx.run((reqId += 1), () => {
      var repl = require("repl").start({
        input: socket,
        output: socket,
      });
      repl.context.ctx = ctx;
      debug("Created repl", ctx.getStore());
    });
  })
  .listen(8081);
debug("Listening on port 8081");

and run it:

$ node remote-repl.js
Listening on port 8081

Now from a different terminal, connect to port 8081 to get your remote REPL:

$ nc localhost 8081
>

What is the expected behavior / what do you see instead?

Accessing ctx (well, its content) from the REPL should return the ID of the REPL itself (i.e. an increasing number), but undefined is returned instead:

> ctx
AsyncLocalStorage {
  kResourceStore: Symbol(kResourceStore),
  enabled: true
}
> ctx.getStore()
undefined

Additional information

I suspect it might have something to do with the underlying socket or EventEmitter, because when I try to access the content of ctx from within a socket.on('data'...) block, I still get undefined.

net
  .createServer(function (socket) {
    ctx.run((reqId += 1), () => {
      debug("Connected", ctx.getStore());
      socket.on("data", (data) => {
        debug(`[${ctx.getStore()}]`, "on-data", data.toString());
      });
    });
  })
  .listen(8081);

Hope this helps -- let me know if there is anything else I could help you with.

@Ayase-252
Copy link
Member

The document says

Returns the current store. If called outside of an asynchronous context initialized by calling asyncLocalStorage.run() or asyncLocalStorage.enterWith(), it returns undefined.

I'm not sure that calling getStore in a REPL within a context created by run can be qualified as [outside of an asynchronous context....].

cc @nodejs/async_hooks

@Flarna
Copy link
Member

Flarna commented Mar 23, 2021

@iamFIREcracker I tried your sample with nodejs 14.16.0 and it worked. Could you try with an up to date node version?

@iamFIREcracker
Copy link
Author

No luck, it's still broken for me:

$ node --version
v14.16.0
$ node async_bug.js
Listening on port 8081

For the first snippet, the REPL one, from the remote client's terminal:

$ nc localhost 8081
> ctx.getStore()
undefined

For the second snippet, the one without the REPL, the one with socket.on(...) only:

$ node async_bug.js 
Listening on port 8081
Connected 1
[undefined] on-data from-1

Connected 2
[undefined] on-data from-2

I read through the #33723 thread, and noticed that if I create an instance of AsyncResource when the connection is established, and run the socket.on handler within the context of that resource, then the asynchronous state does not get lost any longer:

net
  .createServer(function (socket) {
    ctx.run((reqId += 1), () => {
      const res = new asyncHooks.AsyncResource("REPL");
      debug("Connected", ctx.getStore());
      socket.on("data", (data) => {
        res.runInAsyncScope(
          () => {
            debug(`[${ctx.getStore()}]`, "on-data", data.toString());
          },
          this,
          data
        );
      });
    });
  })
  .listen(8081);
debug("Listening on port 8081");

on the server's stdout:

$ node async_bug.js 
Listening on port 8081
Connected 1
[1] on-data from-1

Connected 2
[2] on-data form-2

I quickly hacked the first snippet again, the REPL one, and implemented a man-in-the-middle solution where the socket's input is pumped into the REPL's stdin from inside an AsyncResource context, and it's kinda working (i.e. the the asynchronous context is not lost, but of course all the evaluation results are dumped on the server's terminal:

net
  .createServer(function (socket) {
    ctx.run((reqId += 1), () => {
      const res = new asyncHooks.AsyncResource("REPL");
      const repl = require("repl").start({ input: null, output: null });
      socket.on("data", (data) => {
        res.runInAsyncScope(
          () => {
            repl.input.emit("data", data);
          },
          this,
          data
        );
      });
      repl.context.ctx = ctx;
      debug("Created repl", ctx.getStore());
    });
  })
  .listen(8081);
debug("Listening on port 8081");

If I run this I get:

$ node async_bug.js 
Listening on port 8081
> Created repl 1
ctx.getStore()
1
> Created repl 2
ccttx..etStore(())
2
> 
2

Clearly the wiring left something to be desired, but at least the context was not lost.

I guess I can work-around that (eventemitter-asyncresource maybe?!), but I wonder if it wasn't somehow expected for the created REPL to be bound to the async context of the request.

@targos
Copy link
Member

targos commented Mar 23, 2021

@Flarna
Copy link
Member

Flarna commented Mar 23, 2021

I'm still not able to reproduce this. Maybe because I'm not on a mac (tried windows native, windows wsl and linux).

@targos
Copy link
Member

targos commented Mar 23, 2021

I can reproduce with a mac.

$ nc localhost 8081
> ctx
AsyncLocalStorage {
  kResourceStore: Symbol(kResourceStore),
  enabled: true
}
> ctx.getStore()
undefined
> process.version
'v14.16.0'

@benjamingr
Copy link
Member

Since when is there syntax highlighting in issue titles 😮 ?

@Ayase-252
Copy link
Member

Ayase-252 commented Mar 23, 2021

I can reproduce the issue in MacOS.

I did a bit step-into trick in the code snippet, and found that

node/lib/async_hooks.js

Lines 334 to 339 in 98ff5ac

getStore() {
if (this.enabled) {
const resource = executionAsyncResource();
return resource[this.kResourceStore];
}
}

  • the resource is an AsyncResource when calling debug("Created repl", ctx.getStore()), and it works as intended;
  • the resource is an TCP object when calling ctx.getStore() via REPL, and it returns a undefined.

IMO, the ctx.getStore() is executed in the callback for lineevent of REPL, it could be a different context from the async context initalized by AsyncStore.run. Therefore it is consistent with what document says.

Returns the current store. If called outside of an asynchronous context initialized by calling asyncLocalStorage.run() or asyncLocalStorage.enterWith(), it returns undefined.

In the use case here, I think AsyncResource is a better choice. I'm in favor of no issue with Node.js.

A side note: One thing is interesting here. Is the issue can only reproduced in MacOS? It may be indicating semantic discrepancy btw platforms (darwin vs linux/windows)?

Please correct me if I'm wrong.😊

@Flarna
Copy link
Member

Flarna commented Mar 23, 2021

Sorry, seems I mixed something up during my testing. Can reproduce now also on windows.

@iamFIREcracker
Copy link
Author

There's an official way to integrate with EventEmitter: https://nodejs.org/dist/latest-v15.x/docs/api/async_hooks.html#async_hooks_integrating_asyncresource_with_eventemitter

@targos - I was playing with that earlier, but I could not quite wrap my head around that API, and as a result I could not make it work, i.e. I could not get ahold of the data parameter from within the callback that I passed to AsyncResource.bind; so I dug a bit deeper, and discovered that I was not the only one having troubles with this (see AsyncResource.bind does not forward arguments #36051), and even though the AsyncResource.bind signature was updated to let users pass in thisArg, I am afraid I will have to wait for it to be released before I could give it a try.

Note: the only reason why I mentioned EventEmitters is not because I wanted to explicitly play or integrate with them, but because creating an instance of REPLServer and passing a net.Socket to it causes AsyncLocalStorage to lose its context.

IMO, the ctx.getStore() is executed in the callback for lineevent of REPL, it could be a different context from the async context initalized by AsyncStore.run. Therefore it is consistent with what document says.

@Ayase-252 - I think that perfectly describes what's going on. However, what I am not 100% sure about is whether that's the expected behavior or not; in this case I kind of expect the REPL to evaluate expressions in the same execution context the REPL itself was originally created on, but I totally understand if there are reasons as to why this cannot be the default behavior (e.g. performance?!).

So, speaking of work-arounds, what options do I have? Do I have to monkey-patch that socket's on method to the given callbacks in a specific execution context, so that when REPLServer uses it to set things up, expressions would be evaluated in the "right" execution context? Before going crazy with that, I would like to make sure I am not going in the wrong direction.

@Flarna
Copy link
Member

Flarna commented Mar 24, 2021

The socket is created before outside of your async scope therefore it's clearly not bound to it.
The repl instance is created within the async scope but it seems it's not modeled as async resource. It operates in caller context which are synchronous calls via socket stream (which is in turn an EventEmitter). So this ends up in the Async context of the socket...

You could try to use a Transform stream which passes the data through but switches the async context:

const { Transform } = require("stream");
class MyTransformStream extends Transform {
  constructor(options) {
    super(options);
    this._res = new asyncHooks.AsyncResource("MyTransformStream");
  }
  _transform(chunk, encoding, cb) {
    this._res.runInAsyncScope(cb, null, null, chunk);
  }
}

and then in your sample:

const myStream = new MyTransformStream();
var repl = require("repl").start({
  input: socket.pipe(myStream), // created within your async scope so ALS propagation works
  output: socket,
});

@iamFIREcracker
Copy link
Author

Alright, I think I figured a way to make this work by using a combination of AsyncResoruce.bind and Function.prototype.bind. First I use AsyncResource.bind to force the socket.on callback to run inside the "right" asynchronous context, and then I bind the first argument of that hooked callback (i.e. thisArg); lastly, I call the original socket.on implementation, passing the hooked callback into it.

net
  .createServer(function (socket) {
    ctx.run((reqId += 1), () => {
      const onOriginal = socket.on;
      socket.on = function onHooked(data, callback) {
        let callbackHooked = asyncHooks.AsyncResource.bind(callback, "REPL");
        // The result of `AsyncResource.bind` is a function expecting
        // its first argument to be `thisArg`, so to properly finish up
        // the wiring we need to bind it before actually using the transformed
        // callback
        callbackHooked = callbackHooked.bind(this, this);
        return onOriginal.apply(socket, [data, callbackHooked]);
      };

      const repl = require("repl").start({ input: socket, output: socket });
      repl.context.ctx = ctx;
      debug("Created repl", ctx.getStore());
    });
  })
  .listen(8081);
debug("Listening on port 8081");

And with this, AsyncLocalStorage is not not losing its context anymore.

I am sure there might be better and more idiomatic ways of dealing with this, but at least it unblocks me for now.

@Flarna
Copy link
Member

Flarna commented Mar 24, 2021

@iamFIREcracker Patching on might work fine in your setup but I don't recommend this as general solution.
If someone uses prependListener or once it will fail. Most likely also removeListener(event, listener) will fail as the registered listener is not that one passed by user.

@iamFIREcracker
Copy link
Author

@iamFIREcracker Patching on might work fine in your setup but I don't recommend this as general solution.
If someone uses prependListener or once it will fail. Most likely also removeListener(event, listener) will fail as the registered listener is not that one passed by user.

Totally! When I replied yesterday I had somehow missed your suggestion about using Transform, and it goes without saying it I liked it way better than mine...hack -- because that's what we are talking about here.

Alright, I am not sure if there is any intention of changing REPLServer to implement something similar at the core level, but if not, I think we can close this? I mean, I am able to carry on with my experiments, and if we want to keep REPLServer as is, then I don't see why we would want to keep this issue open.

In the meanwhile, thanks everybody for the help!

@Flarna
Copy link
Member

Flarna commented Mar 25, 2021

I don't think we should change REPLServer regarding this.
I can imagine use cases where people want every request to REPLServer executed as part of the calling AsyncContext instead moving all of them into one like you prefer it in your usecase.

@Flarna Flarna closed this as completed Apr 1, 2021
@pozylon
Copy link

pozylon commented Jul 14, 2021

Just had this with 14.17.1

W20210714-19:40:55.397(2)? (STDERR) TypeError: Cannot read property 'Symbol(kResourceStore)' of undefined
W20210714-19:40:55.398(2)? (STDERR)     at AsyncLocalStorage._propagate (async_hooks.js:289:34)
W20210714-19:40:55.398(2)? (STDERR)     at AsyncHook.init (async_hooks.js:255:22)
W20210714-19:40:55.398(2)? (STDERR)     at emitInitNative (internal/async_hooks.js:204:43)
W20210714-19:40:55.398(2)? (STDERR)     at emitInitScript (internal/async_hooks.js:472:3)
W20210714-19:40:55.398(2)? (STDERR)     at initAsyncResource (internal/timers.js:160:5)
W20210714-19:40:55.398(2)? (STDERR)     at new Timeout (internal/timers.js:194:3)
W20210714-19:40:55.398(2)? (STDERR)     at setUnrefTimeout (internal/timers.js:376:17)
W20210714-19:40:55.398(2)? (STDERR)     at Socket.setStreamTimeout [as setTimeout] (internal/stream_base_commons.js:263:22)
W20210714-19:40:55.399(2)? (STDERR)     at IncomingMessage.setTimeout (_http_incoming.js:99:15)
W20210714-19:40:55.399(2)? (STDERR)     at Server.WebApp._timeoutAdjustmentRequestCallback (packages/webapp/webapp_server.js:290:7)
W20210714-19:40:55.399(2)? (STDERR)     at packages/ddp-server/stream_server.js:184:23
W20210714-19:40:55.399(2)? (STDERR)     at Array.forEach (<anonymous>)
W20210714-19:40:55.399(2)? (STDERR)     at Function._.each._.forEach (packages/underscore.js:139:11)
W20210714-19:40:55.399(2)? (STDERR)     at Server.newListener (packages/ddp-server/stream_server.js:183:11)
W20210714-19:40:55.399(2)? (STDERR)     at Server.emit (events.js:375:28)
W20210714-19:40:55.399(2)? (STDERR)     at parserOnIncoming (_http_server.js:897:12)

@pozylon
Copy link

pozylon commented Jul 14, 2021

@Flarna

@Flarna
Copy link
Member

Flarna commented Jul 14, 2021

@pozylon I doubt that this is related to this issue. I recommend to create a new one but please add more info like how to reproduce this. Optimum would be a reproducer which requires no additional modules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants