Ergonomic way to move data between workers #10078

jakearchibald · 2024-01-19T14:46:38Z

What problem are you trying to solve?

Right now, call-and-response communications with a worker are pretty cumbersome. Complex libraries have been created to try and make this easier.

What solutions exist today?

Message ports, and utilities like comlink.

How would you solve it?

This builds on the blank worker proposal.

ShadowRealm is now stage 3, and I think it has ideas we could borrow.

const worker = new Worker('about:blankjs', { type: 'module' });

// Import into worker:
await worker.importValue(specifier);

// Import and get export:
const value = await worker.importValue(specifier, exportName);

importValue would throw if the worker is not type: 'module'.

The export can be anything structured cloneable, but can be or can include functions.

When functions are called, the args are cloned and the function in the worker is called with those args. The return value is cloned, and used to resolve the function on the caller's side.

For example:

worker-utils.js

export createNumbersArray(length) {
  return Array.from({ length }, (_, i) => i);
}

index.js

const worker = new Worker('about:blankjs', { type: 'module' });
const createNumbersArray = await worker.importValue('./worker-utils.js', 'createNumbersArray');

const numbersArray = await createNumbersArray(3);
// [0, 1, 2]

Anything else?

It'd be nice if certain values could be marked as "transferrable" rather than cloneable. Tranferrable streams would benefit from this too.

The text was updated successfully, but these errors were encountered:

guybedford · 2024-01-19T21:42:58Z

Very interesting, would be great to see something like this. While having this shape of API for the blank worker use case makes a lot of sense, for the use case where the worker itself has a "top-level" module definition, it might also make sense to simplify the interface further to some kind of top-level exports use:

const worker = new Worker('./mod.mjs', { type: 'module' });
const { createNumbersArray, anotherExport } = worker.getExports(['createNumbersArray', 'anotherExport']);

asutherland · 2024-01-20T01:43:40Z

The dynamic behavior allowed by importValue()'s exportName argument makes sense in the context of shadow realms which are heavy on eval-semantics already, but I am concerned it would lend itself to a style that makes it harder for static analysis for security purposes, like for auditing WebExtensions where there's definitely problems with bad actors trying to do tricky things via obfuscated dynamism. Like, it's fine if there's always a string literal there, but the API shape feels like it would be just as idiomatic to use a variable which opens up all kinds of avenues of dynamism, and the need for code auditors to potentially re-litigate the exact same debate over about how submitted code should use the API. In particular, I can imagine developers wanting to use a for loop there.

I feel like there had previously been discussion about enabling something like await worker.remoteImport('the-script.mjs') that could have similar semantics to import(), in particular returning a module namespace object? Are there major spec complexities preventing a solution like that? While this of course still allows nefarious dynamic behavior, it's already the identical problem to import() and allows consistent policy enforcement, like accessing the module via remotedModule.createNumbersArray().

The downside with this alternative is of course that it would potentially do significantly more work than required if the imported script has more exported functions than the caller wants to call. Although obviously it's possible with modules to not export more than is actually desired to proxy, and this approach could arguably be beneficial to static-analysis-based auditors since it would encourage the code authors to limit their number of exports because their code would perform worse because of the wasted exports.

Jamesernator · 2024-01-20T03:02:54Z

The dynamic behavior allowed by importValue()'s exportName argument makes sense in the context of shadow realms which are heavy on eval-semantics already, but I am concerned it would lend itself to a style that makes it harder for static analysis for security purposes, like for auditing WebExtensions where there's definitely problems with bad actors trying to do tricky things via obfuscated dynamism.

This is a wider concern with string specifiers as a dynamic import mechanism. The problem would be generally solved for all forms of dynamic import (including this one) with the module expression blocks proposal.

jakearchibald · 2024-01-22T12:03:30Z

There are cases where blocks are handy, but I don't think they should be required to improve worker communication. Having worker code in another file is usually a benefit.

jakearchibald · 2024-01-22T12:04:23Z

@guybedford yeah, I agree that would be handy (although getExports would need to return a promise). I was trying to avoid creating an API shape that was so different to shadow realms, but maybe that doesn't matter.

jakearchibald · 2024-01-23T13:57:32Z

I don't think we should get too bogged down in the transferable issue, but I think the solution here should be the same as it would be for transferred streams.

Something like:

const blob = new Blob(…);
markForTransfer(blob);

At this point blob is now detached, but the realm has a "marked for transfer" set containing the blob.

Then, in StructuredSerializeWithTransfer, if an object is in the "marked for transfer" set, it's transferred, and removed from the set.

Having one API that's transfer by default seems weird.

asutherland · 2024-01-23T18:11:15Z

I like the markForTransfer idea, it definitely feels ergonomic in a way that transfer lists do not and where transfer lists would be a real problem here. In general I think lessons from CORBA and other APIs are that it's hard/inadvisable to hide RPC boundaries and so it would be quite reasonable for a proposal like this to make it more ergonomic to perform what amounts to RPC, but that code still would need to account for that, including participating in semantics-impacting decisions like marking things for transfer.

Have there been similar discussions of this proposal elsewhere, and in particular that have been TAG reviewed? I'm having trouble finding other examples of markForTransfer specifically.

I should also note:

Per the File API WebIDL, Blobs and Files are not currrently marked Transferable.
Blob.close() was explicitly removed from the spec and markForTransfer presumably amounts to the same thing, as it seems like one would specify the transfer set to be cleared at the conclusion of the task, and if it wasn't transferred, it's now effectively closed. Although this potentially creates foot-guns if the method ever goes async after calling markForTransfer. One would really want an explicit and clear lifetime if allowing the pending transfer to outlive a request. Like let t = new TransferDecorator(); t.markForTransfer(transferrableThing); await somePromise; return t. But that still sounds like it's similar to the comlink approach?

ggoodman · 2024-02-21T00:42:02Z

In chatting with @guybedford, a challenge across JavaScript runtimes right now is how to run an untrusted guest via WebAssembly. The current design of that interface seems to be more aligned with a WASM host and guest being within more-or-less the same trust boundary. A WASM guest can deny service to the event-loop of the host and the host has nothing that it can do to prevent that.

For a host to prevent that, a parent controlling event loop is required. We can accomplish that by having a Worker be the host of the WASM instance. The event loop that spawns the worker can enforce deadlines and cancellations by managing the lifetime of the whole Worker.

However, the ergonomics of this leave a lot to be desired; packaging re-usable code across runtimes so that some is run in a worker and some in the main thread is challenging. Setting up the communication channels in a cross-platform way is also challenging for the same reasons motivating this discussion.

It seems like a version of this proposal might offer a more ergonomic way of handling this intermediary Worker, which is why I bring up the use-case.

jakearchibald added addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest labels Jan 19, 2024

jakearchibald mentioned this issue Jan 19, 2024

Proposal: blank worker #6911

Open

keithamus added the topic: workers label Jan 19, 2024

This comment was marked as resolved.

Sign in to view

josephrocca mentioned this issue Mar 21, 2024

ReadableStream not transferable \w postMessage denoland/deno#20543

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ergonomic way to move data between workers #10078

Ergonomic way to move data between workers #10078

jakearchibald commented Jan 19, 2024

guybedford commented Jan 19, 2024 •

edited

Loading

asutherland commented Jan 20, 2024

Jamesernator commented Jan 20, 2024

jakearchibald commented Jan 22, 2024

jakearchibald commented Jan 22, 2024

This comment was marked as resolved.

jakearchibald commented Jan 23, 2024 •

edited

Loading

asutherland commented Jan 23, 2024

ggoodman commented Feb 21, 2024

Ergonomic way to move data between workers #10078

Ergonomic way to move data between workers #10078

Comments

jakearchibald commented Jan 19, 2024

What problem are you trying to solve?

What solutions exist today?

How would you solve it?

Anything else?

guybedford commented Jan 19, 2024 • edited Loading

asutherland commented Jan 20, 2024

Jamesernator commented Jan 20, 2024

jakearchibald commented Jan 22, 2024

jakearchibald commented Jan 22, 2024

This comment was marked as resolved.

jakearchibald commented Jan 23, 2024 • edited Loading

asutherland commented Jan 23, 2024

ggoodman commented Feb 21, 2024

guybedford commented Jan 19, 2024 •

edited

Loading

jakearchibald commented Jan 23, 2024 •

edited

Loading