Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency in async iterators? #203

Closed
everett1992 opened this issue Jul 15, 2022 · 5 comments
Closed

Concurrency in async iterators? #203

everett1992 opened this issue Jul 15, 2022 · 5 comments
Labels
good follow-on proposal this would be good but doesn't need to be in the first milestone

Comments

@everett1992
Copy link

everett1992 commented Jul 15, 2022

The async iterator helpers described in this spec are serial - await it.next(), then await mapper(value), then loop.

This example from the readme would only send one fetch request at a time.

const responses = await AsyncIterator.from(urls)
  .map(async (url) => {
    const response = await fetch(url);
    return response.json();
  })
  .toArray();

This example, using aws-sdk paginators, sends one query api call then one fetch at a time for each item in the page before starting the next page query.

await responses = paginateQuery({ client }, QUERY)
  .flatMap(page => page.Items)
  .map(item => fetch(item.url.S))

Operations like this can improve throughput with concurrent execution.

Could async iterator helpers support eagerly getting the next items and buffering them for the next call?
Could the helpers call their callback functions concurrently, before waiting for their results?

This could be implemented as arguments to each helper:

await responses = paginateQuery({ client }, QUERY)
 // buffer one item from the iterator, even before it is requested. 
 .flatMap(page => page.Items, { buffer: 1})
 // allow up to four pending iterator/callback promises
 .map(item => fetch(item.url.S), { parallel: 4 }) 

Or their own helper method

await responses = paginateQuery({ client }, QUERY)
 .flatMap(page => page.Items)
 // buffer one item from the iterator, even before it is requested. 
 .buffer(1)
 .map(item => fetch(item.url.S)                    
 // allow up to four pending iterator/callback promises
 .parallel(4)                  

Parallel could support both ordered and unordered execution.

I think this request could be extended to for await loops as well

for await (const item of asyncIterator; parallel = 2, buffer = 1) { ... }
@ljharb
Copy link
Member

ljharb commented Jul 15, 2022

That seems like something wildly out of scope of this proposal.

@bakkot
Copy link
Collaborator

bakkot commented Jul 15, 2022

While the idea of iterators which eagerly start work early is interesting, and is something we've talked about some before (e.g. x and the following day), it's definitely not in scope for this particular proposal - there's a lot of design space here, and it would serve a pretty different function than the helpers here.

@bakkot bakkot added the good follow-on proposal this would be good but doesn't need to be in the first milestone label Jul 15, 2022
@bakkot bakkot closed this as completed Jul 15, 2022
@everett1992
Copy link
Author

What is the process for follow up proposals? Do I have to wait for this to land or reach a certain stage before submitting a stand alone proposal?

@bakkot
Copy link
Collaborator

bakkot commented Jul 15, 2022

The process for contributing is documented here. No need to wait for this to advance to put something together, though do note you'll need to get someone who serves on the committee to champion it.

@bakkot
Copy link
Collaborator

bakkot commented Jan 26, 2023

After thinking about this more, I think that leaving the door open for a follow-on proposal here would actually require a tweak to the semantics in the helpers currently in this proposal. I've opened #262 to track.

I like the idea of a helper to eagerly pull from an iterator multiple times and buffer results, like your .buffer. Assuming we tweak the semantics so that .next calls are forwarded eagerly in helpers like .map, such a helper would be all that would be needed to get parallelism; an additional .parallel isn't actually necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good follow-on proposal this would be good but doesn't need to be in the first milestone
Projects
None yet
Development

No branches or pull requests

3 participants