Proposal + seeking champion: Composable promise concurrency management

Repo: GitHub - isiahmeadows/proposal-promise-schedule: A proposal to add composable promise concurrency management

Promise concurrency management is awkward, obtuse, and plainly difficult at times. It gets non-trivial extremely fast, and is severely fraught with pitfalls and edge cases. And that's even before you attempt any optimizations. (For brevity's sake, I'm not including examples here - follow the link and you'll understand why.)

This proposal took several iterations, including significant intermittent experimentation for years on code in the wild before I finally narrowed it down to something like the above. But at this point, I can't really think of an API that fits right in idiomatically with the way we use promises in practice any better. I actually like this API - it feels fun and rewarding to use, something I can't say about a lot of things programming-wise.

1 Like

This ... is the feature I need, and didn't even know I needed.

There have certainly been times where I've wanted to cap the number of active promises, to prevent sending out a gazillion REST requests or database queries at the same time. Sometimes this occurs in production code, and the amount of ugly (and potentially buggy) code I would have to put in to implement this properly is scary. Sometimes this occurs in a database-upgrade script where I'm not really inclined to do anything too complicated or depend on anything external, so I just throw together a very non-performant solution.

This also provides a very easy solution for the whole sync-errors while setting up promises issues, as discussed at length here (it might not be what the O.P. wanted in that thread, but it'll certainly make me happy).

It's a very creative API too - I love how simple it feels, yet it has so much power.

I'm not 100% sure but isn't Promise.map from bluebirdjs have similar goal?
http://bluebirdjs.com/docs/api/promise.map.html

@MaxGraey I have a link to that in my proposal.

A function that took an iterable of promise-returning functions, combined with a concurrency option, seems like it'd be simpler to me than the schedule callback you've described.

May bad, didn't see this

I get that, but when you need to schedule tasks within tasks, that falls apart. That can happen very easily with sufficiently large amounts of I/O-heavy processing work where not everything is a nice single collection I can just append to, and that's where the reentrancy part of my proposal comes into play. (I've already run into that at work, too.) There's also the issue of if you need to asynchronously push tasks to the queue, you can't just use a plain iterable to do it. And my follow-on of allowing task prioritization and weighting would be much more awkward to add if you used an iterable of tasks - while this doesn't have broad use, it does have some utility in that it's useful for coordinating certain sets of heterogenous tasks.

I do want to note that in terms of library precedent, only two libraries I linked actually follow that pattern of basically a concurrency-limited forEach. when takes a guard-based approach, and d3-queue and vow-queue (linked at the bottom) use an explicit task queue object. Mine is functionally closer to the latter two as it provides a little more power and in effect works almost as a local mini event loop.

1 Like

I think this problem could also be reformulated as two distinct problems, one being the production of an async iterator from events, the other one being the parallel processing of an iterator (as already mentioned by ljharb). I'm thinking of something like:

AsyncIterator.fromProducer(produce => /* call produce(...) multiple times */)
  .map(process)
  .parallel(10)
  .consume(); // actually rather .forEach(nothing), but that looks so weird just to "consume" an iterator

By splitting up these two, they can be composed with the new async iterator helpers:

const accounts = await users.values()
  .map(getAccount)
  .parallel(10)
  .toArray();

The .parallel(max) function would be called on a (possibly synchronous) iterator yielding Promises and would return an async iterator. When consumed, it would then already consume the next max values from the origin, and would yield the value of the first Promise resolving (then consuming the next one).

This for sure is not as thought out as the proposal is, just wanted to throw in a totally different way of thinking about the problem space, which I think is far larger than the problems presented in the proposal (thinking of Streams, time based rate limiting, ...).

2 Likes

I can go with that. FWIW, I'm not seeking much beyond just getting it in the pipeline. Feel free to file an issue with that alternate formulation - I'd love to discuss it a little more at length.

I've added a case study to the README explaining a use case where such a thing would fall apart. Basically, the issue is when one task depends on another. It's possible in theory to hack around it, but it gets extremely awkward and you're almost better off starting from scratch and just not using it.