Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement semaphores #5

Merged
merged 47 commits into from
Jun 19, 2021
Merged

Implement semaphores #5

merged 47 commits into from
Jun 19, 2021

Conversation

ninevra
Copy link
Contributor

@ninevra ninevra commented Mar 16, 2021

This PR implements weighted semaphores, using an API and backend patterned after Lock.

Design choices:

Like Lock, Semaphore is constructed synchronously (via SharedContext.prototype.createSemaphore(id, initialValue)) and used asynchronously (via Semaphore.prototype.down(), downNow(), and up()).

Unlike Lock, createSemaphore() can fail, if the initial value doesn't match another initialization of the same semaphore. Since the creation is synchronous and doesn't touch the backend, failure is reported on use. This means that all Semaphore operations can reject.

down() waiters are woken in strict FIFO order. downNow() ignores the queue, meaning downNow() callers can potentially starve other threads.

The semaphore is weighted: down(2) acquires 2 units from the semaphore. I think this feature is worth including because a) it's useful for modeling resource pools and b) trying to implement weights on top of unweighted semaphores can accidentally lead to deadlock or busy-waiting. (I'm not sure locks, reservations, and unweighted semaphores are enough to build a weighted semaphore, without arbitrary mutable shared state or condition variables.)

down() "acquisitions" are not released on test worker teardown, because many use cases for semaphores don't fit the acquire/release model.

I made amount an optional positional argument because it's the only currently implemented option. Might be a little awkward if/when other options, e.g. timeout, are added.

Open questions:

  • What should down(0) and downNow(0) do?
    • Currently, down(0) blocks for any waiters, and downNow(0) always succeeds.
  • Is there a better way to handle "mismatch" errors?
  • Should a facility be provided for releasing semaphores in failure cases?

@novemberborn
Copy link
Member

Nice, thanks for the PR @ninevra! Curious to hear your thoughts on the shared worker API model.

createSemaphore() can fail, if the initial value doesn't match another initialization of the same semaphore.

Why is this? Could it not act as a default but be irrelevant otherwise if a different test worker ran first with a different value?

Since the creation is synchronous and doesn't touch the backend, failure is reported on use.

Are you referring to the non-negative value TypeError here? You'd need to be asynchronous to confirm the correct initial value.

Semaphore.prototype.down(), downNow(), and up()

Per https://en.wikipedia.org/wiki/Semaphore_(programming)#Operation_names we could use acquire() and acquireNow(), returning a release function (the same way as the Lock interface).

down() waiters are woken in strict FIFO order. downNow() ignores the queue, meaning downNow() callers can potentially starve other threads.

👍

The semaphore is weighted: down(2) acquires 2 units from the semaphore.

👍

down() "acquisitions" are not released on test worker teardown, because many use cases for semaphores don't fit the acquire/release model.

Could you elaborate on this? The intended execution model for shared workers is to clean up automatically on teardown.

I made amount an optional positional argument because it's the only currently implemented option. Might be a little awkward if/when other options, e.g. timeout, are added.

That's fine. This package is purposefully outside of AVA so we can make breaking changes more easily.

What should down(0) and downNow(0) do?

  • Currently, down(0) blocks for any waiters, and downNow(0) always succeeds.

Should this be an error?

@ninevra
Copy link
Contributor Author

ninevra commented Mar 21, 2021

createSemaphore() can fail, if the initial value doesn't match another initialization of the same semaphore.

Why is this? Could it not act as a default but be irrelevant otherwise if a different test worker ran first with a different value?

I think this is primarily a usability question. Test files could be run in any combination and any order, so an individual test file doesn't know whether it's actually creating the semaphore or just accessing a previously-created one. Therefore each test file must supply an initial value. If two test files supply different initial values, I figure that likely reflects a bug in the tests. Failing on mismatch is a way of reporting that bug. Treating the initial value as only a default would hide that bug, in a manner that would be execution-order dependent and potentially very hard to debug.

If we were to treat it as a default, the onus would then be on each test file to check the initial value of the semaphore it receives. That would require a somewhat different interface (probably an async createSemaphore()), and it's not clear to me that there would be any upside.

There are probably cases where it's difficult to guarantee consistent initial values - like if the initial value represents the size of a possibly fluctuating resource pool - but ignoring the mismatch probably isn't correct in these cases either. They'd be more likely to want a resize() operation, which is somewhat trickier.

Since the creation is synchronous and doesn't touch the backend, failure is reported on use.

Are you referring to the non-negative value TypeError here? You'd need to be asynchronous to confirm the correct initial value.

I'm referring to the SemaphoreMismatchError. In this draft, createSemaphore(), like createLock(), is a synchronous function and doesn't communicate with the shared worker. The shared worker allocates the semaphore the first time it's used. In the event of mismatched initialValues, createSemaphore() appears to succeed, but actually using the semaphore (calling down() et al) rejects with SemaphoreMismatchError.

Alternatively, createSemaphore() could be asynchronous. I'm honestly not sure whether that would be a better workflow or not. On the one hand, it would allow SemaphoreMismatchErrors to be reported where they actually occur; but on the other hand, it breaks parallelism with createLock(), and makes using it awkward in contexts that can't use await (such as the top level of a commonjs file, or the top level of an es6 module file on node <14.something).

@ninevra
Copy link
Contributor Author

ninevra commented Mar 21, 2021

Semaphore.prototype.down(), downNow(), and up()

Per https://en.wikipedia.org/wiki/Semaphore_(programming)#Operation_names we could use acquire() and acquireNow(), returning a release function (the same way as the Lock interface).

down() "acquisitions" are not released on test worker teardown, because many use cases for semaphores don't fit the acquire/release model.

Could you elaborate on this? The intended execution model for shared workers is to clean up automatically on teardown.

So, if you're using a semaphore to model acquiring resources from a fixed pool (for example, CPU threads) then acquire/release works fine. A test worker would acquire() some resources, the shared worker would associate those resources with it, and could release them on its behalf when the test worker exits. This model assumes that:

  • each worker's calls to a given semaphore will be balanced: they'll call up() for the same total amount as they call down().
  • each worker will always call down() before it calls up().

However, semaphores can be used to model lots of other situations, many of which break these assumptions.

One example: a provider/consumer relationship with a fixed-size queue/buffer (or other analogy for backpressure). One semaphore, empty, is used to count empty space in the buffer, and one semaphore, full, is used to count filled space. Producers await empty.down(), then push to the queue and call full.up(). Consumers await full.down(), then pull from the queue and call empty.up(). In this scenario, producers only ever call down() on empty, never up(); and they only ever call up() on full, never down().

Another example: rendezvous semaphores. Suppose test.js must only be run after a set of n other test files. A semaphore called rendezvous is initialized to 0. test.js awaits rendezvous.down(n) before it runs tests. Each of the n other test files calls rendezvous.up() when it finishes. Once all n test files have finished, test.js' call to down() completes, and it can run. (I admit this is a bit of a specious example because it breaks a lot of AVA's own assumptions, and could deadlock if AVA's concurrency was set at or below the number of test files awaiting rendezvous.)

Another example: a growable resource pool. pool.up() can be used without a paired pool.down() to grow the pool. (This isn't exactly a good idea, and shrinking the pool with pool.down() can risk deadlock. go has some interesting ideas about how to handle resizable resource pools; see golang/sync#3, golang/go#29721. We could expose a similar interface here, although it makes the queueing strategy more complex.)

The point is: at this low level there's no way to know what teardown actions are semantically correct, if any, without making restrictive assumptions about how the semaphore's being used. Teardown becomes the responsibility of whatever tool (perhaps itself a shared worker) is built on top of @ava/cooperate.

I prefer down() and up() over acquire(), release(), wait(), or signal() because they express what's going on without making an assumption about what it means.

That said, it could be helpful to expose helpers for the acquire/release model, since it's probably the most common use case. This could be done in a lightweight fashion, just moving the finally pattern from README.md into Semaphore itself,

class Semaphore {
  // ...
  async task<T>(job: () => T, amount = 1): Promise<Awaited<T>> {
    await this.down(amount);
    try {
      return await job();
    } finally {
      await this.up(amount);
    }
  }
}

or in a heavyweight fashion, with fourth and fifth operations acquire() and acquireNow() that would have the behavior you suggest.

@ninevra
Copy link
Contributor Author

ninevra commented Mar 21, 2021

What should down(0) and downNow(0) do?

  • Currently, down(0) blocks for any waiters, and downNow(0) always succeeds.

Should this be an error?

Maybe? I'm not sure whether there's ever a good reason to use down(0), but neither am I sure that there isn't. The current behavior follows naturally from the semantics for positive amounts, though; I don't think there's any actual harm in allowing 0.

@novemberborn
Copy link
Member

Hey @ninevra — sorry I haven't had the time to review your comments all week.

Regarding initialization:

Test files could be run in any combination and any order, so an individual test file doesn't know whether it's actually creating the semaphore or just accessing a previously-created one. Therefore each test file must supply an initial value. If two test files supply different initial values, I figure that likely reflects a bug in the tests.

I see. I imagine it likely that the semaphore is actually "created" in a helper file, rather than in each test file, so then it would be unlikely for any hardcoded value to differ, making this behavior manageable.

Even better if you could initialize the semaphore through ava.config.js, at which point you could perhaps access it without an initial value. But that can be done later.

createSemaphore() could be asynchronous.

It's nice to be able to set stuff up synchronously in helpers.

If I may bike-shed… rather than SemaphoreMismatchError how about SemaphoreCreationError?


I prefer down() and up() over acquire(), release(), wait(), or signal() because they express what's going on without making an assumption about what it means.

Cool, let's start with down() and up().

It would be nice though if down() returned a bound up() and vice versa, so you could:

test('upside down', async t => {
  const up = await semaphore.down()
  t.teardown(up)
  t.pass()
})

I'm not sure whether there's ever a good reason to use down(0), but neither am I sure that there isn't. The current behavior follows naturally from the semantics for positive amounts, though; I don't think there's any actual harm in allowing 0.

Wouldn't hurt to restrict it for now.


Any thoughts on this?

down() "acquisitions" are not released on test worker teardown, because many use cases for semaphores don't fit the acquire/release model.

Could you elaborate on this? The intended execution model for shared workers is to clean up automatically on teardown.

@ninevra
Copy link
Contributor Author

ninevra commented Mar 31, 2021

Any thoughts on this?

down() "acquisitions" are not released on test worker teardown, because many use cases for semaphores don't fit the acquire/release model.

Could you elaborate on this? The intended execution model for shared workers is to clean up automatically on teardown.

I tried to elaborate on this in #5 (comment) (which was kind of a long ramble, sorry). Relevant part:

The point is: at this low level there's no way to know what teardown actions are semantically correct, if any, without making restrictive assumptions about how the semaphore's being used.

If we were to always clean up "acquisitions" (i.e. imbalanced usage of down() and up() within a worker) on test worker teardown, then afaict we'd break e.g. producer/consumer queues and rendezvous semaphores, both of which rely on "imbalanced" usage of the semaphore.

I think it makes more sense to support all possible use cases here, since this is explicitly a low-level library? Alternatively we could have both an up()/down() interface with no cleanup and an acquire()/release() interface with automatic cleanup.

@novemberborn
Copy link
Member

I think it makes more sense to support all possible use cases here, since this is explicitly a low-level library?

Agreed. What do you think of making the clean-up opt-out? Simpler use cases will work fine out of the box, and more advanced use cases come with some extra lifecycle management responsibilities.

Could change createSemaphore() to take a secondary object argument, and require all options to match the existing semaphore.

@ninevra
Copy link
Contributor Author

ninevra commented Apr 4, 2021

Making the clean-up opt-out makes sense to me. I'll try to implement that in the next few days.

@ninevra
Copy link
Contributor Author

ninevra commented Apr 12, 2021

Sorry it took me a while to get back to this. I've implemented cleaning up after semaphore acquisitions; it was a minor nightmare trying to organize the code for that. Currently that's exposed via extra methods acquire(), acquireNow(), which return a release() method; that was the simplest interface to mock up. Needs a better interface and documentation, but hopefully it works?

@novemberborn
Copy link
Member

Hey @ninevra I'm so sorry I wanted to land avajs/ava#2741, and didn't even succeed at that, so I haven't been able to look at this yet.

@ninevra
Copy link
Contributor Author

ninevra commented May 18, 2021

@novemberborn there's nothing to apologize for! I still need to do a bit more work & cleanup on this anyway; I'll ping you when I think it's ready again.

#2741 looks like a huge amount of work! I'm super happy to see projects adopting esm.

@ninevra
Copy link
Contributor Author

ninevra commented May 20, 2021

I think I've made a logic error in trying to support non-integer semaphore values. We've been tracking the semaphore value with a number, and also tracking the size of acquisitions with numbers, and assuming that down()ing by a number and later up()ing by that number will balance out. But, I don't think that follows with floating-point numbers? If the semaphore's value is changed by other operations in the middle, then the up() may well add a slightly different amount than the down() removed.

As long as user uses safe integers in a reasonable range, everything'll be fine, but non-integers or very large semaphore values will produce small errors that might add up over time.

We could let that be, or switch to bigints, or check that the amounts used are always integers and throw otherwise.

@ninevra
Copy link
Contributor Author

ninevra commented May 20, 2021

@novemberborn I think this is ready for review! There's no hurry at all, though; it can wait until you're got time for it.

@novemberborn
Copy link
Member

@ninevra this is still on my to-do list, sorry.

@novemberborn
Copy link
Member

This is great @ninevra!

As long as user uses safe integers in a reasonable range, everything'll be fine, but non-integers or very large semaphore values will produce small errors that might add up over time.

We could let that be, or switch to bigints, or check that the amounts used are always integers and throw otherwise.

Agreed. Let's make sure they're safe integers. I've pushed a commit to that effect.

I've also tweaked naming, since both semaphores are counting semaphores. Now using managed and unmanaged. What do you think?

I found it strange to have separate ID spaces for both types of semaphores, so I've changed it to share a single space but enforce that the created semaphores are in the same mode.

Copy link
Contributor Author

@ninevra ninevra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I tweaked the readme to state that amounts and initialValues must be integers.

t.notThrows(() => release(0));
});

test.serial('managed and unmanaged semaphores share their ID space', async t => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why's this test .serial()? It seems to pass just fine without it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah left-over from a debugging attempt, good catch.

source/index.ts Outdated
}
}

/* c8 ignore next 2 */
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these ignore comments perhaps be removed? I couldn't find a good way to satisfy both c8 and eslint:

  • With the /* c8 ignore next 2 */ comments are currently used, c8 just warns about the blank line before and the } line after not being covered.
  • Removing the line above and increasing the range to `/* c8 ignore next 3 */ makes c8 happy, but eslint says "Expected blank line before this statement."
  • Moving the comment up to line 237 and increasing the range to `/* c8 ignore next 4 */ works, but doesn't exactly look great.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh yea, getting rid.

@novemberborn
Copy link
Member

I tweaked the readme to state that amounts and initialValues must be integers.

Nice one.

@novemberborn novemberborn changed the title Add semaphores Implement semaphores Jun 19, 2021
@novemberborn novemberborn merged commit daa4afd into avajs:master Jun 19, 2021
@novemberborn
Copy link
Member

Thanks for your patience and hard work on this @ninevra!

I'll try and push out a release tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants