-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for interrupting another domain #11411
Comments
pthread_kill / Domains.kill
in OCaml 5.0pthread_cancel / Domains.cancel
in OCaml 5.0
In the Discuss thread, @gadmm suggests using asynchronous exceptions, for example his work on Memprof-limits. I agree that this sounds like a better approach than sending signals between threads (which does not fit the OCaml semantics model very well), or even worse terminating a thread uncooperatively. Supporting asynchronous exceptions in this way is not in scope for 5.0 (for example we won't have Memprof yet), but may definitely happen in 5.x. @gadmm has the interest and expertise, and you have a use-case; it's a match! |
Even Bottom line: Pthread-style cancellation is not what you're looking for, nor an appropriate mechanism to build what you're looking for. Asynchronous exceptions might one day do what you need, although many design points remain unclear to me. |
First, thanks for being vocal about these use-cases. I know about them because we have talked about it together but it is good to have this knowledge and experience in public writing. If the title is "Support for pthread_cancel" it is not going to work for the reasons explained here and on discuss. If the title is changed back to "pthread_kill", then there is some sense to the question. First, for cancellation you can still use signals (raising from the OCaml signal handler) and use the sigmask to target the chosen threads in multicore. If you have several threads that you want to cancel at the same time, then for this kind of application you can send your signal repeatedly with some delay until all the targeted threads are cancelled. One advantage of signals over other approaches is that they also interrupt blocking IO operations. In this context, what If the behaviour regarding blocking IO is not needed, then it would be easier to support a By the way, memprof-limits was written with the Coq proof assistant implementation in mind, so I am confident that its uses are well-covered once Memprof is back. We have not started experimenting with bringing it to the Coq proof assistant yet because we still need to find |
Thank you very much for the answers, indeed it seems that the way to go is with the solution discussed by @gadmm using async exceptions; I'll try to reword the issue later on (or @gadmm , please feel free to do so yourself). My reading of what we want here then is "better runtime support for certain kind of async exceptions" ; is that correct? I propose we remove references to Actually, maybe there is already a way using the C OCaml API where we could implement Is it written down somewhere in what cases does a domain check for async exceptions ? (I assume in allocation, right?) I'll be happy to test this method using the coq-lsp project, where we often interrupt Coq (currently sending a signal), actually @gadmm send me over the years a lot of helpful information about async exceptions, but only now I have code (finally) that requires that knowledge, thanks! Out of curiosity I checked what polyML uses:
|
@gadmm we wrote the messages at the same time hehe, please indeed update the title of the issue. |
pthread_cancel / Domains.cancel
in OCaml 5.0
I changed the title of this feature wish to something more descriptive and less focused on a particular implementation.
At allocations and at explicit "polling" points introduced by the compiler when it sees code that might loop forever without allocating. The long story is at #10039
There's a more general mechanism already in place to support stop-the-world GC and signal handling. What you outline is probably doable, but it won't interrupt domains that are blocked on I/O. For this, actual OS signals are needed, with the caveat that it will not be portable to Windows. |
APC may work on Windows instead of signals: https://docs.microsoft.com/en-us/windows/win32/sync/asynchronous-procedure-calls . I believe APC is already used in ocaml/otherlibs/unix/winworker.h Lines 24 to 31 in 000d15d
for the (I haven't been following OCaml 5 deeply, but hopefully we don't end up where Coq et al don't work on Windows.) |
My non-expert understanding:
|
Note: my understanding that "interrupting I/O" is not in scope of @ejgallego's question on how to terminate (concurrent) search processes in the application domain he mentions, which are CPU- and memory-heavy but not IO-heavy. And even when we do IO there, it may suffice to make I/O asynchronous (as done in, say, Lwt) rather than actually cancellable: it's mostly okay if an I/O computation keeps going on for longer than necessary if it does not block computation. |
@gasche , I agree on all fronts, indeed this is not easy, but I don't see any other way for systems like Coq who would like to start checking proofs, then interrupt when the user for example does an edit, right? Indeed the workloads I'm thinking here are not IO bound at all, in fact, I'm happy if (explicit) IO is forbidden withing the solution. There are already pretty good ways to handle heavy I/O tasks concurrently I understand. |
Thanks, that was a fascinating read ! |
One possibility is to not implement cancellation at the domain level but at the user-level thread (fiber) level in |
@gasche I do not think the points 4. and 5. have to be considered in this issue; they are general issues which concerns the asynchronous exceptions that are already available in OCaml. For 4. there was ongoing work to further fix the runtime, because in many places it was/is not ready to receive asynchronous exceptions (with many parts of the runtime left unaudited). For 5. there has been a PR open for a while to implement masking (in the meanwhile memprof-limits shows that it is already possible to implement a form of masking in current OCaml). @kayceesrk I agree the question is best viewed as not specific to domains, but I am not sure fibers are the right abstraction either. I think you want to be parametric in a notion of thread id. For instance if you want to use memprof-limits to interrupt domainslib tasks, it would need to track the cancellation state on a per-task basis. Therefore one would change memprof-limits to accept as parameter a module type Thread_id = sig
type t
val self : unit -> t
val compare : t -> t -> int
end where the provided The biggest challenge would be rewriting domainslib to protect its shared state from asynchronous exceptions. Memprof-limits provides a bracket combinator; it seems to be enough for the use-case of the Coq proof assistant (which does use shared state in its interruptible parts but not too much), but it is probably more limited for libraries which I suspect state-intensive such as domainslib. Domainslib seems pleasingly simple, which makes this an interesting experiment. Maybe it would show that you need a more general notion of resource first.
I want to add:
|
Just thought I'd chime in here to say that I also have a use case for this, in a context adjacent to the OP's coq work (the evaluation of a constraint satisfaction logic). As it stands, I've arranged things so that evaluations are run in a separate process, so that reasonable limits can be placed on runtime duration, memory utilization, and so on. It may be that planned instrumentation of the logic interpreter will allow for an "interrupt" as necessary, but I'd much prefer to use a So, insofar as movement on this topic occurs, I would be very glad to run any implementation through its paces. |
Hi folks, finally I got some time to port coq-lsp to [Recall that polling in critical paths like conversion is not very viable in a system like Coq, due to the perf hit] |
@ejgallego refers to "token limits" https://guillaume.munch.name/software/ocaml/memprof-limits/#token-limits, whereby a thread can ask another thread (preferably CPU-bound rather than IO-bound) to stop by setting a cancellation token, with polling done automatically and regularly from a memprof callback. Support requires memprof for OCaml 5. |
Dear OCaml developers,
in our experiments to port Coq and
coq-lsp
to OCaml 5, it was suprising to me to learn that "true" domain cancellation (which I guess it'd map topthread_cancel
) is not planned for 5.xSee discussion at https://discuss.ocaml.org/t/a-tutorial-on-parallel-programming-in-ocaml-5/9896/9?u=ejgallego
I cannot stress enough how critical this functionality is for a large variety of application domains, most remarkably concurrent proof search, speculative document processing (used for research in ML), and user interfaces such as the upcoming
coq-lsp
incremental type-checker.The suggested alternative, which is to instrument user code to check explicitly for an interrupt, is just not feasible (and indeed, we have tried that already).
I understand that getting this feature right is tricky, however I see other systems such as Lean or Isabelle (in PolyML) do manage spawn and kill thread happily, so as a user I'd be willing to have access to such API even if it would come with a burden in terms of using it carefully.
I'd also be happy if the solution is not super portable at first, or would have us writing platform-specific code.
The text was updated successfully, but these errors were encountered: