-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: interpolate: fix concurrency issues throughout #20671
base: main
Are you sure you want to change the base?
Conversation
with self._c_lock: | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an optimistic-lock; it expects that multiple threads to be calling non-mutating functions most of the time, allowing them to run concurrently. It behaves like a turnstile, allowing threads that arrived first to pass, and then blocks if another thread calls extend
, until it finishes updating the polynomial coefficients.
While this solution may yield dirty-reads for threads that execute before extend
, it allows for better concurrency by not serializing each call that depends on the coefficients, which would slowdown the overall execution time when multiple threads are calling instructions different from extend
.
An alternative would be to signal in the docs that the API is not thread-safe with respect to extend
and let the user handle the concurrency explicitly when a mixture of extend
and other calls happen
061aa6f
to
7edd025
Compare
@andfoy this looks like a good initiative. Let me know when you require someone to review. |
Thanks @Kai-Striega, I'll expect to complete the overview by next week, however, feel free to comment and review on any of the current changes |
The case of
Question: who's responsible for locking, 1) the library or 2) a user? More specifically, if there's a threading issue in user code, where the bug is: in the library or in the userland? If 1), then every library in the ecosystem needs to add locking to all mutating methods. Thus any threading violation is valid bug report which we ought to fix. ISTM there are two questions here, in fact. The first one is "who's responsible for locking" and the answer is, I strongly suspect is "it depends". |
Not really - the library is responsible. Wherever we do something that isn't thread-safe, it requires a lock. That was already true before free-threaded CPython (e.g. we had a lock around
For explicitly mutating methods (which should be very rare in SciPy), then of course the user has to be careful - doing that in multiple threads is not going to happen in a predefined order. I'm not sure why you're asking though - can you give an example of a SciPy function/method that is relevant here? |
The original comment ( motivated by an offline discussion with @andfoy ) was about Basically, this PR adds locks all around. This guards against all sorts of contrived scenarios. And I'm talking about exactly that: what are the lengths we want to go to guard against contrived usage. |
The Alpine test segfault is fixed if |
That's done now at long last:) |
faec9f3
to
4c7fb19
Compare
Reference issue
See gh-20669
What does this implement/fix?
This PR will add concurrency tests for the main interpolation classes in SciPy. This is done in order to potentially discover any concurrency, race conditions or deadlocks that may appear when using multithreaded Python.
Additional information
BSpline
: State is initialized in constructor, it shouldn't have any concurrency issues at first glance.PPoly
(and family): State (aka polynomial coefficients) can mutate underextend
method, all other calls depend on the stateRbfInterpolator
: There are no functions that can mutate an instance state, thus is thread-safe, as longKDTree
is safe: ENH: spatial: ensure thread-safety for KDTree #20655Rbf
: There are no functions that can mutate an instance state.NdBSpline
: State is initialized in constructor, no concurrency issues foundNdPPoly
: State is initialized in constructor, no concurrency issues foundNearestNDInterpolator
: State is initialized in constructor, no mutation methods are available. It depends onKDTree
being thread-safe: ENH: spatial: ensure thread-safety for KDTree #20655RegularGridInterpolator
: A wrappedNdBSpline
instance can mutate during the call, introducing unwanted side effects on a concurrent scenario. Fixed on 7eb6d59LinearNDInterpolator/CloughTocherInterpolator
: TheNDInterpolatorBase
does not have any state mutation methods. Since they depend on QHull access, concurrent calls will be serialized ENH: spatial: serialize concurrent calls to QHull #20619UnivariateSpline/_BivariateSplineBase
(and family): According to Mark fitpack sources asrecursive
#16053 (comment), all FORTRAN code that is declared with therecursive
keyword is reentrant and thread safe, as long as there are not any variables declared with thesave
keyword. The port of FITPACK used by SciPy is not usingsave
, and all routines are declared as recursive. Classes that use the FITPACK backend base classes (akaUnivariateSpline
and_BivariateSplineBase
) do not have any state-mutating public function different from the constructor, so they should be thread-safe in principle.