PERF: nancorr_spearman #41857

mzeitlin11 · 2021-06-07T19:50:36Z

xref ENH: parallelize DataFrame.corr #40956
Ensure all linting tests pass, see here for how to run them
whatsnew entry

Think a lot more can be squeezed out, so not closing the issue. Initial results:

       before           after         ratio
     [dae5c597]       [d60902af]
     <master>         <perf/corr>
              69M            69.7M     1.01  stat_ops.Correlation.peakmem_corr_wide('spearman')
      1.47±0.02ms       1.60±0.6ms     1.09  stat_ops.Correlation.time_corr('spearman')
         690±20μs         629±90μs     0.91  stat_ops.Correlation.time_corr_series('spearman')
-        30.4±1ms       13.9±0.4ms     0.46  stat_ops.Correlation.time_corr_wide('spearman')
         589±60ms         528±20ms    ~0.90  stat_ops.Correlation.time_corr_wide_nans('spearman')
         12.0±2ms       9.02±0.9ms    ~0.75  stat_ops.Correlation.time_corrwith_cols('spearman')
         253±10ms          241±4ms     0.95  stat_ops.Correlation.time_corrwith_rows('spearman')

This speedup factor increases the larger the frame since more time gets spent in the ~O(N^2K) algo. With the nogil added, we could also explore using prange on the outer loop and exposing some new kwarg to allow cython parallelism.

jbrockmendel · 2021-06-07T20:04:43Z

exposing some new kwarg to allow cython parallelism

What is the scenario in which prange is available but the user would want to disable it?

mzeitlin11 · 2021-06-07T20:29:05Z

exposing some new kwarg to allow cython parallelism

What is the scenario in which prange is available but the user would want to disable it?

Sorry "allow" was a poor word choice. More specifically, I think it would make sense for prange to default to num_threads=1, with the user then able to specify using more threads.

By default prange would defer to OpenMP to choose the number of threads, which would usually set the value to the number of cores, which can be problematic for cases like batch computing. You could probably manually set an environment variable to get around that on the user-end, but having the default behavior to use all cores would probably be a surprising change that should be opt-in.

jbrockmendel · 2021-06-08T01:41:07Z

which can be problematic for cases like batch computing

thanks, thats what i was missing.

Would any of this Just Work if this were implemented in numba instead of cython?

I don't have a strong opinion on this per se, but am wary of a) using a cython feature (prange) that we don't currently use and that i don't know i) how well supported it is or ii) if its behavior is e.g. platform-dependent and b) user-facing kwargs that will give us combinatorially more cases to test

jreback · 2021-06-08T01:42:51Z

numba will just work for parallelism

+1 on adding similar apis as we have for window functions

cc @mroeschke

mzeitlin11 · 2021-06-08T02:25:12Z

I don't have a strong opinion on this per se, but am wary of a) using a cython feature (prange) that we don't currently use and that i don't know i) how well supported it is or ii) if its behavior is e.g. platform-dependent and b) user-facing kwargs that will give us combinatorially more cases to test

Agree with the concerns, was just mentioning as something to consider for the future, not meant for this pr

EDIT: also agree numba would make things easier, do you know if there has been any exploration of using numba instead of cython?

mroeschke · 2021-06-08T02:51:29Z

asv_bench/benchmarks/stat_ops.py

@@ -99,6 +99,7 @@ class Correlation:
    param_names = ["method"]

    def setup(self, method):
+        np.random.seed(0)


You don't need this random seed. The imported setup function sets the random seed for all benchmarks

Thanks, will remove

mroeschke · 2021-06-08T02:57:17Z

EDIT: also agree numba would make things easier, do you know if there has been any exploration of using numba instead of cython?

All window aggregations and groupby transform/agg have API support for a engine argument that can accept 'numba' as an argument which will do the computation using numba. It essentially means rewriting the cython code back to python and adding the jit decorator.

You can see pandas/core/window/numba_.py and pandas/core/groupby/numba_.py for the pattern I was using to implement these.

simonjayhawkins · 2021-06-08T09:00:54Z

EDIT: also agree numba would make things easier, do you know if there has been any exploration of using numba instead of cython?

see #40530, although that's a bit ambitious as the goal there is to remove cython completely to create a no-arch pandas build. ( I think would maybe be good for new contributors or for exploration of the libs api, see later). I'm working through slowly but still on 'a' for algos. (next is wrappers for take_2d in algos.)

A significant benefit I would see for using numba would be the native handling of datetimes by numba avoiding i8 conversions in the main code and passing is_datetime like flags to the cython libs. I originally started to this also, but makes keeping my branch in sync harder. So I'm just looking to swap out the libraries as a first steps and keep the same api for the libs.

numba will just work for parallelism

I think you still have the same issue regarding what the default setting should be. IIRC in #40530 where I made take_1d parallel, running the test suite with n=auto was a lot slower and needed to use the environment settings to turn off parallelism. although I have since changed that to only use parallelism when the number of rows > 10_000, so probably no longer an issue for the test suite.

mzeitlin11 · 2021-06-08T16:26:07Z

Thanks for explaining @mroeschke and @simonjayhawkins!

jreback

just a question

jreback · 2021-06-08T22:13:41Z

pandas/_libs/algos.pyx

                    result[xi, yi] = result[yi, xi] = NaN
+                else:
+                    if not all_ranks:
+                        with gil:


why the gil here?

rank_1d can't be called with nogil. Perhaps some refactoring could allow calling some nogil rank_1d helper instead, but that would be a larger change.

i c, ok i think its worthile to make that nogil (but not in this PR), followon preferred.

jreback · 2021-06-09T00:30:47Z

pandas/_libs/algos.pyx

+                            # We need to slice back to nobs because rank_1d will
+                            # require arrays of nobs length
+                            labels_nobs = np.zeros(nobs, dtype=np.int64)
+                            rankedx = rank_1d(np.array(maskedx)[:nobs],


yeah should really take a memory view (or have a helper function to do it)

jreback

some followon suggestions.

mzeitlin11 · 2021-06-09T00:39:36Z

some followon suggestions.

Will look into removing that gil section. The main complication is that rank_1d requires a np.lexsort step that would have to be done with the gil (or perhaps some kind of presorting could be done, will look into it)

mzeitlin11 added 3 commits June 7, 2021 15:33

precommit fixup

c9ff800

Add benchmark seed for stability

d60902a

Add back all bench methods

8886059

mzeitlin11 added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Performance Memory or execution speed performance labels Jun 7, 2021

mroeschke reviewed Jun 8, 2021

View reviewed changes

mzeitlin11 added 2 commits June 8, 2021 12:26

Merge remote-tracking branch 'upstream/master' into perf/corr

db145f8

Remove random seed

ff9519f

jreback requested changes Jun 8, 2021

View reviewed changes

jreback approved these changes Jun 9, 2021

View reviewed changes

jreback reviewed Jun 9, 2021

View reviewed changes

jreback approved these changes Jun 9, 2021

View reviewed changes

jreback added this to the 1.3 milestone Jun 9, 2021

jreback merged commit 63c20d2 into pandas-dev:master Jun 9, 2021

mzeitlin11 deleted the perf/corr branch June 9, 2021 00:39

JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021

PERF: nancorr_spearman (pandas-dev#41857)

6485e48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: nancorr_spearman #41857

PERF: nancorr_spearman #41857

mzeitlin11 commented Jun 7, 2021

jbrockmendel commented Jun 7, 2021

mzeitlin11 commented Jun 7, 2021

jbrockmendel commented Jun 8, 2021

jreback commented Jun 8, 2021

mzeitlin11 commented Jun 8, 2021 •

edited

mroeschke Jun 8, 2021

mzeitlin11 Jun 8, 2021

mroeschke commented Jun 8, 2021

simonjayhawkins commented Jun 8, 2021

mzeitlin11 commented Jun 8, 2021

jreback left a comment

jreback Jun 8, 2021

mzeitlin11 Jun 8, 2021

jreback Jun 9, 2021

jreback Jun 9, 2021

jreback left a comment

mzeitlin11 commented Jun 9, 2021

PERF: nancorr_spearman #41857

PERF: nancorr_spearman #41857

Conversation

mzeitlin11 commented Jun 7, 2021

jbrockmendel commented Jun 7, 2021

mzeitlin11 commented Jun 7, 2021

jbrockmendel commented Jun 8, 2021

jreback commented Jun 8, 2021

mzeitlin11 commented Jun 8, 2021 • edited

mroeschke Jun 8, 2021

Choose a reason for hiding this comment

mzeitlin11 Jun 8, 2021

Choose a reason for hiding this comment

mroeschke commented Jun 8, 2021

simonjayhawkins commented Jun 8, 2021

mzeitlin11 commented Jun 8, 2021

jreback left a comment

Choose a reason for hiding this comment

jreback Jun 8, 2021

Choose a reason for hiding this comment

mzeitlin11 Jun 8, 2021

Choose a reason for hiding this comment

jreback Jun 9, 2021

Choose a reason for hiding this comment

jreback Jun 9, 2021

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

mzeitlin11 commented Jun 9, 2021

mzeitlin11 commented Jun 8, 2021 •

edited