Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: nancorr_spearman fastpath #41885

Merged
merged 4 commits into from Jun 9, 2021

Conversation

mzeitlin11
Copy link
Member

xref #40956

The diff looks much more complicated than it is because of indentation changes - the change is basically allowing a fast path for computation without nans by pulling up a slightly modified computation of sumx, sumxx, sumyy before missing value handling.

Benchmarks:

before           after         ratio
[b73c38e2]       [efeac916]
<perf/grp_cumsum_int~7>       <nancorr_spearman_perf2>
            70.3M              71M     1.01  stat_ops.Correlation.peakmem_corr_wide('spearman')
-     1.39±0.06ms      1.01±0.02ms     0.73  stat_ops.Correlation.time_corr('spearman')
         669±20μs         641±40μs     0.96  stat_ops.Correlation.time_corr_series('spearman')
-      26.3±0.6ms       8.36±0.3ms     0.32  stat_ops.Correlation.time_corr_wide('spearman')
          491±3ms          473±9ms     0.96  stat_ops.Correlation.time_corr_wide_nans('spearman')
       8.04±0.5ms       7.97±0.6ms     0.99  stat_ops.Correlation.time_corrwith_cols('spearman')
          215±1ms          213±2ms     0.99  stat_ops.Correlation.time_corrwith_rows('spearman')

@mzeitlin11 mzeitlin11 added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Performance Memory or execution speed performance labels Jun 9, 2021
@jreback jreback added this to the 1.3 milestone Jun 9, 2021
@jreback
Copy link
Contributor

jreback commented Jun 9, 2021

can you add this PR number to the existing perf note on corr, ping on green.

@mzeitlin11
Copy link
Member Author

Added a small fix and test in latest commit for edge case of min periods > frame length

@jreback jreback merged commit 1f3e646 into pandas-dev:master Jun 9, 2021
@mzeitlin11 mzeitlin11 deleted the nancorr_spearman_perf2 branch June 9, 2021 19:18
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants