Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Drastic memory usage increase in recent builds #22233

Closed
larsoner opened this issue Sep 8, 2022 · 8 comments · Fixed by #22236
Closed

BUG: Drastic memory usage increase in recent builds #22233

larsoner opened this issue Sep 8, 2022 · 8 comments · Fixed by #22236
Labels

Comments

@larsoner
Copy link
Contributor

larsoner commented Sep 8, 2022

Describe the issue:

Recently in MNE-Python our CIs started dying due to memory overconsumption on both

On my local Linux machine doing the following as a test with git bisect --first-parent:

$ mprof run pytest -m "not ultraslowtest" mne/tests/ --tb=short --cov=mne --cov-report=xml --cov-report=html -vv && mprof plot

takes ~12GB of memory usage on main and also 45357de:

bad_45357de4cf83340bd05d279673f68a7fd42b2308

but only a much more reasonable ~2.5GB of max memory usage on db550d5:

good_db550d57ee72e097b50d776cce108c64c80c181b

So I think this is due to #21995.

MNE does use SciPy, Numba, scikit-learn, dipy, etc. that could be interacting with NumPy in some bad way, too. I could at least rule out numba, dipy, and scikit-learn by uninstalling them and seeing the same overconsumption, though (some tests will just be skipped or our library with just use slower code paths this way).

I'm happy to work on isolating this further if it's not clear why this might be happening just from having it pinned down to the PR/commit.

Reproduce the code example:

<not easily reproduced / not yet isolated>

Error message:

No response

NumPy/Python version information:

See commit numbers above

Context for the issue:

No response

@charris
Copy link
Member

charris commented Sep 8, 2022

That's disturbing. Reverting is an option, but it would be nice to figure out why it has that effect.

@larsoner
Copy link
Contributor Author

larsoner commented Sep 8, 2022

Tomorrow I'll see if I can replicate on something simpler like maybe just running numpy.test with memprof

@seberg
Copy link
Member

seberg commented Sep 9, 2022

Hmmmpf, the graph does look like we probably introduced a memory leak? Would be helpful to narrow it down a bit. Although if that is tricky, we can also run the debugging tools on the NumPy test suite to see if it points to something.

@rgommers
Copy link
Member

rgommers commented Sep 9, 2022

SciPy CI jobs started dying as well with no indication of what the problem is other than Error: The operation was canceled. in GitHub Actions (example log). We should revert the PR, and then resubmit and investigate the problem.

Cc @eirrgang as author of the relevant PR.

@seberg
Copy link
Member

seberg commented Sep 9, 2022

oh, I missed that we know which PR is at fault. That should be an easy fix then, no need to revert.

@rgommers
Copy link
Member

rgommers commented Sep 9, 2022

@seberg only if you already understand the problem and it can fixed within a few hours. Because it's going to be a huge waste of time if every downstream CI job testing numpy main is going to fail.

seberg added a commit to seberg/numpy that referenced this issue Sep 9, 2022
The new path to preserve dtypes provided by creating a view got the
reference counting wrong, because it also hit the incref path that
was needed for returning the identity.

This fixes up numpygh-21995

Closes numpygh-22233
@rgommers
Copy link
Member

rgommers commented Sep 9, 2022

Broken wheels have been deleted and memory leak should be fixed - see gh-22236.

@rgommers
Copy link
Member

rgommers commented Sep 9, 2022

Thanks for finding the root cause @larsoner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants