Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leaking in calliope (or some dependency) #368

Open
ramaroesilva opened this issue Aug 5, 2021 · 18 comments
Open

memory leaking in calliope (or some dependency) #368

ramaroesilva opened this issue Aug 5, 2021 · 18 comments
Labels
has-workaround The issue describes a valid workaround until the primary issue is solved

Comments

@ramaroesilva
Copy link

[ran in calliope 0.6.6]
I recently put in place a Python script which runs a somewhat heavy calliope model inside a for loop.

When running it I found out that between each iteration the RAM usage keeps increasing. This has two consequences:

  • after a couple iterations, the running time per iteration increased
  • after some more, Python eventually crashes

This happened even if I deleted the resulting model between each iteration.

After talking with my colleague @gschwind, he mentioned the possibility of this being due to memory leaking in calliope (or some of its dependencies).

@ramaroesilva
Copy link
Author

ramaroesilva commented Aug 5, 2021

@gschwind also suggested the following workaround (example code below):

  • before running calliope, make python run in a separate child process with os.fork()
  • import the calliope module in each iteration (i.e., inside the for loop)
  • make the main process (i.e., the for loop) wait while the calliope calculations for a given loop iteration are runnning
  • after all calculations are done in that iteration, kill the child process with quit(0)
    # Using fork to avoid calliope memory leak.
    pid = os.fork()
    if pid > 0:
        # Parent process
        os.wait()
    else:
        import calliope
        # calliope commands here       
        quit(0)

I've already tested this and, while it doesn't work if it is ran in spyder (it has some issues with quit() and I think it makes the parent/child process thing more complicated) it does if ran in the console.

I also read that PyCharm doesn't have a quit() issue, so maybe this is compatible with Python IDEs other than spyder.

@ramaroesilva
Copy link
Author

ramaroesilva commented Aug 5, 2021

@brynpickering and @sjpfenninger, I just noticed calliope was updated less than 2 months ago, now using more recent versions of pandas, pyomo and xarray. Do you think this may fix the issue?

I can try to re-run my script with calliope 0.6.7 but right now I'm kinda short on time to verify possible backwards-incompatibility issues with my .yaml files.

@ramaroesilva
Copy link
Author

FYI, just found out that the os.fork() is limited to Linux or WSL.
rq/rq#859

@brynpickering
Copy link
Member

brynpickering commented Aug 6, 2021

@ramaroesilva do you have a MWE of the loop you were initiating? It could be any number of dependencies causing it, but I suspect it is Pyomo's build of the LP which may be stored in memory and not dumped after the model has run. This is pure suspicion, though. You can have a look at #69 for some prior stuff I was doing to try and hunt down and squash high memory consumption.

@ramaroesilva
Copy link
Author

ramaroesilva commented Aug 6, 2021

@brynpickering I know I don't have one, but if you tell me what an MWE is I can try to get it :-)

@brynpickering
Copy link
Member

Sorry, shouldn't use abbreviations when they're unnecessary ;) It's a minimal working example. I.e. I don't need to see the calliope model, but your for loop.

@ramaroesilva
Copy link
Author

ramaroesilva commented Aug 6, 2021

Minimal working example below (it's a bit more complicated than this, but this gets the gist of it).

file_base_name_index = {
    0: 'ParamBase15min_10app_colSC_freeExport.nc',
    1: 'Disc25_15min_10app_colSC_freeExport.nc'
}

for count, sce in enumerate(sc_list):
    print(sce)
    
    file_name = file_base_name_index[count]
    if os.path.exists(file_name):
        continue        
    
    model = calliope.Model(os.path.join(DirCalliope, 'model_colSC.yaml'),\
                           scenario=sce)
        
    model.run()
    model.to_netcdf(file_name)
    del(model)

@brynpickering
Copy link
Member

OK, now can you profile the memory consumption for these three different cases:

  1. baseline (what you're doing at the moment).
  2. skip model.run(); everything else remains the same.
  3. model.run() -> model.run(build_only=True); everything else remains the same.

@ramaroesilva
Copy link
Author

ramaroesilva commented Aug 9, 2021

@brynpickering, thanks for the feedback!

This will be my first time profiling memory consumption and, as usual, I found out several packages for this purpose (e.g., objgraph, PySizer, Heapy, guppy3, memory-profiler).

Any suggestion?

@brynpickering
Copy link
Member

I use memory-profiler. See #344.

@brynpickering
Copy link
Member

@ramaroesilva did you ever find out what was going on here?

@fvandebeek
Copy link

Hi all,

I also run into memory problems when running Calliope in a for loop (to do scenario runs). Im using tracemalloc to try to pinpoint where memory may be building up (see https://docs.python.org/3/library/tracemalloc.html).

Im running the exact same model twice in an external for loop where , see snapshots of the top 10 most heavy memory users.

After first iteration
afbeelding

After second iteration
afbeelding

You can clearly see the memory for these objects doubling. Question now is, how to prevent this, or at least how to clear the objects after an iteration?

MWE:


scenarios = [
    None,
    None,
    None,
    ]

for scenario in scenarios:

    # Define the Calliope model
    model = cp.Model('model.yaml')        
    model.run()

@brynpickering
Copy link
Member

I'm not really surprised that it is a pyomo object problem... it might be necessary to explicitly delete pyomo objects before starting a new run to ensure they're purged. Maybe pyomo even has some helpful way to kill a model entirely?

@fvandebeek
Copy link

fvandebeek commented Oct 11, 2023

I searched for the possibility of killing the model, or pyomo in general, online, but all I found was that it seems to be very hard to unload modules in Python. There seems to be two suggested ways:

  • reload modules;
  • Delete calliope defnition at the end of your loop, garbage collect and then reimport at the beginning of your loop;

But I haven't managed to actually clear the allocated memory with either of these methods.

I now set up a windows batch file to loop through the scenarios by passing the name of the scenario to the python script that runs Calliope:

run_calliope.bat:


FOR %%x in (
"scenario1",
"scenario2",
) DO "<location of python executable>" "<location of python file>" %%x
@pause

This seems to be working at least in terms of clearing the memory and preventing buildup. Ill be rerunning the big model batch again soon, so Ill be back if this doesn't work.

Are you sure the leak is not in the way that the pyomo model is defined in the Calliope code? I cannot find any general issue threads on memory leaking in Pyomo, so the problem seems pretty Calliope specific.

@brynpickering
Copy link
Member

I've done some tests and calliope itself doesn't seem to be the root cause. I can keep memory footprint very close to a single model initialisation, if I initialise a model in a loop of 10:

import calliope
import gc

for i in range(10):
    gc.collect()
    m = calliope.examples.national_scale()
    gc.collect()

However, the garbage collecter seems to not clean out the pyomo model and its objects, i.e., memory footprint increases almost the same amount if I run either of these calls (there is a slight difference because calliope xarray dataset is cleared each time, but it is a small memory footprint compared to the pyomo objects):

import calliope
import gc

for i in range(10):
    gc.collect()
    m = calliope.examples.national_scale()
    m.run()
    del m._backend_model  # extra step to try and get the garbage collector to collect the pyomo object (doesn't help)
    gc.collect()
import calliope
import gc

for i in range(10):
    m = calliope.examples.national_scale()
    m.run()

So there's something going on in pyomo that disables the garbage collector. Maybe it's configurable, but I haven't looked.

@brynpickering
Copy link
Member

Alright, here's a solution:

for i in range(10):
    m = calliope.examples.national_scale()
    m.run()
    for obj in m._backend_model.component_objects():
        m._backend_model.del_component(obj)

Rather than just delete the Pyomo ConcreteModel object, I use their functionality to delete every single component after finishing the run (constraints, variables, sets, parameters, ...).

From my tests, this works well. Can you check @fvandebeek (and @ramaroesilva if you're still interested).

@fvandebeek
Copy link

Thanks Bryn, this does prevent most leakage indeed, although not all of it:

Iter1
afbeelding

Iter2
afbeelding

Any ideas how to also keep objective.py, constraint.py, var.py etc in check?

@brynpickering
Copy link
Member

I can't help on that, sorry. You'll need to investigate other methods to delete pyomo objects which might not be picked up using that method I found.

@brynpickering brynpickering added the has-workaround The issue describes a valid workaround until the primary issue is solved label Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has-workaround The issue describes a valid workaround until the primary issue is solved
Projects
None yet
Development

No branches or pull requests

3 participants