Caching of Non-Equilibrium parameters(Evib, Erot) #511

sagarchotalia · 2022-08-17T19:54:42Z

This pull request is to address the caching of non-equilibrium parameters, so that they can be retrieved and accessed later.

Currently, I've only implemented generation of a cache file each time the function is called. If the direction I'm proceeding in is correct, I shall add further conditions such as when cache="regen", or when a cache file already exists.

Examples tested:

Spectra calculated with CO and CO2 using the following code

s = SpectrumFactory(
    wavelength_min=3000,
    wavelength_max=4000,
    cutoff=1e-27, # All lines included, may increase calculation time
    pressure=1,
    molecule = "CO2",
    isotope="1,2,3",
    truncation=5,
    neighbour_lines=5,
    path_length=0.1,
    mole_fraction=1e-3,
    medium="vacuum",
    optimization=None, # No optimization strategy used
    chunksize=None, # Initialising chunksize as None for comparison
    wstep="auto",
    verbose=2,
)
s.fetch_databank("hitemp", load_energies=True, load_columns="noneq")
# For the HITEMP error causing some columns to have NaN values:
s.df0.drop(s.df0.index[s.df0['v1u']==-1], inplace=True)
s1 = s.non_eq_spectrum(Tvib=2000, Trot=3000)

Also added a test function test_caching_noneq_params() to make sure the caching implementation is working properly, in radis/test/lbl/test_base.py.

Aims to fix #176

codecov-commenter · 2022-08-17T20:08:57Z

Codecov Report

Merging #511 (e44e614) into develop (c3d9d13) will increase coverage by 0.01%.
The diff coverage is 80.00%.

@@             Coverage Diff             @@
##           develop     #511      +/-   ##
===========================================
+ Coverage    73.36%   73.38%   +0.01%     
===========================================
  Files          138      138              
  Lines        19028    19052      +24     
===========================================
+ Hits         13960    13981      +21     
- Misses        5068     5071       +3

radis/lbl/base.py

erwanp · 2022-08-22T11:48:40Z

(btw, to update to the latest branch, you shouldn't Merge but Rebase - when possible - (see it as a "Merge without adding a commit number"; i.e. use it to integrate changes that were not yours) . )

radis/lbl/base.py

…file

radis/lbl/base.py

…us test files to resolve pytest errors

…ons in cache file

erwanp · 2022-09-10T12:08:33Z

radis/lbl/base.py

-                    "wavenumber_min": self.input.wavenum_min,
-                    "wavenumber_max": self.input.wavenum_max,
+                    "wavenum_min": self.input.wavenum_min,
+                    "wavenum_max": self.input.wavenum_max,


In your tests, make sure it also works with wavelength inputs. Normally all inputs are converted to wavenumber so it should work, but better test!

erwanp · 2022-09-10T12:10:40Z

radis/test/lbl/test_base.py

-        assert s.df0["wav"].min() == existing_file_metadata["wavenumber_min"]
-    if "wavenumber_max" in existing_file_metadata:
-        assert s.df0["wav"].max() == existing_file_metadata["wavenumber_max"]
+    if "wavenum_min" in existing_file_metadata:


Test should happen always. If key is not in metadata you should regenerate the cache

…in test function

erwanp · 2022-09-10T12:14:08Z

radis/test/spectrum/test_utils.py

@@ -110,7 +110,7 @@ def test_perf_profile_from_factory(*args, **kwargs):
        molecule="CO",
        verbose=3,
    )
-    sf.load_databank("HITRAN-CO-TEST")
+    sf.load_databank("HITRAN-CO-TEST", load_energies=True)


Hello. Why do you need this absolutely?

I remember that we implemented automatic loading of energies when calling noneq_spectrum if it hadnt been loaded already. Doesnt it work?

erwanp · 2022-09-10T12:16:20Z

radis/test/lbl/test_base.py

+def test_caching_noneq_params(verbose=True, plot=True, *args, **kwargs):
+    """
+    Test that the cache file generated during the computation of
+    non-equilibrium spectra is correct, and has the accurate metadata.


Mention this pull request in docstring. Very useful for maintenance, in months or years from now

And mention this test in the first message of the pull request

erwanp · 2022-09-10T12:18:01Z

radis/test/lbl/test_base.py

+        truncation=5,
+        neighbour_lines=5,
+        path_length=0.1,
+        mole_fraction=1e-3,


For path length, mole fractions, medium, optimization, chunksize, and verbose, you could use the defaukt values and remove them from here. It will make it easier to underatand what you are truly testing

erwanp · 2022-09-12T17:29:48Z

radis/test/lbl/test_base.py

    cache_filename = (
        splitext(s.params.dbpath.split(",", 1)[0])[0]
        + "_extra_columns_EvibErot.h5.extra"
    )
    engine = "pytables"
+    if exists(cache_filename):
+        last_modif = getmtime(cache_filename)


shouldn't last_modif be measured after computing the new spectrum, i.e.> l846 ?

erwanp · 2022-09-12T17:32:04Z

radis/test/lbl/test_base.py

+    s2 = sf.non_eq_spectrum(Tvib=2000, Trot=3000)
+    # Checking proper regeneration of the cache file
+    assert exists(cache_filename)
+    assert getmtime(cache_filename) > last_modif


It seems to me the conditions are the same. Shouldn't the modification time be the same (i.e. file is not regenerated)

erwanp · 2022-09-12T17:34:17Z

radis/test/lbl/test_base.py

    if "isotope" in existing_file_metadata:
        assert isotope == existing_file_metadata["isotope"]
+    else:
+        os.remove(cache_filename)
    if "number_of_lines" in existing_file_metadata:


Wait. Why are you testing this here ??

These validity checks should be in the code itself, not in the test.

Here you should only be testing (and asserting) that if you change isotope, the cache file is properly regenerated (ie new_modif_time > last_modif_time)

Same for wavenumbers, etc.

erwanp · 2022-09-12T17:35:33Z

radis/test/lbl/test_base.py

+        assert sf.input.wavenum_max == existing_file_metadata["wavenum_max"]
+    else:
+        os.remove(cache_filename)
    if "spectroscopic_constant_file" in existing_file_metadata:
        assert (
            spectroscopic_constant_file
            == existing_file_metadata["spectroscopic_constant_file"]
        )
+    else:
+        os.remove(cache_filename)
    if "last_modification" in existing_file_metadata:
        assert (
            time.ctime(getmtime(spectroscopic_constant_file))
            == existing_file_metadata["last_modification"]
        )
+    else:
+        os.remove(cache_filename)
    if "neighbour_lines" in existing_file_metadata:
-        assert s.params.neighbour_lines == existing_file_metadata["neighbour_lines"]
+        assert sf.params.neighbour_lines == existing_file_metadata["neighbour_lines"]
+    else:
+        os.remove(cache_filename)
    if "cutoff" in existing_file_metadata:
-        assert s.params.cutoff == existing_file_metadata["cutoff"]
+        assert sf.params.cutoff == existing_file_metadata["cutoff"]
+    else:
+        os.remove(cache_filename)


same as comment above. You shouldn't be conditionning the presence of "last_modifcation" etc in the metadata.

It must be there. You should even be asserting that it is there.

erwanp · 2022-09-12T17:37:27Z

radis/test/lbl/test_base.py

+        wavelength_min=1500,
+        wavelength_max=2000,
+        cutoff=1e-27,
+        pressure=1,
+        molecule="CO",
+        isotope="1,2",
+        truncation=5,
+        neighbour_lines=5,
+        path_length=0.1,
+        mole_fraction=1e-3,
+        medium="vacuum",
+        optimization=None,  # No optimization strategy used
+        chunksize=None,  # Initialising chunksize as None for comparison
+        wstep="auto",
+        verbose=2,


to simplify reading, can you add all of this in a dictionary and use the same dictionary many times.

conditions = { "wavelength_min":1500, "wavelength_max":2000, #etc, fix for dictionary syntax : cutoff=1e-27, pressure=1, molecule="CO", isotope="1,2", truncation=5, neighbour_lines=5, path_length=0.1, mole_fraction=1e-3, medium="vacuum", optimization=None, # No optimization strategy used chunksize=None, # Initialising chunksize as None for comparison wstep="auto", verbose=2, } s1 = SpectrumFactory(**conditions) s2 = SpectrumFactory(**conditions) # or change one : new_conditions = conditions.copy() new_conditions["wavelength_min=1800"] s3 = SpectrumFactory(**new_conditions)

Also, remove all parametesr that are not relevant. pressure, Path_length, mole_fraction, medium, optimization, chunksize, wstep, verbose are not relevant here. Keep the defaults

erwanp · 2022-10-03T11:00:56Z

Hello @sagarchotalia, I'll release 0.14 this week and I'd like to include your changes. Is all the above ready ?

sagarchotalia · 2022-10-04T18:54:16Z

Hello @sagarchotalia, I'll release 0.14 this week and I'd like to include your changes. Is all the above ready ?

Hello! I'm really sorry, but the changes aren't ready yet as I have my exams going on. I'll try to finish them ASAP!

erwanp · 2022-10-05T10:57:28Z

No problem. We'll just delay to post 0.14 .
Good luck with your exams!

anandxkumar · 2022-10-21T23:02:18Z

@sagarchotalia any update?

erwanp · 2023-07-30T12:11:12Z

Hello @sagarchotalia this is hanging from last year, any update ?

sagarchotalia and others added 2 commits August 18, 2022 00:55

Removed unnecessary import

8d54cee

Merge branch 'radis:develop' into caching-dev

07f395f

erwanp requested changes Aug 17, 2022

View reviewed changes

radis/lbl/base.py Outdated Show resolved Hide resolved

radis/lbl/base.py Outdated Show resolved Hide resolved

Merge branch 'radis:develop' into caching-dev

e44e614

sagarchotalia added this to In progress in GSoC 2022: Performance Tweaks in RADIS via automation Aug 20, 2022

Merge branch 'radis:develop' into caching-dev

e1b38fc

sagarchotalia added 2 commits August 24, 2022 22:26

Added correct path for the cache file, corrected metadata

d9ef631

Changed engine to Pytables to remove NotImplemented errors

c69a395

erwanp requested changes Aug 27, 2022

View reviewed changes

radis/lbl/base.py Outdated Show resolved Hide resolved

radis/lbl/base.py Outdated Show resolved Hide resolved

radis/lbl/base.py Outdated Show resolved Hide resolved

radis/lbl/base.py Outdated Show resolved Hide resolved

GSoC 2022: Performance Tweaks in RADIS automation moved this from In progress to Review in progress Aug 27, 2022

Merge branch 'radis:develop' into caching-dev

ef7bffa

sagarchotalia closed this Aug 28, 2022

sagarchotalia force-pushed the caching-dev branch from ef7bffa to f3b0d4f Compare August 28, 2022 17:00

GSoC 2022: Performance Tweaks in RADIS automation moved this from Review in progress to Done Aug 28, 2022

sagarchotalia reopened this Aug 28, 2022

GSoC 2022: Performance Tweaks in RADIS automation moved this from Done to In progress Aug 28, 2022

sagarchotalia and others added 2 commits August 31, 2022 17:33

Added implementation of caching of Evib and Erot columns

cc63533

Merge branch 'radis:develop' into caching-dev

ef4e771

erwanp reviewed Aug 31, 2022

View reviewed changes

radis/lbl/base.py Outdated Show resolved Hide resolved

radis/lbl/base.py Outdated Show resolved Hide resolved

TranHuuNhatHuy and others added 2 commits September 8, 2022 00:34

add some tiny doc/example

3c57ebc

Changed metadata variable name

b4d90a1

erwanp reviewed Sep 7, 2022

View reviewed changes

radis/lbl/base.py Show resolved Hide resolved

radis/lbl/base.py Outdated Show resolved Hide resolved

Added thorough checking of metadata, moved definition of spec consts …

7832b49

…file

erwanp reviewed Sep 9, 2022

View reviewed changes

radis/lbl/base.py Outdated Show resolved Hide resolved

sagarchotalia added 2 commits September 9, 2022 15:25

Added test; added load_energies argument in load_databank() for vario…

f1989b4

…us test files to resolve pytest errors

Changed wildcard file name, added proper wavenum min and max comparis…

4f46630

…ons in cache file

erwanp reviewed Sep 10, 2022

View reviewed changes

Added removal of cache file upon failing metadata check

29106ef

Added comparison of cache file columns and metadata with internal df …

3be0712

…in test function

erwanp requested changes Sep 12, 2022

View reviewed changes

GSoC 2022: Performance Tweaks in RADIS automation moved this from In progress to Review in progress Sep 12, 2022

sagarchotalia moved this from Review in progress to In progress in GSoC 2022: Performance Tweaks in RADIS Sep 20, 2022

erwanp added this to the 0.15 milestone Oct 5, 2022

minouHub modified the milestones: 0.15, 0.16 Mar 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caching of Non-Equilibrium parameters(Evib, Erot) #511

Caching of Non-Equilibrium parameters(Evib, Erot) #511

sagarchotalia commented Aug 17, 2022 •

edited

codecov-commenter commented Aug 17, 2022 •

edited

erwanp commented Aug 22, 2022

erwanp Sep 10, 2022

erwanp Sep 10, 2022

erwanp Sep 10, 2022

erwanp Sep 10, 2022

erwanp Sep 10, 2022

erwanp Sep 12, 2022

erwanp Sep 12, 2022

erwanp Sep 12, 2022

erwanp Sep 12, 2022

erwanp Sep 12, 2022

erwanp Sep 12, 2022

erwanp commented Oct 3, 2022

sagarchotalia commented Oct 4, 2022

erwanp commented Oct 5, 2022

anandxkumar commented Oct 21, 2022

erwanp commented Jul 30, 2023

Caching of Non-Equilibrium parameters(Evib, Erot) #511

Are you sure you want to change the base?

Caching of Non-Equilibrium parameters(Evib, Erot) #511

Conversation

sagarchotalia commented Aug 17, 2022 • edited

Examples tested:

codecov-commenter commented Aug 17, 2022 • edited

Codecov Report

erwanp commented Aug 22, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erwanp commented Oct 3, 2022

sagarchotalia commented Oct 4, 2022

erwanp commented Oct 5, 2022

anandxkumar commented Oct 21, 2022

erwanp commented Jul 30, 2023

sagarchotalia commented Aug 17, 2022 •

edited

codecov-commenter commented Aug 17, 2022 •

edited