Generalize the APPLgrid exporter #285

janw20 · 2024-04-30T15:26:56Z

Hi,
I implemented the cases not yet implemented in the APPLgrid exporter. This includes:

Adding support for exporting PineAPPL grids with multi-dimensional and non-consecutive bins by exporting them as APPLgrids with integer bin limits 0..N. The ordering is the same as the PineAPPL bins, e.g. given by pineappl read --bins. An info message about the bin limits is printed to the user when a PineAPPL grid with multi-dimensional and non-consecutive bins is exported.
Adding support for arbitrary Q2, x1, x2 grids
Adding the option epsilon = 1e-12 to approx_eq! in lines 240 & 257. Otherwise, this failed for the grid I tested this with, for the two values 0.04979197630496172 and 0.04979197630496281 (the difference is only in the last 3 digits). Or would you handle this differently?

* Add support for multi-dimensional and non-consecutive bins by assigning them integer bin limits 0..N after informing the user * Add support for arbitrary Q2, x1, x2 grids * Use the `epsilon` option to of `approx_eq!` to not report a false error

cschwan · 2024-05-02T12:47:19Z

Concerning 1): this is a good idea, it's surely better to do something than to error out. There's probably no meaningful alternative to the standard bin limits (0, 1), (1, 2), ... in the general case.

2): The problem with general bin limits is that APPLgrid requires a fixed functional form, which means that the bin limits x_i must be the results of a function x_i = f(i), and f is hard-coded in APPLgrid. In LagrangeSubgridV{1,2} we use the same function and therefore it happens to work in most cases. However, imagine changing the spacing between some of the points between x_min and x_max; in that case the APPLgrid won't give the right results. In other words: you only ever set x_min and x_max, but never the points in between, and they can be random.

3): I'd try changing the ulps parameter, there shouldn't be such a large difference (try doubling it).

For points 2) and 3) it would be good to have grids to test that with.

janw20 · 2024-05-02T14:30:40Z

Right, we could throw an error for now if a subgrid is not a LagrangeSubgridVx? Also, I did not know how to extract the interpolation order from the grid values, so I am just leaving it at the default value. Is there a way how to reconstruct the interpolation order from just the grid points? Or should the interpolation order maybe be stored in the grid files?
At ulps ≈ 470 this works without the epsilon, below that it still fails for x = 0.000014007074397454824 and 0.00001400707439745562 in my case. Should I just set it to 512 or is there a more systematic way to determine it? The float_cmp author recommends to set both ulps and epsilon, where epsilon is a small multiple of f64::EPSILON. For me this starts working with about 5.0 * f64::EPSILON with ulps = 128.

cschwan · 2024-05-02T14:42:53Z

Right, we could throw an error for now if a subgrid is not a LagrangeSubgridVx? Also, I did not know how to extract the interpolation order from the grid values, so I am just leaving it at the default value. Is there a way how to reconstruct the interpolation order from just the grid points? Or should the interpolation order maybe be stored in the grid files?

We can't throw an error, because then optimized grids can't be exported. That would be a big loss.

There's no way to reconstruct the interpolation order from the grid points alone, you can use the 50 that we always use for different interpolation orders. However I don't think the interpolation order in the exported APPLgrid is important, usually you don't want to fill them after having them exported.

At ulps ≈ 470 this works without the epsilon, below that it still fails for x = 0.000014007074397454824 and 0.00001400707439745562 in my case. Should I just set it to 512 or is there a more systematic way to determine it? The float_cmp author recommends to set both ulps and epsilon, where epsilon is a small multiple of f64::EPSILON. For me this starts working with about 5.0 * f64::EPSILON with ulps = 128.

Set it to 512. I don't think we need an epsilon comparison; I believe this is only needed when we compare against a zero result (0.0 != x for every non-zero x no matter the ulps parameter).

janw20 · 2024-05-02T15:30:33Z

I could also add a --ulps argument to pineappl export, which lets the user set a different value than the value of 512, with 512 then being a hard-coded default value?

cschwan · 2024-05-03T09:19:22Z

I could also add a --ulps argument to pineappl export, which lets the user set a different value than the value of 512, with 512 then being a hard-coded default value?

I don't think this is needed.

janw20 · 2024-05-03T13:25:00Z

Okay, I set ulps = 512 now in the checks of the x grid and removed the epsilon argument. From my side this can be merged

cschwan · 2024-05-04T06:59:18Z

What's missing are tests that check the features that you implemented. We need a PineAPPL grid

with multidimensional bin limits that have bin widths not equal to one, and one
with custom x-grid values.

Those two cases are possibly broken, so we better make sure to test it.

cschwan · 2024-05-04T07:14:50Z

pineappl_cli/src/export/applgrid.rs

+
+    let x_grid = grid
+        .subgrids()
+        .slice(s![order, bin, ..])


This probably doesn't work quite as expected. APPLgrid stores different channels in a single appl_igrid, which has common parameter values NQ2, NQ2min, NQ2max, NQorder, Nx, xmin, xmax, xorder. This means that:

if several channels have different SubgridParam/ExtraSubgridParam values we should error out, unless this is a DIS grid in which we ignore the second dimension. In PineAPPL the second dimension can be either x1 or x2

the flat_map should therefore probably be a simple map and we must make sure that for all channels the entries are the same

the same applies also for mu_grid

The OPTIMIZE_SUBGRID_TYPE optimization can change the ends of the x and mu2 grid points, right? So I would check if the grid points are equal modulo some endpoints, and then use the union as the x points that will be passed to APPLgrid?

This would address one point of it, yes. It'll get interesting if you've got a single channel with different minimums/maximums.

cschwan · 2024-05-04T07:23:04Z

pineappl_cli/src/export/applgrid.rs

-        .chain(limits.last().map(|vec| vec[0].1))
-        .collect();
+    let limits = if integer_bin_limits {
+        (0..=limits.len()).map(|x| x as f64).collect::<Vec<_>>()


Here there's very likely a normalization factor missing. If you change the bin sizes you must correct the cross sections.

You're right, thanks. Does APPLgrid also divide by the bin widths in the end?

AFAICR yes!

pineappl_cli/src/export/applgrid.rs

janw20 · 2024-05-10T12:32:23Z

While working on this, I realized: Why are the SubgridParams reconstructed? Why aren't the SubgridParams from the Grid used, e.g. by exposing them in a pub fn subgrid_params()?

cschwan · 2024-05-10T12:38:40Z

Why are the SubgridParams reconstructed? Why aren't the SubgridParams from the Grid used, e.g. by exposing them in a pub fn subgrid_params()?

Not every Subgrid necessarily has SubgridParams in the way understood by LagrangeSubgridV{1,2}. ImportOnlySubgridV{1,2} may have arbitrary x-node values, for instance linearly-spaced ones like [0.1, 0.2, 0.3, 0.4, 0.5]. These points are impossible to generate with the LagrangeSubgridV{1,2}.

janw20 · 2024-05-10T12:57:07Z

linearly-spaced ones like [0.1, 0.2, 0.3, 0.4, 0.5]

But the exporter can't handle these anyway, since APPLgrid can't handle them, right? (At least that's how I understood your first comment in this pull request.) So it would only matter (for the exporter) how the LagrangeSubgridV{1,2} handles the SubgridParams.

The ExtraSubgridParams are also not much of an issue, I think, since APPLgrid doesn't support different x1 and x2 values.

Also, is the Grid::subgrid_params field updated when optimizing the grid? Because if not, using this field would even solve the issue of stripped-away grid points at the start and the end of the grid axes.

janw20 · 2024-05-10T13:11:02Z

One could also properly document a potential Grid::subgrid_params() function, i.e. say that these are the parameters the subgrids were initialized with, not necessarily the current parameters.

cschwan · 2024-05-10T13:36:18Z

linearly-spaced ones like [0.1, 0.2, 0.3, 0.4, 0.5]

But the exporter can't handle these anyway, since APPLgrid can't handle them, right? (At least that's how I understood your first comment in this pull request.) So it would only matter (for the exporter) how the LagrangeSubgridV{1,2} handles the SubgridParams.

Yes, APPLgrid only understands the grid-node values spaced as LagrangeSubgridV{1,2} uses them.

The ExtraSubgridParams are also not much of an issue, I think, since APPLgrid doesn't support different x1 and x2 values.

It does support DIS grids, however, where one of the is trivial (zero and usually one x-grid value in the unused x-dimension)

Also, is the Grid::subgrid_params field updated when optimizing the grid? Because if not, using this field would even solve the issue of stripped-away grid points at the start and the end of the grid axes.

No it's not updated, but you can't rely on this field, because you can merge any kind of subgrid into any Grid. Grid::subgrid_params is solely for the purpose of filling temporarily empty subgrids (and this is an optimization, see the implementation of Grid::fill).

janw20 · 2024-05-10T13:47:53Z

No it's not updated, but you can't rely on this field, because you can merge any kind of subgrid into any Grid.

I see, this would be a showstopper for using Grid::subgrid_params then. Thanks for the clarification.

Generalize the APPLgrid exporter

ca8dd1c

* Add support for multi-dimensional and non-consecutive bins by assigning them integer bin limits 0..N after informing the user * Add support for arbitrary Q2, x1, x2 grids * Use the `epsilon` option to of `approx_eq!` to not report a false error

Increase ulps and remove epsilon in approx_eq!

503c12c

cschwan requested changes May 4, 2024

View reviewed changes

Remove leftover comment

2159c87

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize the APPLgrid exporter #285

Generalize the APPLgrid exporter #285

janw20 commented Apr 30, 2024 •

edited by cschwan

cschwan commented May 2, 2024 •

edited

janw20 commented May 2, 2024

cschwan commented May 2, 2024 •

edited

janw20 commented May 2, 2024

cschwan commented May 3, 2024

janw20 commented May 3, 2024

cschwan commented May 4, 2024

cschwan May 4, 2024

janw20 May 6, 2024

cschwan May 6, 2024 •

edited

cschwan May 4, 2024

janw20 May 6, 2024

cschwan May 6, 2024

janw20 commented May 10, 2024

cschwan commented May 10, 2024

janw20 commented May 10, 2024 •

edited

janw20 commented May 10, 2024 •

edited

cschwan commented May 10, 2024

janw20 commented May 10, 2024

Generalize the APPLgrid exporter #285

Are you sure you want to change the base?

Generalize the APPLgrid exporter #285

Conversation

janw20 commented Apr 30, 2024 • edited by cschwan

cschwan commented May 2, 2024 • edited

janw20 commented May 2, 2024

cschwan commented May 2, 2024 • edited

janw20 commented May 2, 2024

cschwan commented May 3, 2024

janw20 commented May 3, 2024

cschwan commented May 4, 2024

cschwan May 4, 2024

Choose a reason for hiding this comment

janw20 May 6, 2024

Choose a reason for hiding this comment

cschwan May 6, 2024 • edited

Choose a reason for hiding this comment

cschwan May 4, 2024

Choose a reason for hiding this comment

janw20 May 6, 2024

Choose a reason for hiding this comment

cschwan May 6, 2024

Choose a reason for hiding this comment

janw20 commented May 10, 2024

cschwan commented May 10, 2024

janw20 commented May 10, 2024 • edited

janw20 commented May 10, 2024 • edited

cschwan commented May 10, 2024

janw20 commented May 10, 2024

janw20 commented Apr 30, 2024 •

edited by cschwan

cschwan commented May 2, 2024 •

edited

cschwan commented May 2, 2024 •

edited

cschwan May 6, 2024 •

edited

janw20 commented May 10, 2024 •

edited

janw20 commented May 10, 2024 •

edited