Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plotting bug in diffNuisances.py due to counting all parameters via regex flag --regex ".*" #789

Open
ammitra opened this issue Aug 31, 2022 · 3 comments

Comments

@ammitra
Copy link

ammitra commented Aug 31, 2022

This issue occurs due to the changes introduced in commit cdc68c1, wherein the TH1Fs containing the nuisances and post-fit errors are artificially lengthened due to counting all parameters if the default --regex ".*" option is provided.

Expected Behavior: Running python diffNuisances.py fitDiagnosticsTest.root --abs -g nuisance_pulls.root should produce a ROOT file containing nuisances and post_fit_errs canvases with only the parameters that meet the default val and sig tolerances:

fixed_nuisance_pulls

Current Behavior: Instead, the two TH1Fs have a number of bins equal to the total number of parameters, even though not all parameters meet the criteria to be plotted:

broken_nuisance_pulls

This issue arises due to the introduction of the np_count variable on lines 103-115, which defines the number of bins in the nuisances and post_fit_errs histograms. For instance, in the example I posted above, the total np_count comes to 204 in my analysis, while the only parameters of interest are the ones shown in the expected behavior image. The expected behavior plots were derived by reverting the changes made in the previously mentioned commit.

I'm not sure I understand the reasoning behind setting the number of bins to be np_count - wouldn't the original method of setting the number of bins based on the size of the nuisances_prefit RooArgSet be the most general solution? Thank you!

@nsmith-
Copy link
Collaborator

nsmith- commented Aug 31, 2022

@vicha-w may have a comment, but it looks to me that options.show_all_parameters should be checked before deciding the np_count. I think nuisances_prefit would still be more than just the ones out of tolerance

@vicha-w
Copy link
Contributor

vicha-w commented Aug 31, 2022

Hi. I looked at the code again and noted that, if we want to revert back to setting the number of bins based on the size of nuisances_prefit, then we would have to rely on the number of variables present in nuisances_prefit. I do not fully understand why nuisances_prefit would not contain all parameters in fit_s or fit_b.

The original idea behind --regex option is for me to filter a set of nuisance parameters that would appear in the plot. I guess that in my use case nuisances_prefit has a complete set of NPs, not only parameters that meet tolerances. Since the loop starting from line 120 loops over NPs in fit_s object, I decided to count the bins that pass the criteria from fit_s object with np_count. Also, as far as I understand, there is no way to filter NPs with regex queries from nuisances_prefit object. I may be wrong, so please correct me on this.

So let me propose a fix: instead of counting NPs from fit_s, the script should count NPs that would pass regex criteria from nuisances_prefit object instead. This can be fixed in lines 104-107. However, I am not sure how it will work with the loop starting from line 120, since it loops over all NPs from fit_s.

Do you happen to know if this would work, or would there be a better solution for this?

@ajgilbert
Copy link
Collaborator

Hi, I didn't look into the details of this problem yet, but to clarify: nuisances_prefit will only contain those uncertainties with an associated constraint term (i.e. not rateParams or other unconstrained parameters associated with parametric models). In that sense, it will generally be a subset of the parameters in fit_s. Since nuisances_prefit is a RooArgSet, it should be possible to iterate through and run a regex on the name of each RooRealVar it contains.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants