Correctly recompute PU weights in case of an upper bound #87
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current code has a severe bug that fills the PU weights with nan-s (not a numbers) if there exists a PU weight that needs to be cropped:
nanoAOD-tools/src/WeightCalculatorFromHistogram.cc
Lines 103 to 104 in 2012f6f
nanoAOD-tools/src/WeightCalculatorFromHistogram.cc
Lines 92 to 94 in 2012f6f
The problem is that when the while loop goes beyond the first iteration, N (= the number of bins in MC histogram) more values will be added to
cropped
variable. ThencheckIntegral
loops over the size ofcropped
variable and the running indexi
is used to retrieve the number of events (refvals_
), but this obviously goes beyond the vector boundaries. The computed integrals and final weights become just pure garbage. Resettingcropped
in every iteration didn't work either.For these reasons I and @veelken decided to redo this logic by scaling the largest weight and adjusting the remaining weights until the largest weight approaches to
hardmax
(defaults to 3.) (or, equivalently, until the new integral is close enough to the original one):hardmax
;maxshift
(defaults to 0.0025).Below is an extreme illustration of how the algorithm works:
In practice large PU weights are assigned to a handful of events and the above reweighting procedure basically won't even affect any other weight if the input Ntuple contains a reasonable number of events (which it should if we want to get an accurate PU profile).