-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Addition of SoftDrop from the RecursiveTools Contrib #235
Conversation
Conflicts: tests/test_002-exclusive_jets.py
for more information, see https://pre-commit.ci
Just a suggestion: instead of returning a set of filtered jets, you could return an array of booleans. On the Python side, you can use the booleans to slice the original jets and get the filtered jets, but you can also use those booleans for other things. You could, for instance, apply logical operators on groomings with different parameters, to see which jets would be groomed with one set of parameters and not another. That would be harder if you had to identify jets in the output set that are "the same jet." Also if you do input jets → grooming algorithm → boolean array → slice → filtered jets, the filtered jets share memory with the input jets. It's a view, rather than a copy. |
This is a very good suggestion - especial for environments like HL-LHC. @cmoore24-24 do you see how to do this? |
Is there room for both ideas, potentially? I can definitely see how returning booleans rather than the filtered jets can be helpful when memory is at a premium, but having fastjet do some of the calculation legwork w/r/t the jets is also pretty useful. So maybe keep this implementation as-is, and then down the line have something like As for how to do it, no I'm not really sure off the top of my head. When I have the time to take another look, I can try to figure it out, though. |
You could have one additional kwarg that's "return_as_mask" or have two functions |
for more information, see https://pre-commit.ci
Conflicts: src/fastjet/_generalevent.py src/fastjet/_singleevent.py
for more information, see https://pre-commit.ci
I've added a way to return a boolean mask that can be applied to a PF candidates array to return filtered jets. As for the implementation; I may be wrong, but I don't believe there's an easy way to make softdrop preserve anything like candidate ID or index (though if there is and I missed it, then feel free to let me know). So the only way I could think to do this was explicit 4 vector matching via nested loops.
As a result, it isn't a terribly quick process, but it does work and produce the desired boolean array. |
We need to do better than quadruple nested loop, this will die in bad ways for HL-LHC analyses. I'll think about a way forward. Would not merge as is. |
At this point would it be worthwhile reverting this merge to the pre-mask state and making the addition of the mask a separate PR? |
Hello, I have removed the attempt at adding a mask option for softdrop. That work can serve as a starting point for a separate pull request that adds a mask to softdrop at a later date. The PR is now in its initial state; it returns filtered jet constituents, as well as some attributes of the filtered jets themselves. @jmduarte, do you have any comments on the PR as it is now? Can it be merged as is, or are there changes that need to be made? Best, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think this is a really useful PR, thanks! I agree the masking can be left for a future PR. I can merge this week unless there are any further comments @jpivarski @jmduarte @lgray.
Fix typos
This PR includes the addition of the SoftDrop grooming tool from the RecursiveTools section of fastjet-contrib. The defaults are set to those used in CMS and/or the defaults recommended by the authors of SoftDrop. That said, the options for both the SymmetryCut and RecursionChoice fields have been included. At the moment, the soft drop grooming outputs a record array of the jet constituents, as well as a few values of interest of the jet itself.
As a small change, EnergyCorrelator has been slightly adjusted, so that the default invocation of the generic ECF class will return the normalized values. The unnormalized values can be returned with a "noramlized=False" argument.