Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling vetoes and timeslides in pygrb mini followups #4941

Merged
merged 9 commits into from
Nov 19, 2024

Conversation

pannarale
Copy link
Contributor

This PR is the third in the series started in PR #4929.

Standard information about the request (and the following ones that will be linked to this)

This is a: a new feature enabling veto definer file usage in PyGRB. Utilities and scripts in results production are being streamlined along the way.

This change affects: PyGRB

This change changes: result presentation / plotting and scientific output.

If this change breaks the standard automated test running --help for PyGRB plotting scripts, I will add some workarounds to avoid this. If needed, these will likely be empty functions: the plotting scripts will be progressively renovated in the whole series of PRs.

Motivation

Now that the workflow generator passes around the veto definer file, the mini followups need to handle it use it. In checking pycbc_pygrb_minifollowups I noticed that it is better to enforce the correct timesmlide is being plotted.

Contents

  • Enable passing veto and segments files to pycbc_pygrb_minifollowups and the jobs it prepares.
  • Figure out the correct timeslide to be plotted (e.g., zero-lag for the loudest onsource event) and ensure it is the one used in time series plots.
  • Somewhat unrelated, but in preparation of other PRs, the utility construct_trials was improved and the utility extract_basic_trig_properties was generalized to extract_trig_properties in which the user specifies what datasets need to be extracted from the trigger file.

Testing performed

The totality of the changes that will be broken down in multiple PRs was tested on GRB 170817A data by producing a full results webpage (see here).

  • The author of this pull request confirms they will adhere to the code of conduct

@pannarale pannarale added the PyGRB PyGRB development label Nov 14, 2024
@pannarale pannarale self-assigned this Nov 14, 2024
Comment on lines +201 to +204
logging.info('Processing event: %s', num_event+1)
gps_time = fp['GPS time'][num_event]
gps_time = gps_time.astype(float)
tags = args.tags + [str(num_event+1)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting these to match the row numbering we see in the loudest offsource event / quite injections tables.

else:
for i_ifo, ifo in enumerate(ifos):
time_shift = fp[ifo+' time shift (s)'][num_event]
ifo_time = gps_time + time_shift
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from - to + is intentional here. This was sliding the wrong way!

Copy link
Contributor

@MarcoCusinato MarcoCusinato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes on the pygrb_postprocessing_utils.py as right now would break pycbc_pygrb_efficiency, pycbc_pygrb_page_tables, and pycbc_pygrb_plot_stats_distribution. So these three files need some adjustments

bin/pygrb/pycbc_pygrb_minifollowups Show resolved Hide resolved

return trig_time, trig_snr, trig_bestnr
return found_trigs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pycbc_pygrb_efficiency, pycbc_pygrb_page_tables, and pycbc_pygrb_plot_stats_distribution are using this function and need three outputs not to break.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, in the development version these scripts will call extract_trig_properties to get these quantities. But, I am breaking down the many changes in multiple PRs so that the diffs may be parsed with reasonable effort: one big PR would be complicated to handle for a reviewer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with breaking it down, but approving this PR before any of the others would find difficult to run workflows without issues.

Copy link
Contributor Author

@pannarale pannarale Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was true for quite a while until PR #4909, when PyCBC allowed to generate a (working) PyGRB workflow, but without vetoes. As for previous "PR series" that took us to that point, the idea is that we are enabling a new big feature and, by breaking changes down, accepting that the intermediate states of PyCBC between PR #4909 and the completion of this PR review work will break PyGRB again. At that point we will have a PyCBC where PyGRB workflows are fully working and with vetoes. It will then be the right moment to have a new CI/CD test with a small PyGRB search and hence stop this modus operandi. At the moment PyGRB is already broken on master :-)


# Sort the triggers into each slide
sorted_trigs = sort_trigs(trial_dict, trigs, slide_dict, seg_dict)
logger.info("Triggers sorted.")
n_surviving_trigs = sum([len(i) for i in sorted_trigs.values()])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave the log info here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this because the logging info below is more informative, while is still telling the user the triggers were sorted.

"""Extract and store as dictionaries time, SNR, and BestNR of
time-slid triggers"""
def extract_trig_properties(trial_dict, trigs, slide_dict, seg_dict, keys):
"""Extract and store as dictionaries specific keys of time-slid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the change of name of the function. But since this and its precursor do two different things, shouldn't we keep them both? (see also comment on the output)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the answer above. The call to the old function will be removed throughout PyGRB.

Comment on lines -488 to -490
trig_bestnr[slide_id] = reweightedsnr_cut(
trigs['network/reweighted_snr'][indices],
opts.newsnr_threshold)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we using this anymore? If just one of the functions is kept, maybe one could return the found_trigsas well as the bestNR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user will get the 'network/reweighted_snr' by calling extract_trig_properties with 'network/reweighted_snr' among the keys. So this function is not returning the found_trigs dictionary with the keys requested (and 'network/reweighted_snr' may be one of these).

@pannarale
Copy link
Contributor Author

The changes on the pygrb_postprocessing_utils.py as right now would break pycbc_pygrb_efficiency, pycbc_pygrb_page_tables, and pycbc_pygrb_plot_stats_distribution. So these three files need some adjustments

Yes, correct, and the changes will be part of the next PRs in this series. If I were to produce a single PR, it would be too big to parse by the reviewer(s).

@MarcoCusinato MarcoCusinato merged commit 161bcea into gwastro:master Nov 19, 2024
29 checks passed
@pannarale pannarale deleted the pygrb_vetoes branch November 19, 2024 16:52
prayush pushed a commit to prayush/pycbc that referenced this pull request Nov 21, 2024
* Use veto and segments files in pycbc_pygrb_minifollowups; enforce using the correct timeslide when following triggers/injections

* Added note about pycbc_pygrb_minifollowups in pycbc_pygrb_page_tables

* Updated construct_trials and generalized extract_basic_trig_properties to extract_trig_properties

* Edited var name to something more meaningful

* More accurate docstring

* Aesthetics and missing import

* Simplified pycbc_pygrb_minifollowups and fixed a sign!

* Removed empty line

* f-string and bare except
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PyGRB PyGRB development
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants