Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some changes to HFile.select() and SingleDetTriggers #4545

Merged

Conversation

GarethCabournDavies
Copy link
Contributor

This is an attempt to sort #4524

These changes allow:

  • any of the rankings from ranking.py to be used in the HFile.select() function
  • A pre-sort directly within SingleDetectorTriggers, so that the HFile.select() functionality can used for simple things easily
  • A pre-cut in mask_to_n_loudest_clustered_events in cases where this is appropriate

So far as I can test, this has not broken previous functionality except for one usage, which is the filter string option in pycbc_plot_singles_vs_params

ahnitz
ahnitz previously requested changes Oct 27, 2023
pycbc/io/hdf.py Outdated Show resolved Hide resolved
@GarethCabournDavies
Copy link
Contributor Author

Testing on the updates to the executables which use these are ongoing. I will point to the success (or make changes) when they are complete

@GarethCabournDavies
Copy link
Contributor Author

I am going to try and un-break the filter string option as the next step in this

@GarethCabournDavies GarethCabournDavies force-pushed the hfile_select_ranking branch 2 times, most recently from 4b9327e to b3b6073 Compare November 24, 2023 15:45
@GarethCabournDavies
Copy link
Contributor Author

Okay, I think this is ready to review now

@GarethCabournDavies GarethCabournDavies changed the title Some changes to HFile.select() and mask_to_n_loudest_clustered Some changes to HFile.select() and SingleDetTriggers Nov 29, 2023
@GarethCabournDavies
Copy link
Contributor Author

Poke @ahnitz and @spxiwh

Copy link
Contributor

@spxiwh spxiwh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Gareth. There's some comments/suggestions from me here, but I would like to have @ahnitz 's input before merging this. In general no big worries here, just some aesthetics, and a couple of ideas to help future proof.

bin/minifollowups/pycbc_page_snglinfo Outdated Show resolved Hide resolved
bin/plotting/pycbc_plot_singles_vs_params Outdated Show resolved Hide resolved
bin/plotting/pycbc_plot_singles_vs_params Outdated Show resolved Hide resolved
examples/gw150914/PyCBCInspiral.ipynb Outdated Show resolved Hide resolved
pycbc/io/hdf.py Show resolved Hide resolved
pycbc/io/hdf.py Outdated Show resolved Hide resolved
pycbc/io/hdf.py Outdated Show resolved Hide resolved
pycbc/io/hdf.py Outdated Show resolved Hide resolved
pycbc/io/hdf.py Show resolved Hide resolved
pycbc/io/hdf.py Show resolved Hide resolved
@@ -299,6 +299,21 @@ def get_newsnr_sgveto_psdvar_scaled_threshold(trigs):
'newsnr_sgveto_psdvar_scaled_threshold': get_newsnr_sgveto_psdvar_scaled_threshold,
}

# Lists of datasets required in the trigs object for each function
required_datasets = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make this linked to the function themselves? I'm just wondering if this is really maintainable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are not linked, the dictionary will need updating as more functions are added into this module.

One way to add this would be if we define the dictionary before the functions, and all functions have an assert that the required_datasets are in trigs. Then if another function is made (likely copied and adapted), then people will naturally add the list into this dictionary. This doesn't make it more maintainable, but does add a psychological incentive to add the list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the check, so anyone using get_sngls_ranking_from_trigs without adding the required datasets will have a problem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had to abandon this idea, as the get_sngls_ranking_from_trigs function is used by quite a few different input classes for trigs, so making the check general enough was over-complicated (in my mind at least)

@ahnitz
Copy link
Member

ahnitz commented Dec 13, 2023

@GarethCabournDavies I don't see too much to comment on anymore,, except the one thing. Also can you comment or resolve the issues above? It's not quite clear what is still be working on and what the choice made was in response.

@GarethCabournDavies
Copy link
Contributor Author

@GarethCabournDavies I don't see too much to comment on anymore,, except the one thing. Also can you comment or resolve the issues above? It's not quite clear what is still be working on and what the choice made was in response.

I was leaving them open until I have finished testing and committed, but will resolve as soon as possible

@GarethCabournDavies
Copy link
Contributor Author

Noticed an error in the way pre-masks are handled alongside the new filter_rank in SingleDetTriggers (it's not actually used anywhere in the codebase, but just to be safe), I think I have the fix ready, but this is not to merge before then

@GarethCabournDavies
Copy link
Contributor Author

Fix to the above is in, so please approve if you are happy

@tdent
Copy link
Contributor

tdent commented Jan 8, 2024

Where are we with this, looks like just waiting for review?

@GarethCabournDavies
Copy link
Contributor Author

Where are we with this, looks like just waiting for review?

Yup - the last update was in the day or so before christmas break, so not many working days to have had a look at this

@GarethCabournDavies
Copy link
Contributor Author

Poke for review on this

Copy link
Contributor

@spxiwh spxiwh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the changes now. I think this would also require sign-off from @ahnitz before it could be merged.

@spxiwh
Copy link
Contributor

spxiwh commented Jan 29, 2024

... And there's now conflicts.

@spxiwh
Copy link
Contributor

spxiwh commented Feb 2, 2024

@ahnitz Poke

Copy link
Member

@ahnitz ahnitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply, yes, I think this has addressed the issues I originally raised, and I don't see anything too major. Always cleanup can be done, but I think this leaves it in a better state than it was.

@GarethCabournDavies GarethCabournDavies merged commit ec8bd5d into gwastro:master Feb 5, 2024
33 checks passed
bhooshan-gadre pushed a commit to bhooshan-gadre/pycbc that referenced this pull request Mar 4, 2024
* Some changes to HFile.select() and mask_to_n_loudest_clustered

* missed comma

* Some CC problemas

* Update

* revert changes which meant kwargs couldnt be passed to mask_to_n_loudest_clustered_events

* CC / fixes

* Use list rather than numpy array for mask

* remove messing about with the mask dtyping

* initialise a transparent mask if it is not given

* Fix problem when the filter value may not be available to the HFile

* thinko in pycbc_bin_trigger_rates_dq

* Don't make a huge boolean mask of all True values and then run nonzero on it = BAD IDEA

* CC

* re-enable option to have no mask

* efficiency saving in pycbc_plot_hist

* no mask option in trig_dict

* typo

* allow None maks in mask_size

* thought i could apply another mask but I cant

* get filter_func back, use new interface for recent changes

* Some minor fixes

* cc

* allow option to do a logical and on masks

* various minor bugfixes

* max z problem

* fixes to pycbc_page_snglinfo

* use alternative dict for beter clarity

* IWH comments

* add check for required datasets (mostly to remind people to add the required dataset list if they add a new ranking\!)

* fix bad check, remove min_snr option

* remove option from CI

* Fix bug with applying premask and using filter_rank threshold at the same time

* stop checking for required datasets - this is difficult when so may different types use the same function

* CC prefers np.flatnonzero(arr) to arr.nonzero()[0]

* Fix a couple of rarely-used scripts, remove unused imports, whitespace
lpathak97 pushed a commit to lpathak97/pycbc that referenced this pull request Mar 13, 2024
* Some changes to HFile.select() and mask_to_n_loudest_clustered

* missed comma

* Some CC problemas

* Update

* revert changes which meant kwargs couldnt be passed to mask_to_n_loudest_clustered_events

* CC / fixes

* Use list rather than numpy array for mask

* remove messing about with the mask dtyping

* initialise a transparent mask if it is not given

* Fix problem when the filter value may not be available to the HFile

* thinko in pycbc_bin_trigger_rates_dq

* Don't make a huge boolean mask of all True values and then run nonzero on it = BAD IDEA

* CC

* re-enable option to have no mask

* efficiency saving in pycbc_plot_hist

* no mask option in trig_dict

* typo

* allow None maks in mask_size

* thought i could apply another mask but I cant

* get filter_func back, use new interface for recent changes

* Some minor fixes

* cc

* allow option to do a logical and on masks

* various minor bugfixes

* max z problem

* fixes to pycbc_page_snglinfo

* use alternative dict for beter clarity

* IWH comments

* add check for required datasets (mostly to remind people to add the required dataset list if they add a new ranking\!)

* fix bad check, remove min_snr option

* remove option from CI

* Fix bug with applying premask and using filter_rank threshold at the same time

* stop checking for required datasets - this is difficult when so may different types use the same function

* CC prefers np.flatnonzero(arr) to arr.nonzero()[0]

* Fix a couple of rarely-used scripts, remove unused imports, whitespace
acorreia61201 pushed a commit to acorreia61201/pycbc that referenced this pull request Apr 4, 2024
* Some changes to HFile.select() and mask_to_n_loudest_clustered

* missed comma

* Some CC problemas

* Update

* revert changes which meant kwargs couldnt be passed to mask_to_n_loudest_clustered_events

* CC / fixes

* Use list rather than numpy array for mask

* remove messing about with the mask dtyping

* initialise a transparent mask if it is not given

* Fix problem when the filter value may not be available to the HFile

* thinko in pycbc_bin_trigger_rates_dq

* Don't make a huge boolean mask of all True values and then run nonzero on it = BAD IDEA

* CC

* re-enable option to have no mask

* efficiency saving in pycbc_plot_hist

* no mask option in trig_dict

* typo

* allow None maks in mask_size

* thought i could apply another mask but I cant

* get filter_func back, use new interface for recent changes

* Some minor fixes

* cc

* allow option to do a logical and on masks

* various minor bugfixes

* max z problem

* fixes to pycbc_page_snglinfo

* use alternative dict for beter clarity

* IWH comments

* add check for required datasets (mostly to remind people to add the required dataset list if they add a new ranking\!)

* fix bad check, remove min_snr option

* remove option from CI

* Fix bug with applying premask and using filter_rank threshold at the same time

* stop checking for required datasets - this is difficult when so may different types use the same function

* CC prefers np.flatnonzero(arr) to arr.nonzero()[0]

* Fix a couple of rarely-used scripts, remove unused imports, whitespace
@GarethCabournDavies GarethCabournDavies deleted the hfile_select_ranking branch April 8, 2024 08:43
GarethCabournDavies added a commit to GarethCabournDavies/pycbc that referenced this pull request Jul 24, 2024
* Some changes to HFile.select() and mask_to_n_loudest_clustered

* missed comma

* Some CC problemas

* Update

* revert changes which meant kwargs couldnt be passed to mask_to_n_loudest_clustered_events

* CC / fixes

* Use list rather than numpy array for mask

* remove messing about with the mask dtyping

* initialise a transparent mask if it is not given

* Fix problem when the filter value may not be available to the HFile

* thinko in pycbc_bin_trigger_rates_dq

* Don't make a huge boolean mask of all True values and then run nonzero on it = BAD IDEA

* CC

* re-enable option to have no mask

* efficiency saving in pycbc_plot_hist

* no mask option in trig_dict

* typo

* allow None maks in mask_size

* thought i could apply another mask but I cant

* get filter_func back, use new interface for recent changes

* Some minor fixes

* cc

* allow option to do a logical and on masks

* various minor bugfixes

* max z problem

* fixes to pycbc_page_snglinfo

* use alternative dict for beter clarity

* IWH comments

* add check for required datasets (mostly to remind people to add the required dataset list if they add a new ranking\!)

* fix bad check, remove min_snr option

* remove option from CI

* Fix bug with applying premask and using filter_rank threshold at the same time

* stop checking for required datasets - this is difficult when so may different types use the same function

* CC prefers np.flatnonzero(arr) to arr.nonzero()[0]

* Fix a couple of rarely-used scripts, remove unused imports, whitespace
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants