Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pregenerated injection frame file to perform pregenerated injections #4477

Open
wants to merge 36 commits into
base: master
Choose a base branch
from

Conversation

bhooshan-gadre
Copy link
Contributor

  1. We read injection frames and preprocess injection strain the same as strain data
  2. Then add injection strain to the strain data.
  3. One can do injection frames and injection file together as well.
  4. Finally we delete injections.

I am still testing pycbc_inspiral runs for both. When satisfied I will remove WIP.

@spxiwh
Copy link
Contributor

spxiwh commented Sep 5, 2023

I discussed this possibility with @GarethCabournDavies earlier, so I'll ask him to review this. Thanks for looking into this @bhooshan-gadre !

I guess a longer-term motivation for this is: Injections are becoming longer (starting to exceed the limit at which MKL can iFFT), more memory intensive and slower. It feels like we're getting to the point where pregenerating injections, rather than doing them on the fly, is more efficient.

Copy link
Contributor

@GarethCabournDavies GarethCabournDavies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! - if you could add some images of e.g. qscans from frames made by this, that would be great

The next part of this would be to add the ability to search / add these in the workflow, but that can probably come later

@@ -512,6 +566,33 @@ def insert_strain_option_group(parser, gps_times=True):
help="(optional), Only use frame files where the "
"URL matches the regular expression given.")

## Read frame containing premade O4 injections
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure you state that these are noiseless injection frames.

@bhooshan-gadre bhooshan-gadre changed the title [WIP] Use pregenerated injection frame file to perform injections for O4 [WIP] Use pregenerated injection frame file to perform pregenerated injections Sep 5, 2023
@GarethCabournDavies
Copy link
Contributor

You probably want something like #4360 to avoid requiring an injection file later

@tdent tdent requested a review from ahnitz September 5, 2023 17:25
@bhooshan-gadre
Copy link
Contributor Author

You probably want something like #4360 to avoid requiring an injection file later

Isn't it require to run full bank over all the injections? This will significantly increase computational cost, especially if we have injection parameter file available.
Still, one cannot use injection_filter_rejector_match_threshold with this.

@GarethCabournDavies
Copy link
Contributor

You probably want something like #4360 to avoid requiring an injection file later

Isn't it require to run full bank over all the injections? This will significantly increase computational cost, especially if we have injection parameter file available. Still, one cannot use injection_filter_rejector_match_threshold with this.

that is actually assuming that we don't have an injection file available. I haven't worked out how to to "injections in frames, but we know when / what they are" yet.

@bhooshan-gadre bhooshan-gadre changed the title [WIP] Use pregenerated injection frame file to perform pregenerated injections Use pregenerated injection frame file to perform pregenerated injections Sep 6, 2023
@bhooshan-gadre
Copy link
Contributor Author

I guess the patch works as expected. There are a couple injections which are not made using injection file (maybe because they may have been partially present in the data. But when we add frames, that is not the case as can be seen here in the snr and chisq (and not reduced) plots.

I will try to satisfy code climate and then it should be ready merge.

@bhooshan-gadre
Copy link
Contributor Author

This PR is now ready to be merged. I have made code-climate as happy as I can.

Copy link
Member

@ahnitz ahnitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spxiwh @bhooshan-gadre

I think I see what this is doing. My request is that one consider cutting down on a lot of the code duplication in general here. Secondly, that one rethink the changes the the 'apply' method a bit as it's somewhat intuitive from an API perspective.

+ str(strain.dtype))

lalstrain = strain.lal()
# if not generate_injections and inj_filter_rejector:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debugging??

# Check if we need to use provided injection file
# FIXME: Shoud this be done at the begining even before creating injector
# instance using parser options?
if (injector and not injection_strain):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code duplication can be avoided here.


if opt.sgburst_injection_file:
if injection_strain is not None:
logging.warn("Burst injection file and frames containing injection"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a warning?

@@ -200,7 +200,42 @@ def from_cli(opt, dyn_range_fac=1, precision='single',
gating_info = {}

injector = InjectionSet.from_cli(opt)
# Read noiseless injections from frame file
if opt.injection_frame_cache or opt.injection_frame_files or \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is nearly a complete duplication with the only change really being a couple argument names. Could this be combined or modularized?

@@ -284,12 +319,16 @@ def from_cli(opt, dyn_range_fac=1, precision='single',
if opt.zpk_z and opt.zpk_p and opt.zpk_k:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to early sections, the conditioning is largely duplciated. Can this not be turned into a loop / modular function, so the lines aren't duplicated?

@@ -936,7 +965,8 @@ def slice_and_taper(inj, ts):
return ts

def apply(self, strain, detector_name, distance_scale=1,
injection_sample_rate=None, inj_filter_rejector=None):
injection_sample_rate=None, inj_filter_rejector=None,
generate_injections=True):
if inj_filter_rejector is not None:
raise NotImplementedError("IncoherentFromFile injections do not "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'generate_injections' part seems like a bit of hack. Is there a reason to do this? It seems like the logic is that some setup is needed for the injection filter rejection, but it seems to add a lot of fiddly bits. It seems very unintuitive what is happening when the 'generate_injections' option is set to false. I would prefer this option be removed.

For injection frames, it would seem you should simply not call this method. And for injection filter rejector you essentially use all the parts of the apply method, so the if statements just complicate what is happening. I'm not sure any computation is really saved since the full injection has to be made in this case.

Overall, the logic seems like it would be more clear if not hidden in this function, and it would take up less space.

@ahnitz
Copy link
Member

ahnitz commented Sep 19, 2023

@bhooshan-gadre FYI, there do appear to be CI failures due to the change in command line options. Here's an example.

  File "/home/runner/work/pycbc/pycbc/.tox/py-docs/lib/python3.8/site-packages/pycbc/strain/strain.py", line 204, in from_cli
    if opt.injection_frame_cache or opt.injection_frame_files or \
AttributeError: 'Namespace' object has no attribute 'injection_frame_cache'

@GarethCabournDavies
Copy link
Contributor

GarethCabournDavies commented Oct 2, 2023

It would be nice to see the command lines for:

  • pregenerated injections
  • no injections
  • injections from a file

This will help us to understand how this would be included in the workflow (and also so that I can test the changes myself more easily)

@spxiwh
Copy link
Contributor

spxiwh commented Oct 23, 2023

@bhooshan-gadre What's the state of this PR, are you going to be able to address comments/suggestions? The functionality here is worth having, but I think the comments/suggestions on implementation are fair, as this is a [necessarily] invasive change. Would prefer not to see it get lost though!

@bhooshan-gadre
Copy link
Contributor Author

@bhooshan-gadre What's the state of this PR, are you going to be able to address comments/suggestions? The functionality here is worth having, but I think the comments/suggestions on implementation are fair, as this is a [necessarily] invasive change. Would prefer not to see it get lost though!

@spxiwh I am on it. I tried to incorporate suggestions in a couple of ways. I will update PR in a couple of days.

Bhooshan Gadre added 5 commits October 24, 2023 01:31
…We read injection frames and preprocess injection strain the same as strain data and then add injection strain to the strain data. One can do injection frames and injection file together as well.
…less injections are available.

If match_threshold is provided for the rejector, it will still try to generate injection then but will not add those to the strain again.
If injection file is not given but inj_filter_rejector is given with premade noiseless injections, it should shout and complain. This is not done yet!
…ate injections on the fly only if noiseless injection frames are not given or even with those rejector has match threshold which is sensible
Bhooshan Gadre and others added 5 commits December 19, 2023 01:42
gwastro#4506)

* Update pycbc_brute_bank to include option to cut wavelength and save the lower frequency

* Removed Logging

* Update pycbc_brute_bank

Still need to fix the completion percentage

* Update pycbc_brute_bank

* Update pycbc_brute_bank

* Update pycbc_brute_bank

Debugged test example

* Update pycbc_brute_bank

* Update pycbc_brute_bank

* Testing Condor

Condor installation is not working properly for the checks after I changed two lines.

* Testing 

Testing with file used before swapping two lines caused errors

* Update pycbc_brute_bank

Swapped two lines to be within the if statment
* Use same sample rate as strain for PSD variation instead of hardcoded 2048 Hz

* make srate int

* strain is already preprocessed (resampled to requested sample rate via opts) so it is not even needed. Does PSD var really need fixed sample rate of 2048?

* Removing unused resample to make codeclimate happy in variation.py

---------

Co-authored-by: Bhooshan Gadre <bhooshan.gadre@ligo.org>
* unexpected events

* added temp-volume in cl

* unexpected events

* added temp-volume in cl

* adding sort-type to commandline

* size of labelpad changed

* added decreasing option in sort-type

* added a comment for sorting arrays

* minor change to dictionary

* clarify sort option

---------

Co-authored-by: Thomas Dent <thomas.dent@usc.es>
* Read DQ flags with finer time resolution

* No default sample rate for dq flags

* No default sample rate for dq

* Add type for frequency command line option
titodalcanton and others added 26 commits December 19, 2023 15:36
* add coordinates_space.py

* add LISA/SSB frame params

* add LISA_to_GEO and GEO_to_LISA

* add coordinates_space into FieldArray

* add doc and Astropy support

* update comments on sympy

* use fsolve from scipy instead

* fix cc issues

* fix cc issues

* minor fix

* update

* not use iteration

* decouple LISA orbit and more accurate Earth

* rename

* remove jplephem

* add the angular displacement of the Earth

* use radians

* make func readable in .ini

* reverse back to master

* correct psi range

* reverse to master

* fix unit issue in earth_position_SSB

* put LISA to the "right" position

* add LISA specific transform classes here

* change names

* update

* make a package for coordinates

* remove coordinates_space import

* move __all__ into __init__.py

* remove all coordinates_space

* change TIME_OFFSET to seconds

* fix SOBHB issue

* rename

* add SSB or LISA params into fid_params

* rename

* fix cc issues

* fix cc issue

* fix cc issue

* update

* update

* fix

* add default names

* overwrite params with same names

* remove pre-fixed names

* remove all pre-fixed names

* not pop

* fix inverse transform

* update tc

* not overwrite

* add SNR support for multi-model

* Update waveform.py

* t0 issue

* t0 issue

* Update space.py

* add obstime

* np.mod(psi_newframe, 2*np.pi)

* fix obstime

* add support for array inputs

* Update hierarchical.py

* just use Alex's implementation

* CustomTransformMultiOutputs is in another PR, so remove it

* add LDC and LAL convention correction

* use pycbc standard names

* more meaningful name

* use pycbc standard names

* Update relbin.py

* Update parameters.py

* remove unnecessary changes

* fix cc issue

* fix cc issue

* fix cc issue

* fix cc issue

* compactify

* compactify

* add __all__ back

* Update transforms.py

* Update transforms.py

* Update test_transforms.py

* Update transforms.py

* update doc

* fix time warning

* Update space.py

* Update test_transforms.py

* Create test_coordinates_space.py

* fix cc issues

* fix cc issues

* fix cc issue

* Update tox.ini

* Update tox.ini

* Update tox.ini

* Update tox.ini

* Update tox.ini

* Update tox.ini

* Update tox.ini

* Update test_coordinates_space.py

* add inline doc

* Update tox.ini

* add check of bbhx

* Update test_coordinates_space.py

* Update tox.ini

* Update test_coordinates_space.py

* add MultibandRelativeTimeDom into hierarchical.py

* Update __init__.py

* Update hierarchical.py

* Update hierarchical.py

* Update relbin.py

* Update hierarchical.py

* Update hierarchical.py

* Update relbin.py

* Update hierarchical.py

* Update hierarchical.py

* Update hierarchical.py

* Update hierarchical.py

* Update hierarchical.py

* Update relbin.py

* Update hierarchical.py

* Update hierarchical.py

* Update relbin.py

* Update __init__.py

* Update space.py

* Update space.py

* Update space.py

* fix psi issue

* Update test_coordinates_space.py

* Update test_coordinates_space.py
* Optimize memory usage of pycbc_fit_sngls_split_binned

* Need add option here

* Comments on PR

* Example doesn't use code being edited!

* Update bin/all_sky_search/pycbc_fit_sngls_split_binned

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

---------

Co-authored-by: Thomas Dent <thomas.dent@usc.es>
Updated to match old stats_dist branch.

Co-authored-by: Francesco Pannarale <pannarale@users.noreply.github.com>
I made this mistake: bad idea

Here I add an explicit warning on the release instruction page so that I am more careful about this in future
…ic for pycbc live (gwastro#4533)

* Modifying files to include psd variation in single detector statistic calculation

* ending variation.py with a blank line

* Changing to an increment agnostic solution

* removing change already fixed

* Updating function names and docstrings

* removing ToDos and adding more helpful comments

* Removing unused import

* Codeclimate fixes

* Removing excess logging and whitespace mistakes

* Removing unused objects + codeclimate fixes

* Updating comments and docstrings, removing matchedfilter changes

* Revert "Updating comments and docstrings, removing matchedfilter changes"

This reverts commit 0e6473a.

* Removing matchedfilter changes, updating comments and docstrings

* Move --verbose to the end of the commands

* more comment updates

* Repositioning filter recreation

* Changes to comments and removing whitespace

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* removing refchecks

* Adding option veification for psd variation

* Apply suggestions from code review

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* fixing EOL error

* Refactoring the filter creation function

* codeclimate fixes

* undo

* full_filt func

* removing indentation

* code climate

* code climate

* try to quiet codeclimate

* codeclimate doesn't know PEP8

* brackets obviate line continuation

---------

Co-authored-by: Thomas Dent <thomas.dent@usc.es>
* added scaling of initial pop

* init popn in optimize_di & pso func

* added changes in optimize_pso

* usig logging.debug for snr
* Fix plot_singles and optimize memory

* Tom's comment

* Use np.max

* Remove redundant lines
* Update __init__.py

This would modify init_logging in pycbc to work with a 'count' argument. This means if you give --verbose, the logging level is 'INFO', iv you give it twice (or -vv), it will use 'DEBUG'. 

If you keep giving (verbose) the logging level will decrease in increments in 10, but those don't have canonical names, and would be up to the user to use at their volition.

* Update pycbc/__init__.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* Update __init__.py

* Update __init__.py

* Update pycbc/__init__.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

---------

Co-authored-by: Thomas Dent <thomas.dent@usc.es>
…toes' option (gwastro#4549)

* Use gating windows in sngl_minifollowup - also allow 'triggers within vetoes' option

* Minor fixes

* add infor if fewer triggers than requested

* Allow pycbc_sngls_minifollowup to use single ranking statistic

* Fix bug in rank_stat_single for quadsum/phasetd statistics

* mask may be boolean now

* fix broken pycbc_page_snglinfo due to bugfix

* page_snglinfo to use the ranking statistic used

* missed change
* Fix cases where ifar limit was not being applied

* some more missed cases / long lines

* redoing a couple more long lines

* missed one more place

* unrelated bug where the IFOs were not in the right order for use with the significance_dict
* remove removal of downranked triggers

* add brief explanation into the caption
* update lalsimulation cvmfs path

* missed that the mount location needs to be changed as well

* more references to previsou CVMFS location
* add start to quickstart workflow example

* updates

* fixes

* update

* make folder name default with better name

* demonstrate another part

* ws
* Try fixing load_segment_dict

* Use segments/GPS time API more correctly

* Use h5py correctly

* Codeclimate

* Add comment

* condense segment list creation

Co-authored-by: Francesco Pannarale <pannarale@users.noreply.github.com>

* Plural start/end time in file output

* Codeclimate

---------

Co-authored-by: Francesco Pannarale <pannarale@users.noreply.github.com>
* allow pycbc_fit_sngls_split_binned to use generic parameters for splits ans bin parameters

* testing printing

* lazy formattign in logging

* TD comments

* Update bin/all_sky_search/pycbc_fit_sngls_split_binned

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* code efficiency

* parameter rename

* give choices to options

---------

Co-authored-by: Thomas Dent <thomas.dent@usc.es>
…ints (gwastro#4581)

* Update pycbc_brute_bank to include option to cut wavelength and save the lower frequency

* Removed Logging

* Update pycbc_brute_bank

Still need to fix the completion percentage

* Add index to get bank to try more

* Renamed the variable

* Fixed Syntax
* CI changes to use split_binned hist

* change back to mtotal now that is fixed by gwastro#4577

* revert related but unnecessary change
gwastro#4582)

* Changes to likelihood for multiple modes

* Added multi mode likelihood for wfs that don't need detector response computed.

* Experimenting with multi signal likelihood and bbhx

* Futher likelihood testing for multiple modes

* Test

* Still testing likelihood

* Almost fixed the cross term likelihood

* Testing

* Final tests

* Multi signal likelihood for lisa bbhx cleanup

* Tidy before pr

* Added check for f_lower to set to -1 (pycbc default value) if f_lower not set in params.

* Update docs examples with delta_f value

* Undo hard coded delta_f

* Add error when f_lower not supplied

* Undo previous change from sequence. Added Exception when f_lower not provided in get_fd_det_waveform.

* Removed exception

* Correct indents

* Removed TODO from relbin. Removed delta_f as a required parameter when using the get_fd_det_waveform_sequence method
* add basic support for nessai sampler

* enable all options and resuming in nessai

* fix prior bounds in nessai model

* tweak resuming and samples in nessai interface

* change outdir to avoid namespace conflicts

* tweaks to nessai sampler class

* fix nessai checkpointing and other minor tweaks

* fix for reading in nessai result files

* use callback for checkpointing in nessai

* start addressing codeclimate issues

* add nessai to auxiliary samplers

* add additional comments for nessai

* make simple sampler example 2d

nessai does not support 1d likelihoods, so this change is neede to test nessai in the CI

* fix call to rng.random

* add nessai to samplers example and update plot

* set minimum version for nessai

* force cpu-only version of torch

* add missing epsie jump proposal

* add plot-marginal to samplers plot

* fix whitespace

* use lazy formatting in logging functions

* move functions to common nested class

* update for change common nested class

* address more code climate issues
* prototype fixes

* Update mkl.py

* ws

* also update class interface

* typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.