Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add frequency channel masking capability #125

Open
wfarah opened this issue Dec 6, 2020 · 11 comments
Open

Add frequency channel masking capability #125

wfarah opened this issue Dec 6, 2020 · 11 comments
Assignees

Comments

@wfarah
Copy link

wfarah commented Dec 6, 2020

It would be great to add a flag to turboSETI to make it parse in a "frequency mask", a file with space-separated frequency channel numbers that the user wishes to ignore in the doppler search. These channels are frequencies where the user knows that radio frequency interference is present. The masked channels shall be replaced with 0 before performing doppler searching

@texadactyl texadactyl self-assigned this Dec 7, 2020
@texadactyl
Copy link
Contributor

texadactyl commented Dec 7, 2020

cc: @telegraphic @siemion @luigifcruz

After a discussion with @wfarah , this is my understanding of the requirements for this enhancement issue:

  • It is desirable to be able to ignore (mask out) specified fine-channel frequencies (or ranges thereof) during a turboSETI executable run -or- during execution of FindDoppler.search() as called from another Python program.
  • Feasibility and possible approaches are TBD.
  • The fine-channels would be specified in a separate text editable file as a list of fine-channel mask specifications.
  • Each mask specification is the index of a single channel (nonnegative integer) or a range (low to high) of channel indexes.

Sample mask file:
# Ignore fine-channels 2, 4, 6, and 14 through 22.
2
4
6
14-22
# End of specifications

It is imagined that turboSETI and class FindDoppler would have new parameters:

  • A boolean indicating that masking is enabled or disabled (default: disabled)
  • The full path of the mask file when masking is enabled (default: none)

Parameter errors include:

  • Masking enabled but no mask file was specified.
  • Mask file not found or the file is an unreadable condition for whatever reason.
  • Syntax error in a specification inside the mask file (E.g. not a nonnegative integer).
  • A mask specification that is out of range of the channels as defined in the HDF5 header. This applies to a single fine-channel index or a range specificstion.
  • Two separate specifications overlap.

Any discussion points? Please comment.

@texadactyl
Copy link
Contributor

texadactyl commented Dec 7, 2020

Afterthought: Is it more desirable to express mask specifications in terms of frequency value? If so, maybe only range specifications make sense, similar to how blimpy-dice uses f_start and f_stop parameters to govern its behavior.

E.g.
# Ignore frequencies close to 6300, range 8100 to 8200, and range 9700 to 9900.
freq 8100 8200
freq 9700 9900
freq 6295 6305
# End of specifications

In the future, if need be, we can enhance specifications to use different criteria.

@texadactyl texadactyl removed their assignment Dec 7, 2020
@texadactyl
Copy link
Contributor

texadactyl commented Dec 8, 2020

Assuming that the "afterthought" is the way to go (use frequency values instead of channel numbers), I think that I found a good place to throw out frequency ranges that are observed to be noisy in a previous turboSETI run.

Inside the turbo_seti data_handler.py class DATAHandle, there is a __split_h5() function which builds the coarse channel table for subsequent search() processing. Each entry has a starting frequency and ending frequency. It looks like a good place to mask. Just need to pass in parameters from FindDoppler.search() all the way.

Overlap cases: mask-low:mask-high vs f_start:f_stop
A coarse channel could be partially masked out.
Could just keep the wee bit that is not masked out.

Comments? @wfarah @telegraphic @siemion @luigifcruz

If this looks like it adds value to turbo_seti, then we need a more formal feasibility / system concept document to be reviewed.

@wfarah
Copy link
Author

wfarah commented Dec 9, 2020

I think this is a good idea @texadactyl, and my assumption is that the searching should not be affected by the "zero-ed" channels. As a matter of fact, after some discussion with Dave, some GBT data products have frequency channels that are 0s (because of the failure of some processing nodes). TurboSETI does not complain about them.

@luigifcruz
Copy link
Contributor

I think the "afterthought" is indeed the best way to go with this. In addition to that, we could also add a whitelist. It can be useful if the user is interested only in a certain frequency range. This can be achieved with a YAML configuration file. I'm thinking something like this:

whitelist:
    range:
        - start: 403e6
        - end: 405e6
    # and/or
    point:
        - center_freq: 1.705e9
        - bandwidth: 2e6
# and/or
blacklist:
    range:
        - start: 88e6
        - end: 107e6
    range:
        - start: 2.3e9
        - end: 2.9e9

Sorry for the delay!

@wfarah
Copy link
Author

wfarah commented Dec 18, 2020

Folks, any updates on this?

@luigifcruz
Copy link
Contributor

luigifcruz commented Dec 18, 2020

@wfarah Yes, we have an implementation plan and should start development later today.

@wfarah
Copy link
Author

wfarah commented Dec 18, 2020

Perfect, thanks @luigifcruz!

texadactyl added a commit that referenced this issue Dec 31, 2020
[WIP] Add frequency channel masking capability. Issue #125.
texadactyl added a commit that referenced this issue Dec 31, 2020
Revert "[WIP] Add frequency channel masking capability. Issue #125."
@telegraphic
Copy link
Collaborator

Going through the open issues today so chiming in: this is a great idea and would love to see it 👍

@texadactyl
Copy link
Contributor

texadactyl commented Jul 2, 2021

@wfarah

@telegraphic and I have not forgotten this feature request. I just added it to hyperseti's list too.

Note that there are 2 pinned issues in turbo_seti:

  1. Correct find_event.calc_freq_range() #231 - significant design flaw <--- Done!
  2. Add frequency channel masking capability #125 - highly useful enhancement

That is the priority list order for turbo_seti. If hyperseti can address the 2nd issue before someone in turbo_seti can get to it, then it probably will not be done in turbo_seti (guessing).

@texadactyl
Copy link
Contributor

For whenever this feature is revived:

  • Sample files.
  • What are the diagnostics when there are errors in the yaml file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants