Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAS-friendly extraction #322

Closed
ngreenwald opened this issue Feb 28, 2023 · 3 comments · Fixed by #342
Closed

NAS-friendly extraction #322

ngreenwald opened this issue Feb 28, 2023 · 3 comments · Fixed by #342
Assignees
Labels
enhancement New feature or request

Comments

@ngreenwald
Copy link
Member

Is your feature request related to a problem? Please describe.
The current practice of storing all the bin files is likely not necessary for the long term. In order to ensure everyone is confident that they can remove their bin files after a year, we'll want to have extracted all potentially relevant information from all bin files before we delete them.

Describe the solution you'd like
We'll want to extract the mass-proficient (x+0.3) in addition to our current deficient (x-0.3) setup. In addition, we'll want to extract all potentially relevant antibody/naturally abundant channels. We'll then need to decide how we handle the proficient and deficient copies of each channel so that rosetta and normalization work as expected.

@ngreenwald ngreenwald added the enhancement New feature or request label Feb 28, 2023
@alex-l-kong
Copy link
Contributor

alex-l-kong commented Feb 28, 2023

List STC, to accommodate these changes we'll need to make the following adjustments:

  • extract_bin_files: we'll need to run this separately for panel = (0.0, 0.3). This is equivalent to having a panel file where stop and start are equal to mass_value and mass_value + 0.3 for each target.
  • This will need to be two separate calls, each with a separate out_dir. We can suffix the out_dirs with _deficient and _proficient respectively.
  • Defining the channels (I believe) will need to be done in the panel file. We might need to change sample_panel.csv to account for this, not sure if all the relevant targets are contained in there (and we might want to add a mass-proficient sample_panel.csv as well).

This is assuming that we need explicit separation of proficient and deficient channels. I'm not sure if integration over (-0.3, 0.3) will achieve the same desired result.

@alex-l-kong
Copy link
Contributor

This issue may or may not be dependent on #313. If multiple people are doing a run in the immediate future, we might be able to wait until we fully test the intermediate callbacks.

@alex-l-kong
Copy link
Contributor

Closing this issue for now, will reopen once someone chooses to figure out which channels need specific mass ranges. We'll need to define custom panels for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants