Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(config): gnomAD steps configuration extraction and versioning #620

Conversation

project-defiant
Copy link
Contributor

@project-defiant project-defiant commented May 24, 2024

✨ Context

The context is described in opentargets/issues#3324.

🛠 What does this PR implement

New functionalities

  • VersionEngine class to infer version from resource paths, current implementation adds the interface for the extracting and concrete implementation for GnomAD variants and ld_index steps.

Configuration refactoring

  • Extraction of new input parameters from GnomAD data sources and pass them to the steps (LDIndexStep and VariantAnnotationStep)
  • Set default values in all API layers (steps, datasources, config) that is set up to point to StepConfig inherited classes in ld_index and variant annotation processes - single source of truth for default values.
  • Update of ariflow config layer with new input parameters to prevent the bucket overwriting for GnomAD steps. This is implemented with new flag use_version_from_input set to True, in which case the version is inferred by the VersionEngine class and appended to the output catalog. By default this behaviour is set to False.
  • Unified path for grch38_grch37 liftover in airflow config.

Chores

  • Removal of ruff warnings during make check.
  • Added coverage files to ignored.

🙈 Missing

🚦 Before submitting

  • Do these changes cover one single feature (one change at a time)?
  • Did you read the contributor guideline?
  • Did you make sure to update the documentation with your changes?
  • Did you make sure there is no commented out code in this PR?
  • Did you follow conventional commits standards in PR title and commit messages?
  • Did you make sure the branch is up-to-date with the dev branch?
  • Did you write any new necessary tests?
  • Did you make sure the changes pass local tests (make test)?
  • Did you make sure the changes pass pre-commit rules (e.g poetry run pre-commit run --all-files)?

Szymon Szyszkowski added 9 commits May 24, 2024 12:06
Signed-off-by: Szymon Szyszkowski <ss60@mib117351s.internal.sanger.ac.uk>
…nfigs

Configuration updates for:
- [x] ld_index_step
- [x] ld_variant_annotation_step

Both steps and underlying classes use default values derived from
StepConfig data classes as defaults, while preserving the ability
to set inputs at each stage, in case end user want to use step function
API, step cli or datasource function from API.
@project-defiant project-defiant linked an issue May 24, 2024 that may be closed by this pull request
7 tasks
@github-actions github-actions bot added documentation Improvements or additions to documentation size-L Step Datasource labels May 24, 2024
@project-defiant project-defiant changed the title [OT-3324] gnomAD steps configuration extraction and versioning feat(config): gnomAD steps configuration extraction and versioning May 24, 2024
Copy link
Collaborator

@d0choa d0choa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks great. I think you understood all the many layers in the first pass.
Thanks!

src/gentropy/datasource/gnomad/variants.py Show resolved Hide resolved
tests/gentropy/common/test_version_engine.py Show resolved Hide resolved
.gitignore Show resolved Hide resolved
@project-defiant project-defiant merged commit c2bfa18 into dev May 28, 2024
4 checks passed
@project-defiant project-defiant deleted the ot-3324-szsz-GnomAD-steps-configuration-extraction-and-versioning branch May 28, 2024 07:49
project-defiant added a commit that referenced this pull request Jun 14, 2024
)

* feat: drop .coverage files from tracked files
* feat: new configuration variables for DAGs
* build(linting): resolved ruff warnings in make check
* build(airflow_config): extract additional input parameters for gnomad steps
* feat(step_config): extracted new input parameters from gnomad step configs

Configuration updates for:
- [x] ld_index_step
- [x] ld_variant_annotation_step

Both steps and underlying classes use default values derived from
StepConfig data classes as defaults, while preserving the ability
to set inputs at each stage, in case end user want to use step function
API, step cli or datasource function from API.

* refactor(types): added a file for storing library types
* feat(version_engine): add version engine to infer datasource versions
* docs: added version engine to documentation

---------

Signed-off-by: Szymon Szyszkowski <ss60@mib117351s.internal.sanger.ac.uk>
Co-authored-by: Szymon Szyszkowski <ss60@mib117351s.internal.sanger.ac.uk>
project-defiant pushed a commit that referenced this pull request Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datasource documentation Improvements or additions to documentation size-L Step
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GnomAD steps configuration extraction and versioning
2 participants