Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(eqtl): add preprocessing dag #366

Closed
wants to merge 6 commits into from
Closed

feat(eqtl): add preprocessing dag #366

wants to merge 6 commits into from

Conversation

ireneisdoomed
Copy link
Contributor

@ireneisdoomed ireneisdoomed commented Dec 19, 2023

This PR includes:

  • new DAG of steps in order to process the whole preprocessing of eQTL Catalog
  • redefined eqtl_catalogue step to eqtl_catalogue_ingestion. This is the script that produces a study index and harmonises their raw summary statistics.

When running the step, I hit a bottleneck at the eqtl_catalogue_ingestion level and the job was cancelled.
I want to test it with the recent fixes introduced in #369, it might work now

@codecov-commenter
Copy link

Codecov Report

Merging #366 (5dbebce) into dev (42b366c) will increase coverage by 0.24%.
Report is 20 commits behind head on dev.
The diff coverage is 91.30%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##              dev     #366      +/-   ##
==========================================
+ Coverage   85.67%   85.92%   +0.24%     
==========================================
  Files          89       90       +1     
  Lines        2101     2117      +16     
==========================================
+ Hits         1800     1819      +19     
+ Misses        301      298       -3     
Files Coverage Δ
src/airflow/dags/eqtl_catalogue_preprocess.py 100.00% <100.00%> (ø)
src/otg/dataset/l2g_feature_matrix.py 82.92% <ø> (+7.31%) ⬆️
src/otg/eqtl_catalogue_ingestion.py 61.90% <100.00%> (ø)
src/otg/method/l2g/evaluator.py 38.70% <100.00%> (ø)
src/otg/method/l2g/model.py 58.33% <50.00%> (+0.50%) ⬆️

@ireneisdoomed
Copy link
Contributor Author

I don't think the changes here are relevant any more.

  • The plan is to define credible sets from eQTL Catalogue fine mapping results directly, and not from fine mapping summary stats for PICS.
  • This PR had also a technical limitation in terms of how the inputs were handled in the DAG, so I never managed to do a full run.

Since we don't have any business logic here and we only touch the Airflow layer, let's just close this.

@ireneisdoomed ireneisdoomed deleted the il-eqtl-dag branch July 15, 2024 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants