feat: adding GWAS Catalog study curation #17

DSuveges · 2023-12-08T16:21:10Z

As described in issue #3132, the manually curated GWAS Catalog studies will be stored in the curation repo. As it is now, the code responsible for annotating gwas catalog studies based on curation can take this table as it is. Even if the URL to the file in the repo is provided.

ireneisdoomed

Thank you for adding and documenting the file! I envision having L2G's curation here as well.
I have comments specifically about the column names, let me know your thoughts.

README.md

ireneisdoomed · 2023-12-22T15:43:58Z

docs/genetics.md

+### Schema
+
+- **studyId** - GCST study accession to identify study
+- **analysisFlag** - comment on the applied statistical method authors used that might have downstream implication in our pipelines.


The column name in the data is upateAnalysisFlags.
Looking at the possible categories ("Case-case study", "GxG", "Multivariate analysis", "GxE"), I'd rename this field to studySubType

Although these labels might of fall into a "subtype" category (however I'm 100% agree), there's no such requirement that this field would only contain such.

ireneisdoomed · 2023-12-22T15:46:44Z

docs/genetics.md

+### Schema
+
+- **studyId** - GCST study accession to identify study
+- **analysisFlag** - comment on the applied statistical method authors used that might have downstream implication in our pipelines.


Suggested change

- **analysisFlag** - comment on the applied statistical method authors used that might have downstream implication in our pipelines.

- **studySubType** - description of the specific statistical methodology employed in the GWAS.

ireneisdoomed · 2023-12-22T15:50:25Z

docs/genetics.md

+
+- **studyId** - GCST study accession to identify study
+- **analysisFlag** - comment on the applied statistical method authors used that might have downstream implication in our pipelines.
+- **updateStudyType** - if a study is not really a GWAS, but a qtl. This string will be picked up and replace the `type` value in the study index.


I'd leave the comment of how we use this field for the pipeline documentation. Therefore, I suggest naming it to studyType

Suggested change

- **updateStudyType** - if a study is not really a GWAS, but a qtl. This string will be picked up and replace the `type` value in the study index.

- **studyType** - categorises the study as either GWAS or molQTL.

ireneisdoomed · 2023-12-22T15:52:24Z

docs/genetics.md

+- **studyId** - GCST study accession to identify study
+- **analysisFlag** - comment on the applied statistical method authors used that might have downstream implication in our pipelines.
+- **updateStudyType** - if a study is not really a GWAS, but a qtl. This string will be picked up and replace the `type` value in the study index.
+- **qualityControls** - `|` separated list of identified issues that prevent the study from ingestion.


The column name in the data is upateQualityControls.
Because of the same reason as above, I'd rename it to qualityControls

In the code that processes these columns there's a join that updates similarly named fields. It would make that code more complex. However at some point we come back to this.

DSuveges · 2024-01-02T13:18:00Z

Thank you for adding and documenting the file! I envision having L2G's curation here as well.

Yes, that is certainly a desired path, also we should collate UKBB, FINNGEN trait curation here. Not sure if we should unify diease/trait mappings via ontoma though. There are arguments in both directions.

ireneisdoomed

Having "updated" in the studyType and QualityControls headers makes it look like these columns are relative to the columns in another file. I think this curation can be used as a standalone file, not just in conjunction with the GWAS Catalog study index. So I think it's cleaner to move the logic into the genetics ETL when we use this file in a specific context.

This is my view, not a blocker if you think this solution is better. The only thing that I do think we need to fix is the typos: updated instead of upate.

DSuveges · 2024-01-03T13:27:24Z

I have removed the update prefix from the column names.

ireneisdoomed

Thanks a lot for the changes :)

DSuveges added 2 commits December 8, 2023 16:05

feat: adding GWAS Catalog study curation

95ba91d

docs: updating documentations

3a96f64

DSuveges linked an issue Dec 8, 2023 that may be closed by this pull request

Managing GWAS Catalog study QC/flags opentargets/issues#3173

Closed

feat: udpate file header

18d7849

DSuveges marked this pull request as ready for review December 15, 2023 11:30

DSuveges requested a review from ireneisdoomed December 15, 2023 11:30

ireneisdoomed requested changes Dec 22, 2023

View reviewed changes

ireneisdoomed mentioned this pull request Dec 22, 2023

feat: adding logic to flag gwas catalog studies based on curation opentargets/gentropy#347

Merged

docs: updating header documentation

da0ffef

DSuveges requested a review from ireneisdoomed January 2, 2024 14:06

ireneisdoomed requested changes Jan 3, 2024

View reviewed changes

fix: changing column names

df9dad4

DSuveges requested a review from ireneisdoomed January 3, 2024 13:27

ireneisdoomed approved these changes Jan 4, 2024

View reviewed changes

DSuveges merged commit 23dc493 into master Jan 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: adding GWAS Catalog study curation #17

feat: adding GWAS Catalog study curation #17

DSuveges commented Dec 8, 2023 •

edited

Loading

ireneisdoomed left a comment

ireneisdoomed Dec 22, 2023

DSuveges Jan 2, 2024

ireneisdoomed Dec 22, 2023

ireneisdoomed Dec 22, 2023

ireneisdoomed Dec 22, 2023

DSuveges Jan 2, 2024

DSuveges commented Jan 2, 2024

ireneisdoomed left a comment

DSuveges commented Jan 3, 2024

ireneisdoomed left a comment

	- analysisFlag - comment on the applied statistical method authors used that might have downstream implication in our pipelines.
	- studySubType - description of the specific statistical methodology employed in the GWAS.

	- updateStudyType - if a study is not really a GWAS, but a qtl. This string will be picked up and replace the `type` value in the study index.
	- studyType - categorises the study as either GWAS or molQTL.

feat: adding GWAS Catalog study curation #17

feat: adding GWAS Catalog study curation #17

Conversation

DSuveges commented Dec 8, 2023 • edited Loading

ireneisdoomed left a comment

Choose a reason for hiding this comment

ireneisdoomed Dec 22, 2023

Choose a reason for hiding this comment

DSuveges Jan 2, 2024

Choose a reason for hiding this comment

ireneisdoomed Dec 22, 2023

Choose a reason for hiding this comment

ireneisdoomed Dec 22, 2023

Choose a reason for hiding this comment

ireneisdoomed Dec 22, 2023

Choose a reason for hiding this comment

DSuveges Jan 2, 2024

Choose a reason for hiding this comment

DSuveges commented Jan 2, 2024

ireneisdoomed left a comment

Choose a reason for hiding this comment

DSuveges commented Jan 3, 2024

ireneisdoomed left a comment

Choose a reason for hiding this comment

DSuveges commented Dec 8, 2023 •

edited

Loading