Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slight update on creating Finngen studies. #19

Merged
merged 3 commits into from
May 9, 2022

Conversation

DSuveges
Copy link
Contributor

@DSuveges DSuveges commented May 7, 2022

There's a minor update in scripts/make_FINNGEN_study_table.py:

  • Table is read from JSON directly from URL, no need to create a local copy and re-format via jq. (hence there's a little update in the snakefile.)
  • If phenostring is not present, or it's an empty string, phenocode is used to describe the phenotype.
  • Other updates are trivial.

Resulting json line for the problematic study descibed in #2585:

{
  "study_id": "FINNGEN_R6_I9_HEARTFAIL_AND_CHD",
  "pmid": "",
  "pub_date": "2022-01-24",
  "pub_journal": "",
  "pub_title": "",
  "pub_author": "FINNGEN_R6",
  "trait_reported": "I9_HEARTFAIL_AND_CHD",
  "trait_efos": null,
  "ancestry_initial": "European=206656",
  "ancestry_replication": "",
  "n_initial": 206656,
  "n_cases": 8876,
  "n_replication": 0
}

@DSuveges DSuveges requested a review from ireneisdoomed May 7, 2022 17:10
Copy link
Contributor

@ireneisdoomed ireneisdoomed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the fix @DSuveges

I've run the tables generation pipeline and I can confirm that everything runs smoothly and the bug reported in #2585 is fixed.
I9_HEARTFAIL_AND_CHD was the only study where the reported trait was an empty string, therefore having the problem.
As proposed, the reported trait has been successfully coalesced with the phenotype code so that the ETL will not dropped this record when filtering out those with trait_reported == null

>>> studies[studies['study_id'] == 'FINNGEN_R6_I9_HEARTFAIL_AND_CHD'].iloc[0]
study_id                FINNGEN_R6_I9_HEARTFAIL_AND_CHD
pmid
pub_date                                     2022-01-24
pub_journal
pub_title
pub_author                                   FINNGEN_R6
trait_reported                     I9_HEARTFAIL_AND_CHD
trait_efos                                         None
ancestry_initial                      [European=245732]
ancestry_replication                                 []
n_initial                                        245732
n_replication                                       0.0
n_cases                                         11034.0
num_assoc_loci                                        0
has_sumstats                                       True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants