-
Notifications
You must be signed in to change notification settings - Fork 1
5inghalp/DETECT
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Set-Up: - your own Snakefile with the following first line: include: "/path/to/DETECT/Snakefile" # this imports the rules from this module - config file - outcome code lists (it's helpful to make sure that none of the outcomes have zero individuals) - input date matrices indicated by config file Input (Date of Onset) Matrix Format: - .csv - column names = IID, code1, code2, etc... - row names = person1, person2, etc... - cell values - decimal value with unit YEARS relative to some date (we used the linux epoch by default) Running: - in a python 3.8 environment with pandas, numpy, scipy, matplotlib, and snakemake>=7.8.3 - for dry-run > snakemake -n {OUTPUT} # {OUTPUT} is the target file you're asking for - example: lsf queueing/cluster system profile > snakemake --profile lsf {OUTPUT} # {OUTPUT} is the target file you're asking for - A shortcut to asking for multiple output files would be to add a custom all rule to your Snakefile - example: rule all: input: 'Summary/positive_phecode_tests.csv', 'Summary/positive_icd_tests.csv, expand('Trajectories/trajectories_{index}_{outcome}_icd_{dataset}.csv', outcome=open(config['icd_code_list']).read().splitlines(), dataset=[k for k, v in config['icd_date_matrices'].items()], index=config['icd_index_code_list'])), 'Trajectory_Times/E11_I21._icd_4dig.all_signpost_times.csv.gz' # particular interest in diabetes -> MI - to use your custom all rule > snakemake all --profile lsf Suggested Target outputs: - All phecode tests with RR > 1 = 'Summary/positive_phecode_tests.csv' - All ICD tests with RR > 1 = 'Summary/positive_icd_tests.csv' - Individual-Level Trajectories for one Phecode test = 'Trajectories/trajectories_{index}_{outcome}_phecode_{dataset}.csv' - Individual-Level Trajectories for one ICD Code test = 'Trajectories/trajectories_{index}_{outcome}_icd_{dataset}.csv' Aggregated Target outputs: - All Individual-Level Trajectories for Phecodes = expand('Trajectories/trajectories_{index}_{outcome}_phecode_{dataset}.csv', outcome=[c.replace('*', '') for c in open(config['phecode_list']).read().splitlines()], dataset=[k for k, v in config['phecode_date_matrices'].items()], index=config['phecode_index_code_list']) - All Individual-Level Trajectories for ICD codes = expand('Trajectories/trajectories_{index}_{outcome}_icd_{dataset}.csv', outcome=[c.replace('*', '') for c in open(config['icd_code_list']).read().splitlines()], dataset=[k for k, v in config['icd_date_matrices'].items()], index=config['icd_index_code_list']) Plotting/Detailed outputs: - Trajectory timing information for selected index/outcome combinations = 'Trajectory_Times/{index}_{outcome}_icd_4dig.all_signpost_times.csv.gz'
About
DETECT: DiseasE Trajectory fEature extraCTion method
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published