Skip to content
This repository was archived by the owner on Jan 30, 2024. It is now read-only.

This script will take data from a validated CCDI submission manifest and create dbGaP submission files specifically for a CCDI project.

License

Notifications You must be signed in to change notification settings

CBIIT/ChildhoodCancerDataInitiative-CCDI_to_dbGaPy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChildhoodCancerDataInitiative-CCDI_to_dbGaPy (ARCHIVED)

This repo contains a python script which takes data from a validated CCDI submission manifest and creates dbGaP submission files specifically for a CCDI project.

Table of Contents


Python environment management

A controlled virtual environment of Python is always recommanded for running any python package/script due to dependency management purpose. There are many tools that you can use to create a virtual environment, such as pyenv, virtualenv or conda. An instruction is included here on how to create a conda env with all the dependencies installed.

  • Conda install

    Conda is an open source package management system and environment management system that runs on Windows, macOs, and Lunix. Here is the site of installation instruction. Please pick the right package based on your operation system.

  • Create a conda env

    An environment yaml conda_environment.yml can be be found under folder envs/. To create the environment, simply run

    conda env create -f <path_to_env_yml>

    You should be able to find an environment called CCDI_to_dbGaP_env when you run

    conda env list
  • Activate conda environment

    All the dependecies that the script requires should be succesfully installed within this environment. To activate the environemnt, simply run

    conda activate CCDI_to_dbGaP_env

    You should be able to see (CCDI_to_dbGaP_env) at the begining of your terminal prompt line after activation.

  • Deactivate conda environment

    conda deactivate

Usage instruction

❗Note: THIS SCRIPT assumes all CONSENT to be 👉 GRU (consent number to be 1). If a CONSENT other than GRU is found, data submitter is required to fix the CONSENT encoded value in SC_DD.xlsx before submission

>> python CCDI_to_dbGaPy.py --help
usage: CCDI_to_dbGaPy.py [-h] -f FILE [-s PREVIOUS_SUBMISSION]

This script is a python version to generate dbGaP submission files using a validated CCDI
submission manifest

required arguments:
  -f FILE, --file FILE  A validated dataset file based on the template
                        CCDI_submission_metadata_template (.xlsx)

optional arguments:
  -s PREVIOUS_SUBMISSION, --previous_submission PREVIOUS_SUBMISSION
                        A previous dbGaP submission folder for the same phs_id study.
  • Inputs

    The script requires a validated CCDI manifest. The previous SRA submission folder is optional.

  • Outputs

    • A log file named in CCDI_to_dbGaP_<today_date>.log
    • (If the script finishes successfully) A folder named in <phs_id>_dbGaP_submission_<today_date>.
      aviator_falsetto_6_dbGaP_submission_2023-11-24/
      ├── SA_DD.xlsx
      ├── SA_DS_aviator_falsetto_6_dbGaP_submission.txt
      ├── SC_DD.xlsx
      ├── SC_DS_aviator_falsetto_6_dbGaP_submission.txt
      ├── SSM_DD.xlsx
      ├── SSM_DS_aviator_falsetto_6_dbGaP_submission.txt
      └── metadata.json
      
      1 directory, 7 files
      

About

This script will take data from a validated CCDI submission manifest and create dbGaP submission files specifically for a CCDI project.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages