Code for the manuscript on validation of CBCL-Aff on three datasets (ABCD, HBN, BHRC)
Daniel S. Pine
This is the repostory for the code used for the manuscript "Validation of CBCL depression scores of adolescents in three independent datasets". Depression is a common and deadly disease which originates in childhood. The Adolescent Brain Cognitive Development study (ABCD) provides an attractive tool to study depression in children and adolescents. The only continuous measure of depression provided in ABCD is the parent-report Child Behavior Checklist’s DSM-5-Oriented Affective Problems scale (CBCL-Aff). It is important for depression research in the ABCD dataset that the CBCL-Aff is a valid measure of depression in this age group. We, therefore, tested the sensitivity and specificity of the CBCL-Aff for depression in the ABCD data. CBCL-Aff agreed with parent-report of children’s symptoms but disagreed with child self-report of symptoms. To resolve the disagreement between parents and their children, we further confirmed our results in two independent datasets which included data from children aged 9-12: the Healthy Brain Network (HBN) and the Brazilian High Risk Cohort Study (BHRC). Both HBN and BHRC provided clinician-report depression diagnoses, which we used as a gold standard. CBCL-Aff successfully predicted clinician-report di- agnoses, supporting its validity as a continuous measure of depression.
All preprocessing and analyses were conducted in Python, using Jupyter Notebooks. This directory has 10 .ipynb files, out of which 3 files are used for analysis, one for each dataset (1_ABCD_r51_all.ipynb for ABCD dataset, 3_BHRC_all.ipynb for BHRC dataset, 4_HBN_all.ipynb for HBN dataset). For plotting, use notebooks 51_plot_ABCD_parent.ipynb, 52_plot_ABCD_parent.ipynb, 53_plot_HBN.ipynb, 54_plot_BHRC.ipynb. For recreating plots in the manuscript and in the Supporting Information, use notebooks 6_flowchart.ipynb, 7_extra_plots.ipynb, 8_for_flowchart.ipynb.
Hyp1 = Sensitivity, the ability of CBCL-Aff to differentiate between depressed and not depressed children.
Hyp2a = Specificity, the ability of CBCL-Aff to differentiate between depressed children and children w/o depression but with another form of psychopathology (ADHD/anxiety).
Hyp2a = Strict Specificity, the ability of CBCL-Aff to differentiate between children with depression and w/o comorbidities and children w/o depression but with another form of psychopathology (ADHD/anxiety).
Data is shared through the NIMH data archive. A tutorial on how to access and download the data is available here. Prior to downloading the data, the user needs to create an account on the NDA website and request permission. The version used in the manuscript is Release 5.1.
The link to the data and instructions for download are available here.
Instructions for requesting the data are available here. Request form.
To run the analysis, open the desired .ipynb file (1_ABCD_r51_all.ipynb for ABCD dataset, 3_BHRC_all.ipynb for BHRC dataset, 4_HBN_all.ipynb for HBN dataset). Each main analysis file contains functions for loading files, preprocessing and analysis. The "Main code" section creates a CSV file (one per .ipynb file/dataset) with all possible parameters. If desired to run the code with specific parameters, use the "Not main code" section of the notebooks. Other notebooks are to be run block by block. Please submit an issue in this repository or reach out to Marie Zelenina or Dylan Nielson with questions.
We use the following libraries:
jupyter notebook pandas numpy matplotlib scikit-learn scipy
You can create a conda environment with:
conda env create -p ./env -f environment.yml