Skip to content

Metadata File Format: Mutation Data

Mark Keller edited this page Feb 2, 2019 · 10 revisions

The mutation data metadata file specifies where to find the mutation data files corresponding to each mutation data project.

It must be located at obj/meta-data.tsv

Rows correspond to projects (corresponding to a specific cancer type from a specific source).

It contains the following columns:

  • Project: An identifier for the project.
  • Project Source: The source of the project.
  • Project Name: The project name. Typically named by the cancer type.
  • Oncotree Code: A code mapping the cancer type to an Oncotree node.
  • Path to Counts SBS_96 File: The path to the counts file corresponding to the SBS_96 category type. Relative to the obj directory.
  • Path to Counts DBS_78 File: The path to the counts file corresponding to the DBS_78 category type. Relative to the obj directory.
  • Path to Counts INDEL_Alexandrov2018_83 File: The path to the counts file corresponding to the INDEL_Alexandrov2018_83 category type. Relative to the obj directory.
  • Path to Clinical File: The path to the clinical data file. Relative to the obj directory. Can be left blank.
  • Path to Samples File: The path to the sample-patient mapping file. Relative to the obj directory. Required.
  • Path to Genes File: The path to the gene-alterations file. Relative to the obj directory. Can be left blank.

Example:

Project Project Source Project Name Oncotree Code Path to Counts SBS_96 File Path to Counts DBS_78 File Path to Counts INDEL_Alexandrov2018_83 File Path to Clinical File Path to Samples File Path to Genes File
TCGA-BLCA_BLCA_mc3.v0.2.8 TCGA Bladder Urothelial Carcinoma BLCA mutations/PanCanAtlas/processed/counts/counts.TCGA-BLCA_BLCA_mc3.v0.2.8.SBS-96.tsv mutations/PanCanAtlas/processed/counts/counts.TCGA-BLCA_BLCA_mc3.v0.2.8.DBS-78.tsv mutations/PanCanAtlas/processed/counts/counts.TCGA-BLCA_BLCA_mc3.v0.2.8.INDEL-Alexandrov2018_83.tsv clinical/PanCanAtlas/processed/clinical.TCGA_BLCA.tsv mutations/PanCanAtlas/processed/samples/samples.TCGA-BLCA_BLCA_mc3.v0.2.8.tsv gene-alterations/mutations/maf/processed/PanCanAtlas-TCGA-BLCA_BLCA_mc3.v0.2.8.tsv
... ... ... ... ... ... ... ... ... ...