Skip to content

Latest commit

 

History

History
448 lines (233 loc) · 25.1 KB

Ihec_metadata_specification.md

File metadata and controls

448 lines (233 loc) · 25.1 KB

IHEC Metadata Specification (version 1.0)

Introduction

The IHEC metadata standards are extension of the standards used by the Roadmap Epigenomics Project. Please refer to Sections 1 and 2 of original specification (archived at https://github.com/IHEC/ihec-metadata/blob/master/specs/original_docs/IHEC-Metadata.pdf) for the data and metadata model.

This document describes metadata elements extending the SRA XML Schema 1.2. The core SRA XML elements are augmented by additional attributes defined for purposes of the NIH Roadmap Epigenomics as described in the official IHEC ecosystem repository.

Documentation for the core SRA XML elements is here: http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=doc

The SRA XML schemas are here: http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=xml_schemas

How to define multiple values per metadata tag

The same attribute may be used multiple times in a single XML record. This may be most useful, for example, for supplying URIs to multiple ontologies or for supplying multiple references to a single ontology such as in the case of DISEASE_ONTOLOGY_URI. For example, describing a brain primary tissue using ontology terms for ('Brodmann (1909) area 8', 'Brodmann (1909) area 9') would be:

<SAMPLE_ATTRIBUTE>
    <TAG>SAMPLE_ONTOLOGY_URI</TAG>
    <VALUE>http://purl.obolibrary.org/obo/UBERON_0013539</VALUE>
</SAMPLE_ATTRIBUTE>
<SAMPLE_ATTRIBUTE>
    <TAG>SAMPLE_ONTOLOGY_URI</TAG>
    <VALUE>http://purl.obolibrary.org/obo/UBERON_0013540</VALUE>
</SAMPLE_ATTRIBUTE>

Ontologies

Only terms from following ontologies are acceptable for annotating the metadata:

Field SAMPLE_ONTOLOGY_URI:

Fields DISEASE_ONTOLOGY_URI and DONOR_HEALTH_STATUS_ONTOLOGY_URI:

Field EXPERIMENT_ONTOLOGY_URI:

Field MOLECULE_ONTOLOGY_URI:

Tags with controlled vocabularies are labelled as "Controlled Vocabulary".

Tags with ontologies are labelled as "Ontology".

SAMPLES

Note for metadata resubmission

In order to pass IHEC metadata validation, all datasets submitted prior to 2018 must include all sample properties defined within each specific BIOMATERIAL_TYPE below.

Cell Line

SAMPLE_ONTOLOGY_URI - (Ontology: EFO) Links to sample ontology information.

DISEASE_ONTOLOGY_URI - (Ontology: NCIM) Links to sample disease ontology information. This attribute reflects the disease for this particular sample, not the donor health condition. The NCImetathesaurus term C0277545 "Disease type AND/OR category unknown" should be used for unknown diseases. For samples without any known disease, use the NCImetathesaurus term C0549184 "None". Phenotypes associated with the disease should be submitted as DISEASE_ONTOLOGY_URIs (if available) and/or in the free form DISEASE attribute.

DISEASE: - Free form field for more specific sample disease information. This property reflects the disease for this particular sample, not the donor health condition.

BIOMATERIAL_PROVIDER - The name of the company, laboratory or person that provided the biological material.

BIOMATERIAL_TYPE: - (Controlled Vocabulary) "Cell Line".

LINE - The name of the cell line.

LINEAGE - The developmental lineage to which the cell line belongs.

DIFFERENTIATION_STAGE - The stage in cell differentiation to which the cell line belongs.

DIFFERENTIATION_METHOD - The protocol used to differentiation the cell line.

PASSAGE - The number of times the cell line has been re-plated and allowed to grow back to confluency or to some maximum density, if using suspension cultures.

MEDIUM - The medium in which the cell line has been grown.

SEX: - (Controlled Vocabulary) "Male", "Female", "Unknown", or "Mixed" for pooled samples.

BATCH - The batch from which the cell line is derived. Primarily applicable to initial H1 cell line batches. NA if not applicable.

Primary Cell

SAMPLE_ONTOLOGY_URI - (Ontology: CL) links to sample ontology information.

DISEASE_ONTOLOGY_URI - (Ontology: NCIM) Links to sample disease ontology information. This attribute reflects the disease for this particular sample, not the donor health condition. The NCImetathesaurus term C0277545 "Disease type AND/OR category unknown" should be used for unknown diseases. For samples without any known disease, use the NCImetathesaurus term C0549184 "None". Phenotypes associated with the disease should be submitted as DISEASE_ONTOLOGY_URIs (if available) and/or in the free form DISEASE attribute. If dealing with a rare disease, please consider identifiability issues.

DISEASE: - Free form field for more specific sample disease information. This property reflects the disease for this particular sample, not the donor health condition. If dealing with a rare disease, please consider identifiability issues.

BIOMATERIAL_PROVIDER - The name of the company, laboratory or person that provided the biological material.

BIOMATERIAL_TYPE: - (Controlled Vocabulary) "Primary Cell".

ORIGIN_SAMPLE_ONTOLOGY_URI - (Ontology: UBERON) Links to the origin tissue from which the sample was extracted.

ORIGIN_SAMPLE - Description of the origin tissue from which the sample was extracted.

CELL_TYPE - The type of cell.

MARKERS - Markers used to isolate and identify the cell type.

DONOR_ID - An identifying designation for the donor that provided the primary cell.

DONOR_AGE - The age of the donor that provided the primary cell. NA if not available. If over 90 years enter as "90+". If entering a range of ages use the format "{age}-{age}".

DONOR_AGE_UNIT - (Controlled Vocabulary) "year", "month", "week", or "day".

DONOR_LIFE_STAGE - (Controlled Vocabulary) "fetal", "newborn", "child", "adult", "unknown", "embryonic", "postnatal".

DONOR_HEALTH_STATUS_ONTOLOGY_URI - (Ontology: NCIM) Links to the health status of the donor that provided the primary cell. The NCImetathesaurus term C0277545 "Disease type AND/OR category unknown" should be used for unknown diseases. Phenotypes associated with the disease should be submitted as DISEASE_ONTOLOGY_URIs (if available) or in the free form DISEASE attribute. For samples without any known disease, use the NCImetathesaurus term C0549184 "None". If dealing with a rare disease, please consider identifiability issues.

DONOR_HEALTH_STATUS - The health status of the donor that provided the primary cell. NA if not available.

DONOR_SEX - (Controlled Vocabulary) "Male", "Female", "Unknown", or "Mixed" for pooled samples.

DONOR_ETHNICITY - The ethnicity of the donor that provided the primary cell. NA if not available. If dealing with small/vulnerable populations consider identifiability issues.

PASSAGE_IF_EXPANDED - If the primary cell has been expanded, the number of times the primary cell has been re-plated and allowed to grow back to confluency or to some maximum density if using suspension cultures. NA if no expansion.

Primary Cell Culture

SAMPLE_ONTOLOGY_URI - (Ontology: CL) Links to sample ontology information.

DISEASE_ONTOLOGY_URI - (Ontology: NCIM) Links to sample disease ontology information. This attribute reflects the disease for this particular sample, not the donor health condition. The NCImetathesaurus term C0277545 "Disease type AND/OR category unknown" should be used for unknown diseases. For samples without any known disease, use the NCImetathesaurus term C0549184 "None". Phenotypes associated with the disease should be submitted as DISEASE_ONTOLOGY_URIs (if available) and/or in the free form DISEASE attribute. If dealing with a rare disease, please consider identifiability issues.

DISEASE - Free form field for more specific sample disease information. This property reflects the disease for this particular sample, not the donor health condition. If dealing with a rare disease, please consider identifiability issues.

BIOMATERIAL_PROVIDER - The name of the company, laboratory or person that provided the biological material.

BIOMATERIAL_TYPE - (Controlled Vocabulary) "Primary Cell Culture".

ORIGIN_SAMPLE_ONTOLOGY_URI - (Ontology: UBERON) links to the origin tissue from which the sample was extracted.

ORIGIN_SAMPLE - Description of the origin tissue from which the sample was extracted.

CELL_TYPE - The type of cell.

MARKERS - Markers used to isolate and identify the cell type.

CULTURE_CONDITIONS - The conditions under which the primary cell was cultured.

DONOR_ID - An identifying designation for the donor that provided the primary cell.

DONOR_AGE - The age of the donor that provided the primary cell. NA if not available. If over 90 years enter as "90+". If entering a range of ages use the format "{age}-{age}".

DONOR_AGE_UNIT - (Controlled Vocabulary) "year", "month", "week", or "day".

DONOR_LIFE_STAGE - (Controlled Vocabulary) "fetal", "newborn", "child", "adult", "unknown", "embryonic", "postnatal"

DONOR_HEALTH_STATUS_ONTOLOGY_URI - (Ontology: NCIM) Links to the health status of the donor that provided the primary cell. The NCImetathesaurus term C0277545 "Disease type AND/OR category unknown" should be used for unknown diseases. For samples without any known disease, use the NCImetathesaurus term C0549184 "None". Phenotypes associated with the disease should be submitted as DISEASE_ONTOLOGY_URIs (if available) or in the free form DISEASE attribute. If dealing with a rare disease, please consider identifiability issues.

DONOR_HEALTH_STATUS - The health status of the donor that provided the primary cell. NA if not available.

DONOR_SEX - (Controlled Vocabulary) "Male", "Female", "Unknown", or "Mixed" for pooled samples.

DONOR_ETHNICITY - The ethnicity of the donor that provided the primary cell. NA if not available. If dealing with small/vulnerable populations consider identifiability issues.

PASSAGE_IF_EXPANDED - If the primary cell culture has been expanded, the number of times the cell culture has been re-plated and allowed to grow back to confluency or to some maximum density if using suspension cultures. NA if no expansion.

Primary Tissue

SAMPLE_ONTOLOGY_URI - (Ontology: UBERON) Links to sample ontology information.

DISEASE_ONTOLOGY_URI - (Ontology: NCIM) Links to sample disease ontology information. This attribute reflects the disease for this particular sample, not the donor health condition. The NCImetathesaurus term C0277545 "Disease type AND/OR category unknown" should be used for unknown diseases. For samples without any known disease, use the NCImetathesaurus term C0549184 "None". Phenotypes associated with the disease should be submitted as DISEASE_ONTOLOGY_URIs (if available) and/or in the free form DISEASE attribute. If dealing with a rare disease, please consider identifiability issues.

DISEASE: - Free form field for more specific sample disease information. This property reflects the disease for this particular sample, not for the donor health condition. If dealing with a rare disease, please consider identifiability issues.

BIOMATERIAL_PROVIDER - The name of the company, laboratory or person that provided the biological material.

BIOMATERIAL_TYPE: - (Controlled Vocabulary) "Primary Tissue".

TISSUE_TYPE - The type of tissue.

TISSUE_DEPOT - Details about the anatomical location from which the primary tissue was collected.

COLLECTION_METHOD - The protocol for collecting the primary tissue.

DONOR_ID - An identifying designation for the donor that provided the primary tissue.

DONOR_AGE - The age of the donor that provided the primary tissue. NA if not available. If over 90 years enter as "90+". If entering a range of ages use the format "{age}-{age}".

DONOR_AGE_UNIT - (Controlled Vocabulary) "year", "month", "week", or "day".

DONOR_LIFE_STAGE - (Controlled Vocabulary) "fetal", "newborn", "child", "adult", "unknown", "embryonic", "postnatal"

DONOR_HEALTH_STATUS_ONTOLOGY_URI - (Ontology: NCIM) Links to the health status of the donor that provided the primary cell. The NCImetathesaurus term C0277545 "Disease type AND/OR category unknown" should be used for unknown diseases. For samples without any known disease, use the NCImetathesaurus term C0549184 "None". Phenotypes associated with the disease should be submitted as DISEASE_ONTOLOGY_URIs (if available) or in the free form DISEASE attribute. If dealing with a rare disease, please consider identifiability issues.

DONOR_HEALTH_STATUS - The health status of the donor that provided the primary tissue. NA if not available.

DONOR_SEX - (Controlled Vocabulary) "Male", "Female", "Unknown", or "Mixed" for pooled samples.

DONOR_ETHNICITY - The ethnicity of the donor that provided the primary tissue. NA if not available. If dealing with small/vulnerable populations consider identifiability issues.

EXPERIMENTS

Note for metadata resubmission

In order to pass IHEC metadata validation, all datasets submitted prior to 2018 must include the following properties:

  • LIBRARY_STRATEGY
  • EXPERIMENT_TYPE or EXPERIMENT_ONTOLOGY_URI
  • MOLECULE or MOLECULE_ONTOLOGY_URI, either defined in the experiment or sample object. Because of the complexity to validate the presence of this field in either object, this requirement will be validated at the time of submission to EpiRR.
Common fields

All experiments types include these fields:

EXPERIMENT_TYPE - (Controlled Vocabulary) The assay target (e.g. ‘DNA Methylation’, ‘mRNA-Seq’, ‘smRNA-Seq’, 'Histone H3K4me1').

EXPERIMENT_ONTOLOGY_URI - (Ontology: OBI) links to experiment ontology information.

LIBRARY_STRATEGY - (Controlled Vocabulary) The assay used. These are defined within the SRA metadata specifications with a controlled vocabulary (e.g. ‘Bisulfite-Seq’, ‘RNA-Seq’, ‘ChIP-Seq’). For a complete list, see https://www.ebi.ac.uk/ena/submit/reads-library-strategy.

MOLECULE_ONTOLOGY_URI - (Ontology: SO) links to molecule ontology information.

MOLECULE - (Controlled Vocabulary) The type of molecule that was extracted from the biological material. Include one of the following: total RNA, polyA RNA, cytoplasmic RNA, nuclear RNA, small RNA, genomic DNA, protein, or other.

Chromatin Accessibility

EXPERIMENT_TYPE: (Controlled Vocabulary) 'Chromatin Accessibility'.

EXPERIMENT_ONTOLOGY_URI: (Ontology: OBI) http://purl.obolibrary.org/obo/OBI_0002039, 'http://purl.obolibrary.org/obo/OBI_0001853' or any of its subclasses.

LIBRARY_STRATEGY: (Controlled Vocabulary) 'ATAC-Seq', 'DNase-Hypersensitivity'.

MOLECULE_ONTOLOGY_URI: (Ontology: SO) 'http://purl.obolibrary.org/obo/SO_0000991' or any of its subclasses.

MOLECULE: (Controlled Vocabulary) 'genomic DNA'.

EXTRACTION_PROTOCOL - The protocol used to isolate the extract material.

EXPERIMENT_PROTOCOL - The protocol used for library preparation (e.g. DNAse treatment, transposase treatment, etc.).

WGBS (NOTE: this is a new name to be used instead of Bisulfite-Seq )

EXPERIMENT_TYPE: (Controlled Vocabulary) 'DNA Methylation'.

EXPERIMENT_ONTOLOGY_URI: (Ontology: OBI) 'http://purl.obolibrary.org/obo/OBI_0001863' or any of its subclasses.

LIBRARY_STRATEGY: (Controlled Vocabulary) 'Bisulfite-Seq'.

MOLECULE_ONTOLOGY_URI: (Ontology: SO) 'http://purl.obolibrary.org/obo/SO_0000991' or any of its subclasses.

MOLECULE: (Controlled Vocabulary) 'genomic DNA'.

EXTRACTION_PROTOCOL - The protocol used to isolate the extract material.

EXTRACTION_PROTOCOL_TYPE_OF_SONICATOR - The type of sonicator used for extraction.

EXTRACTION_PROTOCOL_SONICATION_CYCLES - The number of sonication cycles used for extraction.

DNA_PREPARATION_INITIAL_DNA_QNTY - The initial DNA quantity used in preparation.

DNA_PREPARATION_FRAGMENT_SIZE_RANGE - The DNA fragment size range used in preparation.

DNA_PREPARATION_ADAPTOR_SEQUENCE - The sequence of the adaptor used in preparation.

DNA_PREPARATION_ADAPTOR_LIGATION_PROTOCOL - The protocol used for adaptor ligation.

DNA_PREPARATION_POST-LIGATION_FRAGMENT_SIZE_SELECTION - The fragment size selection after adaptor ligation.

BISULFITE_CONVERSION_PROTOCOL - The bisulfite conversion protocol.

BISULFITE_CONVERSION_PERCENT - The bisulfite conversion percent and how it was determined.

LIBRARY_GENERATION_PCR_TEMPLATE_CONC - The PCR template concentration for library generation.

LIBRARY_GENERATION_PCR_POLYMERASE_TYPE - The PCR polymerase used for library generation

LIBRARY_GENERATION_PCR_THERMOCYCLING_PROGRAM - The thermocycling program used for library generation.

LIBRARY_GENERATION_PCR_NUMBER_CYCLES - The number of PCR cycles used for library generation.

LIBRARY_GENERATION_PCR_F_PRIMER_SEQUENCE - The sequence of the PCR forward primer used for library generation.

LIBRARY_GENERATION_PCR_R_PRIMER_SEQUENCE - The sequence of the PCR reverse primer used for library generation.

LIBRARY_GENERATION_PCR_PRIMER_CONC - The concentration of the PCR primers used for library generation.

LIBRARY_GENERATION_PCR_PRODUCT_ISOLATION_PROTOCOL - The protocol for isolating PCR products used for library generation.

MeDIP-Seq

EXPERIMENT_TYPE: (Controlled Vocabulary) 'DNA Methylation'.

EXPERIMENT_ONTOLOGY_URI: (Ontology: OBI) 'http://purl.obolibrary.org/obo/OBI_0000693' or any of its subclasses.

LIBRARY_STRATEGY: (Controlled Vocabulary) 'MeDIP-Seq'.

MOLECULE_ONTOLOGY_URI: (Ontology: SO) 'http://purl.obolibrary.org/obo/SO_0000991' or any of its subclasses.

MOLECULE: (Controlled Vocabulary) 'genomic DNA'.

EXTRACTION_PROTOCOL - The protocol used to isolate the extract material.

EXTRACTION_PROTOCOL_TYPE_OF_SONICATOR - The type of sonicator used for extraction.

EXTRACTION_PROTOCOL_SONICATION_CYCLES - The number of sonication cycles used for extraction.

MeDIP_PROTOCOL - The MeDIP protocol used.

MeDIP_PROTOCOL_DNA_AMOUNT - The amount of DNA used in the MeDIP protocol.

MeDIP_PROTOCOL_BEAD_TYPE - The type of bead used in the MeDIP protocol.

MeDIP_PROTOCOL_BEAD_AMOUNT - The amount of beads used in the MeDIP protocol.

MeDIP_PROTOCOL_ANTIBODY_AMOUNT - The amount of antibody used in the MeDIP protocol.

MeDIP_ANTIBODY - The specific antibody used in the MeDIP protocol.

MeDIP_ANTIBODY_PROVIDER - The name of the company, laboratory or person that provided the antibody.

MeDIP_ANTIBODY_CATALOG - The catalog from which the antibody was purchased.

MeDIP_ANTIBODY_LOT - The lot identifier of the antibody.

MRE-Seq

EXPERIMENT_TYPE: (Controlled Vocabulary) 'DNA Methylation'.

EXPERIMENT_ONTOLOGY_URI: (Ontology: OBI) 'http://purl.obolibrary.org/obo/OBI_0001861' or any of its subclasses.

LIBRARY_STRATEGY: - (Controlled Vocabulary) 'MRE-Seq'.

MOLECULE_ONTOLOGY_URI: (Ontology: SO) 'http://purl.obolibrary.org/obo/SO_0000991' or any of its subclasses.

MOLECULE: (Controlled Vocabulary) 'genomic DNA'.

MRE_PROTOCOL - The MRE protocol.

MRE_PROTOCOL_CHROMATIN_AMOUNT - The amount of chromatin used in the MRE protocol.

MRE_PROTOCOL_RESTRICTION_ENZYME - The restriction enzyme(s) used in the MRE protocol.

MRE_PROTOCOL_SIZE_FRACTION - The size of the fragments selected in the MRE protocol.

ChIP-Seq

EXPERIMENT_TYPE: (Controlled Vocabulary) one of ('ChIP-Seq Input','Histone H3K4me1','Histone H3K4me3','Histone H3K9me3','Histone H3K9ac','Histone H3K27me3','Histone H3K36me3', etc.).

EXPERIMENT_ONTOLOGY_URI: (Ontology: OBI) 'http://purl.obolibrary.org/obo/OBI_0000716' or any of its subclasses.

LIBRARY_STRATEGY: (Controlled Vocabulary) 'ChIP-Seq'.

MOLECULE_ONTOLOGY_URI: (Ontology: SO) 'http://purl.obolibrary.org/obo/SO_0000991' or any of its subclasses.

MOLECULE: (Controlled Vocabulary) 'genomic DNA'.

EXTRACTION_PROTOCOL - The protocol used to isolate the extract material.

EXTRACTION_PROTOCOL_TYPE_OF_SONICATOR - The type of sonicator used for extraction.

EXTRACTION_PROTOCOL_SONICATION_CYCLES - The number of sonication cycles used for extraction.

CHIP_PROTOCOL - The ChIP protocol used, or 'Input'.

CHIP_PROTOCOL_CHROMATIN_AMOUNT - The amount of chromatin used in the ChIP protocol.

CHIP_PROTOCOL_BEAD_TYPE - The type of bead used in the ChIP protocol. Leave empty for 'ChIP-Seq Input'.

CHIP_PROTOCOL_BEAD_AMOUNT - The amount of beads used in the ChIP protocol. Leave empty for 'ChIP-Seq Input'.

CHIP_PROTOCOL_ANTIBODY_AMOUNT - The amount of antibody used in the ChIP protocol. Leave empty for 'ChIP-Seq Input'.

CHIP_ANTIBODY - The specific antibody used in the ChIP protocol. Leave empty for 'ChIP-Seq Input'.

CHIP_ANTIBODY_PROVIDER - The name of the company, laboratory or person that provided the antibody. Leave empty for 'ChIP-Seq Input'.

CHIP_ANTIBODY_CATALOG - The catalog from which the antibody was purchased. Leave empty for 'ChIP-Seq Input'.

CHIP_ANTIBODY_LOT - The lot identifier of the antibody. Leave empty for 'ChIP-Seq Input'.

CHIP_PROTOCOL_CROSSLINK_TIME - The timespan in which the chromatin is crosslinked. Leave empty for 'ChIP-Seq Input'.

LIBRARY_GENERATION_FRAGMENT_SIZE_RANGE - The fragment size range of the preparation. Leave empty for 'ChIP-Seq Input'.

RNA-Seq

EXPERIMENT_TYPE: (Controlled Vocabulary) 'RNA-Seq'.

EXPERIMENT_ONTOLOGY_URI: (Ontology: OBI) 'http://purl.obolibrary.org/obo/OBI_0001271' or any of its subclasses.

LIBRARY_STRATEGY - (Controlled Vocabulary) 'RNA-Seq'.

MOLECULE_ONTOLOGY_URI: (Ontology: SO) 'http://purl.obolibrary.org/obo/SO_0000234' or any of its subclasses.

MOLECULE: (Controlled Vocabulary) 'polyA RNA', 'total RNA', 'nuclear RNA', 'cytoplasmic RNA' or 'small RNA'.

EXTRACTION_PROTOCOL - The protocol used to isolate the extract material.

EXTRACTION_PROTOCOL_RNA_ENRICHMENT - The RNA enrichment method used in the extraction protocol.

EXTRACTION_PROTOCOL_FRAGMENTATION - The fragmentation method used in the extraction protocol.

RNA_PREPARATION_FRAGMENT_SIZE_RANGE - The RNA fragment size range of the preparation, or 'NA' if not applicable.

RNA_PREPARATION_5'_RNA_ADAPTER_SEQUENCE - The sequence of the 5’ RNA adapter used in preparation.

RNA_PREPARATION_3'_RNA_ADAPTER_SEQUENCE - The sequence of the 3’ RNA adapter used in preparation.

RNA_PREPARATION_REVERSE_TRANSCRIPTION_PRIMER_SEQUENCE - The sequence of the primer for reverse transcription used in preparation.

RNA_PREPARATION_5'_DEPHOSPHORYLATION - The protocol for 5’ dephosphorylation used in preparation.

RNA_PREPARATION_5'_PHOSPHORYLATION - The protocol for 5’ phosphorylation used in preparation.

RNA_PREPARATION_3'_RNA_ADAPTER_LIGATION_PROTOCOL - The protocol for 3’ adapter ligation used in preparation.

RNA_PREPARATION_5'_RNA_ADAPTER_LIGATION_PROTOCOL - The protocol for 5’ adapter ligation used in preparation.

LIBRARY_GENERATION_PCR_TEMPLATE_CONC - The PCR template concentration for library generation.

LIBRARY_GENERATION_PCR_POLYMERASE_TYPE - The PCR polymerase used for library generation

LIBRARY_GENERATION_PCR_THERMOCYCLING_PROGRAM - The thermocycling program used for library generation.

LIBRARY_GENERATION_PCR_NUMBER_CYCLES - The number of PCR cycles used for library generation.

LIBRARY_GENERATION_PCR_F_PRIMER_SEQUENCE - The sequence of the PCR forward primer used for library generation.

LIBRARY_GENERATION_PCR_R_PRIMER_SEQUENCE - The sequence of the PCR reverse primer used for library generation.

LIBRARY_GENERATION_PCR_PRIMER_CONC - The concentration of the PCR primers used for library generation.

LIBRARY_GENERATION_PCR_PRODUCT_ISOLATION_PROTOCOL - The protocol for isolating PCR products used for library generation.

TEMPLATE_TYPE - (Controlled Vocabulary) mRNA or cDNA - The type of template, if applicable.

AMPLIFIED - (Controlled Vocabulary) True or False - Is the sample amplified?

PREPARATION_INITIAL_RNA_QNTY - The initial RNA quantity used in preparation.

PREPARATION_REVERSE_TRANSCRIPTION_PROTOCOL - The protocol for reverse transcription used in preparation.

PREPARATION_PCR_NUMBER_CYCLES - The number of PCR cycles used to amplify.

LIBRARY_GENERATION_PROTOCOL - The protocol used to generate the library.

LIBRARY_GENERATION_FRAGMENTATION - The fragmentation method used in the library protocol.

LIBRARY_GENERATION_FRAGMENT_SIZE_RANGE - The fragment size range of the preparation.

LIBRARY_GENERATION_3'_ADAPTER_SEQUENCE - The sequence of the 3' adapter used for library generation.

LIBRARY_GENERATION_5'_ADAPTER_SEQUENCE - The sequence of the 5' adapter used for library generation.