Skip to content

Module: Sequencing

Niema Moshiri edited this page Feb 2, 2017 · 34 revisions

The Sequencing module simulates sequencing imperfections (sequence subsampling per individual, sequencing error, post-processing, consensus, ambiguity, etc.). See the source code to see what is defined by the abstract class.

List of Implementations

  • Sequencing_ART454Amplicon
    • Uses ART to simulate realistic Roche 454 reads (amplicon sequencing)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_454_path: The path to your art_454 executable (or simply "art_454" if it is in your PATH variable
      • art_454_options: The command-line arguments with which to run art_454 (excluding <-A|-B>, <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, and <#_READS/#_READ_PAIRS_PER_AMPLICON>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_454_amplicon_mode: The desired mode of amplicon sequencing
        • Specify "single" for single-end amplicon sequencing
        • Specify "paired" for paired-end amplicon sequencing
      • art_454_reads_pairs_per_amplicon: Number of reads (single-end) or read pairs (paired-end) per amplicon
      • out_dir: The simulation's output directory
  • Sequencing_ART454PairedEnd
    • Uses ART to simulate realistic Roche 454 reads (paired-end)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_454_path: The path to your art_454 executable (or simply "art_454" if it is in your PATH variable
      • art_454_options: The command-line arguments with which to run art_454 (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, <FOLD_COVERAGE>, <MEAN_FRAG_LEN>, and <STD_DEV>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_454_fold_coverage: The desired fold of read coverage
      • art_454_mean_frag_len: The average DNA fragment size for paired-end read simulation
      • art_454_std_dev: The standard deviation of the DNA fragment size for paired-end read simulation
      • out_dir: The simulation's output directory
  • Sequencing_ART454SingleEnd
    • Uses ART to simulate realistic Roche 454 reads (single-end)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_454_path: The path to your art_454 executable (or simply "art_454" if it is in your PATH variable
      • art_454_options: The command-line arguments with which to run art_454 (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, and <FOLD_COVERAGE>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_454_fold_coverage: The desired fold of read coverage
      • out_dir: The simulation's output directory
  • Sequencing_ARTillumina
    • Uses ART to simulate realistic Illumina NGS sequence data from the true sequences
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_illumina_path: The path to your art_illumina executable (or simply "art_illumina" if it is in your PATH variable
      • art_illumina_options: The command-line arguments with which to run art_illumina (excluding -i and -o)
      • out_dir: The simulation's output directory
  • Sequencing_SOLiDSingleEnd
    • Uses ART to simulate realistic SOLiD reads (single-end)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_SOLiD_path: The path to your art_SOLiD executable (or simply "art_SOLiD" if it is in your PATH variable
      • art_SOLiD_options: The command-line arguments with which to run art_SOLiD (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, <LEN_READ>, and <FOLD_COVERAGE>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_SOLiD_len_read: The desired length of F3/R3 reads (max 75)
      • art_SOLiD_fold_coverage: The desired fold of read coverage
      • out_dir: The simulation's output directory
  • Sequencing_DWGSIM
    • Uses DWGSIM to simulate realistic NGS sequence data from the true sequences
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • dwgsim_path: The path to your DWGSIM executable (or simply "dwgsim" if it is in your PATH variable
      • dwgsim_options: The command-line options with which to run DWGSIM (just the options, not <in.ref.fa> or <out.prefix>)
        • To use default settings, simply use the empty string (i.e., "")
      • out_dir: The simulation's output directory
  • Sequencing_Perfect
    • Returns full error-free sequences for all viruses
    • Requirements:
      • None
    • Config Parameters:
      • out_dir: The simulation's output directory