Skip to content

Module: Sequencing

Niema Moshiri edited this page Mar 30, 2017 · 34 revisions

The Sequencing module simulates sequencing imperfections:

  • sequence subsampling per individual,
  • sequencing error
  • post-processing
  • consensus (ambiguity, etc.).

See the source code to see what is defined by the abstract class.

List of Implementations

  • Sequencing_ART454Amplicon
    • Uses ART to simulate realistic Roche 454 reads (amplicon sequencing)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_454_path: The path to your art_454 executable (or simply "art_454" if it is in your PATH variable)
      • art_454_options: The command-line arguments with which to run art_454 (excluding <-A|-B>, <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, and <#_READS/#_READ_PAIRS_PER_AMPLICON>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_454_amplicon_mode: The desired mode of amplicon sequencing
        • Specify "single" for single-end amplicon sequencing
        • Specify "paired" for paired-end amplicon sequencing
      • art_454_reads_pairs_per_amplicon: Number of reads (single-end) or read pairs (paired-end) per amplicon
      • out_dir: The simulation's output directory
  • Sequencing_ART454PairedEnd
    • Uses ART to simulate realistic Roche 454 reads (paired-end)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_454_path: The path to your art_454 executable (or simply "art_454" if it is in your PATH variable)
      • art_454_options: The command-line arguments with which to run art_454 (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, <FOLD_COVERAGE>, <MEAN_FRAG_LEN>, and <STD_DEV>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_454_fold_coverage: The desired fold of read coverage
      • art_454_mean_frag_len: The average DNA fragment size for paired-end read simulation
      • art_454_std_dev: The standard deviation of the DNA fragment size for paired-end read simulation
      • out_dir: The simulation's output directory
  • Sequencing_ART454SingleEnd
    • Uses ART to simulate realistic Roche 454 reads (single-end)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_454_path: The path to your art_454 executable (or simply "art_454" if it is in your PATH variable)
      • art_454_options: The command-line arguments with which to run art_454 (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, and <FOLD_COVERAGE>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_454_fold_coverage: The desired fold of read coverage
      • out_dir: The simulation's output directory
  • Sequencing_ARTillumina
    • Uses ART to simulate realistic Illumina NGS sequence data from the true sequences
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_illumina_path: The path to your art_illumina executable (or simply "art_illumina" if it is in your PATH variable)
      • art_illumina_options: The command-line arguments with which to run art_illumina (excluding -i and -o)
      • out_dir: The simulation's output directory
  • Sequencing_ARTSOLiDAmpliconMatePair
    • Uses ART to simulate realistic SOLiD reads (amplicon mate-pair, F3-R3)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_SOLiD_path: The path to your art_SOLiD executable (or simply "art_SOLiD" if it is in your PATH variable)
      • art_SOLiD_options: The command-line arguments with which to run art_SOLiD (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, <LEN_READ>, and <READ_PAIRS_PER_AMPLICON>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_SOLiD_len_read: The desired length of F3/R3 reads (max 75)
      • art_SOLiD_read_pairs_per_amplicon: The desired number of read pairs per amplicon
      • out_dir: The simulation's output directory
  • Sequencing_ARTSOLiDAmpliconPairedEnd
    • Uses ART to simulate realistic SOLiD reads (amplicon paired-end, F3-F5)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_SOLiD_path: The path to your art_SOLiD executable (or simply "art_SOLiD" if it is in your PATH variable)
      • art_SOLiD_options: The command-line arguments with which to run art_SOLiD (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, <LEN_READ_F3>, <LEN_READ_F5>, and <READ_PAIRS_PER_AMPLICON>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_SOLiD_len_read_F3: The desired length of F3 reads (max 75)
      • art_SOLiD_len_read_F5: The desired length of F5 reads (max 75)
      • art_SOLiD_read_pairs_per_amplicon: The desired number of read pairs per amplicon
      • out_dir: The simulation's output directory
  • Sequencing_ARTSOLiDAmpliconSingleEnd
    • Uses ART to simulate realistic SOLiD reads (amplicon single-end, F3)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_SOLiD_path: The path to your art_SOLiD executable (or simply "art_SOLiD" if it is in your PATH variable)
      • art_SOLiD_options: The command-line arguments with which to run art_SOLiD (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, <LEN_READ>, and <READS_PER_AMPLICON>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_SOLiD_len_read: The desired length of F3 reads (max 75)
      • art_SOLiD_reads_per_amplicon: The desired number of reads per amplicon
      • out_dir: The simulation's output directory
  • Sequencing_ARTSOLiDMatePair
    • Uses ART to simulate realistic SOLiD reads (mate-pair, F3-R3)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_SOLiD_path: The path to your art_SOLiD executable (or simply "art_SOLiD" if it is in your PATH variable)
      • art_SOLiD_options: The command-line arguments with which to run art_SOLiD (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, <LEN_READ>, and <FOLD_COVERAGE>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_SOLiD_len_read: The desired length of F3/R3 reads (max 75)
      • art_SOLiD_fold_coverage: The desired fold of read coverage
      • art_SOLiD_mean_frag_len: The mean fragment size for mate-pair read simulation
      • art_SOLiD_std_dev: The standard deviation of the fragment size for mate-pair simulation
      • out_dir: The simulation's output directory
  • Sequencing_ARTSOLiDPairedEnd
    • Uses ART to simulate realistic SOLiD reads (paired-end, F3-F5)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_SOLiD_path: The path to your art_SOLiD executable (or simply "art_SOLiD" if it is in your PATH variable)
      • art_SOLiD_options: The command-line arguments with which to run art_SOLiD (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, <LEN_READ_F3>, <LEN_READ_F5>, <FOLD_COVERAGE>, <MEAN_FRAG_LEN>, and <STD_DEV>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_SOLiD_len_read_F3: The desired length of F3 reads (max 75)
      • art_SOLiD_len_read_F5: The desired length of F5 reads (max 75)
      • art_SOLiD_fold_coverage: The desired fold of read coverage
      • art_SOLiD_mean_frag_len: The mean fragment size for mate-pair read simulation
      • art_SOLiD_std_dev: The standard deviation of the fragment size for mate-pair simulation
      • out_dir: The simulation's output directory
  • Sequencing_ARTSOLiDSingleEnd
    • Uses ART to simulate realistic SOLiD reads (single-end, F3)
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • art_SOLiD_path: The path to your art_SOLiD executable (or simply "art_SOLiD" if it is in your PATH variable)
      • art_SOLiD_options: The command-line arguments with which to run art_SOLiD (excluding <INPUT_SEQ_FILE>, <OUTPUT_FILE_PREFIX>, <LEN_READ>, and <FOLD_COVERAGE>)
        • To use default settings, simply use the empty string (i.e., "")
      • art_SOLiD_len_read: The desired length of F3 reads (max 75)
      • art_SOLiD_fold_coverage: The desired fold of read coverage
      • out_dir: The simulation's output directory
  • Sequencing_DWGSIM
    • Uses DWGSIM to simulate realistic NGS sequence data from the true sequences
    • Generates one sequencing run per sampled individual
    • Requirements:
      • DWGSIM
    • Config Parameters:
      • dwgsim_path: The path to your DWGSIM executable (or simply "dwgsim" if it is in your PATH variable)
      • dwgsim_options: The command-line options with which to run DWGSIM (just the options, not <in.ref.fa> or <out.prefix>)
        • To use default settings, simply use the empty string (i.e., "")
      • out_dir: The simulation's output directory
  • Sequencing_Grinder
    • Uses Grinder to simulate realistic Sanger sequence data from the true sequences
    • Generates one sequencing run per sampled individual
    • Requirements:
    • Config Parameters:
      • grinder_path: The path to your DWGSIM executable (or simply "dwgsim" if it is in your PATH variable)
      • out_dir: The simulation's output directory
  • Sequencing_Perfect
    • Returns full error-free sequences for all viruses
    • Requirements:
      • None
    • Config Parameters:
      • out_dir: The simulation's output directory