GetOrganelle v1.7.5 get_organelle_from_reads.py assembles organelle genomes from genome skimming data. Find updates in https://github.com/Kinggerm/GetOrganelle and see README.md for more information. Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:41:03) [GCC 9.4.0] PLATFORM: Linux gavin-Precision-WorkStation-T7500 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 PYTHON LIBS: GetOrganelleLib 1.7.5; numpy 1.20.3; sympy 1.9; scipy 1.7.3 DEPENDENCIES: Bowtie2 2.4.2; SPAdes genome assembler 3.14.1; Blast 2.5.0 GETORG_PATH=/home/gavin/.GetOrganelle SEED DB: embplant_pt 0.0.1; embplant_mt 0.0.1 LABEL DB: embplant_pt 0.0.1; embplant_mt 0.0.1 WORKING DIR: /media/maria /home/gavin/anaconda3/envs/getorganelle/bin/get_organelle_from_reads.py -1 /media/maria/clean_data/SZAIPI029529-84/130606_I238_FCC1Y1TACXX_L2_SZAIPI029529-84_1.fq.gz.clean.dup.clean.gz -2 /media/maria/clean_data/SZAIPI029529-84/130606_I238_FCC1Y1TACXX_L2_SZAIPI029529-84_2.fq.gz.clean.dup.clean.gz -t 1 -o malaccensis.R30.plastome -F embplant_pt -R 30 2022-01-07 13:28:00,869 - INFO: Pre-reading fastq ... 2022-01-07 13:28:00,869 - INFO: Estimating reads to use ... (to use all reads, set '--reduce-reads-for-coverage inf --max-reads inf') 2022-01-07 13:28:00,945 - INFO: Tasting 100000+100000 reads ... 2022-01-07 13:28:24,032 - INFO: Estimating reads to use finished. 2022-01-07 13:28:24,032 - INFO: Unzipping reads file: /media/maria/clean_data/SZAIPI029529-84/130606_I238_FCC1Y1TACXX_L2_SZAIPI029529-84_1.fq.gz.clean.dup.clean.gz (381641542 bytes) 2022-01-07 13:28:32,222 - INFO: Unzipping reads file: /media/maria/clean_data/SZAIPI029529-84/130606_I238_FCC1Y1TACXX_L2_SZAIPI029529-84_2.fq.gz.clean.dup.clean.gz (381866209 bytes) 2022-01-07 13:28:40,344 - INFO: Counting read qualities ... 2022-01-07 13:28:40,461 - INFO: Identified quality encoding format = Illumina 1.5+ 2022-01-07 13:28:40,461 - INFO: Phred offset = 64 2022-01-07 13:28:40,461 - WARNING: Min quality score 'B'(66:2) in your fastq file is under the usual lower boundary (67, 105) 2022-01-07 13:28:40,541 - INFO: Mean error rate = 0.0024 2022-01-07 13:28:40,542 - INFO: Counting read lengths ... 2022-01-07 13:28:52,971 - INFO: Mean = 100.0 bp, maximum = 100 bp. 2022-01-07 13:28:52,971 - INFO: Reads used = 5015458+5015458 2022-01-07 13:28:52,971 - INFO: Pre-reading fastq finished. 2022-01-07 13:28:52,971 - INFO: Making seed reads ... 2022-01-07 13:28:52,972 - INFO: Seed bowtie2 index existed! 2022-01-07 13:28:52,972 - INFO: Mapping reads to seed bowtie2 index ... 2022-01-07 13:37:30,710 - INFO: Mapping finished. 2022-01-07 13:37:30,710 - INFO: Seed reads made: malaccensis.R30.plastome/seed/embplant_pt.initial.fq (234245672 bytes) 2022-01-07 13:37:30,710 - INFO: Making seed reads finished. 2022-01-07 13:37:30,710 - INFO: Checking seed reads and parameters ... 2022-01-07 13:37:30,710 - INFO: The automatically-estimated parameter(s) do not ensure the best choice(s). 2022-01-07 13:37:30,711 - INFO: If the result graph is not a circular organelle genome, 2022-01-07 13:37:30,711 - INFO: you could adjust the value(s) of '-w'/'-R' for another new run. 2022-01-07 13:38:18,984 - INFO: Pre-assembling mapped reads ... 2022-01-07 13:38:54,631 - INFO: Pre-assembling mapped reads finished. 2022-01-07 13:38:54,632 - INFO: Estimated embplant_pt-hitting base-coverage = 980.08 2022-01-07 13:38:55,028 - INFO: Reads reduced to = 2558706+2558706 2022-01-07 13:38:55,028 - INFO: Adjusting expected embplant_pt base coverage to 500.00 2022-01-07 13:38:55,028 - INFO: Estimated word size(s): 75 2022-01-07 13:38:55,028 - INFO: Setting '-w 75' 2022-01-07 13:38:55,028 - INFO: Setting '--max-extending-len inf' 2022-01-07 13:38:56,297 - INFO: Checking seed reads and parameters finished. 2022-01-07 13:38:56,297 - INFO: Making read index ... 2022-01-07 13:39:08,238 - INFO: For malaccensis.R30.plastome/1-130606_I238_FCC1Y1TACXX_L2_SZAIPI029529-84_1.fq.gz.clean.dup.clean.gz.fastq, only top 2558706 reads are used in downstream analysis. 2022-01-07 13:39:20,374 - INFO: For malaccensis.R30.plastome/2-130606_I238_FCC1Y1TACXX_L2_SZAIPI029529-84_2.fq.gz.clean.dup.clean.gz.fastq, only top 2558706 reads are used in downstream analysis. 2022-01-07 13:39:23,919 - INFO: 4289703 candidates in all 5117412 reads 2022-01-07 13:39:23,931 - INFO: Pre-grouping reads ... 2022-01-07 13:39:23,931 - INFO: Setting '--pre-w 75' 2022-01-07 13:39:24,377 - INFO: 200000/244261 used/duplicated 2022-01-07 13:39:31,576 - INFO: 6945 groups made. 2022-01-07 13:39:32,074 - INFO: Making read index finished. 2022-01-07 13:39:32,075 - INFO: Extending ... 2022-01-07 13:39:32,075 - INFO: Adding initial words ... 2022-01-07 13:39:48,546 - INFO: AW 2002830 2022-01-07 13:40:08,159 - INFO: Round 1: 4289703/4289703 AI 156289 AW 2175940 2022-01-07 13:40:25,384 - INFO: Round 2: 4289703/4289703 AI 157291 AW 2187088 2022-01-07 13:40:42,560 - INFO: Round 3: 4289703/4289703 AI 157943 AW 2193418 2022-01-07 13:40:59,742 - INFO: Round 4: 4289703/4289703 AI 158470 AW 2198716 2022-01-07 13:41:16,870 - INFO: Round 5: 4289703/4289703 AI 158856 AW 2202458 2022-01-07 13:41:34,007 - INFO: Round 6: 4289703/4289703 AI 159213 AW 2205840 2022-01-07 13:41:51,181 - INFO: Round 7: 4289703/4289703 AI 159535 AW 2208992 2022-01-07 13:42:08,350 - INFO: Round 8: 4289703/4289703 AI 159809 AW 2211590 2022-01-07 13:42:25,501 - INFO: Round 9: 4289703/4289703 AI 160054 AW 2213770 2022-01-07 13:42:42,670 - INFO: Round 10: 4289703/4289703 AI 160276 AW 2215646 2022-01-07 13:42:59,849 - INFO: Round 11: 4289703/4289703 AI 160550 AW 2217782 2022-01-07 13:43:16,999 - INFO: Round 12: 4289703/4289703 AI 160746 AW 2219126 2022-01-07 13:43:34,178 - INFO: Round 13: 4289703/4289703 AI 160921 AW 2220714 2022-01-07 13:43:51,357 - INFO: Round 14: 4289703/4289703 AI 161202 AW 2222876 2022-01-07 13:44:08,558 - INFO: Round 15: 4289703/4289703 AI 161408 AW 2224548 2022-01-07 13:44:25,711 - INFO: Round 16: 4289703/4289703 AI 161596 AW 2226020 2022-01-07 13:44:42,855 - INFO: Round 17: 4289703/4289703 AI 161744 AW 2227196 2022-01-07 13:45:00,029 - INFO: Round 18: 4289703/4289703 AI 161903 AW 2228516 2022-01-07 13:45:17,197 - INFO: Round 19: 4289703/4289703 AI 162065 AW 2229740 2022-01-07 13:45:34,362 - INFO: Round 20: 4289703/4289703 AI 162140 AW 2230292 2022-01-07 13:45:51,558 - INFO: Round 21: 4289703/4289703 AI 162197 AW 2230914 2022-01-07 13:46:08,731 - INFO: Round 22: 4289703/4289703 AI 162340 AW 2231942 2022-01-07 13:46:25,916 - INFO: Round 23: 4289703/4289703 AI 162421 AW 2232714 2022-01-07 13:46:43,103 - INFO: Round 24: 4289703/4289703 AI 162538 AW 2233656 2022-01-07 13:47:00,261 - INFO: Round 25: 4289703/4289703 AI 162637 AW 2234552 2022-01-07 13:47:17,404 - INFO: Round 26: 4289703/4289703 AI 162739 AW 2235248 2022-01-07 13:47:34,599 - INFO: Round 27: 4289703/4289703 AI 162803 AW 2235886 2022-01-07 13:47:51,825 - INFO: Round 28: 4289703/4289703 AI 162892 AW 2236520 2022-01-07 13:48:09,003 - INFO: Round 29: 4289703/4289703 AI 162961 AW 2237184 2022-01-07 13:48:26,174 - INFO: Round 30: 4289703/4289703 AI 163058 AW 2237962 2022-01-07 13:48:26,175 - INFO: Hit the round limit 30 and terminated ... 2022-01-07 13:48:34,100 - INFO: Extending finished. 2022-01-07 13:48:34,324 - INFO: Separating extended fastq file ... 2022-01-07 13:48:36,482 - INFO: Setting '-k 21,55,85' 2022-01-07 13:48:36,482 - INFO: Assembling using SPAdes ... 2022-01-07 13:48:36,494 - INFO: spades.py -t 1 --phred-offset 64 -1 malaccensis.R30.plastome/extended_1_paired.fq -2 malaccensis.R30.plastome/extended_2_paired.fq --s1 malaccensis.R30.plastome/extended_1_unpaired.fq --s2 malaccensis.R30.plastome/extended_2_unpaired.fq -k 21,55,85 -o malaccensis.R30.plastome/extended_spades 2022-01-07 13:52:19,337 - INFO: Insert size = 470.338, deviation = 14.0724, left quantile = 453, right quantile = 488 2022-01-07 13:52:19,337 - INFO: Assembling finished. 2022-01-07 13:52:20,502 - INFO: Slimming malaccensis.R30.plastome/extended_spades/K85/assembly_graph.fastg finished! 2022-01-07 13:52:20,503 - INFO: Slimming assembly graphs finished. 2022-01-07 13:52:20,503 - INFO: Extracting embplant_pt from the assemblies ... 2022-01-07 13:52:20,503 - INFO: Disentangling malaccensis.R30.plastome/extended_spades/K85/assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg as a circular genome ... 2022-01-07 13:52:21,022 - INFO: Disentangling failed: 'Multiple isolated embplant_pt components detected! Broken or contamination?' 2022-01-07 13:52:21,023 - INFO: Scaffolding disconnected contigs using SPAdes scaffolds ... 2022-01-07 13:52:21,023 - WARNING: Assembly based on scaffolding may not be as accurate as the ones directly exported from the assembly graph. 2022-01-07 13:52:21,023 - INFO: Disentangling malaccensis.R30.plastome/extended_spades/K85/assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg as a circular genome ... 2022-01-07 13:52:21,056 - WARNING: -400-bp gap/overlap between 45 and 4281 indicated while conflicting connections existed! 2022-01-07 13:52:21,059 - WARNING: Connection between 5015_tail and 11079_tail already existed! 2022-01-07 13:52:21,142 - INFO: Disentangling failed: 'Multiple isolated embplant_pt components detected! Broken or contamination?' 2022-01-07 13:52:21,142 - INFO: Disentangling malaccensis.R30.plastome/extended_spades/K85/assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg as a/an embplant_pt-insufficient graph ... 2022-01-07 13:52:21,719 - INFO: Average embplant_pt kmer-coverage = 89.7 2022-01-07 13:52:21,719 - INFO: Average embplant_pt base-coverage = 560.4 2022-01-07 13:52:21,719 - INFO: Writing output ... 2022-01-07 13:52:21,790 - INFO: Writing PATH1 of embplant_pt scaffold(s) to malaccensis.R30.plastome/embplant_pt.K85.scaffolds.graph1.1.path_sequence.fasta 2022-01-07 13:52:21,791 - INFO: Writing GRAPH to malaccensis.R30.plastome/embplant_pt.K85.contigs.graph1.selected_graph.gfa 2022-01-07 13:52:21,791 - INFO: Result status of embplant_pt: 9 scaffold(s) 2022-01-07 13:52:21,825 - INFO: Writing output finished. 2022-01-07 13:52:21,825 - INFO: Please ... 2022-01-07 13:52:21,825 - INFO: load the graph file 'assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg' in K85 2022-01-07 13:52:21,825 - INFO: load the CSV file 'assembly_graph.fastg.extend-embplant_pt-embplant_mt.csv' in K85 2022-01-07 13:52:21,825 - INFO: visualize and confirm the incomplete result in Bandage. 2022-01-07 13:52:21,826 - INFO: If the result is nearly complete, 2022-01-07 13:52:21,826 - INFO: you can also adjust the arguments according to https://github.com/Kinggerm/GetOrganelle/wiki/FAQ#what-should-i-do-with-incomplete-resultbroken-assembly-graph 2022-01-07 13:52:21,826 - INFO: If you have questions for us, please provide us with the get_org.log.txt file and the post-slimming graph in the format you like! 2022-01-07 13:52:21,826 - INFO: Extracting embplant_pt from the assemblies finished. Total cost 1462.22 s Thank you!