Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infernal output could align to the negative strand #27

Merged
merged 2 commits into from
Jun 14, 2023

Conversation

nh13
Copy link
Contributor

@nh13 nh13 commented Jun 12, 2023

@jeffreybarrick
Copy link
Contributor

@nh13 Some questions about this PR...

  1. It doesn't seem like this change is necessary? Do you have an example? I see predictions from Infernal being shown on both strands in the webserver version. For example, I am using Addgene #85494 as example input file for testing.

  2. I get a crash when I try to run Addgene #193140 with your code.

Please advise!

@nh13
Copy link
Contributor Author

nh13 commented Jun 13, 2023

My apologies, I should have included the test case from the start

command

plannotate batch -i test.fasta -y debug.yaml -o tmp/ -f test -s '' -c -d

test.fasta

>SRR11171709.78.1 78 length=7306
CATAGAGAGATAGTATTGTAACCTGACAAAGGTTAAATTAGAAAAGAGCTTTATGAAAATAATCTCCGAGATAAAAACAAACCCGCTAGAAACATATTACGATGACATGCTTGTTTCATCAGGTGAAAAATTCCATTTTGTGGTGATATAGGACAGACAAAAAATATAGATCATTTTTCTTCCTATTGCACATTATCCTGATTTTCGGTGATGCCTATTATTTAGTTCCATCGTGCCGCGACTGCAATATGGGAGAGAAAGGTCAAGTTTTCGCAGTAGATGAGGTAACACCAGCGATTCATCCCTATATCGACAAGGACATTTTTTTTCGTGAGCAATGGGTATATGCAAATTTCGTTTCCGGAACTCCGGGTGCTATCAGTTTTTATGTTGAATGCCCGGCGAACTGGAGGCAGGAAGACAAACACAGAGCTCTTCATCATTTCAAGCTATTAAATATTGCTAACAGGTATCGTTTGGAGGCAGGGAAGCACTTGAGTGAAGTGATTACTCAAAGAAACTCTTTCGTAAAAGTTATAAGGAAATATAGTTCAACCGCAACGTTTCAGCAGCTACAGTCAGAATTTATTGAAGCAAATCTGAAACCTATTATAGATTTGAATGACTTCCCAATTATTGGAAAAGAGTTATGTATCAGTGCCTAGCAAACTCGGAAGATTTTTTCAGAGGGATCTAGAATATGATGAAAGATAGAAAATTACGACGCTTATCGGAAGTGAACGAATACTTTTTATATGAGGAGGGCTGTTTTTACAAAATCCGGTAGTAAACTTGCTAACCAATTCCTAGGCAGGTCATTGGGCCAACAGTGGCATGCACCGAGAAGGACGTTTGTAATGTCCGCTCCGGCACATAGCAGTCCTAGGGACAGTGGCGTACAGTCATAGATGGTCGGTGGGAGGTGGTACAAATTCTCTCATGCAAAAAATATGTAAAAATCGGTAGGCAAACTGGAAAATCATGCAACACCCGCACTATCGGAAGTTCACCAGCCAGCCCGCAGCACGTTCCTGCATACGACGTGGTCTGCGGCTCTACCATATCTCCTATGAGCAACGTGTTAGCAGAGCCAAGCCACAACTCTAATTTTAATACATAATGAATGATAATAATAATATTAAAAATTTCCTGTGTAACTAATTTACTATATGGTTTCTGATAAGAATCATTGCAAAGATCAAACAACTTGTATTACATTGACAGTTAAGCAGTTAATTTTATCACCTCTAAAAATATATCAGCATCTAGCAATGCAACCTATCAAAATGGAGAGTTTTATGACTAAAAAAACCATGGGAAAGAAGACTTAAAGATTTATTCGCACTTGCTCAAAATGCTGCATTGATACATATTTTGACCCTGAATTATTTCGCTTGAATTTGAATCAATTCCTCCAAACCGCAAGAACAGTAACATTTTATTATTCAAAAAAACAAAAACCAGATTATAGATATGACATTTGGTATAACAATAATGTTATTGAAAAAATGGAAAAATGATCCATTAATGGCTTGGGCTAAAAATTTCTCGCAATACGAGTAGAAAAACAAGGCGATTTAGAAATGTATAGCGAGGCAAAGGCTACTCTTATTTCATCTTACATTGAAGAAAATGACATTGAGTTTATTACAAATGAAAGTATGTTAAACATTGGTATAAAAAAGTTAGTCAGACTTGCACAAAAGAAATTACCTTCATATTTAACTGAATCATCTCTTATTAATCAGAAAGACGATGGGTCGCTAATACGCTAAAAGATTACGAATTATTACATGCCTTACTATAATCTATGGCAGAATGTATAACTGCTGTAACTCTCTTGGCATACAAATACAATCCAATGGGTGACGATGTGATTTCGCCAACATCATTCGACTCTTTATTTGATGAAGCCAGGAGAATAACTTATTTTAAAATTAAAAGATTACTCCATAAGCAAATTGTCATTTAGCATGATACAATATGACAATAAAATAATTCCTGAAGATATTAAAGAGCGTCTAAAACTGGTAGATAAGCCTAAAAATATCACTTCGACAGAAGAGTTAGTTGACTATACAGCCAAGCTTGCAGAAACGACTTTTTTAAAGGACGGTTATCACATTCAAACATTAATTTTTTATGATAAACAATTCCATCCAATTGATTTAATCATACACATTTGAAGATCAAGCAGATAAATATATTTTTTGGCGTTATGCAGCTGACAGAGCCAAAATAACAATGCCTATGGCTTCATTTGGATATCAGAGCTATGGCTCAGAAAAGCAAGCATCTACTCCAATAAACCAATACATACAATGCCAATTATAGATGAAAGACTTCAGGTAATTGGAAGTGATTCAATAATAATCAAACATGTATTTCCATGGAAAATAGTTAGAGAAAACGAAGAAAAAAAAACCGACTTTAGAAATATCAAACAGCAGACCTCAAAACATGGACGAAAAACCATATTTCATGCGTTCAGTCTTAAAAGCAATTGGCGGTGATGTAAACACTATGAACAATTGAGTCATAGAACTTCCATTATTCTCCTGAAGATAATAATCGCCAAATAAACCAATACTCAGCTTTACAATATACTAACTAACCGCAGAACGTTATTTCATACAACGTTTTCTGCGGCATATCACAAAACGATTACTCCATAACAGGGACAGCAGGCCACTCAATATCAGGTGCAGTGATGTATCACACGGTTCAGCAACACCCCGATACTTCTTCCAGGCTTCCAGCAACGAGGTTTCTTCCTTCGTTGCAATTTCCAGATCTGCAGCATCCTGAAGCGGCGCAATATGCTCACTGGCTACCTGCATCAGGCTTTTTTTTTGTTTCTTCCGCCTCCCGGATCCGGAACAGTTTTTCTGCTTCCGTATCCTTCACCCAGGCTGTGCCGTTCCACTTCTGATATTCCCCTCCCGGCGATAACCAGGTAAAATTTTCCGGTACGGACCGAGTTCAGAAATAAATAACGCGTCGCCGGAAGCCACGTCATAGACGGTTTTACCCCGATGGTCTTCAACGAGATGCCACGATGCCTCATCACTGTTGAAAAACAGCCACAAAGCCAGCGGAATATCTGGCGGTGCAATATCGGTACTGTTTGCAGGCAGACCGGTATGAGGCGGAATATATGCGTCACCTTCACCATAAATTCATTAGTTCCGGCCAGCAGATTATAAATTTTTATGGGTCCGTGGTTGTTCACTCATTCTGAATGCCATTATGCAAGCCTCACAAATAGTGTAAATGCAATGTTTTTGACGGTGTTTTCCGCGTTACCCGCAGCGTTAACGGTGATGGTGTGTCCGTGTGAACCAATACTGAAAGAATGGGCATGAGCACCGATAACAACGCGGATGCTGGTTGCGCACCCAATACCAACTGTATGCGCATGTGCACCGGCACTCACGGCTGTACCGGACAATGAGTGACTGTGGCTGCCCTGACTGTCCGTTTTCGATAAATAAGCAATACCTGTGTGGCTGGTTCCTTTAACTGTGGATAAACTTCCTGTAATGGTTGCTGTTCCATACTGACTCCAGCCAGAACTGTTCATCCTTAACCACTTGTGTGGGCATGGCACCCGCGGCCCCTGTTGAACCGCTCAGACTGTGAGCATGAGCCCCCGTGTTATTCGTCGATTTTGGTGCCGTAATCGAAACTGCCTGTTGTTTTCGTCCCGTAATCAAACGACGATGTGGTTTTCGTCCCCAAATCCGTACCGGATGCACTGGCACTGTGGGTGTGCGACTTAATTCCATCCTGTTCCTGAGACAATACAGCACGACCGCTGGCGGGTTTCCCTTGATTGTCGCAGCCTCGCATATCAGGAAGCACACCCGATGGATACGCGACAGCAAGTTTTGGGGTAGGCTGATTTGTCAAACGCCTGCCCCTGCATCAGGACGTAAGCCAGACGGAACGATATCTGATGGCCACGGATCGGCGCACCTGCCGGAAAGGCTCGAATTCTCACCGGCCCCAAGGTATTCAAGAACATCTGCAACGGAATTTTTGCCCAGAATATCCCTGCCAACCTGAGTCAGTTCAGTCAGGCTGGCGGCATCATTTTCCGCAAAATACGGTAATTTATTTTTCGCCGTGGAAAGCCCTGCCAGCGCCGTCAGTGTCGCATTCTTCGGTTGTTTACCCGCAAGCGCGTTAGTCATGGTGGTAGCAAAATCTGGATCATTCCCGAGCGCTGCGGCCAGTTCATTCAGCGTATTCAGTGCGTCAGGTGACGCGTCGATAACATCTGCAATCGCGGCCAGTACAAAAGCGGTGTCGCAATCTGGGTATTGTTTGTTCCCCTGAGCGCGGTTGGTGCTGTTGGCGTTCCGGTCAGTGCCGGACTGTCCAGTGGGCTTTTCTGTTCGTTTCATCCATTACCACCTTAACCGCCTTTGCGTTGCAGCAAGCGTTTCAGACGTGCTGTTGGTTGCACTGCTGAGCTGCACTATCCCCTTTCTCGTTGTGTCCGCATCCTCAAGCGCGACAGCTGAAGCTATATCTTCTGCACGTTTTGCCGAATTTTTTTGCACGTATTGCCGCCGCTTCTGCCGCACTCTTGCTCTGCGATGCTGATACCGCACTTCCGCAGCCTCTGTCGCCTTCGTGATGCCGTTGACGCACTCCCCGCCGCCGCTGTTTTTGCGTCTGCCGCGGCAGAGGCGCTCCGTTCCGCTGCTGTTTCAGATGACCTGGCATTCGTCCTCGGACGTTTTTTGCCGCCCTGGCAGAATTTTCTGCCGCCGTTGCCGAGGGAAGCTGCACGACCGGCACTTGATGATGCGTTTCGTTTCTGATGATTTTGCTGCCTCTTTTGAGGCCACCGCATCTCGTGCTGAAGTGGCGGCCTCTGACGCTTTCGTGGCCGCGGTGGAGGCAGACGTTGGCGGCTGATTGTTGTGACGCTGCAGCATTCGTTTCTGACGTTTTTCGCCGCACCGGCACTGGTGGCCGCCGCGTTTTTTGAGGACTCTGCGGCTGCGGCACTTTTTTTCCGCTTCAGTGGCCTTTGCTGATGCCGCTTCTGCGCCGGAGGACGCTTCCTGAGCTGACGATGCAGCCTGTCCGGCGGACGTGCTGGCGGCGCGTGCTGAGTCAGTTGCATCAGTCACAAGGGCCGCGACCTGAGGCAGCTGATGCACTGGCATCGCCGGCTGATTTCTTCGCGTCTGCCGTACTCTGTGCCACCACGGACGCGTTACGCGCCACCTCTTCCACCATCAGTTCAGACGACGCAGCACCTCCGGCCGGGCATCATCCTCCGTCATGGCACAGAGAAAATCATTCAGCGTCCCCGGTTGTGAATCTTCATACAGGTGATGGTCCCGGCGTGCGATGGTGGAAAACCGTCAACCTGCAGGATGACACTGTACTGACCGTACTCCACATCCATGCTGTACGCCCGCTTCATCCGGATTCTCTGAGCCCACCCGTGTTCACCACCACCGTGGTGCTGTTACGTCTGGCTTTCAGCTGAATGTGCAGTTCTGTACCGGTTTTCCTGTGCCGTCTTTTCAGGACTCCTGAAATCTTTACTGCCATATTCACCACACAAAAAAAAGCCCACCGTTTCCGGCGGGCTGTCATAACACTGTGTTTACCTGGCTAATCAGAATTTATAACCGACCCCAACGATGAATCCGTCAGTACGCCAGTCGCCACTGCCGGAGCCTTCATAAGCAATATCAACACGACGGACGCTGGCGGATAATCTGTATACCTGCACTCCACGCCACTGAGGTATGCCGCATTGCACTTTCGTCCCTGGCAGTGGTCGTCTCTTTCATATACCCGGGAGTGATTTCCGTCTTACGGTAATCCATTGTACTGCGGACCACCGACTGTGAGCCACTCCGGCCATGGCGTACGCACTGACCTGCTTACTGATTTGTAAAACCGGTCCGGCCATCACGCTCACATAACGTCCACGCAGGCTCTCATAGTGAAACGTATCCTCCCGGTCAATCACTGTGCTGCTCTTTTTCGACGCGGCGAACCCCAGGGAAGCCATCACCCCCACACTGTCCCGTCAGCTCATAACGGTACTTCACGTTAATCCCTTTCAGATGACTCACACCGGTATCCCCGCCCGACAACGACGGCAATGTACCCGGTTTCCACTTGAAAATAGCCACCGTAAACGTACCATGTCCACCTTCCGCACGGGCCGGAGTGACTGTCACCGCAAGTGCGGCAAAGACAGCAACGGCAATACACACATTACGCATCGTTCACCTCTCACTGTTTTATAATAAAACGCCCGTTCCCGACGAACCTCTGTAACACACTCAGACCACGCTGATGCCCAGCGCCTGTTTCTTAATCACCATAACCTGCACATCGCTGGCAAACGTATACGGCGGAATATCTGCCGAATGCCGTGTGGACGTAAGCGTGAACGTCAGGATCACGTTTCCCCGACCCGCTGGCATGTCAACATACGGGAGAACACCTGTACCGCCTCGTTCGCCGCGCCATCATAAATCACCGCACCGTTCATCAGTACTTTCAGATAACACATCGAATACGTTGTCCTGCCGCTGACAGTACGCTTACTTTCCGCGAAACGTCAGCGGAAGCACCACTATCTGGCGATCAAAAAGGATGGTCATCGGTCACGGTGACAGTACGGGTACCTGACGGCCAGTCCACACTGCTTCACGCTGGCGCGGAAAAGCCGCGCTCGCCGCCTTTACAATGTCCCCGACGATTTTTTCCGCCCTCAGCGTACCGTTTATCGTAGCAGTTTTCAGCTATCGTCACATTACTGAGCGTCCGGAGTTCGCATTCACACTGCCACTGATATCCGCATTTTTAGCGGTCAGCTTTCCGTCCGGTGTCATGGAAAAGGCCGGAGGGATTGCCGCCGCTGGTAATGGTGGGGGCCGTCAGGCGCTTCAGGAACACGTCGTTCATGAATATCTGGTTGCCCTGCGCCACAAACATCGGCGTTTCATTCCCGTTTGCCGGGTCAATAAATGCGATACGATTGGCGGCAACCAGAAACTGGCTCAGTTTTGCCTTCCTCCGTGTCCTCCATGCTGAGGCCAATACCCGCGACATAATGTTTGCCGGTCTTTGGTCTGCTCAATTTTGACAGCCCACATGGCATTCCACTTATCACTGGCATCCTTCCACTCTTTCGAAAACTCCTCCAGTCTGCTGGCGTTATCCTCCGTCAGCTCGACTTTTTCCAGCAGCTCCTTGCAGAGATGGGATTCGGTTATCTTGCCTTTGAAAAAAATCCAGGTAACAATACTATCTCTCTATG

debug.yaml:

Rfam:
  details:
    compressed: false
    default_type: ncRNA
    location: None
  location: Default
  method: infernal
  priority: 3
  version: release 14.5

@nh13
Copy link
Contributor Author

nh13 commented Jun 13, 2023

The reason this fails is because infernal['qend'] < infernal['qstart'] so when we extract the qseq, we get an empty sequence. If that's the only result returned across all databases, then the data frame will have the type of the seq column as a float.

@nh13
Copy link
Contributor Author

nh13 commented Jun 14, 2023

I added one more commit, where the original commits failed on this read:

>SRR11171709.130.1 130 length=6466
CATAGAGAGATAGTATTGTCACCAATGCTGAGATAGCTGAGAGATGGCATATTGCTACGCAAGAATGAAAAGTGATATACTGGAATGTTTTAAAAAGGCAGGTGGGCAAAGTTAAGGATTAATTATCAGGAGTAATTATGCGGAACAGATCATGCCTGGTGTTTACATAGTAATAATTCCTTACGTTATCGTAAGCATTTGCTATCTCCTTTTCCGCCACTACATTCCCTGGTGTTTCTTTTTCAGCTCATAGAGATGGTCTTGGGGCGACATTGTCATCATATGCAGGAACCATGATTGCAATCCTGATTGCTGCCTTGACGTTTCTAATCGGAAGCAGAACGCGCCGACTGGCCAAGATTAGAGAGTATGGGTATATGACATCGGTAGTTATTGTCTATGCCCTTAGTTTTGTTGAGCTTGGAGCTTTGTTTTTCTGCGGGTTATTGCTTCTTTCCAGCATAAGCGGCTACATGATACCCACTATCGCCATCGGCATTGCCTCTGCATCGTTCATTCATATATGCATCCTTGTTTTCCAACTATATAATTTGACCAGAGAACAAGAATAACCCGGCCTCAGCGCCGGGTTTTCTTTGCCTCAACGATCGCCCCCAAAAACACATAACCAATTGTATTTATTGAAAAAATAAATAGATACAACTCACTAAACATAGCAATTCAGATCTCTCACCTACCAAACAATGCCCCCCCTGCAAAAAATAAATTCATATAAAAAACATACAGATAACCATCTGCGGTGATAATTATCTCTGGCGGTGTTGACATAAATACCACTGGCGGTGATACTGAGCACATCAGCAGGACGCACTGACCACCATGAAGGTGACGCTCTTAAAAATTAAGCCCTGAAGAAGGGCAGCATTCAAAGCAGAAGGCTTTGGGGTGTGTGATACGAAACGAAAGCATTGGCCGTAAGTGCGATTCCGGATTAGCTGCCAATGTGCCAATCGCGGGGGGTTTTCGTTCAGGACTACAACTGCCACACACCACCAAAGCTAACTGACAGGAGAATCCAGATGGATGCACCTAAACACGCCGCCGCGAACGTCGCGCAGAGAAACAGTCTCAATGGAAAGCAGCAAATCCCCTGTTGGTTGGGGTAAGCGCAAAACCAGTTAACCGCCCTATTCTCTCGCTGAAATCGCAAACCGAAATCACGAGTAGAAAGCGCACTAAATCCGATAGACCTTACAGTGCTGGCTGAAATACCACAAACGAATTGAAAGCAACCTGCAACGTATTGAGCGCAAGAATCAGCGCACATGGTACAGCAAGCCTGGCGAACGCGGCATAACATGCAGTGGACGCCAGAAAATTAAGGGAAAATCGATTCCTCTTATCTAGTTACTTAGATATTGGCCTTGGCTTTATCTCAATATTATATGGATCATAGCTGGCAACTAATTCAGTCCAGTAAATATCCTCAATAGGGAATAATATATGCTTTCCATTCCATCGGGAAAAAGTTTGTTCAACACACCAAGCTCAATCAACTCACTAATGTATGGGAATTTGTTTTGATGTAACCACATACTTCCTGCCTTCATTAAGGGCTGCGCACAAAACCATAAGATTGCTCTTCTGTAAGGTTTTGAATTACTGATGCGCACTTTATCGTTTTGCATCTTAATGCGTTTCTTAGCTTAAATCGCTTATATCTGGCGCTGGCAATAGCTGATAATCGATGCACATTAATTGCTAGCGAAAATGCAAGAGCAAAGACGAAAACATGCCACACATGAGGAATACCGATTCTCTCATTAACATATTCAGGCCAGTTATCTGGGCTTAAAAGCAGAAGTCCAACCCAGATAACGATCATATACATGGTTCTCTCCAGAGGTTCTTACTGAACACTCGTCCGAGAATAACGAGTGGAGTCCATTTCTATACTCATCAAACTGTAGGGGTTGTAATAGTTTATCCGATTTCTCGCTGTAGGGTACACGAGAACCACCGAGCCTGATGTGGTTAAAAAGACAAGGCAACAATCTTTACTACCGCAATCCACTATTTAAGGTGATATATGGGAAGAAGGAATTTGAAAGAGTTCGAAGAGCATCCTCAGGATGTGATGGAACAATACCAGGACTATCCGTATGACTACGACTATTGATAAAAATCAATGGTGTGGACAATTCAAGCGATGCAATGGATGCAAGCTTGCAATCGAATGCATGGTTAGCCTGAGAAATGTTTCCTGTAAATGGAAGATGGGAAATATGTCGATAAAGGGGCAATACTAACGACGGCAAATGATTGCCAGAGAACTTGGTAAACAGAACAACAAAGCTGCCTGATAGTGGCCTTTATTTTTGGCATAAATAACAGAATAAACACTGCACTGTGTATTCATTCCAACGAGTGAATACACGGGAGCAATGTCGCTCGTAACTAAACAGGAGCCGACTTGTTCTGATTATTGGAAAATCTTCTTTGCCCTCCAGTGTGAGGGCGATTTTTATCTGTGAGGATATGAACAGATGTCAAACATCAAAAAAATACATCATTGATTACGACTGGAAAGCATCAATAGAATTGAAATCGACCATGACGTAATGACAGAGGAAAAACTTCACCAGATTAATAATTTCTGGTCAGACTCTGAATACCGACTCAATAAACACGGCTCTGTATTAAAATGCTGTATTAATCATGCTGGCGCAACATGCTCTGCTTATAGCAATTTCAAGCGACTTAAATGCATATGGTGTTGTGTGTGATGTTCGACTGGAATGATGGAAATGGTCAGGAAGGATGGCCCTCCAATGGATGGTACGAAGGATAGAGAATTACGCGATATCGATACATCAGGAATATTTGATTCAGATGATGATGACTATCAAGGCCGCCTGAGTGCGGTTTTACCGCATACCAATAACGCTTCACTCGAGGCGTTTTTCGTTATGTATAAATAAGGAGCACACCATGCAATATGCCATTGCAGGGTGGCCTGTGTGCTGGCTGCCCTTCCGAATTCTTTACTTAACGAATCACCCGTAAATTACGTGACGGATGGAAACGCCTTATCGACATACTATCAGCAGGAGTACCCAAAGAATGGATCAAACACTTATGGCTATCCAGACTAAATTCACTATCGCCACTTTTATTGGCGATGAAAAAGATGTTTCGTGAAGCCGTCGACGCTTATAAAAAATGGATATTAATACTGAAACTGAGATCAAGCAAAGCATTCACTACCCCCTTTCCTGTTTTCCTAATCAGCCCGGCATTTCGCGCGGCGATATTTTCACAGCTATTTCGGAGTTCAGCCATGAACGCTTATTACAGTCAGGAATCGTGCTTGAGGCTCAGAAGCTGGGCGCGTCACTACCAGCAGCTCGCCCGTGAAGAGAAAGAGGCAAGAACTGGCAGACGACATGGAAAAAGGCCTGCCCCAGCACCCTGTTTGAATCGGCTATGCATCGATCATTTGCAAACGCCACGGGCCATCAAAAAATCAATTACCCGTGCGTTTGATGACGATGTTGAGTTTCAGGAGCGCATGGCAGAACACATCCGGTACATGGTTAGAAACCATTGCTCACCACCAGGTTGATATTGATTCAGTAGGTATAAAAACGAATGAGTACTGCACTCGCAACGCTGGCTGGGAAGCTGGCTGAACGTGTCGGCATGGATTCTGTCGACCCACAGGAAACTGATCACCACTCTTCGCCAGACGGCATTTAAAGGTGATGCCAGCGATGCGCAGTTCATCGCATTACTGATCCGTTGCCAACCAGTACGGCCGTATCCGTGGACGAAAAGTAATTTACGCCTTTCCTGATAAGCGAATGGCATCGTTCCGGTGGGTGGGCGTTTGATGGCTGGTCCCCGCATCATCAATGAAAACCAGCAGTTTGATGGCATGGACTTTGAGCAGGACAATGAATCCTGTACATGCCGGATTTACCGCAAGGACCGTATCATCCGATCTGCGTTGACCGAATGGATGGATGAATGCCGCCGCGAACCATTCAAAACTCGCGAAGGCCAGAGAAATCACGGGGCCGTGGCAGTCCGCATCCCAAACGGATGTTTACGTCATAAAGCCATGATTCAGTGTGCCCGTCTGGCCTTCGAGTTGCTGGTATCTATGACAAGGATGAAGCCGAGCGCATTGTCGAAAATACTGCATACACTGCAGAAACGTCAGCCGGAACGCGACATCACTCCGGTTAACGATGAAACCATGCAGGAGATTAACACTCTGCTGATCGCCCTGGATAAAACATGGGATGACGACTTATTGCCGCTCTGTTTCCCAGATATTTCGCCGCGACATTCGTGCATCGTCAGAACTGACACAGGCCGAAGCAGTAAAAAGCTCTTTGGATTCCTGAAACGAAAGCCGCAGAGCAGAAGGTGCAGCATGACACCGACATTTCCTGCACGTACCGGGATCGATGTGAGAGCTGTCGAACAGGGGGATGATGCGTGGCACAAATTACGGCTCGGCGTCATCACCGCTTCAGAAGTTCACAACGTTATAGCAAAAACCCCGCTCCGGAAAGAAGTGGCCTGACATGAAAAATGTCCTACTTCCACACCCTGCTTGCTGAGGTTTGCACCGGTGTGGCTCCGGAGTTAACGCTAAAGCACTGGCCTGGGGAAAACAGTACGAGAGACGACGCCAGAACCCTTTTTGAATTCAACTTCGGCGTTGAATGTTACTGAATCCCCGATCATCTATCGCGACGAAAGTATGCGTACCGCCTGCTCTCCCGATGGTTTAATGCAGTGACGGCAACGGCCTTGAACTGAAATGCCCGTTTACCTCCCGGGATTTCATGAAGTTCCGGCTCGGTGGTTTCGAGGCCATAAAGTCAGCTTACATGGCCCAGGTGCAGTACAGCATGTTGGGTGACGCGAAAAAATGCCTGGTACTTTGCCAACTATGACCCGCGTATGAAGCGTGAAGGCCTGCATTATGTCGTGATTGAGCGGGATGAAAAGTACATGGCGAGTTTTGGACGAGAATCGTGCCGGAGTTCATCGAAAAAAATGGACGAGGCACTGGCTGAAATTGGTTTTGTATTTGGGGAGCAATGGCGATGACGCATCCTCACGATAATATCCGGGTAGGCCGCAATCACTTTCGTCTACTCCGTTACAAAAGCGAGGCTGGTATTTCCCGGCCTTTCTGTTATCCGAAAATCCACTGAAAGCACAGCGGCTGGCTGAGGATAAATAATAAACGAGGGGCTGTATGCACAAAGCATCTTCTGTGAGTTAAGAACGAGTATCGAGAATGGCCATAGCCTTGCTCATATTGGAATCAGGTTGTGCCAATACCAGTAGAAACAGACGAAGAAATTTCATACGTTAGCCGCATCCCTTTCACAAAAAGCTGGAAAATGATGGTGGCGAAAGCAGAAGCAGATGAGAGAAACCAGGTATGACAACCACGGAATGCATTTTCTGGCAGCGGGCCTTTCATATTCTGTGTGCTTATGCTTGCCGACATGGGACTTGTTCAATGACAACCTCAGCAGGAAAACGCCTTCGCAGCATTGCCCGTCAGGCTAATTCTGAAATCAAAAAAGCAGACAGCAGTTTCCGGATAAAAACGTCGATTGACATTTGCCGTAGCGTACTGAAAGAAGCACCGCGAACGGTAACGCTGATGGGATTCACACCGACTCATTTAAGCCTGGCAATCGGCATGTTAAACTGCGTCTTTAAGGAAACGATGAACATGAAAGCAAAAATCATACAGGGAGCTACAGGCTCCTTTTTTATTTTCGCATTCACCCTCAAGCGTATTAACCAACAGTTCAGGGCTTAATGAAAGATGGCAGACATCATTGATTCAGCATCAGAAAATAGAAGAATTACAGCGCAACACAGCAATAAAAAATGCGCCGCCTGAACCACCAGGCTATAATCTGCCACTCATTGTTGTGAGTGTGGCGATCCGATAGATGAACGAAGAACGCCTGTCGTTCAGGGTTGTCGGACTTGTGCAAGTTGCCAGGGGAGGATCTGGAACTTATCAGTAAACAGAGAGGTTCGAAGTGTAGCGAAATTAACTCTCAGGCACTGCGTGAAGCGGCAGAGCAGGCAATGCATGACGACTGGGGATTTGACGCAGGACCTTTTCCATGATTGGTAACAACATCGATTGTGCTGGAACTGCTGGATGACGGGAAAGAACCAGCAATACAGATCAAACGCCGCGACCAGGAGAACGAGGATATTGCGCTAAACAGTAGGGAAACTGCGTGTTGAGCTTGAAACAGCAAAAAAATCAAAACTCAACGAGCAGCGGTGAGTATTACGAAGGTGTTATCTCGGATGGGAGTAAGCGTATTGCTAAACTGAAAAGGCAACGAAGTCCGTGAAGACGGAAACCAGTTTCTTGTTGTTCGCCATCCCTGGGGAAAAGACTCCTGTTATCAAGCACATGCACTGGTACAATACTATCTCTCTATGG

Weirdly, after swapping the columns, pandas changes the types of the qstart and qend columns from an integer to a float!

@jeffreybarrick
Copy link
Contributor

This works now in my testing, including on the case where it caused a crash before, so I'm merging this.

I am leaving a note here that I don't understand the further processing of qstart and qend enough to know that swapping them here doesn't cause a downstream problem in how things are displayed and output. My testing seems to show that this doesn't change anything, so I think further processing is fixing the coords to be safe whether they are reversed or not because it uses sframe to determine the strand.

@jeffreybarrick jeffreybarrick merged commit 7606a8a into mmcguffi:master Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants