-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automation, Refactoring, and Mismap Fixes #20
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…column diff in renaming contigs. Replaced awk script with py script that chooses concensus ref chr to avoid multiple chr mappings to one contig.
…chm13 RM annotations.
…ules. Removed unnecessary conversion of unaligned bam to fq.
…lculation. Bump memory for nucfreq and fix hap pattern to account for hifiasm contig naming convention.
…e. Added rules to rename assemblies. Added intersecting of alignment bed with monomeric pqarm to reduce num of aligned contigs. Moved/renamed some config vars around so cleaner.
… Add back second alignment to cens only. Handle hifiasm format differences. Set minimap2 version to v2.26 fixing issues with piping to samtools view.
…dirs for dna_brnn to avoid cluttering.
…flow to avoid clutter and work with longer contig IDs. Fixed count_complete_cens rule to account for partial contigs.
…oring humas-hmmer to work with prev changes. Remove unused scripts. Allow setting input start and end columns for filter_cen_ctgs.py. Added map_cens.py.
…v row chm1 output to include contig coords. Added cen-stats and remove check cen status script. Allowed plot script rules to skip empty files. Removed sed splits which can fail due to contig id name differences. Refactored/categorized outputs into subdirs. Changed map_cens to use percent identity by events instead of all. Decreased legend size of repeatmasker plot. Adjusted stv_row output to include contig start coord. Updated StainedGlass submodule with updates to correctly include output dir. Removed inaccurate as-hor length output.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
documentation
Improvements or additions to documentation
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
count_complete_cens
rule to count complete centromeres after accounting for partial centromeres.arr_len_thr
params incalculate_HOR_length.py
to prevent including small HOR arrays.calculate_HOR_lengths.py
#14calculate_HOR_length.py
.calculate_HOR_length.py
with incorrect length difference calculation.NucFreq
misassembly detection withNucFreq
fork.minimap2
v2.26 as v2.27 omits MD tag in SAM header causing failure in piping tosamtools view
.awk
commands to avoid usingsed
-split inputs which can fail with different ID formats.