You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The normalize-by-median script in the PE mode keeps both paired ends if only one of them is below the coverage cutoff. However then script only consume the kmers from one end. This is why if we run abound-dist.py on the output k-mer counting table, 1st line will show some no of kmers with 0 frequency.
The text was updated successfully, but these errors were encountered:
Another important point, current implantation replace 'N's with 'A's then use this modified sequence to consume kmers. I think load-into-counting.py (& filter-abund-single.py and abundance-dist-single.py) ignore reads with 'N's.
While fixing this we need to take care that: normalize-by-median is using the function consume(seq) which does not check for DNA validity and thus will count kmers with 'N's. I think this function needs to be fixed to call for the check_and_process_read function.
The normalize-by-median script in the PE mode keeps both paired ends if only one of them is below the coverage cutoff. However then script only consume the kmers from one end. This is why if we run abound-dist.py on the output k-mer counting table, 1st line will show some no of kmers with 0 frequency.
The text was updated successfully, but these errors were encountered: