Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{ts} barcode extract either read #311

Merged
merged 6 commits into from
Feb 1, 2019
Merged

Conversation

TomSmithCGAT
Copy link
Member

Note, the commits on Jan 24 are from a merge with branch {TS}-EnablePerCellCountTab (PR #308). Probably best to merge that PR before reviewing the changes herein.

See #175 for motivation

  • Implements barcode extraction from read1 or read2 with --either-read.
  • Implements option to either discard reads where both match, or keep the UMI with highest sequence quality
  • Adds test for --either-read

I don't have a good input sample for the test so the current input is hacked together from the indrop data (see tests/README). Basically swapped read1 and read2 at random and use option to identify which read matches the regex. Of course, the output from this is now garbage. I'll post on #175 and ask for more suitable toy data. Also need to add a test to cover --either-read-resolve=quality

Copy link
Member

@IanSudbery IanSudbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder how hard it would be to allow the user to provide a single regex for both reads.

umi_tools/extract.py Show resolved Hide resolved
@TomSmithCGAT
Copy link
Member Author

@IanSudbery - OK for merge?

@IanSudbery
Copy link
Member

How about allowing the user to just provide one regex if it can be on either read?

@TomSmithCGAT
Copy link
Member Author

TomSmithCGAT commented Jan 28, 2019

Oops, sorry, missed this!

We could have a single regex option by simply reusing --bc-pattern if --bc-pattern2 is not provided (I assume this is what are thinking).

Personally, I'd rather stick with the current implementation and force the user to be explicit so there's no unexpected behaviour when the regexes are different and they have forgotten to specify --bc-pattern2.

@IanSudbery
Copy link
Member

go ahead and merge this.

@TomSmithCGAT TomSmithCGAT merged commit 07713b8 into master Feb 1, 2019
@TomSmithCGAT TomSmithCGAT deleted the {TS}-BarcodeExtractEitherRead branch February 1, 2019 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants