Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bc-pattern2: TypeError: object of type 'NoneType' has no len() #522

Closed
m3hdad opened this issue Mar 28, 2022 · 6 comments
Closed

bc-pattern2: TypeError: object of type 'NoneType' has no len() #522

m3hdad opened this issue Mar 28, 2022 · 6 comments

Comments

@m3hdad
Copy link

m3hdad commented Mar 28, 2022

Hi there,

Shouldn't in the following if(len(read1.seq) < len(self.pattern)) be as if(len(read1.seq) < len(self.pattern2)) or something to avoid this.

This is fixed if I pass --bc-pattern=X

UMI on read2:

read1:

@A1000 1:N
CGCTGGTGGCTGGCCGCTTTGGCCTGGCACCCACCTCCACCCCCCACACCAACCCCGGCCAGAAGCTGCTGCCAACTGACAAGTCTGCTGGCCTGTACAGCGGCGACCCTGCTGGCTTCAACGCCGTCGATGTGCTGGCACTTGGCGCCC
+
FFFFFFFFFFFFFFFFF:FFFFFFFFFF:FFF:FF,FF::FF,FFFF:FF,FFFFF,FFF,FFF,F:FFFFFFFFFFFFFF,,FF:FFFFFFFFFFFF:FFFFFFFFFF:FFFFFFFFF:FF:FF,FFFFFF,FFFFFF:F:FFFFFFFF
@A1047 1:N
GGCTGCACTGCAACAAAAGGGCTGGGCATACGAAGAAGATGTGGGTGGCGGAGCATTTTATGGTCCCAAGATTGACATCAAGATTTGCGATGCCATAGGCAGGAAATGGCAGTGCTTAACAGTGCAGCTGGATTTCAACCTGCCAGAACG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFF:FFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFF,FFF:FFFFF:FF::F::FFF::FFFFFFFFF::F

read2

@A1000 2:N
AACAACAAGCGGCAGGTGGCAGGGTTGACTGAGGATGTTCTTCTCGGGCAGTACATGGGTCAAGAGCGACCGACAGGGGCTGACGGACGCTAAGATCAGGCCAGGTTGCCGGTGGCCTTGAGGCCCAGCACCAGGCCGACGCCGATGATGTGTCCAAG
+
FFFFFFFFFFFFFFFF:FF:FFFFFFFFFFFFFFFFF:F:FFFFFFFF:FFFFFFFFF:FFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFF:FFFFFFFFFFFF,FFFFFFFFFFFF:
@A1047 2:N
CCCGCATCACGGGGGCCAGCCATAGGGGGAAGGCACCTGCGTAGTTTTCTATCAAAATGCCCATAAACCTCTCCAAAGAACCAAGAATTGCACGATGGATCATGATGGGTCGCTCTCTGACATTGGCATCGCTGATGTAAAACATGTCGAAACGTTCT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFF:FFFFFFFF:FF:::FFFFFFFFFFF:FFFFFFFFFFFFFFFFF:F:FF:F:FF

I have the following TypeError:

# UMI-tools version: 1.1.2
# output generated by extract -I read1.fastq.gz --read2-in=read2.fastq.gz --bc-pattern2=NNNNNNNN -S r1UMI.fastq.gz --read2-out=r2UMI.fastq.gz

# blacklist                               : None
# compresslevel                           : 6
# correct_umi_threshold                   : 0
# either_read                             : False
# either_read_resolve                     : discard
# error_correct_cell                      : False
# extract_method                          : string
# filter_cell_barcode                     : None
# filter_cell_barcodes                    : False
# filter_umi                              : None
# filtered_out                            : None
# filtered_out2                           : None
# ignore_suffix                           : False
# log2stderr                              : False
# loglevel                                : 1
# pattern                                 : None
# pattern2                                : NNNNNNNN
# prime3                                  : None
# quality_encoding                        : None
# quality_filter_mask                     : None
# quality_filter_threshold                : None
# random_seed                             : None
# read2_in                                : read2.fastq.gz
# read2_out                               : r2UMI.fastq.gz
# read2_stdout                            : False
# reads_subset                            : None
# reconcile                               : False
# retain_umi                              : None
# short_help                              : None
# stderr                                  : <_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>
# stdin                                   : <_io.TextIOWrapper name='read1.fastq.gz' encoding='ascii'>
# stdlog                                  : <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
# stdout                                  : <_io.TextIOWrapper name='r1UMI.fastq.gz' encoding='ascii'>
# timeit_file                             : None
# timeit_header                           : None
# timeit_name                             : all
# tmpdir                                  : None
# umi_correct_log                         : None
# umi_whitelist                           : None
# umi_whitelist_paired                    : None
# whitelist                               : None
2022-03-28 11:57:57,858 INFO Starting barcode extraction
Traceback (most recent call last):
  File "/home/xxx/progs/miniconda3/envs/xxx/bin/umi_tools", line 11, in <module>
    sys.exit(main())
  File "/home/xxx/progs/miniconda3/envs/xxx/lib/python3.8/site-packages/umi_tools/umi_tools.py", line 61, in main
    module.main(sys.argv)
  File "/home/xxx/progs/miniconda3/envs/xxx/lib/python3.8/site-packages/umi_tools/extract.py", line 477, in main
    reads = ReadExtractor(read1, read2)
  File "/home/xxx/progs/miniconda3/envs/xxx/lib/python3.8/site-packages/umi_tools/extract_methods.py", line 543, in __call__
    if(len(read1.seq) < len(self.pattern)):
TypeError: object of type 'NoneType' has no len()
@TomSmithCGAT
Copy link
Member

Just to check, you're trying to extract a UMI from just read2? extract is designed to extract from read1 or read1 and read2, but it's possible to extract from just read2. Just swap them around in the input/output like so:

umi_tools extract -I read2.fastq.gz --read2-in=read1.fastq.gz --bc-pattern=NNNNNNNN -S r2UMI.fastq.gz --read2-out=r1UMI.fastq.gz

@m3hdad
Copy link
Author

m3hdad commented Mar 30, 2022

Yes I was trying to extract from read2 only. The reason is that umi-tools is part of a pipeline which I do not want to change the order of reads or its logic will break down.

Anyway I know the swapping works fine. It was my misunderstanding about --bc-pattern2 function to extract only from read2. Well in the documentation it says --bc-pattern and/or --bc-pattern2 are required.

You might consider this as a feature request if it's not too much work to cater for. As I mentioned passing --bc-pattern=X escapes the TypeError. Or the issue is closed as it is how it was suppose to work. Thanks for your reply.

@TomSmithCGAT
Copy link
Member

Ah, the docs are wrong in that case!

I'd have to take a look back into the code to see whether supporting just extraction from read2 is a PITA. I suspect it should be OK and I've got no principle against it.

You OK if we add this @IanSudbery?

@IanSudbery
Copy link
Member

Fine by me, as long is its not too big a surgery.

@c-guzman
Copy link

Just posting here as another instance where this isn't explicitly clear in the documentation and caused issues until I found this post. Would be a great feature to add.

Thanks!

@TomSmithCGAT
Copy link
Member

The next release will include an option to extract barcodes from read 2 only (see #630)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants