Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major behaviour change from 1.0.7 to 1.1.1 #24

Open
tseemann opened this issue Aug 16, 2016 · 4 comments
Open

Major behaviour change from 1.0.7 to 1.1.1 #24

tseemann opened this issue Aug 16, 2016 · 4 comments

Comments

@tseemann
Copy link

tseemann commented Aug 16, 2016

Today I upgraded from lighter 1.0.7 to 1.1.1 and I first noticed a problem when 1.1.1 was outputting different number of reads in the two output files, and then noticed it was also passing far fewer reads.

This is the command line:

lighter -od . -r R1.fq.gz -r R2.fq.gz -K 32 4000000 -t 72 -maxcor 2

This is the difference in read counts:

Files   R1.fq.gz
Reads   3747457  # original reads
Files   R2.fq.gz
Reads   3747457

Files   1.0.7-R1.cor.fq.gz
Reads   3747457  # none missing
Files   1.0.7-R2.cor.fq.gz
Reads   3747457

Files   1.1.1-R1.cor.fq.gz
Reads   2511489  # lots missing
Files   1.1.1-R2.cor.fq.gz
Reads   2511506  # has 17 more reads!

Any ideas?

@mourisl
Copy link
Owner

mourisl commented Aug 16, 2016

I tested again on my data sets and could not trigger the bug you met. Is there a way for me to access the data set you use? If not, can you show me the summary of correction on screen output by Lighter? Thanks.

@tseemann
Copy link
Author

tseemann commented Aug 16, 2016

I found the issue. If you compile with default -O2 option it works. In Linuxbrew, I used the system CXXFLAGS which sets -Os (size optimize), which causes the bug!
CC: @sjackman

See the output messages below:

Files   R1.fq.gz
Reads   3747457

This is g++ -O2 (which works)

$ ./lighter-1.1.1-O2 -od 1.1.1-O2 -r R1.fq.gz -r R2.fq.gz -K 32 4000000 -t 72 -maxcor 2
[2016-08-17 00:11:57] =============Start====================
[2016-08-17 00:11:57] Scanning the input files to infer alpha(sampling rate)
[2016-08-17 00:12:04] Average coverage is 141.346 and alpha is 0.050
[2016-08-17 00:12:05] Bad quality threshold is "B"
[2016-08-17 00:12:15] Finish sampling kmers
[2016-08-17 00:12:15] Bloom filter A's false positive rate: 0.006326
[2016-08-17 00:12:24] Finish storing trusted kmers
[2016-08-17 00:12:56] Finish error correction
Processed 7494914 reads:
        7042749 are error-free
        Corrected 579197 bases(1.280942 corrections for reads with errors)
        Trimmed 0 reads with average trimmed bases 0.000000
        Discard 0 reads

This is g++ -Os with missing reads!

$ ./lighter-1.1.1-Os -od 1.1.1-Os -r R1.fq.gz -r R2.fq.gz -K 32 4000000 -t 72 -maxcor 2
[2016-08-17 00:13:38] =============Start====================
[2016-08-17 00:13:38] Scanning the input files to infer alpha(sampling rate)
[2016-08-17 00:13:46] Average coverage is 141.346 and alpha is 0.050
[2016-08-17 00:13:47] Bad quality threshold is "B"
[2016-08-17 00:13:57] Finish sampling kmers
[2016-08-17 00:13:57] Bloom filter A's false positive rate: 0.006326
[2016-08-17 00:14:06] Finish storing trusted kmers
[2016-08-17 00:14:32] Finish error correction
Processed 5022995 reads:
        4719925 are error-free
        Corrected 388478 bases(1.281809 corrections for reads with errors)
        Trimmed 0 reads with average trimmed bases 0.000000
        Discard 0 reads

@tseemann
Copy link
Author

tseemann commented Jul 1, 2017

Ping @mourisl - any ideas?

@sjackman
Copy link

sjackman commented Jul 5, 2017

As a workaround you can use ENV.O2 in the formula to use -O2 rather than the default -Os.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants