Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic computation of chunkSize introduces gaps between bins at chunk ends #887

Closed
dmalzl opened this issue Nov 13, 2019 · 9 comments
Closed

Comments

@dmalzl
Copy link

dmalzl commented Nov 13, 2019

I frequently use deeptools and I am very fond of the versatility that the suite offers, but just recently I encountered a more or less nasty bug that might not be so dramatic for most use cases but is deleterious when an application relies on accurate bin positions.

Using the multiBamSummary tool I counted reads in a 5kb fashion with 16 cores for multiple BAMfiles. These counts per bin were later used to call regions that exhibit a given distribution of counts across these bins. During the development of this caller I discovered something that I already previously noticed. The expected behaviour of tiling a genome into bins would be successive bins of a given binsize e.g. if a chromosome is 100563 bp long I would assume a 5kb binning of it to look like:

chr 0 5000
chr 5000 10000
.
.
chr 95000 100000
chr 100000 100563

However, the deeptools output always exhibited some irregularities such that somewhere along the tiling process some bins seemed to be a little bit larger than the specified binsize, for whatever reason. Since accuracy is crucial for me I dug a little bit into the data and discovered that not the binsize seems to vary from time to time but there are gaps between contiguous stretches of consecutive bins. I dug a little bit into the code and discovered that this most certainly is due to the way the genome is split into chunks for parallel computation.

The chunkSize used for splitting is dynamically computed from some BAM statistic and matches exactly the length of the contiguous bin stretches (in my case 16650712 bp)

image

Thus, I suspect the following:
The computation of the chunksize results in a number that is not a multiple of the binsize. This scenario seems to be handled by simply omitting bin that are not of a minimum length. Consequently, the last 712 (or whatever binsize dependent residual) are is not included into the output of a given chunk. However subsequent chunks are started at end position of the previous chunk where a little portion was omitted, resulting in a gap of 712 bp (in my case) that is not part of the output.

I suggest fixing this in the next release

@dpryan79
Copy link
Collaborator

Indeed, for our own uses we've never all that much cared exactly where the bins are, but since you're probably not the only one for which getting them exactly evenly spaced is important then we should look into addressing this. Thanks for the detailed bug report!

@dpryan79
Copy link
Collaborator

dpryan79 commented Dec 2, 2019

Which version of deepTools is giving you non-adjacent bins in this case? With 3.3.1 I get adjacent bins, though they definitely are different sizes around chunk length boundaries.

@dmalzl
Copy link
Author

dmalzl commented Dec 2, 2019

I currently use version 3.3.0 for all my analysis and also for the one where I found the inconsistencies.

@dpryan79
Copy link
Collaborator

dpryan79 commented Dec 2, 2019

And you're seeing this with multiBamSummary bins, yes?

@dmalzl
Copy link
Author

dmalzl commented Dec 2, 2019

Exactly

@dpryan79
Copy link
Collaborator

dpryan79 commented Dec 2, 2019

Can you provide the exact command you're using? I still don't get the gaps you mentioned with version 3.3.0. In the next release I'll try to add a --genomeChunkLength argument anyway so you can directly modify this for consistency.

@dmalzl
Copy link
Author

dmalzl commented Dec 2, 2019

The used command was:

multiBamSummary bins -b $somebams -l $somelabels -bs 5000 --outRawCounts $rawcountsfile -o $npzfile -p 8 --ignoreDuplicates

dpryan79 added a commit that referenced this issue Dec 3, 2019
dpryan79 added a commit that referenced this issue Dec 3, 2019
@dpryan79
Copy link
Collaborator

dpryan79 commented Dec 3, 2019

This should now be fixed in the develop branch. The shorter regions at the end of genomic chunks are no longer skipped so there aren't gaps between bins any more. multiBamSummary also accepts a --genomicChunkSize option now, so you'll be able to tweak for consistency.

@dpryan79 dpryan79 closed this as completed Dec 3, 2019
@dmalzl
Copy link
Author

dmalzl commented Dec 3, 2019

Thanks

dpryan79 added a commit that referenced this issue Jan 23, 2020
* copy changes from bgruening

* this file should not be here since years (#845)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* this file should not be here since years

* Add Arabidopsis TAIR10 (A_thaliana_Jun_2009) (#853)

Using output from:
faCount A_thaliana_Jun_2009.fa 
#seq	len	A	C	G	T	N	cpg
Chr1	30427671	9709674	5435374	5421151	9697113	164359	697370
Chr2	19698289	6315641	3542973	3520766	6316348	2561	457572
Chr3	23459830	7484757	4258333	4262704	7448059	5977	559031
Chr4	18585056	5940546	3371349	3356091	5914038	3032	439585
Chr5	26975502	8621974	4832253	4858759	8652238	10278	630299
ChrC	154478	48546	28496	27570	49866	0	4639
ChrM	366924	102464	82661	81609	100190	0	13697
total	119667750	38223602	21551439	21528650	38177852	186207	2802193
hpc $ python
Python 2.7.11 (default, Jul 25 2019, 12:10:26) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 119667750-186207
119481543

* Fix python version in Azure tests  (#860)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Fix python version

* Update azure-pipelines.yml

* fixed typo (#864)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* fixed typo

* Update test images, skip testing if the wrong matplotlib version is used (#865)

* Update test images, skip testing if the wrong matplotlib version is used

* Update test-template.yml

* linting

* can't conda activate on azure

* now the heatmap is correct and the profile is wrong

* lint

* only one test should fail now

* Fix #844

* Should fix one test at least

* fix last tests

* fix #838 (#843)

* fix #838

* fixes

* Update CHANGES.txt

* Close #868 #867 and #851 (#869)

* Fix #868

* Fix #867

* Default ALL the things!

* Fix #866 (#871)

* Release 3.3.1 (#873)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* release 3.3.1

* try github actions

* each action is a file

* OK, that's inflexible

* OK, the action.yml thing is a mess

* syntax

* ok, try this

* uses

* spacing

* ok

* do anchors work?

* boo, so duplicative!

* oops

* maybe this will work for pypi

* ensure dist is empty

* nev

* rename

* bump version number

* Actionable active actions acting actively (#874)

* give actions another try

* wrong docs?

* ok

* hmm

* WTF

* ah, we CAN give a path

* hmm

* actions everywhere

* foo

* artifacts

* fix #889 (#891)

* Fix888 (#892)

* fix x-axis profile tick positions

* set minimum matplotlib version to 3.1.0

* fix hexbin and overlapped_lines too

* fix #887 (#893)

* update change log

* Seaborn colormaps (#894)

* add seaborn colormaps

* bump version and finally change license

* indenting

* update colormaps in galaxy wrapper

* update version in galaxy wrapper

* changelog

* wrong issue number

* pep8

* pep8

* pep8

* pep8

* added clusterUsingSamples to heatmap

* Added a couple of assertions to cehck the range of samples' indices

* Using --xRange and --yRange fails in galaxy due to the single quote. … (#901)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Release 3.3.1 (#872)

* copy changes from bgruening

* this file should not be here since years (#845)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* this file should not be here since years

* Add Arabidopsis TAIR10 (A_thaliana_Jun_2009) (#853)

Using output from:
faCount A_thaliana_Jun_2009.fa 
#seq	len	A	C	G	T	N	cpg
Chr1	30427671	9709674	5435374	5421151	9697113	164359	697370
Chr2	19698289	6315641	3542973	3520766	6316348	2561	457572
Chr3	23459830	7484757	4258333	4262704	7448059	5977	559031
Chr4	18585056	5940546	3371349	3356091	5914038	3032	439585
Chr5	26975502	8621974	4832253	4858759	8652238	10278	630299
ChrC	154478	48546	28496	27570	49866	0	4639
ChrM	366924	102464	82661	81609	100190	0	13697
total	119667750	38223602	21551439	21528650	38177852	186207	2802193
hpc $ python
Python 2.7.11 (default, Jul 25 2019, 12:10:26) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 119667750-186207
119481543

* Fix python version in Azure tests  (#860)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Fix python version

* Update azure-pipelines.yml

* fixed typo (#864)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* fixed typo

* Update test images, skip testing if the wrong matplotlib version is used (#865)

* Update test images, skip testing if the wrong matplotlib version is used

* Update test-template.yml

* linting

* can't conda activate on azure

* now the heatmap is correct and the profile is wrong

* lint

* only one test should fail now

* Fix #844

* Should fix one test at least

* fix last tests

* fix #838 (#843)

* fix #838

* fixes

* Update CHANGES.txt

* Close #868 #867 and #851 (#869)

* Fix #868

* Fix #867

* Default ALL the things!

* Fix #866 (#871)

* release 3.3.1

* try github actions

* each action is a file

* OK, that's inflexible

* OK, the action.yml thing is a mess

* syntax

* ok, try this

* uses

* spacing

* ok

* do anchors work?

* boo, so duplicative!

* oops

* maybe this will work for pypi

* ensure dist is empty

* nev

* rename

* bump version number

* Using --xRange and --yRange fails in galaxy due to the single quote. Removed them.

* Try just changing the wrapper

* fix wrapper linting

* plotCorrelation wrapper works properly now

* Add separate linting step to catch some of this in the future

Co-authored-by: Devon Ryan <dpryan79@users.noreply.github.com>
Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
Co-authored-by: Steffen Möller <steffen_moeller@gmx.de>

* Documentation fixes, closes #886 (#905)

* Fix #902 (#906)

* [WIP] added a silhouette calculation (#876)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Release 3.3.1 (#872)

* copy changes from bgruening

* this file should not be here since years (#845)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* this file should not be here since years

* Add Arabidopsis TAIR10 (A_thaliana_Jun_2009) (#853)

Using output from:
faCount A_thaliana_Jun_2009.fa 
#seq	len	A	C	G	T	N	cpg
Chr1	30427671	9709674	5435374	5421151	9697113	164359	697370
Chr2	19698289	6315641	3542973	3520766	6316348	2561	457572
Chr3	23459830	7484757	4258333	4262704	7448059	5977	559031
Chr4	18585056	5940546	3371349	3356091	5914038	3032	439585
Chr5	26975502	8621974	4832253	4858759	8652238	10278	630299
ChrC	154478	48546	28496	27570	49866	0	4639
ChrM	366924	102464	82661	81609	100190	0	13697
total	119667750	38223602	21551439	21528650	38177852	186207	2802193
hpc $ python
Python 2.7.11 (default, Jul 25 2019, 12:10:26) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 119667750-186207
119481543

* Fix python version in Azure tests  (#860)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Fix python version

* Update azure-pipelines.yml

* fixed typo (#864)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* fixed typo

* Update test images, skip testing if the wrong matplotlib version is used (#865)

* Update test images, skip testing if the wrong matplotlib version is used

* Update test-template.yml

* linting

* can't conda activate on azure

* now the heatmap is correct and the profile is wrong

* lint

* only one test should fail now

* Fix #844

* Should fix one test at least

* fix last tests

* fix #838 (#843)

* fix #838

* fixes

* Update CHANGES.txt

* Close #868 #867 and #851 (#869)

* Fix #868

* Fix #867

* Default ALL the things!

* Fix #866 (#871)

* release 3.3.1

* try github actions

* each action is a file

* OK, that's inflexible

* OK, the action.yml thing is a mess

* syntax

* ok, try this

* uses

* spacing

* ok

* do anchors work?

* boo, so duplicative!

* oops

* maybe this will work for pypi

* ensure dist is empty

* nev

* rename

* bump version number

* added a silhouette calculation

* remove sklearn, implement with scipy and numpy

* Update requirements.txt

* Update setup.py

* Update heatmapper.py

* update galaxy wrapper

* Fix run time issues

* refactor, the order matters here.

* removing debugging stuff

* Update heatmapper.py

* Update plotHeatmap.py

Co-authored-by: Devon Ryan <dpryan79@users.noreply.github.com>
Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
Co-authored-by: Steffen Möller <steffen_moeller@gmx.de>

* mention silhouette score

* update help location

* see if this fixes things (#909)

Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
Co-authored-by: Ann Loraine <aloraine@uncc.edu>
Co-authored-by: Jan Janssen <jan-janssen@users.noreply.github.com>
Co-authored-by: Ömer An <bounlu@gmail.com>
Co-authored-by: LeilyR <leila.rabbani@gmail.com>
Co-authored-by: cgirardot <girardot@embl.de>
Co-authored-by: Steffen Möller <steffen_moeller@gmx.de>
Co-authored-by: Lucille Delisle <lucille.delisle@epfl.ch>
dpryan79 added a commit that referenced this issue Mar 5, 2020
* copy changes from bgruening

* this file should not be here since years (#845)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* this file should not be here since years

* Add Arabidopsis TAIR10 (A_thaliana_Jun_2009) (#853)

Using output from:
faCount A_thaliana_Jun_2009.fa 
#seq	len	A	C	G	T	N	cpg
Chr1	30427671	9709674	5435374	5421151	9697113	164359	697370
Chr2	19698289	6315641	3542973	3520766	6316348	2561	457572
Chr3	23459830	7484757	4258333	4262704	7448059	5977	559031
Chr4	18585056	5940546	3371349	3356091	5914038	3032	439585
Chr5	26975502	8621974	4832253	4858759	8652238	10278	630299
ChrC	154478	48546	28496	27570	49866	0	4639
ChrM	366924	102464	82661	81609	100190	0	13697
total	119667750	38223602	21551439	21528650	38177852	186207	2802193
hpc $ python
Python 2.7.11 (default, Jul 25 2019, 12:10:26) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 119667750-186207
119481543

* Fix python version in Azure tests  (#860)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Fix python version

* Update azure-pipelines.yml

* fixed typo (#864)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* fixed typo

* Update test images, skip testing if the wrong matplotlib version is used (#865)

* Update test images, skip testing if the wrong matplotlib version is used

* Update test-template.yml

* linting

* can't conda activate on azure

* now the heatmap is correct and the profile is wrong

* lint

* only one test should fail now

* Fix #844

* Should fix one test at least

* fix last tests

* fix #838 (#843)

* fix #838

* fixes

* Update CHANGES.txt

* Close #868 #867 and #851 (#869)

* Fix #868

* Fix #867

* Default ALL the things!

* Fix #866 (#871)

* Release 3.3.1 (#873)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* release 3.3.1

* try github actions

* each action is a file

* OK, that's inflexible

* OK, the action.yml thing is a mess

* syntax

* ok, try this

* uses

* spacing

* ok

* do anchors work?

* boo, so duplicative!

* oops

* maybe this will work for pypi

* ensure dist is empty

* nev

* rename

* bump version number

* Actionable active actions acting actively (#874)

* give actions another try

* wrong docs?

* ok

* hmm

* WTF

* ah, we CAN give a path

* hmm

* actions everywhere

* foo

* artifacts

* fix #889 (#891)

* Fix888 (#892)

* fix x-axis profile tick positions

* set minimum matplotlib version to 3.1.0

* fix hexbin and overlapped_lines too

* fix #887 (#893)

* update change log

* Seaborn colormaps (#894)

* add seaborn colormaps

* bump version and finally change license

* indenting

* update colormaps in galaxy wrapper

* update version in galaxy wrapper

* changelog

* wrong issue number

* pep8

* pep8

* pep8

* pep8

* added clusterUsingSamples to heatmap

* Added a couple of assertions to cehck the range of samples' indices

* Using --xRange and --yRange fails in galaxy due to the single quote. … (#901)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Release 3.3.1 (#872)

* copy changes from bgruening

* this file should not be here since years (#845)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* this file should not be here since years

* Add Arabidopsis TAIR10 (A_thaliana_Jun_2009) (#853)

Using output from:
faCount A_thaliana_Jun_2009.fa 
#seq	len	A	C	G	T	N	cpg
Chr1	30427671	9709674	5435374	5421151	9697113	164359	697370
Chr2	19698289	6315641	3542973	3520766	6316348	2561	457572
Chr3	23459830	7484757	4258333	4262704	7448059	5977	559031
Chr4	18585056	5940546	3371349	3356091	5914038	3032	439585
Chr5	26975502	8621974	4832253	4858759	8652238	10278	630299
ChrC	154478	48546	28496	27570	49866	0	4639
ChrM	366924	102464	82661	81609	100190	0	13697
total	119667750	38223602	21551439	21528650	38177852	186207	2802193
hpc $ python
Python 2.7.11 (default, Jul 25 2019, 12:10:26) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 119667750-186207
119481543

* Fix python version in Azure tests  (#860)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Fix python version

* Update azure-pipelines.yml

* fixed typo (#864)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* fixed typo

* Update test images, skip testing if the wrong matplotlib version is used (#865)

* Update test images, skip testing if the wrong matplotlib version is used

* Update test-template.yml

* linting

* can't conda activate on azure

* now the heatmap is correct and the profile is wrong

* lint

* only one test should fail now

* Fix #844

* Should fix one test at least

* fix last tests

* fix #838 (#843)

* fix #838

* fixes

* Update CHANGES.txt

* Close #868 #867 and #851 (#869)

* Fix #868

* Fix #867

* Default ALL the things!

* Fix #866 (#871)

* release 3.3.1

* try github actions

* each action is a file

* OK, that's inflexible

* OK, the action.yml thing is a mess

* syntax

* ok, try this

* uses

* spacing

* ok

* do anchors work?

* boo, so duplicative!

* oops

* maybe this will work for pypi

* ensure dist is empty

* nev

* rename

* bump version number

* Using --xRange and --yRange fails in galaxy due to the single quote. Removed them.

* Try just changing the wrapper

* fix wrapper linting

* plotCorrelation wrapper works properly now

* Add separate linting step to catch some of this in the future

Co-authored-by: Devon Ryan <dpryan79@users.noreply.github.com>
Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
Co-authored-by: Steffen Möller <steffen_moeller@gmx.de>

* Documentation fixes, closes #886 (#905)

* Fix #902 (#906)

* [WIP] added a silhouette calculation (#876)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Release 3.3.1 (#872)

* copy changes from bgruening

* this file should not be here since years (#845)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* this file should not be here since years

* Add Arabidopsis TAIR10 (A_thaliana_Jun_2009) (#853)

Using output from:
faCount A_thaliana_Jun_2009.fa 
#seq	len	A	C	G	T	N	cpg
Chr1	30427671	9709674	5435374	5421151	9697113	164359	697370
Chr2	19698289	6315641	3542973	3520766	6316348	2561	457572
Chr3	23459830	7484757	4258333	4262704	7448059	5977	559031
Chr4	18585056	5940546	3371349	3356091	5914038	3032	439585
Chr5	26975502	8621974	4832253	4858759	8652238	10278	630299
ChrC	154478	48546	28496	27570	49866	0	4639
ChrM	366924	102464	82661	81609	100190	0	13697
total	119667750	38223602	21551439	21528650	38177852	186207	2802193
hpc $ python
Python 2.7.11 (default, Jul 25 2019, 12:10:26) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 119667750-186207
119481543

* Fix python version in Azure tests  (#860)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Fix python version

* Update azure-pipelines.yml

* fixed typo (#864)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* fixed typo

* Update test images, skip testing if the wrong matplotlib version is used (#865)

* Update test images, skip testing if the wrong matplotlib version is used

* Update test-template.yml

* linting

* can't conda activate on azure

* now the heatmap is correct and the profile is wrong

* lint

* only one test should fail now

* Fix #844

* Should fix one test at least

* fix last tests

* fix #838 (#843)

* fix #838

* fixes

* Update CHANGES.txt

* Close #868 #867 and #851 (#869)

* Fix #868

* Fix #867

* Default ALL the things!

* Fix #866 (#871)

* release 3.3.1

* try github actions

* each action is a file

* OK, that's inflexible

* OK, the action.yml thing is a mess

* syntax

* ok, try this

* uses

* spacing

* ok

* do anchors work?

* boo, so duplicative!

* oops

* maybe this will work for pypi

* ensure dist is empty

* nev

* rename

* bump version number

* added a silhouette calculation

* remove sklearn, implement with scipy and numpy

* Update requirements.txt

* Update setup.py

* Update heatmapper.py

* update galaxy wrapper

* Fix run time issues

* refactor, the order matters here.

* removing debugging stuff

* Update heatmapper.py

* Update plotHeatmap.py

Co-authored-by: Devon Ryan <dpryan79@users.noreply.github.com>
Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
Co-authored-by: Steffen Möller <steffen_moeller@gmx.de>

* mention silhouette score

* update help location

* see if this fixes things (#909)

* update Azure OSX client version

* Fix typos in documentation (#916)

Fixes two typos in example code.

* copy wrapper fixes from Bjoern's PR

* Fix dotted line (#921)

* Fix the dashed line in plotHeatmap with reference-point TES and sorting by region_length

* fix test

* Implement #924 (#925)

* Basic implementation of --linesAtTickMarks and galaxy wrapper

* bump version to 3.4.0

* typo

* stupid eigenvalues

Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
Co-authored-by: A. Loraine <aloraine@uncc.edu>
Co-authored-by: Jan Janssen <jan-janssen@users.noreply.github.com>
Co-authored-by: Ömer An <bounlu@gmail.com>
Co-authored-by: LeilyR <leila.rabbani@gmail.com>
Co-authored-by: cgirardot <girardot@embl.de>
Co-authored-by: Steffen Möller <steffen_moeller@gmx.de>
Co-authored-by: Lucille Delisle <lucille.delisle@epfl.ch>
Co-authored-by: Sichong <scpeng@ucdavis.edu>
dpryan79 added a commit that referenced this issue Mar 15, 2020
* copy changes from bgruening

* this file should not be here since years (#845)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* this file should not be here since years

* Add Arabidopsis TAIR10 (A_thaliana_Jun_2009) (#853)

Using output from:
faCount A_thaliana_Jun_2009.fa 
#seq	len	A	C	G	T	N	cpg
Chr1	30427671	9709674	5435374	5421151	9697113	164359	697370
Chr2	19698289	6315641	3542973	3520766	6316348	2561	457572
Chr3	23459830	7484757	4258333	4262704	7448059	5977	559031
Chr4	18585056	5940546	3371349	3356091	5914038	3032	439585
Chr5	26975502	8621974	4832253	4858759	8652238	10278	630299
ChrC	154478	48546	28496	27570	49866	0	4639
ChrM	366924	102464	82661	81609	100190	0	13697
total	119667750	38223602	21551439	21528650	38177852	186207	2802193
hpc $ python
Python 2.7.11 (default, Jul 25 2019, 12:10:26) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 119667750-186207
119481543

* Fix python version in Azure tests  (#860)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Fix python version

* Update azure-pipelines.yml

* fixed typo (#864)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* fixed typo

* Update test images, skip testing if the wrong matplotlib version is used (#865)

* Update test images, skip testing if the wrong matplotlib version is used

* Update test-template.yml

* linting

* can't conda activate on azure

* now the heatmap is correct and the profile is wrong

* lint

* only one test should fail now

* Fix #844

* Should fix one test at least

* fix last tests

* fix #838 (#843)

* fix #838

* fixes

* Update CHANGES.txt

* Close #868 #867 and #851 (#869)

* Fix #868

* Fix #867

* Default ALL the things!

* Fix #866 (#871)

* Release 3.3.1 (#873)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* release 3.3.1

* try github actions

* each action is a file

* OK, that's inflexible

* OK, the action.yml thing is a mess

* syntax

* ok, try this

* uses

* spacing

* ok

* do anchors work?

* boo, so duplicative!

* oops

* maybe this will work for pypi

* ensure dist is empty

* nev

* rename

* bump version number

* Actionable active actions acting actively (#874)

* give actions another try

* wrong docs?

* ok

* hmm

* WTF

* ah, we CAN give a path

* hmm

* actions everywhere

* foo

* artifacts

* fix #889 (#891)

* Fix888 (#892)

* fix x-axis profile tick positions

* set minimum matplotlib version to 3.1.0

* fix hexbin and overlapped_lines too

* fix #887 (#893)

* update change log

* Seaborn colormaps (#894)

* add seaborn colormaps

* bump version and finally change license

* indenting

* update colormaps in galaxy wrapper

* update version in galaxy wrapper

* changelog

* wrong issue number

* pep8

* pep8

* pep8

* pep8

* added clusterUsingSamples to heatmap

* Added a couple of assertions to cehck the range of samples' indices

* Using --xRange and --yRange fails in galaxy due to the single quote. … (#901)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Release 3.3.1 (#872)

* copy changes from bgruening

* this file should not be here since years (#845)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* this file should not be here since years

* Add Arabidopsis TAIR10 (A_thaliana_Jun_2009) (#853)

Using output from:
faCount A_thaliana_Jun_2009.fa 
#seq	len	A	C	G	T	N	cpg
Chr1	30427671	9709674	5435374	5421151	9697113	164359	697370
Chr2	19698289	6315641	3542973	3520766	6316348	2561	457572
Chr3	23459830	7484757	4258333	4262704	7448059	5977	559031
Chr4	18585056	5940546	3371349	3356091	5914038	3032	439585
Chr5	26975502	8621974	4832253	4858759	8652238	10278	630299
ChrC	154478	48546	28496	27570	49866	0	4639
ChrM	366924	102464	82661	81609	100190	0	13697
total	119667750	38223602	21551439	21528650	38177852	186207	2802193
hpc $ python
Python 2.7.11 (default, Jul 25 2019, 12:10:26) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 119667750-186207
119481543

* Fix python version in Azure tests  (#860)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Fix python version

* Update azure-pipelines.yml

* fixed typo (#864)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* fixed typo

* Update test images, skip testing if the wrong matplotlib version is used (#865)

* Update test images, skip testing if the wrong matplotlib version is used

* Update test-template.yml

* linting

* can't conda activate on azure

* now the heatmap is correct and the profile is wrong

* lint

* only one test should fail now

* Fix #844

* Should fix one test at least

* fix last tests

* fix #838 (#843)

* fix #838

* fixes

* Update CHANGES.txt

* Close #868 #867 and #851 (#869)

* Fix #868

* Fix #867

* Default ALL the things!

* Fix #866 (#871)

* release 3.3.1

* try github actions

* each action is a file

* OK, that's inflexible

* OK, the action.yml thing is a mess

* syntax

* ok, try this

* uses

* spacing

* ok

* do anchors work?

* boo, so duplicative!

* oops

* maybe this will work for pypi

* ensure dist is empty

* nev

* rename

* bump version number

* Using --xRange and --yRange fails in galaxy due to the single quote. Removed them.

* Try just changing the wrapper

* fix wrapper linting

* plotCorrelation wrapper works properly now

* Add separate linting step to catch some of this in the future

Co-authored-by: Devon Ryan <dpryan79@users.noreply.github.com>
Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
Co-authored-by: Steffen Möller <steffen_moeller@gmx.de>

* Documentation fixes, closes #886 (#905)

* Fix #902 (#906)

* [WIP] added a silhouette calculation (#876)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Release 3.3.1 (#872)

* copy changes from bgruening

* this file should not be here since years (#845)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* this file should not be here since years

* Add Arabidopsis TAIR10 (A_thaliana_Jun_2009) (#853)

Using output from:
faCount A_thaliana_Jun_2009.fa 
#seq	len	A	C	G	T	N	cpg
Chr1	30427671	9709674	5435374	5421151	9697113	164359	697370
Chr2	19698289	6315641	3542973	3520766	6316348	2561	457572
Chr3	23459830	7484757	4258333	4262704	7448059	5977	559031
Chr4	18585056	5940546	3371349	3356091	5914038	3032	439585
Chr5	26975502	8621974	4832253	4858759	8652238	10278	630299
ChrC	154478	48546	28496	27570	49866	0	4639
ChrM	366924	102464	82661	81609	100190	0	13697
total	119667750	38223602	21551439	21528650	38177852	186207	2802193
hpc $ python
Python 2.7.11 (default, Jul 25 2019, 12:10:26) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 119667750-186207
119481543

* Fix python version in Azure tests  (#860)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* Fix python version

* Update azure-pipelines.yml

* fixed typo (#864)

* Develop (#827)

* Merged into the wrong branch without noticing :( (#814)

* use better conda link (#799)

* Estimated filtering fix (#813)

* oops

* fix testing and set a max number of filtered reads

* apparently a bunch of things were getting skipped

* fix wrappers

* update computeMatrix wrapper

* Decrease memory needs (#817)

* Use an iterator to not blow memory up

* Update a bit more

* The GC bias stuff is all deprecated, I'm not fixing that old code

* Cache resulting counts rather than just decreasing the bin size (#818)

* Cache resulting counts rather than just decreasing the bin size

* sanity check

* Implement #815

* [skip ci] update change log

* Implement #816 (#825)

* Implement #816

* expose option

* Add a test using pseudocounts and skipZeroOverZero

* syntax

* Fix tests

* Make --skipZeroOverZero a galaxy macro and add to bigwigCompare

* [ci skip] a bit of formatting

* Fix #822 (#826)

* fixes linting issues (#837)

* Delete #test.bg# (#859)

File is removed upon clean.

* fixed typo

* Update test images, skip testing if the wrong matplotlib version is used (#865)

* Update test images, skip testing if the wrong matplotlib version is used

* Update test-template.yml

* linting

* can't conda activate on azure

* now the heatmap is correct and the profile is wrong

* lint

* only one test should fail now

* Fix #844

* Should fix one test at least

* fix last tests

* fix #838 (#843)

* fix #838

* fixes

* Update CHANGES.txt

* Close #868 #867 and #851 (#869)

* Fix #868

* Fix #867

* Default ALL the things!

* Fix #866 (#871)

* release 3.3.1

* try github actions

* each action is a file

* OK, that's inflexible

* OK, the action.yml thing is a mess

* syntax

* ok, try this

* uses

* spacing

* ok

* do anchors work?

* boo, so duplicative!

* oops

* maybe this will work for pypi

* ensure dist is empty

* nev

* rename

* bump version number

* added a silhouette calculation

* remove sklearn, implement with scipy and numpy

* Update requirements.txt

* Update setup.py

* Update heatmapper.py

* update galaxy wrapper

* Fix run time issues

* refactor, the order matters here.

* removing debugging stuff

* Update heatmapper.py

* Update plotHeatmap.py

Co-authored-by: Devon Ryan <dpryan79@users.noreply.github.com>
Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
Co-authored-by: Steffen Möller <steffen_moeller@gmx.de>

* mention silhouette score

* update help location

* see if this fixes things (#909)

* update Azure OSX client version

* Fix typos in documentation (#916)

Fixes two typos in example code.

* copy wrapper fixes from Bjoern's PR

* Fix dotted line (#921)

* Fix the dashed line in plotHeatmap with reference-point TES and sorting by region_length

* fix test

* Implement #924 (#925)

* Basic implementation of --linesAtTickMarks and galaxy wrapper

* bump version to 3.4.0

* typo

* stupid eigenvalues

* Fix #928 (#929)

* Don't force shared memory.

Co-authored-by: Björn Grüning <bjoern.gruening@gmail.com>
Co-authored-by: Ann Loraine <aloraine@uncc.edu>
Co-authored-by: Jan Janssen <jan-janssen@users.noreply.github.com>
Co-authored-by: bounlu <bounlu@gmail.com>
Co-authored-by: Leily Rabbani <rabbani@pc390.ie-freiburg.mpg.de>
Co-authored-by: LeilyR <leila.rabbani@gmail.com>
Co-authored-by: cgirardot <girardot@embl.de>
Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
Co-authored-by: Steffen Möller <steffen_moeller@gmx.de>
Co-authored-by: Lucille Delisle <lucille.delisle@epfl.ch>
Co-authored-by: Sichong <scpeng@ucdavis.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants