Fixing the filter sizes #51

redst4r · 2022-10-02T22:20:03Z

Hi,
just came across two bugs in the _running_mean() function:

the size of the filter (pyramid) is off by one (len(pyramid) == window_size-1. Not a big deal, but slightly inconsistent.
as mentioned by Shape of the output of cnv.tl.infercnv #37, due to the way np.convolve() works, if the filtersize is bigger than the number of genes, it flips filter and signal, and the result doesn't mean much
is an easy fix
Two options: either skip the chromosome entirely if there's not enough genes. Or, make the filter smaller, essentially the chromosome yields a single convolution result (all genes on the chromosome go into the convolution). I opted for 2), especially for larger filter sizes, you'd loose alot of data when doing 1)

Thanks for putting together this nice package!

Closes #37

grst · 2022-10-03T07:23:46Z

Thanks for fixing this!

Do you urgently need this in a released version? Ideally, I would wait a couple of weeks until I fully ported infercnvpy to the cookiecutter-scverse template, which should also fix the CI.

redst4r · 2022-10-03T19:43:29Z

Hi,
no need to put this into a release soon, I'll just merge it on my local repository for the time being!

grst

So the new template is merged in and the CI works again.

It seems that a variable is not defined. Can you please fix that?

grst · 2022-10-06T13:27:21Z

src/infercnvpy/tl/_infercnv.py

+        r = np.arange(1, n + 1)
+
+        pyramid = np.minimum(r, r[::-1])
+        smoothed_x = np.apply_along_axis(lambda row: np.convolve(row, pyramid, mode=conv_mode), axis=1, arr=x) / np.sum(


it seems the conv_mode is not defined here. Can you add that as a parameter to the infercnvpy function and pass it on to the _running_mean? Or just leave it hardcoded to same - either is fine for me.

ah, my mistake, I was playing around with the convolution mode in another branch and it leaked into here...

PS: is there a specific reason to have the convolution with mode="same"?
Seems to create weird edge effects on the end of chromosomes. The filter keeps sliding past the end of the chromosome, so the last datapoints at the start/end of the chromosome are convoluted with a "ramp" (half of the pyramid filter). Also, that makes the effective filter size vary along the chromosome (if window_size=100 genes) you'll be smoothing over 100 genes most of the time, but at start and end of the chromosome, you'll just be smoothing 50 genes instead.
I've been using `mode="valid", that at least treats each position the same way.

I'm not sure what's more appropriate, maybe mode=valid makes detecting CNVs at the edges harder...

Don't remember if there was a specific reason. Maybe I tried being consistent with the R version, but not sure if that's what they are doing.

Personally, I don't think it matters that much, as it's just a very small region, and single-cell based CNV calling is anyway qualitative at best.

codecov-commenter · 2022-10-08T16:58:42Z

Codecov Report

Merging #51 (5b4a734) into main (978fdfd) will decrease coverage by 0.95%.
The diff coverage is 0.00%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #51      +/-   ##
==========================================
- Coverage   66.01%   65.06%   -0.96%     
==========================================
  Files          13       13              
  Lines         409      415       +6     
==========================================
  Hits          270      270              
- Misses        139      145       +6

Flag	Coverage Δ
unittests	`65.06% <0.00%> (-0.96%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/infercnvpy/tl/_infercnv.py	`51.02% <0.00%> (-3.33%)`	⬇️

redst4r added 2 commits October 2, 2022 15:06

fixed filter/window size

30e2365

fixing edge-case: windowsize > #genes

0a6221f

Merge branch 'main' into conv_window_fix

ed7f7f8

grst requested changes Oct 6, 2022

View reviewed changes

redst4r and others added 2 commits October 7, 2022 10:27

fixed convolution mode

1d8242b

Merge branch 'main' into conv_window_fix

5b4a734

grst enabled auto-merge (squash) October 8, 2022 16:57

grst merged commit f6870ec into icbi-lab:main Oct 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing the filter sizes #51

Fixing the filter sizes #51

redst4r commented Oct 2, 2022 •

edited by grst

Loading

grst commented Oct 3, 2022

redst4r commented Oct 3, 2022

grst left a comment

grst Oct 6, 2022

redst4r Oct 7, 2022

grst Oct 8, 2022

codecov-commenter commented Oct 8, 2022 •

edited

Loading

Fixing the filter sizes #51

Fixing the filter sizes #51

Conversation

redst4r commented Oct 2, 2022 • edited by grst Loading

grst commented Oct 3, 2022

redst4r commented Oct 3, 2022

grst left a comment

Choose a reason for hiding this comment

grst Oct 6, 2022

Choose a reason for hiding this comment

redst4r Oct 7, 2022

Choose a reason for hiding this comment

grst Oct 8, 2022

Choose a reason for hiding this comment

codecov-commenter commented Oct 8, 2022 • edited Loading

Codecov Report

redst4r commented Oct 2, 2022 •

edited by grst

Loading

codecov-commenter commented Oct 8, 2022 •

edited

Loading