-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sensitive detection of rare disease-associated cell subsets via representation learning #79
Comments
They appear to be using convolutional neural networks for unordered data. This seems strange to me. Has anyone seen that before in other domains? |
@agitter : Haven't seen it before, other than the graph stuff. How does convolution work without any sort of neighbor relationships? Isn't that the primary reason that one uses CNNs - to take advantage of that structure? |
@cgreene Exactly, I always thought that the point of convolution was to take advantage of neighbor relationships in one or more dimensions. From my ~5 minute read of this paper, they seem to be creating artificial random neighbor relationships from the unordered data for each cell in a biological sample and repeating that process many times. This may be their workaround for the problem that cell i in sample x does not correspond to cell i in sample y. |
@agitter I agree that it is hard to figure that paper out. Back when we were first starting with these algorithms, we briefly considered clustering and using a dendrogram or similar to define neighbors for convolution. We decided that imposed too much structure and ended up moving forward with non-convolutional methods. From my ~10 minute read they are not imposing order:
They are doing something here
but I don't think that would impose the type of structure usually used for convolution. I have e-mailed the authors a link to this to see if they can provide some clarity. I would love to know if they compared against approaches that don't impose the structure of a CNN. |
@cgreene Great that you emailed them! I was similarly intrigued/mystified when I looked at this a few months back. I also ran into some issue that I can't recall now when trying to run the code. Would be interesting to get some more insight from the authors. |
@cgreene I started at Figure 1a for a while, and it makes more sense. I erroneously read 2 or 3 convolutional filters as 2 or 3 convolutional layers. The network is actually quite small. My current understanding (subject to change) is that there is only 1 convolutional layer that contains 2 or 3 filters. Each filter is supposed to recognize a cell type signature; it transforms the mass cytometry marker values for a single cell into a scalar score. Then the pooling is over all of cells, which either detects whether the signature was seen in any of the input cells (max) or the frequency with which it was seen (mean). The output layer makes a prediction using only these 2 or 3 inputs. If I understand correctly, it is a very special case of a convolutional network where the convolutional layer receptive field is 1 and the pooling layer receptive field is k, where k is the number of cells in the multi-cell input (e.g., 1000). So you are right that they are not artificially imposing an order among the cells. Now I'm curious what happens when the number of filters increases. |
@agitter Exactly, a bit unusual type of convnet where convolution simply corresponds to a dot product of a filter vector and the markers of a single cell since number of "channels" is equal to the number of markers, if I understood correctly. Couple of things crossed my mind:
|
@gokceneraslan Comparing a MLP with and without MIL would be a good idea. They could also create a simple baseline by having each cytometry sample be an instance and using the marker means as input to the 3 hidden units. That could be contrasted with the MIL approach with max pooling and mean pooling. |
Now width and height are both specified at 13 pixels, to constrain the aspect ratio of these SVGs as square. Previously, the icons appeared squished in DOCX exports. See manubot/rootstock#40
http://doi.org/10.1101/046508
The text was updated successfully, but these errors were encountered: