Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-Segmentation #16

Open
chloielam opened this issue Sep 18, 2024 · 1 comment
Open

Multi-Segmentation #16

chloielam opened this issue Sep 18, 2024 · 1 comment

Comments

@chloielam
Copy link

May I ask why, in the multi-segmentation dataset, we assign the class_id to random existing classes?

@lin-tianyu
Copy link
Owner

Hi @chloielam! Glad you ask.

SDSeg is a binary-segmentation model for now. The released multi-class segmentation code is still undergoing experimental refinement, thus not published with the SDSeg paper.

And for your question, in the SDSeg setting, the AutoEncoder expects a 3-channel input of an image to get the corresponding latent representation, and we don't want to make any changes to the autoencoder. That's for one.

Secondly, with a fixed 3-channel input, we have tried inputting the multi-label map with values 0, 1, 2, 3, etc. (and copy 3 times for input), but it doesn't work well.

As a result, the multi-class segmentation code that you're using now takes the segmentation map of random existing classes as the AutoEncoder's input, with also a label embedding telling the model which classes are being processed. With this approach, the multi-class version of SDSeg can have reasonable segmentation results (though not SOTA yet).

As for "existing" classes, we don't really want the model to sample too many empty classes which have no benefit for the model and may lead to a strong class-imbalance problem.

All in all, at each training step we sample some existing classes to train the model for segmenting certain classes with given label embedding. And in the inference stage, we will send all the label embeddings one by one to generate segmentation results for all classes of an image.


Anyway, this is only a non-optimal solution for expanding SDSeg to multi-class segmentation. I am still not quite sure why this cannot work well. My biggest guess is that the BTCV dataset is too small.

Feel free to try the multi-class segmentation code of SDSeg! If you come up with a better idea of doing multi-class segmentation, please don't hesitate to contact me. Let's see if we can build an SDSeg-V2 or something together, lol

Best,
Tianyu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants