Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semi-supervised Support? #11

Open
narugo1992 opened this issue May 20, 2023 · 7 comments
Open

Semi-supervised Support? #11

narugo1992 opened this issue May 20, 2023 · 7 comments

Comments

@narugo1992
Copy link

narugo1992 commented May 20, 2023

We attempted to export images from dataset_generator.DatasetGenerator but found that they did not accurately represent the actual form of anime images. Therefore, the performance of the trained model on real anime images still needs improvement.

(One sample image)
image

(Its mask)
image

Therefore, I believe that this training framework should consider supporting semi-supervised learning, allowing users to provide a large number of real anime and illustration images to improve performance in real waifu images. I believe that semi-supervised learning is crucial for training on anime images, especially for tasks like image segmentation that require extremely expensive data annotation.

@narugo1992
Copy link
Author

Based on this, we have trained relatively reliable object detection models for characters and faces in anime images (online demo). If you encounter difficulties in directly performing semi-supervised training using completely unlabeled anime images, you can consider using the Segment Anything (SAM) model. By passing the results of object detection (bounding boxes for characters and midpoints of faces) as input, you can guide the training process towards the desired targets.

@SkyTNT
Copy link
Owner

SkyTNT commented May 20, 2023

It already allow users to provide a large number of real anime and illustration images. Just put real images and masks to imgs folder and masks folder.

@narugo1992
Copy link
Author

narugo1992 commented May 20, 2023

It already allow users to provide a large number of real anime and illustration images. Just put real images and masks to imgs folder and masks folder.

I mean Semi-supervised Learning, which means the images are unlabeled. For anime images, it is difficult to obtain a large amount of annotated data, which is why semi-supervised learning is needed.

@SkyTNT
Copy link
Owner

SkyTNT commented May 20, 2023

Oh, I see, but I haven't had time to implement it recently. If you open PR, I will merge it.

@narugo1992
Copy link
Author

I just trained a MODNet with the native dataset, here is one sample:

image

So maybe semi/self-supervised learning is necessary for this model.

I just studied the source code of MODNet, and they provide a self-supervised training based on SOC for fine-tuning on unlabeled data: https://github.com/ZHKKKe/MODNet/blob/master/src/trainer.py#L177. The expecting effect is like in the following image:

image

I will conduct experiments with the aforementioned fine-tuning code and, if the results are good, I will submit a pull request. 😄

@Zarxrax
Copy link

Zarxrax commented Jun 21, 2023

I mean Semi-supervised Learning, which means the images are unlabeled. For anime images, it is difficult to obtain a large amount of annotated data, which is why semi-supervised learning is needed.

I spent the past few months building a new dataset which I was hoping could provide better training for anime style images. I haven't tried training it yet though, because I wanted to make some big modifications to how the dataset generator code worked. Just haven't had time to start writing code yet.

Semi-supervised training sounds interesting, but wouldn't it only be as good as the model used to segment the images?

@narugo1992
Copy link
Author

@Zarxrax Regarding semi-supervised training, my understanding is as follows, considering the following scenarios:

  • A: Supervised training using 1k labeled images.
  • B: Supervised training using 10k labeled images.
  • C: Semi-supervised training using 1k labeled images and 9k unlabeled images.

It is evident that the performance of scenario C will likely be better than scenario A, but it may not surpass scenario B.

Moreover, in image segmentation tasks, obtaining precise labeled data is extremely costly, making it challenging to acquire a large amount of real segmentation training data. The approach suggested by @SkyTNT, which involves generating synthetic training data, is a compromise. However, the generated images still exhibit significant differences compared to real anime images, leaving room for improvement. In fact, based on my actual testing results, the models trained using this method are only marginally usable, particularly with heavy networks like isnetis, and there are still many issues to address.

In practical applications, semi-supervised learning often helps models better learn from real-world data without significantly increasing data annotation costs. Therefore, I believe that semi-supervised learning is necessary for the task of anime character segmentation, considering the limited availability of precisely labeled data and the potential for improved performance in real-world scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants