Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instance segmentation examples #31084

Merged
merged 39 commits into from
May 31, 2024
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
9652ee2
Initial setup
qubvel May 16, 2024
2a68a73
Metrics
qubvel May 17, 2024
ffb000f
Overfit on two batches
qubvel May 20, 2024
0bda3ea
Train 40 epochs
qubvel May 20, 2024
55ebde0
Memory leak debugging
qubvel May 21, 2024
bdb07c3
Trainer fine-tuning
qubvel May 26, 2024
1088e8f
Draft
qubvel May 26, 2024
d42a1fa
Fixup
qubvel May 26, 2024
67f9890
Trained end-to-end
qubvel May 27, 2024
5e39a88
Add requirements
qubvel May 27, 2024
d5f6b58
Rewrite evaluator
qubvel May 27, 2024
71123e6
nits
qubvel May 27, 2024
b537ba6
Add readme
qubvel May 27, 2024
13e56fc
Add instance-segmentation to the table
qubvel May 27, 2024
edb51ed
Support void masks
qubvel May 27, 2024
25bdc13
Remove sh
qubvel May 27, 2024
b3e64f5
Update docs
qubvel May 27, 2024
d44dd69
Add pytorch test
qubvel May 27, 2024
781ccc2
Add accelerate test
qubvel May 27, 2024
414c1b5
Update examples/pytorch/instance-segmentation/README.md
qubvel May 28, 2024
bb40d46
Update examples/pytorch/instance-segmentation/run_instance_segmentati…
qubvel May 28, 2024
b175acb
Update examples/pytorch/instance-segmentation/run_instance_segmentati…
qubvel May 28, 2024
fa7dd21
Update examples/pytorch/instance-segmentation/run_instance_segmentati…
qubvel May 28, 2024
e1eb7b3
Update examples/pytorch/instance-segmentation/run_instance_segmentati…
qubvel May 28, 2024
9372fe9
Fix consistency oneformer
qubvel May 28, 2024
d55b2b7
Merge branch 'instance-segmentation-examples' of github.com:qubvel/tr…
qubvel May 28, 2024
869f8f9
Fix imports
qubvel May 28, 2024
b7b9c36
Fix imports sort
qubvel May 28, 2024
6e40817
Apply suggestions from code review
qubvel May 29, 2024
5d7ad92
Update examples/pytorch/instance-segmentation/run_instance_segmentati…
qubvel May 30, 2024
b22246b
Add resources to docs
qubvel May 30, 2024
4d92577
Merge branch 'instance-segmentation-examples' of github.com:qubvel/tr…
qubvel May 31, 2024
d00a8fd
Update examples/pytorch/instance-segmentation/README.md
qubvel May 31, 2024
d734a6c
Update examples/pytorch/instance-segmentation/README.md
qubvel May 31, 2024
0a916d9
Remove explicit model_type argument
qubvel May 31, 2024
1caa2d9
Fix tests
qubvel May 31, 2024
27256f5
Update readme
qubvel May 31, 2024
d0fccce
Merge branch 'instance-segmentation-examples' of github.com:qubvel/tr…
qubvel May 31, 2024
f5ba398
Note about other models
qubvel May 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions examples/pytorch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ Coming soon!
| [**`image-classification`**](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) | [CIFAR-10](https://huggingface.co/datasets/cifar10) | ✅ | ✅ |✅ | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb)
| [**`semantic-segmentation`**](https://github.com/huggingface/transformers/tree/main/examples/pytorch/semantic-segmentation) | [SCENE_PARSE_150](https://huggingface.co/datasets/scene_parse_150) | ✅ | ✅ |✅ | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/semantic_segmentation.ipynb)
| [**`object-detection`**](https://github.com/huggingface/transformers/tree/main/examples/pytorch/object-detection) | [CPPE-5](https://huggingface.co/datasets/cppe-5) | ✅ | ✅ |✅ | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/pytorch/object_detection.ipynb)
| [**`instance-segmentation`**](https://github.com/huggingface/transformers/tree/main/examples/pytorch/instance-segmentation) | [ADE20K sample](https://huggingface.co/datasets/qubvel-hf/ade20k-mini) | ✅ | ✅ |✅ |


## Running quick tests
Expand Down
239 changes: 239 additions & 0 deletions examples/pytorch/instance-segmentation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
<!---
Copyright 2024 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Instance segmentation examples

This directory contains 2 scripts that showcase how to fine-tune [MaskFormer](https://huggingface.co/docs/transformers/model_doc/maskformer) and [Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former) for instance segmentation using PyTorch.

Content:
* [PyTorch version with Trainer](#pytorch-version-trainer)
* [PyTorch version with Accelerate](#pytorch-version-no-trainer)
* [Reload and perform inference](#reload-and-perform-inference)
* [Note on custom data](#note-on-custom-data)


## PyTorch version, Trainer

Based on the script [`run_instance_segmentation.py`](https://github.com/huggingface/transformers/blob/main/examples/pytorch/instance-segmentation/run_instance_segmentation.py).

The script leverages the [🤗 Trainer API](https://huggingface.co/docs/transformers/main_classes/trainer) to automatically take care of the training for you, running on distributed environments right away.

Here we show how to fine-tune a [Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former) model on the subsample of [ADE20K](https://huggingface.co/datasets/zhoubolei/scene_parse_150) dataset. For this example we created a [small dataset](https://huggingface.co/datasets/qubvel-hf/ade20k-mini) with ~2k images that contain only "person" and "car" annotations, all other pixels marked as "background".
qubvel marked this conversation as resolved.
Show resolved Hide resolved

Here is how `label2id` looks for this dataset:
```python
label2id = {
"background": 0,
"person": 1,
"car": 2,
}
```

Since the `background` label is not an instance and we don't want to predict it, we will specify `do_reduce_labels` to remove it from the data.

You can run the training with following command:
qubvel marked this conversation as resolved.
Show resolved Hide resolved

```bash
python run_instance_segmentation.py \
--model_type mask2former \
--model_name_or_path facebook/mask2former-swin-tiny-coco-instance \
--output_dir finetune-instance-segmentation-ade20k-mini-mask2former \
--dataset_name qubvel-hf/ade20k-mini \
--do_reduce_labels \
--image_height 256 \
--image_width 256 \
--do_train \
--fp16 \
--num_train_epochs 40 \
--learning_rate 1e-5 \
--lr_scheduler_type constant \
--per_device_train_batch_size 8 \
--gradient_accumulation_steps 2 \
--dataloader_num_workers 8 \
--dataloader_persistent_workers \
--dataloader_prefetch_factor 4 \
--do_eval \
--evaluation_strategy epoch \
--logging_strategy epoch \
--save_strategy epoch \
--save_total_limit 2 \
--push_to_hub
```

The resulting model can be seen here: https://huggingface.co/qubvel-hf/finetune-instance-segmentation-ade20k-mini-mask2former. Note that it's always advised to check the original paper to know the details regarding training hyperparameters. Hyperparameters for current example were not tuned. To improve model quality you could try:
- changing image size parameters (`--image_height`/`--image_width`)
- changing training parameters, such as learning rate, batch size, warmup, optimizer and many more (see [TrainingArguments](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments))
- adding more image augmentations (we created a helpful [HF Space](https://huggingface.co/spaces/qubvel-hf/albumentations-demo) to choose some)

Note that you can replace the model type (`--model_type maskformer`) and model [checkpoint](https://huggingface.co/models?search=maskformer).
qubvel marked this conversation as resolved.
Show resolved Hide resolved
qubvel marked this conversation as resolved.
Show resolved Hide resolved


## PyTorch version, no Trainer

Based on the script [`run_instance_segmentation_no_trainer.py`](https://github.com/huggingface/transformers/blob/main/examples/pytorch/instance-segmentation/run_instance_segmentation.py).

The script leverages [🤗 `Accelerate`](https://github.com/huggingface/accelerate), which allows to write your own training loop in PyTorch, but have it run instantly on any (distributed) environment, including CPU, multi-CPU, GPU, multi-GPU and TPU. It also supports mixed precision.
qubvel marked this conversation as resolved.
Show resolved Hide resolved

First, run:

```bash
accelerate config
```

and reply to the questions asked regarding the environment on which you'd like to train. Then

```bash
accelerate test
```

that will check everything is ready for training. Finally, you can launch training with

```bash
accelerate launch run_instance_segmentation_no_trainer.py \
--model_type mask2former \
--model_name_or_path facebook/mask2former-swin-tiny-coco-instance \
--output_dir finetune-instance-segmentation-ade20k-mini-mask2former-no-trainer \
--dataset_name qubvel-hf/ade20k-mini \
--do_reduce_labels \
--image_height 256 \
--image_width 256 \
--num_train_epochs 40 \
--learning_rate 1e-5 \
--lr_scheduler_type constant \
--per_device_train_batch_size 8 \
--gradient_accumulation_steps 2 \
--dataloader_num_workers 8 \
--push_to_hub
```

and boom, you're training, possibly on multiple GPUs, logging everything to all trackers found in your environment (like Weights and Biases, Tensorboard) and regularly pushing your model to the hub (with the repo name being equal to `args.output_dir` at your HF username) 🤗

With the default settings, the script fine-tunes a [Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former) model on the sample of [ADE20K](https://huggingface.co/datasets/qubvel-hf/ade20k-mini) dataset. The resulting model can be seen here: https://huggingface.co/qubvel-hf/finetune-instance-segmentation-ade20k-mini-mask2former-no-trainer.


## Reload and perform inference

This means that after training, you can easily load your trained model and perform inference as follows:
qubvel marked this conversation as resolved.
Show resolved Hide resolved

```python
import torch
import requests
import matplotlib.pyplot as plt

from PIL import Image
from transformers import Mask2FormerForUniversalSegmentation, Mask2FormerImageProcessor


# Load image
image = Image.open(requests.get("http://farm4.staticflickr.com/3017/3071497290_31f0393363_z.jpg", stream=True).raw)

# Load model and image processor
device = "cuda"
checkpoint = "qubvel-hf/finetune-instance-segmentation-ade20k-mini-mask2former"

model = Mask2FormerForUniversalSegmentation.from_pretrained(checkpoint, device_map=device)
image_processor = Mask2FormerImageProcessor.from_pretrained(checkpoint)

# Run inference on image
inputs = image_processor(images=[image], return_tensors="pt").to(device)
with torch.no_grad():
outputs = model(**inputs)

# Post process outputs
outputs = image_processor.post_process_instance_segmentation(outputs, target_sizes=[image.size[::-1]])

print("Mask shape: ", outputs[0]["segmentation"].shape)
print("Mask values: ", outputs[0]["segmentation"].unique())
for segment in outputs[0]["segments_info"]:
print("Segment: ", segment)
```

```
Mask shape: torch.Size([427, 640])
Mask values: tensor([-1., 0., 1., 2., 3., 4., 5., 6.])
Segment: {'id': 0, 'label_id': 0, 'was_fused': False, 'score': 0.946127}
Segment: {'id': 1, 'label_id': 1, 'was_fused': False, 'score': 0.961582}
Segment: {'id': 2, 'label_id': 1, 'was_fused': False, 'score': 0.968367}
Segment: {'id': 3, 'label_id': 1, 'was_fused': False, 'score': 0.819527}
Segment: {'id': 4, 'label_id': 1, 'was_fused': False, 'score': 0.655761}
Segment: {'id': 5, 'label_id': 1, 'was_fused': False, 'score': 0.531299}
Segment: {'id': 6, 'label_id': 1, 'was_fused': False, 'score': 0.929477}
```

Visualize the results with the following code:
```python
import numpy as np
import matplotlib.pyplot as plt

segmentation = outputs[0]["segmentation"].numpy()

plt.figure(figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(np.array(image))
plt.axis("off")
plt.subplot(1, 2, 2)
plt.imshow(segmentation)
plt.axis("off")
plt.show()
```
![Result](https://i.imgur.com/rZmaRjD.png)


## Note on custom data

Here is a short script demonstrating how to create your own dataset for instance segmentation and push it to the hub:

> Note: Annotations should be represented as 3-channel images (similar to [scene_parsing_150](https://huggingface.co/datasets/zhoubolei/scene_parse_150#instance_segmentation-1) dataset), first channel is semantic-segmentation map with values corresponding to `label2id`, second is instance-segmentation map, where each instance has its unique value, third channel should be empty (filled with zeros).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this super complicated for people to create their own instance segmentation datasets? In the end they probably just want to have binary masks, indicating with 1's where the instance is and 0 where background is. Shouldn't we support this format as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point, I feel that your suggested way is more common.
But then they have to provide a class-id/name for each binary mask.

I think the best way would be to support various formats which provide different annotation platforms. On the other hand, usually platforms allow to export in different formats. Probably we can select several most common (like coco) and support them across segmentation, detection, instance segmentation.


```python
from datasets import Dataset, DatasetDict
from datasets import Image as DatasetImage

label2id = {
"background": 0,
"person": 1,
"car": 2,
}

train_split = {
"image": [<PIL Image 1>, <PIL Image 2>, <PIL Image 3>, ...],
"annotation": [<PIL Image ann 1>, <PIL Image ann 2>, <PIL Image ann 3>, ...],
}

validation_split = {
"image": [<PIL Image 101>, <PIL Image 102>, <PIL Image 103>, ...],
"annotation": [<PIL Image ann 101>, <PIL Image ann 102>, <PIL Image ann 103>, ...],
}

def create_instance_segmentation_dataset(label2id, **splits):
dataset_dict = {}
for split_name, split in splits.items():
split["semantic_class_to_id"] = [label2id] * len(split["image"])
dataset_split = (
Dataset.from_dict(split)
.cast_column("image", DatasetImage())
.cast_column("annotation", DatasetImage())
)
dataset_dict[split_name] = dataset_split
return DatasetDict(dataset_dict)

dataset = create_instance_segmentation_dataset(label2id, train=train_split, validation=validation_split)
dataset.push_to_hub("qubvel-hf/ade20k-nano")
```

Then, you could use this dataset for fine-tuning by specifying its name with `--dataset_name <your_dataset>`.

See also: [Dataset Creation Guide](https://huggingface.co/docs/datasets/image_dataset#create-an-image-dataset)
5 changes: 5 additions & 0 deletions examples/pytorch/instance-segmentation/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
albumentations >= 1.4.5
timm
datasets
torchmetrics
pycocotools
Loading
Loading