Skip to content

Commit c80942f

Browse files
committedJun 23, 2020
add scripts
1 parent 767e100 commit c80942f

File tree

544 files changed

+64126
-342
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

544 files changed

+64126
-342
lines changed
 

‎LICENSE

-21
This file was deleted.

‎README.md

+74-42
Original file line numberDiff line numberDiff line change
@@ -1,91 +1,123 @@
11
# Self Correction for Human Parsing
22

3-
An out-of-box human parsing representation extractor. Also the 3rd LIP challenge winner solution!
3+
![Python 3.6](https://img.shields.io/badge/python-3.6-green.svg)
4+
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
45

5-
![lip-visualization](./img/lip-visualization.jpg)
6+
An out-of-box human parsing representation extractor.
67

7-
At this time, we provide the trained models on three popular human parsing datasets that achieve the state-of-the-art performance. We hope our work could serve as a basic human parsing representation extractor and facilitate your own tasks, e.g. Fashion AI, Person Re-Identification, Virtual Reality, Virtual Try-on, Human Analysis and so on.
8+
Our solution ranks 1st for all human parsing tracks (including single, multiple and video) in the third LIP challenge!
89

9-
## Citation
10-
11-
Please cite our work if you find this repo useful in your research.
10+
![lip-visualization](./demo/lip-visualization.jpg)
1211

13-
```latex
14-
@article{li2019self,
15-
title={Self-Correction for Human Parsing},
16-
author={Li, Peike and Xu, Yunqiu and Wei, Yunchao and Yang, Yi},
17-
journal={arXiv preprint arXiv:1910.09777},
18-
year={2019}
19-
}
20-
```
21-
22-
## TODO List
23-
24-
- [x] Inference code on three popular single person human parsing datasets.
25-
- [ ] Training code
26-
- [ ] Extension on multi-person and video human parsing tasks.
27-
28-
Coming Soon! Stay tuned!
12+
Features:
13+
- [x] Out-of-box human parsing extractor for other downstream applications.
14+
- [x] Pretrained model on three popular single person human parsing datasets.
15+
- [x] Training and inferecne code.
16+
- [x] Simple yet effective extension on multi-person and video human parsing tasks.
2917

3018
## Requirements
3119

3220
```
33-
Python >= 3.5, PyTorch >= 0.4
21+
Python >= 3.6, PyTorch >= 1.0
3422
```
3523

36-
## Trained models
24+
## Simple Out-of-Box Extractor
3725

38-
The easiest way to get started is to use our trained SCHP models on your own images to extract human parsing representations. Here we provided trained models on three popular datasets. Theses three datasets have different label system, you can choose the best one to fit on your own task.
26+
The easiest way to get started is to use our trained SCHP models on your own images to extract human parsing representations. Here we provided state-of-the-art [trained models](https://drive.google.com/drive/folders/1uOaQCpNtosIjEL2phQKEdiYd0Td18jNo?usp=sharing) on three popular datasets. Theses three datasets have different label system, you can choose the best one to fit on your own task.
3927

40-
**LIP** ([exp-schp-201908261155-lip.pth](https://drive.google.com/file/d/1ZrTiadzAOM332d896fw7JZQ2lWALedDB/view?usp=sharing))
28+
**LIP** ([exp-schp-201908261155-lip.pth](https://drive.google.com/file/d/1k4dllHpu0bdx38J7H28rVVLpU-kOHmnH/view?usp=sharing))
4129

4230
* mIoU on LIP validation: **59.36 %**.
4331

4432
* LIP is the largest single person human parsing dataset with 50000+ images. This dataset focus more on the complicated real scenarios. LIP has 20 labels, including 'Background', 'Hat', 'Hair', 'Glove', 'Sunglasses', 'Upper-clothes', 'Dress', 'Coat', 'Socks', 'Pants', 'Jumpsuits', 'Scarf', 'Skirt', 'Face', 'Left-arm', 'Right-arm', 'Left-leg', 'Right-leg', 'Left-shoe', 'Right-shoe'.
4533

46-
**ATR** ([exp-schp-201908301523-atr.pth](https://drive.google.com/file/d/1klCtqx51orBkFKdkvYwM4qao_vEFbJ_z/view?usp=sharing))
34+
**ATR** ([exp-schp-201908301523-atr.pth](https://drive.google.com/file/d/1ruJg4lqR_jgQPj-9K0PP-L2vJERYOxLP/view?usp=sharing))
4735

4836
* mIoU on ATR test: **82.29%**.
4937

5038
* ATR is a large single person human parsing dataset with 17000+ images. This dataset focus more on fashion AI. ATR has 18 labels, including 'Background', 'Hat', 'Hair', 'Sunglasses', 'Upper-clothes', 'Skirt', 'Pants', 'Dress', 'Belt', 'Left-shoe', 'Right-shoe', 'Face', 'Left-leg', 'Right-leg', 'Left-arm', 'Right-arm', 'Bag', 'Scarf'.
5139

52-
**Pascal-Person-Part** ([exp-schp-201908270938-pascal-person-part.pth](https://drive.google.com/file/d/13ph1AloYNiC4DIGOyCLZdmA08tP9OeGu/view?usp=sharing))
40+
**Pascal-Person-Part** ([exp-schp-201908270938-pascal-person-part.pth](https://drive.google.com/file/d/1E5YwNKW2VOEayK9mWCS3Kpsxf-3z04ZE/view?usp=sharing))
5341

5442
* mIoU on Pascal-Person-Part validation: **71.46** %.
5543

5644
* Pascal Person Part is a tiny single person human parsing dataset with 3000+ images. This dataset focus more on body parts segmentation. Pascal Person Part has 7 labels, including 'Background', 'Head', 'Torso', 'Upper Arms', 'Lower Arms', 'Upper Legs', 'Lower Legs'.
5745

5846
Choose one and have fun on your own task!
5947

60-
## Inference
48+
To extract the human parsing representation, simply put your own image in the `INPUT_PATH` folder, then download a pretrained model and run the following command. The output images with the same file name will be saved in `OUTPUT_PATH`
49+
50+
```
51+
python simple_extractor.py --dataset [DATASET] --model-restore [CHECKPOINT_PATH] --input-dir [INPUT_PATH] --output-dir [OUTPUT_PATH]
52+
```
53+
54+
The `DATASET` command has three options, including 'lip', 'atr' and 'pascal'. Note each pixel in the output images denotes the predicted label number. The output images have the same size as the input ones. To better visualization, we put a palette with the output images. We suggest you to read the image with `PIL`.
55+
56+
If you need not only the final parsing images, but also the feature map representations. Add `--logits` command to save the output feature maps. These feature maps are the logits before softmax layer.
57+
58+
## Dataset Preparation
59+
60+
Please download the [LIP](http://sysu-hcp.net/lip/) dataset following the below structure.
6161

62-
To extract the human parsing representation, simply put your own image in the `Input_Directory`, download a pretrained model and run the following command. The output images with the same file name will be saved in `Output_Directory`
62+
```commandline
63+
data/LIP
64+
|--- train_imgaes # 30462 training single person images
65+
|--- val_images # 10000 validation single person images
66+
|--- train_segmentations # 30462 training annotations
67+
|--- val_segmentations # 10000 training annotations
68+
|--- train_id.txt # training image list
69+
|--- val_id.txt # validation image list
70+
```
71+
72+
## Training
73+
74+
```
75+
python trian.py
76+
```
6377

78+
## Evaluation
6479
```
65-
python evaluate.py --dataset Dataset --restore-weight Checkpoint_Path --input Input_Directory --output Output_Directory
80+
python evaluate.py --model-restore [CHECKPOINT_PATH]
6681
```
6782

68-
The `Dataset` command has three options, including 'lip', 'atr' and 'pascal'. Note each pixel in the output images denotes the predicted label number. The output images have the same size as the input ones. To better visualization, we put a palette with the output images. We suggest you to read the image with `PIL`.
83+
## Extension on Multiple Human Parsing
6984

70-
If you need not only the final parsing image, but also a feature map representation. Add `--logits` command to save the output feature map. This feature map is the logits before softmax layer with the dimension of HxWxC.
85+
Please read [MultipleHumanParsing.md](./mhp_extension/README.md) for more details.
7186

87+
## Citation
88+
89+
Please cite our work if you find this repo useful in your research.
90+
91+
```latex
92+
@article{li2019self,
93+
title={Self-Correction for Human Parsing},
94+
author={Li, Peike and Xu, Yunqiu and Wei, Yunchao and Yang, Yi},
95+
journal={arXiv preprint arXiv:1910.09777},
96+
year={2019}
97+
}
98+
```
7299

73100
## Visualization
74101

75102
* Source Image.
76-
![demo](./input/demo.jpg)
77-
103+
![demo](./demo/demo.jpg)
78104
* LIP Parsing Result.
79-
![demo-lip](./output/demo_lip.png)
80-
105+
![demo-lip](./demo/demo_lip.png)
81106
* ATR Parsing Result.
82-
![demo-atr](./output/demo_atr.png)
83-
107+
![demo-atr](./demo/demo_atr.png)
84108
* Pascal-Person-Part Parsing Result.
85-
![demo-pascal](./output/demo_pascal.png)
109+
![demo-pascal](./demo/demo_pascal.png)
110+
* Source Image.
111+
![demo](./mhp_extension/demo/demo.jpg)
112+
* Instance Human Mask.
113+
![demo-lip](./mhp_extension/demo/demo_instance_human_mask.png)
114+
* Global Human Parsing Result.
115+
![demo-lip](./mhp_extension/demo/demo_global_human_parsing.png)
116+
* Multiple Human Parsing Result.
117+
![demo-lip](./mhp_extension/demo/demo_multiple_human_parsing.png)
86118

87119

88120
## Related
121+
Our code adopts the [InplaceSyncBN](https://github.com/mapillary/inplace_abn) to save gpu memory cost.
89122

90-
There is also a [PaddlePaddle](https://github.com/PaddlePaddle/PaddleSeg/tree/master/contrib/ACE2P) Implementation.
91-
This implementation is the version that we submitted to the 3rd LIP Challenge.
123+
There is also a [PaddlePaddle](https://github.com/PaddlePaddle/PaddleSeg/tree/develop/contrib/ACE2P) Implementation of this project.

0 commit comments

Comments
 (0)