Skip to content

Commit

Permalink
Merge pull request #27 from DefTruth/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
DefTruth authored Mar 7, 2022
2 parents ec2ead2 + 1d19f51 commit de129d6
Show file tree
Hide file tree
Showing 9 changed files with 186 additions and 100 deletions.
240 changes: 143 additions & 97 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,30 +12,27 @@


## 🤗 Introduction
**torchlm** is a PyTorch landmarks-only library with **100+ data augmentations**, support **training** and **inference**. **torchlm** is aims at only focus on any landmark detection, such as face landmarks, hand keypoints and body keypoints, etc. It provides **30+** native data augmentations and can **bind** with **80+** transforms from torchvision and albumentations, no matter the input is a np.ndarray or a torch Tensor, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper. Further, **torchlm** will add modules for **training** and **inference** in the future.
**torchlm** is aims to build a high level pipeline for face landmarks detection, support **100+ data augmentations**, **training** and **inference**, can can easily install with **pip**.
<div align='center'>
<img src='docs/res/605.jpg' height="100px" width="100px">
<img src='docs/res/802.jpg' height="100px" width="100px">
<img src='docs/res/92.jpg' height="100px" width="100px">
<img src='docs/res/234.jpg' height="100px" width="100px">
<img src='docs/res/906.jpg' height="100px" width="100px">
<img src='docs/res/825.jpg' height="100px" width="100px">
<img src='docs/res/388.jpg' height="100px" width="100px">
<br>
<img src='docs/res/2_wflw_44.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_67.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_76.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_162.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_229.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_440.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_478.jpg' height="100px" width="100px">
<img src='docs/assets/pipnet0.jpg' height="100px" width="100px">
<img src='docs/assets/pipnet_300W_CELEBA_model.gif' height="100px" width="100px">
<img src='docs/assets/pipnet_shaolin_soccer.gif' height="100px" width="100px">
<img src='docs/assets/pipnet_WFLW_model.gif' height="100px" width="100px">
</div>

<p align="center"> ❤️ Star 🌟👆🏻 this repo to support me if it does any helps to you, thanks ~ </p>

## 👋 Core Features
* High level pipeline for **training** and **inference**.
* Provides **30+** native landmarks data augmentations.
* Can **bind 80+** transforms from torchvision and albumentations with **one-line-code**.
* Support awesome models for face landmarks detection, such as YOLOX, YOLOv5, ResNet, MobileNet, ShuffleNet and PIPNet, etc.

# 🆕 What's New

## 🆕 What's New
* [2022/03/08]: Add **PIPNet**: [Towards Efficient Facial Landmark Detection in the Wild, CVPR2021](https://github.com/jhb86253817/PIPNet)
* [2022/02/13]: Add **30+** native data augmentations and **bind** **80+** transforms from torchvision and albumentations.

## 🛠️ Usage
Expand All @@ -44,53 +41,56 @@
* opencv-python-headless>=4.5.2
* numpy>=1.14.4
* torch>=1.6.0
* torchvision>=0.9.0
* torchvision>=0.8.0
* albumentations>=1.1.0
* onnx>=1.8.0
* onnxruntime>=1.7.0
* tqdm>=4.10.0

### Installation
you can install **torchlm** directly from [pypi](https://pypi.org/project/torchlm/).
you can install **torchlm** directly from [pypi](https://pypi.org/project/torchlm/). See [NOTE](#torchlm-NOTE) before installation!!!
```shell
pip3 install torchlm
# install from specific pypi mirrors use '-i'
pip3 install torchlm -i https://pypi.org/simple/
```
or install from source.
or install from source if you want the latest torchlm and install it in editable mode with `-e`.
```shell
# clone torchlm repository locally
# clone torchlm repository locally if you want the latest torchlm
git clone --depth=1 https://github.com/DefTruth/torchlm.git
cd torchlm
# install in editable mode
pip install -e .
```
<div id="torchlm-NOTE"></div>

**NOTE**: If you have the conflict problem between different installed version of opencv (opencv-python and opencv-python-headless, `ablumentations` need opencv-python-headless). Please uninstall the opencv-python and opencv-python-headless first, and then reinstall torchlm. See [albumentations#1139](https://github.com/albumentations-team/albumentations/issues/1139) for more details.

```shell
# first uninstall confilct opencvs
pip uninstall opencv-python
pip uninstall opencv-python-headless
pip uninstall torchlm # if you have installed torchlm
# then reinstall torchlm
pip install torchlm # will also install deps, e.g opencv
```

### Data Augmentation
### 🌟🌟Data Augmentation
**torchlm** provides **30+** native data augmentations for landmarks and can **bind** with **80+** transforms from torchvision and albumentations through **torchlm.bind** method. Further, **torchlm.bind** provide a `prob` param at bind-level to force any transform or callable be a random-style augmentation. The data augmentations in **torchlm** are `safe` and `simplest`. Any transform operations at runtime cause landmarks outside will be auto dropped to keep the number of landmarks unchanged. The layout format of landmarks is `xy` with shape `(N, 2)`, `N` denotes the number of the input landmarks. No matter the input is a np.ndarray or a torch Tensor, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper.

* use almost **30+** native transforms from **torchlm** directly
```python
import torchlm
transform = torchlm.LandmarksCompose([
# use native torchlm transforms
torchlm.LandmarksRandomScale(prob=0.5),
torchlm.LandmarksRandomTranslate(prob=0.5),
torchlm.LandmarksRandomShear(prob=0.5),
torchlm.LandmarksRandomMask(prob=0.5),
torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5),
torchlm.LandmarksRandomBrightness(prob=0.),
torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8),
torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5),
# ...
])
torchlm.LandmarksRandomScale(prob=0.5),
torchlm.LandmarksRandomMask(prob=0.5),
torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5),
torchlm.LandmarksRandomBrightness(prob=0.),
torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8),
torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5)
])
```
<div align='center'>
<img src='docs/res/605.jpg' height="100px" width="100px">
<img src='docs/res/802.jpg' height="100px" width="100px">
<img src='docs/res/92.jpg' height="100px" width="100px">
<img src='docs/res/234.jpg' height="100px" width="100px">
<img src='docs/res/906.jpg' height="100px" width="100px">
<img src='docs/res/825.jpg' height="100px" width="100px">
<img src='docs/res/388.jpg' height="100px" width="100px">
<br>
<img src='docs/res/2_wflw_44.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_67.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_76.jpg' height="100px" width="100px">
Expand All @@ -102,76 +102,45 @@ transform = torchlm.LandmarksCompose([

* **bind** **80+** torchvision and albumentations's transforms through **torchlm.bind**
```python
import torchvision
import albumentations
import torchlm
transform = torchlm.LandmarksCompose([
# use native torchlm transforms
torchlm.LandmarksRandomScale(prob=0.5),
# bind torchvision image only transforms, bind with a given prob
torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5),
torchlm.bind(torchvision.transforms.RandomAutocontrast(p=0.5)),
# bind albumentations image only transforms
torchlm.bind(albumentations.ColorJitter(p=0.5)),
torchlm.bind(albumentations.GlassBlur(p=0.5)),
# bind albumentations dual transforms
torchlm.bind(albumentations.RandomCrop(height=200, width=200, p=0.5)),
torchlm.bind(albumentations.Rotate(p=0.5)),
# ...
])
torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5),
torchlm.bind(albumentations.ColorJitter(p=0.5))
])
```
* **bind** custom callable array or Tensor functions through **torchlm.bind**
See [transforms.md](docs/api/transforms.md) for supported transforms sets and more example can be found at [test/transforms.py](test/transforms.py).

<details>
<summary> bind custom callable array or Tensor functions through torchlm.bind </summary>

```python
# First, defined your custom functions
def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
# do some transform here ...
def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: # do some transform here ...
return img.astype(np.uint32), landmarks.astype(np.float32)

def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]:
# do some transform here ...
def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]: # do some transform here ...
return img, landmarks
```

```python
# Then, bind your functions and put it into the transforms pipeline.
transform = torchlm.LandmarksCompose([
# use native torchlm transforms
torchlm.LandmarksRandomScale(prob=0.5),
# bind custom callable array functions
torchlm.bind(callable_array_noop, bind_type=torchlm.BindEnum.Callable_Array),
# bind custom callable Tensor functions with a given prob
torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5),
# ...
])
torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5)
])
```
<div align='center'>
<img src='docs/res/124.jpg' height="100px" width="100px">
<img src='docs/res/158.jpg' height="100px" width="100px">
<img src='docs/res/386.jpg' height="100px" width="100px">
<img src='docs/res/478.jpg' height="100px" width="100px">
<img src='docs/res/537.jpg' height="100px" width="100px">
<img src='docs/res/605.jpg' height="100px" width="100px">
<img src='docs/res/802.jpg' height="100px" width="100px">
<br>
<img src='docs/res/2_wflw_484.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_505.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_529.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_536.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_669.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_672.jpg' height="100px" width="100px">
<img src='docs/res/2_wflw_741.jpg' height="100px" width="100px">
</div>
</details>

<details>
<summary> some global debug setting for torchlm's transform </summary>

* setup logging mode as `True` globally might help you figure out the runtime details
```python
import torchlm
# some global setting
torchlm.set_transforms_debug(True)
torchlm.set_transforms_logging(True)
torchlm.set_autodtype_logging(True)
```
```

some detail information will show you at each runtime, the infos might look like
```shell
LandmarksRandomScale() AutoDtype Info: AutoDtypeEnum.Array_InOut
Expand All @@ -194,21 +163,98 @@ LandmarksRandomTranslate() Execution Flag: False

But, is ok if you pass a Tensor to a np.ndarray-like transform, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper.

</details>


### 🎉🎉Training
In **torchlm**, each model have a high level and user-friendly API named `training`, here is a example of [PIPNet](https://github.com/jhb86253817/PIPNet).
```python
from torchlm.models import pipnet

model = pipnet(
backbone="resnet18",
pretrained=False,
num_nb=10,
num_lms=98,
net_stride=32,
input_size=256,
meanface_type="wflw",
backbone_pretrained=True,
map_location="cuda",
checkpoint=None
)

model.training(
self,
annotation_path: str,
criterion_cls: nn.Module = nn.MSELoss(),
criterion_reg: nn.Module = nn.L1Loss(),
learning_rate: float = 0.0001,
cls_loss_weight: float = 10.,
reg_loss_weight: float = 1.,
num_nb: int = 10,
num_epochs: int = 60,
save_dir: Optional[str] = "./save",
save_interval: Optional[int] = 10,
save_prefix: Optional[str] = "",
decay_steps: Optional[List[int]] = (30, 50),
decay_gamma: Optional[float] = 0.1,
device: Optional[Union[str, torch.device]] = "cuda",
transform: Optional[transforms.LandmarksCompose] = None,
coordinates_already_normalized: Optional[bool] = False,
**kwargs: Any # params for DataLoader
) -> nn.Module:
```
Please jump to the entry point of the function for the detail documentations of **training** API for each defined models in torchlm, e.g [pipnet/_impls.py#L159](https://github.com/DefTruth/torchlm/blob/main/torchlm/models/pipnet/_impls.py#L159). Further, the model implementation plan is as follows:

* Supported Transforms Sets, see [transforms.md](docs/api/transforms.md). A detail example can be found at [test/transforms.py](test/transforms.py).
❔ YOLOX ❔ YOLOv5 ❔ NanoDet ✅ [PIPNet](https://github.com/jhb86253817/PIPNet) ❔ ResNet ❔ MobileNet ❔ ShuffleNet ❔...

### Training(TODO)
* [ ] YOLOX
* [ ] YOLOv5
* [ ] NanoDet
* [ ] PIPNet
* [ ] ResNet
* [ ] MobileNet
* [ ] ShuffleNet
* [ ] ...
✅ = known work and official supported, ❔ = in my plan, but not coming soon.

### Inference
### 👀👇 Inference
#### C++ API
The ONNXRuntime(CPU/GPU), MNN, NCNN and TNN C++ inference of **torchlm** will be release at [lite.ai.toolkit](https://github.com/DefTruth/lite.ai.toolkit).
#### Python API
In **torchlm**, we offer a high level API named `runtime.bind` to bind any models in torchlm and then you can run the `runtime.forward` API to get the output landmarks and bboxes, here is a example of [PIPNet](https://github.com/jhb86253817/PIPNet).
```python
import cv2
import torchlm
from torchlm.tools import faceboxesv2
from torchlm.models import pipnet

def test_pipnet_runtime():
img_path = "./1.jpg"
save_path = "./1.jpg"
checkpoint = "./pipnet_resnet18_10x98x32x256_wflw.pth"
image = cv2.imread(img_path)

torchlm.runtime.bind(faceboxesv2())
torchlm.runtime.bind(
pipnet(
backbone="resnet18",
pretrained=True,
num_nb=10,
num_lms=98,
net_stride=32,
input_size=256,
meanface_type="wflw",
backbone_pretrained=True,
map_location="cpu",
checkpoint=checkpoint
)
)
landmarks, bboxes = torchlm.runtime.forward(image)
image = torchlm.utils.draw_bboxes(image, bboxes=bboxes)
image = torchlm.utils.draw_landmarks(image, landmarks=landmarks)

cv2.imwrite(save_path, image)
```
<div align='center'>
<img src='docs/assets/pipnet0.jpg' height="180px" width="180px">
<img src='docs/assets/pipnet_300W_CELEBA_model.gif' height="180px" width="180px">
<img src='docs/assets/pipnet_shaolin_soccer.gif' height="180px" width="180px">
<img src='docs/assets/pipnet_WFLW_model.gif' height="180px" width="180px">
</div>

## 📖 Documentations
* [x] [Data Augmentation's API](docs/api/transforms.md)
Expand Down
Binary file added docs/assets/pipnet0.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/pipnet_300W_CELEBA_model.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/pipnet_WFLW_model.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/pipnet_shaolin_soccer.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# torchlm
opencv-python-headless>=4.5.2
opencv-python-headless>=4.3.0
numpy>=1.14.4
torch>=1.6.0
torchvision>=0.9.0
Expand Down
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,14 @@ def get_long_description():
url="https://github.com/DefTruth/torchlm",
packages=setuptools.find_packages(),
install_requires=[
"opencv-python-headless>=4.5.2",
"opencv-python-headless>=4.3.0",
"numpy>=1.14.4",
"torch>=1.6.0",
"torchvision>=0.8.0",
"albumentations>=1.1.0",
"onnx>=1.8.0",
"onnxruntime>=1.7.0",
"tqdm>=4.60.0"
"tqdm>=4.10.0"
],
classifiers=[
"Programming Language :: Python :: 3",
Expand Down
13 changes: 13 additions & 0 deletions torchlm/data/_converters.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import os
import cv2
import numpy as np
from abc import ABCMeta, abstractmethod
from typing import Tuple, Optional, List


class BaseConverter(object):
__metaclass__ = ABCMeta

@abstractmethod
def convert(self, *args, **kwargs):
raise NotImplementedError
27 changes: 27 additions & 0 deletions torchlm/models/pipnet/_impls.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,33 @@ def training(
coordinates_already_normalized: Optional[bool] = False,
**kwargs: Any # params for DataLoader
) -> nn.Module:
"""
:param annotation_path: the path to a annotation file, the format must be
"img0_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
"img1_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
"img2_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
"img3_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
...
:param criterion_cls: loss criterion for PIPNet heatmap classification, default MSELoss
:param criterion_reg: loss criterion for PIPNet offsets regression, default L1Loss
:param learning_rate: learning rate, default 0.0001
:param cls_loss_weight: weight for heatmap classification
:param reg_loss_weight: weight for offsets regression
:param num_nb: the number of Nearest-neighbor landmarks for NRM, default 10
:param num_epochs: the number of training epochs
:param save_dir: the dir to save checkpoints
:param save_interval: the interval to save checkpoints
:param save_prefix: the prefix to save checkpoints, the saved name would look like
{save_prefix}-epoch{epoch}-loss{epoch_loss}.pth
:param decay_steps: decay steps for learning rate
:param decay_gamma: decay gamma for learning rate
:param device: training device, default cuda.
:param transform: user specific transform. If None, torchlm will build a default transform,
more details can be found at `torchlm.transforms.build_default_transform`
:param coordinates_already_normalized: denoted the label in annotation_path is normalized(by image size) of not
:param kwargs: params for DataLoader
:return: A trained model.
"""
device = device if torch.cuda.is_available() else "cpu"
# prepare dataset
default_dataset = _PIPTrainDataset(
Expand Down

0 comments on commit de129d6

Please sign in to comment.