Merge pull request #27 from DefTruth/dev

Dev
DefTruth · Mar 7, 2022 · de129d6 · de129d6
2 parents ec2ead2 + 1d19f51
commit de129d6
Show file tree

Hide file tree

Showing 9 changed files with 186 additions and 100 deletions.
diff --git a/README.md b/README.md
@@ -12,30 +12,27 @@
 
 
 ## 🤗 Introduction
-**torchlm** is a PyTorch landmarks-only library with **100+ data augmentations**, support **training** and **inference**. **torchlm** is aims at only focus on any landmark detection, such as face landmarks, hand keypoints and body keypoints, etc. It provides **30+** native data augmentations and can **bind** with **80+** transforms from torchvision and albumentations, no matter the input is a np.ndarray or a torch Tensor, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper. Further, **torchlm** will add modules for **training** and **inference** in the future. 
+**torchlm** is aims to build a high level pipeline for face landmarks detection, support **100+ data augmentations**, **training** and **inference**, can can easily install with **pip**.
 <div align='center'>
- <img src='docs/res/605.jpg' height="100px" width="100px">
- <img src='docs/res/802.jpg' height="100px" width="100px">
- <img src='docs/res/92.jpg' height="100px" width="100px">
- <img src='docs/res/234.jpg' height="100px" width="100px">
- <img src='docs/res/906.jpg' height="100px" width="100px">
- <img src='docs/res/825.jpg' height="100px" width="100px">
- <img src='docs/res/388.jpg' height="100px" width="100px">
- <br>
  <img src='docs/res/2_wflw_44.jpg' height="100px" width="100px">
  <img src='docs/res/2_wflw_67.jpg' height="100px" width="100px">
  <img src='docs/res/2_wflw_76.jpg' height="100px" width="100px">
- <img src='docs/res/2_wflw_162.jpg' height="100px" width="100px">
- <img src='docs/res/2_wflw_229.jpg' height="100px" width="100px">
- <img src='docs/res/2_wflw_440.jpg' height="100px" width="100px">
- <img src='docs/res/2_wflw_478.jpg' height="100px" width="100px">
+ <img src='docs/assets/pipnet0.jpg' height="100px" width="100px">
+ <img src='docs/assets/pipnet_300W_CELEBA_model.gif' height="100px" width="100px">
+ <img src='docs/assets/pipnet_shaolin_soccer.gif' height="100px" width="100px">
+ <img src='docs/assets/pipnet_WFLW_model.gif' height="100px" width="100px">
 </div> 
 
 <p align="center"> ❤️ Star 🌟👆🏻 this repo to support me if it does any helps to you, thanks ~ </p>
 
+## 👋 Core Features
+* High level pipeline for **training** and **inference**.
+* Provides **30+** native landmarks data augmentations.
+* Can **bind 80+** transforms from torchvision and albumentations with **one-line-code**.
+* Support awesome models for face landmarks detection, such as YOLOX, YOLOv5, ResNet, MobileNet, ShuffleNet and PIPNet, etc.
 
-# 🆕 What's New
-
+## 🆕 What's New
+* [2022/03/08]: Add **PIPNet**: [Towards Efficient Facial Landmark Detection in the Wild, CVPR2021](https://github.com/jhb86253817/PIPNet)
 * [2022/02/13]: Add **30+** native data augmentations and **bind** **80+** transforms from torchvision and albumentations.
 
 ## 🛠️ Usage
@@ -44,53 +41,56 @@
 * opencv-python-headless>=4.5.2
 * numpy>=1.14.4
 * torch>=1.6.0
-* torchvision>=0.9.0
+* torchvision>=0.8.0
 * albumentations>=1.1.0
+* onnx>=1.8.0
+* onnxruntime>=1.7.0
+* tqdm>=4.10.0
 
 ### Installation
-you can install **torchlm** directly from [pypi](https://pypi.org/project/torchlm/).
+you can install **torchlm** directly from [pypi](https://pypi.org/project/torchlm/). See [NOTE](#torchlm-NOTE) before installation!!!
 ```shell
 pip3 install torchlm
 # install from specific pypi mirrors use '-i'
 pip3 install torchlm -i https://pypi.org/simple/
 ```
-or install from source.
+or install from source if you want the latest torchlm and install it in editable mode with `-e`.
 ```shell
-# clone torchlm repository locally
+# clone torchlm repository locally if you want the latest torchlm
 git clone --depth=1 https://github.com/DefTruth/torchlm.git 
 cd torchlm
 # install in editable mode
 pip install -e .
 ```
+<div id="torchlm-NOTE"></div> 
+
+**NOTE**: If you have the conflict problem between different installed version of opencv (opencv-python and opencv-python-headless, `ablumentations` need opencv-python-headless). Please uninstall the opencv-python and opencv-python-headless first, and then reinstall torchlm. See [albumentations#1139](https://github.com/albumentations-team/albumentations/issues/1139) for more details.
+
+```shell
+# first uninstall confilct opencvs
+pip uninstall opencv-python
+pip uninstall opencv-python-headless
+pip uninstall torchlm # if you have installed torchlm
+# then reinstall torchlm
+pip install torchlm # will also install deps, e.g opencv
+```
 
-### Data Augmentation
+### 🌟🌟Data Augmentation
 **torchlm** provides **30+** native data augmentations for landmarks and can **bind** with **80+** transforms from torchvision and albumentations through **torchlm.bind** method. Further, **torchlm.bind** provide a `prob` param at bind-level to force any transform or callable be a random-style augmentation. The data augmentations in **torchlm** are `safe` and `simplest`. Any transform operations at runtime cause landmarks outside will be auto dropped to keep the number of landmarks unchanged. The layout format of landmarks is `xy` with shape `(N, 2)`, `N` denotes the number of the input landmarks. No matter the input is a np.ndarray or a torch Tensor, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper. 
 
 * use almost **30+** native transforms from **torchlm** directly
 ```python
 import torchlm
 transform = torchlm.LandmarksCompose([
- # use native torchlm transforms
- torchlm.LandmarksRandomScale(prob=0.5),
- torchlm.LandmarksRandomTranslate(prob=0.5),
- torchlm.LandmarksRandomShear(prob=0.5),
- torchlm.LandmarksRandomMask(prob=0.5),
- torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5),
- torchlm.LandmarksRandomBrightness(prob=0.),
- torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8),
- torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5),
- # ...
- ])
+ torchlm.LandmarksRandomScale(prob=0.5),
+ torchlm.LandmarksRandomMask(prob=0.5),
+ torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5),
+ torchlm.LandmarksRandomBrightness(prob=0.),
+ torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8),
+ torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5)
+])
 ``` 
 <div align='center'>
- <img src='docs/res/605.jpg' height="100px" width="100px">
- <img src='docs/res/802.jpg' height="100px" width="100px">
- <img src='docs/res/92.jpg' height="100px" width="100px">
- <img src='docs/res/234.jpg' height="100px" width="100px">
- <img src='docs/res/906.jpg' height="100px" width="100px">
- <img src='docs/res/825.jpg' height="100px" width="100px">
- <img src='docs/res/388.jpg' height="100px" width="100px">
- <br>
  <img src='docs/res/2_wflw_44.jpg' height="100px" width="100px">
  <img src='docs/res/2_wflw_67.jpg' height="100px" width="100px">
  <img src='docs/res/2_wflw_76.jpg' height="100px" width="100px">
@@ -102,76 +102,45 @@ transform = torchlm.LandmarksCompose([
 
 * **bind** **80+** torchvision and albumentations's transforms through **torchlm.bind**
 ```python
-import torchvision
-import albumentations
-import torchlm
 transform = torchlm.LandmarksCompose([
- # use native torchlm transforms
- torchlm.LandmarksRandomScale(prob=0.5),
- # bind torchvision image only transforms, bind with a given prob
- torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5), 
- torchlm.bind(torchvision.transforms.RandomAutocontrast(p=0.5)),
- # bind albumentations image only transforms
- torchlm.bind(albumentations.ColorJitter(p=0.5)),
- torchlm.bind(albumentations.GlassBlur(p=0.5)),
- # bind albumentations dual transforms
- torchlm.bind(albumentations.RandomCrop(height=200, width=200, p=0.5)),
- torchlm.bind(albumentations.Rotate(p=0.5)),
- # ...
- ])
+ torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5), 
+ torchlm.bind(albumentations.ColorJitter(p=0.5))
+])
 ```
-* **bind** custom callable array or Tensor functions through **torchlm.bind** 
+See [transforms.md](docs/api/transforms.md) for supported transforms sets and more example can be found at [test/transforms.py](test/transforms.py).
+
+<details>
+<summary> bind custom callable array or Tensor functions through torchlm.bind </summary> 
 
 ```python
 # First, defined your custom functions
-def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
- # do some transform here ...
+def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: # do some transform here ...
  return img.astype(np.uint32), landmarks.astype(np.float32)
 
-def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]:
- # do some transform here ...
+def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]: # do some transform here ...
  return img, landmarks
 ```
 
 ```python
 # Then, bind your functions and put it into the transforms pipeline.
 transform = torchlm.LandmarksCompose([
- # use native torchlm transforms
- torchlm.LandmarksRandomScale(prob=0.5),
- # bind custom callable array functions
  torchlm.bind(callable_array_noop, bind_type=torchlm.BindEnum.Callable_Array),
- # bind custom callable Tensor functions with a given prob
- torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5), 
- # ...
- ])
+ torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5)
+])
 ```
-<div align='center'>
- <img src='docs/res/124.jpg' height="100px" width="100px">
- <img src='docs/res/158.jpg' height="100px" width="100px">
- <img src='docs/res/386.jpg' height="100px" width="100px">
- <img src='docs/res/478.jpg' height="100px" width="100px">
- <img src='docs/res/537.jpg' height="100px" width="100px">
- <img src='docs/res/605.jpg' height="100px" width="100px">
- <img src='docs/res/802.jpg' height="100px" width="100px">
-<br>
- <img src='docs/res/2_wflw_484.jpg' height="100px" width="100px">
- <img src='docs/res/2_wflw_505.jpg' height="100px" width="100px">
- <img src='docs/res/2_wflw_529.jpg' height="100px" width="100px">
- <img src='docs/res/2_wflw_536.jpg' height="100px" width="100px">
- <img src='docs/res/2_wflw_669.jpg' height="100px" width="100px">
- <img src='docs/res/2_wflw_672.jpg' height="100px" width="100px">
- <img src='docs/res/2_wflw_741.jpg' height="100px" width="100px">
-</div> 
+</details>
 
+<details>
+<summary> some global debug setting for torchlm's transform </summary> 
 
 * setup logging mode as `True` globally might help you figure out the runtime details
 ```python
-import torchlm
 # some global setting
 torchlm.set_transforms_debug(True)
 torchlm.set_transforms_logging(True)
 torchlm.set_autodtype_logging(True)
-```
+``` 
+
 some detail information will show you at each runtime, the infos might look like
 ```shell
 LandmarksRandomScale() AutoDtype Info: AutoDtypeEnum.Array_InOut
@@ -194,21 +163,98 @@ LandmarksRandomTranslate() Execution Flag: False
 
  But, is ok if you pass a Tensor to a np.ndarray-like transform, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper.
 
+</details>
+
+
+### 🎉🎉Training
+In **torchlm**, each model have a high level and user-friendly API named `training`, here is a example of [PIPNet](https://github.com/jhb86253817/PIPNet).
+```python
+from torchlm.models import pipnet
+
+model = pipnet(
+ backbone="resnet18",
+ pretrained=False,
+ num_nb=10,
+ num_lms=98,
+ net_stride=32,
+ input_size=256,
+ meanface_type="wflw",
+ backbone_pretrained=True,
+ map_location="cuda",
+ checkpoint=None
+)
+
+model.training(
+ self,
+ annotation_path: str,
+ criterion_cls: nn.Module = nn.MSELoss(),
+ criterion_reg: nn.Module = nn.L1Loss(),
+ learning_rate: float = 0.0001,
+ cls_loss_weight: float = 10.,
+ reg_loss_weight: float = 1.,
+ num_nb: int = 10,
+ num_epochs: int = 60,
+ save_dir: Optional[str] = "./save",
+ save_interval: Optional[int] = 10,
+ save_prefix: Optional[str] = "",
+ decay_steps: Optional[List[int]] = (30, 50),
+ decay_gamma: Optional[float] = 0.1,
+ device: Optional[Union[str, torch.device]] = "cuda",
+ transform: Optional[transforms.LandmarksCompose] = None,
+ coordinates_already_normalized: Optional[bool] = False,
+ **kwargs: Any # params for DataLoader
+) -> nn.Module:
+```
+Please jump to the entry point of the function for the detail documentations of **training** API for each defined models in torchlm, e.g [pipnet/_impls.py#L159](https://github.com/DefTruth/torchlm/blob/main/torchlm/models/pipnet/_impls.py#L159). Further, the model implementation plan is as follows:
 
-* Supported Transforms Sets, see [transforms.md](docs/api/transforms.md). A detail example can be found at [test/transforms.py](test/transforms.py).
+❔ YOLOX ❔ YOLOv5 ❔ NanoDet ✅ [PIPNet](https://github.com/jhb86253817/PIPNet) ❔ ResNet ❔ MobileNet ❔ ShuffleNet ❔...
 
-### Training(TODO)
-* [ ] YOLOX
-* [ ] YOLOv5
-* [ ] NanoDet
-* [ ] PIPNet
-* [ ] ResNet
-* [ ] MobileNet
-* [ ] ShuffleNet
-* [ ] ...
+✅ = known work and official supported, ❔ = in my plan, but not coming soon.
 
-### Inference
+### 👀👇 Inference
+#### C++ API
 The ONNXRuntime(CPU/GPU), MNN, NCNN and TNN C++ inference of **torchlm** will be release at [lite.ai.toolkit](https://github.com/DefTruth/lite.ai.toolkit).
+#### Python API
+In **torchlm**, we offer a high level API named `runtime.bind` to bind any models in torchlm and then you can run the `runtime.forward` API to get the output landmarks and bboxes, here is a example of [PIPNet](https://github.com/jhb86253817/PIPNet).
+```python
+import cv2
+import torchlm
+from torchlm.tools import faceboxesv2
+from torchlm.models import pipnet
+
+def test_pipnet_runtime():
+ img_path = "./1.jpg"
+ save_path = "./1.jpg"
+ checkpoint = "./pipnet_resnet18_10x98x32x256_wflw.pth"
+ image = cv2.imread(img_path)
+
+ torchlm.runtime.bind(faceboxesv2())
+ torchlm.runtime.bind(
+ pipnet(
+ backbone="resnet18",
+ pretrained=True,
+ num_nb=10,
+ num_lms=98,
+ net_stride=32,
+ input_size=256,
+ meanface_type="wflw",
+ backbone_pretrained=True,
+ map_location="cpu",
+ checkpoint=checkpoint
+ )
+ )
+ landmarks, bboxes = torchlm.runtime.forward(image)
+ image = torchlm.utils.draw_bboxes(image, bboxes=bboxes)
+ image = torchlm.utils.draw_landmarks(image, landmarks=landmarks)
+
+ cv2.imwrite(save_path, image)
+```
+<div align='center'>
+ <img src='docs/assets/pipnet0.jpg' height="180px" width="180px">
+ <img src='docs/assets/pipnet_300W_CELEBA_model.gif' height="180px" width="180px">
+ <img src='docs/assets/pipnet_shaolin_soccer.gif' height="180px" width="180px">
+ <img src='docs/assets/pipnet_WFLW_model.gif' height="180px" width="180px">
+</div> 
 
 ## 📖 Documentations
 * [x] [Data Augmentation's API](docs/api/transforms.md) 

diff --git a/docs/assets/pipnet0.jpg b/docs/assets/pipnet0.jpg
diff --git a/docs/assets/pipnet_300W_CELEBA_model.gif b/docs/assets/pipnet_300W_CELEBA_model.gif
diff --git a/docs/assets/pipnet_WFLW_model.gif b/docs/assets/pipnet_WFLW_model.gif
diff --git a/docs/assets/pipnet_shaolin_soccer.gif b/docs/assets/pipnet_shaolin_soccer.gif
diff --git a/requirements.txt b/requirements.txt
@@ -1,5 +1,5 @@
 # torchlm
-opencv-python-headless>=4.5.2
+opencv-python-headless>=4.3.0
 numpy>=1.14.4
 torch>=1.6.0
 torchvision>=0.9.0

diff --git a/setup.py b/setup.py
@@ -25,14 +25,14 @@ def get_long_description():
  url="https://github.com/DefTruth/torchlm",
  packages=setuptools.find_packages(),
  install_requires=[
- "opencv-python-headless>=4.5.2",
+ "opencv-python-headless>=4.3.0",
  "numpy>=1.14.4",
  "torch>=1.6.0",
  "torchvision>=0.8.0",
  "albumentations>=1.1.0",
  "onnx>=1.8.0",
  "onnxruntime>=1.7.0",
- "tqdm>=4.60.0"
+ "tqdm>=4.10.0"
  ],
  classifiers=[
  "Programming Language :: Python :: 3",

diff --git a/torchlm/data/_converters.py b/torchlm/data/_converters.py
@@ -0,0 +1,13 @@
+import os
+import cv2
+import numpy as np
+from abc import ABCMeta, abstractmethod
+from typing import Tuple, Optional, List
+
+
+class BaseConverter(object):
+ __metaclass__ = ABCMeta
+
+ @abstractmethod
+ def convert(self, *args, **kwargs):
+ raise NotImplementedError
diff --git a/torchlm/models/pipnet/_impls.py b/torchlm/models/pipnet/_impls.py
@@ -157,6 +157,33 @@ def training(
  coordinates_already_normalized: Optional[bool] = False,
  **kwargs: Any # params for DataLoader
  ) -> nn.Module:
+ """
+ :param annotation_path: the path to a annotation file, the format must be
+ "img0_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
+ "img1_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
+ "img2_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
+ "img3_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
+ ...
+ :param criterion_cls: loss criterion for PIPNet heatmap classification, default MSELoss
+ :param criterion_reg: loss criterion for PIPNet offsets regression, default L1Loss
+ :param learning_rate: learning rate, default 0.0001
+ :param cls_loss_weight: weight for heatmap classification
+ :param reg_loss_weight: weight for offsets regression
+ :param num_nb: the number of Nearest-neighbor landmarks for NRM, default 10
+ :param num_epochs: the number of training epochs
+ :param save_dir: the dir to save checkpoints
+ :param save_interval: the interval to save checkpoints
+ :param save_prefix: the prefix to save checkpoints, the saved name would look like
+ {save_prefix}-epoch{epoch}-loss{epoch_loss}.pth
+ :param decay_steps: decay steps for learning rate
+ :param decay_gamma: decay gamma for learning rate
+ :param device: training device, default cuda.
+ :param transform: user specific transform. If None, torchlm will build a default transform,
+ more details can be found at `torchlm.transforms.build_default_transform`
+ :param coordinates_already_normalized: denoted the label in annotation_path is normalized(by image size) of not
+ :param kwargs: params for DataLoader
+ :return: A trained model.
+ """
  device = device if torch.cuda.is_available() else "cpu"
  # prepare dataset
  default_dataset = _PIPTrainDataset(