Skip to content

yui-mhcp/detection

Repository files navigation

😋 Object detection

Check the CHANGELOG file to have a global overview of the latest modifications ! 😋

Important Note : the EAST training procedure is not implemented yet for the current post-processing pipeline inspired from this repo. It can still be used by using the available pretrained weights ! 😄

Project structure

├── custom_architectures
│   ├── east_arch.py        : defines the EAST architecture with VGG16 backbone
│   └── yolo_arch.py        : defines the YOLOv2 architecture
├── custom_layers
├── custom_train_objects
│   ├── losses
│   │   └── yolo_loss.py        : main YOLOLoss class
├── loggers
├── models
│   ├── detection
│   │   ├── base_detector.py    : abstract class for object detection models
│   │   ├── east.py         : main EAST class (rotated bounding box detection based on U-Net model)
│   │   └── yolo.py         : main YOLO class (object detection)
├── unitests
├── utils
├── detection.ipynb     : illustrates the use of pretrained models for object / text detection
└── example_yolo.ipynb  : illustrates a complete YOLOv2 training procedure

Check the main project for more information about the unextended modules / structure / main classes.

Available features

  • Detection (module models.detection) :
Feature Fuction / class Description
detection detect detect objects on images / videos and allow multiple saving types (save cropped boxes, detected images, video frames, ...)
stream stream perform detection on your camera (also allow to save frames)

The detection notebook provides a concrete demonstration of these functions 😄

Available models

Model architectures

Available architectures :

Model weights

Classes Dataset Architecture Trainer Weights
80 classes COCO YOLOv2 YOLOv2's author link

pretrained backend for YOLO can be downloaded at this link.

The pretrained version of EAST can be downloaded from this project. It should be stored in pretrained_models/pretrained_weights/east_vgg16.pth (torch is required to transfer the weights : pip install torch).

Installation and usage

Check this installagion guide for the step-by-step instructions !

TO-DO list :

  • Make the TO-DO list
  • Support pretrained COCO model
  • Add weights for face detection
  • Add label-based model loading (without manual association)
  • Add producer-consumer based streaming
  • Automatically downloads the official YOLOv2 pretrained weights (if not loaded)
  • Add the Locality-Aware Non Maximum Suppression (NMS) method as described in the EAST paper
  • Keras 3 support
  • Convert the pretrained models to be compatible with Keras 3
  • Make comprehensive comparison example between NMS and LANMS

Difference between detection and segmentation

The 2 main methodologies in object detection are detection with bounding boxes and pixel-wise segmentation. These 2 approaches tends to detect position of objects in an image but with different level of precision. This difference has an impact on the model architecture as the required output shape is not thesame.

Here is a simple, non-exhaustive comparison of both approaches based on some criteria :

Criterion Detection Segmentation
Precision Surrounding bounding boxes pixel by pixel
Type of output [x, y, w, h] (position of bounding boxes) mask ([0, 1] probability score for each pixel)
Output shape [grid_h, grid_w, nb_box, 4 + 1 + nb_class]* [image_h, image_w, 1]
Applications General detection + classification Medical image detection / object extraction
Model architecture Full CNN 2D downsampling to (grid_h, grid_w) Full CNN with downsampling and upsampling
Post processing Decode output to get position of boxes Thresholding pixel confidence
Model mechanism Split image in grid and detect boxes in each grid cell downsample the image and upsample it to give probability of object for each pixel of the image
Support multi-label classification yes, by design I guess yes but not its main application

* This is the classical output shape of YOLO models. The last dimension is [x, y, w, h, confidence, * class_score]

More advanced strategies also exist, differing from the standard methodology described above ;) This aims to be a simple introduction to object detection / segmentation.

Contacts and licence

Contacts :

  • Mail : yui-mhcp@tutanota.com
  • Discord : yui0732

Terms of use

The goal of these projects is to support and advance education and research in Deep Learning technology. To facilitate this, all associated code is made available under the GNU Affero General Public License (AGPL) v3, supplemented by a clause that prohibits commercial use (cf the LICENCE file).

These projects are released as "free software", allowing you to freely use, modify, deploy, and share the software, provided you adhere to the terms of the license. While the software is freely available, it is not public domain and retains copyright protection. The license conditions are designed to ensure that every user can utilize and modify any version of the code for their own educational and research projects.

If you wish to use this project in a proprietary commercial endeavor, you must obtain a separate license. For further details on this process, please contact me directly.

For my protection, it is important to note that all projects are available on an "As Is" basis, without any warranties or conditions of any kind, either explicit or implied. However, do not hesitate to report issues on the repository's project, or make a Pull Request to solve it 😄

Citation

If you find this project useful in your work, please add this citation to give it more visibility ! 😋

@misc{yui-mhcp
    author  = {yui},
    title   = {A Deep Learning projects centralization},
    year    = {2021},
    publisher   = {GitHub},
    howpublished    = {\url{https://github.com/yui-mhcp}}
}

Notes and references

The code for the YOLO part of this project is highly inspired from this repo :

Papers and tutorials :

Datasets :

  • COCO dataset : 80 labels dataset for object detection in real context
  • COCO Text dataset : an extension of COCO for text detection
  • Wider Face dataset : face detection dataset
  • kangaroo dataset : funny tiny dataset to train fast a powerful model (fun to have fast results)

Releases

No releases published

Packages

No packages published