Skip to content

This is a Qt project that makes possible the tagging of videos/images/bvh animations. You can assign each frame what behaviour, action or subaction is being made. Laterly, after the tagging, it´s possible to export the data to JSON and apply video data augmentation to the output.

Notifications You must be signed in to change notification settings

3dperceptionlab/Action-Manual-Tagger

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Action-Manual-Tagger

Demo of the tool

Watch the video

This is a Qt project that makes possible to give a frame annotation level, reporting in each frame which behaviour, action and subaction is happening. As result, a behaviour analysis dataset is obtained. This information can be used to train Deep Learning or classical oriented algorithms.

There are a maximum of three action tagging levels, meaning that the tool can create a hierarchy of actions of height 3: behavior, actions and subactions. The behavior is composed by actions, and for accomplishing those actions, many subactions are defined. For example, a behavior could be making a tea. All the actions would be: moving the cup from place to place, warm water, filling it...etc. In the case on moving cup from place, resulting subactions are: moving left arm up, closing left hand...etc.

UML Diagram

In a single picture

Main view

In a single picture

Video Data Augmentation

When we are dealing with Deep learning, it is interesting to have a big amount of diverse data. If the data lacks of variability, the training process would be difficult to succeed. Data augmentation is one of the best techniques to improve data by incorporating variation. We propose an extension of the Action Manual Tagger which returns augmented data.

In a single picture

For example, with this video data augmentation pipeline, one possible variation is changing image contrast to make the DL model invariant to light changes. Furthermore, the use of shifts, rotations and cropping can help the data to be more spatially varied. Applying this operations, the neural network trained with this data learns not to focus to fixed locations of an image.

The video data augmentation is an extension of applying image data augmentation over the same frames of a full video. Each operation processed over the first frame is also applied to the rest of the video. Therefore, the typical operations over images can be used on videos as well : rotation, translation, vertical and horizontal flip, cropping, changes in brightness, add noise or blur, among others.

In a single picture

About

This is a Qt project that makes possible the tagging of videos/images/bvh animations. You can assign each frame what behaviour, action or subaction is being made. Laterly, after the tagging, it´s possible to export the data to JSON and apply video data augmentation to the output.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 94.1%
  • Python 2.9%
  • CMake 1.0%
  • Makefile 1.0%
  • QMake 0.8%
  • Shell 0.1%
  • Other 0.1%