AI Assisted Annotation Tool for Multi-task Video Labeling

It is written in Python and uses Qt for its graphical interface.

This tool is designed for general-purpose video annotation, which is mainly used to record mutual information included in one video such as objects, object tracking, object status, the time interval of event occurrence to construct the multi-task dataset.

The annotation tool utilizing some existing pre-trained models or trained models to assisted annotation tasks. Through this tool, we could know the existing model's prediction defects or we could record multiple types of attributes in the collecting videos to build our custom dataset.

Requirements

Python 3.5, TensorFlow 1.12, Keras 2.2.4 and other packages listed in requirements.txt.

Installation

Clone this repository
Install dependencies
```
pip install -r requirements.txt
```
Download each models' pre-trained weights :
- For Object Detection model:
(1) Mask-RCNN

Download pre-trained COCO weights from the releases page. (mask_rcnn_coco.h5)

(2) Faster-RCNN

Download pre-trained weights from the Model Zoo.
(3) Yolo

Download YOLOv3 weights from YOLO website.
```
wget https://pjreddie.com/media/files/yolov3.weights
```
Convert .weights file into .h5 file
```
python convert.py yolov3.cfg yolov3.weights model_data/yolo.h5
```
- For Object Tracking model:
Re3

Download link for the Caffe model weights
Convert caffe file into tensorflow weights
```
python caffe_to_tf.py
```

Settings for Detection & Rendering

Please checkout ObjClasses folder, you would see three files:

Obj_Classes_Name.txt  // Each line represent categories of object in your annotation file
Obj_Classes_Color.txt // Rendering RGB color for each class
Frame_Classes.txt     // Each line represent categories of frame level classification

Usage

Run the following command:

python main.py

For more annotation tutorials please forward here.
The annotations are saved as a JSON file.

Export JSON File Example:

For each video, the annotation tool create a JSON file formatted as following.

{
    "File name": "xxxxxxxxxx.mp4",
    "Resolution": [ 1280,720], "FPS": 30, "Total frame": 1620,
    "Event type": "",
    "Event interval": [],
    "Frame event": {},
    "Object bounding box": {
        "347": {
            "10": {
                "class": "car",
                "bbx": [ 414, 343, 791, 554],
                "event state": false
            },
            "9": {
                "class": "car",
                "bbx": [ 0, 320, 137, 517 ],
                "event state": false
            }
        },
        "348": {
            "10": {
                "class": "car",
                "bbx": [ 414, 343, 791, 554],
                "event state": false
            },
            "9": {
                "class": "car",
                "bbx": [
                    0, 321, 136, 516
                ],
                "event state": false
            }
        }    
    }
}

where The "Object bounding box" record containing each instance's name, object class, event state and bounding box location in each frame index. The "Event type" is used for video classification, The "Event interval" and "Frame event" are used for event detection and frame level classification.

Attached Your own Detection Model

If you want to attach your own detection model to this tool, please import and create a custom Class with detect_bbx() member function in it, which takes one input frame with RGB order and return single list that contains list of instances.

Each instance is formatted with [class name, score, bounding box], where bounding box information is an ordered list combined with the bounding box's center position of x, center position of y, bounding box's width, bounding box's height.

Example of return format for each instance:

[class , score, [cx,cy,w,h]]

Notice that if the detection model has mismatch name for the same class in the annotator (eg. motorbike v.s. motorcycle ), please also translate it at detect_bbx() method.

Citations

ICON create Tool
```
UI/logo.png
```
Apps Logo Frree Created with DesignEvo
Re3 tracker

@article{gordon2017re3,
  title={Re3: Real-Time Recurrent Regression Networks for Object Tracking},
  author={Gordon, Daniel and Farhadi, Ali and Fox, Dieter},
  journal={arXiv preprint arXiv:1705.06368},
  year={2017}
}

Mask R-CNN

@misc{matterport_maskrcnn_2017,
  title={Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow},
  author={Waleed Abdulla},
  year={2017},
  publisher={Github},
  journal={CVPR},
  howpublished={\url{https://github.com/matterport/Mask_RCNN}},
}

Faster R-CNN, using TensorFlow Object Detection API

Speed/accuracy trade-offs for modern convolutional object detectors.
Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z,
Song Y, Guadarrama S, Murphy K, CVPR 2017

Yolo3

@misc{qqwweee_2018,
  author = {qqwweee},
  title = {keras-yolo3},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/qqwweee/keras-yolo3}}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Detector		Detector
ObjClasses		ObjClasses
Tracker		Tracker
Tutorials		Tutorials
UI		UI
.gitignore		.gitignore
README.md		README.md
labeling.py		labeling.py
main.py		main.py
requirements.txt		requirements.txt
tracker_handler.py		tracker_handler.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detector

Detector

ObjClasses

ObjClasses

Tracker

Tracker

Tutorials

Tutorials

UI

UI

.gitignore

.gitignore

README.md

README.md

labeling.py

labeling.py

main.py

main.py

requirements.txt

requirements.txt

tracker_handler.py

tracker_handler.py

Repository files navigation

AI Assisted Annotation Tool for Multi-task Video Labeling

Requirements

Installation

Settings for Detection & Rendering

Usage

Export JSON File Example:

Attached Your own Detection Model

Citations

About

Releases

Packages

Languages

tom970505/AI_VidAnnTool

Folders and files

Latest commit

History

Repository files navigation

AI Assisted Annotation Tool for Multi-task Video Labeling

Requirements

Installation

Settings for Detection & Rendering

Usage

Export JSON File Example:

Attached Your own Detection Model

Citations

About

Topics

Resources

Stars

Watchers

Forks

Languages