Blog

2017.08.14

Research

ChainerCV Release

Tag

Yusuke Niitani

Engineer

We released ChainerCV: a utility library for computer vision in deep learning. This library aims at making the process of training and applying deep learning models for computer vision easier using Chainer. It contains high quality implementations of computer vision models, and tools that are necessary to conduct research in this field.
GitHub page: https://github.com/chainer/chainercv
Documentation: http://chainercv.readthedocs.io/en/stable/

Features supported in ChainerCV

High quality implementations of computer vision models

The quality of software is guaranteed with documentations and tests. In addition to those traditional software engineering practices, for researchers and developers who want to try new ideas, it is also important to provide a training script that performs on par with the performance reported by the paper that the implementation is based on. We list the performance of our implementations on our GitHub page with evaluation scripts that can be used to verify the scores on a local machine. All of the implementations perform on par with the originally reported scores.
Our implementations are as faithful as possible to the reference implementations. Whenever a change is made, there is a comment in the documentation about it.
Currently, ChainerCV provides implementations of object detection and semantic segmentation models including Faster R-CNN, SSD and SegNet.

Easy-to-use implementation

Performing inference is very easy in ChainerCV.
ChainerCV hosts a number of trained weights that can be automatically downloaded from the Internet at runtime so that users do not need to worry about downloading or remembering the file location of the trained weight.
Also, ChainerCV provides simple and unified interface to conduct inference for different models solving the same task. For example, both Faster R-CNN and SSD have a method called “predict” that takes images and returns results of object detection.

Set of tools for research in computer vision

ChainerCV hosts a set of tools to conduct research in the field of computer vision including:

  • Dataset loaders for common vision datasets (e.g. Object detection dataset for PASCAL VOC).
  • Transforms that can be used for data preprocessing/augmentation.
  • Visualization.
  • Evaluation code for common metrics.

 

Sample visualization of the inference results of the implemented models


On the left, we have an inference output of an object detection model. Object detection is a task of localizing objects in an image. On the right, we have input and output of semantic segmentation. Semantic segmentation is a task of segmenting an image into pieces and assigning labels to them.

Note on using ChainerCV

Many data in computer vision have different standards of representation such as color channel order of an image. It is important for users to be aware of the convention adopted in the library they use. Full list of conventions used in ChainerCV can be found in the README page.

Future plans

  • We plan to support wider range of tasks in the future. We will support classification soon.
  • We plan to provide a trained weight of a model on a large dataset, which is difficult or very time consuming to train in an environment with limited computing capacity.
  • A paper on ChainerCV has been accepted to ACM Multimedia 2017 Open Source Software Competition (link). The paper will be released on arxiv soon.

Tag

  • Twitter
  • Facebook