Experimental toolchain to compile and run Chainer models

Area

Chip

Machine Learning / Deep Learning

Tag

# Uncategorized

Shinichiro Hamaji

Engineer

Hello, my name is Shinichiro Hamaji, an engineer at Preferred Networks. I would like to introduce an experimental project named Chainer-compiler today. Although not ready for end users, we are making it publicly available with the hope that others may find it interesting or useful for research purposes in its current state.

https://github.com/pfnet-research/chainer-compiler

Late last year, Preferred Networks release a beta version of ChainerX. The three goals of ChainerX were

Optimize the speed of running models
Make models deployable to environment without Python
Make it easier to port models to non-CPU/GPU environments

while keeping the flexibility of Chainer. The goal of chainer-compiler is to go further with ChainerX. Currently, it has the following three main components:

Translate Python AST to extract computation graphs as extended ONNX format.
Modify the extended ONNX graph for optimization, auto-differentiation, etc. It then generates deployable code.
Run the exported code with ChainerX’s C++ API.

Here are expected use-cases of this project:

Unlike the imperative model of Chainer/CuPy/ChainerX (a.k.a. define-by-run), the first step extracts a computation graph with multiple operations so that it gives a chance to apply inter-operation optimization techniques such as operation fusion.
By running the step 1 and 2 on the host machine and deploying only 3, you can easily deploy your model to Python-free environments.
If you add targets of the code generator, your model can run with/on optimized model executor or domain specific chips like MN-Core.
By using the step 2 and 3, you can run ONNX models generated by other tools such as ONNX-chainer.

Other than the above, we would like to continue conducting experimental research.

Like other areas around deep learning, many people are competing for deep learning compilers. They have different strengths and focuses, which makes research on a deep learning compiler very interesting, in my opinion. In this project, we are trying not to hurt the flexibility of Chainer. For that reason, the toolchain does not assume that the model is static and can handle tensors without static dimensions, control-flow primitives of Python, and Python lists. This could be one of the unique strengths of this toolchain.

In this article, we have introduced chainer-compiler, an experimental project which compiles and runs Chainer models. We still have a huge number of TODOs but they are challenging and fun to work on. If you are interested in working with us, please consider applying to Preferred Networks. Any questions or feedbacks are really appreciated.

Lastly, I would like to thank everyone who helped us. I especially would like to thank Sato-san, an intern who realized the Python code to ONNX compiler.

Area

Chip

Machine Learning / Deep Learning

Tag

# Uncategorized