マニュアルはPDFで公開されている。60頁。
Page.7-8
引用 1.5 Hardware exports Once the learned DNN recognition rate performances are satisfying, an optimized version of the network can be automatically exported for various embedded targets. An automated network computation performances benchmarking can also be performed among different targets. The following targets are currently supported by the toolflow: • Plain C code (no dynamic memory allocation, no floating point processing); • C code accelerated with OpenMP; • C code tailored for High-Level Synthesis (HLS) with Xilinx® Vivado® HLS; Direct synthesis to FPGA, with timing and utilization after routing; Possibility to constrain the maximum number of clock cycles desired to compute the whole network; FPGA utilization vs number of clock cycle trade-off analysis • OpenCL code optimized for either CPU/DSP or GPU; • Cuda kernels and cuDNN code optimized for NVIDIA® GPUs.
いろいろなターゲット用のコードを出力できるのね。
FPGAでは、やっぱり、Vivado HLSなのね。