Vengineerの妄想(準備期間)

人生は短いけど、長いです。人生を楽しみましょう!

N2D2




Page.7-8
引用
1.5 Hardware exports Once the learned DNN recognition rate performances are satisfying, an optimized version of the network can be automatically exported for various embedded targets. An automated network computation performances benchmarking can also be performed among different targets. The following targets are currently supported by the toolflow:

 • Plain C code (no dynamic memory allocation, no floating point processing); 
 • C code accelerated with OpenMP; 
 • C code tailored for High-Level Synthesis (HLS) with Xilinx® Vivado® HLS; 
   Direct synthesis to FPGA, with timing and utilization after routing; 
   Possibility to constrain the maximum number of clock cycles desired 
   to compute the whole network; 
   FPGA utilization vs number of clock cycle trade-off analysis
 • OpenCL code optimized for either CPU/DSP or GPU; 
 • Cuda kernels and cuDNN code optimized for NVIDIA® GPUs.

いろいろなターゲット用のコードを出力できるのね。

FPGAでは、やっぱり、Vivado HLSなのね。