@Vengineerの戯言 : Twitter
SystemVerilogの世界へようこそ、すべては、SystemC v0.9公開から始まった
このツイートでしった Google の XNNPack
Exciting to see XNNPack open-sourced: https://t.co/fHKBsQ42Hf - it's a library of optimized floating point arithmetic operations to speed up deep learning calculations on Arm and x86.
— Pete Warden (@petewarden) October 4, 2019
We're already using it in @TensorFlow Lite, I'd love to see it more widely adopted!
XNNPACK is a highly optimized library of floating-point neural network inference operators for ARM, WebAssembly, and x86 (SSE2 level) platforms. XNNPACK is not intended for direct use by deep learning practitioners researchers; instead it provides low-level performance primitives for accelerating high-level machine learning frameworks, such as MediaPipe, TensorFlow Lite, and TensorFlow.js.
- ARM64 on Android and Linux
- ARM on Android
- WebAssembly MVP
- WebAssembly SIMD (experimental)
- x86 and x86-64 (up to SSE2 only) on Android and Linux
XNNPACK is a based on QNNPACK library. However, unlike QNNPACK, XNNPACK focuses entirely on floating-point operators, and its API is no longer compatible with QNNPACK.
Marat Dukhan "The Indirect Convolution Algorithm". Presented on Efficient Deep Learning for Compute Vision (ECV) 2019 workshop
サンプルモデル としては、mobilenet-v1 と v2 があります。
ツイートのPete Wardenさん、GoogleでTensorFlow Liteとくにマイコン用のLiteをやっていますが、何でそうなんだろうかな?-とずーと思っていましたが、分かりました。
RasPiのGPUをPythonでプログラムがかけるライブラリである PyVideoCore の github の README.md にそれがあったんですよ。
Several QPU assemblers are written by pioneers (hermanhermitage, petewarden, elorimer and so on). There is also an implementation of OpenCL for QPU: VC4CL.
petewarden ですよ。。。
Pee WardenさんのTwitterのトップページを見ると、
とGoogleに買収された Jetpac の CTO だったんですね。下記の写真の左の人がPete Wardenさんだと思います。