Apple M1でTensorFlowがめっちゃ速くなったって。

@Vengineerの戯言 : Twitter
SystemVerilogの世界へようこそ、すべては、SystemC v0.9公開から始まった

TensorFlowのTwitterに下記のように、Apple M1搭載のMac用に最適化してTensorFlow 2.4が使えるようになったと。

Apple M1だと、7倍以上の性能向上したって。

え、そんなに速くなるの？

⚡️ Accelerating TensorFlow 2 performance on Mac @Apple’s new Mac-optimized TensorFlow 2.4 fork lets you speed up training on Macs, resulting in up to 7x faster performance on platforms with the new M1 chip!

Learn how ↓ https://t.co/ggaQeduCWl
— TensorFlow (@TensorFlow) 2020年11月18日

MacBook Pro 2020のIntelでの TensorFlow 2.3に対して、Accelerated された TensorFlow 2.4を MacBook Pro 2020のIntel版でもちょっと速くなっているが、MacBook Pro 2020 (M1)だと、めっちゃ速くなっている。

AMDのGPUを搭載している Mac Pro (Intel)での比較もありますが、こちらも AcceleratedされたTensorFlow 2.4すると、めっちゃ速くなったと。

ということで、GPUによる高速化ができているということなんですね。

Mac Pro (Intel) と M1 との比較の図は無いのは、Mac Pro (Intel)の方が速いからなんでしょうね。さすがに外付けのGPUと内部のGPUの性能差はまだありますからね。

といっても、倍半分ぐらいなのでお高い Mac Proの価値がかなり落ちそうですね。

いずれ、M1を強化した、M2とかでれば Mac Pro も Applie Silicon への置き換えになりそうですね。

じゃー、どうやって、Acceleratedしたのか？

下記の github の README.md を見ると、

There is an optional mlcompute.set_mlc_device(device_name=’any') API for ML Compute device selection. The default value for device_name is 'any’, which means ML Compute will select the best available device on your system, including multiple GPUs on multi-GPU configurations.

とあります。

今まではCPUで実行していたのを、GPUでも実行できるようになって速くなったんだよね。

github.com

このビデオをみると、CoreML を使うことで CPU, GPU, NPU が利用できるようになるんだね。

このビデオでいろいろと説明している。

let config = MLModelConfiguration()
config.computeUnits = .all

とすると、Neural Engine　で動くって。https://t.co/Ie94Ma1gGX
— Vengineer＠ (@Vengineer) 2020年11月18日

TensorFlowの普通の学習では、floatを使うので、NPUは利用できないけど、8bitにしたら、NPUも使うようになるんだろうね？

おまけ、Mac Pro も Apple Silicon になるようですね。

But Apple has already confirmed that it plans to move the entire lineup to Apple Silicon within a couple of years—including big performers like the Mac Pro.

とありますね。 https://t.co/GZ9r9XPnU3
— Vengineer＠ (@Vengineer) 2020年11月21日