Vengineerの妄想(準備期間)

人生は短いけど、長いです。人生を楽しみましょう!

NVIDIA Turingの中身



 ・Turing Tensor Cores
  FP16、INT8、INT4
  Deep Learning Super Sampling (DLSS)

 ・メモリ階層 L1/L2共に、増強
  Turing : LOAD/STORE UNIT => L1 & SHARED MEM (64KB+32KB or 32KB+64KB) x2 => L2 (6MB) => GDDR6  (32bitx12=384bits)
    Pascal : LOAD/STORE UNIT => L1 (24KB) x2 + SHARED MEM (96KB) x1         => L2 (3MB) => GDDR5X (32bitx12=384bits)

 ・Second-Generation NVIDIA NVLink

 ・USB-C and VirtualLink

Deep Learning Features for Inference


引用
  Turing GPUs deliver exceptional inference performance. 

  The Turing Tensor Cores, along with continual improvements in TensorRT 
    (NVIDIA’s run-time inferencing framework), CUDA, and CuDNN libraries, 
    enable Turing GPUs to deliver outstanding performance for inferencing applications. 

  Turing Tensor Cores also add support for fast INT8 matrix operations 
    to significantly accelerate inference throughput with minimal loss in accuracy. 

  New low-precision INT4 matrix operations are now possible with Turing Tensor Cores 
    and will enable research and development into sub 8-bit neural networks.

Turing GPUは、推論用なんだね。TensorRTでサポートしてくれるんで。