NVIDIA Turingの中身 - Vengineerの妄想(準備期間)

@Vengineerの戯言 : Twitter
SystemVerilogの世界へようこそ、すべては、SystemC v0.9公開から始まった

　・Turing Tensor Cores
　　FP16、INT8、INT4
　　Deep Learning Super Sampling (DLSS)

　・メモリ階層 L1/L2共に、増強
　　Turing : LOAD/STORE UNIT => L1 & SHARED MEM (64KB+32KB or 32KB+64KB) x2 => L2 (6MB) => GDDR6  (32bitx12=384bits)
    Pascal : LOAD/STORE UNIT => L1 (24KB) x2 + SHARED MEM (96KB) x1         => L2 (3MB) => GDDR5X (32bitx12=384bits)

　・Second-Generation NVIDIA NVLink

　・USB-C and VirtualLink

Deep Learning Features for Inference

引用
  Turing GPUs deliver exceptional inference performance. 

  The Turing Tensor Cores, along with continual improvements in TensorRT 
    (NVIDIA’s run-time inferencing framework), CUDA, and CuDNN libraries, 
    enable Turing GPUs to deliver outstanding performance for inferencing applications. 

  Turing Tensor Cores also add support for fast INT8 matrix operations 
    to significantly accelerate inference throughput with minimal loss in accuracy. 

  New low-precision INT4 matrix operations are now possible with Turing Tensor Cores 
    and will enable research and development into sub 8-bit neural networks.

Turing GPUは、推論用なんだね。TensorRTでサポートしてくれるんで。