・Turing Tensor Cores FP16、INT8、INT4 Deep Learning Super Sampling (DLSS) ・メモリ階層 L1/L2共に、増強 Turing : LOAD/STORE UNIT => L1 & SHARED MEM (64KB+32KB or 32KB+64KB) x2 => L2 (6MB) => GDDR6 (32bitx12=384bits) Pascal : LOAD/STORE UNIT => L1 (24KB) x2 + SHARED MEM (96KB) x1 => L2 (3MB) => GDDR5X (32bitx12=384bits) ・Second-Generation NVIDIA NVLink ・USB-C and VirtualLink
Deep Learning Features for Inference
引用 Turing GPUs deliver exceptional inference performance. The Turing Tensor Cores, along with continual improvements in TensorRT (NVIDIA’s run-time inferencing framework), CUDA, and CuDNN libraries, enable Turing GPUs to deliver outstanding performance for inferencing applications. Turing Tensor Cores also add support for fast INT8 matrix operations to significantly accelerate inference throughput with minimal loss in accuracy. New low-precision INT4 matrix operations are now possible with Turing Tensor Cores and will enable research and development into sub 8-bit neural networks.
Turing GPUは、推論用なんだね。TensorRTでサポートしてくれるんで。