其他分享
首页 > 其他分享> > Tensor Core

Tensor Core

作者:互联网

参考:

https://forums.developer.nvidia.com/t/how-to-use-wmma-efficiently/157619/2

https://github.com/BigNerd95/CUDASamples/tree/master/samples/0_Simple/cudaTensorCoreGemm

(配置WARPS_PER_BLOCK为4,即可达到接近100TFLOS,一般100TFLOPS性能已经比较好。80%peak )

 

标签:Core,Tensor,WARPS,forums,PER,https,com
来源: https://www.cnblogs.com/simpleminds/p/16313068.html