其他分享
首页 > 其他分享> > TENSOR CORE PERFORMANCE: THE ULTIMATE GUIDE

TENSOR CORE PERFORMANCE: THE ULTIMATE GUIDE

作者:互联网

TENSOR CORE PERFORMANCE: THE ULTIMATE GUIDE

1. 一个有意思的点,batch size / 108 整除的性能(TFLOPS)更好,因为A100的tensor core sm数为108.

见参考

 

 

 

参考:

https://developer.download.nvidia.cn/video/gputechconf/gtc/2020/presentations/s21929-tensor-core-performance-on-nvidia-gpus-the-ultimate-guide.pdf

 

标签:CORE,TENSOR,core,108,ULTIMATE,PERFORMANCE,GUIDE
来源: https://www.cnblogs.com/simpleminds/p/16334552.html