其他分享
首页 > 其他分享> > 论文阅读----Ten Lessons From Three Generations Shaped Google‘s TPU V4i

论文阅读----Ten Lessons From Three Generations Shaped Google‘s TPU V4i

作者:互联网

文章目录

论文阅读----Ten Lessons From Three Generations Shaped Google’s TPU V4i

1, 论文常见缩写

1) Domain Specific Architecture (DSA)

2) A custom chip-to-chip interconnect fabric (ICI)

3) P99 latency:

P99延迟: 一段时间内,所有请求中最快的 99%请求的平均延时,能直观的衡量服务器性能指标.

同理 P95就是快的95%的请求延时.

4) SLA/SLO/SLI

SLA = Service Level Agreement = 服务质量 / 水平协议

SLO = Service-level-objective = 服务质量 / 水平目标

SLI = Service Level Indicator = 服务质量 / 水平指标

5) ISA(Instruction set architecture)

6) MLPerf benchmarks 0.5-0.7 :

7) High BandWIdth Memory(HBM)

8) Itanium’s VLIW architecture: Very Long Instruction Word:超长指令集架构

9) compiler:

10) CMEM: Common Moemory

11) performance per TCO(Total cost of ownership) vs Per CapEx

CapEx: Capital Expense is the price of an item .

OpEx: Operation Expense is the cost of operation. including electricity consumed and power provisioning(power distribution and cooling).

TCO(Total cost of ownership)

Standard accounting amortizes computer CapEx over 3-5 years, so for 3 years

​ **TCO = CapEx + 3 ✕ OpEx. **

Google and most companies care more about performance/TCO of production apps (perf/TCO) than raw performance or performance/CapEx (perf/CapEx) of benchmarks.

performance/mm2 can look good even if it’s bad for perf/TCO

TDP(Thermal design power)

总体来说, TCO和TDP具有正向相关性.若无法获得TCO可用TDP.

2, 总结:

如题所示,论文首先介绍了TPUv1,2,3三代产品的总体结构,并据此总结出TPU/GPU设计中的十个经验:

进而介绍了根据这些经验产生的TPUv4i 产品,及其采用的技术,如

最后,文章讨论了关于参数和性能评估标准等的选取问题.

标签:Lessons,Google,Ten,chip,CMEM,TCO,编译器,CapEx,TPUv4i
来源: https://blog.csdn.net/weixin_44544687/article/details/118696754