




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、高性能多核和众核处理机芯片技术发展李三立教授清华大学1引言处理机永远是计算机技术和产业的重要驱动力。要进一步发展千亿次(Petaflops)高性能计算机,是离不开多核与众核芯片的发展的;计算机体系结构的新技术大多体现在高性能多核与众核芯片上。希望我们关注高性能计算技术的发展;现在计算机体系结构是“系统”都做到“芯片上”去了(SOC)。希望我们计算机学院的“计算机组织”和“计算机体系结构”课程的老师和学生能够在教学与学习中增加这方面内容,老师在申请自然科学基金和其它科研经费方面也注意加重这方面的研究方向;希望我们年轻教师和学生把兴趣放在这一领域,把我国的处理机芯片技术搞上去。2我国万万亿次超级计
2、算机CPU有望全部国产化 世界第一的“天河一号”超级计算机系统采用了“飞腾-1000”高性能多核微处理器。“天河一号”:4700万亿次的峰值速度和2566万亿次的持续速度 ;1000万亿次/秒为:1Petaflops 2019-3-8日环球网报道国防科大校长张育林谈话3我国天河一号千万亿次超级计算机世界500强第一名,奥巴马专门提到它4世界500强第一名天河1号插件版5提纲1。多核与众核处理机结构芯片技术的需要2。多核和众核体系结构处理机芯片的发展3。异构多核众核结构芯片4。片上系统SOC互联网络的发展5。微电子工艺的进一步发展6。未来exaFlops高性能计算机芯片预测7。结论6(一)。 多
3、核与众核处理机结构芯片技术的需要77/21/202288高性能计算应用需求1 Zettaflops100 Exaflops10 Exaflops1 Exaflops100 Petaflops10 Petaflops1 Petaflops100 TeraflopsSystem PerformancePlasma Fusion Simulation Jardin 03Simulation of more complex biomolecular structures200020202019No schedule provided by sourceApplicationsJardin 03 S.C
4、. Jardin, “Plasma Science Contribution to the SCaLeS Report,” Princeton Plasma Physics Laboratory, PPPL-3879 UC-70, available on Internet.Malone 03 Robert C. Malone, John B. Drake, Philip W. Jones, Douglas A. Rotman, “High-End Computing in Climate Modeling,” contribution to SCaLeS report.NASA 99 R.
5、T. Biedron, P. Mehrotra, M. L. Nelson, F. S. Preston, J. J. Rehder, J. L. Rogers, D. H. Rudy, J. Sobieski, and O. O. Storaasli, “Compute as Fast as the Engineers Can Think!”NASA/TM-2019-209715, available on Internet.NASA 02 NASA Goddard Space Flight Center, “Advanced Weather Prediction Technologies:
6、 NASAs Contribution to the Operational Agencies,” available on Internet.SCaLeS 03 Workshop on the Science Case for Large-scale Simulation, June 24-25, proceedings on Internet a /scales/.DeBenedictis 04, Erik P. DeBenedictis, “Matching Supercomputing to Progress in Science,” July 2019. Present
7、ation at Lawrence Berkeley National Laboratory, also published asSandia National Laboratories SAND report SAND2019-3333P. Sandia technical reports are available by going to and accessing the technical library.HEC04 Federal Plan for High-End Computing, May, 2019.Compute as fast as the engi
8、neer can thinkNASA 99 100 1000 SCaLeS 03 Geodata Earth Station Range NASA 02Full Global Climate Malone 03 Courtesy of Erik P. DeBenedictis simulation of medium biomolecular structures (us scale) simulation of large biomolecular structures (ms scale)protein folding50 TFLOPS250 TFLOPS1 PFLOPSHEC04cpeg
9、421-2019-F/Topic-3-I等离子体全球气候模型海量地球数据更复杂生物分子结构模拟蛋白质结构生物分子结构系统性能应用1万万亿次100万万亿次1000万万亿次8晶体管数目增长-Intel320亿晶体管9芯片上频率不能持续增长功耗问题停顿了10功耗引起发热直观图片11CPU的水冷和风冷水冷系统风冷系统12解决功耗增长和晶体管增长的矛盾解决方案:新制造材料;新制冷技术;多核和众核体系结构13多核和众核的发展对于性能的影响多核三年的变化性能年份Intel着重在PC机发展14体系结构进展:单核多核众核-片上互联1993, Pentium2019, Pentium MMX2019, Penti
10、um II2019, Pentium III2019, Tualatin2019, Pentium 4Northwood2019, Pentium D2019, Core 2 Duo (Conroe)2019, Core 2 Quad(Kentisfield)2019, TeraScale 80-core prototypeSingle core with increased performanceMulticore processor with more and more cores!Key for Multicore:Interconnection15AMD通用单核的内部结构 AGUAGU
11、Int Decode & RenameFADDFMISCFMUL44-entryLoad/StoreQueue36-entry FP schedulerFP Decode & RenameALUAGUALUMULTALUResResResL1Icache64KBL1Dcache64KBFetchBranchPredictionInstruction Control Unit (72 entries)FastpathMicrocode EngineScan/Align/Decodeops取指转移预测微码硬布线微操作数据缓存指令缓存16AMD 双核芯片的布局双核AMD Opteron 处理机 19
12、9mm2 90nm 工艺单核 AMD Opteron 处理机 193mm2 130nm 工艺17AMD Opteron 的多核架构18Intel多核与众核解决路线2005200920062008200720102004201120122013201420152016201720182019202012481625632641285121024Pentium DCore DuoCore 2 DuoConroe, Allendale, Wolfdale, Merom, PenrynCore 2 DuoKentsfield, YorkfieldCore i7Sandy BridgePolaris T
13、eraScale80 Cores / 80 ThreadsSingle Chip Cloud Computing48 Cores / 48 ThreadsKnight Corner50 Cores / 200 ThreadsCommercial PathResearch PathNehalem 核数商业路径研究路径19Intel的 Nehalem多核结构要有图形核快速通道接口20Intel 的 Nehalem四核芯片布局快速通道连接96GB/S 快速通道连接96GB/S21Intel Nehalem多核处理机层次式存储结构CPU Core32KB L1 D$32KB L1 I$256KB L2
14、$8MB Shared L3$CPU Core32KB L1 D$32KB L1 I$256KB L2$4-8 CoresDDR3 DRAM Memory ControllersQuickPath System InterconnectEach direction is 20b6.4Gb/sEach DRAM Channel is 64/72b wide at up to 1.33Gb/sQPI是重要特点22Intel 通用Nehalem的单核结构预取缓冲预译码指令队列对准转移预测循环流译码快速通道访存QPI乱序执行缓冲第三级Cache 23JFMAMJJASONDJFMAMJJASONDJF
15、MAMJJASONDJFMAMJJASONDJFMAMJJASONDJFMAMJJASONDJFMAMJJASONDJFMAMJJASONDJFMAMJJASONDJFMAMJJASONDPower4 (2019)1.1 to 1.3 GHz(1)(2)(2)Power4+ (2019)1.9 GHz(1)(2)(2)Power5 (2019)1.5-1.9 GHz(1)(2)(4)Power5+ (2019)1.5-2.26 GHz(1)(2)(4)CBE (2019)3.2 GHz(1)(9)(10)PowerXCell8i (2019)3.2GHz(1)(9)(10)Xenon (201
16、9)3.2 GHz(1)(3)(6)Power63.5-4.7 GHz(1)(2)(4)Power6+5 GHz(1)(2)(4)Power6+5 GHz(1)(2)(4)Pentium D3.8 GHz(1)(2)(4)Core 21.8-3.2 GHz(1)(4)(8)Dual Core Atom0.8-2.06 GHz(1)(2)(2)Sandy Bridge4.6 GHz(1)(8)(16)Xeon2.863.56 GHz(1)(2)(2)Xeon Quad Code2.133.56 GHz(1)(4)(8)Xeon Beckton2.83.56 GHz(1)(8)(16)Core 7
17、i2.663.33 GHz(1)(4)(8)Opteron Denmark1.6-2.8GHz(1)(2)(2)Opteron Barcelona1.76-2.6GHz(1)(4)(4)Opteron Istanbul2.26-2.66GHz(1)(6)(6)Opteron Sao Paolo?(1)(6)(6)Opteron Magny Cours?(1)(12)(12)Opteron Interlagos?(1)(16)(16)Ultra SPARC IV1-1.356 GHz(1)(2)(2)Ultra SPARC IV+1.5-2.16 GHz(1)(2)(2)Ultra SPARC
18、T11-1.46 GHz(1)(4)(32)Ultra SPARC T21-1.66 GHz(1)(8)(64)Ultra SPARC VII2.4-2.56 GHz(1)(4)(16)Ultra SPARC VIIIfx2.4-2.56 GHz(1)(8)(16)IBMSUN / ORACLEAMDINTEL20192019201920192019201920192019200920192019NameHertz(Processor)(Cores)(Threads)7/21/202224JPL-Dec-01-2009Chips with 8 physical cores or more其他公
19、司多核/众核发展计划24晶体管数(千)单线程性能(SpecINT)频率(MHz)典型功耗(瓦)核数目小结:35年处理机发展综合趋势25(二)。多核和众核体系结构处理机芯片的发展26为何要多核?CoreCacheCoreCacheCoreVoltage = 1Freq = 1Area = 1Power = 1Perf = 1Voltage = -15%Freq = -15%Area = 2Power = 1Perf = 1.8In the same process technology27GPGPGPGPGPGPGPGPGPGPGPGPGeneral Purpose Cores进一步多核异构芯片
20、-SOCSPSPSPSPSpecial Purpose HWCCCCCCCCCCCCCCCCInterconnect fabricHeterogeneous Multi-Core PlatformSOC通用核专用硬件互联网络28多核技术将要多样化!Multiple parallel general-purpose processors (GPPs)Multiple application-specific processors (ASPs)Sun Niagara8 GPP cores (32 threads)IntelXScale Core32K IC32K DCMEv210MEv211MEv
21、212MEv215MEv214MEv213Rbuf64 128BTbuf64 128BHash48/64/128Scratch16KBQDRSRAM2QDRSRAM1RDRAM1RDRAM3RDRAM2GASKETPCI(64b)66 MHzIXP280016b16b1818181818181864bSPI4orCSIXStripeE/D QE/D QQDRSRAM3E/D Q1818MEv29MEv216MEv22MEv23MEv24MEv27MEv26MEv25MEv21MEv28CSRs -Fast_wr-UART-Timers-GPIO-BootROM/SlowPortQDRSRAM4
22、E/D Q1818Intel Network Processor1 GPP Core16 ASPs (128 threads)IBM Cell1 GPP (2 threads)8 ASPsPicochip DSP1 GPP core248 ASPsCisco CRS-1188 Tensilica GPPs处理机上有上千个线程处理机就是摩尔定理中的晶体管“The Processor is the new Transistor” Rowen29AMD做的GPU多核SIMD芯片结构30多核伴随指令的扩展-加速31众核处理机结构3232Intel Terascale 80 核处理机Tilera 64核
23、处理机云存储服务器无线网络32NVIDIAs Fermi GPU architecture consists of 16 streaming multiprocessors (SMs), each consisting of 32 cores, each of which can execute one floating-point or integer instruction per clock. The SMs are supported by a second-level cache, host interface, GigaThread scheduler, and multiple
24、DRAM interfaces.NVIDIA的新GPU众核芯片FERMI 结构SM32核33Each Fermi SM includes 32 cores, 16 load/store units, four special-function units, a 32K-word register file, 64K of configurable RAM, and thread control logic. Each core has both floating-point and integer execution units寄存器堆32K字浮点定点每个CUDA核34多核芯片的片上、片外访存
25、速度设计考虑(数据访问速度Memory Wall)处理部件64 寄存器片上Cache16MB/32KBLoad 1, Store 11.92TB/sLoad 2, Store 1640GB/s片外静态CacheSRAM 2.5MB Load 20 cycles, Store 10 cycles 320GB/s (片外差6倍)板外动态存储器DRAM16GBLoad 36 cycles, Store 18 cycles 16GB/s (板外差120倍)35(三)。异构多核结构芯片36为什么要发展异构众核芯片1。要研制千万亿次(PetaFlops)高性能计算机,单靠Intel 或AMD通用同构型众核
26、芯片是不行的,必须要有加速器;2。同构众核芯片又会遇到功耗问题,每个核都要有它Cache等配合硬件;因此,加速器要用较大量的“小核”;3。如果CPU和GPU芯片合用,因为GPU要求大量数据,所以在芯片之间传送大量数据,是瓶颈,很难达到峰值;4。因此,CPU和GPU应该做在一个芯片上,芯片上的数据传输频带要宽很多;更进一步,GPU仍然有编程困难的问题,如有针对专门用途的、算法和编程都比较能简化的小核,更为合适。另一个办法是在众核中扩充指令、实现加速。5。高性能计算机有分向的趋势,一般通用HPC用现有的刀片式服务器、再加上Infiniband就可以很快造成,价廉、研制速度快;而自己专门设计板级产品
27、的、几个PetaFlops的 HPC一般都只能针对一、二种应用,有专用化的趋势。37Enabled by: Moores Law Voltage ScalingSingle-Core EraMulti-CoreEraHeterogeneousSystems EraEnabled by: Moores Law Desire For Throughput20 years of SMP archPowerParallel SW availabilityPerformance ScalabilityMicro-Architecture受限于: Power Complexity受限于: Enabled
28、by: Moores Law Abundant data parallelism Power efficient GPUs当前受限于: Programming models Communication overheads处理机性能的三个时代单线程性能吞吐率性能针对应用目标的性能We are hereWe are hereWe are here?单核多核异构38IBM异构型Cell-NOC:八个64位向量部件SXU和标量部件PXUCell处理机39Observed clock speed: a wide range of operating frequencies are supported t
29、o optimize for power and yield; Peak performance (single precision): 256 GFlopsPeak performance (double precision): 26 GFlopsIBM Cell 异构多核处理器结构详细结构图双精度单精度向量部件SIMD标量部件互联网络40下一步:千万亿次高性能计算机怎么办?Intel 或 AMD通用处理机再多,也无法达到;只有具有加速器功能的异构众核处理机芯片才可以达到!硬件可以达到,软件没有充分准备好(我们大学以后不一定造HPC机器,可以搞软件,和结合算法的软件)。41GPU对于超级计算
30、机并非理想GPU对于高性能计算的编程不适当,解决办法是把CPU和GPU结合。 Jack Dongarra说:“The obvious upside of GPUs is that they provide compelling performance for modest prices. The downside is that they are more difficult to program, since at the very least you will need to write one program for the CPUs and another program for th
31、e GPUs. Another problem that GPUs present pertains to the movement of data. Any machine that requires a lot of data movement will never come close to achieving its peak performance. The CPU-GPU link is a thin pipe, and that becomes the strangle-point for the effective use of GPUs. In the future this
32、 problem will be addressed by having the CPU and GPU integrated in a single socket。”42Cell处理机对于高性能计算机已经死亡Cell is Dead for HPCChips that contain both x86 general processing cores as well as graphics processing cores are essentially heterogeneous multi-core processors, which AMD calls Fusion. The vast
33、 majority of multi-core chips today are homogenous chips that contain a number of similar processing engines. There are processors with different types of cores the Cell chips jointly developed by IBM, Sony Corp. and Toshiba Corp. which originally promised to redefine the market of multimedia chips
34、as well as CPUs for HPC market. However, since all three companies cease to develop Cell, it has no future.Jack Dongarra 说:“The Cell architecture is no longer being developed, so it is effectively dead. No new supercomputers will use Cell。” 43CPUmulti-threadingmulti-coremany-corefixed functionpartia
35、lly programmablefully programmable?programmabilityparallelismA Likely Trajectory - Collision or Convergence?CPUGPUmulti-threadingmulti-coremany-corefixed functionpartially programmablefully programmablefuture processor by 2019?programmabilityparallelismafter Justin Rattner, Intel, ISC 2019未来可能的轨迹多线程
36、多核众核全部可编程部分可编程并行度可编程度通用性和并行度的结合-异构众核44IBM Cyclops-64(C64)芯片体系结构On-chip bisection BW = 0.38 TB/s, total BW to 6 neighbors = 48GB/sec80个核45异构型处理机构成1.1PetaFlops 超级计算机的组装46其他多用途的异构多核芯片Combination of different coresTwo main options:Different types Microcontroller + DSP, Processor + Accelerator .Different
37、 performance Big processor + small processorAdvantagesProcessors can be optimized for different tasks Operating system, multimedia, graphics, low power appsProcessors are decoupled Independent SW developmentDisadvantagesDifferent architectures - more to learn.Different toolsMore complex SW47Texas 的用
38、于移动终端的异构多核结构芯片各个核并行执行不同的任务,可用在移动终端48(四)。片上系统SOC 互联网络的发展49NOC的发展片上互联网络随工艺进步而发展片上互联必然发展到NOC (Network On Chip)80386奔腾多核50片上众核系统的互联网络之一片上众核 + 通道SOC上面:P是处理机的核51片上众核系统的互联网络之二片上众核 + 通道 + 路由器R路由器结构图开关52片上互联网络的两种典型拓扑结构Torus 拓扑结构Mesh 拓扑结构53时钟:NOC的SOC的片上时钟是分布式的RRRRRRRRRRRRRRRR每一个颜色块代表一个时钟域两种研究领域: 非同步路由器 设计简单,低
39、功耗 非同步互联 高频宽,低功耗图中R是NOC路由器54未来Exa-Scale片上网络NOCParallelism replaces clock frequency scaling and core complexityResulting ChallengesScalabilityProgrammingPower55未来Exa-Scale片上网络NOCUnpredictable Traffic LoadApplication2Application1ConventionalNoC System(number of cores102)TimeExa-Scale Micro-Networking
40、System(number of cores:102104)UnbalancedResource AllocationScalabilityGood Performance onSmall-Scale NetworkFaulty Router & LinkComplex Design & VerificationNoC FeaturesRegular ArchitecturePacket-based TransmissionFlexible Bandwidth Utilization56MIT:对于众核结构的分析和考虑阵列式上千个小核可以解决芯片面积和扩展性问题,但是,编程将成为难于逾越的壁垒
41、; 上千个核的并行化应用是非常艰难的:1.任务和数据的划分;2.通信会导致延迟的增加;3.较远距离的通信会引起沿路上的资源竞争;从而降低功能增加功耗;4.没有有效的广播式通信(硅片上金属线太长)。57MIT:对于众核结构的分析和考虑为提高上千众核芯片性能,必须有效管理通信和局域性:任务和数据两者都要优化划分和(位置)置放:分析通信模式以便使延迟最小化;数据必须放在经常使用它的执行部件附近;某些常用程序要靠近DRAM和I/O;动态的和不可预测的通信是很难优化的;为此,MIT提出用广播式光通信代替电连线的阵列式通信:广播式通信容易实现共享存储模式,从而易于编程;减少局域性的管理;价廉而且功耗小。技
42、术基础研究的好题目5859ATAC ArchitecturepswitchmpswitchmpswitchmpswitchmpswitchmpswitchmpswitchmpswitchmpswitchmpswitchmpswitchmpswitchmpswitchmpswitchmpswitchmpswitchmOptical Broadcast WDM InterconnectElectrical Mesh InterconnectMIT麻省理工学院提出的上千个众核芯片上的广播式光通信ATAC电连线的阵列式互联网络广播式光通信互联网络59MIT提出的众核芯片广播式光通信的优点光导通过众核芯
43、片上的每一个核;光导的不同波长可以完全消除资源竞争;型号全部可以在 2ns到达所有上千个核所有核都可以接收到同样的信号,实现真正的广播式传播。广播式光通信互联托扑结构60(五)。微电子工艺的 进一步发展61Terascale Integration CapacityTotal Transistors,300mm2 die1.5B LogicTransistors100MB Cache片上集成度到几千亿个晶体管62Freq scaling will slow downVdd scaling will slow downPower will be too high300mm2 Die频率、电压和功
44、耗的扩展性问题频率电压功率63连线:芯片工艺线条变细引起的问题:影响时钟分布、延迟设计、互联结构等等金属层4金属层3金属层2金属层164Package封装问题:System in a Package系统Si ChipSi ChipLimited pins: 10mm / 50 micron = 200 pinsLimited pinsSignal distance is large 10 mm higher powerComplex package65从两维到三维的SOC20个芯片堆叠(TSV)66Package散热问题:Anatomy of a Silicon ChipSi ChipHeat
45、-sinkHeatPowerSignals67PackageDRAM at the BottomDRAMCPUHeat-sinkPower and IO signals go through DRAM to CPUThin DRAM dieThrough DRAM viasThe most promising solution to feed the beast68(六)。未来exaFlops高性能计算机芯片预测69PetaFlops以后的进展The first 10 to 20 petaflop/s supercomputers should be in service by 2019 an
46、d after that comes a machine in the 100 petaflop/s range (2019). Scientists are moderately optimistic that exaflop/s (1000 petaflop/s) mainframes can be constructed by 2018 - 2020. However, are some of these expectations just plain irrational? (2019:1-2万万次);(2019:10万万次);(2018-2020:100万万次) Number of
47、cores per chip will double every two years Clock speed will not increase (possibly decrease) Need to deal with systems with millions of concurrent threads Need to deal with inter-chip parallelism as well as intra-chip parallelismthe future machines architecture. At best, it will require 20 Megawatts
48、 to run. So getting to the exaflop/s level or beyond may be extremely difficult. 500 x performance (peak) 100 x memory 5000 x concurrency 3x powerSpecialized software will be needed to best make use of the massive parallelism. Argonnes Leadership Computing Facility (ALCF) will install Mira, a next g
49、eneration Blue Gene system (BG/Q), in 2019. The ALCFs stated requirements for the 10 petaflops system include approximately 0.75 million cores and 0.75 petabytes of memory, with 16 cores and 16 gigabytes of memory per node.70$200M,20MWatt,64PB of RAM 的exaFlops高性能计算机“The current memory paradigm is hi
50、erarchical, based on registers, L1 and L2 caches, local memory, shared memory, and distributed memory among nodes. That is a potential model for exaFLOPS systems. However, we want exaFLOPS systems to be designed to be relatively easy to program. We therefore want a globally shared address space(全局地址
51、空间), and explicit methods to pass data between the processors in order to orchestrate the unfolding computation. That paradigm may be necessary for a machine that has a billion threads(百万线程)” 71预计的两种exaFLOPS HPC途径“There are two models that we can use to get to an exaflop while staying within a 20meg
52、aW budget. The first model employs huge numbers of lightweight processors, such as IBM Blue Gene Processor running at 1.0GHz. If we use 1 million chips, and each chip has 1000 cores, then we can get to a potential billion threads of execution. The other approach is a hybrid that makes extensive use
53、of coprocessors or GPUs. It would use a 1.0GHz processor and 10 000 floating point units per socket, and 100 000 sockets per system,” 72IBM MIRA 1万万亿次超级计算机scientists will have to scale their current computer codes tomore than 750,000 individual computing cores, providing them preliminary experience
54、on how scalability might be achieved on an exascale-class system with 100s of millions of cores. Despite a popular trend to use both central processing units (CPUs) and graphics processing units (GPU), the Mira will be based only on IBMs PowerPC chips.The IBM BlueGene/Q supercomputer design is based
55、 on sixteen-core IBM PowerPC A2 chip with 4-way simultaneous multi-threading technology. Each processor has at least 1GB of DDR3 memory. Featuring 750 thousand processing cores, the new supercomputer will be cooled-down using a special water-cooling system.IBM Blue/Gene Q-US Department of Energys (D
56、OE) Argonne National Laboratory IBM要为Laurence Livermore国家实验室做20PetaFlops的 Sequoia , IBM把Blue/Gene结构发展到 50Petaflops 和100Petaflops73Mira 10PetaFlops的Power PC A2处理机PowerPC A2是具有高度多核和多线程能力的64位Power架构的处理器。 IBM 称之为 “线速处理器”,他被设计为进行切换和路由工作的传统网络处理器与处理和封装数据的典型服务器处理器的混合体。以A2核心为基础的处理器版本从16核心, 2.3G频率, 65W功耗到一个4核
57、心,1.4G频率,20W功耗。每一个A2核心可以同时执行4个多线程(补充:Intel的超线程是两个)。每个核心有8M缓存,并且除了通用计算处理器外,还有一系列任务专用引擎,例如XML,加密解密,压缩和传统的表达加速,4个10G以太网接口和2个PCIe线路。不需要其他支持芯片的情况下,最多可以链接有四个芯片为SMP(对称多处理器)系统 。这些芯片据说极其复杂,使用了14亿3千万的晶体管,在45纳米制程下核心大小428平方毫米。注:线速处理器 “wire-speed processor”. 指处理器的数据吞吐量和通信标准的数据量相当。此概念IBM解释为,处理器不再是消化数据的地方,即数据停滞。而是
58、一个过滤或者修改数据并再发送的地方。 74IBM Power PC A2 的体系结构PLLPLLPLLPLLPLLEnginePLLPLLPLLPLLPLLPatternAccessx8 PHYx8 PHYx4 PHYx8 PHYEI3EI3EI3Misc I/O4x 10GE MAC or4x 1GE MACPervasivePCI ExpGen 2PCI ExpGen 2Host Ethernet Controller / Packet ProcessorRootEngineRoot/EP EnginePbusMacroPBus ExternalControllerPBICPBICPBus
59、PBICPBICComp / DecompCryptoXMLMCMCMem PHYMem PHYAT32MB L2AT22MB L2AT12MB L2AT02MB L2加速器75IBM Power PC A2的加速和互联四个芯片互联成SMP4 Channels 800-1600MHzTechnologyIBM 45nm SOICore Frequency2.3GHz 0.97V (Worst Case Process)Chip size428 mm2 (including kerf)Chip Power (4-AT node) Chip Power (1-AT node)65W 2.0GHz,
60、 0.85V Max Single Chip20W 1.4GHz, 0.77V Min Single ChipMain Voltage (VDD)0.7V to 1.1VMetal Layers11 Cu (3-1x, 2-1.3x, 3-2x, 1-4x, 2-10 x)Latch Count3.2MTransistor Count1.43BA2 Cores / Threads16 / 64L1 I & D Cache16 x (16KB + 16KB) SRAML2 Cache4 x 2MB eDRAMHardware AcceleratorsCrypto, Compression, Re
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 闸板阀维修施工方案
- 围墙护栏基础施工方案
- 2025年中考语文一轮复习:古诗词阅读核心考点解读
- 施工方案自己写吗
- 码头岸电施工方案
- 鹰潭护坡施工方案
- 2025年境外分子测试试题及答案
- 6年级下册语文第10课
- 荆州古建施工方案公司
- codebert在编程领域的使用
- 考生个人简历及自述表
- 试讲评分标准
- 硬质支气管镜技术参数要求
- 《网红现象的研究背景、意义及文献综述(2100字)》
- 管接头注塑模具设计开题报告
- 最新-驾驶员职业心理和生理健康知识二-课件
- 加氢装置催化剂硫化方案
- 核电厂概率安全评价概述课件
- 2022“博学杯”全国幼儿识字与阅读大赛选拔试卷
- 幼儿园硬笔专用字帖大写数字描红
- 沪教牛津版四年级上册英语全册课件
评论
0/150
提交评论