版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、William Stallings Computer Organization and Architecture7th EditionChapter 13Reduced Instruction Set Computers1Chapter 13Reduced Instruction Set ComputersKey terms CISC complex instruction set computerRISC reduced instruction set computerDelayed branchDelayed loadHLL high-level languageRegister file
2、Register window SPARC2Major Advances in Computers(1)The family conceptIBM System/360 1964DEC PDP-8Separates architecture from implementationMicroporgrammed control unitIdea by Wilkes 1951Produced by IBM S/360 1964Cache memoryIBM S/360 model 85 19693Major Advances in Computers(2)Solid State RAM(See m
3、emory notes)MicroprocessorsIntel 4004 1971PipeliningIntroduces parallelism into fetch execute cycleMultiple processors4The Next Step - RISCReduced Instruction Set ComputerKey featuresLarge number of general purpose registersor use of compiler technology to optimize register useLimited and simple ins
4、truction setEmphasis on optimising the instruction pipeline5Comparison of processors6Driving force for CISCSoftware costs far exceed hardware costsIncreasingly complex high level languagesSemantic gapLeads to:Large instruction setsMore addressing modesHardware implementations of HLL statementse.g. C
5、ASE (switch) on VAXSemantic: 语义的;语义学的 Gap 英音:gp 豁口,裂口 7Intention of CISCEase compiler writingImprove execution efficiencyComplex operations in microcodeSupport more complex HLLsease 减轻 HLL 缩写词 abbr. high-level language 【电脑】高级语言8Execution CharacteristicsOperations performedOperands usedExecution sequ
6、encingStudies have been done based on programs written in HLLsDynamic studies are measured during the execution of the program9OperationsAssignmentsMovement of dataConditional statements (IF, LOOP)Sequence controlProcedure call-return is very time consumingSome HLL instruction lead to many machine c
7、ode operationsAssignment 分配;指派,选派10Weighted Relative Dynamic Frequency of HLL Operations PATT82a Dynamic OccurrenceMachine-Instruction WeightedMemory-Reference WeightedPascalCPascalCPascalCASSIGN45%38%13%13%14%15%LOOP5%3%42%32%33%26%CALL15%12%31%33%44%45%IF29%43%11%21%7%13%GOTO3%OTHER6%1%3%1%2%1%11O
8、perandsMainly local scalar variablesOptimisation should concentrate on accessing local variablesPascalCAverageInteger Constant16%23%20%Scalar Variable58%53%55%Array/Structure26%24%25%12Procedure CallsVery time consumingDepends on number of parameters passedDepends on level of nestingMost programs do
9、 not do a lot of calls followed by lots of returnsMost variables are local(c.f. locality of reference)13ImplicationsBest support is given by optimising most used and most time consuming featuresLarge number of registersOperand referencingCareful design of pipelinesBranch prediction etc.Simplified (r
10、educed) instruction set14Large Register FileSoftware solutionRequire compiler to allocate registersAllocate based on most used variables in a given timeRequires sophisticated program analysisHardware solutionHave more registersThus more variables will be in registers15Registers for Local VariablesSt
11、ore local scalar variables in registersReduces memory accessEvery procedure (function) call changes localityParameters must be passedResults must be returnedVariables from calling programs must be restored16Register WindowsOnly few parametersLimited range of depth of callUse multiple small sets of r
12、egistersCalls switch to a different set of registersReturns switch back to a previously used set of registers17Register Windows cont.Three areas within a register setParameter registersLocal registersTemporary registersTemporary registers from one set overlap parameter registers from the nextThis al
13、lows parameter passing without moving datacont. 1.内容,所含之物 (contents) 2.继续的;不断的;连续的 18Overlapping Register Windows19Circular Buffer diagram主程序1子程序A2子程序B3子程序C4子程序D5子程序E6子程序F7子程序G20Operation of Circular BufferWhen a call is made, a current window pointer is moved to show the currently active register w
14、indowIf all windows are in use, an interrupt is generated and the oldest window (the one furthest back in the call nesting) is saved to memoryA saved window pointer indicates where the next saved windows should restore to21Global VariablesAllocated by the compiler to memoryInefficient for frequently
15、 accessed variablesHave a set of registers for global variables22Registers v CacheLarge Register FileCacheAll local scalarsRecently-used local scalarsIndividual variablesBlocks of memoryCompiler-assigned global variablesRecently-used global variablesSave/Restore based on procedure nesting depthSave/
16、Restore based on cache replacement algorithmRegister addressingMemory addressing23Referencing a Scalar - Window Based Register File24Referencing a Scalar - Cache25Referencing a Scalar - Window Based Register File26Compiler Based Register OptimizationAssume small number of registers (16-32)Optimizing
17、 use is up to compilerHLL programs have no explicit references to registersusually - think about C - register intAssign symbolic or virtual register to each candidate variable Map (unlimited) symbolic registers to real registersSymbolic registers that do not overlap can share real registersIf you ru
18、n out of real registers some variables use memory27Graph ColoringGiven a graph of nodes and edgesAssign a color to each nodeAdjacent nodes have different colorsUse minimum number of colorsNodes are symbolic registersTwo registers that are live in the same program fragment are joined by an edgeTry to
19、 color the graph with n colors, where n is the number of real registersNodes that can not be colored are placed in memory28Graph Coloring Approach29Graph Coloring Approach30Why CISC (1)?Compiler simplification? DisputedComplex machine instructions harder to exploitOptimization more difficult (开发)Sma
20、ller programs?Program takes up less memory butMemory is now cheapMay not occupy less bits, just look shorter in symbolic formMore instructions require longer op-codesRegister references require fewer bits dispute 英音:dispju:t 争论;争执 simplification 1. 单纯化 2. 简单化 31Why CISC (2)?Faster programs?Bias towa
21、rds use of simpler instructionsMore complex control unitMicroprogram control store largerthus simple instructions take longer to executeIt is far from clear that CISC is the appropriate solution Bias 倾向,趋势 32RISC CharacteristicsOne instruction per cycleRegister to register operationsFew, simple addr
22、essing modesFew, simple instruction formatsHardwired design (no microcode)Fixed instruction formatMore compile time/effort33RISC v CISCNot clear cutMany designs borrow from both philosophiese.g. PowerPC and Pentium IIphilosophies 哲学;观点34RISC PipeliningMost instructions are register to registerTwo ph
23、ases of executionI: Instruction fetchE: ExecuteALU operation with register input and outputFor load and storeI: Instruction fetchE: ExecuteCalculate memory addressD: MemoryRegister to memory or memory to register operation35Figure 13.6cFigure 13.6c, three instructions can be overlapped, and the impr
24、ovement is as much as a factor of 3.I36Figure 13.6dE1: Register file readE2: ALU operation and register writeI37Effects of Pipelining38Optimization of PipeliningDelayed branchDelayed LoadLoop Unrolling39Optimization of Pipelining (1)Loop UnrollingReplicate body of loop a number of timesIterate loop
25、fewer timesReduces loop overheadIncreases instruction parallelismImproved register, data cache or TLB locality unroll 展开,打开(卷着的东西) replicate 英音:replikeit 折叠;复制 iterate 英音:itreit 反复,重复 overhead 英音:uvhed日常开支,额外开销 40Optimization of Pipelining (2)Delayed branchDoes not take effect until after execution
26、of following instructionThis following instruction is the delay slot41Normal and Delayed BranchAddressNormal BranchDelayed BranchOptimized Delayed Branch100LOADX, rALOADX, rALOAD X, rA101ADD1, rAADD1, rAJUMP105102JUMP105JUMP106ADD1, rA103ADDrA, rBNOOPADDrA, rB104SUBrC, rBADDrA, rBSUBrC, rB105STORE r
27、A, ZSUBrC, rBSTORE rA, Z106STORE rA, Z42Use of Delayed BranchAddressNormal BranchDelayed BranchOptimized Delayed Branch100LOADX, rALOAD X, rALOAD X, rA101ADD 1, rAADD 1, rAJUMP 105102JUMP 105JUMP 106ADD 1, rA103ADD rA, rBNOOPADD rA, rB104SUB rC, rBADD rA, rBSUB rC, rB105STORE rA, ZSUB rC, rBSTORE rA
28、, Z106STORE rA, Z43Optimization of Pipelining (3)Delayed LoadRegister to be target is locked by processorContinue execution of instruction stream until register requiredIdle until load completeRe-arranging instructions can allow useful work whilst loading4480486 Instruction Pipeline Examples45Controversy (1)Quantitativecompare program sizes and execution speedsQualitativeexamine issues of
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026广东深圳市宝安区西乡桃源居幼儿园(集团)招聘工作人员7人备考题库带答案详解(预热题)
- 2026年网络营销知识竞赛考试题库及答案
- 汽车改装店操作不规范问题自查整改报告
- 2026北京航空航天大学可靠性与系统工程学院聘用编软件测试工程师F岗招聘2人备考题库附答案详解(黄金题型)
- 2026广东江门职业技术学院管理教辅人员招聘4人备考题库附参考答案详解(基础题)
- 2026新疆准东能源投资(集团)有限公司 招(竞)聘7人备考题库附参考答案详解(突破训练)
- 2026内蒙古鄂尔多斯东胜区万佳小学招聘英语教师1人备考题库含答案详解(研优卷)
- 2026安徽合肥国家实验室技术支撑岗位招聘1人备考题库光学工程师完整参考答案详解
- 2026年安徽省合肥市青年路小学教育集团青年路小学、黄河路小学、云谷路小学2026年春季学期教师招聘备考题库参考答案详解
- 2026北京市农林科学院招聘32人备考题库及参考答案详解
- 清真生产过程管控制度
- 途虎养车安全培训课件
- 2025-2026学年人教版(新教材)小学数学二年级下册(全册)教学设计(附教材目录P161)
- 物业小区春节前安全培训课件
- 刷单协议书合同范本
- 内科学总论小儿遗传代谢病课件
- 2026小红书平台营销通案
- 品牌设计报价方案
- 2026届上海交大附属中学高一化学第一学期期末达标检测试题含解析
- 消化性溃疡溃疡愈合后康复随访方案
- 公司员工自带电脑补贴发放管理办法
评论
0/150
提交评论