计算机结构英文版课件:Chapter 13 Reduced Instruction Set Computers_第1页
计算机结构英文版课件:Chapter 13 Reduced Instruction Set Computers_第2页
计算机结构英文版课件:Chapter 13 Reduced Instruction Set Computers_第3页
计算机结构英文版课件:Chapter 13 Reduced Instruction Set Computers_第4页
计算机结构英文版课件:Chapter 13 Reduced Instruction Set Computers_第5页
已阅读5页,还剩45页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、William Stallings Computer Organization and Architecture7th EditionChapter 13Reduced Instruction Set Computers1Chapter 13Reduced Instruction Set ComputersKey terms CISC complex instruction set computerRISC reduced instruction set computerDelayed branchDelayed loadHLL high-level languageRegister file

2、Register window SPARC2Major Advances in Computers(1)The family conceptIBM System/360 1964DEC PDP-8Separates architecture from implementationMicroporgrammed control unitIdea by Wilkes 1951Produced by IBM S/360 1964Cache memoryIBM S/360 model 85 19693Major Advances in Computers(2)Solid State RAM(See m

3、emory notes)MicroprocessorsIntel 4004 1971PipeliningIntroduces parallelism into fetch execute cycleMultiple processors4The Next Step - RISCReduced Instruction Set ComputerKey featuresLarge number of general purpose registersor use of compiler technology to optimize register useLimited and simple ins

4、truction setEmphasis on optimising the instruction pipeline5Comparison of processors6Driving force for CISCSoftware costs far exceed hardware costsIncreasingly complex high level languagesSemantic gapLeads to:Large instruction setsMore addressing modesHardware implementations of HLL statementse.g. C

5、ASE (switch) on VAXSemantic: 语义的;语义学的 Gap 英音:gp 豁口,裂口 7Intention of CISCEase compiler writingImprove execution efficiencyComplex operations in microcodeSupport more complex HLLsease 减轻 HLL 缩写词 abbr. high-level language 【电脑】高级语言8Execution CharacteristicsOperations performedOperands usedExecution sequ

6、encingStudies have been done based on programs written in HLLsDynamic studies are measured during the execution of the program9OperationsAssignmentsMovement of dataConditional statements (IF, LOOP)Sequence controlProcedure call-return is very time consumingSome HLL instruction lead to many machine c

7、ode operationsAssignment 分配;指派,选派10Weighted Relative Dynamic Frequency of HLL Operations PATT82a Dynamic OccurrenceMachine-Instruction WeightedMemory-Reference WeightedPascalCPascalCPascalCASSIGN45%38%13%13%14%15%LOOP5%3%42%32%33%26%CALL15%12%31%33%44%45%IF29%43%11%21%7%13%GOTO3%OTHER6%1%3%1%2%1%11O

8、perandsMainly local scalar variablesOptimisation should concentrate on accessing local variablesPascalCAverageInteger Constant16%23%20%Scalar Variable58%53%55%Array/Structure26%24%25%12Procedure CallsVery time consumingDepends on number of parameters passedDepends on level of nestingMost programs do

9、 not do a lot of calls followed by lots of returnsMost variables are local(c.f. locality of reference)13ImplicationsBest support is given by optimising most used and most time consuming featuresLarge number of registersOperand referencingCareful design of pipelinesBranch prediction etc.Simplified (r

10、educed) instruction set14Large Register FileSoftware solutionRequire compiler to allocate registersAllocate based on most used variables in a given timeRequires sophisticated program analysisHardware solutionHave more registersThus more variables will be in registers15Registers for Local VariablesSt

11、ore local scalar variables in registersReduces memory accessEvery procedure (function) call changes localityParameters must be passedResults must be returnedVariables from calling programs must be restored16Register WindowsOnly few parametersLimited range of depth of callUse multiple small sets of r

12、egistersCalls switch to a different set of registersReturns switch back to a previously used set of registers17Register Windows cont.Three areas within a register setParameter registersLocal registersTemporary registersTemporary registers from one set overlap parameter registers from the nextThis al

13、lows parameter passing without moving datacont. 1.内容,所含之物 (contents) 2.继续的;不断的;连续的 18Overlapping Register Windows19Circular Buffer diagram主程序1子程序A2子程序B3子程序C4子程序D5子程序E6子程序F7子程序G20Operation of Circular BufferWhen a call is made, a current window pointer is moved to show the currently active register w

14、indowIf all windows are in use, an interrupt is generated and the oldest window (the one furthest back in the call nesting) is saved to memoryA saved window pointer indicates where the next saved windows should restore to21Global VariablesAllocated by the compiler to memoryInefficient for frequently

15、 accessed variablesHave a set of registers for global variables22Registers v CacheLarge Register FileCacheAll local scalarsRecently-used local scalarsIndividual variablesBlocks of memoryCompiler-assigned global variablesRecently-used global variablesSave/Restore based on procedure nesting depthSave/

16、Restore based on cache replacement algorithmRegister addressingMemory addressing23Referencing a Scalar - Window Based Register File24Referencing a Scalar - Cache25Referencing a Scalar - Window Based Register File26Compiler Based Register OptimizationAssume small number of registers (16-32)Optimizing

17、 use is up to compilerHLL programs have no explicit references to registersusually - think about C - register intAssign symbolic or virtual register to each candidate variable Map (unlimited) symbolic registers to real registersSymbolic registers that do not overlap can share real registersIf you ru

18、n out of real registers some variables use memory27Graph ColoringGiven a graph of nodes and edgesAssign a color to each nodeAdjacent nodes have different colorsUse minimum number of colorsNodes are symbolic registersTwo registers that are live in the same program fragment are joined by an edgeTry to

19、 color the graph with n colors, where n is the number of real registersNodes that can not be colored are placed in memory28Graph Coloring Approach29Graph Coloring Approach30Why CISC (1)?Compiler simplification? DisputedComplex machine instructions harder to exploitOptimization more difficult (开发)Sma

20、ller programs?Program takes up less memory butMemory is now cheapMay not occupy less bits, just look shorter in symbolic formMore instructions require longer op-codesRegister references require fewer bits dispute 英音:dispju:t 争论;争执 simplification 1. 单纯化 2. 简单化 31Why CISC (2)?Faster programs?Bias towa

21、rds use of simpler instructionsMore complex control unitMicroprogram control store largerthus simple instructions take longer to executeIt is far from clear that CISC is the appropriate solution Bias 倾向,趋势 32RISC CharacteristicsOne instruction per cycleRegister to register operationsFew, simple addr

22、essing modesFew, simple instruction formatsHardwired design (no microcode)Fixed instruction formatMore compile time/effort33RISC v CISCNot clear cutMany designs borrow from both philosophiese.g. PowerPC and Pentium IIphilosophies 哲学;观点34RISC PipeliningMost instructions are register to registerTwo ph

23、ases of executionI: Instruction fetchE: ExecuteALU operation with register input and outputFor load and storeI: Instruction fetchE: ExecuteCalculate memory addressD: MemoryRegister to memory or memory to register operation35Figure 13.6cFigure 13.6c, three instructions can be overlapped, and the impr

24、ovement is as much as a factor of 3.I36Figure 13.6dE1: Register file readE2: ALU operation and register writeI37Effects of Pipelining38Optimization of PipeliningDelayed branchDelayed LoadLoop Unrolling39Optimization of Pipelining (1)Loop UnrollingReplicate body of loop a number of timesIterate loop

25、fewer timesReduces loop overheadIncreases instruction parallelismImproved register, data cache or TLB locality unroll 展开,打开(卷着的东西) replicate 英音:replikeit 折叠;复制 iterate 英音:itreit 反复,重复 overhead 英音:uvhed日常开支,额外开销 40Optimization of Pipelining (2)Delayed branchDoes not take effect until after execution

26、of following instructionThis following instruction is the delay slot41Normal and Delayed BranchAddressNormal BranchDelayed BranchOptimized Delayed Branch100LOADX, rALOADX, rALOAD X, rA101ADD1, rAADD1, rAJUMP105102JUMP105JUMP106ADD1, rA103ADDrA, rBNOOPADDrA, rB104SUBrC, rBADDrA, rBSUBrC, rB105STORE r

27、A, ZSUBrC, rBSTORE rA, Z106STORE rA, Z42Use of Delayed BranchAddressNormal BranchDelayed BranchOptimized Delayed Branch100LOADX, rALOAD X, rALOAD X, rA101ADD 1, rAADD 1, rAJUMP 105102JUMP 105JUMP 106ADD 1, rA103ADD rA, rBNOOPADD rA, rB104SUB rC, rBADD rA, rBSUB rC, rB105STORE rA, ZSUB rC, rBSTORE rA

28、, Z106STORE rA, Z43Optimization of Pipelining (3)Delayed LoadRegister to be target is locked by processorContinue execution of instruction stream until register requiredIdle until load completeRe-arranging instructions can allow useful work whilst loading4480486 Instruction Pipeline Examples45Controversy (1)Quantitativecompare program sizes and execution speedsQualitativeexamine issues of

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论