版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、William Stallings Computer Organization and Architecture7th EditionChapter 13Reduced Instruction Set Computers1Chapter 13Reduced Instruction Set ComputersKey terms CISC complex instruction set computerRISC reduced instruction set computerDelayed branchDelayed loadHLL high-level languageRegister file
2、Register window SPARC2Major Advances in Computers(1)The family conceptIBM System/360 1964DEC PDP-8Separates architecture from implementationMicroporgrammed control unitIdea by Wilkes 1951Produced by IBM S/360 1964Cache memoryIBM S/360 model 85 19693Major Advances in Computers(2)Solid State RAM(See m
3、emory notes)MicroprocessorsIntel 4004 1971PipeliningIntroduces parallelism into fetch execute cycleMultiple processors4The Next Step - RISCReduced Instruction Set ComputerKey featuresLarge number of general purpose registersor use of compiler technology to optimize register useLimited and simple ins
4、truction setEmphasis on optimising the instruction pipeline5Comparison of processors6Driving force for CISCSoftware costs far exceed hardware costsIncreasingly complex high level languagesSemantic gapLeads to:Large instruction setsMore addressing modesHardware implementations of HLL statementse.g. C
5、ASE (switch) on VAXSemantic: 语义的;语义学的 Gap 英音:gp 豁口,裂口 7Intention of CISCEase compiler writingImprove execution efficiencyComplex operations in microcodeSupport more complex HLLsease 减轻 HLL 缩写词 abbr. high-level language 【电脑】高级语言8Execution CharacteristicsOperations performedOperands usedExecution sequ
6、encingStudies have been done based on programs written in HLLsDynamic studies are measured during the execution of the program9OperationsAssignmentsMovement of dataConditional statements (IF, LOOP)Sequence controlProcedure call-return is very time consumingSome HLL instruction lead to many machine c
7、ode operationsAssignment 分配;指派,选派10Weighted Relative Dynamic Frequency of HLL Operations PATT82a Dynamic OccurrenceMachine-Instruction WeightedMemory-Reference WeightedPascalCPascalCPascalCASSIGN45%38%13%13%14%15%LOOP5%3%42%32%33%26%CALL15%12%31%33%44%45%IF29%43%11%21%7%13%GOTO3%OTHER6%1%3%1%2%1%11O
8、perandsMainly local scalar variablesOptimisation should concentrate on accessing local variablesPascalCAverageInteger Constant16%23%20%Scalar Variable58%53%55%Array/Structure26%24%25%12Procedure CallsVery time consumingDepends on number of parameters passedDepends on level of nestingMost programs do
9、 not do a lot of calls followed by lots of returnsMost variables are local(c.f. locality of reference)13ImplicationsBest support is given by optimising most used and most time consuming featuresLarge number of registersOperand referencingCareful design of pipelinesBranch prediction etc.Simplified (r
10、educed) instruction set14Large Register FileSoftware solutionRequire compiler to allocate registersAllocate based on most used variables in a given timeRequires sophisticated program analysisHardware solutionHave more registersThus more variables will be in registers15Registers for Local VariablesSt
11、ore local scalar variables in registersReduces memory accessEvery procedure (function) call changes localityParameters must be passedResults must be returnedVariables from calling programs must be restored16Register WindowsOnly few parametersLimited range of depth of callUse multiple small sets of r
12、egistersCalls switch to a different set of registersReturns switch back to a previously used set of registers17Register Windows cont.Three areas within a register setParameter registersLocal registersTemporary registersTemporary registers from one set overlap parameter registers from the nextThis al
13、lows parameter passing without moving datacont. 1.内容,所含之物 (contents) 2.继续的;不断的;连续的 18Overlapping Register Windows19Circular Buffer diagram主程序1子程序A2子程序B3子程序C4子程序D5子程序E6子程序F7子程序G20Operation of Circular BufferWhen a call is made, a current window pointer is moved to show the currently active register w
14、indowIf all windows are in use, an interrupt is generated and the oldest window (the one furthest back in the call nesting) is saved to memoryA saved window pointer indicates where the next saved windows should restore to21Global VariablesAllocated by the compiler to memoryInefficient for frequently
15、 accessed variablesHave a set of registers for global variables22Registers v CacheLarge Register FileCacheAll local scalarsRecently-used local scalarsIndividual variablesBlocks of memoryCompiler-assigned global variablesRecently-used global variablesSave/Restore based on procedure nesting depthSave/
16、Restore based on cache replacement algorithmRegister addressingMemory addressing23Referencing a Scalar - Window Based Register File24Referencing a Scalar - Cache25Referencing a Scalar - Window Based Register File26Compiler Based Register OptimizationAssume small number of registers (16-32)Optimizing
17、 use is up to compilerHLL programs have no explicit references to registersusually - think about C - register intAssign symbolic or virtual register to each candidate variable Map (unlimited) symbolic registers to real registersSymbolic registers that do not overlap can share real registersIf you ru
18、n out of real registers some variables use memory27Graph ColoringGiven a graph of nodes and edgesAssign a color to each nodeAdjacent nodes have different colorsUse minimum number of colorsNodes are symbolic registersTwo registers that are live in the same program fragment are joined by an edgeTry to
19、 color the graph with n colors, where n is the number of real registersNodes that can not be colored are placed in memory28Graph Coloring Approach29Graph Coloring Approach30Why CISC (1)?Compiler simplification? DisputedComplex machine instructions harder to exploitOptimization more difficult (开发)Sma
20、ller programs?Program takes up less memory butMemory is now cheapMay not occupy less bits, just look shorter in symbolic formMore instructions require longer op-codesRegister references require fewer bits dispute 英音:dispju:t 争论;争执 simplification 1. 单纯化 2. 简单化 31Why CISC (2)?Faster programs?Bias towa
21、rds use of simpler instructionsMore complex control unitMicroprogram control store largerthus simple instructions take longer to executeIt is far from clear that CISC is the appropriate solution Bias 倾向,趋势 32RISC CharacteristicsOne instruction per cycleRegister to register operationsFew, simple addr
22、essing modesFew, simple instruction formatsHardwired design (no microcode)Fixed instruction formatMore compile time/effort33RISC v CISCNot clear cutMany designs borrow from both philosophiese.g. PowerPC and Pentium IIphilosophies 哲学;观点34RISC PipeliningMost instructions are register to registerTwo ph
23、ases of executionI: Instruction fetchE: ExecuteALU operation with register input and outputFor load and storeI: Instruction fetchE: ExecuteCalculate memory addressD: MemoryRegister to memory or memory to register operation35Figure 13.6cFigure 13.6c, three instructions can be overlapped, and the impr
24、ovement is as much as a factor of 3.I36Figure 13.6dE1: Register file readE2: ALU operation and register writeI37Effects of Pipelining38Optimization of PipeliningDelayed branchDelayed LoadLoop Unrolling39Optimization of Pipelining (1)Loop UnrollingReplicate body of loop a number of timesIterate loop
25、fewer timesReduces loop overheadIncreases instruction parallelismImproved register, data cache or TLB locality unroll 展开,打开(卷着的东西) replicate 英音:replikeit 折叠;复制 iterate 英音:itreit 反复,重复 overhead 英音:uvhed日常开支,额外开销 40Optimization of Pipelining (2)Delayed branchDoes not take effect until after execution
26、of following instructionThis following instruction is the delay slot41Normal and Delayed BranchAddressNormal BranchDelayed BranchOptimized Delayed Branch100LOADX, rALOADX, rALOAD X, rA101ADD1, rAADD1, rAJUMP105102JUMP105JUMP106ADD1, rA103ADDrA, rBNOOPADDrA, rB104SUBrC, rBADDrA, rBSUBrC, rB105STORE r
27、A, ZSUBrC, rBSTORE rA, Z106STORE rA, Z42Use of Delayed BranchAddressNormal BranchDelayed BranchOptimized Delayed Branch100LOADX, rALOAD X, rALOAD X, rA101ADD 1, rAADD 1, rAJUMP 105102JUMP 105JUMP 106ADD 1, rA103ADD rA, rBNOOPADD rA, rB104SUB rC, rBADD rA, rBSUB rC, rB105STORE rA, ZSUB rC, rBSTORE rA
28、, Z106STORE rA, Z43Optimization of Pipelining (3)Delayed LoadRegister to be target is locked by processorContinue execution of instruction stream until register requiredIdle until load completeRe-arranging instructions can allow useful work whilst loading4480486 Instruction Pipeline Examples45Controversy (1)Quantitativecompare program sizes and execution speedsQualitativeexamine issues of
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2024年自贡客运资格证试题完整版
- 吉首大学《期货与期权》2021-2022学年第一学期期末试卷
- 吉首大学《非参数统计》2021-2022学年第一学期期末试卷
- 吉林艺术学院《造型基础训练III》2021-2022学年第一学期期末试卷
- 吉林艺术学院《数字化建筑环境设计软件基础SketchUP》2021-2022学年第一学期期末试卷
- 期刊经营转让协议书范文模板
- 吉林师范大学《中国画技法研究》2021-2022学年第一学期期末试卷
- 吉林师范大学《虚拟现实设计与制作》2021-2022学年第一学期期末试卷
- 2024年大棚蔬菜分包协议书模板
- 2024年大葱采购协议书模板
- 2024年国家公务员考试《行测》真题卷(副省级)答案及解析
- 教育局职业院校教师培训实施方案
- 2024年新华社招聘应届毕业生及留学回国人员129人历年高频难、易错点500题模拟试题附带答案详解
- 江苏省南京市秦淮区2023-2024学年八年级上学期期中语文试题及答案
- 2024年个人车位租赁合同参考范文(三篇)
- (完整版)新概念英语第一册单词表(打印版)
- 签申工作准假证明中英文模板
- 员工履历表(标准样本)
- 2024年山东省济南市中考数学真题(含答案)
- 山东省青岛市黄岛区2023-2024学年六年级上学期期中语文试卷
- 二手门市销售合同范本
评论
0/150
提交评论