版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1/49WilliamStallings
ComputerOrganization
andArchitecture
8thEditionChapter13ReducedInstructionSetComputers2/40Chapter13
ReducedInstructionSetComputersKeytermsKeypointsChaptertitles3/40Chapter13
ReducedInstructionSetComputersKeytermsCISCcomplexinstructionsetcomputerRISCreducedinstructionsetcomputerDelayedbranchDelayedloadHLLhigh-levellanguageRegisterfileRegisterwindowSPARC4/40MajorAdvancesinComputers(1)ThefamilyconceptIBMSystem/3601964DECPDP-8SeparatesarchitecturefromimplementationMicroporgrammedcontrolunitIdeabyWilkes1951ProducedbyIBMS/3601964CachememoryIBMS/360model8519695/40MajorAdvancesinComputers(2)SolidStateRAM(Seememorynotes)MicroprocessorsIntel40041971PipeliningIntroducesparallelismintofetchexecutecycleMultipleprocessors6/40TheNextStep-RISCReducedInstructionSetComputerKeyfeaturesLargenumberofgeneralpurposeregistersoruseofcompilertechnologytooptimizeregisteruseLimitedandsimpleinstructionsetEmphasisonoptimisingtheinstructionpipeline7/40Comparisonofprocessors8/40DrivingforceforCISCSoftwarecostsfarexceedhardwarecostsIncreasinglycomplexhighlevellanguagesSemantic
gapLeadsto:LargeinstructionsetsMoreaddressingmodesHardwareimplementationsofHLLstatementse.g.CASE(switch)onVAXSemantic:语义的;语义学的
Gap英音:[gæp]豁口,裂口
9/40IntentionofCISCEasecompilerwritingImproveexecutionefficiencyComplexoperationsinmicrocodeSupportmorecomplexHLLs
ease减轻
HLL
缩写词abbr.
high-levellanguage【电脑】高级语言10/40ExecutionCharacteristicsOperationsperformedOperandsusedExecutionsequencingStudieshavebeendonebasedonprogramswritteninHLLsDynamicstudiesaremeasuredduringtheexecutionoftheprogram11/40OperationsAssignmentsMovementofdataConditionalstatements(IF,LOOP)SequencecontrolProcedurecall-returnisverytimeconsumingSomeHLLinstructionleadtomanymachinecodeoperations
Assignment
分配;指派,选派
12/40WeightedRelativeDynamicFrequencyofHLLOperations[PATT82a]
DynamicOccurrenceMachine-InstructionWeightedMemory-ReferenceWeighted
PascalCPascalCPascalCASSIGN45%38%13%13%14%15%LOOP5%3%42%32%33%26%CALL15%12%31%33%44%45%IF29%43%11%21%7%13%GOTO—3%————OTHER6%1%3%1%2%1%13/40OperandsMainlylocalscalarvariablesOptimisationshouldconcentrateonaccessinglocalvariables
PascalCAverageIntegerConstant16%23%20%ScalarVariable58%53%55%Array/Structure26%24%25%14/40ProcedureCallsVerytimeconsumingDependsonnumberofparameterspassedDependsonlevelofnestingMostprogramsdonotdoalotofcallsfollowedbylotsofreturnsMostvariablesarelocal(c.f.localityofreference)15/40ImplicationsBestsupportisgivenbyoptimisingmostusedandmosttimeconsumingfeaturesLargenumberofregistersOperandreferencingCarefuldesignofpipelinesBranchpredictionetc.Simplified(reduced)instructionset16/40LargeRegisterFileSoftwaresolutionRequirecompilertoallocateregistersAllocatebasedonmostusedvariablesinagiventimeRequiressophisticatedprogramanalysisHardwaresolutionHavemoreregistersThusmorevariableswillbeinregisters17/40RegistersforLocalVariablesStorelocalscalarvariablesinregistersReducesmemoryaccessEveryprocedure(function)callchangeslocalityParametersmustbepassedResultsmustbereturnedVariablesfromcallingprogramsmustberestored18/40RegisterWindowsOnlyfewparametersLimitedrangeofdepthofcallUsemultiplesmallsetsofregistersCallsswitchtoadifferentsetofregistersReturnsswitchbacktoapreviouslyusedsetofregisters19/40RegisterWindowscont.ThreeareaswithinaregistersetParameterregistersLocalregistersTemporaryregistersTemporaryregistersfromonesetoverlapparameterregistersfromthenextThisallowsparameterpassingwithoutmovingdata
cont.
1.内容,所含之物(contents)2.继续的;不断的;连续的20/40OverlappingRegisterWindows……21/40CircularBufferdiagram主程序1子程序A2子程序B3子程序C4子程序D5子程序E6子程序F7子程序G22/40OperationofCircularBufferWhenacallismade,acurrentwindowpointerismovedtoshowthecurrentlyactiveregisterwindowIfallwindowsareinuse,aninterruptisgeneratedandtheoldestwindow(theonefurthestbackinthecallnesting)issavedtomemoryAsavedwindowpointerindicateswherethenextsavedwindowsshouldrestoreto23/40GlobalVariablesAllocatedbythecompilertomemoryInefficientforfrequentlyaccessedvariablesHaveasetofregistersforglobalvariables24/40RegistersvCacheLargeRegisterFileCacheAlllocalscalarsRecently-usedlocalscalarsIndividualvariablesBlocksofmemoryCompiler-assignedglobalvariablesRecently-usedglobalvariablesSave/RestorebasedonprocedurenestingdepthSave/RestorebasedoncachereplacementalgorithmRegisteraddressingMemoryaddressing25/40ReferencingaScalar-
WindowBasedRegisterFile26/40ReferencingaScalar-Cache27/40ReferencingaScalar-
WindowBasedRegisterFile28/40CompilerBasedRegisterOptimizationAssumesmallnumberofregisters(16-32)OptimizinguseisuptocompilerHLLprogramshavenoexplicitreferencestoregistersusually-thinkaboutC-registerintAssignsymbolicorvirtualregistertoeachcandidatevariableMap(unlimited)symbolicregisterstorealregistersSymbolicregistersthatdonotoverlapcansharerealregistersIfyourunoutofrealregisterssomevariablesusememory29/40GraphColoringGivenagraphofnodesandedgesAssignacolortoeachnodeAdjacentnodeshavedifferentcolorsUseminimumnumberofcolorsNodesaresymbolicregistersTworegistersthatareliveinthesameprogramfragmentarejoinedbyanedgeTrytocolorthegraphwithncolors,wherenisthenumberofrealregistersNodesthatcannotbecoloredareplacedinmemory30/40GraphColoringApproach31/40WhyCISC(1)?Compilersimplification?
Disputed…ComplexmachineinstructionshardertoexploitOptimizationmoredifficult
(开发)Smallerprograms?Programtakesuplessmemorybut…MemoryisnowcheapMaynotoccupylessbits,justlookshorterinsymbolicformMoreinstructionsrequirelongerop-codesRegisterreferencesrequirefewerbits
dispute
英音:[di‘spju:t]争论;争执
simplification
1.单纯化2.简单化32/40WhyCISC(2)?Fasterprograms?BiastowardsuseofsimplerinstructionsMorecomplexcontrolunitMicroprogramcontrolstorelargerthussimpleinstructionstakelongertoexecuteItisfarfromclearthatCISCistheappropriatesolution
Bias倾向,趋势33/40RISCCharacteristicsOneinstructionpercycleRegistertoregisteroperationsFew,simpleaddressingmodesFew,simpleinstructionformatsHardwireddesign(nomicrocode)FixedinstructionformatMorecompiletime/effort34/40RISCvCISCNotclearcutManydesignsborrowfrombothphilosophiese.g.PowerPCandPentiumII
philosophies
哲学;观点35/40RISCPipeliningMostinstructionsareregistertoregisterTwophasesofexecutionI:InstructionfetchE:ExecuteALUoperationwithregisterinputandoutputForloadandstoreI:InstructionfetchE:ExecuteCalculatememoryaddressD:MemoryRegistertomemoryormemorytoregisteroperation36/40Sequentialexecutionfigure13.6adepictsthetimingofasequenceofinstructionsnopipelining.Clearly,thisisawastefulprocess.I37/40Figure13.6bshowsatwo-stagepipeliningscheme,inwhichtheIandEstagesoftwodifferentinstructionsareperformedsimultaneously.I38/40Figure13.6c,threeinstructionscanbeoverlapped,andtheimprovementisasmuchasafactorof3.I39/40Figure13.6dE1:RegisterfilereadE2:ALUoperationandregisterwriteI40/40EffectsofPipelining41/40OptimizationofPipeliningDelayedbranchDelayedLoadLoopUnrolling42/40OptimizationofPipelining
(1)LoopUnrollingReplicatebodyofloopanumberoftimesIterateloopfewertimesReducesloopoverheadIncreasesinstructionparallelismImprovedregister,datacacheorTLBlocality
unroll
展开,打开(卷着的东西)
replicate
英音:[‘replikeit]折叠;复制
iterate
英音:['itəreit]反复,重复
overhead英音:[‘əuvə’hed]日常开支,额外开销43/40OptimizationofPipelining
(2)DelayedbranchDoesnottakeeffectuntilafterexecutionoffollowinginstructionThisfollowinginstructionisthedelayslot44/40NormalandDelayedBranchAddressNormalBranchDelayedBranchOptimizedDelayedBranch100LOAD X,rALOAD X,rALOADX,rA101ADD 1,rAADD 1,rAJUMP 105102JUMP 105JUMP 106ADD 1,rA103ADD rA,rBNOOP
ADD rA,rB104SUB rC,rBADD rA,rBSUB rC,rB105STORErA,Z
SUB rC,rBSTORErA,Z
106
STORErA,Z
45/40UseofDelayed
BranchAddressNormalBranchDelayedBranchOptimizedDelayedBranch100LOAD X,rALOADX,rALOADX,rA101ADD1,rAADD1,rAJUMP105102JUMP105JUMP106ADD1,rA103ADDrA,rBNOOP ADDrA,rB104SUBrC,rBADDrA,rBSUBrC,rB105STORErA,Z SUBrC,rBSTORErA,Z 106
STORErA,Z
46/40OptimizationofPipelining
(3)DelayedLoadRegistertobetargetislockedbyprocessorContinueexecutionofinstructionstreamuntilregisterrequiredIdleuntilloadcompleteRe-arranginginstructionscanallowusefulworkwhilstloading47/4080486InstructionPipelineExamples48/40Controversy(1)QuantitativecompareprogramsizesandexecutionspeedsQualitativeexamineissuesofhighlevellanguagesupportanduseofVLSIrealestate
controversy
英音:[‘kɔntrəvə:si]争论,辩论;争议
quantitative
定量的
qualitative
定性的
estate
英音:[is'teit]财产,资产49/40Controversy(2)ProblemsNopairofRISCandCISCthataredirectlycomparableNodefinitivesetoftestprogramsDifficulttoseparatehardwareeffectsfromcompliereffectsMostcomparisonsdoneon“toy”ratherthanproductionmachinesMostcommercialdevicesareamixture50/40RequiredReadingStallingschapter13Manufacturerwebsites51/40请问现在的MIPS处理器设计中,延迟槽和分支预测是怎样的关系呢?
延迟槽和分支预测是提高流水线利用率的完全不相干的两种技术?延迟槽是早期用来提高性能的技术,现在已经很少用了呢?52/401.概述
分支延迟槽(Branchdelayslot),简单地说就是位
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年开发项目融资合同
- 2025年新型数字货币交易平台可行性研究报告
- 2025年无人机航空服务项目可行性研究报告
- 2025年低碳环保产品市场发展可行性研究报告
- 纸品购销合同范本
- 中美创业协议书
- 羊皮购销合同范本
- 2025年跨境电商产业园区发展项目可行性研究报告
- 高考全国甲卷英语试题题库(含答案)
- 成都轨道项目经理项目面试题库及答案
- 项目经理年底汇报
- 新生儿戒断综合征评分标准
- 【公开课】绝对值人教版(2024)数学七年级上册+
- T/CI 312-2024风力发电机组塔架主体用高强钢焊接性评价方法
- 药品检验质量风险管理
- 中国古桥欣赏课件
- 2025年硅酸乙酯-32#项目可行性研究报告
- 超星尔雅学习通《心理、行为与文化(北京大学)》2025章节测试附答案
- 《煤矿安全生产责任制》培训课件2025
- 《临床中药学实训》课程教学大纲
- 慢性牙周炎讲解
评论
0/150
提交评论