hazards(结构冒险大多发生在)_第1页
hazards(结构冒险大多发生在)_第2页
hazards(结构冒险大多发生在)_第3页
hazards(结构冒险大多发生在)_第4页
hazards(结构冒险大多发生在)_第5页
已阅读5页,还剩66页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

MorePipeline1BasicRISCPipeliningBasicidea:Eachinstructionspends1clockcycleineachofthe5executionstages.During1clockcycle,thepipelinecanprocess(indifferentstages)5differentinstructions.2SimpleRISCDatapathIFIDEXMEMWBProgram

CounterNextPCInst.

Reg.Load

fr.Mem.

Data3DescriptionofPipeStages4Hazards5ThehazardsofpipeliningPipelinehazardspreventnextinstructionfromexecutingduringdesignatedclockcycleThereare3classesofhazards:StructuralHazards:ArisefromresourceconflictsHWcannotsupportallpossiblecombinationsofinstructionsDataHazards:OccurwhengiveninstructiondependsondatafromaninstructionaheadofitinpipelineControlHazards:Resultfrombranch,otherinstructionsthatchangeflowofprogram(i.e.changePC)6Howdowedealwithhazards?Often,pipelinemustbe

stalledStallingpipelineusuallyletssomeinstruction(s)inpipelineproceed,another/otherswaitfordata,resource,etc.7StallsandperformanceStallsimpede(阻止)progressofapipelineandresultindeviationfrom1instructionexecuting/clockcyclePipeliningcanbeviewedto:DecreaseCPIorclockcycletimeforinstructionLet’sseewhataffectstallshaveonCPI…CPIpipelined=IdealCPI+Pipelinestallcyclesperinstruction1+PipelinestallcyclesperinstructionIgnoringoverheadandassumingstagesarebalanced:8Evenmorepipelineperformanceissues!Thisresultsin:Whichleadsto:Ifnostallsinidealcasespeedup==numberofpipelinestages91.StructuralhazardsMostcommoninstancesofstructuralhazards(结构冒险大多发生在):Whenafunctionalunitnotfullypipelined(完全流水)WhensomeresourcenotduplicatedenoughOnewaytoavoidstructuralhazardsistoduplicateresourcesPipelinesstallresultofhazards,CPIincreasedfromtheusual“1〞10AnexampleofastructuralhazardALURegMemDMRegALURegMemDMRegALURegMemDMRegALURegMemDMRegTimeALURegMemDMRegLoadInstruction1Instruction2Instruction3Instruction4What’stheproblemhere?Theprocessorhasacombinedinstruction+datamemorywithonly1readport11Howisitresolved?ALURegMemDMRegALURegMemDMRegALURegMemDMRegTimeALURegMemDMRegLoadInstruction1Instruction2StallInstruction3BubbleBubbleBubbleBubbleBubblePipelinegenerallystalledbyinsertinga“bubble〞orNOP12Oralternatively…Inst.#12345678910LOADIFIDEXMEMWBInst.i+1IFIDEXMEMWBInst.i+2IFIDEXMEMWBInst.i+3stallIFIDEXMEMWBInst.i+4IFIDEXMEMWBInst.i+5IFIDEXMEMInst.i+6IFIDEXClockNumberLOADinstruction“steals〞aninstructionfetchcyclewhichwillcausethepipelinetostall.Thus,noinstructioncompletesonclockcycle813Rememberthecommoncase!But,insomecasesitmaybebettertoallowthemthantoeliminatethem.Thesearesituationsacomputerarchitectmighthavetoconsider:IspipeliningfunctionalunitsorduplicatingthemcostlyintermsofHW?Doesstructuralhazardoccuroften?What’sthecommoncase?142.DatahazardsWhydotheyexist???Pipeliningchangesorder(i.e.read/writeaccessestooperands)Orderdiffersfromorderseenbysequentiallyexecutinginstructionsonunpipelinedmachine(流水执行序不同于非流水机器的顺序执行指令序)Considerthisexample:ADDR1,R2,R3SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9XORR10,R1,R11AllinstructionsafterADDuseresultofADDADDwritestheregisterinWBbutSUBneedsitinID.Thisisadatahazard15IllustratingadatahazardALURegMemDMRegALURegMemDMRegALURegMemDMRegMemTimeADDR1,R2,R3SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9XORR10,R1,R11ALURegMemADDinstructioncausesahazardinnext3instructionsb/c(because)

registernotwrittenuntilafterthose3readit.16DatahazardspecificsThereareactually3differentkindsofdatahazards!ReadAfterWrite(RAW)WriteAfterWrite(WAW)WriteAfterRead(WAR)Assumethathazardswilluseinstructionsi&j.iisalwaysissuedbeforej.Thus,iwillalwaysbefurtheralonginpipelinethanj.Withanin-orderissue/in-ordercompletionmachine,we’renotasconcernedwithWAW,WAR17ThreeTypesofDataHazardsThereareactually3differentkindsofdatahazards!Let

i

beanearlierinstruction,

j

alaterone.RAW(readafterwrite)jtriestoreadavaluebefore

i

writesitWAW(writeafterwrite)i

andj

writetosameplace,butinthewrongorder.发生条件:Onlyoccursif>1pipelinestagecanwrite(in-order)WAR(writeafterread)j

writesanewvaluetoalocationbeforei

hasreadtheoldone.发生条件:Onlyoccursifwritescanhappenbeforereadsinpipeline(in-order).18Readafterwrite(RAW)hazardsWithRAWhazard,instructionjtriestoreadasourceoperandbeforeinstructioniwritesit.Thus,jwouldincorrectlyreceiveanoldorincorrectvalueGraphically/Example:Canusestallingorforwardingtoresolvethishazard…ji…InstructionjisareadinstructionissuedafteriInstructioniisawriteinstructionissuedbeforeji:ADDR1,R2,R3j:SUBR4,R1,R619ForwardingItcanactuallybesolvedrelativelyeasily–withforwardingInthisexample,resultoftheADDinstructionnotreallyneededuntilafterADDactuallyproducesitCanwemovetheresultfromEX/MEMregistertothebeginningofALU(whereSUBneedsit)?Generallyspeaking:Forwarding

occurswhenaresultispasseddirectlytofunctionalunitthatrequiresit.Resultgoesfromoutputofoneunittoinputofanother20Whencanweforward?ALURegMemDMRegALURegMemDMRegALURegMemDMRegMemTimeADDR1,R2,R3SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9XORR10,R1,R11ALURegMemSUBgetsinfo.fromEX/MEMpiperegisterANDgetsinfo.fromMEM/WBpiperegisterORgetsinfo.byforwardingfromregisterfileRuleofthumb: Iflinegoes“forward〞youcandoforwarding. Ifitsdrawnbackward,it’sphysicallyimpossible.21DataHazardDetection22HazardDetectionLogicExample:Detectingwhetheraninstructionthathasjustbeenfetchedneedstobestalledbecauseofaprecedingload.23ForwardingSituationsinDLX24HWChangeforForwardingMuxMuxALUZero?DatamemoryID/EXEX/MEMMEM/WB25Forwarding:Itdoesn’talwaysworkALURegIMDMRegALURegIMDMALURegIMTimeLWR1,0(R2)SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9RegIMLoadhasalatencythatforwardingcan’tsolve.Pipelinemuststalluntilhazardcleared(startingwithinstructionthatwantstousedatauntilsourceproducesit).26ThesolutionALURegIMDMRegRegIMIMTimeLWR1,0(R2)SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9BubbleBubbleBubbleALURegRegIMALUDMInsertionofbubblecauses#ofcyclestocompletethissequencetogrowby127DatahazardsandthecompilerCompilershouldbeabletohelpeliminatesomestallscausedbydatahazardsi.e.compilercouldnotgenerateaLOADinstructionthatisimmediatelyfollowedbyinstructionthatusesresultofLOAD’sdestinationregister.Techniqueiscalled“pipeline/instructionscheduling〞28AsimpleExampleAclevercompilercanoftenrescheduleinstructionstoavoidastall.Asimpleexample:Originalcode:

lwr2,0(r4)

addr1,r2,r3Note:Stallhappenshere!

lwr5,4(r4)Transformedcode:

lwr2,0(r4)

lwr5,4(r4)

addr1,r2,r3Nostallneeded!

29SimpleRISCPipelineStallStatistics%ofloadsthatcauseastall30Writeafterwrite(WAW)hazardsWithWAWhazard,instructionjtriestowriteanoperandbeforeinstructioniwritesit.ThewritesareperformedinwrongorderleavingthevaluewrittenbyearlierinstructionGraphically/Example:…ji…InstructionjisawriteinstructionissuedafteriInstructioniisawriteinstructionissuedbeforeji:DIVF1,F2,F3j:SUBF1,F4,F631Writeafterread(WAR)hazardsWithWARhazard,instructionjtriestowriteanoperandbeforeinstructionireadsit.Instructioniwouldincorrectlyreceivenewervalueofitsoperand;Insteadofgettingoldvalue,itcouldreceivesomenewer,undesiredvalue.Graphically/Example:…ji…InstructionjisawriteinstructionissuedafteriInstructioniisareadinstructionissuedbeforeji:DIVF7,F1,F3j:SUBF1,F4,F6323.Control(Branch)HazardsSupposethenewPCvalueisnotcomputeduntiltheMEMstage.Thenwemuststall3clocksaftereverybranch!33BranchHazardsneedtoconsiderhazardsinvolvingbranches:Example:40: beq $1,$3,2844: and $12,$2,$548: or $13,$6,$252: add $14,$2,$272: lw $4,50($7)34PipelineimpactonbranchHowdowedealwiththis?AlwaysstallAssumebranch-not-takenBranchdelayslots35AssumebranchnottakenOnaverage,branchesaretaken½thetimeIfbranchnottaken…ContinuenormalprocessingElse,ifbranchistaken…NeedtoflushimproperinstructionfrompipelineCutsoveralltimeforbranchprocessingin½36AssumebranchnottakenCase1:nottakenExecutionproceedsnormallynopenalty37AssumebranchnottakenCase2:takenbranchBubblesinjectedinto3stagesduringcycle538Sum:BranchPenaltyImpactAssume16%ofallinstructionsarebranches4%unconditionalbranches:3cyclepenalty12%conditional:50%taken,3cyclepenaltyForasequenceofNinstructions(assumeNislarge)Ncyclestoinitiateeach3*0.04*Ndelaysduetounconditionalbranches0.5*3*0.12*NdelaysduetoconditionaltakenAlso,anextra4cyclesforpipelinetoemptyTotal:1.3*N+4totalcycles(or1.3cycles/instruction)(CPI)30%PerformanceHit!!!(Badthing)39BranchdelayslotDelayslot:FindoneinstructionthatwillbeexecutednomatterwhichwaythebranchgoesBranchesalwaysexecutenext1or2instructionsInstructionsoexecutedsaidtobeindelayslotbranchinstruction

Delayslotinstruction1

Delayslotinstruction2

Delayslotinstructionn

branchtargetiftaken

Branchdelayslotoflengthn40SchedulingDelayedBranchADDR1,R2,R3ifR2=0thenifR2=0thenSUBR4,R5,R6ADDR1,R2,R3ifR1=0thenSUBR4,R5,R6ADDR1,R2,R3ifR1=0thenADDR1,R2,R3ifR1=0thenADDR1,R2,R3SUBR4,R5,R6ORR7,R8,R9SUBR4,R5,R6ADDR1,R2,R3ifR1=0thenSUBR4,R5,R6ORR7,R8,R9FrombeforeFromtargetFromfallthrough41SchedulingDelayedBranchWheretogetinstructionstofillbranchdelayslot?BeforebranchinstructionalwaysvaluableFromthetargetaddress:onlyvaluablewhenbranchtakenFromfallthrough:onlyvaluablewhenbranchnottaken42FastBranchResolutionPerformancepenaltycouldbemorethan30%Deeperpipelines,somecodeisverybranchheavyFastBranchResolutionAdderinIDforPC+immediatetargetsOnlyworksforsimpleconditions(compareto0)Comparingtworegistervaluescouldbetooslow4344NewPipelineLogic45ExampleAssumethefollowingMIPSinstructionmix:WhatistheresultingCPIforthepipelinedMIPSwithforwardingandbranchaddresscalculationinIDstagewhenusingabranchnot-takenscheme?CPI=IdealCPI+Pipelinestallclockcyclesperinstruction=1+stallsbyloads+stallsbybranches=1+.3x.25x1+.2x.45x1=1+.075+.09=1.165Type Frequency Arith/Logic 40% Load 30%ofwhich25%arefollowedimmediatelybyaninstructionusingtheloadedvalueStore 10% branch 20%ofwhich45%aretaken46Exceptions47TypesofExceptions(Interrupts,Faults)I/Odevicerequest,timereventInvokingOSservicesfromauserprogramTracing(single-stepping)throughprogramBreakpointsIntegerarithmeticoverflow,dividebyzeroFParithmeticanomaly(overflow,underflow,etc.)Pagefault(pagenotinphysicalmemory)MisalignedmemoryaccessMemory-protectionviolation(acc.mem.notalloc’edtoproc.)Illegal(undefinedorunimplemented)instructionHardwaremalfunctionPower-relatedinterrupt(e.g.batterylow,powerfailure)……48ExceptionCharacterization1Synchronousvs.asynchronousEventsynchronizedwithprogramexecution?Synchronous:eventoccurssameplaceeverytimeAsynchronous:causedbydevicesexternaltoCPU&memory,alsohwmalfunctionsUserrequestedvs.coercedEventcausedintentionallybyuserprogram?Requested:usertaskasksforitCoerced:hweventnotundercontrolofuserprogram49ExceptionCharacterization2Usermaskable(canbedisabled)ornotCaneventbedisabled?Maskable:eventthatcanbedisabledbyusertaskWithininstructionsorbetweeninstructionsDoeseventpreventinstructionfromcompleting?Within:duringexecutionoftask,hardtohandle,usuallysynchronoussinceinstructionistriggerResumevsterminateDoestheprogramcontinuefromwhereitleftoffafterexceptionishandled,ordoesitstop?Terminating:executionalwaysstopsaftertheinterrupt50RestartableExceptionsRequirements:Exceptionmayoccurwithininstruction.Programmustcontinueafterexceptionishandled.Examples:Virtualmemorypagefault.Difficultbecause:Pipelinestatemustbesaved.Oneapproach,foreasycases:1.Forceatrapinst.intopipelineonnextIF.2.Clearpipelinebehindfaultinginstruction.3.ExceptionhandlersavesPCoffaultinginstr.51Precisevs.ImpreciseHandlingMachinesmaysupporteitherorbothmodesofexceptionhandling:Preciseexceptionhandling:Correctlyimplementallpossiblecombinationsofexceptionsinallcircumstances.Maybearequirementforsomesystems/applications.Maybe10xslower!Easierforintegerthanfloating-point.Usefulfordebuggingcode.Impreciseexceptionhandling:Onlycorrectlyimplementthemostcommoncases.Softwaremayavoidsomeexceptions.Onlystatisticalguaranteesofcorrectness,throughtesting.52ExceptionsinDLXpipelineInstructionFetch,&MemorystagesPagefaultoninstruction/datafetchMisalignedmemoryaccessMemory-protectionviolationInstructionDecodestageUndefined/illegalopcodeExecutionstageArithmeticexceptionWrite-BackstageNone!53Out-of-OrderExceptionsConsiderthefollowingcodesequence:LWIFIDEXMEMWBADDIFIDEXMEMWBTheADDmaycauseanexceptionduringIF,beforeLWcausesanexceptionduringMEM!Can’trestartPContheADD!Solution:Notetheexceptioninastatusvector,carriedalong.Disablewritesforthatinstruction.Resolveallexceptionsatalatestage(e.g.WB).54PipeliningComplicationsComplexaddressingmodesandinstructionsAutoincrementaddressmodes:causesregisterchangeduringinstructionexecutioninterrupts?NeedtorestoreregisterstateAddsWARandWAWhazardssincewritesnolongerinlaststageFloatingpoint:longexecutiontime;outofordercompletion55StoppingandStartingExecutionMostdifficultexceptionoccurrenceshave2properties TheyoccurwithininstructionsTheymustberestartableThepipelinemustbeshutdownsafelyandthestatemustbesavedforcorrectrestartingRestartingisusuallydonebysavingPCofinstructionatwhichtostartBranchesanddelayedbranchesrequirespecialtreatmentPreciseexceptionsallowinstructionsjustbeforetheexceptiontobecompleted,whilerestartinginstructionsaftertheexception56Multi-cycleOperations57Multi-cycleOperationsforFP58PipelinedMultiple-IssueFPU59Out-of-ordercompleteNoticeinstructionsmaycompleteout-of-order:MULTDIFIDM1M2M3M4M5M6M7MEWBADDDIFIDA1A2A3A4MEWBLDIFIDEX

MEWBSDIFIDEX

MEWB60TypicalFPCodeSeq.WAR.StallsClockCycleNumberInstruction1234567891011121314151617L.DF4,0(R2)IFIDEXMEWBMUL.DF0,F4,F6IFIDstallM1M2M3M4M5M6M7MEWBADD.DF2,F0,F8IFstallIDstallstallstallstallstallstallA1A2A3A4MEWBS.DF2,0(R2)IFstallstallstallstallstallstallIDEXstallstallstallME61Structurehazards62Sum:multiple-cyclesproblemsRaisesthepossibilityofWAWhazards,andstructuralhazardsinMEM&WBstages.Structuralhazardsmayoccurespeciallyoftenwithnon-pipelinedDIVunit.Out-of-ordercompletionimpactsexceptionhandling.63附录:TheMIPSR4000Pipeline64TheMIPSR4300PipelineManufacturedbyNEC64-bitprocessorimplementsMIPS64ISAUsedinembeddedapplicationsNintendo-64(任天堂)gameprocessor,networkrouter,…MultipleEXstagesforfloating-pointpipelineOut-of-ordercompletion,preciseexceptionsNECVR4122:Integerdata

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论