![多发射指令的算法细节._第1页](http://file3.renrendoc.com/fileroot_temp3/2022-5/25/5eff538d-7daf-4344-a6c5-bee3826d34b7/5eff538d-7daf-4344-a6c5-bee3826d34b71.gif)
![多发射指令的算法细节._第2页](http://file3.renrendoc.com/fileroot_temp3/2022-5/25/5eff538d-7daf-4344-a6c5-bee3826d34b7/5eff538d-7daf-4344-a6c5-bee3826d34b72.gif)
![多发射指令的算法细节._第3页](http://file3.renrendoc.com/fileroot_temp3/2022-5/25/5eff538d-7daf-4344-a6c5-bee3826d34b7/5eff538d-7daf-4344-a6c5-bee3826d34b73.gif)
![多发射指令的算法细节._第4页](http://file3.renrendoc.com/fileroot_temp3/2022-5/25/5eff538d-7daf-4344-a6c5-bee3826d34b7/5eff538d-7daf-4344-a6c5-bee3826d34b74.gif)
![多发射指令的算法细节._第5页](http://file3.renrendoc.com/fileroot_temp3/2022-5/25/5eff538d-7daf-4344-a6c5-bee3826d34b7/5eff538d-7daf-4344-a6c5-bee3826d34b75.gif)
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Ch 4 指令级并行Embedded System Lab Fall 20124.1 指令级并行(Instruction Level Parallelism) 相关是程序运行的本质特征 相关带来数据冒险 冒险导致CPU停顿 Stall相关的分类: 数据相关 结构相关 控制相关 ILP: 无关的指令重叠执行Loop: LD F0,0(R1)SUBI R2,R2,8SUBI R3,R3,8 ADDD F4,F0,F2 名相关 另一种相关称为名相关( name dependence): 两条指令使用同一个名字(register or memory location) 但不交换数据 反相关(Antid
2、ependence) (WAR) Instruction j 所写的寄存器或存储单元,与 instruction i 所读的寄存器或存储单元相同,注instruction i 先执行 输出相关(Output dependence) (WAW) Instruction i 和instruction j 对同一寄存器或存储单元进行写操作,必须保证两条指令的写顺序 下列是否有名相关? 1 Loop: LDF0,0(R1) 2ADDDF4,F0,F2 3SD0(R1),F4 4LDF0,-8(R1) 5ADDDF4,F0,F2 6SD-8(R1),F4 7LDF0,-16(R1) 8ADDDF4,F0
3、,F2 9SD-16(R1),F4 ; 10LDF0,-24(R1) 11ADDDF4,F0,F2 12SD-24(R1),F4 13SUBIR1,R1,#32 14BNEZR1,LOOP 15NOP 如何消除名相关如何消除名相关?名相关的消除 1 Loop: LDF0,0(R1) 2ADDDF4,F0,F2 3SD0(R1),F4 ;drop SUBI & BNEZ 4LDF6,-8(R1) 5ADDDF8,F6,F2 6SD-8(R1),F8 ;drop SUBI & BNEZ 7LDF10,-16(R1) 8ADDDF12,F10,F2 9SD-16(R1),F12 ;d
4、rop SUBI & BNEZ 10LDF14,-24(R1) 11ADDDF16,F14,F2 12SD-24(R1),F16 13SUBIR1,R1,#32;alter to 4*8 14BNEZR1,LOOP 15NOP 这种方法称为寄存器重命名这种方法称为寄存器重命名“register renaming”指令级并行的若干定义 基本块的定义 直线型代码,无分支 整个程序是由分支语句连接基本块构成 MIPS 的分支指令占15%左右,基本块的大小在47条指令指令级并行的若干定义 OS代码中的分支较少负责资源管理填写状态寄存器填写控制寄存器设置控制变量 跨基本块的并行(循环级并行) 循
5、环的特征 控制循环的分支指令是有执行偏好的 绝大多数是成功的, 预测比较容易,但必须有预测方案 流水线的平均CPI Pipeline CPI = Ideal Pipeline CPI + Struct Stalls + RAW Stalls + WAR Stalls + WAW Stalls + Control Stalls 本章研究 减少停顿(stalls)数的方法和技术指令集调度的基本途径基本途径软件方法(编译器优化)Gcc: 17%控制类指令5 instructions + 1 branch在基本块上,得到更多的并行性挖掘循环级并行硬件方法动态调度方法静态与动态调度 8086 IO周期和
6、CPU周期 386 指令重叠执行 486 指令级并行 动态指令集调度Pentium Pro Pentium II,III,IV, AMD Athlon, MIPS R10K R12K, Sun UltraSpac, PowerPC 603,G3,G4,G5(IBM-Motorola-Apple),Alpha 21264 静态调度 Itanium & Transmeta: Crusoe 一个循环的例子for (i = 1; i = 1000; i+) x(i) = x(i) + y(i); 特征 计算x(i)时没有相关 并行方式 最简单的方法,循环展开。 采用向量的方式X=X+Y60年代
7、开始 Cray HITACHI NEC Fujitsu目前均采用向量加速部件的形式 GPU DSP简单循环及其对应的汇编程序for (i=1; i=1000; i+) x(i) = x(i) + s; Loop: LD F0,0(R1);F0=vector element ADDD F4,F0,F2;add scalar from F2 SD 0(R1),F4;store result SUBI R1,R1,8;decrement pointer 8B (DW) BNEZ R1,Loop;branch R1!=zero NOP;delayed branch slotFP 循环中的相关Loop:
8、LDF0,0(R1);F0=vector element ADDDF4,F0,F2;add scalar from F2 SD0(R1),F4;store result SUBIR1,R1,8;decrement pointer 8B (DW) BNEZR1,Loop;branch R1!=zero NOP;delayed branch slot产生结果的指令产生结果的指令 使用结果的指令使用结果的指令所需的延时所需的延时FP ALU opAnother FP ALU op3FP ALU opStore double2 Load doubleFP ALU op1Load doubleStore
9、 double0Integer opInteger op0 需要在哪里加需要在哪里加stalls?(假设分支在(假设分支在ID段得到地址和条件)段得到地址和条件)FP 循环中的Stalls 10 clocks: 是否可以通过调整代码顺序使stalls减到最小 1 Loop:LDF0,0(R1);F0=vector element 2stall 3ADDD F4,F0,F2;add scalar in F2 4stall 5stall 6 SD0(R1),F4;store result 7 SUBIR1,R1,8;decrement pointer 8B (DW) 8 stall 9 BNEZR
10、1,Loop;branch R1!=zero 10stall;delayed branch slot产生结果的指令产生结果的指令 使用结果的指令使用结果的指令所需的延时所需的延时FP ALU opAnother FP ALU op3FP ALU opStore double2 Load doubleFP ALU op1Load doubleStore double0Integer opInteger op0FP 循环中的最少Stalls数 6 clocks: 通过循环展开通过循环展开4次是否可以提高性能次是否可以提高性能? 1 Loop:LDF0,0(R1) 2SUBIR1,R1,8 3ADD
11、DF4,F0,F2 4 stall 5BNEZR1,Loop;delayed branch 6 SD8(R1),F4;altered when move past SUBISwap BNEZ and SD by changing address of SD 1 Loop:LDF0,0(R1);F0=vector element 2stall 3ADDDF4,F0,F2;add scalar in F2 4stall 5stall 6 SD0(R1),F4;store result 7 SUBIR1,R1,8;decrement pointer 8B (DW) 8 stall 9 BNEZR1,
12、Loop;branch R1!=zero 10stall;delayed branch slot循环展开4次(straightforward way) Rewrite loop to minimize stalls? 1 Loop: LDF0,0(R1) stall 2ADDDF4,F0,F2 stall stall 3SD0(R1),F4 ;drop SUBI & BNEZ 4LDF6,-8(R1) stall 5ADDDF8,F6,F2 stall stall 6SD-8(R1),F8 ;drop SUBI & BNEZ 7LDF10,-16(R1) stall 8ADDD
13、F12,F10,F2 stall stall 9SD-16(R1),F12 ;drop SUBI & BNEZ 10LDF14,-24(R1) stall 11ADDDF16,F14,F2 stall stall 12SD-24(R1),F16 13SUBIR1,R1,#32 stall ;alter to 4*8 14BNEZR1,LOOP 15NOP 15 + 4 x (1+2) + 1 = 28 cycles, or 7 per iteration Assumes R1 is multiple of 4名相关如何解决名相关如何解决Stalls数最小的循环展开 代码移动后 SD移动
14、到SUBI后,注意偏移量的修改 Loads移动到SD前,注意偏移量的修改1 Loop: LDF0,0(R1)2LDF6,-8(R1)3LDF10,-16(R1)4LDF14,-24(R1)5ADDDF4,F0,F26ADDDF8,F6,F27ADDDF12,F10,F28ADDDF16,F14,F29SD0(R1),F410SD-8(R1),F811SUBIR1,R1,#3212SD16(R1),F1213BNEZR1,LOOP14SD8(R1),F16; 8-32 = -24 14 clock cycles, or 3.5 per iteration循环展开示例小结移动SD到SUBI和BNE
15、Z后,需要调整SD中的偏移循环展开对循环间无关的程序是有效降低stalls的手段(对循环级并行).不同次的循环,使用不同的寄存器.指令调度,必须保证程序运行的结果不变 指令重排+循环展开 不做任何优化 10000 采用指令重排 6000 4次循环展开 7000 4次循环展开+指令重排 3500循环展开(1/3) Example: 下列程序段存在哪些数据相关? (A,B,C 指向不同的存储区且不存在覆盖区) for (i=1; i=100; i=i+1) Ai+1 = Ai + Ci; /* S1 */Bi+1 = Bi + Ai+1; /* S2 */ 1. S2使用由S1在同一循环计算出的
16、Ai+1. 2. S1 使用由S1在前一次循环中计算的值,同样S2也使用由S2在前一次循环中计算的值. 这种存在于循环间的相关,我们称为 “loop-carried dependence” 这表示循环间存在相关,不能并行执行,它与我们前面的例子中循环间无关是有区别的循环展开(2/3) Example:A,B,C,D distinct & nonoverlapping for (i=1; i=100; i=i+1) Ai = Ai + Bi; /* S1 */Bi+1 = Ci + Di; /* S2 */1. S1和S2没有相关,S1和S2互换不会影响程序的正确性 2. 在第一次循环中
17、,S1依赖于前一次循环的Bi.循环展开(3/3)A1 = A1 + B1;for (i=1; i=99; i=i+1) Bi+1 = Ci + Di;Ai+1 = Ai+1 + Bi+1;B101 = C100 + D100;for (i=1; i out-of-order completion 记分牌算法 Tomasulo算法硬件方案之一: 记分牌 记分牌的基本概念示意图记分牌控制的四阶段(1/2)1. Issue指令流出,检测结构相关 如果当前指令所使用的功能部件空闲,并且没有其他活动的指令使用相同的目的寄存器(WAW), 记分牌发射该指令到功能部件,并更新记分牌内部数据,如果有结构相关或
18、WAW相关,则该指令的发射暂停,并且也不发射后继指令,直到相关解除. 2. Read operands没有数据相关时,读操作数 如果先前已发射的正在运行的指令不对当前指令的源操作数寄存器进行写操作,或者一个正在工作的功能部件已经完成了对该寄存器的写操作,则该操作数有效. 操作数有效时,记分牌控制功能部件读操作数,准备执行。 记分牌在这一步动态地解决了RAW相关,指令可能会乱序执行。记分牌控制的四阶段(2/2)3.Execution取到操作数后执行 (EX) 接收到操作数后,功能部件开始执行. 当计算出结果后,它通知记分牌,可以结束该条指令的执行. 4.Write resultfinish ex
19、ecution (WR) 一旦记分牌得到功能部件执行完毕的信息后,记分牌检测WAR相关,如果没有WAR相关,就写结果,如果有WAR 相关,则暂停该条指令。Example: DIVDF0,F2,F4 ADDDF10,F0,F8 SUBDF8,F8,F14 CDC 6600 scoreboard 将暂停 SUBD 直到ADDD 读取操作数后,才进入WR段处理。思考 记分牌和DLX流水线有什么关系ISROEXWRScoreboard记分牌的结构1. Instruction status记录正在执行的各条指令处于四步中的哪一步2. Functional unit status记录功能部件(FU)的状态
20、。用9个域记录每个功能部件的9个参量:Busy指示该部件是否空闲Op该部件所完成的操作Fi其目标寄存器编号Fj, Fk源寄存器编号Qj, Qk产生源操作数Fj, Fk的功能部件Rj, Rk标识源操作数Fj, Fk是否就绪的标志,读走之后设置为No3. Register result status如果存在功能部件对某一寄存器进行写操作,指示具体是哪个功能部件对该寄存器进行写操作。如果没有指令对该寄存器进行写操作,则该域为BlankScoreboard ExampleInstruction status ReadExecutionWriteInstructionjkIssueoperands co
21、mplete ResultLDF634+R2LDF245+R3MULTD F0F2F4SUBDF8F6F2DIVDF10F0F6ADDDF6F8F2Functional unit statusdestS1S2FU for j FU for kFj?Fk?TimeNameBusyOpFiFjFkQjQkRjRkIntegerNoMult1NoMult2NoAddNoDivideNoRegister result statusClockF0F2F4F6F8F10F12.F30FU* *加法指令执行需要加法指令执行需要2 2个周期,乘法需要个周期,乘法需要1010个周期,除法需要个周期,除法需要40
22、40个周期个周期LDLD指令使用指令使用IntegerInteger整型部件整型部件Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21LDF245+ R3MULTDF0F2F4SUBDF8F6F2DIVDF10F0F6ADDDF6F8F2Functional unit status:destS1S2FUFUFj?Fk?Ti m e Nam eBusyOpFiFjFkQjQkRjRkIntegerYesLoadF6R2YesMult1NoMult2NoAddNoDivideNoRegis
23、ter result status:ClockF0F2F4F6F8F10 F12.F301FUIntegerScoreboard Example: Cycle 1Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R212LDF245+ R3MULTDF0F2F4SUBDF8F6F2DIVDF10F0F6ADDDF6F8F2Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerYesLoadF6
24、R2YesMult1NoMult2NoAddNoDivideNoRegister result status:ClockF0F2F4F6F8F10 F12.F302FUInteger Issue 2nd LD?Scoreboard Example: Cycle 2Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R2123LDF245+ R3MULTDF0F2F4SUBDF8F6F2DIVDF10F0F6ADDDF6F8F2Functional unit status:destS1S2FUF
25、UFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerYesLoadF6R2NoMult1NoMult2NoAddNoDivideNoRegister result status:ClockF0F2F4F6F8F10 F12.F303FUInteger Issue MULT?Scoreboard Example: Cycle 3Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R3MULTDF0F2F4SUBDF8F6F2DIVDF
26、10F0F6ADDDF6F8F2Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNoMult1NoMult2NoAddNoDivideNoRegister result status:ClockF0F2F4F6F8F10 F12.F304FUIntegerScoreboard Example: Cycle 4Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R
27、35MULTDF0F2F4SUBDF8F6F2DIVDF10F0F6ADDDF6F8F2Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerYesLoadF2R3YesMult1NoMult2NoAddNoDivideNoRegister result status:ClockF0F2F4F6F8F10 F12.F305FUIntegerScoreboard Example: Cycle 5Instruction status:Read Exec WriteInstructionjkIssue
28、 Oper Comp ResultLDF634+ R21234LDF245+ R356MULTDF0F2F46SUBDF8F6F2DIVDF10F0F6ADDDF6F8F2Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerYesLoadF2R3YesMult1YesMultF0F2F4IntegerNoYesMult2NoAddNoDivideNoRegister result status:ClockF0F2F4F6F8F10 F12.F306FUMult1 IntegerScoreboa
29、rd Example: Cycle 6Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R3567M ULTDF0F2F46SUBDF8F6F27DIVDF10F0F6ADDDF6F8F2Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerYesLoadF2R3NoMult1YesMultF0F2F4IntegerNoYesMult2NoAddYesSubF8F6
30、F2IntegerYesNoDivideNoRegister result status:ClockF0F2F4F6F8F10 F12.F307FUMult1 IntegerAdd Read multiply operands?Scoreboard Example: Cycle 7Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R3567MULTDF0F2F46SUBDF8F6F27DIVDF10F0F68ADDDF6F8F2Functional unit st
31、atus:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerYesLoadF2R3NoMult1YesMultF0F2F4IntegerNoYesMult2NoAddYesSubF8F6F2IntegerYesNoDivideYesDivF10F0F6Mult1NoYesRegister result status:ClockF0F2F4F6F8F10 F12.F308FUMult1 IntegerAddDivideScoreboard Example: Cycle 8a (First half of clock cycle)Instr
32、uction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F46SUBDF8F6F27DIVDF10F0F68ADDDF6F8F2Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNoMult1YesMultF0F2F4YesYesMult2NoAddYesSubF8F6F2YesYesDivideYesDivF10F0F6Mult1NoYesReg
33、ister result status:ClockF0F2F4F6F8F10 F12.F308FUMult1AddDivideScoreboard Example: Cycle 8b (Second half of clock cycle)Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F469SUBDF8F6F279DIVDF10F0F68ADDDF6F8F2Functional unit status:destS1S2FUFUF
34、j?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNo10 Mult1YesMultF0F2F4YesYesMult2No2 AddYesSubF8F6F2YesYesDivideYesDivF10F0F6Mult1NoYesRegister result status:ClockF0F2F4F6F8F10 F12.F309FUMult1AddDivide Read operands for MULT & SUB? Issue ADDD?Note RemainingScoreboard Example: Cycle 9Instruction status
35、:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F469SUBDF8F6F279DIVDF10F0F68ADDDF6F8F2Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNo9 Mult1YesMultF0F2F4NoNoMult2No1 AddYesSubF8F6F2NoNoDivideYesDivF10F0F6Mult1NoYesRegister resul
36、t status:ClockF0F2F4F6F8F10 F12.F3010FUMult1AddDivideScoreboard Example: Cycle 10Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F469SUBDF8F6F27911DIVDF10F0F68ADDDF6F8F2Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkInt
37、egerNo8 Mult1YesMultF0F2F4NoNoMult2No0 AddYesSubF8F6F2NoNoDivideYesDivF10F0F6Mult1NoYesRegister result status:ClockF0F2F4F6F8F10 F12.F3011FUMult1AddDivideScoreboard Example: Cycle 11Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F469SUBDF8F6
38、F2791112DIVDF10F0F68ADDDF6F8F2Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNo7 Mult1YesMultF0F2F4NoNoMult2NoAddNoDivideYesDivF10F0F6Mult1NoYesRegister result status:ClockF0F2F4F6F8F10 F12.F3012FUMult1Divide Read operands for DIVD?Scoreboard Example: Cycle 12Instructio
39、n status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F469SUBDF8F6F2791112DIVDF10F0F68ADDDF6F8F213Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNo6 Mult1YesMultF0F2F4NoNoMult2NoAddYesAddF6F8F2YesYesDivideYesDivF10F0F6Mult1NoYes
40、Register result status:ClockF0F2F4F6F8F10 F12.F3013FUMult1AddDivideScoreboard Example: Cycle 13Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F469SUBDF8F6F2791112DIVDF10F0F68ADDDF6F8F21314Functional unit status:destS1S2FUFUFj?Fk?Time NameBus
41、yOpFiFjFkQjQkRjRkIntegerNo5 Mult1YesMultF0F2F4NoNoMult2No2 AddYesAddF6F8F2YesYesDivideYesDivF10F0F6Mult1NoYesRegister result status:ClockF0F2F4F6F8F10 F12.F3014FUMult1AddDivideScoreboard Example: Cycle 14Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R3567
42、8MULTDF0F2F469SUBDF8F6F2791112DIVDF10F0F68ADDDF6F8F21314Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNo4 Mult1YesMultF0F2F4NoNoMult2No1 AddYesAddF6F8F2NoNoDivideYesDivF10F0F6Mult1NoYesRegister result status:ClockF0F2F4F6F8F10 F12.F3015FUMult1AddDivideScoreboard Exampl
43、e: Cycle 15Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F469SUBDF8F6F2791112DIVDF10F0F68ADDDF6F8F2131416Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNo3 Mult1YesMultF0F2F4NoNoMult2No0 AddYesAddF6F8F2NoNoDivi
44、deYesDivF10F0F6Mult1NoYesRegister result status:ClockF0F2F4F6F8F10 F12.F3016FUMult1AddDivideScoreboard Example: Cycle 16Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F469SUBDF8F6F2791112DIVDF10F0F68ADDDF6F8F2131416Functional unit status:des
45、tS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNo2 Mult1YesMultF0F2F4NoNoMult2NoAddYesAddF6F8F2NoNoDivideYesDivF10F0F6Mult1NoYesRegister result status:ClockF0F2F4F6F8F10 F12.F3017FUMult1AddDivide Why not write result of ADD? WAR Hazard!Scoreboard Example: Cycle 17Instruction status:Read Exec Wri
46、teInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F469SUBDF8F6F2791112DIVDF10F0F68ADDDF6F8F2131416Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNo1 Mult1YesMultF0F2F4NoNoMult2NoAddYesAddF6F8F2NoNoDivideYesDivF10F0F6Mult1NoYesRegister result stat
47、us:ClockF0F2F4F6F8F10 F12.F3018FUMult1AddDivideScoreboard Example: Cycle 18Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F46919SUBDF8F6F2791112DIVDF10F0F68ADDDF6F8F2131416Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjR
48、kIntegerNo0 Mult1YesMultF0F2F4NoNoMult2NoAddYesAddF6F8F2NoNoDivideYesDivF10F0F6Mult1NoYesRegister result status:ClockF0F2F4F6F8F10 F12.F3019FUMult1AddDivideScoreboard Example: Cycle 19Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F4691920SU
49、BDF8F6F2791112DIVDF10F0F68ADDDF6F8F2131416Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNoMult1NoMult2NoAddYesAddF6F8F2NoNoDivideYesDivF10F0F6YesYesRegister result status:ClockF0F2F4F6F8F10 F12.F3020FUAddDivideScoreboard Example: Cycle 20Instruction status:Read Exec Wr
50、iteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F4691920SUBDF8F6F2791112DIVDF10F0F6821ADDDF6F8F2131416Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNoMult1NoMult2NoAddYesAddF6F8F2NoNoDivideYesDivF10F0F6YesYesRegister result status:ClockF0F2F4
51、F6F8F10 F12.F3021FUAddDivide WAR Hazard is now gone. Scoreboard Example: Cycle 21Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F4691920SUBDF8F6F2791112DIVDF10F0F6821ADDDF6F8F213141622Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpF
52、iFjFkQjQkRjRkIntegerNoMult1NoMult2NoAddNo39 DivideYesDivF10F0F6NoNoRegister result status:ClockF0F2F4F6F8F10 F12.F3022FUDivideScoreboard Example: Cycle 22Continue.Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F4691920SUBDF8F6F2791112DIVDF10
53、F0F682161ADDDF6F8F213141622Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNoMult1NoMult2NoAddNo0 DivideYesDivF10F0F6NoNoRegister result status:ClockF0F2F4F6F8F10 F12.F3061FUDivideScoreboard Example: Cycle 61Instruction status:Read Exec WriteInstructionjkIssue Oper Comp ResultLDF634+ R21234LDF245+ R35678MULTDF0F2F4691920SUBDF8F6F2791112DIVDF10F0F68216162ADDDF6F8F213141622Functional unit status:destS1S2FUFUFj?Fk?Time NameBusyOpFiFjFkQjQkRjRkIntegerNoMult1NoMult2NoAddNoDivideNoRegister result status:ClockF0F2F4F6F8F10 F12.F3062FUScoreboard
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 电火锅电蒸锅市场前景预测
- 湖北大学知行学院《web应用开发基础课程设计》2023-2024学年第二学期期末试卷
- 江苏电子信息职业学院《药学综合实验》2023-2024学年第二学期期末试卷
- 成都银杏酒店管理学院《EDA技术》2023-2024学年第二学期期末试卷
- 浙大宁波理工学院《空间数据库》2023-2024学年第二学期期末试卷
- 西安文理学院《文化产业政策与法规》2023-2024学年第二学期期末试卷
- 南阳职业学院《医学与法学专题讲座》2023-2024学年第二学期期末试卷
- 广东第二师范学院《关学概论》2023-2024学年第二学期期末试卷
- 辽宁广告职业学院《机器人传感与检测技术》2023-2024学年第二学期期末试卷
- 2025年济南货运从业资格证模拟考试题库及答案
- 中压电力线载波通信技术规范
- 周志华-机器学习-Chap01绪论-课件
- YB∕T 4146-2016 高碳铬轴承钢无缝钢管
- 多图中华民族共同体概论课件第十三讲先锋队与中华民族独立解放(1919-1949)根据高等教育出版社教材制作
- 第三单元《交流平台与初试身手》课件语文六年级下册
- (2024年)TPM培训讲义课件
- (2024年)物联网概述课件pptx
- 高考英语单词3500(乱序版)
- 《社区康复》课件-第五章 脊髓损伤患者的社区康复实践
- 北方、南方戏剧圈的杂剧文档
- 康复科st科出科小结
评论
0/150
提交评论