![系统结构实验报告_第1页](http://file4.renrendoc.com/view/c179cadf9acfe32bf9de8a1d7393c69a/c179cadf9acfe32bf9de8a1d7393c69a1.gif)
![系统结构实验报告_第2页](http://file4.renrendoc.com/view/c179cadf9acfe32bf9de8a1d7393c69a/c179cadf9acfe32bf9de8a1d7393c69a2.gif)
![系统结构实验报告_第3页](http://file4.renrendoc.com/view/c179cadf9acfe32bf9de8a1d7393c69a/c179cadf9acfe32bf9de8a1d7393c69a3.gif)
![系统结构实验报告_第4页](http://file4.renrendoc.com/view/c179cadf9acfe32bf9de8a1d7393c69a/c179cadf9acfe32bf9de8a1d7393c69a4.gif)
![系统结构实验报告_第5页](http://file4.renrendoc.com/view/c179cadf9acfe32bf9de8a1d7393c69a/c179cadf9acfe32bf9de8a1d7393c69a5.gif)
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、北京工业大学计算机学院系统结构实验报告07044101王文通2010-12-1目录 TOC o 1-5 h z HYPERLINK l bookmark4 o Current Document 目录2实验一流水线中的相关3一、实验目的3 HYPERLINK l bookmark36 o Current Document 二、实验原理3三、实验步骤3 HYPERLINK l bookmark43 o Current Document 四、实验总结5实验二循环展开及指令调度6一、实验目的6 HYPERLINK l bookmark54 o Current Document 二、实验原理6三、实验步
2、骤6 HYPERLINK l bookmark58 o Current Document 指令调度技术6 HYPERLINK l bookmark67 o Current Document 循环展开以及指令调度提高性能8 HYPERLINK l bookmark75 o Current Document 四、实验总结10 HYPERLINK l bookmark78 o Current Document 五、代码10实验三cache性能分析13一、实验目的13 HYPERLINK l bookmark91 o Current Document 二、实验原理13三、实验过程13基本配置情况下运行
3、程序13 HYPERLINK l bookmark100 o Current Document 改变Cache容量对Cache性能的影响; 14 HYPERLINK l bookmark106 o Current Document 改变Cache的相联度对Cache性能的影响; 16 HYPERLINK l bookmark109 o Current Document 改变Cache块大小对Cache性能的影响; 18 HYPERLINK l bookmark119 o Current Document 不同的替换算法对Cache性能的影响。205.1不同容量下,不同的替换算法对Cache性能
4、的影响205.2不同相联度下,不同的替换算法对Cache性能的影响23实验一流水线中的相关一、实验目的熟练掌握WinDLX模拟器的操作和使用,熟悉DLX指令集结构及其特点;加深对计算机流水线基本概念的理解;进一步了解DLX基本流水线各段的功能以及基本操作;加深对数据相关、结构相关的理解,了解这两类相关对CPU性能的影响;了解解决数据相关的方法,掌握如何使用定向技术来减少数据相关带来的暂停二、实验原理用WinDLX模拟器模拟流水线。1.用WinDLX模拟器执行下列三个程序:求阶乘程序fact.s求最大公倍数程序gcm.s求素数程序prim.s分别以步进、连续、设置断点的方式运行程序,观察程序在流
5、水线中的执行情况,观察 CPU中寄存器和存储器的内容。熟练掌握WinDLX的操作和使用。2,用WinDLX运行程序structure_d.s,通过模拟找出存在资源相关的指令对以及导致资 源相关的部件;记录由资源相关引起的暂停时钟周期数,计算暂停时钟周期数占总执行周期 数的百分比;论述资源相关对CPU性能的影响,讨论解决资源相关的方法。资源相关:adddiI IF I 10 侣腿IIaddd f2JOJ2 ;IF | 洞ADDD F0, F0, F4ADDD F2, F0, F2 ;加法器狷Hd f2JOJ2:| IF | Wall | ID |F!-*all招di如2於8:| IF | Sta
6、llADDD F2, F0, F2ADDI R2, R2, #8 ;加法器addd I2JOJ2:| IF | 5由II | ID |R-5BI| 沽血| MEMaddir2j2,Ox8 :| IF | 5tall | I。 | intEX | 日illaddi r3j3,OK8;| IF | 1口 | 汩Isub r5j4j2;|IF | StallADDD F2, F0, F2ADDI R2, R2, #8 ;MEMADDI R3, R3, #8 ;MEMSUB R5, R4, R2;MEM由资源相关引起的暂停时钟周期数是50个(每次循环5个,共10次循环),暂停时钟周期 数占总执行周期数
7、的百分比50/139=35.97%资源相关降低CPU的性能,效率下降。资源相关引起的暂停,可以采取指令调度的方法 进行改进。3,在不采用定向技术的情况下(去掉Configuration菜单中Enable Forwarding选项前的勾 选符),用WinDLX运行程序data_d.s。记录数据相关引起的暂停时钟周期数以及程序执行 的总时钟周期数,计算暂停时钟周期数占总执行周期数的百分比。数据相关:St at ist ics20Total:202 Cyclefs) executed.ID ewecuted 顷 85 Instructiori(s).12 Instnjctionfsj current
8、ly in Pipeline.Hardware conf iguration:Memory size: 32768 BtesfaddE-Stges; 1, required Cycles; 2fmulEX-Stages: 1, required Cycles: 5fdivEX-Stages: L required Cycles: 19Forwarding disabled.GtsilLs:RAW stalls: 104 (51.48 4 日II CyclesWW stalls; 0 0.00 of all Cycles)Structural stalls: (0.Q0S of all Cycl
9、esControl stalls: 9 (4.46 of all CyclesTrap stalls: 3 1.48 of all Cycles)Total: 11 6 Stall(s(57.42 oF all Cycles数据相关引起的时钟周期数是104个,执行程序总时钟周期数是202个。暂停时钟周期数占总执行周期数的百分比54.18%在采用定向技术的情况下(勾选Enable Forwarding,用WinDLX再次运行程序data_d.s。 重复上述3中的工作,并计算采用定向技术后性能提高的倍数。数据相关引起的时钟周期数是30个,执行程序总时钟周期数是128个。暂停时钟周期数占总执行周期数
10、的百分比23.44%采用定向技术后性能提高202/128=1.58倍Statist ics-I叫回Total;128 Cycle(s)已灼己u旧d.ID executed 坷 85 Instruct!on(s).2 Instructionfsj currently in Pipeline.Hardware conf iguration:Memory size: 32766 BtesfdddE-Stages: 1, required Cycles: 2fmulEX-Stages: 1, required Cycles: 5fdivEX-Stages: k required Cycles: 19F
11、orwarding enabledStalls:RAW 找可卜;30 (23.44 of 0I Cycle?), thereof;LD stalls: 20 陟淞 of RAW stallsBranch/Junp stalls: 10 (33.33 of RAW stallsFloating point stalls: Cl (0.00 of RAW stalls)WAV/ stalls: 0(0.00 G all Cycles)Structural stalls: 0 (0.00 of all Cycles)Control stalls: 3 (7.03X of all CyclesTrap
12、 stalls: 3 2.34% of all CyclesTotal: 42 5t3ll(s) 32.81 希 mil Cycles)_四、实验总结采用流水线技术会遇到:数据相关、控制相关、资源相关 等问题。为解决数据相关引起的暂停,可以采用指令调度和定向技 术的方法来避免。通过指令调度,使相关的数据执行距离拉开, 使得不会发生数据相关。通过定向技术,可以将相关数据结果 直接传送到所有需要它的功能单元的输入端,避免数据相关引 起的暂停。为解决控制相关引起的暂停,可以使用循环展开的方法来减 少控制相关的次数。为解决资源相关引起的暂停,可以采用指令调度的方法来避 免。通过指令调度,使相关的资源使
13、用的距离拉开,使得不会 发生资源相关。实验二循环展开及指令调度一、实验目的加深对循环级并行性、指令调度技术、循环展开技术以及寄存器换名技术的理解;熟悉用指令调度技术来解决流水线中的数据相关的方法;了解循环展开、指令调度等技术对CPU性能的改进。二、实验原理用WinDLX模拟器模拟流水线。1 .指令调度技术用DLX汇编语言编写代码文件*.s,程序中应包括数据相关与结构相关(假设:加 法、乘法、除法部件各有2个,延迟时间都是3个时钟周期)通过Co时谊附血为菜单中的“瓦bmig point stages选项,把加法、乘法、除法部 件的个数设置为2个,把延迟都设置为3个时钟周期;用WinDLX运行程序
14、。记录程序执行过程中各种相关发生的次数、发生相关的指 令组合,以及程序执行的总时钟周期数;采用指令调度技术对程序进行指令调度,消除相关;用WinDLX运行调度后的程序,观察程序在流水线中的执行情况,记录程序执行Stalls:RAW stalls: 17 (22.37 of all Cycles), thereof:LD stalls: 9(52.94: of RAW stalls)Branch/Jump stalls: 8 (47.06 of RAW stalls)Floating point stalls: 0 (0.00 of RAW stallsWAW stalls: 0 (0.00:
15、of all Cycles)Structural stalls: 0 (0.00 of all Cycles)Control stalls: 7 (9.21 of all CvclesTrap stalls: 3 (3.95 of all Cycles)Total: 27 Stall(s (35.53 of all Cycles)的总时钟周期数;根据记录结果,比较 调度前和调度后的性能。论述 指令调度对于提高CPU性能的 意义。改进之前:共用了 76个周期共发生36次相关资源相关:9次数据相关:17次控制相关:7次Trap暂停:3次StatisticsTotal:76 Cycle(s) exe
16、cuted.ID executed bv 40 Instruction(s).2 Instructionfs currently in Pipeline.Hardware conf iguration: Memorv size: 32760 Bytes faddEX-Stages: 2, required Cycles: 3 fmulEX-Stages: 2, required Cycles: 3 fdivEX-Stages: 2, required Cycles: 3 Forwarding enabled.Total:59 Cvcle(s) executed.ID executed by 4
17、8 Instruction(s).2 Instruction(s) currently in Pipeline.指令调度后:共用了 59个周期共发生10次相关控制相关:7次Trap暂停:3次LWR1, 0 (R2)ADDR1, R1, R3 ;数据相关,1次ADDI R7,R0,8 ;资源相关,1次Iwr5.0w0(r1!IF |IDintEX |MEM17addi冷!IF |ID |R-Stall1addi r2.r2.0M4:IF IStallLWR5, 0 (R1)ADDI R5, R5, #10 ;数据相关,共发生8次ADDI R2, R2, #4 ;资源相关,共发生8次subi r7
18、,rmi:IFID| intEX |7bnez r7.luopSUB R7, R7, 1BNEZ R7, LOOP ;数据相关,共发生8次| IF |R-Stall|bnez ryjoup;IFR-StallIDintEXtrap 0m0:I IF I abortedBNEZ R7, LOOPTRAP #0 ;控制相关,共发生7次traj CxO:| IF |T-SlallTRAP #0 ;Trap 暂停3个周期Hardware conf iguration: Memory size: 32760 Bytes faddEX-Stages: 1, required Cycles: 2 fmulE
19、X-Stages: L required Cycles: 5 fdivEX-Stages: 1, required Cycles: 19 Forwarding enabled.Stalls:RAW stalls: 0 (0.00 of all Cycles, thereof:LD stalls: 0(0.00: of RAW stalls)Branch/Jump stalls: 0 (0.00 of RAW stalls)Floating point stalls: 0 (0.00 of RAW stallsWAV/ stalls: 0 (0.00: of all CyclesStructur
20、al stalls: 0 (0.00 of all CyclesControl stalls: 7 (11.06 of all Cycles)Trap stalls: 3 (5.00 of all CyclesTotal: 10Stall(s) (16.95: of all Cycles)指令调度后的加速比是76/59=1.29,通过指令调度,可以充分利用cpu部件利用率,减少 数据相关和资源相关引起的暂停。bnez r7L00P;IF | ID | mlEXtrap 0 x0iBNEZTRAPR7, LOOP#0 ;控制相关,共发生7次| IF | abortedtraj CxO1TRAP#
21、0 ;Trap暂停3个周期IF |T-Stall2.循环展开以及指令调度提高性能(1)用DLX汇编语言编写代码文件*.s,程序中包含一个循环次数为4的整数倍的简单循 环;(2)用WinDLX运行该程序。记录执行过程中各种相关发生的次数以及程序执行的总时 钟周期数;(3)将循环展开3次,将4个循环体组成的代码代替原来的循环体,并对程序做相应的修 改。然后对新的循环体进行寄存器换名和指令调度;(4)用WinDLX运行修改后的程序,记录执行过程中各种相关发生的次数以及程序执行 的总时钟周期数;(5)根据记录结果,比较循环展开、指令调度前后的性能。StatisticsTotal:单纯循环展开:共用了
22、58个周期共发生24次相关资源相关:9次数据相关:11次控制相关:1次Trap暂停:3次50 Cclefs executed.ID executed by 42 Instructionfs, 2 Instructionfsj currently in Pipeline.Hardware conf iguration: Memory size: 32768 Bytes faddEX-Stages: 1, required Cycles: 2 fmulEX-Stages: 1, required RuIet: 5 fdivEX-Stages: 1, required Cycles: 19 Forw
23、arding enabled.Stalls:RAW stalls: 11 (10.96 of all Cycles), thereof:LD stalls: 9(01.02 of RAW stalls)Branch/Jump stalls: 2 (18.10 of RAW stalls)Floating point stalls: 0 (0.00 of RAW stalls)WAW stalls: 0 (0.00: of all CyclesStructural stalls: 0 (0.00 of all Cycles)Control stalls: 1 (1.72 of all Cycle
24、s)Trap stalls: 3 (5.11% of all CyclesTotal: 15Stall(s (25.86 of all Cvcles)指令调度后的加速比是76/58=1.31,通过循环展开,可以充分利用cpu部件利用率,减少控制相关引起的冲刷和数据相关引起的暂停。lwrLOwO(i2)add r1addi r/jO.OwSLWR1, 0 (R2)ADDR1, R1, R3 ;数据相关,1次ADDI R7,R0,8 ;资源相关,1次LWR5, 0 (R1)ADDIR5, R5, #10 ;数据相关,共发生8次ADDIR2, R2, #4 ;资源相关,共发生8次subi r7,.r
25、bnez r/.JuupSUBR7, R7, 1BNEZR7, LOOP ;数据相关,共发生2次bnez r/.Jijuptrap 0 x0iBNEZR7, LOOPTRAP#0 ;控制相关,共发生1次traj CxO:IF | T-StallTRAP #0 ;Trap暂停3个周期IF|IF|R-Stall|IFR-StallIDintEX| IF | abortedIIStatisticsnJTotal:47 Cycle(s) executed.ID executed bv 42 Instruction(s).2 Instruction(s) currently in Pipeline.循环
26、展开+指令调度:共用了 47个周期共发生4次相关Hardware conf iguration: Memory size: 32768 Bytes faddEX-Stages: 1, required Cycles: 2 fmulEX-Stages: L required Cycles: 5 fdivEX-Stages: 1, required Cycles: 19 Forwarding enabled.luu r5.0w0(r1:IF | ID| intEX | MEM1addi冷!1 IF| ID |R-5tall1addi r2j2,0w4:| IF | Stall控制相关:1次Trap
27、暂停:3次Stalls:RAW stalls: 0 (0.00 of all Cycles), thereof:LD stalls: 0(0.00 of RAW stalls)Branch/Jump stalls: 0 (0.00 of RAW stalls)Floating point stalls: 0 (0.00 of RAW stallsWAW stalls: 0 (0.00: of all CyclesStructural stalls: 0 (0.00 of all Cycles)Control stalls: 1 (2.13 of all Cycles)T rap stalls:
28、 3 (6.38 of all CyclesTotal: 4Stall(s (8.51 of all Cycles指令调度后的加速比是76/47=1.62。bn 52 r7OOPIF | ID | intEXtrap OwO1BNEZTRAPR7, LOOP#0 ;控制相关,共发生1次| IF | abortedtrap 0 x01TRAP#0 ;Trap暂停3个周期IF IT-Slall四、实验总结循环展开和指令调度都能提高cpu性能,减少暂停,但是两者同 时进行改进时,优化性能并不是单纯的相加。因为循环展开的改进也 会减少和循环判断有关的数据相关。同时,循环展开和指令调度对 cpu性能提高
29、的能力也因不同程序而异,若循环次数较多,则采取循 环展开获得的cpu性能提升较高,若数据相关、资源相关较多,则采 取指令调度获得的cpu性能提升较高。五、代码改进前代码:LHI R2, (A16) & 0XFFFFADDUI R2, R2, A +8LHI R3, (B16)&0XFFFFADDUI R3, R3, B&0XFFFFLW R1, 0 (R2)ADD R1, R1, R3ADDI R7,R0,8LOOP:LW R5, 0 (R1)ADDI R5, R5, #10ADDI R2, R2, #4SUB R7, R7, 1BNEZ R7, LOOPTRAP #0A: .WORD 0,
30、4, 8, 12, 16, 20, 24, 28, 32, 36B: .WORD 9, 8, 7, 6, 5, 4, 3, 2, 1, 0指令调度后代码:LHI R2, (A16) & 0XFFFF ADDUI R2, R2, A +8LHIR3, (B16)&0XFFFFADDUI R3, R3, B&0XFFFF LWR1, 0 (R2)ADDI R7,R0,8ADD R1, R1, R3 LOOP:LW R5, 0 (R1) ADDI R2, R2, #4 SUB R7, R7, 1 ADDI R5, R5, #10 BNEZ R7, LOOP TRAP #0A: .WORD 0, 4,
31、 8, 12, 16, 20, 24, 28, 32, 36 B: .WORD 9, 8, 7, 6, 5, 4, 3, 2, 1, 0循环展开后代码:LHI R2, (A16) & 0XFFFFADDUI R2, R2, A +8LHI R3, (B16)&0XFFFFADDUI R3, R3, B&0XFFFFLW R1, 0 (R2)ADD R1, R1, R3ADDI R7,R0,8LOOP:LWADDIADDISUBLWADDIADDISUBLWR5, 0 (R1) R5, R5, #10 R2, R2, #4 R7, R7, 1 R5, 0 (R1)R5, R5, #10 R2,
32、R2, #4 R7, R7, 1 R5, 0 (R1)ADDIR5, R5, #10ADDIR2, R2, #4SUBR7, R7, 1LWR5, 0 (R1)ADDIR5, R5, #10ADDIR2, R2, #4SUBR7, R7, 1BNEZR7, LOOPTRAP#0A: .WORD 0, 4, 8, 12, 16, 20, 24, 28, 32, 36 B: .WORD 9, 8, 7, 6, 5, 4, 3, 2, 1, 0循环展开+指令调度后的代码:LHI R2, (A16) & 0XFFFF ADDUI R2, R2, A +8 LHIR3, (B16)&0XFFFFADDU
33、I R3, R3, B&0XFFFF LWR1, 0 (R2)ADDI R7,R0,8 ADD R1, R1, R3 LOOP:LWR5, 0 (R1)ADDIR2, R2, #4SUBR7, R7, 1ADDIR5, R5, #10LWR5, 0 (R1)ADDIR2, R2, #4SUBR7, R7, 1ADDIR5, R5, #10LWR5, 0 (R1)ADDIR2, R2, #4SUBR7, R7, 1ADDIR5, R5, #10LWR5, 0 (R1)ADDIR2, R2, #4SUBR7, R7, 1ADDIR5, R5, #10BNEZR7, LOOPTRAP#0A: .WO
34、RD 0, 4, 8, 12, 16, 20, 24, 28, 32, 36 B: .WORD 9, 8, 7, 6, 5, 4, 3, 2, 1, 0实验三cache性能分析一、实验目的加深对C ach e的基本概念、基本组织结构以及基本工作原理的理解;了解。ach e的容量、相联度、块大小对Cach e性能的影响;掌握降低Cache失效率的各种方法,以及这些方法对Cache性能提高的好处;理解C ache失效的产生原因以及Cache的三种失效;理解LRU与随机法的基本思想,及它们对Cache性能的影响;二、实验原理现代微机系统结构的另一重要技术是Cache。但是Cache 一般位于CPU内
35、部,即使是 对汇编语言程序员也是不可见的。为了直观的建立Cache技术的各种概念,形象的学习甚至 于自己动手进行Cache性能分析,设计一系列有针对性的仿真实验是个很好的教学方法。 SimpleScalar工具集中有专门针对Cache技术的模拟器sim-cache和sim- cheetah,正是完成 这些仿真实验的理想平台。借助这两个工具,我们在系统结构课程中增设了 Cache性能分析 的系列仿真实验帮助学生更好的理解和掌握Cache技术。三、实验过程1 .基本配置情况下运行程序默认参数:-cache:dl1 dl1:256:32:1:l -cache:dl2 ul2:1024:64:4:l
36、-cache:il1 il1:256:32:1:lbenchmarkbin.littletest-mathil1.misses il1.miss_rate23761 # total number of misses0.1113 # miss rate (i.e., misses/ref)5122 # total number of missesil1.miss ratebenchmarksuppliedvortex.ss il1.misses0.1223 # miss rate (i.e., misses/ref)benchmarkbin.littletest-fmathill.missesi
37、ll.replacementsill.writebacksill.invalidationsill.miss_rateill.repl_rate11248811223200.0泌.0619totaltotaltotaltotalnumber number number numbermisses replacements writebacks invalidationsmiss rate (i.e., misses/ref)replacement rate (i.e., repls/ref)2.改变Cache容量对Cache性能的影响;benchmarkbin.littletest-printf
38、-cache:il1 il1:512:32:1:lill.missesill.replacementsill .writebacksill.invalidationsill.miss_rateill.repl_ratetotal number of missestotal numbertotal numberof replacementsof writebacks0 # total number of invalidationsO.OOOS # miss rate (i.emisses/ref)0.0000 # replacement rate (i.e., repls/ref)ill.mis
39、ses60016耗total numberof missesill.replacements59507total numberof replacementsill.writebacksatotal numberof writebacksill.invalidationstotal numberof invalidationsill.jniss_rate0.0331miss rate (i.e., misses/ref)ill.repl_rate0.032S#replacement rate (i.e., repls/ref)-cache:il1il1:1024:32:1:lill.misses
40、23335total numberof missesill .replaceinents22408#total numberof replaceiti&ntsill.writebackstotal numberof writebacksill.invalidationstotal numberof invalidationsill.miss_rate0.0129miss rate (i.e., misses/ref)ill.repl_rate0.0124#replacementrate (i.e., repls/ref)-cache:il1il1:2048:32:1:lill.misses85
41、78total numberof missesill .replacements716 5total numberof replacementsill.writebacks0total numberof writebacksill.invalidations0tatal numberof invalidationsill.miss_rate0.0047miss rate (i.e . , misses/ref)ill.repl_rate0.0040replacement rate (i.e., repls/ref)-cache:il1 il1: 16384:32:1:lbenchmarkbin
42、.littletest-math-cache:il1il1:512:32:1:lil1.misses il1.miss_rate -cache:il1 il1.misses15565 # total number of misses0.0729 # miss rate (i.e., misses/ref) il1:1024:32:1:l6614 # total number of missesil1.miss_rate-cache:il10.0310 # miss rate (i.e., misses/ref) il1:2048:32:1:lil1.misses il1.miss_rate27
43、12 # total number of misses0.0127 # miss rate (i.e., misses/ref)14-cache:il1 ill: 16384:32:1:1ill.misses1636 # total number of missesi11.miss_rate0.0077 # miss rate (i.e., misses/ref)benchmarksuppliedvortex.ss-cache:il1il1.missesil1:512:32:1:l3241 # total number of missesil1.miss_rate-cache:il10.077
44、4 # miss rate (i.e., misses/ref) il1:1024:32:1:lil1.misses il1.miss_rate -cache:il12497 # total number of misses0.0596 # miss rate (i.e., misses/ref) il1:2048:32:1:lil1.misses il1.miss_rate -cache:il1 il1.misses1111 # total number of misses0.0265 # miss rate (i.e., misses/ref)il1: 16384:32:1:l590 #
45、total number of missesil1.miss_rate0.0141 # miss rate (i.e., misses/ref)改变cache容量容量i=ji=j missrate:test-math missrate:vortex.sscache容量miss_rate:test-printfCach e容量对C ache性能的影响:Cache容量越大,失效率越低,命中率越高。3.改变Cache的相联度对Cache性能的影响;benchmarkbin.littletest-printf-cache:il1 il1:256:32:1:lill.hitsill.missesill
46、.replacementsill.writebacksill.invalidationsill .iniss_rateill.repl_rate1782265 并 31480 并 30963 并 # # 0.017 / 0.0171 共totaltotaltotaltotaltotalnumber number number number numbermiss rate (if hitsf missesf replacementsof writebacksf invalidations e., inisses/ref)replacement rate (i.e., repls/ref)ill.
47、hits1701257 # total numberf hitsill.misses112488 养 total numberof missesill.replacements112232 春 total numberf replacejnentsill.writebacks0 # total numberof writebacksill.invalidations0 # total numberof invalidationsill .iniss_rate0.020 并 mss rate (i.e., misses/ref)ill.repl_rate0.0CL9 # replaceinent
48、 icate (i.e., repls/ref)-cache:il1 il1:256:32:2:l-cache:il1 il1:256:32:4:lill.hits ill.missesill. replaceinent s ill.writebacks ill.invalidations ill.miss_rate ill.repl_rate13OS592 #4153 #total number af hitstotal numbertotal numbertotal numbertotal numberof missesof replacements of writebacksof inv
49、alidations.023 # miss rate (i.e., misses/ref)0-. 0017 # replacement rate (1. e. , repls/ref)-cache:il1 il1:256:32:8:lill.hits1812230 #total nuitiber of hitsill.misses1515 #total number of missesill.replacements2 #total number of replacementsill.writebacks #total nuitiber of writebacksill. iiivalidat
50、ians #total number of invalidationsill.miss_rate0.0008 #miss rate (i.e. , misses/ref)ill.repl_rateo.aooo #replaceitient rate Ci.e. , repls/ref)-cache:il1 il1:256:32:64:lill.hits1812230 #totalnumberfhitsill.misses1515 #totalnumberfmissesill.replacements2 #totalnumberfreplacementsill.writebacks #total
51、numberfwritebacksill. iiivalidations #totalnumberfinvalidationsill.miss_rate0.0008 #miss ?ate (i,-e,misses/ref)ill.repl_rateo.aooo #replacement i? ti.e., repls/ref)benchmarkbin.littletest-math-cache:il1 il1:256:32:1:lil1.misses23761 # total number of misses16il1.miss_rate-cache:il1il1.misses0.1113 #
52、 miss rate (i.e., misses/ref) il1:256:32:2:l13479 # total number of missesil1.miss_rate-cache:il10.0631 # miss rate (i.e., misses/ref) il1:256:32:4:lil1.misses il1.miss_rate -cache:il1 il1.misses4889 # total number of misses0.0229 # miss rate (i.e., misses/ref) il1:256:32:8:l1640 # total number of m
53、issesil1.miss_rate-cache:il10.0077 # miss rate (i.e., misses/ref) il1:256:32:64:lil1.misses1636 # total number of missesil1.miss_rate0.0077 # miss rate (i.e., misses/ref)benchmarksuppliedvortex.ss-cache:il1il1:256:32:1:lil1.missesil1.miss_rate-cache:il1 il1:256:32:2:lil1.misses5122 # total number of
54、 misses0.1223 # miss rate (i.e., misses/ref)2575 # total number of missesil1.miss_rate0.0615 # miss rate (i.e., misses/ref)-cache:il1il1:256:32:4:lil1.missesil1.miss_rate-cache:il1 il1:256:32:8:lil1.misses619 # total number of misses 0.0148 # miss rate (i.e., misses/ref)590 # total number of missesi
55、l1.miss rate-cache:il10.0141 # miss rate (i.e., misses/ref) il1:256:32:64:lil1.misses590 # total number of misses相联度对Cache性能的影响:相联度越路数越大,失效率越低,命中率越高。4.改变Cache块大小对Cache性能的影响;ftbenchmarkbin.littletest-printf-cache:il1 il1:256:64:1:l -cache:dl2 ul2:1024:256:4:lill.missesill.replacements ill.writebacksi
56、ll. invali-dationsill .m.iss_rateill.repl_rate41012 # total number40756 并 tatal number # total number0 # total numberof missesf replacementsof writebacksf invalidations01.0226 # miss rate (i.e., misses/ref).022 5 # replaceitient rate (i.e. , repls/ref)-cache:il1 il1:256:128:1:l -cache:dl2 ul2:1024:2
57、56:4:lill.missesill.replacementsill.writebacksill.invalidationsill.miss_rateill.repl_rate12314 #12065 #0 #total numbertotal numbertotal numbertotal numberof missesof replacements of writebacksof invalidations.0063 # miss rate (i.e. , misses/ref). 0067 # replacement rate (i.e. , repls/ref)-cache:il1
58、il1:256:256:1:l -cache:dl2 ul2:1024:256:4:lill.missesill.replacementsill.writebacksill.invalidationsill.miss_rateill.repl_rate3731 寿 total number of misses3 510 春total numbertotal numbertotal numberof replaceiuents of writebacksof invalidationsD.0021 # miss rate (i.e., misses/ref)D.00T9 寿 replacemen
59、t rate (i.e., repls/ref)-cache:il1 il1:256:2048:1:l -cache:dl2 ul2:1024:2048:4:lill.missesill.replacementsill.writebacksill.invalidationsill.miss_rateill.repl_rate37 杏 total numb er0 岩 total numb er0 # total numb er0 # total numb erof missesof replacemeiits of writebacksof invali-dations.00 00 岩 mis
60、s rate (i.e. , misses/ref).00 00replacement rate (i.e. , repls/ref)benchmarkbin.littletest-math-cache:il1 il1:256:64:1:l -cache:dl2 ul2:1024:256:4:lill.missesill.replacementsill.writebacksill.invalidatiansill.miss_rateill.repl_rate10531 考 totalnumber of misses10275 #0养total numbertotal numberof repl
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 机电设备销售员工工作总结
- 2025-2030全球无线智能振动监测传感器行业调研及趋势分析报告
- 2025-2030全球FinFET 3D晶体管行业调研及趋势分析报告
- 2025-2030全球无人潜水器用于海上石油和天然气行业行业调研及趋势分析报告
- 2025-2030全球手机支付安全行业调研及趋势分析报告
- 2025年全球及中国纳米粒度及Zeta电位分析仪行业头部企业市场占有率及排名调研报告
- 2025-2030全球高效粘泥剥离剂行业调研及趋势分析报告
- 2025区域代理合同模板范本
- 供水工程承包合同
- 音响设备购销合同范本
- 输变电工程监督检查标准化清单-质监站检查
- 2024-2025学年北京海淀区高二(上)期末生物试卷(含答案)
- 【超星学习通】马克思主义基本原理(南开大学)尔雅章节测试网课答案
- 2024年中国工业涂料行业发展现状、市场前景、投资方向分析报告(智研咨询发布)
- 化工企业重大事故隐患判定标准培训考试卷(后附答案)
- 工伤赔偿授权委托书范例
- 食堂餐具炊具供货服务方案
- 员工安全健康手册
- 2024化工园区危险品运输车辆停车场建设规范
- 自然科学基础(小学教育专业)全套教学课件
- 华为客服制度
评论
0/150
提交评论