windlx系统结构实验报告模板_第1页
windlx系统结构实验报告模板_第2页
windlx系统结构实验报告模板_第3页
windlx系统结构实验报告模板_第4页
windlx系统结构实验报告模板_第5页
已阅读5页,还剩7页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、实验报告书课程名称:计算机系统结构题 冃: windlx模拟器实验学生姓名:专 业:计算机科学与技术班 级:信管计科00941学 号:指导老师:日 期:年月日一、实验目的:熟悉windlx的的基本概念和使用,了解各种不同指令在流水线的实际流动情况,对流水线做性能分析,加深对流水线及risc处理器的理解二、实验环境:windlx模拟器可以装入dlx汇编语言程序,然后单步、设置断点或 者连续执行该程序。cpu的寄存器、流水线、i/o和存储器都可以使用图形 的方式表示出来。模拟器还提供了对流水线操作的统计功能。该模拟器对理 解流水线和risc处理器的特点很冇帮助。windlx要求的硬件平台是ibm

2、pc兼容机,windlx是一个windows应用 程序,运行于windows 3.0以上的操作系统。三、实验步骤、结果分析:使用windlx模拟器,对fact, s做如下分析:(1)通过双击windlx图标开始模拟z前的准备工作:启动windlx,将出现-个带有六个图标的主窗口:殳windlx°|x|file window execute memory configuration breakp ointshelp(2)进行初始化模拟器,点击file菜单中的reset all菜单项,弹出一个“reset dlx”对话框。然后点击窗口中的“确认”按钮即可。(3) 在开始模拟z前,至少应装

3、入一个程序到主存。为此,选择file / load code or data,窗口中会列出目录中所有汇编程序。fact.s计算一个整型值的 阶乘;inputs中包含一个子程序,它读标准输入(键盘)并将值存入dlx处 理器的通用寄存器r1中。按如下步骤操作,把fact, s和input, s加载入主存。点击 fact.s点击select按钮点击 inputs点击select按钮点击zoa"按钮得到如卜图:file window ezecumemory confi guration breakpointdirectories:selected files:dl xprim.s struc

4、t1.s-a- -c- -d- -e- 卜 g-h-u docume邑也八桌面winclb<匕 ct.sdirectory: files: data_d.s fact.s gcm.scadocumel面'內indlx弹出一下询问对话框,点击“是”。load code or dataelfile(s) loaded successfully. reset dlx?1. 观察增加浮点运算部件对性能的影响。点击configuration打开菜单,然后点击floating point stages菜单项, 选择如下标准配置:floating point stage configurati

5、oncount:d elay:addition units:el2multiplication units:i5division units:i19number of units in each class: 1 <= m <= 8zdelay (clock cycles): 1 < = n <=50warning: if you change the values, the processorwill be reset automatically!okcancel(2)再点击executerun,输入15点,回车,在弹出的对话框出现消息”trap #0 occurre

6、d'1表明最后一条指令trap 0已经执行,trap指令中编号“0”没有定义,只是用来终止程序。点击确定。按f5进行查看: dlx-st andardi/0inputcancelian integer value >115factorial = 1.30767e+12关闭上而的对话框,在statistisc中得到结果如下所示:tota.丄:1 80 cyclefs) executed.id executed by 1 20 instruction(s).2 instruction(s) currently in pipeline.ha.i?dwa.i?e conf igui?a.

7、t ion :memory size: 32768 bytes faddex-stages: 1. required cycles: 2 fmulex-stages: 1. required cycles: 5 fdivexst3ges: 1, required cycles: 1 9 forwarding enabled.ste丄is:raw stalls: 1 7 (3.44 of all cycles), thereof:ld stalls: 3 (1 7.65 of raw stalls)branch-jump stalls: 3 (1 7.65立 of raw stalls) flo

8、ating point stalls: 11 (64.70: of raw stalls)v/aw stalls: 0 (0.00 of all cycles)structural stalls: 0 (0.00 of all cycles)control stalls: 20 (11.11 之 of all cycles)t rap stalls: 12 (6.67 of all cycles)total: 49 stall(s) (27.22 of all cycles)cond i t i ona. 1 bi?a.nches ):total: 1 8 (1 5.00; of all in

9、structions), thereof: taken: 2 (11.11 of all cond. branches) not taken: 1 6 (1 00.00 of all cond. branches)loa.d/stozce工nwtzruotions :total: 1 3 1 0.83 of all instructions), thereo上loads: 7 (53.85 of load-/s tore-l nstructions)stores: 6 (46.1 5 of load-/store-lnstructions)floaitinm point sta.ge inst

10、zruotions : total: 30 (25.00: of all instructions), thereof:additions: 1 4 (46.67 of floating point stage instj multiplications: 1 6 (53.33 of floating point stage instj divisions: 0 (0.00 of floating point stage instjtr-a.ps :traps: 43.33之 of all instructions)(4) 修改浮点数一次,参数如下图:floa± irtg poinf

11、taddition unitsmultiplication unitsdivision unitsco urit:delay:number of units in each class: 1 <= m <= 8delay (clock cycles): 1 <= n <= 50warning: if you change the values, the processor will be reset automatically!cancel修改浮点数后,单击ok,再点击executed run,输入15点,冋车,在弹出的对话框中点击确定。然后在statistisc中得到

12、结果如下所示:tot a.1 :1 80 cyclefs) executed.id executed by 1 20 instruction(s).2 lnstruction(s currently in pipeline.ha-x-dwa-a?© con f i gu.3?ai t i on.:memory size: 32768 bytes faddex-stages: 7, required cycles: 2 fmulex-stages: 7, required cycles: 5 fdivex-stages: 7, required cycles: 1 9 forwardi

13、ng enabled.ste.丄 is :raw stalls: 1 7 (9.44' of all cycles), thereof:ld stalls: 3 (1 7.65 of raw stalls)branch/jump stalls: 3 (1 7.65 of raw stalls) floating point stalls: 11 (64.70 of raw stalls)waw stalls: 0 (0.00 of all cycles)structural stalls: 0 (0.00 of all cycles)control stalls: 20 (11.11

14、之 of all cycles)t rap stalls: 1 2 (6.67 of all cycles)total: 49 stall(s) (27.22 of all cycles)condi i t i ona. 1 br-a.nch.es ):total: 1 8 (1 5.00; of all instructions), thereo上 taken: 2 (11.11 鬼 of ell cond. branches) not taken: 1 6 (1 00.00 of all cond. branches)zoa.d./s t uizcw i ns 13?u.c t i okl

15、w :total: 1 3 (1 0.83 of all instructions), thereof:loads: 7 (53.85 of load-/s tore-l nstructions)stores: 6 (46.1 5 of load-/store-lnstructions)f1 oa.t ing poin.t sta.ge inst:ruot ionm :total: 30 (25.00 of all instructions), thereo上additions: 1 4 (46.67 of floating point stage inst.) multiplications

16、: 1 6 (53.33之 of floating point stage inst) divisions: 0 (0.00 of floating point stage inst.)ti?a.ps :traps: 43.33玄 of all instructions)conf igura. floating point(5) 修改浮点数二次,参数如下图:count:delay:addition units:62multiplication units:75division units:19st agenumber of units in each class: 1 <= m <

17、= 8,delay (clock cycles): 1 <= n <= 50ok iwarning: if you change the values, the processorwill be reset automatically!cancel执行程序,输入点数15,然后在statistisc中得到结果如下所示:tote丄:1 80 cycle(s) executed.id executed by 1 20 instruction(s).2 instruction(s) currently in pipeline.ha.2rdwa.r-e con f i gur-a. t i

18、on :hd emory size: 32768 bytesfaddex-stages: 6, required cycles: 2 fmulex-stages: 7z required cycles: 5 fdivex-stages: 8z required cycles: 19forwarding enabled.stm丄is:raw stalls: 1 7 9.44之 of all cycles thereof: ld stalls: 3 (17.65 of raw stalls) branch/jump stalls: 3 (1 7.65 of raw stalls) floating

19、 point stalls: 11 (64.70 of raw stalls)waw stalls: 0 (0.00 of all cycles)structural stalls: 0 (0.00 of all cycles)control stalls: 20 (11.11 of all cycles)t rap stalls: 126.67之 of all cycles)total: 49 stall(s) (27.22 of all cycles)cond i t i ona. 1 branches ):total: 1 8 (1 5.00 of all instructions),

20、thereof: taken: 2 (11.11 of all cond. branches) not taken: 16 (100.00 of all cond. branches)loaid/s t oi?e i ns t r-uc t i ons :total: 1 3 (1 0.83 of all instructions), thereof: loads: 7 (53.85 of load-/s tore-l nstructions) stores: 6 (46.1 s% of load-/store-lnstructions)floa.ting* point sta.ge inst

21、zruotiokls:total: 30 (25.00 of all instructions), thereof:additions: 14 (46.67 of floating point stage inst) multiplications: 1 6 (53.33 of floating point stage inst.) divisions: 0 (0.00 of floating point stage inst.)tnratps :traps: 43.33玄 of all instructions)结论:从上面几次条件的改变下我们可以看出:增加浮点部件后执行该程序时结 果并没有

22、发生变化。曲此可见,浮点运算部件的增减对效率无影响,这是因为此 程序屮浮点计算指令没有重叠,所以并行度并没有增加,性能没有得到提高。2.观察增加forward部件对性能的影响。(1)增加 forward 部件情况:在 conf iguration 中使 enable forwarding 选项而为打钩状态。然后点击execute - > run运行,在弹出的对话框中输入15, 然后回车。在statistisc屮得到的结果图如下图所示:total:180 cycle(s) executed.id executed by 120 instruction(s).2 instruction(s)

23、 currently in pipeline.hardware conf igurat ion:memory size: 32768 bytes faddex-stages:required cycles: 2fmulex-stages: 1z required cycles: 5 fdivex-stages: 1, required cycles: 19 forwarding enabled.stel. is :raw stalls: 17 (9.44 of all cycles), thereof: ld stalls: 3 (17.65 of raw stalls) branchzjum

24、p stalls: 3 (17.65玄 of raw stalls) floating point stalls: 11 (64.70 of raw stalls)waw stalls: 0 (0.00 of all cycles)structural stalls: 0 (0.00% of all cycles)control stalls: 20 (11.11% of all cycles)trap stalls: 12 (6.67% of all cycles)total: 49 stall(s) (27.22% of all cycles)conditional branches):t

25、otal: 18 (15.00 of all instructions thereof: taken: 2 (11.11: of all cond. branches) not taken: 16 (100.00 of all cond. branches)load/storeinstruotions:total: 13 (10.83: of all instructions), thereof: loads: 7 (53.85% of loadvs tore-l nstructions) stores: 6 (46.15老 of load-/store-lnstructions)floa t

26、 inm poin t st age ins tzruut ions :total: 30 (25.00% of all instructions thereof: additions: 14 (46.67x of floating point stage inst.) multiplicalions: 1653.33龛 of floating point stage inst.) divisions: 0 (0.00; of floating point stage inst)traps:traps: 43.33玄 of all instructions) 不增加forward 件情况:点击

27、configuration中的enable forwarding使定向无效 (去掉小钩),打开断点breakpoints图标并点击breakpoints菜单,删除所冇断 点,然后按f5,键入15后,按enter ,模拟程序一直运行到结束。在statistisc中得到以下的结果,如下图所示:tote丄:201 cycle(s) executed.id executed by 1 20 instruction(s).2 instruction(s) currently in pipeline.ha.r-dwa.r-e conf igu.3?a.t ion.: memory size: 32768

28、bytes faddex-s tages: 1, required cycles: 2 fmulex1, required cycles: 5 fdivex-stages: 1, required cycles: 1 9 forwarding disabled.stalis:raw stalls: 48 (23.88 of all cycles) waw stalls: 0 (0.00 of all cycles) structural stalls: 0 (0.00 of all cycles) control stalls: 20 (9.95 of all cycles) t rap st

29、alls: 1 2 (5.97 of all cycles) total: 80 stall(s) (39.80 of all cycles)cond. i t i onm 1 b2?am.clies ):total: 1 8 (1 s.ooxs of all instructions), thereof: taken: 2 (11.11 :迄 of all cond ekriches not taken: 1 6 (1 00.00 of all cond. branches)zoaidlxs t o3?e i n.s 13?uc t i ons :total: 1 3 (1 0.83 of

30、all instructions), thereof: loads: 7 (53.85: of load-/s tore-l nstructions) stores: 6 (46.1 5 of load-/store-lnstructions)floa.tin.gr point mta.ge instzruotiokls :total: 30 (25.00 of all instructions), thereof:additions: 1 4 (46.67 of floating point stage inst.) multiplications: 1 653.33迄 of floating point stage inst.) divisions: 0 (0.00 of floating point stage inst.)tiraips :traps: 4 (3.33 of all instructions结论:我们先看一下statistics窗口中的各种统计数字:增加forwarding tii, 总的周期数(180)和暂停数(17 raw, 20control, 12 trap; 49 total)。当不增 加forwarding吋,重新查看静态窗口,你会看到控制暂停和trap暂停仍然

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论