上海交通大学计算机体系结果课件1_第1页
上海交通大学计算机体系结果课件1_第2页
上海交通大学计算机体系结果课件1_第3页
上海交通大学计算机体系结果课件1_第4页
上海交通大学计算机体系结果课件1_第5页
已阅读5页,还剩83页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

AdvancedComputerArchitecture

Spring,2014

Lecture1

ZHUYongxin,Winson

zhuyongxin@

IntroductionPublicemailtostorepresentations,exercisesandanswers:ca.2014.course@Password:2014sjtuHowtosubmityourhomework?

Emailto:ca.2014.assignment@

TextBookComputerarchitecture:aquantitativeapproach,JohnL.HennessyandDavidA.Patterson,AQuantitativeApproach(ISBN:9787111364580),ElsevierSciencePtdLtd/机械工业部影印本,5thedition,2012Optionaltexts:JohnL.HennessyandDavidA.Patterson,ComputerOrganizationandDesign,theHardware/SoftwareInterface,(ISBN:7111193393),机械工业部影印本,2006年计算机系统结构(ISBN:978-7-81124-238-6),胡越明,北京航空航天大学出版社

IntroductionInstructor:ZHUYongxin,Winson祝永新Phone:34204546ext1037Email:zhuyongxin@TA:LiFen李芬

lifen_sjtu@163.com,Lab208

Location:Room308,工程馆Date&Time:13:30–18:30,Sunday(oddweeks)

NamingconventionsforhomeworkNamingconventions:studentno_name_lab_no.pdfstudentno_name_homework_no.pdf(eg:1062110xxx_zhangsan_lab_1.pdf/1062110xxx_zhangsan_homework_1_15.pdf)AlanTuring’sBombetocrackGermanArmy’sEnigmamessageBackgroundWhatisadvancedprocessorarchitecture?BackgroundMulti-corearchitectsaredoubling#ofcoresevery18mthsIBMCellprocessor(8cores),source:IBMAMDprocessor(4cores)Source:AMDCorp.GrowthofIntelprocessorcoresSource:ImpressWatchTodiversify:DesktopCPUsgofusingFromCMPtoFusion(CPU+GPU),Propus(left)andLiano(right)Source:AMDTodiversify:mobileprocessorsturnsouttobemulti-cores:Source:ARMTodiversify:NvidiagraphicprocessorsgogenericSource:NvidiaHowaboutsupercomputers?42ndList:ChinaisintheTOP10,Nov2013RankNameComputerSiteManufacturerCountryYear1Tianhe-2(MilkyWay-2)TH-IVB-FEPCluster,IntelXeonE5-269212C2.200GHz,THExpress-2,IntelXeonPhi31S1PNationalSuperComputerCenterinGuangzhouNUDTChina20132TitanCrayXK7,Opteron627416C2.200GHz,CrayGeminiinterconnect,NVIDIAK20xDOE/SC/OakRidgeNationalLaboratoryCrayInc.UnitedStates20123SequoiaBlueGene/Q,PowerBQC16C1.60GHz,CustomDOE/NNSA/LLNLIBMUnitedStates20114Kcomputer,SPARC64VIIIfx2.0GHz,TofuinterconnectRIKENAdvancedInstituteforComputationalScience(AICS)FujitsuJapan20115MiraBlueGene/Q,PowerBQC16C1.60GHz,CustomDOE/SC/ArgonneNationalLaboratoryIBMUnitedStates20126PizDaintCrayXC30,XeonE5-26708C2.600GHz,Ariesinterconnect,NVIDIAK20xSwissNationalSupercomputingCentre(CSCS)CrayInc.Switzerland20127StampedePowerEdgeC8220,XeonE5-26808C2.700GHz,InfinibandFDR,IntelXeonPhiSE10PTexasAdvancedComputingCenter/Univ.ofTexasDellUnitedStates20128JUQUEENBlueGene/Q,PowerBQC16C1.600GHz,CustomInterconnectForschungszentrumJuelich(FZJ)IBMGermany20129VulcanBlueGene/Q,PowerBQC16C1.600GHz,CustomInterconnectDOE/NNSA/LLNLIBMUnitedStates201210SuperMUCiDataPlexDX360M4,XeonE5-26808C2.70GHz,InfinibandFDRLeibnizRechenzentrumIBMGermany201211TSUBAME2.5ClusterPlatformSL390sG7,XeonX56706C2.930GHz,InfinibandQDR,NVIDIAK20xGSICCenter,TokyoInstituteofTechnologyNEC/HPJapan201312Tianhe-1ANUDTYHMPP,XeonX56706C2.93GHz,NVIDIA2050NationalSupercomputingCenterinTianjinNUDTChina2010Tianhe-2

Tianhe-2

Howaboutnewarchitecturethen?Source:MarkD.Hill,MichaelR.Marty,Amdahl’sLawintheMulticoreEra,IEEEComputer,2008译者:祝永新,中国计算机学会通讯2009年第7期Thenewtrendishigh-efficiency,reconfigurableandlowpowercomputing.现有摊大饼的Cluster模式是否合适,通过简单增加连接处理器数目搭建更高计算能力的计算机是否合适?异构可重构计算领域的阿姆达尔定理《新型计算结构》

2010.9祝永新Whatwedid?202013年中国十大科技进展新闻CPUHWLOGICCPU+HWAccel.HWLOGIC+Embedd.cpuGenericComputing:FixedStr.withchangeablealg.forallapp.App.Specific:FixedStr.,FixedAgl.,HighEff.ForfixedproblemsReconfiguringComponents+RecombiningNetwork+RebuildingSystem

=>BetterefficiencyforAPPLICATIONSrebuildReconfig.RecombineCPUHWLOGICCPU+HWAccel.HWLOGIC+Embedd.cpuReconfigurableCloudComputingArchitectureWouldBetheKeyOutlineCoursebriefingFundamentalofcomputerarchitecture

historyperformanceevaluationcostanalysisDiggingdeeperSummaryWhatistheJobofaComputerArchitect?UsuallyChiefengineerHe/ShedefinestheinterfacebetweensoftwareandhardwareHe/ShedefineshowhardwareshouldexecutesoftwareHe/ShespecifieshowsoftwareshouldutilizeshardwareWhatistheJobofaComputerArchitect?AlsoEmbeddedSystemschiefengineeraprocessorchipisfarawayfromthecompletesystemapplycomputerarchitectureknowledgeelectronicsystemleveldesignPlentyofopportutniesTherehavebeenopportunitiesin:Chipdesigners:AMD,ARM,Huawei,Intel,Imaging,Marwelll,Mediatek,EDATools:Synopsys,Cadence,ApplicationSoftware:Google,IBM,Microsoft,FPGA/ASICbasedMicrosecondTradingSystemWhyDoYouTaketheCourse?AllsystemdesignandICdesignstudentsneedtheknow-howofthecomputerarchitectureManystudentswillactuallydesignembeddedsystemsorSOCaftergraduationSomestudentswouldgobeyonddigitalsystemsJohnHennesseyStanfordin1977,nowpresident1983to1993directoroftheComputerSystemLaboratory1981,RISC1984,co-foundedMIPSbachelordegreeinelectricalengineeringVillanovaUniversitymasterdegreeanddoctoraldegreeincomputersciencefromtheStateUniversityofNewYorkatStonyBrook.

DavidPattersonUniversityofCaliforniaatBerkeleyin1977DesignandimplementationofRISCI,firstVLSIRISCComputer,foundationoftheSPARCarchitecture,usedbySunleaderofRedundantArraysofInexpensiveDisksproject(orRAID)ChairoftheEECSdeptatBerkeley

DavidPattersonHesteppeddownfromtheACMpresident,andholdsEECSDept.Chair,but…

Source:/~pattrsn/TheyseemtobeclosetousScopush-GraphIndex:DavidA.Patterson17JohnL.Hennessy6ZhuYongxin9ActuallywestillhavealongwaytogoLectureOutlookLectures(willmostcertainlybemodified)FundamentalsofComputerArchitecture(Ch.1)6lecturesCacheandMemorysystems(Ch.2)4lecturesPipelining&ILP&Scheduling(Ch.3)8lecturesGPUandvectormachines(Ch.4)MultiprocessorsandTLP(Ch5)6lecturesComputerwarehousesystemandreliability(Ch6)6SeminarsStudents’Presentations?Homework&GradingPolicyHomeworkTherewillbearound3setsofHomeworkTherewillbearound2-3setsofLabreportsHomeworkwillbedue2-3daysbeforethebeginningofthefollowingclassLateHomeworkwillhave10%deductionforeachlatedayItisyourresponsibilitytolocatetheTAtohand-inlateHomeworkNocreditforlateHomeworkhanded-inaftergradingGrading30-50%finalwrittenexaminJune201325%homework25-45%projectsOthersProjectsExerciseonpipelineoperationsusingMIPS64andVerilog/ModelsimExerciseonawholeprocessorusingSimpleScalarExerciseonamulti-coreprocessorusingMulti2SIMExerciseonGPUusingOpenCLCoursePrerequisiteDigitalcircuitsdesignComputerorganizationComputerarchitectureWewillbriefthefundamentalconceptstorefreshyourmemoryYouareencouragedtopickupmoredetailsoftheprerequisiteonyourownifyoudidn’tlearnbeforeOutlineCoursebriefingFundamentalofcomputerarchitecture

historyperformanceevaluationcostanalysisDiggingdeeperSummaryFundamentalsofComputerArchitecture

UniversalTuringMachinein1936VonNeumannArchitecturein1940sEvolutionofComputingEvolutionofComputingItstartedwithENIACbyUSArmyin1946ElectronicNumericalIntegratorandCalculator

Consistingof18000vacuumbulbsWeighting30tonsLaterweexperiencedBusinesscalculationonmainframesMathematicalanalysisonvectormachines,EmbeddedcomputinginembeddedsystemsPervasivecomputingeverywhere

ComputerArchitecture–ChangingDefinition1950sto1960s:ComputerArchitectureCourse:–ComputerArithmetic1970stomid1980s:ComputerArchitectureCourse:–InstructionSetDesign,especiallyISAappropriateforcompilers1990s:ComputerArchitectureCourse:DesignofCPU,memorysystem,I/Osystem,Multiprocessors,Networks2000s:ComputerArchitectureCourse:–multi-corearchitecture,powerawarearchitecture,energyawarearchitecture,nonVon-Neumannarchitecture,dynamicreconfigurable

WhatisComputerArchitectureTechnologyCMOS,Bipolar,….ApplicationsDatabase,Multimedia….SystemOrganizationInterfacesISA,API…..Measurement&EvaluationBenchmarking,modeling…

TheDefiningForcesinComputerArchitectureApplicationsTechnologySoftware/CompatibilityComputerArchitecturePerformanceofSupercomputersWhatdidDr.Moorespredict?In1965,GordonMooresketchedouthispredictionofthepaceofsilicontechnology.Source:IntelCorp.TechnologyContinuousimprovements&quickchangesIntegratedCircuitLogicTransistorDensityincreasesby~50%peryearDiesizeincreases~10%-25%peryear60%moredevicesperyearSemiconductorDRAM->PhaseChangingMemoryDensityincreasesby~60%Cycletimedecreasesslower-~onethirdevery10yearsMagneticdisks->SsdImproveby~50%peryearNetworktechnologyE.g.,Ethernet10M100Mb1Gb10Gb40GbCircuitboards:Increaseby~5%inwiredensityTechnologyTrends

Processorlogiccapacity:about30%peryearclockrate:about20%peryearMemoryDRAMcapacity:about60%peryear(4xevery3years)Memoryspeed:about10%peryearCostperbit:improvesabout25%peryearDiskcapacity:about60%peryearTotaluseofdata:100%per9months!NetworkBandwidthBandwidthincreasingmorethan100%peryear!ComputersDefinedbyWattsnotMIPSH2000:1GOPS,10MBDRAM,100MBFlashH2010:100GOPS,1GBDRAM,10GBFlash(Electricityis25%ofrunningcosts)<1W100W10kW1MWE21MachineRoomDataCenterWirelessBuildingNetInternetH21DesktopMoreTechnicalIssuesInvolvedinSupercomputersMemorywallPowerwallReliabilityProgrammingwall

TheDefiningForcesinComputerArchitectureApplicationsTechnologySoftware/CompatibilityComputerArchitectureCurrentComputingApplications&InfrastructureReal-timereal-worlddataprocessingvideoaudiosensordatawirelessHuman-MachineinterfacesspeechrecognitiongesturerecognitionlanguageunderstandingGlobal-scaleserversnon-stopservicesecuredatastorageNetworkingintelligentroutersClients/EdgeServers/CoreNextGenerationApplicationsHighPerformanceApplicationsinUS大气系统、气象系统和地球气候之间的非线性相互作用;地球碳、氮、水的耦合循环动力学;通过高分辨率、宽带、全球的地震探测研究地球的内部构造;十年期间的大型江河流域水文动力学;海洋与陆地以及海洋与大气的耦合动力学;具有大生物分子和生物分子团的反应机理,例如酶、核糖体和细胞膜;在只给定基本氨基酸序列的条件下预测蛋白质的三维结构;病毒衣壳组装(assemblyofcapsids)的理解;利用第一原理模拟极端条件下物质的块体性质(bulkproperties);适应特殊性和高效地运用第一原理设计催化剂、药物和其它分子材料;材料设计;NewApplicationinEurope生命科学领域系统生物学,在未来4年内,欧洲将实现世界上第一个“硅片中的”细胞。染色体动力学大尺度蛋白质动力学蛋白质联结和聚合超分子系统医学,例如,确定触发多基因疾病、预测在某些人群中与药物异常代谢相关的次级作用或药物与异于其原始靶点的大分子的交互作用的仿真。工程学领域直升机的完全仿真生物医学流体力学燃气轮机和内燃烧引擎森林火灾绿色飞行器虚拟发电厂HighPerformanceComputinginJapanCostofProcessorNon-RecurringEngineering(NRE)costsareincreasingrapidlyfornewprocessordesigns>$1MformaskstospinanewdesignEngineerscost~$200K/year(salary+benefits+overhead)PentiumProdesignverificationtookaround350engineeryearsor~$70M

=>Tremendouseconomiesofscale (Can’tsell<1,000,000partsfor<$100each)CostofProcessorDesigncost(Non-recurringEngineeringCosts,NRE)dominatedbyengineer-years(~$200Kperengineeryear)alsomaskcosts(approaching$1Mperspin)Costofdiedieareadieyield(maturityofmanufacturingprocess,redundancyfeatures)cost/sizeofwafersdiecostwithnoredundancyDiggingDeeperPerformanceevolutionCostestimationPerformanceEvolution$1Ktodaybuysagizmobetterthan$1Mcouldbuyin1965.1970sMainframesdominated–performanceimproved25—30%/yrMostlyduetoimprovedarchitecture+sometechnologyaids1980sVLSI+microprocessorbecamethefoundationTechnologyimprovesat35%/yrMachinelanguagedeath=opportunityMostlywithUNIXandCinmid-80’sEvenmostsystemprogrammersgaveupassemblylanguageWiththiscametheneedforefficientcompilers

PerformanceEvolution(Cont.)

1980s(Cont.)CompilerfocusbroughtonthegreatCISCvs.RISCdebateWiththeexceptionofIntel–RISCwontheargumentRISCperformanceimprovedby50%/yearinitiallyOfcourseRISCisnotassimpleanymoreandthecompilerisakeypartofthegameDoesnotmatterhowfastyourcomputeris,ifthecompilerwastesmostofitduetotheinabilitytogenerateefficientcodeWiththeexploitationofinstruction-levelparallelism(pipeline+super-scalar)andtheuseofcaches,performanceisfurtherenhancedCISC:ComplexInstructionSetComputingRISC:RelegateImportantStufftotheCompiler(ReducedInstructionSetComputing)CostofProcessorCostofpackagingnumberofpins(signal+power/groundpins)powerdissipationCostoftestingbuilt-intestfeatures?logicalcomplexityofdesignchoiceofcircuits(minimumclockrates,leakagecurrents,I/Odrivers)ArchitectaffectsalloftheseAFewExamplesDigitalAlpha(v1,v3)1992-97RIPsoonHPPA-RISC(v1.1,v2.0)1986-96RIPsoon/dead2005SunSPARC(v8,v9)1987-95SGIMIPS(MIPSI,II,III,IV,V)1986-96IA-16/32(8086,286,386,486,1978-1999Pentium,MMX,SSE,…)IA-64(Itanium)1996-nowAMD64/EMT642002-nowIBMPOWER(PowerPC,…)1990-nowARMnowManydeadprocessorarchitecturesliveoninmicrocontrollersChipCostEstimationWafercountsdiewaferChipCostEstimationICCostDieCostChipCostEstimation(cont’d)DiesperwaferdiagonalofasquarediecircumstanceofthewaferincompletesquaresaroundtheboundaryThusDiesperwafer~Die_area-1ChipCostEstimation(cont’d)Dieyieldαcorrespondstothenumberofmaskinglevels,anestimate=3,thenDieyield~Die_area-3Inshort,forα=3,Costofdie=f(Die_area4)RealWorldExamples

OutlineCoursebriefingFundamentalofcomputerarchitecture

historyperformanceevaluationcostanalysisDiggingdeeperSummaryDiggingDeeperComputerOrganizationfromaComputerArchitect’spointofviewAnArchitect’sViewHowareComputerSystemsorganized?Andwhyaretheyorganizedthatway?SystemLevelArchitectureMicro-architectureInstructionSetArchitectureSystemLevelArchitectureBoard/overallsystemProcessors,memory,chipset,bridges,….Electrical/mechanicalrules,wiring,connectors,layout,applicationconstraints….ICpincount,packagingtype,chipphysicalarea,…DesignsdrivenbyCostPerformancePackaging/styleAnOldandStillValidViewArchitectshavebeenviewingacomputerasamodelcontaining5componentssince1946AnOldSystemArchitectureSampleTISuperSPARCTMS390Z50inSunSPARCstation20

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论