Topsummit大数据的虚拟化之路-VMware张君迟_第1页
Topsummit大数据的虚拟化之路-VMware张君迟_第2页
Topsummit大数据的虚拟化之路-VMware张君迟_第3页
Topsummit大数据的虚拟化之路-VMware张君迟_第4页
Topsummit大数据的虚拟化之路-VMware张君迟_第5页
已阅读5页,还剩38页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

大数据的虚拟化之路演讲者:张君迟来自VMware虚拟化?现实。Source:Gartner“MagicQuadrantforx86ServerVirtualizationInfrastructure”byThomasJ.Bittman,GeorgeJ.Weiss,MarkA.Margevicius,PhilipDawson,June11,2012虚机部署的百分比2005200620072008200920102011201380%70%60%50%40%30%20%10%020152010年超越Empowerpeopleandorganizationsbyradically

simplifyingITthroughvirtualizationsoftware通过虚拟化软件创新,彻底地简化IT虚拟化绝对的领导者超过50万家客户超过5.5万家合作伙伴约1.3万名员工#3什么是虚拟化?–名词解释:当初x86体系计算机硬件设计思想是单台运行一个操作系统和一个应用,造成大多数此类计算机的利用率偏低。虚拟化使得多个虚拟机能够运行在同一个物理计算机上,每个虚拟机共享物理机的资源。虚拟机可以支持大多类型的操作系统和各式各样的应用,最终它们都是运行在同一台物理计算机上。传统架构虚拟化架构图解……OSExchangeOperatingSystem虚拟化OSSAPERPOperatingSystem虚拟化OSFile/PrintOperatingSystem虚拟化OSOracleCRMOperatingSystem虚拟化虚拟化基础架构网络交换池CPU池内存池存储池传统视角虚拟化架构动画解……OracleCRMOperatingSystemSAPERPOperatingSystemFile/PrintOperatingSystemExchangeOperatingSystem虚拟化基础架构网络交换池CPU池内存池存储池动画解……交付的改变存储计算网络安全管理过去现在按

周、天计按分钟、秒计为什么要大数据的虚拟化?设备越来越多!应用越来越多!社交越来越多!数据能创造巨大价值,但保留和处理数据是有成本的……大数据时代Source:Gartner2020年,非结构化数据10倍于结构化数据的增长结构化数据非结构化数据花10倍的投入买这些硬件,无以为继。换一种思路解决……大数据的虚拟化将大数据的工作负载运行或迁移到虚拟化的基础环境中,继承虚拟化的优点。MPP

DBHadoopHBase虚拟化平台

Hadoop虚拟化平台

HBase

MPP监控易于管理集群安装和配置监控硬件规划和部署集群安装和配置硬件规划和部署虚拟化平台集群整合共享资源,降低CAPEXΣ(Max)Max(Σ)效率对比物理集群虚拟化集群集群构建采购服务器搭建数据中心复杂手工步骤无需精确了解业务对资源消耗中心化IT管理完全端到端自动化操作集群运维故障发生需要立即反馈高容错自动故障转移容量计划需要为未来做好规划,预留未使用资源只需为现在准备,所用即所需,无需预留资源增加计算/存储能力需要重新采购和搭建服务器一键触发,自动向资源池申请资源扩展容量减少运维成本(OPEX)减少资产投入(CAPEX)高回报(ROI)17动态伸缩Hadoop-合理利用资源不同租户部署各自的计算集群,共享分布式文件系统(HDFS)根据优先级和可用资源动态Adhocdatamining动态资源控制数据层HDFSHostHostHostHostHostHostProductionrecommendationengine虚拟平台计算层ComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVM测试集群生产集群ComputeVMJobTrackerJobTracker为什么要大数据的虚拟化?简化操作共享基础架构利用现有投入vSphereBigDataExtensionsVMware的BigData解决方案-BDEVMwarevSphereBigDataExtensions(简称BDE)于2013年9月22日作为vSphere5.5的新功能正式上市。全新的BigDataExtensions件作为vSphere的插件发布。管理员可以直接从vCenter上部署、监控和管理Hadoop集群。提高了Hadoop运行效率。几分钟内部署大数据集群服务器准备操作系统安装网络配置大数据集群的安装和配置手工部署流程自动化的界面部署流程一键即可横向扩展集群轻松自定义配置集群Resourceconfiguration

ClusterSpecificationFile

"groups":[{"name":"master","roles":["hadoop_namenode","hadoop_jobtracker”],"storage":{"type":"SHARED”,sizeGB":20},"instance_type":MEDIUM,"instance_num":1,"ha":true},{"name":"worker","roles":["hadoop_datanode","hadoop_tasktracker"],"instance_type":SMALL,"instance_num":5,"ha":false

…Storageconfiguration

ChoiceofsharedstorageorLocaldiskHighavailabilityoption

#ofHadoopnodes

PredefinedSpecforStandardizationandEaseofConsumptionShipwithanumberofcommonclusterspecificationfilesPredefinespecssuitableforvaryingneedsoftheirusersEaseofconsumption–Itjustworks!StandardizationDeveloper3HadoopnodesCloudera,Pivotal

MapRSmallVMLocalstorageNoHA…DataScientist5HadoopnodesCloudera,PivotalHive,PigMediumVMHA…Highpriority50HadoopnodesClouderaHive,PigLargeVMHA…………YourChoiceofHadoopDistributionsandToolsCommunityProjectsDistributionsFlexibilitytochooseandtryoutmajordistributionsSupportformultipleprojectsOpenarchitecturetowelcomeindustryparticipationContributingHadoopVirtualizationExtensions(HVE)toopensourcecommunityAutomationofHadoopClusterLifecycleManagementDeployCustomizeLoaddataExecutejobsTuneconfigurationScaling…vSphereBigDataExtensionsChallengesofRunningHadoopinEnterprisesProductionTestExperimentationDeptA:recommendationengineDeptB:adtargetingProductionTestExperimentationLogfilesSocialdataTransactiondataHistoricalcustbehaviorPainPoints:ClustersprawlingRedundantcommondatainseparateclustersInefficientuseofresourcs,someclusterscouldberunningatcapacitywhileotherclustersaresittingidleNoSQLRealtimeSQL…Onthehorizon…Whatifyoucan…Experimentation

ProductionrecommendationengineProductionAdTargetingTest/DevProductionTestProductionTestExperimentationRecommendationengineAdtargetingExperimentationOnephysicalplatformtosupportmultiplevirtualbigdataclustersToday’sChallengesonHadoopInfrastructureFixedcomputeandstorageleadstolowutilizationandinflexibilityComputeandstoragelinkedtogetherwithfixedratiobasedonhardwarespecNotalljobsarecreatedequal(puteintensive)InflexibleinfrastructureleadstowasteToolittlecomputepowerslowprocessingToomuchcomputepowersittingidleProblemcompoundswithlargerclustersSowhathappens?Yahoo-averageCPUutilizationofHadoopclustersis<15%Twitter–usedifferenthardwareforclusters,expensivewaytoachievedefficiencyServerCompute

NodeData

NodeServerCompute

NodeData

NodeServerCompute

NodeData

NodeServerCompute

NodeStorage

NodeServerCompute

NodeGettingmoreoutofyourinfrastructureDecouplethelinkagebetweencomputeandstorageStatelesscomputecanelasticgrowandshrinkDatalocalityispreserved,placethecomputewheredataresidesExtracomputecapacitycanbeusedforotherworkloadsVMStorage

NodeVMComputelayerComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMStorageVMStorageVMStorageVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMStorageVMStorageVMStorageVMComputeVMStorageVMStoragelayerRunotherworkloadsRunHadoopStorageElastic,Multi-tenantHadoopwithVirtualizationComputeCombinedStorage/ComputeStorageT1T2VMVMVMVMVMVMUnmodifiedHadoop

nodeinaVMVMlifecycle

determined

byDatanodeLimitedelasticitySeparateComputefrom

StorageSeparatecompute

fromdataStatelesscomputeElasticcomputeSeparateVirtualComputeClusters

pertenantSeparatevirtualcomputeComputeclusterpertenantStrongerVM-gradesecurity

andresourceisolationHadoopNodeUsecase1:ElasticHadoopwithTierredSLAProductionworkloadshashighpriorityExperimentationworkloadshaslowerpriorityExperimentationDynamicresourcepoolDatalayerProductionrecommendationengineComputelayerComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMExperimentationProductionComputeVMExperimentation

MapreduceProduction

MapreduceVMwarevSphere+SerengetiUsecase2:ElasticHadoopforMultipledepartmentsCentralizeITisofferingHadooptomultipledepartmentsExperimentationDynamicresourcepoolDatalayerProductionrecommendationengineComputelayerComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMDepartment1Department2ComputeVMMapreduceMapreduceVMwarevSphere+SerengetiUsecase3:ElasticBigDataHadoopecosystemevolvingquicklytoincludemoreandmorecomputingengines(Hbase,streaming,interactivesqletc.)ExperimentationDynamicresourcepoolDatalayerProductionrecommendationengineComputelayerComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMComputeVMHbaseRPHadoopResourcePoolComputeVMHbaseMapreduceVMwarevSphere+SerengetiHDFS

(HadoopDistributedFileSystem)HBase(Key-Valuestore)MapReduce(JobScheduling/ExecutionSystem)Pig(DataFlow)Hive(SQL)BIReportingETLToolsManagementServerZookeepr(Coordination)HCatalogRDBMSNamenodeJobtrackerHiveMetaDBHcatalogMDBServervSphereHAisbattle-testedhighavailabilitytechnologySinglemechanismtoachieveHAfortheentireHadoopstackOneclicktoenableHAand/orFTAchieveHAfortheEntireHadoopStackHybridstoragemodeltogetthebestofbothworldsMasternodes:Namenode,jobtrackeretc.onsharedstorageLeveragevSpherevMotion,HAandFTSlavenodesTasktracker/datanodeonlocalstorageLowercost,scalablebandwidthLocalStorageSharedStorageLeveragingIsilonasExternalHDFSTimetoresults:AnalysisofdatainplaceLowerriskusingvSpherewithIsilonScalestorageandcomputeindependentlyDataLayer–HadooponIsilonElasticVirtualComputeLayerProactivemonitoringwithvCOPsProactivelymonitoringthroughVCOPsGaincomprehensivevisibilityElimin

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论