![高性能弹性化的Spark部署架构_第1页](http://file4.renrendoc.com/view/0ad319bafa5af2126583396391bce8b2/0ad319bafa5af2126583396391bce8b21.gif)
![高性能弹性化的Spark部署架构_第2页](http://file4.renrendoc.com/view/0ad319bafa5af2126583396391bce8b2/0ad319bafa5af2126583396391bce8b22.gif)
![高性能弹性化的Spark部署架构_第3页](http://file4.renrendoc.com/view/0ad319bafa5af2126583396391bce8b2/0ad319bafa5af2126583396391bce8b23.gif)
![高性能弹性化的Spark部署架构_第4页](http://file4.renrendoc.com/view/0ad319bafa5af2126583396391bce8b2/0ad319bafa5af2126583396391bce8b24.gif)
![高性能弹性化的Spark部署架构_第5页](http://file4.renrendoc.com/view/0ad319bafa5af2126583396391bce8b2/0ad319bafa5af2126583396391bce8b25.gif)
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、High Performance Spark via Separation of Compute and Storage通过计算存储分离架构实现高性能弹性化的Spark部署高性能弹性化的Spark部署架构MotivationSpark shuffle with disaggregated storageSplash shuffle managerA reference design with in-memory distributed file systemEvaluation resultsFuture work and conclusionTable of ContentsHow did
2、we design data application?Network bandwidth vs. disk throughputMove code rather than moving dataFast small memory vs. slow large diskOptimize sequential R/WBack To the Date of MapReduce1GbpsThe Trends of HW in DChttps:/c/dam/en/us/products/collateral/switches/nexus-9000-series-switches/white-paper-
3、c11-734328.pdfhttps:/blog/hdd-vs-ssd-in-data-centers/Enterprise Bytes Shipments: HDD and SSDDatacenter Bandwidth MigrationChanges happen to modern DC?Disaggregated storage and computationhigh-speed network between compute nodes and storage boxesTiered storage for hot and cold dataModern DC Architect
4、ure25100GbpsCompute nodesstorage boxesAcceleratorsReimaging the DC Memory and Storage HierarchyHDD/TAPESSDDRAMMemoryStorageHOTMCOLDImproving memory capacityImproving SSD performanceWAREfficient and scalable storageLow latency and high throughput, like DRAMLatency: 200 400nsBandwidth per DIMM:Read: U
5、p to 8GB/sWrite: Up to 3GB/sHigh density and non-volatility, like NANDUp to 6TB per serverMemory-speed storage systemEmbrace the New ArchitectureIntel OptaneTM DC Persistent MemoryHow to Use DCPMMRDMA/DPDKDCPMM per nodeDCPMM centered archMemVerge Elastic Spark SolutionRDDCaching and StorageShuffle D
6、ataEthernet SwitchData SourceA PMEM Centric Data PlatformMemVerge DMOCluster Shared Persistent MemoryMemVerge Spark AdaptorsNode 1DRAM PMEMNode 2DRAM PMEMNode 3DRAM PMEMNode 4DRAM PMEMNode NDRAM PMEMSpark IntegrationRDDCaching and StorageShuffle DataData SourceHadoop compatible storage APIsA new gen
7、eric shuffle managerSpark with additional RDD persist APIsMemVerge DMOSpark Shuffle with Disaggregated StorageBlock manager persists data to memory or disk in local nodes.Losing an executor means recomputing of the whole shuffle task.The storage and network implementation is coupled with the shuffle
8、 implementation.Shuffle & Block ManagerBlock ManagerMemory StoreDisk StoreLocal DiskCompute NodeSpark ExecutorShuffle ManagerPersist & Retrieve DataShuffle OutputPoor elasticityThe failure of node leads to shuffle data lostFurther leads to recomputeHeavy overhead to NodeManagerCoexisting with NM bri
9、ngs heavy overhead to NM for heavy workloadsUnsuitable to cloud environmentstorage/computation disaggregation architecture brings no advantages to local shuffleThe Spark community is also working on these problemsSPARK-25299 Use remote storage for persisting shuffle dataSPARK-26268 Decouple shuffle
10、data from Spark deploymentThe Problems of Current Shuffle Manager DesignA flexible shuffle managerSupports user-defined storage backend and network transport for shuffle dataOpen source/MemVerge/splashSpark JIRA:SPARK-25299MemVerge Splash Shuffle ManagerSplash Shuffle ManagerStorage System (NFS, loc
11、al FS, HDFS, S3, DMO )Write shuffleWorker 1Executor 1 Splash Storage PluginRead shuffleWorker 2Executor 2 Splash Storage PluginA new shuffle managerImplementing shuffle manager interfaceSeparating storage and computeExtracting storage and network implementations outside of shuffle manager itself int
12、o pluginsBenefitsShuffle becomes statelessStorage becomes easier to maintainEnables the use of 3rd party high performance networking and storageDistributed Memory Object (DMO) is a distributed file system built on PMEM.The storage plugin allows us to persist data into the DMO system, a separated sto
13、rage cluster.The use of PMEM and fast network technologies (RDMA or DPDK) in the storage cluster speeds up the shuffle.Persisting Shuffle Data to PMEMPersistent MemoryDMOSystemShuffle ManagerSplash Shuffle ManagerStorage PluginDMO PluginCommon4 compute nodes10GbE networkDriver memory 4gExecutor memo
14、ry 6gTotal cores 160Executor cores 4Spark 2.3.2Hadoop 2.7.4Benchmark ConfigurationsBaseline4 local 1TB HDDs/nodeVanilla shuffle managerTCP/IP based NettyOurs2 DMO nodes512GB PMEM/nodeSplash shuffle managerDPDK network9.26.52210Baseline116DMO with UDPDMO with DPDKTeraSort 400GB, 216G Shuffle WriteRed
15、uce Stage (min) Map Stage (min)TeraSort PerformanceIntel HiBench: /Intel-bigdata/HiBench1800160014001200100080060040020007846424a24b8023a23b251729119374501640Duration (s)Query IDTPC-DS 1.2TBBaseline DMOTPC-DS Performance on Some Shuffle Heavy Queriesspark-sql-perf: /databricks/spark-sql-perfData Siz
16、e Scaling120010008006004002000400GB800GB1200GBTPC-DS Query 8016001400120010008006004002000400GB800GB1200GBTPC-DS Query 4180016001400120010008006004002000400GB800GB1200GBTPC-DS Query 2305001000150020002500400GB800GB1200GBTPC-DS Query 24BaselineDMOTPC-DS Performance - All Queriesspark-sql-perf: /datab
17、ricks/spark-sql-perfSplash + DMO + RDMAValidate in production and cloud environmentsPerformance tuningIntegration with Spark on K8SFuture WorkSeparating compute and storage is beneficial for SparkPerformanceElasticityFault toleranceA reference design based on Splash shuffle managerNo Spark modification is neededStorage and network becom
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 电子通信技术中的电气工程设计应用培训汇报
- 木托盘项目可行性研究报告立项申请报告模板
- 电影制作中光影效果的创新实践
- 中国鞋机行业市场发展监测及投资潜力预测报告
- 2024-2030年中国男士沐浴露行业市场深度研究及发展趋势预测报告
- 书面借款申请书
- 印花电子锅行业深度研究报告
- 中国低温贮罐行业市场发展监测及投资潜力预测报告
- 2024-2030年谷物油行业市场发展分析及竞争格局与投资战略研究报告
- 2025年中国家用烧烤炉行业市场深度评估及投资战略规划报告
- 2024算力工厂建设指南白皮书-33正式版
- 2024年广州市中考语文试卷真题(含官方答案)
- 2024年吉林省吉林市中考一模物理试题(解析版)
- CJT 290-2008 城镇污水处理厂污泥处置 单独焚烧用泥质
- 飞行员陆空通话(2)智慧树知到期末考试答案章节答案2024年中国民航大学
- 内审员审核规则与技巧
- 预应力混凝土管桩(L21G404)
- Unit 2 Last weekend C Story time (教学设计)人教PEP版英语六年级下册
- 图解《匠心筑梦职启未来》主题团日活动课件
- 2024年上海市普通高中学业水平等级性考试化学试卷(含答案)
- DZ∕T 0153-2014 物化探工程测量规范(正式版)
评论
0/150
提交评论