版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、HPE 3PAR数据消重压缩方案HPE 3PAR Adaptive Data ReductionThe HPE 3PAR StoreServ data reduction storyWhen used together, the Adaptive Data Reduction technologies will operate in this orderWe will go through them in this order as this is the correct way to position them to customersDeduplicationPrevent storing
2、 duplicate dataCompressionReduce data footprintData PackingPack odd-sized data togetherZero DetectRemove zeros inline1234数据消重 Deduplication3HPE 3PAR StoreServ deduplicationAdvanced inline, in-memory deduplicationHost writes data to the array held in cache pages to increase write performanceDuplicate
3、s are removed, only unique data is flushed to the SSDs reducing writesThe 3PAR ASIC, paired with Express Index lookup tables provides high performance, low-latency inline deduplicationThe 3PAR ASIC/CPU checks to see if the pages are duplicates of existing pagesDedup lookup tablePotential duplicates
4、are confirmed with a bit-for-bit checkAccelerated by the ASIC and Express Indexing 3PAR Thin deduplication000.1101L1 TableHash L1L2 TableL3 TableHash L2Hash L3LBAxxx yyy zzzWrite CacheHost WriteWrite Acknowledge to hostASIC computes hashASIC performs fast metadata lookup with Express IndexingMatch?
5、No - Write new host data to the backendYes Read data from cache or backend and XOR with new host write data to prevent any hash collision If XOR result = 0 just update metadataIf XOR result 0 write new host data to backend1234000.11015a000.11015bHPE 3PAR deduplicationImplementation in HPE 3PAR OS 3.
6、2.x (TDVV1/2)Dedup-enabled TPVVPrivate space (one per dedup-enabled TPVV)Shared space (one per CPG)New incoming writes have a hash calculated. If the hash hasnt been seen before, the data is stored in the shared space (DDS). The DDS can grow significantly.If a calculated hash matches an existing one
7、, the array uses the ASIC to detect if the new data matches the existing data (XOR). If it does, no new data is writtenIf the data does not match, it is a hash collision (where the hash is the same but the data is different). This data is stored in the VVs private spaceWhen data in the shared space
8、is no longer referenced by any dedup-enabled VV, the space is freed from the shared space and is available for new dataSmall private space only contains collision dataLarge shared space contains most data, including a large amount of unique dataDuplicate dataDeduplication in the real worldA small am
9、ount of data is very heavily duplicated7Unique dataDuplicate data (11:1)Unique dataDuplicate data880GB720GB80GB720GB80GB80GB+720GB=800GB (2:1)80GBx11=880GB + 720GB = 1.6TB (rehydrated)880GB+720GB=1.6TB (1:1)Compression-optimized deduplication approachNew compression-friendly implementation (TDVV3)Al
10、l writes with new hashes are written to the private space first, not shared space. All hashes are tracked.When data is seen for the second time, the data is moved from the private space to shared space. Both VVs reference that data (as do future duplicate writes)The majority of data by capacity will
11、 be held in private space since, once deduplicated, duplicate data accounts for only a small amount of consumed capacityShared space is used more efficiently, meaning a smaller amount of shared space can offer increased deduplication scalability while reduced metadata improves perfomancePrivate spac
12、e (one per dedup-enabled TPVV)Dedup-enabled TPVVShared space (one per CPG)Small shared space contains only duplicate dataLarge private spaces hold majority of dataTDVV3 metadata structure9L1L2L3L2L316K pageSD spaceDDS Table16K pageSD spaceDDC (Dedup Client / Private Space, one per VV)LBA2iii jjj kkk
13、LBA1aaa bbb cccDDS (Dedup Store / Shared Space, one per CPG)L1L2L3VV1VV2LBA3xxx yyy zzzDuplicate matchDeduplication hash use: reducing back-end readsTaking advantage of the larger hash1016K cache page09745aed7c6fa8ab74120620efd8b8ab241d81a3f4d56278b0ed0926f3f8f4eff3f8f4eff4d56278b0ed092632-bitsUsed
14、for DDS entry index64-bitsStored in DDS for collision detectionDDS Tablef3f8f4ee f4d56278b0ed0926f63b8e018bcf38572b5c41ac59b0ee83df47eca7b46cd3713cb34af0f3f8f4ef16K cache pageIf the hash doesnt match, we have a hash collision and we dont need a backend readIf the hash matches, then we perform a read
15、 and an XOR to verifyA quick note on TDVV311TDVV1 and 2 are working perfectly for almost all of our existing customersThe new deduplication implementation is designed to enable support for deduplication and compression together on the same VV (more detail coming up!)A single DDS goes further than ev
16、er beforeOn average, duplicate data accounts for 10% of capacity consumed12DDS10%DDC90%Max size: 64TBGiven that 10% of data by volume stored is duplicate and 90% is unique, a single 64TB TDVV3 DDS allows us to deduplicate 640TB of written dataAfter the first 640TB, further new data is 100% unique (a
17、nd will be written to the DDS), but any incoming data thats a duplicate of data already stored in the DDS can continued to be deduplicatedThis will not affect the amount of savings (in GB) but it will reduce the deduplication ratio as the ratio of unique to shared data will change as the DDCs contin
18、ue to grow576TB total DDC space for a 64TB DDS640TBProvisioning changes in 3.3.1“TDVV” and Thinly Deduplicated is going away13SSMC 3.1 VV provisioning screenDeduplication is no longer a volume typeProvisioning changes in 3.3.1“TDVV” is going away14Deduplication and compression are now attributes of
19、TPVVs (thin volumes)They are always shown but only enabled when a system supports deduplication and compressionThey can be enabled together or independently from one anotherYou can still use the tune tools to convert a VV between thin, dedup, compress, decoYou can easily see the status of dedup/comp
20、ression15What does the new implementation mean?Advantages of the new deduplication implementationUp to 8x better scalabilityBetter deduplication scalability with more efficient use of DDSImproved savingsStoring only duplicate data in the DDS means more chances to deduplicate dataSimplified managemen
21、tImproved scalability means fewer CPGs for more deduplicated volumesImproved performanceIncreased IOPS, bandwidth and reduced latency for all platforms数据压缩 Compression17HPE 3PAR StoreServ compressionAdvanced inline, in-memory compressionHPE 3PAR StoreServ arrays leverage Express Scan technology to p
22、revent wasted CPU cyclesHost writes data to the array held in cache pages to increase write performancePage compressed using CPU once Express Scan verifies data is compressibleCompressed pages are written to SSD for permanent storageCompression uses the same three-layer exception tables used for ded
23、uplicationLossy vs. lossless compressionComparing methods for compression19Uncompressed imageSize: 12MBLossless (PNG)compressed imageSize: 6MBLossy (JPEG)compressed imageSize: 0.1MBCompression ratio1:1Compression ratio2:1Compression ratio120:1Compression algorithm and block sizeCompression ratio for
24、 Oracle test dataFull File64KB blocks16KB blocksCompression (MB/s)Decompression (MB/s)lz43.743.693.454411460lzo3.773.753.58404610gzip4.733.643.47246133For compression, larger block sizes offer increased savings3PARs 16KB page size is naturally a great choice for compression16KB shows just a 6.5% los
25、s in savings compared to a 64KB block sizeTelemetry data tells us that the majority of write bandwidth is driven by block sizes 16KB and largerCompression with blocks smaller than 16KB is CPU-intensive work and is very inefficient in terms of savingsA perfect example is EMC XtremIO; when they introd
26、uced compression in XOS 3.0 they were forced to change their block size from 4KB to 8KB (which is still less efficient than a 16KB block size)lz4 offers excellent savings when using 16KB pages and the best performance for compression and decompressionData segments111110000011111000000000000000000000
27、000000000000000000000000000000000000000000000000000000CompressionData reduction by compressing dataCompression algorithms work by inspecting data in blocks and removing redundant informationWithin each block, there will be repeated data and often padding around the real dataCompression removes the r
28、epeated data and padding space to reduce the capacity required to store the data100001000010000111111111111111100010111010001100010111010001111110000011111100001000010000HPE Private | Internal Use Only 16K pageCompression metadata structure2216KB page from SD LD spaceCompressed pagesPage metadataaaa
29、bbbcccMD000000000000iiijjjkkkxxxyyyzzzPadding spaceOffset=010 (2)Offset=011 (3)Offset=001 (1)In this example, three 16KiB pages are compressed in a single 16KiB pageThe host has written 48KiB to the volume but the array is only consuming 16KiB, resulting in 3:1 data reductionThere is no change to th
30、e metadata path for compressed pages since the existing Express Index tables are leveraged for increased simplicityUp to eight compressed pages can be stored in a single SD LD pageL1L2L3LBA2iii jjj kkkL2L3LBA1aaa bbb cccL3L2LBA3xxx yyy zzzSD spaceData Packing23Compressed data presents a unique probl
31、emAfter compression, blocks are odd sizes2416KB16KB16KB16KB16KB16KB2.3KB8.3KB5.2KB3.2KB10.7KB1.1KBTotal: 96KBTotal: 30.8KBSaving: 65.2KB3.12:1CompressedUncompressed?The odd sizes of compressed pages make them difficult to storeAppend-only data structuresUsed by many all-flash arrays25As data is writ
32、ten to the system, it is compressed and then combined into a single stripeThe complete stripe, with any metadata required, is written to SSD sequentiallyWhen hosts overwrite data, the old blocks are invalidated and new data is written to new stripesAt some point, a post-process task must take the ex
33、isting data and write it to a new stripe with data from other partial stripesExtremely inefficient use of space as data is overwrittenRequires the array to hide space for housekeeping (overprovisioning)Requires backend I/O intensive housekeeping to keep up with massive amount of garbageGarbage colle
34、ction and housekeeping need to be run at drive and system levelVirtually impossible to accurately report space consumption and true data reduction ratiosStripeStoring compressed data in variable block sizesSome systems use variable block sizes (base 2 4K, 8K, 16K)2616KB16KB16KB16KB16KB16KB2.3KB8.3KB
35、5.2KB3.2KB10.7KB1.1KB4KB4KB4KB16KB8KB16KBTotal: 96KBTotal: 30.8KBSaving: 65.2KB3.12:1Total: 52KBSaving: 44KB1.85:1CompressedUncompressedBackendPadding per-backend page means lots of wasted space, resulting in lower total system efficiencyThe array would likely still report this as 3.12:1HPE 3PAR Dat
36、a Packing2716KB16KB16KB16KB16KB2.3KB1.1KB5.2KB3.2KB2.9KBCompressedUncompressedData PackingNo compromise on efficiency gained through compressionHow does it work?Buffer Header is 256 bytes and contains pointers to the compressed pagesEach data page can hold up to 8 compressed pages (limited by the av
37、ailable page table entry space)Compressed Data Page FormatControl Buffer Header (256b)Compressed Data0Compressed Data1.Compressed Data716 KiB data pageHPE Private | Internal Use Only How does it work?Data PackingControl Buffer HeaderCompressed DataCompressed DataCompressed Data16 KiB data page16 KiB
38、 compression buffer16 KiB uncompressed CMP #1101000101000101011101010101001010100101001010101000001010100101001010016 KiB uncompressed CMP #216 KiB uncompressed CMP #3HPE Private | Internal Use Only 16 KiB data page1 KiBHow does it work?Overwrites3016 KiB compressed CMPControl Buffer Header3 KiB2 Ki
39、B8 KiBControl Buffer Header3 KiB2 KiB8 KiBControl Buffer Header3 KiB2 KiB8 KiBControl Buffer Header3 KiB1 KiB8 KiBNew DataCompressionHPE Private | Internal Use Only 16 KiB data pageHow does it work?Overwrites3116 KiB compressed CMPControl Buffer Header3 KiB2 KiB8 KiBControl Buffer Header3 KiB2 KiB8
40、KiBControl Buffer Header3 KiB2 KiB8 KiB4 KiBControl Buffer Header3 KiB4 KiB8 KiBNew DataCompressionHPE Private | Internal Use Only 16 KiB data pageHow does it work?Express Scan16 KiB compression bufferCompressed Data16 KiB uncompressed CMP1010001010001010111010101010010101001010010101010000010101001
41、01001010051%HPE Private | Internal Use Only Adaptive Data Reduction33HPE 3PAR StoreServ data reduction technologiesTechnologies work together for optimal resultsWhen used together, duplicate pages are removed first and unique pages are then compressedExpress Index tablesDeduplication3PAR ASICCompres
42、sionIntel CPUData in cacheResulting data written to SSDExpress ScanUnique dataData PackingDeduplication compared with compressionA simpler way to understand the differences and target use-casesCompressionWorks within datasetsDeduplicationWorks across datasetsDeduplication and compression on HPE 3PAR
43、36PrivateSpace(DDC)PrivateSpace(DDC)Shared Space(DDS)PrivateSpace(DDC)PrivateSpace(DDC)90% of dataCompression ratio2:110% of dataAverage references per page: 10-20CPG dedup ratio 2:1When new pages are received, a hash is calculated. If the page is unique, its compressed and written to the DDCNew uni
44、que pages are received, the hash calculated and if theyre unique, theyre written to the DDC.When a duplicate page is detected, its written uncompressed to the DDS and a pointer in the L3 exception table points to the location. The existing pages L3 exception is also updated with the DDS locationThe
45、original, compressed page is now marked as invalid and collected during the next GC runOnly 10% of data is stored in the DDS that data is referenced between 10 and 20 times on average, resulting in a 10-20:1 ratio within the DDS. Compressing this data offers limited savingsVolume Type Positioning37F
46、ullDeduplicatedDeduplicated + CompressedCompressedThinProvisioning TypePerformanceSpace SavingsHPE Confidential8200 and 8400 Feature MatrixVarious software combinations supported on an array at the same timeConfigurationFile PersonaAFCDedupCompressionSync RCPeriodic RCAsync RCOption 1Option 2Option
47、3Option 4There restrictions are for 8200 and 8400 models onlyThese restrictions are in place due to the limited Control cache (16GB) available on these two models8440, 8450 and 20000 Feature Matrix At release, there will be no support for compression and Async Streaming together on the same systemTh
48、is restriction is under review and may be lifted at a later dateConfigurationFile PersonaAFCDedupCompressionSync RCPeriodic RCAsync RCOption 1Option 27000 and 10000 Feature MatrixVarious software combinations supported on an array at the same timeCompression is not supported on Gen4 platforms (7000
49、and 10000 series)Async Streaming is not supported on Gen4 platforms (7000 and 10000 series)Only existing Gen4 systems running Async Streaming before upgrading to 3.3.1 will be supportedAny 7K or 10K customer who wants to start using Async Streaming on 3.3.1 will need to file a CERConfigurationFile P
50、ersonaAFCDedupCompressionSync RCPeriodic RCAsync RCOptionWhen to use whatDeduplicated volumesGood candidates for deduplication: Any data that has a high level of redundancyVDI - Persistent desktops can achieve excellent deduplication ratiosVM - OS images from multiple VMs can benefit from dedup. The
51、 app data may or may not dedup.Home directory and file shares - Users often store copies of the same file so this may benefit from dedupPoor candidates for deduplication: Databases - Most databases do not contain redundant data blocksPreviously deduplicated, compressed or encrypted data will not com
52、pact furtherThis does not include self-encrypting drives where data is deduped before it is written41HPE ConfidentialWhen to use whatCompressed volumesGood candidates for compression: Data with little redundancy will not dedup well but can benefit from compressionDatabases Typically do not have redu
53、ndant blocks but do have redundant data within blocksVM images with a lot of application data can benefit from compression of the application dataVDI with non-persistent desktops can achieve excellent compression ratiosPoor candidates for compression: Compressed data data that is compressed at the h
54、ost will not compress furtherEncrypted data - Host or SAN encrypted data will not benefit from storage compressionThis does not include self-encrypting drives where data is compressed before it is writtenBe careful with file data as it many contain compressed data such as jpegs and mp3s42HPE Confide
55、ntialWhen to use whatDeduplicated and Compressed volumes (DECO)Good candidates for DECO: VM images - OS images from multiple VMs can benefit from dedup and the application data will compressVDI Both persistent and non-persistent desktops can achieve excellent data reduction ratiosHome directory and
56、file shares - Deduplication and compression can offer significant space savingsEmail applications such as ExchangePoor candidates for DECO: Databases - Most databases will not dedup. Compression only is best for databases.Deduplicated data - Data that has already been deduplicated on the host will n
57、ot dedup furtherData compressed or encrypted at the host or switch will not dedup or compress furtherThis does not include self-encrypting drives where data is deduped before it is written43HPE ConfidentialSelective Adaptive Data ReductionAllowing more efficient use of system resourcesDifferent data
58、 types have different requirementsFor each data type, enable the technologies that provide benefits and disable the technologies that dontOracle databaseCompressed(2:1)Exchange serverDeduplicatedCompressed(1.5:1)Compressed videoThin ProvisionedVDI environmentDeduplicatedCompressed(2:1+)Understanding
59、 Compaction ratiosCompaction is a factor of total system efficiency and includes all VVs45Deduplicated 2:1VV1Compressed 2:1Thin Savings: 1.5:1VV2Compressed 2:1Thin Savings: 1.5:1VV3DeduplicatedThin Savings: 1.5:1VV4DeduplicatedThin Savings: 1.5:1VV5Thin ProvisionedThin Savings: 1.5:1VV6Thin Provisio
60、nedThin Savings: 1.5:1VV7Thick ProvisionedVV8Thick ProvisionedVV9Thick ProvisionedSystem Compaction: 1.2:1Why?Because of the Thick Provisioned VVs which consume a lot of capacitySSMC and CLI changes46Estimating savings from dedup and compressionFor existing VVs, dedup and compression start dry-runs
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 建筑电梯勘察设计施工合同
- 餐厅厨房空调租赁合同范文
- 2024年新品瓷砖订购协议2篇
- 2024年项目跟投协议范本3篇
- 煤炭行业货车司机劳动合同样本
- 航空顾问合同样本
- 商业综合体二手房交易协议模板
- 实验室设备安装施工合同范本
- 商业步行街施工合同范本
- 招标投标项目资金受贿罪防范
- APQP产品设计与开发(共97页).ppt
- GMP认证药厂固体车间及中药材提取车间平面图
- 海尔售后服务承诺
- 2020-2021学年高二物理粤教版选修3-1课时分层作业17 研究洛伦兹力 Word版含解析
- 国华太仓电厂600MW超临界直流炉控制策略
- 网络安全教育ppt课件
- 阀门基础知识_
- 退房通知书模板
- 生物质能发电厂原料收集存在的问题及其对策
- 海螺牌水泥质量检验报告天报告加章
- 设备保温管理制度
评论
0/150
提交评论