云计算与云数据管理_第1页
云计算与云数据管理_第2页
云计算与云数据管理_第3页
云计算与云数据管理_第4页
云计算与云数据管理_第5页
已阅读5页,还剩162页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

云计算与云数据管理陆嘉恒中国人民大学《先进数据管理》前沿讲习班2024/9/201主要内容2

云计算概述Google云计算技术:GFS,Bigtable和MapreduceYahoo云计算技术和Hadoop云数据管理的挑战2024/9/202人民大学新开的《分布式系统与云计算》课程3

分布式系统概述分布式云计算技术综述分布式云计算平台分布式云计算程序开发2024/9/203第一篇分布式系统概述4第一章:分布式系统入门第二章:客户-服务器端构架第三章:分布式对象第四章:公共对象请求代理结构(CORBA)2024/9/204第二篇云计算综述5第五章:云计算入门

第六章:云服务第七章:云相关技术比较7.1网格计算和云计算7.2Utility计算(效用计算)和云计算7.3并行和分布计算和云计算7.4集群计算和云计算

2024/9/205第三篇云计算平台6第八章:Google云平台的三大技术第九章:Yahoo云平台的技术第十章:Aneka云平台的技术第十一章:Greenplum云平台的技术第十二章:Amazondynamo云平台的技术2024/9/206第四篇云计算平台开发7第十三章:基于Hadoop系统开发第十四章:基于HBase系统开发第十五章:基于GoogleApps系统开发第十六章:基于MSAzure系统开发第十七章:基于AmazonEC2系统开发2024/9/207Cloudcomputing2024/9/2082024/9/209Whyweusecloudcomputing?2024/9/2010Whyweusecloudcomputing?Case1:WriteafileSaveComputerdown,fileislostFilesarealwaysstoredincloud,neverlost2024/9/2011Whyweusecloudcomputing?Case2:UseIEdownload,install,useUseQQdownload,install,useUseC++download,install,use……Gettheservefromthecloud2024/9/2012Whatiscloudandcloudcomputing?CloudDemandresourcesorservicesoverInternetscaleandreliabilityofadatacenter.2024/9/2013Whatiscloudandcloudcomputing?

CloudcomputingisastyleofcomputinginwhichdynamicallyscalableandoftenvirtualizedresourcesareprovidedasaserveovertheInternet.Usersneednothaveknowledgeof,expertisein,orcontroloverthetechnologyinfrastructureinthe"cloud"thatsupportsthem.

2024/9/2014CharacteristicsofcloudcomputingVirtual.software,databases,Webservers,operatingsystems,storageandnetworkingasvirtualservers.Ondemand.addandsubtractprocessors,memory,networkbandwidth,storage.2024/9/2015IaaSInfrastructureasaServicePaaSPlatformasaServiceSaaSSoftwareasaServiceTypesofcloudservice2024/9/2016SoftwaredeliverymodelNohardwareorsoftwaretomanageServicedeliveredthroughabrowserCustomersusetheserviceondemandInstantScalabilitySaaS2024/9/2017ExamplesYourcurrentCRMpackageisnotmanagingtheloadoryousimplydon’twanttohostitin-house.UseaSaaSprovidersuchasS

Youremailishostedonanexchangeserverinyourofficeanditisveryslow.OutsourcethisusingHostedExchange.SaaS2024/9/2018PlatformdeliverymodelPlatformsarebuiltuponInfrastructure,whichisexpensiveEstimatingdemandisnotascience!Platformmanagementisnotfun!PaaS2024/9/2019ExamplesYouneedtohostalargefile(5Mb)onyourwebsiteandmakeitavailablefor35,000usersforonlytwomonthsduration.UseCloudFrontfromAmazon.Youwanttostartstorageservicesonyournetworkforalargenumberoffilesandyoudonothavethestoragecapacity…useAmazonS3.PaaS2024/9/2020ComputerinfrastructuredeliverymodelAplatformvirtualizationenvironmentComputingresources,suchasstoringandprocessingcapacity.

VirtualizationtakenastepfurtherIaaS2024/9/2021ExamplesYouwanttorunabatchjobbutyoudon’thavetheinfrastructurenecessarytorunitinatimelymanner.UseAmazonEC2.

Youwanttohostawebsite,butonlyforafewdays.UseFlexiscale.IaaS2024/9/2022Cloudcomputingandothercomputingtechniques2024/9/2023The21stCenturyVisionOfComputingLeonardKleinrock,oneofthechiefscientistsoftheoriginalAdvancedResearchProjectsAgencyNetwork(ARPANET)projectwhichseededtheInternet,said:“Asofnow,computernetworksarestillintheirinfancy,butastheygrowupandbecomesophisticated,wewillprobablyseethespreadof‘computerutilities’which,likepresentelectricandtelephoneutilities,willserviceindividualhomesandofficesacrossthecountry.”2024/9/2024The21stCenturyVisionOfComputingSunMicrosystemsco-founderBillJoyHealsoindicated“Itwouldtaketimeuntilthesemarketstomaturetogeneratethiskindofvalue.Predictingnowwhichcompanieswillcapturethevalueisimpossible.Manyofthemhavenotevenbeencreatedyet.”2024/9/2025The21stCenturyVisionOfComputing2024/9/2026DefinitionsCloudGridClusterutility2024/9/2027DefinitionsCloudGridClusterutilityUtilitycomputingisthepackagingofcomputingresources,suchascomputationandstorage,asameteredservicesimilartoatraditionalpublicutility2024/9/2028DefinitionsCloudGridClusterutilityAcomputerclusterisagroupoflinkedcomputers,workingtogethercloselysothatinmanyrespectstheyformasinglecomputer.2024/9/2029DefinitionsCloudGridClusterutilityGridcomputingistheapplicationofseveralcomputerstoasingleproblematthesametime—usuallytoascientificortechnicalproblemthatrequiresagreatnumberofcomputerprocessingcyclesoraccesstolargeamountsofdata2024/9/2030DefinitionsCloudGridClusterutilityCloudcomputingisastyleofcomputinginwhichdynamicallyscalableandoftenvirtualizedresourcesareprovidedasaserviceovertheInternet.2024/9/2031GridComputing&CloudComputingsharealotcommonalityintention,architectureandtechnology

Differenceprogrammingmodel,businessmodel,computemodel,applications,andVirtualization.2024/9/2032GridComputing&CloudComputingtheproblemsaremostlythesamemanagelargefacilities;definemethodsbywhichconsumersdiscover,requestanduseresourcesprovidedbythecentralfacilities;implementtheoftenhighlyparallelcomputationsthatexecuteonthoseresources.2024/9/2033GridComputing&CloudComputingVirtualizationGriddonotrelyonvirtualizationasmuchasCloudsdo,eachindividualorganizationmaintainfullcontroloftheirresourcesCloudanindispensableingredientforalmosteveryCloud2024/9/20342024/9/20352024/9/2036Anyquestionandanycomments?2024/9/2036主要内容37

云计算概述Google云计算技术:GFS,Bigtable和MapreduceYahoo云计算技术和Hadoop云数据管理的挑战2024/9/2037GoogleCloudcomputingtechniques2024/9/2038TheGoogleFileSystem 2024/9/2039TheGoogleFileSystem (GFS)AscalabledistributedfilesystemforlargedistributeddataintensiveapplicationsMultipleGFSclustersarecurrentlydeployed.Thelargestoneshave:1000+storagenodes300+TeraBytesofdiskstorageheavilyaccessedbyhundredsofclientsondistinctmachines2024/9/2040IntroductionSharesmanysamegoalsaspreviousdistributedfilesystemsperformance,scalability,reliability,etcGFSdesignhasbeendrivenbyfourkeyobservationofGoogleapplicationworkloadsandtechnologicalenvironment2024/9/2041Intro:Observations11.Componentfailuresarethenormconstantmonitoring,errordetection,faulttoleranceandautomaticrecoveryareintegraltothesystem2.Hugefiles(bytraditionalstandards)MultiGBfilesarecommonI/Ooperationsandblockssizesmustberevisited2024/9/2042Intro:Observations23.MostfilesaremutatedbyappendingnewdataThisisthefocusofperformanceoptimizationandatomicityguarantees4.Co-designingtheapplicationsandAPIsbenefitsoverallsystembyincreasingflexibility2024/9/2043TheDesignClusterconsistsofasinglemasterandmultiplechunkserversandisaccessedbymultipleclients2024/9/2044TheMasterMaintainsallfilesystemmetadata.namesspace,accesscontrolinfo,filetochunkmappings,chunk(includingreplicas)location,etc.PeriodicallycommunicateswithchunkserversinHeartBeatmessagestogiveinstructionsandcheckstate2024/9/2045TheMasterHelpsmakesophisticatedchunkplacementandreplicationdecision,usingglobalknowledgeForreadingandwriting,clientcontactsMastertogetchunklocations,thendealsdirectlywithchunkserversMasterisnotabottleneckforreads/writes2024/9/2046ChunkserversFilesarebrokenintochunks.Eachchunkhasaimmutablegloballyunique64-bitchunk-handle.handleisassignedbythemasteratchunkcreationChunksizeis64MBEachchunkisreplicatedon3(default)servers2024/9/2047ClientsLinkedtoappsusingthefilesystemAPI.CommunicateswithmasterandchunkserversforreadingandwritingMasterinteractionsonlyformetadataChunkserverinteractionsfordataOnlycachesmetadatainformationDataistoolargetocache.2024/9/2048ChunkLocationsMasterdoesnotkeepapersistentrecordoflocationsofchunksandreplicas.Pollschunkserversatstartup,andwhennewchunkserversjoin/leaveforthis.StaysuptodatebycontrollingplacementofnewchunksandthroughHeartBeatmessages(whenmonitoringchunkservers)2024/9/2049OperationLogRecordofallcriticalmetadatachangesStoredonMasterandreplicatedonothermachinesDefinesorderofconcurrentoperationsAlsousedtorecoverthefilesystemstate2024/9/2050SystemInteractions:

LeasesandMutationOrderLeasesmaintainamutationorderacrossallchunkreplicasMastergrantsaleasetoareplica,calledtheprimaryTheprimarychosestheserialmutationorder,andallreplicasfollowthisorderMinimizesmanagementoverheadfortheMaster2024/9/2051AtomicRecordAppendClientspecifiesthedatatowrite;GFSchoosesandreturnstheoffsetitwritestoandappendsthedatatoeachreplicaatleastonceHeavilyusedbyGoogle’sDistributedapplications.NoneedforadistributedlockmanagerGFSchosestheoffset,nottheclient2024/9/2052AtomicRecordAppend:How?FollowssimilarcontrolflowasmutationsPrimarytellssecondaryreplicastoappendatthesameoffsetastheprimaryIfareplicaappendfailsatanyreplica,itisretriedbytheclient.Soreplicasofthesamechunkmaycontaindifferentdata,includingduplicates,wholeorinpart,ofthesamerecord2024/9/2053AtomicRecordAppend:How?GFSdoesnotguaranteethatallreplicasarebitwiseidentical.Onlyguaranteesthatdataiswrittenatleastonceinanatomicunit.Datamustbewrittenatthesameoffsetforallchunkreplicasforsuccesstobereported.2024/9/2054DetectingStaleReplicasMasterhasachunkversionnumbertodistinguishuptodateandstalereplicasIncreaseversionwhengrantingaleaseIfareplicaisnotavailable,itsversionisnotincreasedmasterdetectsstalereplicaswhenachunkserversreportchunksandversionsRemovestalereplicasduringgarbagecollection2024/9/2055GarbagecollectionWhenaclientdeletesafile,masterlogsitlikeotherchangesandchangesfilenametoahiddenfile.Masterremovesfileshiddenforlongerthan3dayswhenscanningfilesystemnamespacemetadataisalsoerasedDuringHeartBeatmessages,thechunkserverssendthemasterasubsetofitschunks,andthemastertellsitwhichfileshavenometadata.Chunkserverremovesthesefilesonitsown2024/9/2056FaultTolerance:

HighAvailabilityFastrecoveryMasterandchunkserverscanrestartinsecondsChunkReplicationMasterReplication“shadow”mastersprovideread-onlyaccesswhenprimarymasterisdownmutationsnotdoneuntilrecordedonallmasterreplicas2024/9/2057FaultTolerance:

DataIntegrityChunkserversusechecksumstodetectcorruptdataSincereplicasarenotbitwiseidentical,chunkserversmaintaintheirownchecksumsForreads,chunkserververifieschecksumbeforesendingchunkUpdatechecksumsduringwrites2024/9/2058Introductionto

MapReduce2024/9/2059MapReduce:Insight

”Considertheproblemofcountingthenumberofoccurrencesofeachwordinalargecollectionofdocuments”Howwouldyoudoitinparallel?2024/9/2060MapReduceProgrammingModel

InspiredfrommapandreduceoperationscommonlyusedinfunctionalprogramminglanguageslikeLisp.Usersimplementinterfaceoftwoprimarymethods:1.Map:(key1,val1)→(key2,val2)2.Reduce:(key2,[val2])→[val3]

2024/9/2061Mapoperation

Map,apurefunction,writtenbytheuser,takesaninputkey/valuepairandproducesasetofintermediatekey/valuepairs.e.g.(doc—id,doc-content)DrawananalogytoSQL,mapcanbevisualizedasgroup-byclauseofanaggregatequery.

2024/9/2062Reduceoperation

Oncompletionofmapphase,alltheintermediatevaluesforagivenoutputkeyarecombinedtogetherintoalistandgiventoareducer.Canbevisualizedasaggregatefunction(e.g.,average)thatiscomputedoveralltherowswiththesamegroup-byattribute.2024/9/2063Pseudo-codemap(Stringinput_key,Stringinput_value)://input_key:documentname//input_value:documentcontentsforeachwordwininput_value: EmitIntermediate(w,"1");reduce(Stringoutput_key,Iteratorintermediate_values)://output_key:aword//output_values:alistofcountsintresult=0;foreachvinintermediate_values: result+=ParseInt(v);Emit(AsString(result));2024/9/2064MapReduce:Executionoverview

2024/9/2065MapReduce:Example

2024/9/2066MapReduceinParallel:Example

2024/9/2067MapReduce:FaultToleranceHandledviare-executionoftasks.TaskcompletioncommittedthroughmasterWhathappensifMapperfails?Re-executecompleted+in-progressmaptasksWhathappensifReducerfails?Re-executeinprogressreducetasksWhathappensifMasterfails?Potentialtrouble!!2024/9/2068MapReduce:

WalkthroughofOnemoreApplication2024/9/20692024/9/2070MapReduce:PageRank

PageRankmodelsthebehaviorofa“randomsurfer”.C(t)istheout-degreeoft,and(1-d)isadampingfactor(randomjump)The“randomsurfer”keepsclickingonsuccessivelinksatrandomnottakingcontentintoconsideration.Distributesitspagesrankequallyamongallpagesitlinksto.Thedampeningfactortakesthesurfer“gettingbored”andtypingarbitraryURL.2024/9/2071PageRank:KeyInsights

Effectsateachiterationislocal.i+1thiterationdependsonlyonithiterationAtiterationi,PageRankforindividualnodescanbecomputedindependently2024/9/2072PageRankusingMapReduce

UseSparsematrixrepresentation(M)MapeachrowofMtoalistofPageRank“credit”toassigntooutlinkneighbours.TheseprestigescoresarereducedtoasinglePageRankvalueforapagebyaggregatingoverthem.2024/9/2073PageRankusingMapReduceMap:distributePageRank“credit”tolinktargetsReduce:gatherupPageRank“credit”frommultiplesourcestocomputenewPageRankvalueIterateuntilconvergenceSourceofImage:Lin20082024/9/2074

Phase1:ProcessHTML

Maptasktakes(URL,content)pairsandmapsthemto(URL,(PRinit,list-of-urls))PRinitisthe“seed”PageRankforURLlist-of-urlscontainsallpagespointedtobyURLReducetaskisjusttheidentityfunction2024/9/2075

Phase2:PageRankDistribution

Reducetaskgets(URL,url_list)andmany(URL,val)valuesSumvalsandfixupwithdtogetnewPREmit(URL,(new_rank,url_list))Checkforconvergenceusingnonparallelcomponent2024/9/2076MapReduce:SomeMoreAppsDistributedGrep.CountofURLAccessFrequency.Clustering(K-means)GraphAlgorithms.IndexingSystemsMapReduceProgramsInGoogleSourceTree2024/9/2077MapReduce:Extensionsandsimilarapps

PIG(Yahoo)Hadoop(Apache)DryadLinq(Microsoft)2024/9/2078LargeScaleSystemsArchitectureusingMapReduceUserAppMapReduceDistributedFileSystems(GFS)2024/9/2079BigTable:ADistributedStorageSystemforStructuredData2024/9/2080IntroductionBigTableisadistributedstoragesystemformanagingstructureddata.DesignedtoscaletoaverylargesizePetabytesofdataacrossthousandsofserversUsedformanyGoogleprojectsWebindexing,PersonalizedSearch,GoogleEarth,GoogleAnalytics,GoogleFinance,…Flexible,high-performancesolutionforallofGoogle’sproducts2024/9/2081MotivationLotsof(semi-)structureddataatGoogleURLs:Contents,crawlmetadata,links,anchors,pagerank,…Per-userdata:Userpreferencesettings,recentqueries/searchresults,…Geographiclocations:Physicalentities(shops,restaurants,etc.),roads,satelliteimagedata,userannotations,…ScaleislargeBillionsofURLs,manyversions/page(~20K/version)Hundredsofmillionsofusers,thousandsorq/sec100TB+ofsatelliteimagedata2024/9/2082WhynotjustusecommercialDB?ScaleistoolargeformostcommercialdatabasesEvenifitweren’t,costwouldbeveryhighBuildinginternallymeanssystemcanbeappliedacrossmanyprojectsforlowincrementalcostLow-levelstorageoptimizationshelpperformancesignificantlyMuchhardertodowhenrunningontopofadatabaselayer2024/9/2083GoalsWantasynchronousprocessestobecontinuouslyupdatingdifferentpiecesofdataWantaccesstomostcurrentdataatanytimeNeedtosupport:Veryhighread/writerates(millionsofopspersecond)EfficientscansoverallorinterestingsubsetsofdataEfficientjoinsoflargeone-to-oneandone-to-manydatasetsOftenwanttoexaminedatachangesovertimeE.g.Contentsofawebpageovermultiplecrawls2024/9/2084BigTableDistributedmulti-levelmapFault-tolerant,persistentScalableThousandsofserversTerabytesofin-memorydataPetabyteofdisk-baseddataMillionsofreads/writespersecond,efficientscansSelf-managingServerscanbeadded/removeddynamicallyServersadjusttoloadimbalance2024/9/2085BuildingBlocksBuildingblocks:GoogleFileSystem(GFS):RawstorageScheduler:schedulesjobsontomachinesLockservice:distributedlockmanagerMapReduce:simplifiedlarge-scaledataprocessingBigTableusesofbuildingblocks:GFS:storespersistentdata(SSTablefileformatforstorageofdata)Scheduler:schedulesjobsinvolvedinBigTableservingLockservice:masterelection,locationbootstrappingMapReduce:oftenusedtoread/writeBigTabledata2024/9/2086BasicDataModelABigTableisasparse,distributedpersistentmulti-dimensionalsortedmap(row,column,timestamp)->cellcontentsGoodmatchformostGoogleapplications2024/9/2087WebTableExampleWanttokeepcopyofalargecollectionofwebpagesandrelatedinformationUseURLsasrowkeysVariousaspectsofwebpageascolumnnamesStorecontentsofwebpagesinthecontents:columnunderthetimestampswhentheywerefetched.2024/9/2088RowsNameisanarbitrarystringAccesstodatainarowisatomicRowcreationisimplicituponstoringdataRowsorderedlexicographicallyRowsclosetogetherlexicographicallyusuallyononeorasmallnumberofmachines2024/9/2089Rows(cont.)Readsofshortrowrangesareefficientandtypicallyrequirecommunicationwithasmallnumberofmachines.Canexploitthispropertybyselectingrowkeyssotheygetgoodlocalityfordataaccess.Example: ,,, VS edu.gatech.math,edu.gatech.phys,edu.uga.math,edu.uga.phys2024/9/2090ColumnsColumnshavetwo-levelnamestructure:family:optional_qualifierColumnfamilyUnitofaccesscontrolHasassociatedtypeinformationQualifiergivesunboundedcolumnsAdditionallevelsofindexing,ifdesired2024/9/2091TimestampsUsedtostoredifferentversionsofdatainacellNewwritesdefaulttocurrenttime,buttimestampsforwritescanalsobesetexplicitlybyclientsLookupoptions:“ReturnmostrecentKvalues”“Returnallvaluesintimestamprange(orallvalues)”Columnfamiliescanbemarkedw/attributes:“OnlyretainmostrecentKvaluesinacell”“KeepvaluesuntiltheyareolderthanKseconds”2024/9/2092Implementation–ThreeMajorComponentsLibrarylinkedintoeveryclientOnemasterserverResponsiblefor:AssigningtabletstotabletserversDetectingadditionandexpirationoftabletserversBalancingtablet-serverloadGarbagecollectionManytabletserversTabletservershandlereadandwriterequeststoitstableSplitstabletsthathavegrowntoolarge2024/9/2093Implementation(cont.)Clientdatadoesn’tmovethroughmasterserver.Clientscommunicatedirectlywithtabletserversforreadsandwrites.Mostclientsnevercommunicatewiththemasterserver,leavingitlightlyloadedinpractice.2024/9/2094TabletsLargetablesbrokenintotabletsatrowboundariesTabletholdscontiguousrangeofrowsClientscanoftenchooserowkeystoachievelocalityAimfor~100MBto200MBofdatapertabletServingmachineresponsiblefor~100tabletsFastrecovery:100machineseachpickup1tabletforfailedmachineFine-grainedloadbalancing:MigratetabletsawayfromoverloadedmachineMastermakesload-balancingdecisions2024/9/2095TabletLocationSincetabletsmovearoundfromservertoserver,givenarow,howdoclientsfindtherightmachine?Needtofindtabletwhoserowrangecoversthetargetrow2024/9/2096TabletAssignmentEachtabletisassignedtoonetabletserveratatime.Masterserverkeepstrackofthesetoflivetabletserversandcurrentassignmentsoftabletstoservers.Alsokeepstrackofunassignedtablets.Whenatabletisunassigned,masterassignsthetablettoantabletserverwithsufficientroom.2024/9/2097APIMetadataoperationsCreate/deletetables,columnfamilies,changemetadataWrites(atomic)Set():writecellsinarowDeleteCells():deletecellsinarowDeleteRow():deleteallcellsinarowReadsScanner:readarbitrarycellsinabigtableEachrowreadisatomicCanrestrictreturnedrowstoaparticularrangeCanaskforjustdatafrom1row,allrows,etc.Canaskforallcolumns,justcertaincolumnfamilies,orspecificcolumns2024/9/2098Refinements:CompressionManyopportunitiesforcompressionSimilarvaluesinthesamerow/columnatdifferenttimestampsSimilarvaluesindifferentcolumnsSimilarvaluesacrossadjacentrowsTwo-passcustomcompressionsschemeFirstpass:compresslongcommonstringsacrossalargewindowSecondpass:lookforrepetitionsinsmallwindowSpeedemphasized,butgoodspacereduction(10-to-1)2024/9/2099Refinements:BloomFiltersReadoperationhastoreadfromdiskwhendesiredSSTableisn’tinmemoryReducenumberofaccessesbyspecifyingaBloomfilter.AllowsusaskifanSSTablemightcontaindataforaspecifiedrow/columnpair.SmallamountofmemoryforBloomfiltersdrasticallyreducesthenumberofdiskseeksforreadoperationsUseimpliesthatmostlookupsfornon-existentrowsorcolumnsdonotneedtotouchdisk2024/9/20100Refinements:BloomFiltersReadoperationhastoreadfromdiskwhendesiredSSTableisn’tinmemoryReducenumberofaccessesbyspecifyingaBloomfilter.AllowsusaskifanSSTablemightcontaindataforaspecifiedrow/columnpair.SmallamountofmemoryforBloomfiltersdrasticallyreducesthenumberofdiskseeksforreadoperationsUseimpliesthatmostlookupsfornon-existentrowsorcolumnsdonotneedtotouchdisk2024/9/20101主要内容102

云计算概述

Google云计算技术:GFS,Bigtable和MapreduceYahoo云计算技术和Hadoop云数据管理的挑战2024/9/20102Yahoo!Cloudcomputing2024/9/20103babycenterepicuriousSearchResultsoftheFutureLinkedInwebmdGawkerNewYorkTimes2024/9/20104What’sintheHorizontalCloud?CommonApproachestoQA,ProductionEngineering,PerformanceEngineering,DatacenterManagement,andOptimizationID&AccountManagementMonitoring&QoSSharedInfrastructureMetering,Billing,AccountingHorizontalCloudServicesEdgeContentServicese.g.,YCS,YCPIProvisioning&Virtualizatione.g.,EC2BatchStorage&Processinge.g.,Hadoop&PigOperationalStoragee.g.,S3,MObStor,SherpaOtherServicesMessaging,Workflow,virtualDBs&WebservingSecuritySimpleWebServiceAPI’s2024/9/20105Yahoo!CloudStackProvisioning(Self-serve)HorizontalCloudServices…YCSYCPIBrooklynEDGEMonitoring/Metering/SecurityHorizontalCloudServices…HadoopBATCHHorizontalCloudServices…SherpaMOBStorSTORAGEHorizontalCloudServicesVM/OS…APPHorizontalCloudServicesVM/OSyApacheWEBDataHighwayServingGridPHPAppEngine2024/9/20106WebDataManagementLargedataanalysis(Hadoop)Structuredrecordstorage(PNUTS/Sherpa)Blobstorage(SAN/NAS)ScanorientedworkloadsFocusonsequentialdiskI/O$percpucycleCRUDPointlookupsandshortscansIndexorganizedtableandrandomI/Os$perlatencyObjectretrievalandstreamingScalablefilestorage$perGB2024/9/20107TheWorldHasChangedWebservingapplicationsneed:Scalability!PreferablyelasticFlexibleschemasGeographicdistributionHighavailabilityReliablestorageWebservingapplicationscandowithout:ComplicatedqueriesStrongtransactions2024/9/20108PNUTS/SHERPAToHelpYouScaleYourMountainsofData2024/9/20109Yahoo!ServingStorageProblemSmallrecords–100KBorlessStructuredrecords–lotsoffields,evolvingExtremedatascale-TensofTBExtremerequestscale-Tensofthousandsofrequests/secLowlatencyglobally-20+datacentersworldwideHighAvailability-outagescost$millionsVariableusagepatterns-asapplicationsanduserschange

1102024/9/20110ThePNUTS/SherpaSolutionThenextgenerationglobal-scalerecordstoreRecord-orientation:Routing,datastorageoptimizedforlow-latencyrecordaccessScaleout:Addmachinestoscalethroughput(whilekeepinglatencylow)Asynchrony:Pub-subreplicationtofar-flungdatacenterstomaskpropagationdelayConsistencymodel:ReducecomplexityofasynchronyfortheapplicationprogrammerClouddeploymentmodel:Hosted,managedservicetoreduceapptime-to-marketandenableondemandscaleandelasticity1112024/9/20111E75656CA42342EB42521WC66354WD12352EF15677EWhatisPNUTS/Sherpa?E75656CA42342EB42521WC66354WD12352EF15677ECREATETABLEParts( IDVARCHAR, StockNumberINT, StatusVARCHAR …)ParalleldatabaseGeographicreplicationStructured,flexibleschemaHosted,managedinfrastructureA42342EB42521WC66354WD12352EE75656CF15677E1122024/9/20112WhatWillItBecome?E75656CA42342EB42521WC66354WD12352EF15677EE75656CA42342EB42521WC66354WD12352EF15677EE75656CA42342EB42521WC66354WD12352EF15677ECREATETABLEParts( IDVARCHAR, StockNumberINT, StatusVARCHA

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论