版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
ModelingAnalyticsfor
ComputationalStorage
VeronicaLagrange
MemorySolutionsLab
SamsungSemiconductor,Inc.
SanJose,U.S.A.
veronica.l@
Harry(Huan)Li
MemorySolutionsLab
SamsungSemiconductor,Inc.
SanJose,U.S.A.
harry.li@
AnahitaShayesteh
MemorySolutionsLab
SamsungSemiconductor,Inc.
SanJose,U.S.A.
Abstract
Nextgenerationflashstoragewillbearmedwithasubstantialamountofcomputingpower.Inthis
paper,weinvestigateopportunitiestoutilizethiscomputationalcapabilitytooptimizeOnlineAnalytical
Processing(OLAP)applications.WehavedirectedouranalysisattheperformanceofasubsetofTPC-DS
queriesusingApacheHadoopclustersandtwodatabaseengines,ApacheSPARK-SQLandPresto1.We
modeltheexpectedspeed-upachievedbyoffloadingafewoperationsthatareexecutedfirstwithin
mostSQLplans.Offloadingtheseoperationsrequiresminimalcooperationfromthedatabaseengine,
andnochangestotheexistingplan.Weshowthatthespeed-upachievedvariessignificantlyamong
queriesandbetweenengines,andthatthequeriesbenefitingthemostareI/Oheavywithhighselectivity
ofthe“needleinthehaystack”variety.Ourmaincontributionisestimatingthespeed-upanticipated
frompushingtheexecutionofafewkeySQLbuildingblocks(scan,filter,andprojectoperations)to
computationalstoragewhenusingreadoptimized,columnarApacheParquetformatfiles2.
CCSConcepts
•Computingmethodologies→Modelingandsimulation→Modeldevelopmentandanalysis→Modelverificationandvalidation;
•Hardware→Communicationhardware,interfacesandstorage→Externalstorage;
•Informationsystems→Datamanagementsystems→Databasemanagementsystemengines→Databasequeryprocessing→Queryplanning;
•Informationsystems→Datamanagementsystems→Databasemanagementsystemengines→Onlineanalyticalprocessingengines;
1PrestoisaregisteredtrademarkofFacebook,Inc.
2AnearlierversionofthisreportwillappearintheProceedingsofICPE2020.
ModelingAnalyticsforComputationalStorage2
Keywords
ColumnarDatabase,Parquet,SQL,SmartStorage,acceleration,offloading,TPC-DS,Spark,Presto,OLAP
1Introduction
Currentdevelopmentsin“bigdata”storagesolutionsgeartowardsmovingdataprocessingcloserto
wherethedataresides,reducingunnecessarymovementandspeedingupdataprocessingconsiderably.
Computationalstorageisanemergingtrendwhereacomparativelylargeamountofdataprocessing
occursinsidethestoragelayer.Examplesofnewdevicesexposingflashstorageinternalcomputing
powerincludeSamsung’sSmartSSD[1],NGDSystems[2],andScaleFlux[3].Thisnewfunctionalitysignals
performanceimprovementopportunitiesforI/Oheavyworkloadscontainingoperationsamenableto
beingcompletednearthestoragesource.Oneofthemostcriticaltypesofdatabaseanalytics–OLAP–
wellexemplifiesthistypeofopportunity.ItistypicallyveryI/Ointensiveandcontainsquiteafewbuilding
blocksthatmaybeseamlesslymovedto,orexecutedby,acomputationalstoragedevice.
Offloadingisnotanewconcept.Networkprocessors,GPUsandrecentlymachinelearningspecialized
processorsarewidelyusedtoacceleratespecificcomputekernelswhilefreeingCPUresources.Wewill
showthattheoffloadingofmanymoretime-consumingoperationsfromthehostCPUtostorageimproves
bothworkloadperformanceandsystemefficiency.Theimmediatebenefit,ofcourse,isasizeabledecrease
inI/Ovolume.ThisreductioninI/Oleadstolesshostresourceutilization,whichnotonlyimproves
performanceofindividualqueries,butalsoincreasesservercapacity.Besidesdatabaseoperations,other
frequentoperationsthatcanbeexecutednearthestoragedeviceincludeencryptionandcompression.
Databaseanalyticsworkloadsareespeciallyread-intensive.ItisnotuncommonforI/Oreadstotake90%
ormoreofthetotalexecutiontime.OffloadingsomeofthattostoragereducesI/Obandwidthalong
withotherhostresourceusage,andmayimproveperformanceconsiderably.Furthermore,SSDshavean
internalbandwidththatismuchhigherthanthatwhichisexposedtothehostcomputerthroughexisting
channels(SAS,SATA,PCI-E,etc.)[4],whichmeansthatcomputationalstoragehasalargeamountof
untappedpotentialtoexploit.
Thispaperdiscussestheexpectedperformancebenefitsofoffloadingsomeimportantbasicdatabase
operations–namelyScan,FilterandProject–tocomputationalstorage.Weevaluatetheperformance
estimatemodelusingTPC-DSworkloadandtwodatabaseenginesrunningonHadoopclusters:SPARK-
SQLandPresto.
Thispaperisorganizedasfollows:aftercoveringpreviouscomputationalstoragedatabaseoffloading
work,weexplaintheOLAPworkloadselection,andtheconfigurationofourtwoclusters.InSection
IVwediveintoTPC-DScharacteristicsandexaminetheoverallperformancefromrunningonthetwo
Hadoopclusters,whichhavebeenthefocusofourexperimentation.InSectionV,weexplainourmodeling
methodologies,andinSectionVIwedescribeandanalyzeresultsfromthatmodeling.Specifically,we
showhowasubstantialspeed-upfromcomputationalstorageoptimizationcandependonmultiple
factors.Finally,webrieflydiscussotherSQLbuildingblocksamenabletocomputationalstorage
pushdowns,andconclude.
ModelingAnalyticsforComputationalStorage3
2PreviousWork
MostpreviousworkonpushingSQLfunctionsdowntocomputationalstorageconcentrateonspecific
functionsofaspecificDatabaseEngine.Summarizer[5]modifiestheexistingNVMecommandinterface
toimplementfouroperations:initializevariablesorsetqueries;readdataandexecutecomputation;read
dataandfilter–theselectioncase;andthetransferoftheoutputresultstothehost.Fromtheircase
studies,using3TPC-Hqueriesandaverysmallscalefactor(100MB–0.1SF),wedeterminedthatthey
coulddosimilarityjoinsaswell.Theycomparedifferentdegreesofcomputationoffloadingforthesethree
queries.Theauthorsshowthatsomewhatcomplexcomputationscanbecarriedoutnearstorage,and
brieflydiscussthedataintegrationproblem:howtocombinedatafromdifferentformatsandsources.
Theyconcentrateononespecificintegrationproblem:similarityjoin,anddescribetheheuristicstheyuse.
Leftunansweredisthebiggerissueonhowtointegratetrulydistinctformats.
YourSQL[6]isbasedonMariaDB.YourSQLallowsforcomplexqueryoperationstobeoffloadedtoasmart
SSDintheformofanISCtask.Thatpaperspendsthebulkofitstimetalkingaboutoptimizerheuristics.
Oneveryinterestingobservation,fromtheauthors’performanceanalysisisthatwhileyourtypicalSQL
application–OLTPorOLAP–cannotexhaustanNVMebandwidth,itsnear-storageimplementationcan.
Biscuit[7]iswhatYourSQLusestoenableitscomputationalstorageoperations.Itprovidestheuser
applicationwithC++APIs.Theuser’sSSD-sideC++programwithBiscuitAPIs,calledanSSDlet,isloaded
inthedevice.Ahost-sideprograminvokesandcoordinatesexecutionoftheSSDlettasksusinglibsisc;
communicationisdonebylinkinginputandoutputportstospecifictasks.Heretheyalsoclaimthatthe
APIsusedtoaccessfilesarenearlyidenticaltostandardlibraries.
ExtraV[8]isIBM’seffortatcomputationalstorageforgraphprocessingbasedontheirCAPRI[9].This
paperdescribesanFPGAprototypethatexecutescommongraphtraversalfunctionsnearthedevice.It
workslikevirtualmemoryforgraphapplications,asitprovidesthehostwiththeillusionthattheentire
graphlivesinmemory,whileitisactuallypartlystoredandcompressedinanSSD.Theauthorshavestated
thatgraphprocessingismostlydoneinmemory,eitherinsingleserversorclusters,andthatitcannotbe
doneefficientlywhengraphsgrowbeyondtheavailablememory.
PG-Strom[10]isanacceleratorforPostgreSQLthatoffloadspartoftheSQLworkloadtoaGPU.Supports
JoinsandAggregates.However,bythetimeofthatpublication[10]alldatafedtotheGPUcamefrom
mainmemory(notstorage).
NeteezawasthefirstsuccessfulproducttouseFPGAsascomputationalstoragecomputingaccelerators
foranalyticsdataengines.Itdoesnotrequireanysoftwareinstallationortuning.Justplugandplay.
NeteezadatabaseengineisbasedonPostgres[11],andimplementsfourfunctionsinitsFPGAengine:
Compress,Project,RestrictandVisibility.Francisco[12]claimsthatNeteeza’senginedecompresses
dataatwirespeed.ProjectandRestrictoperationsfilteroutcolumnsandrows,respectively,basedon
theparametersintheSELECTandWHEREclausesofaquery.TheNeteezaVisibilityengineisfocusedon
databaseintegrity,andtherein,filtersoutrowsthatshouldnotbeseenbythequery,suchasanyrows
beinginsertedbyatransactionthathasnotyetcommitted.
ModelingAnalyticsforComputationalStorage4
ComputationalstoragehasalsoattractedinterestbeyondSQLanddatabaseapplications.Forexample,
REGISTOR[24]isanFPGAplatformapplyingregexsearch,on-the-fly,toanyfilebeingtransferredfroman
SSDtothehost;INSIDER[25],alsoanFPGA-baseddrivecontroller,exposesavirtualfilesystemwithem-
beddedprogrammability,allowingprogrammerstopushdownoperationscustomizedtotheapplication’s
specificneeds.
3WorkloadandSetup
Here,weexplaintheTPC-DSbenchmark,aswellasthetwoclusterconfigurationsusedintheexperi-
mentsdescribed.Moreover,wedescribethetwodatabaseengines(SPARK-SQLandPresto),andexplain
therationalebehindusingtheParquetfileformattooffloadSQLoperationstocomputationalstorage.
3.1TPC-DS
“TheTPCBenchmarkDS(TPC-DS)isadecisionsupportbenchmarkthatmodelsseveralgenerallyappli-
cableaspectsofadecisionsupportsystem”[13].
TPC-DScontains24tables,organizedasasnowflakeschema.Itcontains6verylargeFACTtables,and
manysmallDIMENSIONtables.FurthermoreTPC-DSiscomprisedof99queries,eachonerepresenting
adifferentbusinessquestion.So,eventhoughthisisanartificialbenchmark,ittriestomirrorreal-life
applications.Schemaisscalable,withthesmallestbeing1GBandthelargest100TB.The1GBdatasetis
usedforQAonly.PerformanceismeasuredinQueriesperHour@ScaleFactor(QphDS@SF),andmust
includemultipletests(pertainingtopower,throughput,anddatamaintenance).Inthisstudy,wecon-
siderasubsetofthepowertest.ForamoredetailedexplanationoftheTPC-DSbenchmark,wereferthe
readerto[14].
TPC-DShasbeenaroundsince2007,butdidnotcatchupuntilrecentlyandafteramajorre-write,with
thefirstpublishedofficialreportdatedMarch2018(Cisco)[15].AsofJanuary2020,thereareonlysix
officialreportspublished.Nonetheless,subsetsofTPC-DSareheavilyusedinformallybytheindustryto
demonstrateupandcomingtrends[16][17].TPC-DSisoneofmanyTransactionProcessingPerformance
Council(TPC)benchmarks[18],andassuchcoversenoughgeneralOLAPcasestobeusefultopracti-
tioners.
BecauseFACTtablesareordersofmagnitudelargerthanDIMENSIONtables,wewillgravitatetowards
queriesthatare“FACTtableScanheavy,”asopposedtoqueriesthatare“DIMENSIONtableScanheavy.”
ModelingAnalyticsforComputationalStorage5
3.2TestConfiguration
Twoclustersareusedinthispaper,andsincetheyareconfiguredtorunSPARK-SQLandPresto,werefer
tothemsimplyastheSPARK-SQLclusterandthePrestocluster.Eachhaseightdatanodeswithdiffer-
enthardware.ThedetailedconfigurationislistedinTableI.BothenginesuseApacheHiveMetadata,
andtheParquetfileformat.
SPARK-SQLisApacheSpark’shigh-leveltoolforstructureddataprocessing[19].Itisanin-memory,
distributed,RDBMSthatunderstandsSQLandaDatasetAPI(availableinJavaandScala).User
applicationsinterfacewithSPARK-SQLviaacommand-linemodule,JDBCorODBC.SPARK-SQLalso
supportsreadingandwritingdatastoredinanexistingApacheHiveinstallation.
Spark-SQL
Presto
DataNode
Hardware
CPU
Intel(R)Xeon(R)
Gold6152CPU@
2.10GHz
Intel(R)Xeon(R)
CPUE5-2699v4@
2.20GHz
Memory
256GB
256GBto1024GB
LocalStorage
2xNVMeSSD3.2TB
3xNVMeSSD1.6TB
SoftwareStack
OS
LinuxKernel4.13.0
LinuxKernel4.x.x
SPARK-SQL/Presto
2.3.0
0.205
Hadoop
2.7.3
2.9.0
Hive
1.2.1000
1.2.2
HDFSReplication
1
TPC-DS
ScaleFactor
10000
StorageFormat
Paquet
Table1-ClusterConfiguration
PrestoisadistributedSQLqueryenginedesignedtoquerylargedatasetsdistributedoveroneormore
heterogeneousdatasources[20].PrestoprovidesaCLIinterface,andqueryprocessing(parser,planner,
scheduler),butwillusedataandmetadataprovidedbyothersoftwarecomponents(ApacheHBase,
ApacheHive,MySQL,etc.).Prestointeractswiththeseothercomponentsviaconnectors,andthisisits
claimtofameasitispossibletocombinemultiple,differentdatasourcesintoonequeryseamlessly.
ThereisnoneedforveryexpensiveETL(Export-Transform-Load)datasetsinordertoanalyzethem.
Similartoclassicmassivelyparallelprocessing(MPP)DBMS[21],Prestoisadistributedsystemthatruns
onacluster.PrestoclientsubmitsSQLstatementstoamasterdaemoncoordinator.Usingmetadata
ModelingAnalyticsforComputationalStorage6
fromconnectors,thecoordinatorparsesthequery,generatestheplan,andthenschedulesandcoordi-
nateshowitisexecutedbytheworkers.Workersgetdatafromconnectors,executeassignedtasks,and
deliverresultstotheclient.Allprocessinghappensinmemory,anddataispipelinedacrossthenetwork
betweendifferentstages.
ParquetisanopensourcecolumnarfileformatthatwasdesignedtobeusedwithOLAPsystems[22].
TheParquetfileformatisREADoptimized,asinsertsorupdatescanbeexpensiveoperations.Itwas
inspiredbythe“Dremel”paper[23],andisextensivelyusedintheHadoopecosystem.Furthermore,each
Parquetdiskfilecontainsthetable’sschema.Thisfeatureresolvestheissueofthedevicebeingaware
ofthetablemetadata,arequirementforanycomputationalstorageprocessing.Furthermore,existing
Parquetreadersarecapableofprojectingandfilteringcertaindatatypesusingstatisticsprovidedin
metadata.Implementingsomefunctionalityinacomputationalstoragedeviceiscomplementaryandin
additiontotheexistingpushdowncapabilitiesofParquet.
TPC-DSqueriesaredownloadedfromtheTPCwebsiteandresultswereverifiedagainstsampleoutput
fromtheTPC.Allqueriesrunsequentiallyasasingletestjob.Beforeeachquery,thememorycacheis
cleared.Inaddition,
•SPARK-SQLisrestartedbeforeeveryquery
•Prestoisrestartedbeforethejob
4TPC-DSCharacterization
Inthissection,wediscussthemanystages(orfragments)oftheexecutionplangeneratedbythequery
optimizer.Next,weshowSPARK-SQLandPrestoqueryruntimeresultsforTPC-DS.Welistthemsideby
sidetoshowthattheybehavedifferentlyinordertoillustrateandexplainthedifferentspeed-upsthat
onemightseeforthesamequeryexecutedwithdifferentengines.Next,weexaminetheconceptofScan
Ratio,andhowweuseittocharacterizeandrankqueries.
4.1TypicalTPC-DSqueryplan
SQLqueryplansarecomposedofbasicbuildingblocks.Theyformanexecutiontree.Eachbuildingblock
typicallyfocusesononespecificoperation,andisscheduledbyaSQLengine.Howthesebuildingblocks
areassembleddictatesqueryperformance.Mainbuildingblocksinclude:Scan,Filter,Project,Aggregate,
Sort,Join,Merge,Union.Figure1(A)illustratesatypicalquerysequence,withbuildingblocksbeingexe-
cutedfromtoptobottom.Figure1(B)isthebuildingblocksequencecreatedbytheSPARK-SQLplanner
forTPC-DSQuery44.
ModelingAnalyticsforComputationalStorage7
(A)GenericSQLqueryplan(B)SPARK-SQLplanforQuery44
Figure1-SQLqueryplans
Thefunctionalityofsomebuildingblocksincludes:
•Scan:Readdatabasecontentfromstoragetocomputehostmemoryandapplyanyneededtransformations
•Filter:Filtertablerowsinmemorywithgivingcriteria
•Project:Selecttablecolumnsinmemory
•Join:Combinetwotablesbasedongivencriteria
ModelingAnalyticsforComputationalStorage8
QueryRuntime(sec.)
Figure2-TPC-DSruntimequerycomparison
4.2Performanceofallqueries
Figure2showstheruntimeforallTPC-DSqueriesforSPARK-SQLandPresto.With10TBdataset,SPARK-
SQLcompletes91andPrestocompletes61queries.Bothdatabaseenginesstoreallintermediateresults
inmemory,andthequeriesthatfailedincurredan“out-of-memory”error.Thequeryruntimehasawide
rangefromlessthanaminutetomanyhours.Wehavenotmatchedourclusterhardwareconfigurations
forSPARK-SQLandPresto,asitisnotourgoaltocomparetheperformancebetweenthem.Thepointof
thispaperistoshowcaseasubsetofthemanydifferentsystemparametersinfluencingthepotentially
substantialspeed-upaffordedbycomputationalstoragedevices.Wedemonstratethateventhough
computationalstoragecanprovideimpressivespeed-ups,thebenefitsvarysignificantlydependingon
manyotherparameterssuchastablesize,selectivity,queryplan,etc.
Inthefollowingsections,wewillselectfivequeriesfromeachclusterbasedonsystemcharacterization
ofthequeriesandpotentialoffloadbenefits,andprovidefurtheranalysisofeach.
4.3ScanRatio
ScanRatioisdefinedasthetotalCPUtimespentonadatabaseScanoperation,dividedbytotalCPU
timeconsumedbythequery.TheCPUtimeisreportedbyqueryplannerfromdatabaseengine.Thistime
isnotthewallclocktimeandshouldnotbeconfusedwithqueryruntime.
ForTPC-DSqueries,theScanRatiorangesfromnear0%to~93%onSPARK-SQLanduptonearly100%
forPresto.InFigure3,queriesaresortedbytheirScanRatio,fromlefttoright.Q9,withthehighestScan
ModelingAnalyticsforComputationalStorage9
Ratio,isfurthesttotheright.NoticethatmostCPUintensivequerieshaveasmallScanRatio,butnot
all.Somecomplexqueries,suchasQ44,arebothcomputeandI/Ointensive.
HighScanRatiodoesnotnecessarymeanthequeryreadsmoredatafromstorage,itonlyindicatesthat
timespentonI/Oishigherrelativetootherqueryoperations.Forexample,Query45hasatotaldisk
readof~1.3TB,itsScanRatioisonly2.99%.ButforQuery9,whichhasthehighestScanRatioof~93%,
totaldiskreadisonly~105GB.Althoughthetotalqueryruntimedifferenceisnotlarge(Q9,212.36sec,
Q45,176.02sec.),theCPUcyclesspentonnon-I/OoperationscausedtheScanRatiotobelowerforQ45.
AhighScanRatioindicatesthataqueryisastrongcandidateforcomputationalstorageoptimization,
sinceitsI/Ooperationsarelikelytobeinitscriticalpath,whilealowScanRatioindicatesthatopera-
tionsotherthanI/Oarethebottleneck.
5OffloadingModel
Here,weexplainhowweselectedeachplanstagetobeoffloadedtocomputationalstorage,followed
byadetaileddescriptionofthemodelmethodologyusedwithbothdatabaseengines.Noticethatthe
methodsaresomewhatdifferent,whichwechosetodoinordertocovermoreaspectsoftheoffloading
process.
ModelingAnalyticsforComputationalStorage10
Figure3-TPC-DSScanRatioandCPUutilization
5.1Offloadingcomponents(orkernels)
Inthissection,weexploitopportunitiestooffloadoperationsfromhosttocomputationalstorage.In
ordertoexecuteaquery,dataflowsfromtheleavesoftheplantotheroot.Usuallytheleavescontain
someformofSCANoperation:tablerowsandcolumnsarereadin(usuallyfromdisk,unlessthisdata
waspreviouslycached).TheSCANoperationusuallyincludessomesortofdatatransformation,from
theformatondisktotheoneunderstoodbythedatabaseengine.Onceatableisscanned(orsometimes
whilethetableisbeingscanned),rowsmaybefilteredorprojected.Nextplanstepsmaycontainaggre-
gates,sorts,joins,windowfunctions,orotheradvanceddatatransformations.Operationsneartheleaves
willgenerallybe“easier”topushdowntocomputationalstorage.BasicSCANs,FILTERs,andPROJEC-
TIONsmayhappenwithvirtuallynochangetothedatabaseenginequeryplan.Moreaggressivepush
downoptimizationsarepossible,butrequirethecooperationofthedatabaseengine,andre-factoringof
thequeryplan.
Forexample,inFigure1(B),weobservethispatterninbothFACTtableandDIMENSIONtableI/O.By
combining“Scan,”“Filter”and“Project”intoanewbuildingblock,wecanestimatetheperformance
benefitofoffloadingthisnewbuildingblock(“Scan/Filter”)tocomputationalstorage.Regardless,with
“Scan/Filter”offloading,theSPARK-SQLplanforQuery44stilllooksthesame.
ModelingAnalyticsforComputationalStorage11
5.2SPARK-SQLmodelmethodology
TheperformanceestimatemodelforSPARK-SQLisbasedonhowthedatabaseengineplanisexecuted
–instageswithdependencies.Weassumethereisnoresourcelimitationonthenumberofstagesthat
canbeexecutedconcurrently.
Forexample,Figure4showsagenericquerythatinvolves3tables,1DIMENSIONtableand2FACT
tables.Stage-0readsthecontentoftheDIMENSIONtable,whilereadingFACTtableshappensinStage-1
andStage-2.Then,Stage-3and4sorttheresultsfromStage-1and2.Theresultsaresubsequently
passedtoStage-5forthefinalJoinoperation.
Figure4-QueryStageScheduling
First3stages(0,1and2)includeScan/Filter/ProjectoperationsasmarkedwithlightdotshadeinFigure
4.Thetimespentontheoperationsare1,5and8secondsrespectively,andcouldbeoffloadedtocomputationalstorage.Theoffloadedexecutiontimeiscalculatedas:
•Reserve1secondforoffloading-relatedhandshaking.Thereservedtimeisanarbitrarynumber.
•AssumesthattheFilterruntimeonthedeviceisatwirespeedandcanbeomitted.Thisisanopti-misticassumptionthatprovidesanupperboundforouranalysis.TheactualFilterruntimedependsoncompute/IOcapabilitiesofthedevice,andcanbefurtherimprovedwithpre-processinginthedevice.
•Time-of-resultdatatransferbetweenthedeviceandthehostascalculatedbasedonthedeviceReadbandwidthspecification;inthispaper,3GB/sechasbeenused.
ModelingAnalyticsforComputationalStorage12
Withtheseassumptions,theexampleexecutiontimecanbereducedfrom18secondsto12seconds
(seeFigure5).
Figure5-SPARK-SQLOffloadModel
Asthemostfundamentalstepinbuildingtheestimatemodel,weneedtoknowthetimespentforScan/
Filter/ProjectoneachSPARK-SQLquerystage.Fortunately,withSPARK-SQLthelogfileprovidesthe
followingkeylogginginformation(Figure6):
•MeasWClock:TheStagewallclockruntime
•ThrTime:Totalexecutiontimeforthestagefromallexecutionthreads.Thisisnotwallclocktime
•ThrTime:TheexecutiontimebreakdownforScan,Filter,Project
Withtheaboveinformation,theestimatedtimespentonScan/Filter/Projectcanbecalculatedas
ThrTime
EstWClockTime=MeasWClockTime
ThrTime
InadditiontoScantime,wealsoconsiderthefollowing:
WClockTime-Thetimetoinitializecomputationalstorageforoffloading.Wealwaysas-
sumeonesecondfortheestimationcalculation.
WClockTime-Thetimerequiredtotransfertheresultsfromtheoffloadingdevice
backtothehost.ItiscalculatedbasedtheReadbandwidthofthecomputationaldevice.Inourmodel,
theFilteredresultisusuallylessthan0.5%oftheresultsthatareunfiltered.Itwouldtakeonlyafrac-
tionofasecondtoreadbacktothehost,thereforeweignoreditthistime.
WithParquetformat,weassumethatnoProjectoperationorProjecttimeisomitted.
Withtheaboveassumption,theestimatedstageruntimewithoffloadingforSPARK-SQLiscalculatedas:
EstWClockTime=WClockTime+EstWClockTime
ModelingAnalyticsforComputationalStorage13
Figure6-OneSPARK-SQLQueryStagewithStatistics
5.3PrestoModelmethodology
TomodelpushdownbenefitsofScan/Filter/Projectoperations,wecreateandpopulatesmallertables
wecall“modeltables.”These“modeltables”containonlytherowsandcolumnsthatwouldbeselected
byacomputationalstorageengineexecutingtheScan/Filter/Projectoperationsdefinedbythequery.
Werepeatthequeryusingthemodeltable,andcompareresultsagainstthesamequeryusingthe
originaltables–seeFigure7.ForPresto,bothoriginalandmodelqueriesgeneratethesamequeryplan.
SimilartoourSPARK-SQLmodel,theperformancedifferenceistheupperboundofthespeed-upthata
computationalstoragedevicewouldyield,becausethismodelassumesthatthestoragedevicewould
becapableoffilteringandprojectingrowsandcolumnsatwirespeed.However,ifwetakeintoconsid-
erationthehigherinternalflashstoragebandwidth[4],thisisarealisticapproximationoftheexpected
speed-up.
ModelingAnalyticsforComputationalStorage14
Figure7-PrestoOffloadModel.
6OffloadingEvaluation
Here,wedescribeindetailthequeryselectionprocess,andgiveahigh-levelviewoftheresultsobtained
bythemodelingofbothdatabaseengines.Furthermore,wepresentside-by-sideanalysisoftheexpect-
edspeed-upforafewselectedqueries.
ModelingAnalyticsforComputationalStorage15
6.1Thequeries
Inthisstudy,wepickedfivequeriesfromeachconfigurationfordeepanalysis.Thequerieswereselected
basedonwheretheyfallonthedifferentquadrantsoftheScanRatioversusaCPUutilizationchart(see
Figure8)tocoverawiderrangeofcharacteristics.BecausewefocusonoffloadingScan/Filter/Project,
wewantqueriesthatareI/OintensiveandshowhighselectivitywhenfilteringandprojectingFACT
tables.Thatis,welookforqueriesofthe“needleinthehaystack”variety.Threeofthequeries(Q9,Q44,
andQ75)arefoundinbothstudies,whiletheothertwoarefoundexclusivelyineitherSPARK-SQLor
Presto.Wechosethisapproachbecause,duetotheirdifferentarchitectureandoptimizer,interesting
queriesinoneenvironmentarenotnecessarilyinteresting,orpossible,intheother.Forexample,Presto
cannotexecuteQ4(out-of-memoryerror).
UsingthechartinFigure8,weselectedthefollowingfiveSPARK-SQLqueriesforanalysis:Queries9and
44havehighScan/Filterratio;Query4hashighCPUutilization;Query72hasthelongest
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 工业活动融资行业营销策略方案
- 扫描探针显微镜产业链招商引资的调研报告
- 去中心化身份认证服务行业市场调研分析报告
- 园艺学行业营销策略方案
- 家用空间降温装置出租行业营销策略方案
- 装钓鱼假饵用盒市场发展前景分析及供需格局研究预测报告
- 离心压缩机产品供应链分析
- 机械式起重葫芦产品供应链分析
- 动物清洁行业经营分析报告
- 美容霜市场分析及投资价值研究报告
- 雨污分流管网工程施工方案
- 横河CS3000工程师培训资料
- 江苏省苏州市振华中学2023-2024学年九年级上学期期中物理试卷
- 慢性阻塞性肺疾病急性加重临床路径
- 人教版小学数学一年级上册第七单元《认识钟表》教学课件
- 专题20 上海高考说明文阅读技巧点睛(解析版)
- 城乡供水一体化
- 高三数学试题(含答案)
- 国开电大土木工程本科《工程地质》在线形考形考(作业1至4)试题及答案
- 剑桥少儿英语第一集听力对话文本
- 《控制工程基础》实验指导书(新)
评论
0/150
提交评论