图Mamba综述:探索图学习中的状态空间模型 Exploring Graph Mamba -A Comprehensive Survey on State-Space Models for Graph Learning_第1页
图Mamba综述:探索图学习中的状态空间模型 Exploring Graph Mamba -A Comprehensive Survey on State-Space Models for Graph Learning_第2页
图Mamba综述:探索图学习中的状态空间模型 Exploring Graph Mamba -A Comprehensive Survey on State-Space Models for Graph Learning_第3页
图Mamba综述:探索图学习中的状态空间模型 Exploring Graph Mamba -A Comprehensive Survey on State-Space Models for Graph Learning_第4页
图Mamba综述:探索图学习中的状态空间模型 Exploring Graph Mamba -A Comprehensive Survey on State-Space Models for Graph Learning_第5页
已阅读5页,还剩65页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

ExploringGraphMamba:AComprehensiveSurveyonState-SpaceModelsforGraphLearning

arXiv:2412.18322v1[cs.LG]24Dec2024

SAFABENATITALLAH,PrinceSultanUniversity,SaudiArabiaandUniversityofManouba,Tunisia

CHAIMABENRABAH,WeillCornellMedicine,QatarandUniversityofManouba,TunisiaMAHADRISS,PrinceSultanUniversity,SaudiArabiaandUniversityofManouba,Tunisia

WADIIBOULILA,PrinceSultanUniversity,SaudiArabiaandUniversityofManouba,Tunisia

ANISKOUBAA,PrinceSultanUniversity,SaudiArabia

GraphMamba,apowerfulgraphembeddingtechnique,hasemergedasacornerstoneinvariousdomains,includingbioinformatics,socialnetworks,andrecommendationsystems.ThissurveyrepresentsthefirstcomprehensivestudydevotedtoGraphMamba,toaddressthecriticalgapsinunderstandingitsapplications,challenges,andfuturepotential.WestartbyofferingadetailedexplanationoftheoriginalGraphMambaarchitecture,highlightingitskeycomponentsandunderlyingmechanisms.Subsequently,weexplorethemostrecentmodificationsandenhancementsproposedtoimproveitsperformanceandapplicability.TodemonstratetheversatilityofGraphMamba,weexamineitsapplicationsacrossdiversedomains.AcomparativeanalysisofGraphMambaanditsvariantsisconductedtoshedlightontheiruniquecharacteristicsandpotentialusecases.Furthermore,weidentifypotentialareaswhereGraphMambacanbeappliedinthefuture,highlightingitspotentialtorevolutionizedataanalysisinthesefields.Finally,weaddressthecurrentlimitationsandopenresearchquestionsassociatedwithGraphMamba.Byacknowledgingthesechallenges,weaimtostimulatefurtherresearchanddevelopmentinthispromisingarea.ThissurveyservesasavaluableresourceforbothnewcomersandexperiencedresearchersseekingtounderstandandleveragethepowerofGraphMamba.

AdditionalKeyWordsandPhrases:StateSpaceModels,MambaBlock,GraphMamba,GraphLearning,GraphConvolutionalNetwork,Applications

ACMReferenceFormat:

SafaBenAtitallah,ChaimaBenRabah,MahaDriss,WadiiBoulila,andAnisKoubaa.2024.ExploringGraphMamba:AComprehensiveSurveyonState-SpaceModelsforGraphLearning.1,1(December2024),

35

pages.

/10.1145/nnnnnnn.nnnnnnn

1Introduction

Graph-basedlearningmodels,particularlyGraphNeuralNetworks(GNNs),havegainedsignificanttractioninrecentyearsduetotheirabilitytoeffectivelycaptureandprocesscomplexrelationaldata.Thesemodelshaveprovenadvantageousinmanydifferentfieldswheregraphsarethetypicalwaytorepresentdata

[1]

.TheincreasingsignificanceofGNNscanbeattributedtovariousfactors.Graph-structureddatahasbeenraisedinmanyreal-worldsystems,suchassocialnetworks,molecularstructures,andcitationnetworks

[2,

3]

.GNNshaveasolidabilitytoleveragerelationalinformationandtheconnectionsbetweenentities.Inaddition,differentadvancedGNNarchitectureshavebeenproposedwithhighscalabilitytohandlelarge-scalegraphs,

Authors’ContactInformation:SafaBenAtitallah,satitallah@.sa,PrinceSultanUniversity,Riyadh,SaudiArabiaandUniversityofManouba,Manouba,Tunisia;ChaimaBenRabah,WeillCornellMedicine,Doha,QatarandUniversityofManouba,Manouba,Tunisia;MahaDriss,PrinceSultanUniversity,Riyadh,SaudiArabiaandUniversityofManouba,Manouba,Tunisia;WadiiBoulila,PrinceSultanUniversity,Riyadh,SaudiArabiaandUniversityofManouba,Manouba,Tunisia;AnisKoubaa,PrinceSultanUniversity,Riyadh,SaudiArabia.

Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationonthefirstpage.Copyrightsforcomponentsofthisworkownedbyothersthantheauthor(s)mustbehonored.Abstractingwithcreditispermitted.Tocopyotherwise,orrepublish,topostonserversortoredistributetolists,requirespriorspecificpermissionand/orafee.Requestpermissionsfrompermissions@.

。2024Copyrightheldbytheowner/author(s).PublicationrightslicensedtoACM.ACMXXXX-XXXX/2024/12-ART

/10.1145/nnnnnnn.nnnnnnn

,Vol.1,No.1,Article.Publicationdate:December2024.

2BenAtitallahetal.

makingthemsuitableforbigdataapplications.Thistypeoflearningcanbeappliedtovarioustasks,includingnodeclassification,linkprediction,andgraphclassification.

However,theyfaceseveralsignificantchallengesthatlimittheireffectivenessinspecificscenarios.MostGNNsarerestrictedintheirabilitytoeffectivelycapturelong-rangedependencies.Theytypicallyrelyonmessagepassingbetweenneighboringnodes,whichcanleadtoinformationdilutionovermultiplehops.Thisconstraintisparticularlyproblematicingraphswithcomplexhierarchicalstructures.Inaddition,manyGNNarchitecturesrequiremultipleroundsofneighborhoodaggregation,whichiscomputationallyexpensive,especiallyforlarge-scalegraphs.Thecomputationalcostgrowssignificantlyasthenumberoflayersincreasestocapturemorecomplexpatterns.Furthermore,GNNsusuallyfacememoryconstraintsandincreasedtrainingtimewhenappliedtolargegraphs

[4]

.Theissueisheightenedfordynamicgraphs,wherethestructurechangesovertimeandrequiresfrequentupdatestonoderepresentations.Samplingtechniqueshavebeenproposedtoaddressthisbutcanleadtoinformationloss.GNNvariantshavequadraticcomplexityregardingthenumberofnodesortokens.SimilarissuesariseinGNNswhencomputingfullgraphattentionorwhendealingwithdensegraphs.Thisquadraticscalingsignificantlyimpactsperformanceandlimitstheapplicationofthesemodelstohugegraphsorlongsequences.

Indeed,addressingthelimitationsofcurrentgraph-basedlearningmodelsiscrucialfortheirbroaderapplicability.OnepromisingdirectioninthiseffortistheadaptationofState-SpaceModels(SSMs)tographlearning,whichhasledtothedevelopmentofGraphMamba.SSMsaremathematicalmodelsinitiallydesignedforsequencemodelingincontroltheoryandsignalprocessing.Theyrepresentasystem’sbehaviorusingasetofinput,output,andstatevariablesrelatedbyfirst-orderdifferentialequations.InthecontextofML,SSMscanefficientlymodellong-rangedependenciesinsequentialdata.Theyofferacontinuous-timeperspectiveonsequencemodeling,whichcanbenefitspecificdatatypes.

Recently,MambahasemergedasagroundbreakingapproachinArtificialIntelligence(AI),specificallydesignedasaspecializedformoftheSSMtoaddressthecomputationallimitationsoftraditionalDeepLearning(DL)models.Standardmodels,suchasConvolutionalNeuralNetworks(CNNs)andTransformers,faceasignificantchallengerelatedtocomputationalinefficiency,particularlyintasksinvolvinglong-sequencemodeling.Mamba’sprimarygoalistoenhancecomputationalefficiencybyreducingtimecomplexityfromquadratic,asseenintransformers,tolinear.InspiredbyadvancementsinstructuredSSMs,Mambaispresentedtoboostperformanceinareasrequiringlong-rangedependencymodelingandlarge-scaledataprocessing.

GraphMambaemergesasaspecializedvariantofSSMsdesignedspecificallyforgraphlearning.ItsprimarygoalistoaddressthelimitationsoftraditionalGNNsbyleveragingtheuniquestrengthsofstate-spacemodels.ThecoreconceptofGraphMambaisitsstate-spacemodelingapproach,whichemploysselectivescanning,apowerfulmechanismforefficientlyprocessinggraphinformationbydynamicallyfocusingonthemostrelevantpartsofthegraphstructure.ThisallowsGraphMambatomanagelarge-scaleandcomplexgraphswithsuperiorcomputationalperformance.

Recently,therehasbeenanincreasinginterestinGraphMamba,asshownbythegrowingnumberofarticles.ThissurveyaimstoinvestigatethepotentialofintegratinggraphstructureswithMambaframeworkstoenhancerepresentationlearningandscalability.Throughacomparativeanalysisofexistingliteratureandempiricalstudies,thissurveyevaluatestheperformanceofGraphMambaagainsttraditionalMachineLearning(ML)methods.

1.1RelatedSurveys

Thissectionprovidesathoroughsummaryofessentialsurveystudiesfromtworesearchfields;GNNarchitec-turesandMambaframework.

1.1.1SurveysonGraphNeuralNetworks:AdvancementsandApplications.GNNshavefoundapplicationsinavarietyofdomains,includingcomputervision,recommendationsystems,frauddetection,andhealthcare.SeveralcomprehensivesurveyshavebeenelaboratedonGNNs.In

[5],theauthorspresentedacomprehensive

reviewofGNNs,emphasizingtheirevolution,essentialconcepts,andnumerouspotentialapplicationsofthiscutting-edgetechnology.GNNstransformedMLbyeffectivelymodelingrelationshipsingraph-structureddata,overcomingtheconstraintsofconventionalneuralnetworks.ThestudydescribedmajorGNNarchitecturessuchasGraphConvolutionalNetworks(GCNs),GraphAttentionNetworks(GATs),andGraphSampleand

,Vol.1,No.1,Article.Publicationdate:December2024.

ExploringGraphMamba:AComprehensiveSurveyonState-SpaceModelsforGraphLearning3

Aggregate(GraphSAGE),aswellastheirmessage-passingmechanismsforrepeatedlyaggregatinginformationfromneighboringnodes.Amongtheapplicationsinvestigatedwerenodeclassification,linkprediction,andgraphclassificationacrosssocialnetworks,biology,andrecommendationsystems.Inaddition,thepaperexaminedcommonlyuseddatasetsandPythonlibraries,exploredscalabilityandinterpretabilityissues,andrecommendedfutureresearchareastoimproveGNNperformanceandexpanditsapplicabilitytodynamicandheterogeneousgraphs.Theauthorsin

[6]providedacomprehensivereviewofGNNsandtheirapplicationsin

dataminingandMLfields.Itdiscussedtheissuesposedbygraph-structureddatainnon-EuclideandomainsandhowDLmethodshadbeenmodifiedtoaccommodatesuchdata.Theauthorsin

[6]presentedanew

taxonomythatdividedGNNsintofourtypes:recurrentGNNs,convolutionalGNNs,graphautoencoders,andspatial-temporalGNNs,eachofwhichwascustomizedtoaspecificgraph-basedtask.ThesurveyalsoexaminedthepracticalusesofGNNsinsocialnetworks,recommendationsystems,andbiologicalmodeling.Furthermore,itreviewedopen-sourceimplementations,benchmarkdatasets,andevaluationcriteriautilizedinGNNresearch.Itconcludedbylistingunresolvedchallengesandproposingfutureresearchtopics,highlightingthepotentialforadvancedGNNmethodologiesandapplications.

1.1.2SurveysonMamba:Trends,Techniques,andApplications.Sinceitsintroductioninlate2023,MambahasreceivedalotofattentionintheDLcommunitybecauseitofferscompellingbenefitsthatencourageadoptionandexplorationacrossmultipledomains.NnumeroussurveyshavebeenelaboratedtoinvestigateMAmbapotentialanditsapplications.Forexample,ThePatroetal.in

[7]investigatedtheuseofSSMs

asefficientalternativestotransformersforsequencemodelingapplications.ItclassifiedSSMsintothreeparadigms:gating,structural,andrecurrent,anddiscussedkeymodelslikeS4,HiPPO,andMamba.ThissurveyemphasizedtheuseofSSMsinavarietyofdomains,includingnaturallanguageprocessing,vision,audio,andmedicaldiagnostics.ItcomparedSSMsandtransformersbasedoncomputationalefficiencyandbenchmarkperformance.ThepaperemphasizedtheneedforadditionalresearchtoimproveSSMs’abilitytohandleextendedsequenceswhilemaintaininghighperformanceacrossmultipleapplications.

Quetal.

[8]gaveathoroughexplanationofMamba

.TheypositionedMambaasaviablealternativetotransformertopologies,particularlyfortasksinvolvingextendedsequences.ThesurveypresentsthefundamentalsofMamba,highlightingitsincorporationoffeaturesfromRNNs,Transformers,andSSMs.ItexaminedimprovementsinMambadesign,includingthecreationofMamba-1andMamba-2,whichfeaturedbreakthroughssuchasselectivestatespacemodeling,HiPPO-basedmemoryinitialization,andhardware-awarecomputationoptimizationmethods.TheauthorsalsolookedintoMamba’sapplicationsinavarietyofdomains,includingnaturallanguageprocessing,computervision,time-seriesanalysis,andspeechprocessing,demonstratingitsversatilityintaskssuchaslargelanguagemodeling,videoanalysis,andmedicalimaging.ThestudyidentifiedmanyproblemsrelatedtoMambause,includinglimitationsincontext-awaremodelingandtrade-offsbetweenefficiencyandgeneralization.TheyalsosuggestedimprovementsforMamba’sgeneralizationcapabilities,computationalefficiency,anddiscusseditsapplicabilityinnewresearchareasinthefuture.

Intheirrecentstudy,Wangetal.in

[9]conductedacomprehensivesurveythatemphasizedthechanging

landscapeofDLtechnologies.ThissurveyfocusedprimarilyonthetheoreticalfoundationsandapplicationsofSSMsinfieldssuchasnaturallanguageprocessing,computervision,andmulti-modallearning,withthegoalofaddressingthecomputationalinefficienciesofconventionalmodels.Experimentalcomparisonsrevealedthat,whileSSMsshowedpromiseintermsofefficiency,theyfrequentlyfellshortoftheperformanceofcutting-edgetransformermodels.Despitethis,thefindingsinthisstudyrevealedthatSSMscouldreducememoryusageandprovideinsightsintofutureresearchtoimprovetheirperformance.ThisstudyprovidedvaluableinsightsintoDLarchitectures,showingthatSSMscouldplayacrucialroleintheirdevelopment.

Ontheotherhand,recentstudieshaveexploredMambaVisiontechniques,emphasizingitsrapidgrowthandrisingimportanceincomputervision.TheyhighlightMamba’sabilitytoaddressthelimitationsofCNNsandVisionTransformers,particularlyincapturinglong-rangedependencieswithlinearcomputationalcomplexity.Rahmanetal.

[10]investigatedtheMambamodel,thisrevolutionarycomputervisionapproach

thataddressedtheconstraintsofCNNsandVisionTransformers(ViTs).WithCNNs,localfeatureextractionismoreefficient,butwithViTs,long-rangedependenciesaremoredifficultduetotheirquadraticself-attentionmechanism.MambausedSelectiveStructuredStateSpaceModels(S4)tohandlelong-rangedependencieswith

,Vol.1,No.1,Article.Publicationdate:December2024.

4BenAtitallahetal.

linearcomputationalcost-efficiently.ThesurveyclassifiedMambamodelsintofourapplicationcategories:videoprocessing,medicalimaging,remotesensing,and3Dpointcloudanalysis.Avarietyofscanningapproacheswerealsoexamined,includingzigzag,spiral,andomnidirectionalmethods.ThepaperemphasizedMamba’scomputationalefficiencyandscalability,whichmakeitsuitableforhigh-resolutionandreal-timeoperations.TheauthorsalsoconductedacomparisoninvestigationofMambaagainstCNNsandViTs,provingitsadvantagesinavarietyofbenchmarks.Theyalsodiscussedpotentialfutureresearchdirections,suchasincreasingdynamicstaterepresentationsandmodelinterpretability.Overall,thearticlepositionedMambaasaparadigmforbalancingperformanceandcomputationefficiencyincomputervision.

Thestudypresentedin

[11]providedacomprehensivesurveyandtaxonomyofSSMsinvision-oriented

approaches,withafocusonMamba.AcomparisonwasmadebetweenMamba,CNNs,andTransformers.Duetoitsabilitytohandleirregularandsparsedata,Mambahasbeenusedforavarietyofvisionapplications,includingmedicalimageanalysis,remotesensing,and3Dvisualidentification.ThissurveyclassifiedMambamodelsbyapplicationareas,suchasgeneralvision,multi-modaltasks,andvertical-domaintasks,andpresentedacomprehensivetaxonomyofMambavariants,aswellasdetaileddescriptionsoftheirprinciplesandapplications.ThemainobjectiveofthissurveywastohelpacademicscomprehendMamba’sdevelopmentandpotentialtoimprovecomputervision,particularlyinapplicationsthatrequirecomputingefficiencyandlong-rangedependencymodeling.

1.1.3Discussion.Whilethesurveysdiscussedaboveprovideessentialinsightsintoavarietyofcutting-edgefields,theydohavesignificantlimitations.ManysurveysonGNNsconcentrateonthetheoreticalfoundationsandarchitectureofthesenetworks,payinglittleattentiontopracticalproblemsandmodelscalabilityindynamicscenarios.Inaddition,whilethesesurveyshighlightGNN’srelevanceinresearchfieldslikehealthcareandrecommendationsystems,theyoftenignorepracticalchallengessuchascomputationalcomplexity,scalabilityinlargenetworks,andlimitedgeneralizationacrossheterogeneousdatasets.Besides,whilemanysurveysdiscussMambaframeworks’potentialtoovercometransformerlimitations,theytendtofocusontheoreticaladvancementsandmodelefficiencyratherthanprovidinganin-depthanalysisofreal-worldlimitations,suchastrade-offsbetweencomputationalefficiencyandperformanceacrossvariousdomains.TheavailablestudiesonGNNsandMambamodelshighlighttheirdistinctimprovementsbutremainlimitedinscope.GNNsurveysinvestigategraph-basedlearningbutdonotexplorehowgraphstructuresmaybeincorporatedintoMambaframeworks.Mamba-relatedsurveys,ontheotherhand,concentrateonsequentialmodelingandcomputingefficiencywithoutinvestigatingthepossibilityofcombininggraph-basedmethods.Thisdiscrepancycreatesahugeresearchgap.IntegratinggraphstructuresintoMambapresentstransformativecapabilitiesthatneedacomprehensivereview.

1.2ContributionsoftheProposedSurvey

TherehasbeenarapidsurgeinresearchexploringGraphMamba’sarchitecture,improvements,andappli-cationsacrossvariousdomains.However,theinsightsremaindistributedacrossvariousstudies,andthereiscurrentlynothoroughreviewthatbringsthesefindingstogether.Asthefieldadvancesrapidly,awell-structuredoverviewofthelatestdevelopmentsisincreasinglyvaluable.Themaincontributionsofthissurveypaperareillustratedinthefollowingpoints:

•ThissurveyoffersacomprehensiveexplanationofthefundamentalprinciplesofGraphMambaandoffersastrongtheoreticalfoundationforbothresearchersandpractitioners.

•ItexaminesthemostrecentenhancementstotheoriginalGraphMambaarchitectureandevaluatestheperformanceimplicationsofvariousproposedmodifications.

•AcomparisonofvariousGraphMambavariantsispresentedtoemphasizetheiruniquecharacteristics.

•ThesurveyexaminesavarietyofdisciplinesinwhichGraphMambahasbeenimplemented,suchascomputervision,healthcare,andbiosignals.

•Additionally,itidentifiespotentialfieldsforfutureimplementationsofGraphMambaandaddressesthecurrentlimitationsandopenresearchquestionsinthiscontext.

,Vol.1,No.1,Article.Publicationdate:December2024.

ExploringGraphMamba:AComprehensiveSurveyonState-SpaceModelsforGraphLearning5

1.3PaperOrganization

ThissurveyprovidesacomprehensiveoverviewofGraphMambastatespacemodels,includingtheirarchitec-tures,applications,challenges,andpotentialfuturedirections.WeexploretheadvantagesanddisadvantagesofexistingGraphMambamodelsanddiscusstheirprospectsforfuturedevelopment.Thepaperisorganizedasfollows:Section

2

discussesthepreliminariesandkeytermsrelatedtoGraphNeuralNetworks,StateSpaceModels,andMamba.InSection

3,wedelveintovariousGraphMambaarchitectures

.Section

4

highlightsrecentapplicationsofGraphMamba.Sections

5

and

6

presentbenchmarksandacomparativeanalysisofresultsdemonstratingGraphMamba’sperformanceacrossdifferenttasks.Section

7

outlinesthelimitationsofapplyingGraphMamba.Section

8

exploresemergingareasandfutureresearchdirections.Finally,weconclude

theworkinSection

9.

2Preliminaries

ThissectionreviewsthefoundationofGNNsandSSMsandhowtheyareintegratedintheGraphMambaframework.

2.1GraphNeuralNetworks(GNNs)

GNNshavedevelopedasastrongclassofDLmodelsbuiltforgraph-structureddata.UnlikestandardMLmodels,whichoftenoperateonfixed-sizedinputssuchaspicturesorsequences,GNNsarespeciallydesignedtohandlenon-Euclideandata,representedasnodesandedges

[1].ThismakesGNNsidealfortasksthatneedcomplicated

relationaldata,suchassocialnetworks,knowledgegraphs,chemicalstructures,andrecommendationsystems.Graphsareinherentlyadaptableandcanrepresentabroadrangeofdataformats.StandardDLmodels,suchasCNNs,performwellwithstructureddatalikegridsorsequencesbutfailtogeneralizetographdata.GNNsaddressthisdrawbackbylearningrepresentationsofnodes,edges,andgraphsinawaythatcapturesboththelocalneighborhoodinformationandtheglobalstructureofthegraph.Indeed,GNNsarebasedontheideaofmessageforwarding,inwhicheachnodeinthenetworkgathersinformationfromitsneighborstoupdateitsrepresentation.ThismethodenablesGNNstoeffectivelycapturebothlocalpatternsandlong-rangerelationshipsthroughoutthegraphbypropagatinginformationthroughasetoflayers.Inthefollowingsubsections,wepresentanoverviewaboutsomepopularGNNarchitecturesproposedintheliterature.

2.1.1GraphConvolutionalNetworks(GCNs).GCNs,introducedbyKipfetal.

[12],areaspecializedtype

ofGNNcreatedtoworkwithgraph-baseddata.Thecoreideaistotaketheconceptofconvolution,whichissoeffectiveinimageprocessingwithgridsofpixels,andadaptittotheirregularstructureofgraphs.IncontrasttoconventionalCNNsthatdependonstaticgrids,GCNsexecutelocalizedconvolutionsateachnode,aggregatinginformationfromadjacentnodes.ThisenablesGCNstounderstandthelinksandpatternsinsidethegraphstructureinamannerthatconventionalCNNscan’t.ThepropagationruleforaGCNlayerisrepresentedas:

whereiisthenodebeingprocessed,Niisthesetofnodesthatareneighborsofi,histhemathematical

representationofiatlayerl,wlisthelayer’sweightmatrix,andcijservesasanormalizationfactortoaccountfordifferencesinthenumberofneighbors.

Thegraphconvolutionprocessiscarriedoutrepeatedlyonmanylevels,whichhelpsthemodelunderstandmorecomplicatedconnectionsandhigher-levellinksinthegraphstructure.

2.1.2GraphAttentionNetworks(GATs).In

[13],GATshavebeenproposedbyVelikovi

etal.,whichareatypeofGNNdesignedtoaddresslimitationsintraditionalGNNs.Theyarespeciallydesignedforcomplexconnectionsandirregulargraphstructures.Theirkeyinnovationisanattentionmechanismthatselectivelyaggregatesinformationfromneighboringnodes,allowingthemtofocusonthemostrelevantinputs.Thismethodassignsdifferentweightstoeachneighbor,emphasizingtheimportanceofspecificnodesduringaggregationandimprovingthemodel’sabilitytocapturemeaningfulrelationships.Thecomputationsmade

,Vol.1,No.1,Article.Publicationdate:December2024.

6BenAtitallahetal.

intheGATlayerarepresentedinthefollowingEquation

2:

where,idenotesthetargetnode,N(i)representsthesetofi’sneighbors,andhi(l)istherepresentationofnodeiatlayerl.W(l)istheweightmatrixsharedacrosslayerl,andαijistheattentionweightfortheedgebetweennodesiandj,determinedbyalearnableattentionmechanism.

2.1.3GraphSampleandAggregation(GraphSAGE).GraphSAGE,introducedbyHamiltonetal.in

[13],

isascalableGNNarchitectureforlargegraphs.Itlearnsnodeembeddingsbysamplingandaggregatinginformationfromlocalneighbors,allowinginductivelearningtogeneralizetounseennodes.GraphSAGEconsistsoftwomainparts:embeddinggeneration(forwardpropagation)andparameterlearning.Themodeliterativelytraversesneighborhoodlayersandenablesnodestogatherinformationfromtheirsurroundings.TherepresentationforanodeUatdepthkisupdatedasfollows:

hK)=σ(WK·CONCAT(hK−1),AGGREGATEK({hK−1),∀u∈N(U)})))(3)

where,σisthenon-linearactivationfunction,andWKisthelearnableweightmatrixfordepthk.TheCONCAToperationcombinesho(K−1)withtheaggregateddatafromU’sneighbors,denotedN(U),usingAGGREGATEK,whichcanbeamean,LSTM,orpoolingfunction.ThisiterativeprocessenablesGraphSAGEtocapturecomplexnoderelationshipsinaninductiveandscalableway.

2.2StateSpaceModels(SSMs)

DLhasseenanotabletransformationwiththeemergenceofTransformermodels,whichhaveattaineddominanceinbothcomputervisionandnaturallanguageprocessing.Theirsuccessisattributedtotheself-attentionmechanism,aneffectivestrategythatenhancesmodelunderstandingbyproducinganattentionmatrixbasedonquery,key,andvaluevectors

[14]

.Thismethodologyhastransformedhowmodelsanalyzeandcomprehenddata.However,theTransformerarchitecturefacesanotablechallenge.Itsself-attentionmechanismoperateswithquadratictimecomplexity.Astheinputsequencelengthgrows,thecomputationalrequirementsincreaseexponentiallyandcreateasignificantbottleneck,especiallywhendealingwithverylongsequencesorlargedatasets.Thislimitationhaspushedresearchtodevelopmoreefficientarchitecturesthatcanmaintainthebenefitsofself-attentionwhilescalingmoreeffectivelytomoresignificantinputs.

Inthiscontext,MambawasproposedbyGuetal.

[15]basedonSSMs

.Ithasgainedmuchinterestinrecentyearsduetoitseffectivenessinprovidinggoodperformanceastransformerswhilereducingtheoverallcomplexity.SSMsarewidelyusedtorepresentdynamic

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论