版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
TowardsData-EfficientDeepLearningwithMeta-LearningandSymmetries
JinXu
BalliolCollege
UniversityofOxford
AthesissubmittedforthedegreeofDoctorofPhilosophyinStatistics
Trinity2023
2
Acknowledgements
Firstandforemost,Iwanttoexpressmydeepgratitudetomysupervisors,Prof.Yee
WhyeTehandDr.TomRainforth.Theirunwaveringsupport,carefulguidance,andconstantinspirationhavebeeninvaluablethroughoutmyPhDjourney.Ithasbeenaprivilegetobementoredbythem,whoIregardasresearchrolemodels.Theirdepthandbreadthofknowledgehavebeenbothhumblingandenlightening.SpecialacknowledgementgoestoYeeWhye,whohasalwaysbeenconsiderateandreadytohelpintoughtimes.MyheartfeltthanksgotoTomforhisguidanceduringthechallengingtimesbroughtonbythepandemic.
IwouldliketoextendmygratitudetoallmycollaboratorsHyunjikKim,Jean-FrancoisTon,AdamKosiorek,EmilienDupont,andKasparMärtens.TheirexpertiseandfeedbackhavebeencrucialinimprovingmyworkandIlearnagreatdealfromthem.AbigthankyoutoProf.RyanAdamsfromPrincetonUniversityandtomyinternshiphosts,JamesHensmanandMaxCrociatMicrosoftResearch.TheirmentorshipoutsideofmyPhDlifehasbeenanindispensablepartofmyresearchexperience.
Moreover,Ifeelextremelyfortunatetobesurroundedbyamazingandcaringfriendswhosenamesarenotpossibletoenumeratehere.AmongthemareEmilienDupont,Jean-FrancoisTon,CharlineLeLan,BobbyHe,SheheryarZaidi,QinyiZhang,GuneetDhillon,AndrewCampbell,ChrisWilliams,CarloAlfano,FaaizTaufiq,AnnaMenacherandothersfromourlovelyoffice1.17,HanwenXing,YanzhaoYang,NingMiao,ChaoZhang,Yutonglu,YixuanHe,XiLin,YuanZhou,FanWu,BohaoYaofromthedepartmentofstatistics,DunhongJin,SihanZhou,SijiaYao,HuiningYang,KevinWang,NataliaHong,HangYuan,KangningZhang,ChengyangWangandmanyothersfromotherdepartmentsatOxford,DenizOktay,SulinLiu,JennyZhanandothersfromPrincetonUniversity,internshippeersatMicrosoftResearchincludingAlexanderMeulemans,SalehAshkboosfromETH.
Aspecialthankstoalluniversityanddepartmentstaff,especiallyChrisCullenforhiskindandpatientsupportduringdifficulttimes,andtoJoannaStoneham,Stuart
3
McRobert,andotherswhoensuredasmoothPhDexperience.
Finally,aboveall,mydeepestthanksgotoYifanYuforherloveandcompanionship.SheimmenselyenrichedmytimeinOxford,bringingcolourandjoytomylife.Additionally,IameternallygratefultomyparentsChengxiangXuandFengChenforgivingmethefreedomtopursuemypassionsandfortheirunquestioningsupportthroughoutthisjourney.
4
Abstract
Recentadvancesindeeplearninghavebeensignificantlypropelledbytheincreasingavailabilityofdataandcomputationalresources.Whiletheabundanceofdataenablesmodelstoperformwellincertaindomains,therearereal-worldapplications,suchasinthemedicalfield,wherethedataisscarceordifficulttocollect.Furthermore,therearealsoscenarioswherethelargedatasetisbetterviewedaslotsofrelatedsmalldatasets,andthedatabecomesinsufficientforthetaskassociatedwithoneofthesmalldatasets.Itisalsonoteworthythathumanintelligenceoftenrequiresonlyahandfulofexamplestoperformwellonnewtasks,emphasizingtheimportanceofdesigningdata-efficientAIsystems.Thisthesisdelvesintotwostrategiestoaddressthischallenge:meta-learningandsymmetries.Meta-learningapproachesthedata-richenvironmentasacollectionofmanysmall,individualdatasets.Eachofthesesmalldatasetsrepresentsadistincttask,yetthereisunderlyingsharedknowledgebetweenthem.Harnessingthissharedknowledgeallowsforthedesignoflearningalgorithmsthatcanefficientlyaddressnewtaskswithinsimilardomains.Incomparison,symmetryisaformofdirectpriorknowledge.Byensuringthatmodels’predictionsremainconsistentdespiteanytransformationtotheirinputs,thesemodelsenjoybettersampleefficiencyandgeneralization.
Inthesubsequentchapters,wepresentnoveltechniquesandmodelswhichallaimatimprovingthedataefficiencyofdeeplearningsystems.Firstly,wedemonstratethesuccessofencoder-decoderstylemeta-learningmethodsbasedonConditionalNeuralProcesses(cnps).Secondly,weintroduceanewclassofexpressivemeta-learnedstochasticprocessmodelswhichareconstructedbystackingsequencesofneuralparameterisedMarkovtransitionoperatorsinfunctionspace.Finally,weproposegroupequivariantsubsampling/upsamplinglayerswhichtacklesthelossofequivarianceinconventionalsubsampling/upsamplinglayers.Theselayerscanbeusedtoconstructend-to-endequivariantmodelswithimproveddata-efficiency.
i
Contents
1Introduction
1
1.1Motivation
1
1.2Thesisoutline
3
1.3Papers
4
2Background
6
2.1Meta-learning
6
2.1.1Conventionalsupervisedlearningandmeta-learning
6
2.1.2Differentviewsofmeta-learning
8
2.1.3Commonapproachestometa-learning
10
2.2Neuralprocesses
11
2.2.1Stochasticprocesses
12
2.2.2Neuralprocessesasstochasticprocesses
12
2.2.3Neuralprocesstrainingobjectives
13
2.2.4Ameta-learningperspective
14
2.3Symmetriesindeeplearning
15
2.3.1Group,cosetandquotientspace
15
2.3.2Grouphomomorphism,groupactionsandgroupequivariance
.16
2.3.3Homogeneousspacesandliftingfeaturemaps
16
2.3.4FeaturemapsinG-CNNs
17
2.3.5Groupequivariantneuralnetworks
18
3MetaFun:Meta-LearningwithIterativeFunctionalUpdates
20
3.1Introduction
20
3.2MetaFun
22
3.2.1Learningfunctionaltaskrepresentation
23
3.2.2MetaFunforregressionandclassification
26
3.3Relatedwork
27
ii
3.4Experiments
31
3.4.11-Dfunctionregression
31
3.4.2Classification:miniImageNetandtieredImageNet
33
3.4.3Ablationstudy
36
3.5Conclusionsandfuturework
37
3.6Supplementarymaterials
38
3.6.1Functionalgradientdescent
38
ReproducingkernelHilbertspace
38
Functionalgradients
39
Functionalgradientdescent
40
3.6.2Experimentaldetails
40
4DeepStochasticProcessesviaFunctionalMarkovTransitionOpera-
tors
44
4.1Introduction
44
4.2Background
46
4.3Markovneuralprocesses
47
4.3.1AmoregeneralformofNeuralProcessdensityfunctions
47
4.3.2Markovchainsinfunctionspace
48
4.3.3Parameterisation,inferenceandtraining
49
4.4Relatedwork
52
4.5Experiments
54
4.5.11Dfunctionregression
54
4.5.2Contextualbandits
55
4.5.3Geologicalinference
56
4.6Discussion
58
4.7Supplementarymaterials
59
4.7.1Proofs
59
60
4.7.2Implementationdetails
63
4.7.3Data
63
Modelarchitecturesandhyperparameters
65
Computationalcostsandresources
66
4.7.4Broaderimpacts
67
iii
5GroupEquivariantSubsampling
68
5.1Introduction
68
5.2Equivariantsubsamplingandupsampling
70
5.2.1TranslationequivariantsubsamplingforCNNs
70
5.2.2Groupequivariantsubsamplingandupsampling
72
5.2.3ConstructingΦ
75
5.3Application:Groupequivariantautoencoders
75
5.4Relatedwork
77
5.5Experiments
79
5.5.1Basicproperties:Equivariance,disentanglementandout-of-
distributiongeneralization
80
5.5.2Singleobject
81
5.5.3Multipleobjects
82
5.6Conclusions,limitationsandfuturework
83
5.7Supplementarymaterials
84
5.7.1Equivariantsubsamplingandupsampling
84
ConstructingΦ
84
Multiplesubsamplinglayers
85
5.7.2Groupequivariantautoencoders
87
5.7.3Proofs
88
5.7.4Implementationdetails
93
Data
93
Modelarchitectures
94
Hyperparameters
95
Computationalresources
95
6ConclusionsandFutureOutlook
96
Bibliography
99
1
Chapter1
Introduction
1.1Motivation
Recentbreakthroughsindeeplearningcanbelargelyattributedtothevastamountofdataavailableandtheadvancementofcomputationalresources[
Dengetal.,
2009,
Rainaetal.,
2009,
Silveretal.,
2016,
Jumperetal.,
2021,
Brownetal.,
2020a]
.Whiletrainingonlargedatasetsenablesdeeplearningmodelstoexcelincertaintasks,manyreal-worldapplicationsonlyprovidelimiteddataforaspecifictask.Forinstance,inmedicalfields,obtainingdata,especiallyforrarediseases,ischallengingandoftenexpensive.Indrugdevelopmentorrecommendationsystems,therewillalwaysbeinsufficientdatafornewdrugs/users,eventhoughabundantdataexistsforotherdrugsorusers.Therefore,toapplydeeplearningtothesefields,itisvitaltodevelopsystemsthataredata-efficient.Moreover,foradvancedAIsystems,data-efficiencycanbeacrucialingredient:Firstly,AIsystemsshouldbeabletogeneralizebeyondspecificdatadistributionswithoutrelyingondata;forinstance,animagerecognitionsystemshouldrecognizeobjectsregardlessoftheirpositionororientation.Secondly,humanintelligencecanoftensolvenewtaskswithjustafewexamples.Thus,forAItoemulatehuman-likeintelligence,itshouldalsohavesuchcapability.
FromaBayesianperspective,learninginvolvesupdatingourbeliefsaboutamodel(representedbyθ)giventhedata,i.e.p(θ|Ddata).Foramodeltolearnefficientlyfromasmallamountofdata,it’simportanttostartwithagoodinitialguessor"prior"p(θ).Inthispaper,welookattwodirectionstoobtainsuchpriorfordata-efficientlearning:Thefirstismeta-learning,whichlearnstheprior(orthesharedknowledge)from
2
similartasks.Itcanbeunderstoodas"learningtolearnmoreefficiently".Thesecondissymmetriesindeeplearning,whichservesasaknownpriorforcertainproblems.Symmetry,afundamentalconceptinphysics,representsaformofpriorknowledgethatisubiquitouslyobservedthroughoutourphysicalworld.
Meta-learningtacklesaspecificscenarioinwhichthevastpoolofdatacanbeviewedasmanysmalldatasets,eachrepresentingadistincttask.Yet,thesetaskscontainunderlyingsharedknowledgethatcanbeharnessedtoaddressnewtaskswithinthesamecategory.Thisscenarioisprevalentinmanyapplications.Take,forinstance,anonlineretailcompanywithdatafromcustomersworldwide.Thedataassociatedwitheachuseristypicallysparse.Inthiscontext,predictingbehavioursforeachuserconstitutesanindividualtask,butpatternsamongdifferentusersoftenexhibitsimilarities.Meta-learningalgorithmsaredesignedtohandlesuchcircumstances.Thegoalofmeta-learningistolearndata-efficientlearningalgorithmsthatcanlaterbeappliedtoaparticulartask.Thetrainingdataformeta-learningcomprisesnumerousrelatedtasks,eachwithalimitedsetofdatapoints.Afterthemeta-learningphase,thelearnedlearningalgorithmscansolveanewtaskinadata-efficientmanner.Incontrast,theaimofconventionalsupervisedlearningisjusttolearnapredictivemodel.
Meta-learningproblemscanbetackledfromvariousperspectives,andtheseap-proachescanbeunderstoodthroughdifferentviewpointssuchasoptimization-basedap-proaches[
RaviandLarochelle,
2016,
Finnetal.,
2017a
],metric-basedapproaches[
Koch,
2015
,
Vinyalsetal.,
2016,
Sungetal.,
2018,
Snelletal.,
2017],andmodel-based
approaches[
Santoroetal.,
2016,
Mishraetal.,
2018,
Garneloetal.,
2018a
],amongothers.Notethattheseviewsarenotexclusive.Forexample,methodssuchasprototypicalNetworks[
Snelletal.,
2017
],MAML[
Finnetal.,
2017a
],ML-PIP[
Gordon
etal.
,
2018
]etc.canbereformulatedunderamodel-basedframeworkthatusesanencoder-decodersetup.Inthissetup,theencoderproducesataskrepresentationusingtrainingdata,andthedecoderthenmakespredictionsbasedonthetaskrep-resentation.Theseapproachestransformthemeta-learningchallengetoresemblearegularlearningprobleminvolvingsequences,anditisalsomorecomputationallyefficientifnogradientcomputationisinvolvedinboththeencoderandthedecoderlikecnp-typemodels[
Garneloetal.,
2018a]
.OurstudyinChapter
3
explicitlyadoptsthisencoder-decoderframeworkformeta-learning.Byusingafunctionaltaskrepresentation,anditerativelyupdatingtherepresentationdirectlyinfunctionspace,
3
wedemonstratethatencoder-decoderapproacheswithoutgradientinformationcanalsobecompetitivewithotherapproaches,whichhasnotbeenshownbefore.
Furthermore,becausetrainingdataforeachtaskinmeta-learningisoftenlimited,uncertaintyestimationbecomescrucial.StochasticProcesses(sps)(e.g.GaussianProcesses(gps))canbeusedtomakepredictionswithuncertaintyestimation.Thus,learningtheseprocessescanbeseenasawaytoapproachmeta-learningwithuncer-taintyinmind.InChapter
4
,weproposeanewframeworktoconstructexpressiveneuralparameterisedspsbyparameterisingMarkovtransitionsinfunctionspace.
Unlikemeta-learningabove,whichdiscoverssharedknowledgefromrelatedtasks,symmetryservesasadirectformofpriororinductivebias,integratedintodeeplearningmodelswithouttheneedforpre-training.Symmetriesrefertotransformationsthatmaintaincertainpropertiesofanobjectofinterestunchanged.Theseincludetransformationssuchasimagetranslation,rotation,orpermutationofsetelements.Byincorporatingthesesymmetriesintodeeplearningmodels,ensuringthattheoutputsremainconsistent(thesameorundergothecorrespondingtransformation)despiteinputtransformations,themodelinherentlygeneralizestotransformedinputs.Consequently,deeplearningmodelsequippedwiththesesymmetriesnotonlybecomemoredata-efficientbutalsogeneralizebetter.AsimpleexampleofthisisConvolutinalNeuralNetworks(cnns),whichareinvarianttoinputtranslationsforclassificationtasks,andperformsignificantlybettercomparedtoplainfeed-forwardnetworks.Earlierresearchhasintroducedmanymethodstobuildconvolutional[
Cohenand
Welling,
2016,
2017,
Cohenetal.,
2019]andattentionblocks[Hutchinsonetal.,
2021,
Fuchsetal.,
2020
]thatareequivariantw.r.t.tovarioussymmetries.However,thepoolinglayersorsubsampling/upsamplinglayerscommonlyusedinvariousdeeplearningarchitecturesbreakthesesymmetries[
Zhang,
2019]
.InChapter
5,wepresent
groupequivariantsubsampling/upsamplinglayersthathaveexactequivariance.
1.2Thesisoutline
InChapter
2
,weprovideashortintroductiontometa-learning,neuralprocessesandsymmetriesindeeplearning,tosetthestageforlaterchapters.
InChapter
3
,weintroduceaniterativefunctionalencoder-decodermethodforsu-pervisedmeta-learning,whichisbasedonNeuralProcesses(nps)[
Garneloetal.,
4
2018a
,b]
.Onstandardfew-shotclassificationbenchmarkslikeminiImageNetandtieredImageNet,itisdemonstratedthatmeta-learningmethodsbasedontheneuralprocessfamilycanbecompetitiveorevenoutperformgradient-basedmethodssuchasMAML[
Finnetal.,
2017a
]andLEO[
Rusuetal.,
2019]
.
InChapter
4
,weintroduceMarkovNeuralProcesses(MNPs),anewclassofStochasticProcesses(SPs)whichareconstructedbystackingsequencesofneuralparameterisedMarkovtransitionoperatorsinfunctionspace.Therefore,theproposediterativeconstructionaddssubstantialflexibilityandexpressivitytotheoriginalframeworkofNeuralProcesses(NPs)withoutcompromisingconsistencyoraddingrestrictions.OurexperimentsdemonstrateclearadvantagesofMNPsoverbaselinemodelsonavarietyoftasks.It’snoteworthythatspmodelscanbeviewedthroughameta-learninglens.Sotheproposedmethodcanalsobeseenasameta-learningapproachwithprincipleduncertaintyestimation.
Chapter
5
,wefirstintroducetranslationequivariantsubsampling/upsamplinglayersthatcanbeusedtoconstructexacttranslationequivariantCNNs.Wethengeneralisetheselayersbeyondtranslationstogeneralgroups,thusproposinggroupequivariantsubsampling/upsampling.Weusetheselayerstoconstructgroupequivariantautoen-coders(GAEs)thatallowustolearnlow-dimensionalequivariantrepresentations.Weempiricallyverifyonimagesthattherepresentationsareindeedequivarianttoinputtranslationsandrotations,andthusgeneralisewelltounseenpositionsandorienta-tions.WefurtheruseGAEsinmodelsthatlearnobject-centricrepresentationsonmulti-objectdatasets,andshowimproveddataefficiencyanddecompositioncomparedtonon-equivariantbaselines.
InChapter
6
,wesummarizeourfindingsandexplorepotentialavenuesforfutureresearchtofurtheradvancethefield.
1.3Papers
Thisisanintegratedthesisandincludesthefollowingpublishedpapers:Chapter3contains:
Xu,J.,Ton,J.F.,Kim,H.,Kosiorek,A.,&Teh,Y.W.Metafun:Meta-
5
learningwithiterativefunctionalupdates.InternationalConferenceon
MachineLearning(ICML),2020[
Xuetal.,
2020]
Chapter4contains:
Xu,J.,Kim,H.,Rainforth,T.,&Teh,Y.(2021).Groupequivariantsub-sampling.AdvancesinNeuralInformationProcessingSystems(NeurIPS),2021[
Xuetal.,
2021]
Chapter5contains
Xu,J.,Dupont,E.,Märtens,K.,Rainforth,T.,&Teh,Y.W.(2023).DeepStochasticProcessesviaFunctionalMarkovTransitionOperators.AdvancesinNeuralInformationProcessingSystems(NeurIPS),2023[
Xu
etal.
,
2023]
6
Chapter2
Background
2.1Meta-learning
2.1.1Conventionalsupervisedlearningandmeta-learning
Inconventionalsupervisedlearning,theobjectiveistolearnafunctionfthatmapsaninputfeaturevectorx∈Xtoanoutputlabely∈Y.Learningisbasedonexampleinput-outputpairsinatrainingsetDtrain={(xi,yi.Commontypesofsupervisedlearningtasksincluderegressionwhereoutputlabelsarereal-valued,andclassificationwheretheoutputlabelsrepresentdifferentclasses.Thefunctionf,oftenreferredto
asthepredictivemodel,isamemberofahypothesisclass,H:={f|f(x;ϕ),ϕ∈Rdφ}.
Foreachtask,thereisariskfunctionℓ(y,f(x))whichmeasurespredictionerror.Asanexample,inthecontextofaregressiontask,ℓoftentakestheformofasquarederror,ℓ(y,f(x))=(y−f(x))2.Thetrainingprocessofthemodelftranslatestosolvinganoptimizationproblemdefinedasfollows:
ItiscalledempiricalriskminimizationbecausethisobjectiveisanestimationofthepopulationriskE(xi,yi)~p(x,y)[ℓ(yi,f(xi))]basedontheempiricaldistributionoftrainingdata.
7
Aftertraining,themodelshouldgeneralizeeffectivelywhenpresentedwithatestset,denotedasDtest={(xi,yim+1.Themodel’sperformancecanbeassessedusing
thetestrisk(f;Dtest)whichservesasanestimateoftheoverallpopulationrisk
usingunseendata.
Figure2.1:Dataforameta-classificationproblem.Boththemeta-trainingandmeta-testsetsconsistoftasks(redrectangles)andarepresumedtocomefromthesametaskdistributionp(T).Eachofthesetasksencompassesitsowntask-specifictrainingandtestsets,whicharecommonlyreferredtoasthecontext(yellowlabels)andthetarget(greylabels)respectively.
Inpractice,itiscommontohavescenarioswherelotsofsupervisedlearningtasksarerelatedtoeachother,yetthenumberofdatapointsforeachindividualtaskislimited.Meta-learningemergesasanewlearningparadigmtoaddresssuchchallenges.
Specifically,wehaveameta-trainingsetdefinedasMtrain={(Dt(a)in,Dt(s)t,ℓ(j)
andameta-testsetgivenbyMtest={(Dt(a)in,Dt(s)t,ℓ(j)M+1.Eachelementinthese
meta-datasetsisatupleconsistingofatrainingset(calledthecontext),atestset(calledthetarget)andariskfunction(typicallythesamewithinameta-dataset).This3-tuplecharacterizesataskTj(seeFigure
2.1
illustration).Insupervisedlearning,weusetrainingdatatotrainapredictivemodel,hopingitcangeneralizeacrosstheentiredatadistribution.Inmeta-learning,theassumptionisthatthereisacommontaskdistribution,denotedasp(T),fromwhichboththemeta-trainingsetandthemeta-testsetaredrawn.Meta-learningalgorithmsaimtousemeta-trainingdatatodiscoverlearningalgorithmsthatcangeneralizeacrosstheentiretaskdistribution.
Morespecifically,alearningalgorithmforasupervisedlearningtasktakesinatraining
8
setDtrain,ariskfunctionℓandoutputsapredictivemodel,writtenas:
=ΦALGO(Dtrain,ℓ).(2.2)
Sinceℓisusuallyfixed,wewillomitthedependencyonitinsubsequentdiscussions.Foraparticulartask,thelearningalgorithmΦALGOcanbeevaluatedbythetestriskofthelearnedpredictivemodel,denotedas:
(;Dtest).(2.3)
Meta-learningfindsalearningalgorithmbasedontasksfromthemeta-trainingsetMtrain,sothatthislearningalgorithmcanbemoreefficientlyappliedtonewtasks,andgeneralizesacrossthetaskdistributionp(T).Themeta-learningalgorithmcanberepresentedas:
ΦALGO=MetaAlgo(Mtrain).(2.4)
Toevaluatethemeta-learningalgorithm,wecancompute:
Whileitresemblesthetestlossinsupervisedlearning,theaggregatedtestriskforataskreplacesthetraditionalriskfunctionforadatapoint.
Itisworthnotingthatwhilewefocusonsupervisedlearningtaskshere,meta-learningcanbeextendedtounsupervisedlearning[
EdwardsandStorkey,
2016,
Reedetal.,
2018
,
Hsuetal.,
2018]orreinforcementlearning[
Wangetal.,
2016,
Finnetal.,
2017a
,b]
.
2.1.2Differentviewsofmeta-learning
Bi-leveloptimizationviewLetusassumeboththepredictivemodelfandthelearningalgorithmΦALGOcanbeparameterised,andtheparametersaredenotedasϕandθaccordingly.Thatistosay,thelearningalgorithmcanbewrittenas:
ϕ=ΦALGO(Dtrain;θ).(2.6)
9
Meta-learningcanbeformulatedasthefollowingbi-leveloptimizationproblem:
wheretask-specificparameterϕjdependsonθthroughtheinner-loopoptimization:
ϕj(θ)=ΦALGO(Dt(a)in;θ)(2.8)
Manymeta-learningalgorithmsaredevelopedbasedonthisbi-leveloptimizationview,suchas
Finnetal.
[2017a],
Nicholetal.
[2018],
RaviandLarochelle
[2016]
.
HierarchicalmodelviewFromaprobabilisticperspective,thegenerativeprocessforeachtaskTjcanbeexpressedas:
θ∼p(θ),ϕj∼p(ϕj|θ),yi(j)∼p(yi(j)|xi(j)ϕj,θ)(2.9)
BoththetrainingsetDt(a)inandthetestsetDt(s)tfollowthesamedistribution(as
illustratedinFigure
2.2
).Thiscanbeseenasaprobabilistichierarchicalmodelwhereθindicatesthehigh-levelglobalparametersforalltasksandϕjdenotesthelow-levellocalparametersforeachtask.Inthiscontext,meta-learningisaboutinferringθfromlotsoftasksinthemeta-trainingset,thatisp(θ|Mtrain).Learning,ontheother
hand,infersϕjgiventhetrainingsetDt(a)infortaskTj,thatisp(ϕj|θ,Dt(a)in).
(j)i
j=1,...
Figure2.2:Meta-learningashierarchicalmodels(AremakeofFigure1in
Gordon
etal.
[2018])
.Task-specificparameterϕjdependsontheglobalparameterθ.Datapointsinboththecontextandthetargethavethesamegenerativeprocess,whichdependonbothθandϕj.
Notethatp(ϕj|θ)canbeseenasapriorfortaskTjconditionedonθ.Therefore,meta-learningcanbeseenaslearninganempiricalpriorfromthemeta-trainingset.
Finnetal.
[2018],
Requeimaetal.
[2019]adoptsthisview
.
10
Model-basedviewAlearningalgorithmf=ΦALGO(Dtrain)canbeseenasafunctionthattakesintheentiretrainingsetandoutputsapredictivemodel.ThemodelisthenusedtomakepredictionsontestdatainDtest.Thelearningandpredictionprocessescanthusbeconceptualizedassequence-to-sequencemappings.Forthesakeofbrevity,let’suseaconcisenotationfordatasequences,suchasx1:n={x1,x2,...,xn}.ForaspecifictaskTj,makingpredictionsfortestsetdatapointsbasedonthosefromthetrainingsetcanbedescribedasthefollowinginferencetask
p(ym+1:n|xm+1:n,x1:m,y1:m).(2.10)
Fromthisperspective,meta-learningisaboutcreatingthisconditionalmodel.Meta-learningonlydiffersfromconventionalsupervisedlearninginthatboththeinp
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年云南元谋县公安局公开招聘警务辅助人员15人备考题库及完整答案详解1套
- 2026年中国科学院南海海洋研究所海洋备考题库服务中心海洋大数据与人工智能工程师岗位招聘备考题库及完整答案详解1套
- 2026年中国科学院新疆天文台财务处招聘备考题库及答案详解一套
- 2026年中国共产党南宁市兴宁区纪律检查委员会招聘工作人员备考题库及完整答案详解1套
- 2026年南京大学公开招聘水处理与水环境修复教育部工程研究中心主任备考题库及完整答案详解1套
- 2026年中国科学院备考题库工程研究所第七研究室招聘备考题库完整参考答案详解
- 2026年成都市青羊区人民政府草市街街道办事处公开招聘编外人员的备考题库及完整答案详解1套
- 2026年共和县东巴卫生院乡村医生招聘备考题库完整参考答案详解
- 2026年呼和浩特市玉泉区苁蓉社区卫生服务中心招聘备考题库及答案详解参考
- 2026年关于为淄博市检察机关公开招聘聘用制书记员的备考题库附答案详解
- 2025年荆楚理工学院马克思主义基本原理概论期末考试真题汇编
- 2026年恒丰银行广州分行社会招聘备考题库带答案详解
- 纹绣风险协议书
- 【语文】湖南省长沙市雨花区桂花树小学小学一年级上册期末试卷(含答案)
- 贵港市利恒投资集团有限公司关于公开招聘工作人员备考题库附答案
- 广东省部分学校2025-2026学年高三上学期9月质量检测化学试题
- 【道 法】期末综合复习 课件-2025-2026学年统编版道德与法治七年级上册
- 中国心力衰竭诊断和治疗指南2024解读
- GB/T 14977-2025热轧钢板表面质量的一般要求
- GB/T 20160-2006旋转电机绝缘电阻测试
- GB/T 18318-2001纺织品织物弯曲长度的测定
评论
0/150
提交评论