




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
TowardsData-EfficientDeepLearningwithMeta-LearningandSymmetries
JinXu
BalliolCollege
UniversityofOxford
AthesissubmittedforthedegreeofDoctorofPhilosophyinStatistics
Trinity2023
2
Acknowledgements
Firstandforemost,Iwanttoexpressmydeepgratitudetomysupervisors,Prof.Yee
WhyeTehandDr.TomRainforth.Theirunwaveringsupport,carefulguidance,andconstantinspirationhavebeeninvaluablethroughoutmyPhDjourney.Ithasbeenaprivilegetobementoredbythem,whoIregardasresearchrolemodels.Theirdepthandbreadthofknowledgehavebeenbothhumblingandenlightening.SpecialacknowledgementgoestoYeeWhye,whohasalwaysbeenconsiderateandreadytohelpintoughtimes.MyheartfeltthanksgotoTomforhisguidanceduringthechallengingtimesbroughtonbythepandemic.
IwouldliketoextendmygratitudetoallmycollaboratorsHyunjikKim,Jean-FrancoisTon,AdamKosiorek,EmilienDupont,andKasparMärtens.TheirexpertiseandfeedbackhavebeencrucialinimprovingmyworkandIlearnagreatdealfromthem.AbigthankyoutoProf.RyanAdamsfromPrincetonUniversityandtomyinternshiphosts,JamesHensmanandMaxCrociatMicrosoftResearch.TheirmentorshipoutsideofmyPhDlifehasbeenanindispensablepartofmyresearchexperience.
Moreover,Ifeelextremelyfortunatetobesurroundedbyamazingandcaringfriendswhosenamesarenotpossibletoenumeratehere.AmongthemareEmilienDupont,Jean-FrancoisTon,CharlineLeLan,BobbyHe,SheheryarZaidi,QinyiZhang,GuneetDhillon,AndrewCampbell,ChrisWilliams,CarloAlfano,FaaizTaufiq,AnnaMenacherandothersfromourlovelyoffice1.17,HanwenXing,YanzhaoYang,NingMiao,ChaoZhang,Yutonglu,YixuanHe,XiLin,YuanZhou,FanWu,BohaoYaofromthedepartmentofstatistics,DunhongJin,SihanZhou,SijiaYao,HuiningYang,KevinWang,NataliaHong,HangYuan,KangningZhang,ChengyangWangandmanyothersfromotherdepartmentsatOxford,DenizOktay,SulinLiu,JennyZhanandothersfromPrincetonUniversity,internshippeersatMicrosoftResearchincludingAlexanderMeulemans,SalehAshkboosfromETH.
Aspecialthankstoalluniversityanddepartmentstaff,especiallyChrisCullenforhiskindandpatientsupportduringdifficulttimes,andtoJoannaStoneham,Stuart
3
McRobert,andotherswhoensuredasmoothPhDexperience.
Finally,aboveall,mydeepestthanksgotoYifanYuforherloveandcompanionship.SheimmenselyenrichedmytimeinOxford,bringingcolourandjoytomylife.Additionally,IameternallygratefultomyparentsChengxiangXuandFengChenforgivingmethefreedomtopursuemypassionsandfortheirunquestioningsupportthroughoutthisjourney.
4
Abstract
Recentadvancesindeeplearninghavebeensignificantlypropelledbytheincreasingavailabilityofdataandcomputationalresources.Whiletheabundanceofdataenablesmodelstoperformwellincertaindomains,therearereal-worldapplications,suchasinthemedicalfield,wherethedataisscarceordifficulttocollect.Furthermore,therearealsoscenarioswherethelargedatasetisbetterviewedaslotsofrelatedsmalldatasets,andthedatabecomesinsufficientforthetaskassociatedwithoneofthesmalldatasets.Itisalsonoteworthythathumanintelligenceoftenrequiresonlyahandfulofexamplestoperformwellonnewtasks,emphasizingtheimportanceofdesigningdata-efficientAIsystems.Thisthesisdelvesintotwostrategiestoaddressthischallenge:meta-learningandsymmetries.Meta-learningapproachesthedata-richenvironmentasacollectionofmanysmall,individualdatasets.Eachofthesesmalldatasetsrepresentsadistincttask,yetthereisunderlyingsharedknowledgebetweenthem.Harnessingthissharedknowledgeallowsforthedesignoflearningalgorithmsthatcanefficientlyaddressnewtaskswithinsimilardomains.Incomparison,symmetryisaformofdirectpriorknowledge.Byensuringthatmodels’predictionsremainconsistentdespiteanytransformationtotheirinputs,thesemodelsenjoybettersampleefficiencyandgeneralization.
Inthesubsequentchapters,wepresentnoveltechniquesandmodelswhichallaimatimprovingthedataefficiencyofdeeplearningsystems.Firstly,wedemonstratethesuccessofencoder-decoderstylemeta-learningmethodsbasedonConditionalNeuralProcesses(cnps).Secondly,weintroduceanewclassofexpressivemeta-learnedstochasticprocessmodelswhichareconstructedbystackingsequencesofneuralparameterisedMarkovtransitionoperatorsinfunctionspace.Finally,weproposegroupequivariantsubsampling/upsamplinglayerswhichtacklesthelossofequivarianceinconventionalsubsampling/upsamplinglayers.Theselayerscanbeusedtoconstructend-to-endequivariantmodelswithimproveddata-efficiency.
i
Contents
1Introduction
1
1.1Motivation
1
1.2Thesisoutline
3
1.3Papers
4
2Background
6
2.1Meta-learning
6
2.1.1Conventionalsupervisedlearningandmeta-learning
6
2.1.2Differentviewsofmeta-learning
8
2.1.3Commonapproachestometa-learning
10
2.2Neuralprocesses
11
2.2.1Stochasticprocesses
12
2.2.2Neuralprocessesasstochasticprocesses
12
2.2.3Neuralprocesstrainingobjectives
13
2.2.4Ameta-learningperspective
14
2.3Symmetriesindeeplearning
15
2.3.1Group,cosetandquotientspace
15
2.3.2Grouphomomorphism,groupactionsandgroupequivariance
.16
2.3.3Homogeneousspacesandliftingfeaturemaps
16
2.3.4FeaturemapsinG-CNNs
17
2.3.5Groupequivariantneuralnetworks
18
3MetaFun:Meta-LearningwithIterativeFunctionalUpdates
20
3.1Introduction
20
3.2MetaFun
22
3.2.1Learningfunctionaltaskrepresentation
23
3.2.2MetaFunforregressionandclassification
26
3.3Relatedwork
27
ii
3.4Experiments
31
3.4.11-Dfunctionregression
31
3.4.2Classification:miniImageNetandtieredImageNet
33
3.4.3Ablationstudy
36
3.5Conclusionsandfuturework
37
3.6Supplementarymaterials
38
3.6.1Functionalgradientdescent
38
ReproducingkernelHilbertspace
38
Functionalgradients
39
Functionalgradientdescent
40
3.6.2Experimentaldetails
40
4DeepStochasticProcessesviaFunctionalMarkovTransitionOpera-
tors
44
4.1Introduction
44
4.2Background
46
4.3Markovneuralprocesses
47
4.3.1AmoregeneralformofNeuralProcessdensityfunctions
47
4.3.2Markovchainsinfunctionspace
48
4.3.3Parameterisation,inferenceandtraining
49
4.4Relatedwork
52
4.5Experiments
54
4.5.11Dfunctionregression
54
4.5.2Contextualbandits
55
4.5.3Geologicalinference
56
4.6Discussion
58
4.7Supplementarymaterials
59
4.7.1Proofs
59
60
4.7.2Implementationdetails
63
4.7.3Data
63
Modelarchitecturesandhyperparameters
65
Computationalcostsandresources
66
4.7.4Broaderimpacts
67
iii
5GroupEquivariantSubsampling
68
5.1Introduction
68
5.2Equivariantsubsamplingandupsampling
70
5.2.1TranslationequivariantsubsamplingforCNNs
70
5.2.2Groupequivariantsubsamplingandupsampling
72
5.2.3ConstructingΦ
75
5.3Application:Groupequivariantautoencoders
75
5.4Relatedwork
77
5.5Experiments
79
5.5.1Basicproperties:Equivariance,disentanglementandout-of-
distributiongeneralization
80
5.5.2Singleobject
81
5.5.3Multipleobjects
82
5.6Conclusions,limitationsandfuturework
83
5.7Supplementarymaterials
84
5.7.1Equivariantsubsamplingandupsampling
84
ConstructingΦ
84
Multiplesubsamplinglayers
85
5.7.2Groupequivariantautoencoders
87
5.7.3Proofs
88
5.7.4Implementationdetails
93
Data
93
Modelarchitectures
94
Hyperparameters
95
Computationalresources
95
6ConclusionsandFutureOutlook
96
Bibliography
99
1
Chapter1
Introduction
1.1Motivation
Recentbreakthroughsindeeplearningcanbelargelyattributedtothevastamountofdataavailableandtheadvancementofcomputationalresources[
Dengetal.,
2009,
Rainaetal.,
2009,
Silveretal.,
2016,
Jumperetal.,
2021,
Brownetal.,
2020a]
.Whiletrainingonlargedatasetsenablesdeeplearningmodelstoexcelincertaintasks,manyreal-worldapplicationsonlyprovidelimiteddataforaspecifictask.Forinstance,inmedicalfields,obtainingdata,especiallyforrarediseases,ischallengingandoftenexpensive.Indrugdevelopmentorrecommendationsystems,therewillalwaysbeinsufficientdatafornewdrugs/users,eventhoughabundantdataexistsforotherdrugsorusers.Therefore,toapplydeeplearningtothesefields,itisvitaltodevelopsystemsthataredata-efficient.Moreover,foradvancedAIsystems,data-efficiencycanbeacrucialingredient:Firstly,AIsystemsshouldbeabletogeneralizebeyondspecificdatadistributionswithoutrelyingondata;forinstance,animagerecognitionsystemshouldrecognizeobjectsregardlessoftheirpositionororientation.Secondly,humanintelligencecanoftensolvenewtaskswithjustafewexamples.Thus,forAItoemulatehuman-likeintelligence,itshouldalsohavesuchcapability.
FromaBayesianperspective,learninginvolvesupdatingourbeliefsaboutamodel(representedbyθ)giventhedata,i.e.p(θ|Ddata).Foramodeltolearnefficientlyfromasmallamountofdata,it’simportanttostartwithagoodinitialguessor"prior"p(θ).Inthispaper,welookattwodirectionstoobtainsuchpriorfordata-efficientlearning:Thefirstismeta-learning,whichlearnstheprior(orthesharedknowledge)from
2
similartasks.Itcanbeunderstoodas"learningtolearnmoreefficiently".Thesecondissymmetriesindeeplearning,whichservesasaknownpriorforcertainproblems.Symmetry,afundamentalconceptinphysics,representsaformofpriorknowledgethatisubiquitouslyobservedthroughoutourphysicalworld.
Meta-learningtacklesaspecificscenarioinwhichthevastpoolofdatacanbeviewedasmanysmalldatasets,eachrepresentingadistincttask.Yet,thesetaskscontainunderlyingsharedknowledgethatcanbeharnessedtoaddressnewtaskswithinthesamecategory.Thisscenarioisprevalentinmanyapplications.Take,forinstance,anonlineretailcompanywithdatafromcustomersworldwide.Thedataassociatedwitheachuseristypicallysparse.Inthiscontext,predictingbehavioursforeachuserconstitutesanindividualtask,butpatternsamongdifferentusersoftenexhibitsimilarities.Meta-learningalgorithmsaredesignedtohandlesuchcircumstances.Thegoalofmeta-learningistolearndata-efficientlearningalgorithmsthatcanlaterbeappliedtoaparticulartask.Thetrainingdataformeta-learningcomprisesnumerousrelatedtasks,eachwithalimitedsetofdatapoints.Afterthemeta-learningphase,thelearnedlearningalgorithmscansolveanewtaskinadata-efficientmanner.Incontrast,theaimofconventionalsupervisedlearningisjusttolearnapredictivemodel.
Meta-learningproblemscanbetackledfromvariousperspectives,andtheseap-proachescanbeunderstoodthroughdifferentviewpointssuchasoptimization-basedap-proaches[
RaviandLarochelle,
2016,
Finnetal.,
2017a
],metric-basedapproaches[
Koch,
2015
,
Vinyalsetal.,
2016,
Sungetal.,
2018,
Snelletal.,
2017],andmodel-based
approaches[
Santoroetal.,
2016,
Mishraetal.,
2018,
Garneloetal.,
2018a
],amongothers.Notethattheseviewsarenotexclusive.Forexample,methodssuchasprototypicalNetworks[
Snelletal.,
2017
],MAML[
Finnetal.,
2017a
],ML-PIP[
Gordon
etal.
,
2018
]etc.canbereformulatedunderamodel-basedframeworkthatusesanencoder-decodersetup.Inthissetup,theencoderproducesataskrepresentationusingtrainingdata,andthedecoderthenmakespredictionsbasedonthetaskrep-resentation.Theseapproachestransformthemeta-learningchallengetoresemblearegularlearningprobleminvolvingsequences,anditisalsomorecomputationallyefficientifnogradientcomputationisinvolvedinboththeencoderandthedecoderlikecnp-typemodels[
Garneloetal.,
2018a]
.OurstudyinChapter
3
explicitlyadoptsthisencoder-decoderframeworkformeta-learning.Byusingafunctionaltaskrepresentation,anditerativelyupdatingtherepresentationdirectlyinfunctionspace,
3
wedemonstratethatencoder-decoderapproacheswithoutgradientinformationcanalsobecompetitivewithotherapproaches,whichhasnotbeenshownbefore.
Furthermore,becausetrainingdataforeachtaskinmeta-learningisoftenlimited,uncertaintyestimationbecomescrucial.StochasticProcesses(sps)(e.g.GaussianProcesses(gps))canbeusedtomakepredictionswithuncertaintyestimation.Thus,learningtheseprocessescanbeseenasawaytoapproachmeta-learningwithuncer-taintyinmind.InChapter
4
,weproposeanewframeworktoconstructexpressiveneuralparameterisedspsbyparameterisingMarkovtransitionsinfunctionspace.
Unlikemeta-learningabove,whichdiscoverssharedknowledgefromrelatedtasks,symmetryservesasadirectformofpriororinductivebias,integratedintodeeplearningmodelswithouttheneedforpre-training.Symmetriesrefertotransformationsthatmaintaincertainpropertiesofanobjectofinterestunchanged.Theseincludetransformationssuchasimagetranslation,rotation,orpermutationofsetelements.Byincorporatingthesesymmetriesintodeeplearningmodels,ensuringthattheoutputsremainconsistent(thesameorundergothecorrespondingtransformation)despiteinputtransformations,themodelinherentlygeneralizestotransformedinputs.Consequently,deeplearningmodelsequippedwiththesesymmetriesnotonlybecomemoredata-efficientbutalsogeneralizebetter.AsimpleexampleofthisisConvolutinalNeuralNetworks(cnns),whichareinvarianttoinputtranslationsforclassificationtasks,andperformsignificantlybettercomparedtoplainfeed-forwardnetworks.Earlierresearchhasintroducedmanymethodstobuildconvolutional[
Cohenand
Welling,
2016,
2017,
Cohenetal.,
2019]andattentionblocks[Hutchinsonetal.,
2021,
Fuchsetal.,
2020
]thatareequivariantw.r.t.tovarioussymmetries.However,thepoolinglayersorsubsampling/upsamplinglayerscommonlyusedinvariousdeeplearningarchitecturesbreakthesesymmetries[
Zhang,
2019]
.InChapter
5,wepresent
groupequivariantsubsampling/upsamplinglayersthathaveexactequivariance.
1.2Thesisoutline
InChapter
2
,weprovideashortintroductiontometa-learning,neuralprocessesandsymmetriesindeeplearning,tosetthestageforlaterchapters.
InChapter
3
,weintroduceaniterativefunctionalencoder-decodermethodforsu-pervisedmeta-learning,whichisbasedonNeuralProcesses(nps)[
Garneloetal.,
4
2018a
,b]
.Onstandardfew-shotclassificationbenchmarkslikeminiImageNetandtieredImageNet,itisdemonstratedthatmeta-learningmethodsbasedontheneuralprocessfamilycanbecompetitiveorevenoutperformgradient-basedmethodssuchasMAML[
Finnetal.,
2017a
]andLEO[
Rusuetal.,
2019]
.
InChapter
4
,weintroduceMarkovNeuralProcesses(MNPs),anewclassofStochasticProcesses(SPs)whichareconstructedbystackingsequencesofneuralparameterisedMarkovtransitionoperatorsinfunctionspace.Therefore,theproposediterativeconstructionaddssubstantialflexibilityandexpressivitytotheoriginalframeworkofNeuralProcesses(NPs)withoutcompromisingconsistencyoraddingrestrictions.OurexperimentsdemonstrateclearadvantagesofMNPsoverbaselinemodelsonavarietyoftasks.It’snoteworthythatspmodelscanbeviewedthroughameta-learninglens.Sotheproposedmethodcanalsobeseenasameta-learningapproachwithprincipleduncertaintyestimation.
Chapter
5
,wefirstintroducetranslationequivariantsubsampling/upsamplinglayersthatcanbeusedtoconstructexacttranslationequivariantCNNs.Wethengeneralisetheselayersbeyondtranslationstogeneralgroups,thusproposinggroupequivariantsubsampling/upsampling.Weusetheselayerstoconstructgroupequivariantautoen-coders(GAEs)thatallowustolearnlow-dimensionalequivariantrepresentations.Weempiricallyverifyonimagesthattherepresentationsareindeedequivarianttoinputtranslationsandrotations,andthusgeneralisewelltounseenpositionsandorienta-tions.WefurtheruseGAEsinmodelsthatlearnobject-centricrepresentationsonmulti-objectdatasets,andshowimproveddataefficiencyanddecompositioncomparedtonon-equivariantbaselines.
InChapter
6
,wesummarizeourfindingsandexplorepotentialavenuesforfutureresearchtofurtheradvancethefield.
1.3Papers
Thisisanintegratedthesisandincludesthefollowingpublishedpapers:Chapter3contains:
Xu,J.,Ton,J.F.,Kim,H.,Kosiorek,A.,&Teh,Y.W.Metafun:Meta-
5
learningwithiterativefunctionalupdates.InternationalConferenceon
MachineLearning(ICML),2020[
Xuetal.,
2020]
Chapter4contains:
Xu,J.,Kim,H.,Rainforth,T.,&Teh,Y.(2021).Groupequivariantsub-sampling.AdvancesinNeuralInformationProcessingSystems(NeurIPS),2021[
Xuetal.,
2021]
Chapter5contains
Xu,J.,Dupont,E.,Märtens,K.,Rainforth,T.,&Teh,Y.W.(2023).DeepStochasticProcessesviaFunctionalMarkovTransitionOperators.AdvancesinNeuralInformationProcessingSystems(NeurIPS),2023[
Xu
etal.
,
2023]
6
Chapter2
Background
2.1Meta-learning
2.1.1Conventionalsupervisedlearningandmeta-learning
Inconventionalsupervisedlearning,theobjectiveistolearnafunctionfthatmapsaninputfeaturevectorx∈Xtoanoutputlabely∈Y.Learningisbasedonexampleinput-outputpairsinatrainingsetDtrain={(xi,yi.Commontypesofsupervisedlearningtasksincluderegressionwhereoutputlabelsarereal-valued,andclassificationwheretheoutputlabelsrepresentdifferentclasses.Thefunctionf,oftenreferredto
asthepredictivemodel,isamemberofahypothesisclass,H:={f|f(x;ϕ),ϕ∈Rdφ}.
Foreachtask,thereisariskfunctionℓ(y,f(x))whichmeasurespredictionerror.Asanexample,inthecontextofaregressiontask,ℓoftentakestheformofasquarederror,ℓ(y,f(x))=(y−f(x))2.Thetrainingprocessofthemodelftranslatestosolvinganoptimizationproblemdefinedasfollows:
ItiscalledempiricalriskminimizationbecausethisobjectiveisanestimationofthepopulationriskE(xi,yi)~p(x,y)[ℓ(yi,f(xi))]basedontheempiricaldistributionoftrainingdata.
7
Aftertraining,themodelshouldgeneralizeeffectivelywhenpresentedwithatestset,denotedasDtest={(xi,yim+1.Themodel’sperformancecanbeassessedusing
thetestrisk(f;Dtest)whichservesasanestimateoftheoverallpopulationrisk
usingunseendata.
Figure2.1:Dataforameta-classificationproblem.Boththemeta-trainingandmeta-testsetsconsistoftasks(redrectangles)andarepresumedtocomefromthesametaskdistributionp(T).Eachofthesetasksencompassesitsowntask-specifictrainingandtestsets,whicharecommonlyreferredtoasthecontext(yellowlabels)andthetarget(greylabels)respectively.
Inpractice,itiscommontohavescenarioswherelotsofsupervisedlearningtasksarerelatedtoeachother,yetthenumberofdatapointsforeachindividualtaskislimited.Meta-learningemergesasanewlearningparadigmtoaddresssuchchallenges.
Specifically,wehaveameta-trainingsetdefinedasMtrain={(Dt(a)in,Dt(s)t,ℓ(j)
andameta-testsetgivenbyMtest={(Dt(a)in,Dt(s)t,ℓ(j)M+1.Eachelementinthese
meta-datasetsisatupleconsistingofatrainingset(calledthecontext),atestset(calledthetarget)andariskfunction(typicallythesamewithinameta-dataset).This3-tuplecharacterizesataskTj(seeFigure
2.1
illustration).Insupervisedlearning,weusetrainingdatatotrainapredictivemodel,hopingitcangeneralizeacrosstheentiredatadistribution.Inmeta-learning,theassumptionisthatthereisacommontaskdistribution,denotedasp(T),fromwhichboththemeta-trainingsetandthemeta-testsetaredrawn.Meta-learningalgorithmsaimtousemeta-trainingdatatodiscoverlearningalgorithmsthatcangeneralizeacrosstheentiretaskdistribution.
Morespecifically,alearningalgorithmforasupervisedlearningtasktakesinatraining
8
setDtrain,ariskfunctionℓandoutputsapredictivemodel,writtenas:
=ΦALGO(Dtrain,ℓ).(2.2)
Sinceℓisusuallyfixed,wewillomitthedependencyonitinsubsequentdiscussions.Foraparticulartask,thelearningalgorithmΦALGOcanbeevaluatedbythetestriskofthelearnedpredictivemodel,denotedas:
(;Dtest).(2.3)
Meta-learningfindsalearningalgorithmbasedontasksfromthemeta-trainingsetMtrain,sothatthislearningalgorithmcanbemoreefficientlyappliedtonewtasks,andgeneralizesacrossthetaskdistributionp(T).Themeta-learningalgorithmcanberepresentedas:
ΦALGO=MetaAlgo(Mtrain).(2.4)
Toevaluatethemeta-learningalgorithm,wecancompute:
Whileitresemblesthetestlossinsupervisedlearning,theaggregatedtestriskforataskreplacesthetraditionalriskfunctionforadatapoint.
Itisworthnotingthatwhilewefocusonsupervisedlearningtaskshere,meta-learningcanbeextendedtounsupervisedlearning[
EdwardsandStorkey,
2016,
Reedetal.,
2018
,
Hsuetal.,
2018]orreinforcementlearning[
Wangetal.,
2016,
Finnetal.,
2017a
,b]
.
2.1.2Differentviewsofmeta-learning
Bi-leveloptimizationviewLetusassumeboththepredictivemodelfandthelearningalgorithmΦALGOcanbeparameterised,andtheparametersaredenotedasϕandθaccordingly.Thatistosay,thelearningalgorithmcanbewrittenas:
ϕ=ΦALGO(Dtrain;θ).(2.6)
9
Meta-learningcanbeformulatedasthefollowingbi-leveloptimizationproblem:
wheretask-specificparameterϕjdependsonθthroughtheinner-loopoptimization:
ϕj(θ)=ΦALGO(Dt(a)in;θ)(2.8)
Manymeta-learningalgorithmsaredevelopedbasedonthisbi-leveloptimizationview,suchas
Finnetal.
[2017a],
Nicholetal.
[2018],
RaviandLarochelle
[2016]
.
HierarchicalmodelviewFromaprobabilisticperspective,thegenerativeprocessforeachtaskTjcanbeexpressedas:
θ∼p(θ),ϕj∼p(ϕj|θ),yi(j)∼p(yi(j)|xi(j)ϕj,θ)(2.9)
BoththetrainingsetDt(a)inandthetestsetDt(s)tfollowthesamedistribution(as
illustratedinFigure
2.2
).Thiscanbeseenasaprobabilistichierarchicalmodelwhereθindicatesthehigh-levelglobalparametersforalltasksandϕjdenotesthelow-levellocalparametersforeachtask.Inthiscontext,meta-learningisaboutinferringθfromlotsoftasksinthemeta-trainingset,thatisp(θ|Mtrain).Learning,ontheother
hand,infersϕjgiventhetrainingsetDt(a)infortaskTj,thatisp(ϕj|θ,Dt(a)in).
(j)i
j=1,...
Figure2.2:Meta-learningashierarchicalmodels(AremakeofFigure1in
Gordon
etal.
[2018])
.Task-specificparameterϕjdependsontheglobalparameterθ.Datapointsinboththecontextandthetargethavethesamegenerativeprocess,whichdependonbothθandϕj.
Notethatp(ϕj|θ)canbeseenasapriorfortaskTjconditionedonθ.Therefore,meta-learningcanbeseenaslearninganempiricalpriorfromthemeta-trainingset.
Finnetal.
[2018],
Requeimaetal.
[2019]adoptsthisview
.
10
Model-basedviewAlearningalgorithmf=ΦALGO(Dtrain)canbeseenasafunctionthattakesintheentiretrainingsetandoutputsapredictivemodel.ThemodelisthenusedtomakepredictionsontestdatainDtest.Thelearningandpredictionprocessescanthusbeconceptualizedassequence-to-sequencemappings.Forthesakeofbrevity,let’suseaconcisenotationfordatasequences,suchasx1:n={x1,x2,...,xn}.ForaspecifictaskTj,makingpredictionsfortestsetdatapointsbasedonthosefromthetrainingsetcanbedescribedasthefollowinginferencetask
p(ym+1:n|xm+1:n,x1:m,y1:m).(2.10)
Fromthisperspective,meta-learningisaboutcreatingthisconditionalmodel.Meta-learningonlydiffersfromconventionalsupervisedlearninginthatboththeinp
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 中医美容知到课后答案智慧树章节测试答案2025年春广西中医药大学
- 中医外科学知到课后答案智慧树章节测试答案2025年春云南中医药大学
- 七~九年级《体育与健康》教案
- 生石灰项目企划书(仅供参考)
- 聚酯树脂项目企划书(仅供参考)
- 滑雪服项目企划书(仅供参考)
- 体育运动健康课件
- 消防防爆安全知识
- 系统解剖学习题库(含参考答案)
- LAG-3-cyclic-peptide-inhibitor-C25-生命科学试剂-MCE
- 职能科室对医技科室医疗质量督查记录表(检验科、放射科、超声科、功能科、内镜室)
- 报警员服务规范用语
- 广东省珠海市香洲区2023-2024学年七年级下学期期末历史试题(原卷版)
- 反诉状(业主反诉物业)(供参考)
- GH/T 1451-2024调配蜂蜜水
- 3.作文指导-写一种小动物课件
- 煤矿掘进探放水专项安全风险辨识评估标准
- 人教版(2015) 六年级下学期信息技术指挥海龟画图形-指挥海龟起步走(教案)
- 主题1考察探究外卖的调查研究教学设计山文艺出版社-劳动教育实践活动课程指导八年级上册
- 铁路运输与人工智能融合应用研究
- AQ/T 3029-2010 危险化学品生产单位主要负责人安全生产培训大纲及考核标准(正式版)
评论
0/150
提交评论