版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
TowardsData-EfficientDeepLearningwithMeta-LearningandSymmetries
JinXu
BalliolCollege
UniversityofOxford
AthesissubmittedforthedegreeofDoctorofPhilosophyinStatistics
Trinity2023
2
Acknowledgements
Firstandforemost,Iwanttoexpressmydeepgratitudetomysupervisors,Prof.Yee
WhyeTehandDr.TomRainforth.Theirunwaveringsupport,carefulguidance,andconstantinspirationhavebeeninvaluablethroughoutmyPhDjourney.Ithasbeenaprivilegetobementoredbythem,whoIregardasresearchrolemodels.Theirdepthandbreadthofknowledgehavebeenbothhumblingandenlightening.SpecialacknowledgementgoestoYeeWhye,whohasalwaysbeenconsiderateandreadytohelpintoughtimes.MyheartfeltthanksgotoTomforhisguidanceduringthechallengingtimesbroughtonbythepandemic.
IwouldliketoextendmygratitudetoallmycollaboratorsHyunjikKim,Jean-FrancoisTon,AdamKosiorek,EmilienDupont,andKasparMärtens.TheirexpertiseandfeedbackhavebeencrucialinimprovingmyworkandIlearnagreatdealfromthem.AbigthankyoutoProf.RyanAdamsfromPrincetonUniversityandtomyinternshiphosts,JamesHensmanandMaxCrociatMicrosoftResearch.TheirmentorshipoutsideofmyPhDlifehasbeenanindispensablepartofmyresearchexperience.
Moreover,Ifeelextremelyfortunatetobesurroundedbyamazingandcaringfriendswhosenamesarenotpossibletoenumeratehere.AmongthemareEmilienDupont,Jean-FrancoisTon,CharlineLeLan,BobbyHe,SheheryarZaidi,QinyiZhang,GuneetDhillon,AndrewCampbell,ChrisWilliams,CarloAlfano,FaaizTaufiq,AnnaMenacherandothersfromourlovelyoffice1.17,HanwenXing,YanzhaoYang,NingMiao,ChaoZhang,Yutonglu,YixuanHe,XiLin,YuanZhou,FanWu,BohaoYaofromthedepartmentofstatistics,DunhongJin,SihanZhou,SijiaYao,HuiningYang,KevinWang,NataliaHong,HangYuan,KangningZhang,ChengyangWangandmanyothersfromotherdepartmentsatOxford,DenizOktay,SulinLiu,JennyZhanandothersfromPrincetonUniversity,internshippeersatMicrosoftResearchincludingAlexanderMeulemans,SalehAshkboosfromETH.
Aspecialthankstoalluniversityanddepartmentstaff,especiallyChrisCullenforhiskindandpatientsupportduringdifficulttimes,andtoJoannaStoneham,Stuart
3
McRobert,andotherswhoensuredasmoothPhDexperience.
Finally,aboveall,mydeepestthanksgotoYifanYuforherloveandcompanionship.SheimmenselyenrichedmytimeinOxford,bringingcolourandjoytomylife.Additionally,IameternallygratefultomyparentsChengxiangXuandFengChenforgivingmethefreedomtopursuemypassionsandfortheirunquestioningsupportthroughoutthisjourney.
4
Abstract
Recentadvancesindeeplearninghavebeensignificantlypropelledbytheincreasingavailabilityofdataandcomputationalresources.Whiletheabundanceofdataenablesmodelstoperformwellincertaindomains,therearereal-worldapplications,suchasinthemedicalfield,wherethedataisscarceordifficulttocollect.Furthermore,therearealsoscenarioswherethelargedatasetisbetterviewedaslotsofrelatedsmalldatasets,andthedatabecomesinsufficientforthetaskassociatedwithoneofthesmalldatasets.Itisalsonoteworthythathumanintelligenceoftenrequiresonlyahandfulofexamplestoperformwellonnewtasks,emphasizingtheimportanceofdesigningdata-efficientAIsystems.Thisthesisdelvesintotwostrategiestoaddressthischallenge:meta-learningandsymmetries.Meta-learningapproachesthedata-richenvironmentasacollectionofmanysmall,individualdatasets.Eachofthesesmalldatasetsrepresentsadistincttask,yetthereisunderlyingsharedknowledgebetweenthem.Harnessingthissharedknowledgeallowsforthedesignoflearningalgorithmsthatcanefficientlyaddressnewtaskswithinsimilardomains.Incomparison,symmetryisaformofdirectpriorknowledge.Byensuringthatmodels’predictionsremainconsistentdespiteanytransformationtotheirinputs,thesemodelsenjoybettersampleefficiencyandgeneralization.
Inthesubsequentchapters,wepresentnoveltechniquesandmodelswhichallaimatimprovingthedataefficiencyofdeeplearningsystems.Firstly,wedemonstratethesuccessofencoder-decoderstylemeta-learningmethodsbasedonConditionalNeuralProcesses(cnps).Secondly,weintroduceanewclassofexpressivemeta-learnedstochasticprocessmodelswhichareconstructedbystackingsequencesofneuralparameterisedMarkovtransitionoperatorsinfunctionspace.Finally,weproposegroupequivariantsubsampling/upsamplinglayerswhichtacklesthelossofequivarianceinconventionalsubsampling/upsamplinglayers.Theselayerscanbeusedtoconstructend-to-endequivariantmodelswithimproveddata-efficiency.
i
Contents
1Introduction
1
1.1Motivation
1
1.2Thesisoutline
3
1.3Papers
4
2Background
6
2.1Meta-learning
6
2.1.1Conventionalsupervisedlearningandmeta-learning
6
2.1.2Differentviewsofmeta-learning
8
2.1.3Commonapproachestometa-learning
10
2.2Neuralprocesses
11
2.2.1Stochasticprocesses
12
2.2.2Neuralprocessesasstochasticprocesses
12
2.2.3Neuralprocesstrainingobjectives
13
2.2.4Ameta-learningperspective
14
2.3Symmetriesindeeplearning
15
2.3.1Group,cosetandquotientspace
15
2.3.2Grouphomomorphism,groupactionsandgroupequivariance
.16
2.3.3Homogeneousspacesandliftingfeaturemaps
16
2.3.4FeaturemapsinG-CNNs
17
2.3.5Groupequivariantneuralnetworks
18
3MetaFun:Meta-LearningwithIterativeFunctionalUpdates
20
3.1Introduction
20
3.2MetaFun
22
3.2.1Learningfunctionaltaskrepresentation
23
3.2.2MetaFunforregressionandclassification
26
3.3Relatedwork
27
ii
3.4Experiments
31
3.4.11-Dfunctionregression
31
3.4.2Classification:miniImageNetandtieredImageNet
33
3.4.3Ablationstudy
36
3.5Conclusionsandfuturework
37
3.6Supplementarymaterials
38
3.6.1Functionalgradientdescent
38
ReproducingkernelHilbertspace
38
Functionalgradients
39
Functionalgradientdescent
40
3.6.2Experimentaldetails
40
4DeepStochasticProcessesviaFunctionalMarkovTransitionOpera-
tors
44
4.1Introduction
44
4.2Background
46
4.3Markovneuralprocesses
47
4.3.1AmoregeneralformofNeuralProcessdensityfunctions
47
4.3.2Markovchainsinfunctionspace
48
4.3.3Parameterisation,inferenceandtraining
49
4.4Relatedwork
52
4.5Experiments
54
4.5.11Dfunctionregression
54
4.5.2Contextualbandits
55
4.5.3Geologicalinference
56
4.6Discussion
58
4.7Supplementarymaterials
59
4.7.1Proofs
59
60
4.7.2Implementationdetails
63
4.7.3Data
63
Modelarchitecturesandhyperparameters
65
Computationalcostsandresources
66
4.7.4Broaderimpacts
67
iii
5GroupEquivariantSubsampling
68
5.1Introduction
68
5.2Equivariantsubsamplingandupsampling
70
5.2.1TranslationequivariantsubsamplingforCNNs
70
5.2.2Groupequivariantsubsamplingandupsampling
72
5.2.3ConstructingΦ
75
5.3Application:Groupequivariantautoencoders
75
5.4Relatedwork
77
5.5Experiments
79
5.5.1Basicproperties:Equivariance,disentanglementandout-of-
distributiongeneralization
80
5.5.2Singleobject
81
5.5.3Multipleobjects
82
5.6Conclusions,limitationsandfuturework
83
5.7Supplementarymaterials
84
5.7.1Equivariantsubsamplingandupsampling
84
ConstructingΦ
84
Multiplesubsamplinglayers
85
5.7.2Groupequivariantautoencoders
87
5.7.3Proofs
88
5.7.4Implementationdetails
93
Data
93
Modelarchitectures
94
Hyperparameters
95
Computationalresources
95
6ConclusionsandFutureOutlook
96
Bibliography
99
1
Chapter1
Introduction
1.1Motivation
Recentbreakthroughsindeeplearningcanbelargelyattributedtothevastamountofdataavailableandtheadvancementofcomputationalresources[
Dengetal.,
2009,
Rainaetal.,
2009,
Silveretal.,
2016,
Jumperetal.,
2021,
Brownetal.,
2020a]
.Whiletrainingonlargedatasetsenablesdeeplearningmodelstoexcelincertaintasks,manyreal-worldapplicationsonlyprovidelimiteddataforaspecifictask.Forinstance,inmedicalfields,obtainingdata,especiallyforrarediseases,ischallengingandoftenexpensive.Indrugdevelopmentorrecommendationsystems,therewillalwaysbeinsufficientdatafornewdrugs/users,eventhoughabundantdataexistsforotherdrugsorusers.Therefore,toapplydeeplearningtothesefields,itisvitaltodevelopsystemsthataredata-efficient.Moreover,foradvancedAIsystems,data-efficiencycanbeacrucialingredient:Firstly,AIsystemsshouldbeabletogeneralizebeyondspecificdatadistributionswithoutrelyingondata;forinstance,animagerecognitionsystemshouldrecognizeobjectsregardlessoftheirpositionororientation.Secondly,humanintelligencecanoftensolvenewtaskswithjustafewexamples.Thus,forAItoemulatehuman-likeintelligence,itshouldalsohavesuchcapability.
FromaBayesianperspective,learninginvolvesupdatingourbeliefsaboutamodel(representedbyθ)giventhedata,i.e.p(θ|Ddata).Foramodeltolearnefficientlyfromasmallamountofdata,it’simportanttostartwithagoodinitialguessor"prior"p(θ).Inthispaper,welookattwodirectionstoobtainsuchpriorfordata-efficientlearning:Thefirstismeta-learning,whichlearnstheprior(orthesharedknowledge)from
2
similartasks.Itcanbeunderstoodas"learningtolearnmoreefficiently".Thesecondissymmetriesindeeplearning,whichservesasaknownpriorforcertainproblems.Symmetry,afundamentalconceptinphysics,representsaformofpriorknowledgethatisubiquitouslyobservedthroughoutourphysicalworld.
Meta-learningtacklesaspecificscenarioinwhichthevastpoolofdatacanbeviewedasmanysmalldatasets,eachrepresentingadistincttask.Yet,thesetaskscontainunderlyingsharedknowledgethatcanbeharnessedtoaddressnewtaskswithinthesamecategory.Thisscenarioisprevalentinmanyapplications.Take,forinstance,anonlineretailcompanywithdatafromcustomersworldwide.Thedataassociatedwitheachuseristypicallysparse.Inthiscontext,predictingbehavioursforeachuserconstitutesanindividualtask,butpatternsamongdifferentusersoftenexhibitsimilarities.Meta-learningalgorithmsaredesignedtohandlesuchcircumstances.Thegoalofmeta-learningistolearndata-efficientlearningalgorithmsthatcanlaterbeappliedtoaparticulartask.Thetrainingdataformeta-learningcomprisesnumerousrelatedtasks,eachwithalimitedsetofdatapoints.Afterthemeta-learningphase,thelearnedlearningalgorithmscansolveanewtaskinadata-efficientmanner.Incontrast,theaimofconventionalsupervisedlearningisjusttolearnapredictivemodel.
Meta-learningproblemscanbetackledfromvariousperspectives,andtheseap-proachescanbeunderstoodthroughdifferentviewpointssuchasoptimization-basedap-proaches[
RaviandLarochelle,
2016,
Finnetal.,
2017a
],metric-basedapproaches[
Koch,
2015
,
Vinyalsetal.,
2016,
Sungetal.,
2018,
Snelletal.,
2017],andmodel-based
approaches[
Santoroetal.,
2016,
Mishraetal.,
2018,
Garneloetal.,
2018a
],amongothers.Notethattheseviewsarenotexclusive.Forexample,methodssuchasprototypicalNetworks[
Snelletal.,
2017
],MAML[
Finnetal.,
2017a
],ML-PIP[
Gordon
etal.
,
2018
]etc.canbereformulatedunderamodel-basedframeworkthatusesanencoder-decodersetup.Inthissetup,theencoderproducesataskrepresentationusingtrainingdata,andthedecoderthenmakespredictionsbasedonthetaskrep-resentation.Theseapproachestransformthemeta-learningchallengetoresemblearegularlearningprobleminvolvingsequences,anditisalsomorecomputationallyefficientifnogradientcomputationisinvolvedinboththeencoderandthedecoderlikecnp-typemodels[
Garneloetal.,
2018a]
.OurstudyinChapter
3
explicitlyadoptsthisencoder-decoderframeworkformeta-learning.Byusingafunctionaltaskrepresentation,anditerativelyupdatingtherepresentationdirectlyinfunctionspace,
3
wedemonstratethatencoder-decoderapproacheswithoutgradientinformationcanalsobecompetitivewithotherapproaches,whichhasnotbeenshownbefore.
Furthermore,becausetrainingdataforeachtaskinmeta-learningisoftenlimited,uncertaintyestimationbecomescrucial.StochasticProcesses(sps)(e.g.GaussianProcesses(gps))canbeusedtomakepredictionswithuncertaintyestimation.Thus,learningtheseprocessescanbeseenasawaytoapproachmeta-learningwithuncer-taintyinmind.InChapter
4
,weproposeanewframeworktoconstructexpressiveneuralparameterisedspsbyparameterisingMarkovtransitionsinfunctionspace.
Unlikemeta-learningabove,whichdiscoverssharedknowledgefromrelatedtasks,symmetryservesasadirectformofpriororinductivebias,integratedintodeeplearningmodelswithouttheneedforpre-training.Symmetriesrefertotransformationsthatmaintaincertainpropertiesofanobjectofinterestunchanged.Theseincludetransformationssuchasimagetranslation,rotation,orpermutationofsetelements.Byincorporatingthesesymmetriesintodeeplearningmodels,ensuringthattheoutputsremainconsistent(thesameorundergothecorrespondingtransformation)despiteinputtransformations,themodelinherentlygeneralizestotransformedinputs.Consequently,deeplearningmodelsequippedwiththesesymmetriesnotonlybecomemoredata-efficientbutalsogeneralizebetter.AsimpleexampleofthisisConvolutinalNeuralNetworks(cnns),whichareinvarianttoinputtranslationsforclassificationtasks,andperformsignificantlybettercomparedtoplainfeed-forwardnetworks.Earlierresearchhasintroducedmanymethodstobuildconvolutional[
Cohenand
Welling,
2016,
2017,
Cohenetal.,
2019]andattentionblocks[Hutchinsonetal.,
2021,
Fuchsetal.,
2020
]thatareequivariantw.r.t.tovarioussymmetries.However,thepoolinglayersorsubsampling/upsamplinglayerscommonlyusedinvariousdeeplearningarchitecturesbreakthesesymmetries[
Zhang,
2019]
.InChapter
5,wepresent
groupequivariantsubsampling/upsamplinglayersthathaveexactequivariance.
1.2Thesisoutline
InChapter
2
,weprovideashortintroductiontometa-learning,neuralprocessesandsymmetriesindeeplearning,tosetthestageforlaterchapters.
InChapter
3
,weintroduceaniterativefunctionalencoder-decodermethodforsu-pervisedmeta-learning,whichisbasedonNeuralProcesses(nps)[
Garneloetal.,
4
2018a
,b]
.Onstandardfew-shotclassificationbenchmarkslikeminiImageNetandtieredImageNet,itisdemonstratedthatmeta-learningmethodsbasedontheneuralprocessfamilycanbecompetitiveorevenoutperformgradient-basedmethodssuchasMAML[
Finnetal.,
2017a
]andLEO[
Rusuetal.,
2019]
.
InChapter
4
,weintroduceMarkovNeuralProcesses(MNPs),anewclassofStochasticProcesses(SPs)whichareconstructedbystackingsequencesofneuralparameterisedMarkovtransitionoperatorsinfunctionspace.Therefore,theproposediterativeconstructionaddssubstantialflexibilityandexpressivitytotheoriginalframeworkofNeuralProcesses(NPs)withoutcompromisingconsistencyoraddingrestrictions.OurexperimentsdemonstrateclearadvantagesofMNPsoverbaselinemodelsonavarietyoftasks.It’snoteworthythatspmodelscanbeviewedthroughameta-learninglens.Sotheproposedmethodcanalsobeseenasameta-learningapproachwithprincipleduncertaintyestimation.
Chapter
5
,wefirstintroducetranslationequivariantsubsampling/upsamplinglayersthatcanbeusedtoconstructexacttranslationequivariantCNNs.Wethengeneralisetheselayersbeyondtranslationstogeneralgroups,thusproposinggroupequivariantsubsampling/upsampling.Weusetheselayerstoconstructgroupequivariantautoen-coders(GAEs)thatallowustolearnlow-dimensionalequivariantrepresentations.Weempiricallyverifyonimagesthattherepresentationsareindeedequivarianttoinputtranslationsandrotations,andthusgeneralisewelltounseenpositionsandorienta-tions.WefurtheruseGAEsinmodelsthatlearnobject-centricrepresentationsonmulti-objectdatasets,andshowimproveddataefficiencyanddecompositioncomparedtonon-equivariantbaselines.
InChapter
6
,wesummarizeourfindingsandexplorepotentialavenuesforfutureresearchtofurtheradvancethefield.
1.3Papers
Thisisanintegratedthesisandincludesthefollowingpublishedpapers:Chapter3contains:
Xu,J.,Ton,J.F.,Kim,H.,Kosiorek,A.,&Teh,Y.W.Metafun:Meta-
5
learningwithiterativefunctionalupdates.InternationalConferenceon
MachineLearning(ICML),2020[
Xuetal.,
2020]
Chapter4contains:
Xu,J.,Kim,H.,Rainforth,T.,&Teh,Y.(2021).Groupequivariantsub-sampling.AdvancesinNeuralInformationProcessingSystems(NeurIPS),2021[
Xuetal.,
2021]
Chapter5contains
Xu,J.,Dupont,E.,Märtens,K.,Rainforth,T.,&Teh,Y.W.(2023).DeepStochasticProcessesviaFunctionalMarkovTransitionOperators.AdvancesinNeuralInformationProcessingSystems(NeurIPS),2023[
Xu
etal.
,
2023]
6
Chapter2
Background
2.1Meta-learning
2.1.1Conventionalsupervisedlearningandmeta-learning
Inconventionalsupervisedlearning,theobjectiveistolearnafunctionfthatmapsaninputfeaturevectorx∈Xtoanoutputlabely∈Y.Learningisbasedonexampleinput-outputpairsinatrainingsetDtrain={(xi,yi.Commontypesofsupervisedlearningtasksincluderegressionwhereoutputlabelsarereal-valued,andclassificationwheretheoutputlabelsrepresentdifferentclasses.Thefunctionf,oftenreferredto
asthepredictivemodel,isamemberofahypothesisclass,H:={f|f(x;ϕ),ϕ∈Rdφ}.
Foreachtask,thereisariskfunctionℓ(y,f(x))whichmeasurespredictionerror.Asanexample,inthecontextofaregressiontask,ℓoftentakestheformofasquarederror,ℓ(y,f(x))=(y−f(x))2.Thetrainingprocessofthemodelftranslatestosolvinganoptimizationproblemdefinedasfollows:
ItiscalledempiricalriskminimizationbecausethisobjectiveisanestimationofthepopulationriskE(xi,yi)~p(x,y)[ℓ(yi,f(xi))]basedontheempiricaldistributionoftrainingdata.
7
Aftertraining,themodelshouldgeneralizeeffectivelywhenpresentedwithatestset,denotedasDtest={(xi,yim+1.Themodel’sperformancecanbeassessedusing
thetestrisk(f;Dtest)whichservesasanestimateoftheoverallpopulationrisk
usingunseendata.
Figure2.1:Dataforameta-classificationproblem.Boththemeta-trainingandmeta-testsetsconsistoftasks(redrectangles)andarepresumedtocomefromthesametaskdistributionp(T).Eachofthesetasksencompassesitsowntask-specifictrainingandtestsets,whicharecommonlyreferredtoasthecontext(yellowlabels)andthetarget(greylabels)respectively.
Inpractice,itiscommontohavescenarioswherelotsofsupervisedlearningtasksarerelatedtoeachother,yetthenumberofdatapointsforeachindividualtaskislimited.Meta-learningemergesasanewlearningparadigmtoaddresssuchchallenges.
Specifically,wehaveameta-trainingsetdefinedasMtrain={(Dt(a)in,Dt(s)t,ℓ(j)
andameta-testsetgivenbyMtest={(Dt(a)in,Dt(s)t,ℓ(j)M+1.Eachelementinthese
meta-datasetsisatupleconsistingofatrainingset(calledthecontext),atestset(calledthetarget)andariskfunction(typicallythesamewithinameta-dataset).This3-tuplecharacterizesataskTj(seeFigure
2.1
illustration).Insupervisedlearning,weusetrainingdatatotrainapredictivemodel,hopingitcangeneralizeacrosstheentiredatadistribution.Inmeta-learning,theassumptionisthatthereisacommontaskdistribution,denotedasp(T),fromwhichboththemeta-trainingsetandthemeta-testsetaredrawn.Meta-learningalgorithmsaimtousemeta-trainingdatatodiscoverlearningalgorithmsthatcangeneralizeacrosstheentiretaskdistribution.
Morespecifically,alearningalgorithmforasupervisedlearningtasktakesinatraining
8
setDtrain,ariskfunctionℓandoutputsapredictivemodel,writtenas:
=ΦALGO(Dtrain,ℓ).(2.2)
Sinceℓisusuallyfixed,wewillomitthedependencyonitinsubsequentdiscussions.Foraparticulartask,thelearningalgorithmΦALGOcanbeevaluatedbythetestriskofthelearnedpredictivemodel,denotedas:
(;Dtest).(2.3)
Meta-learningfindsalearningalgorithmbasedontasksfromthemeta-trainingsetMtrain,sothatthislearningalgorithmcanbemoreefficientlyappliedtonewtasks,andgeneralizesacrossthetaskdistributionp(T).Themeta-learningalgorithmcanberepresentedas:
ΦALGO=MetaAlgo(Mtrain).(2.4)
Toevaluatethemeta-learningalgorithm,wecancompute:
Whileitresemblesthetestlossinsupervisedlearning,theaggregatedtestriskforataskreplacesthetraditionalriskfunctionforadatapoint.
Itisworthnotingthatwhilewefocusonsupervisedlearningtaskshere,meta-learningcanbeextendedtounsupervisedlearning[
EdwardsandStorkey,
2016,
Reedetal.,
2018
,
Hsuetal.,
2018]orreinforcementlearning[
Wangetal.,
2016,
Finnetal.,
2017a
,b]
.
2.1.2Differentviewsofmeta-learning
Bi-leveloptimizationviewLetusassumeboththepredictivemodelfandthelearningalgorithmΦALGOcanbeparameterised,andtheparametersaredenotedasϕandθaccordingly.Thatistosay,thelearningalgorithmcanbewrittenas:
ϕ=ΦALGO(Dtrain;θ).(2.6)
9
Meta-learningcanbeformulatedasthefollowingbi-leveloptimizationproblem:
wheretask-specificparameterϕjdependsonθthroughtheinner-loopoptimization:
ϕj(θ)=ΦALGO(Dt(a)in;θ)(2.8)
Manymeta-learningalgorithmsaredevelopedbasedonthisbi-leveloptimizationview,suchas
Finnetal.
[2017a],
Nicholetal.
[2018],
RaviandLarochelle
[2016]
.
HierarchicalmodelviewFromaprobabilisticperspective,thegenerativeprocessforeachtaskTjcanbeexpressedas:
θ∼p(θ),ϕj∼p(ϕj|θ),yi(j)∼p(yi(j)|xi(j)ϕj,θ)(2.9)
BoththetrainingsetDt(a)inandthetestsetDt(s)tfollowthesamedistribution(as
illustratedinFigure
2.2
).Thiscanbeseenasaprobabilistichierarchicalmodelwhereθindicatesthehigh-levelglobalparametersforalltasksandϕjdenotesthelow-levellocalparametersforeachtask.Inthiscontext,meta-learningisaboutinferringθfromlotsoftasksinthemeta-trainingset,thatisp(θ|Mtrain).Learning,ontheother
hand,infersϕjgiventhetrainingsetDt(a)infortaskTj,thatisp(ϕj|θ,Dt(a)in).
(j)i
j=1,...
Figure2.2:Meta-learningashierarchicalmodels(AremakeofFigure1in
Gordon
etal.
[2018])
.Task-specificparameterϕjdependsontheglobalparameterθ.Datapointsinboththecontextandthetargethavethesamegenerativeprocess,whichdependonbothθandϕj.
Notethatp(ϕj|θ)canbeseenasapriorfortaskTjconditionedonθ.Therefore,meta-learningcanbeseenaslearninganempiricalpriorfromthemeta-trainingset.
Finnetal.
[2018],
Requeimaetal.
[2019]adoptsthisview
.
10
Model-basedviewAlearningalgorithmf=ΦALGO(Dtrain)canbeseenasafunctionthattakesintheentiretrainingsetandoutputsapredictivemodel.ThemodelisthenusedtomakepredictionsontestdatainDtest.Thelearningandpredictionprocessescanthusbeconceptualizedassequence-to-sequencemappings.Forthesakeofbrevity,let’suseaconcisenotationfordatasequences,suchasx1:n={x1,x2,...,xn}.ForaspecifictaskTj,makingpredictionsfortestsetdatapointsbasedonthosefromthetrainingsetcanbedescribedasthefollowinginferencetask
p(ym+1:n|xm+1:n,x1:m,y1:m).(2.10)
Fromthisperspective,meta-learningisaboutcreatingthisconditionalmodel.Meta-learningonlydiffersfromconventionalsupervisedlearninginthatboththeinp
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 借款购货合同范本
- 2023年云南省滇东北中心医院人才招引考试真题
- c 课程设计枚举法
- ug夹具课程设计
- 2023年茂名信宜市统一选调公务员考试真题
- 浅谈“电子白板”在中小学校的应用管理
- 多媒体技术在小学语文教学中的应用研究
- 简易房屋买卖合同协议书(33篇)
- c语言汉字点阵课程设计
- 2023年佛山市第二人民医院招聘考试真题
- 参加美术教师培训心得体会(30篇)
- 国开电大可编程控制器应用实训形考任务1实训报告
- 2024领导力培训课程ppt完整版含内容
- 森林火灾中的自救与互救课件
- 数据新闻可视化
- 中学生应急救护知识讲座
- ISO9001质量管理体系培训教材
- 纸质文物保护修复的传统及现代技术研究
- 前庭周围性眩晕个案护理
- 帕金森病患者认知功能障碍的评估与处理
- 达州市消防救援支队智能接处警和智能指挥系统暨全国消防
评论
0/150
提交评论