支持向量机器学习_第1页
支持向量机器学习_第2页
支持向量机器学习_第3页
支持向量机器学习_第4页
支持向量机器学习_第5页
已阅读5页,还剩36页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

SupervisedSupervisedAnagentormachineisgivenNsensoryinputsDSupervisedSupervisedAnagentormachineisgivenNsensoryinputsD={x1,x2...,xN},aswellasthedesiredoutputsy1,y2,...yN,itsgoalistolearntoproducethecorrectoutputgivenanewinput.GivenDwhatcanwesayabout––Classification:y1,y2,...yNarediscreteclasslabels,learnafy––––NaïveDecisionKnearestLeastsquares2=learningfromlabeleddata.DominantprobleminMachineLearning+3=learningfromlabeleddata.DominantprobleminMachineLearning+3LinearBinaryclassificationcanbeviewedasthetaskofseparatingclassesinfeaturespace(特LinearBinaryclassificationcanbeviewedasthetaskofseparatingclassesinfeaturespace(特征空间):wTx+b=wTx+b>wTx+b<h(x)=sign(wTx+4 ifwTx+b>0,LinearLinearWhichofthelinearseparatorsis5ClassificationMargin(间距GeometryofClassificationMargin(间距GeometryoflinearclassificationDiscriminantfunction•••Important:thedistancenotchangeifwe6ClassificationMargin(间距|wTxiClassificationMargin(间距|wTxibDistancefromexamplexitotheseparator rwDefinethemarginofalinearclassifierasthewidththattheboundarycouldbebybeforehittingadataExamplesclosesttothehyperplane(超平面)aresupportvectors(支持向量Marginmoftheseparatoristhedistancebetweensupportmr7MaximumMarginMaximumMargin最大间距分类MaximizingthemarginisgoodaccordingtointuitionandPACImpliesthatonlysupportvectorsmatter;othertrainingexamplesareignorable.8MaximumMargin最大MaximumMargin最大间距分类MaximizingthemarginisgoodaccordingtointuitionandPACImpliesthatonlysupportvectorsmatter;otherexamplesarem9HowdowecomputeintermofwandMaximumMarginLettrainingset{(xi,yi)}i=1..N,xiRd,yiMaximumMarginLettrainingset{(xi,yi)}i=1..N,xiRd,yi{-1,1}beseparatedbyahyperplanewithmarginm.Thenforeachtrainingexample(xi,yi):wTxi+b≤- ifyi=-y+b)ciiwTx+b≥ify=miiForeverysupportvectortheaboveinequalityisanys(wTxs+b)=rIntheequality,weobtaindistancebetweeneachxsandthehyperplaneisr|wTxsb|ys(wTxsb)wwwMaximumMarginThenMaximumMarginThenthemargincanbeexpressedthroughwandHereisourMaximumMarginClassificationNotethatthemagnitude(大小)ofcmerelyscaleswandb,anddoesnotchangetheclassificationboundaryatall!SowehaveacleanerThisleadstothefamousSupportVectorMachinesbelievedbymanytobethebest"off-the-shelf"supervisedlearningalgorithmLearningasLearningasParameter SupportVectorAconvexquadraticprogramming(凸二次规划)problemwithlinearconstraints:SupportVectorAconvexquadraticprogramming(凸二次规划)problemwithlinearconstraints:•+++–Theattainedmarginisnowgiven–Onlyafewoftheclassificationconstraintsare→support optimization(约束优化•–Wecandirectlysolvethisusingcommercialquadraticprogramming(QP)codeButwewanttotakeamorecarefulinvestigationofLagrange(对偶性),andthesolutionoftheaboveinitsdual––deeperinsight:supportvectors,kernels(核)QuadraticMinimize(withrespecttoQuadraticMinimize(withrespecttoSubjecttooneormoreconstraintsoftheIfQ0theng(xisaconvexfunction(凸函数):InthiscasethequadraticprogramhasaglobalminimizerQuadraticprogramofsupportvectorSolvingMaximumMarginOurSolvingMaximumMarginOuroptimizationTheConsidereachSolvingMaximumMarginOuroptimizationTheSolvingMaximumMarginOuroptimizationThecanbereformulatedThedualproblem(对偶问题TheDualProblem(对偶问题WeminimizeTheDualProblem(对偶问题WeminimizeLwithrespecttowandbNotethatthebiastermbdroppedoutbuthadproduced“global”constraintNote:d(Ax+b)T(Ax+b)=(2(Ax+b)TA)dxd(xTa)=d(aTx)=aTdxTheDualProblem(对偶问题WeminimizeTheDualProblem(对偶问题WeminimizeLwithrespecttowandbNotethat(2)Plug(4)backtoL,andusing(3),weTheDualProblem(对偶问题NowTheDualProblem(对偶问题NowwehavethefollowingdualoptimizationThisisaquadraticprogrammingproblem–AglobalmaximumcanalwaysbeButwhat’sthebigwcanberecoveredbcanberecoveredmoreSupportIfapointxSupportIfapointxiDuetothefactWewisdecidedbythepointswithnon-zeroSupportonlySupportonlyafewαi'scanbeSupportVectorOnceSupportVectorOncewehavetheLagrangemultipliersαi,wecantheparametervectorwasaweightedcombinationoftrainingFortestingwithanewdataandclassifyx’asclass1ifthesumispositive,andclass2Note:wneednotbeformedInterpretation(解释vectorTheoptimalwisalinearcombinationofasmallnumberofdataInterpretation(解释vectorTheoptimalwisalinearcombinationofasmallnumberofdatapointsThissparse稀疏representationcanbeviewedasdatacompression(数据压缩)asintheconstructionofkNNclassifier••Tocomputetheweightsαi,andtousesupportmachinesweneedtospecifyonlytheinnerproducts内积(orkernel)betweentheexamples•Wemakedecisionsbycomparingeachnewexamplex’onlythesupportSoftMarginWhatSoftMarginWhatifthetrainingsetisnotlinearlySlackvariables(松弛变量)ξicanbeaddedtoallowmisclassificationofdifficultornoisyexamplesresultingmargincalledsoft.SoftMargin“Hard”marginSoftMargin“Hard”marginSoftmargin➢➢➢Notethatξi=0ifthereisnoerrorforξiisanupperboundofthenumberofParameterCcanbeviewedasawaytocontrol “tradesoff(折衷,权衡)”therelativeimportanceofmaximizingthemarginandfittingthetrainingdata(minimizingtheerror).–LargerC→morereluctanttomakeTheOptimizationTheTheOptimizationThedualofthisnewconstrainedoptimizationproblemThisisverysimilartotheoptimizationproblemintheseparablecase,exceptthatthereisanupperboundConOnceagain,aQPsolvercanbeusedtofindLossinLossLossinLossismeasuredThislossisknownashingeLoss•Loss•BinaryZero/onelossL0/1(nogoodSquaredlossAbsolutelossHingeloss(SupportvectorLogisticloss(LogisticLinearTheclassifierisaLinearTheclassifierisaseparatingMost“important”trainingpointsaresupportvectors;theydefineQuadraticoptimizationalgorithmscanidentifywhichtrainingpointsxisupportvectorswithnon-zeroLagrangianmultipliersBothinthedualformulationoftheproblemandinthesolutionpointsappearonlyinsideinnerf(x)=ΣαiyixTx+iFindα1…αNsuchQ(α)=Σαi-½ΣΣαiαjyiyjxTxjismaximizediΣαiyi=0≤αi≤CforallNon-linearDatasetsthatarelinearlyseparablewithsomeNon-linearDatasetsthatarelinearlyseparablewithsomenoiseworkoutx0Butwhatarewegoingtodoifthedatasetisjusttoox0Howabout…mappingdatatoahigher-dimensionalx0Non-linearFeatureGeneraltheNon-linearFeatureGeneraltheoriginalfeaturespacecanalwaysbetosomehigher-dimensionalfeaturespacewherethesetisx→The“KernelRecalltheThe“KernelRecalltheSVMoptimizationThedatapointsonlyappearasinner•Aslongaswecancalculatetheinnerproductinthespace,wedonotneedthemappingManycommongeometricoperations(angles,distances)canbeexpressedbyinnerproducts•DefinethekernelfunctionKbyK(xi,xj)=φ(xi)TKernel•FeaturesKernel•Featuresviewpoint:constructandworkφ(x)(thinkintermsofpropertiesof•Kernelviewpoint:constructandworkK(xi,xj)(thinkintermsofsimilaritybetweenAnExampleAnExampleforfeatureandMoreExamplesofKernel•Linear:K(xi,MoreExamplesofKernel•Linear:K(xi,xj)=–MappingΦ:x→φ(x),whereφ(x)isx•Polynomial(多项式)ofpowerpK(xi,xj1+2Gaussian(radial-basisfunction径向基函数):K(xi,xj–MappingΦ:x→φ(x),whereφ(x)isinfinite-••Higher-dimensionalspacestillhasintrinsicdimensionalitybutlinearseparatorsinitcorrespondtonon-linearseparatorsinoriginalspace.xixKernelSupposeforKernelSupposefornowthatKisindeedavalidkernelcorrespondingsomefeaturemappingφ,thenforx1,…,xn,wecancomputeann×nmatrix{Ki,j}whereKi,j=φ(xi)Tφ(xj)ThisiscalledakernelNow,ifa

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论