版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
FoundationsofMachineLearningIntroductionofMachineLearningContents1ClassicalMachineLearning234Whatismachinelearning?EnsembleMethodsReinforcementLearning5DeepLearningAboutmachinelearningFromLearningtoMachineLearningLearning:AcquiringskillWithexperienceaccumulatedfromobservationsFromLearningtoMachineLearningLearning:AcquiringskillWithexperienceaccumulatedfromobservationsMachineLearning:AcquiringskillWithexperienceaccumulated/computedfromdataWhatisskill?AMoreConcreteDefinitionskill⇔
improve
some
performance
measure
(e.g.
prediction
accuracy)MachineLearning:improvingperformance
measure
withexperiencecomputedfromdataAMoreConcreteDefinitionAprogramcanbesaidtolearnfromexperienceEwithrespecttosomeclassoftasksTandperformancemeasureP,ifitsperformanceattasksinT,asmeasuredbyP,improvedwithexperienceE.ImproveonTaskwithrespecttoPerformancemetricbasedonExperienceT:PlayingcheckersP:Percentageofgameswonagainstanarbitraryopponent
E:PlayingpracticegamesagainstitselfWhyusemachinelearningML:analternativeroutetobuildcomplicatedsystemLearnfromthispictureandrecognize:3-year-oldcandoDefineflowersandhand-program:difficultML-basedflowersrecognitionsystemcanbeeasiertobuildthanhand-programmedsystemMLRouteML:analternativeroutetobuildcomplicatedsystemSomeScenariostouseMLwhenhumancannotprogramthesystemmanuallyNavigatingonMarswhenhumancannotdefinethesolutioneasilySpeechrecognitionWhenneedingrapiddecisionsthathumancannotdoHigh-frequencytradingWhenneedingtobeuser-orientedinamassivescaleConsumer-targetedmarketingKeyessenceofMLKeyessence:helpdecidewhethertouseMLMachineLearning:improvingperformance
measure
withexperiencecomputedfromdataExistssomeunderlyingpatterntobelearnedSoperformancemeasurecanbeimprovedButnoprogrammabledefinitionSoMLisneededSomehowthereisdataaboutthepatternSoMLhassomeinputstolearnfromThreecomponentsofmachinelearningDataWanttodetectspam?Getsamplesofspammessages.Wanttoforecaststocks?Findthepricehistory.Wanttofindoutuserpreferences?ParsetheiractivitiesonWebChat.Therearetwomainwaystogetthedata—manualandautomatic.Manuallycollecteddatacontainsfarfewererrorsbuttakesmoretimetocollect.Automaticapproachischeaperbutwithmoreerrors.SomesmartasseslikeGoogleusetheirowncustomerstolabeldataforthemforfree.RememberReCaptchawhichforcesyouto"Selectallstreetsigns"?That'sexactlywhatthey'redoing.Freelabour!Nice.ThreecomponentsofmachinelearningDataFeaturesAlsoknownasparametersorvariables.Thosecouldbecarmileage,user'sgender,stockprice,wordfrequencyinthetext.Inotherwords,thesearethefactorsforamachinetolookat.Whendatastoredintablesit'ssimple—featuresarecolumnnames.Butwhataretheyifyouhave100Gbofcatpics?Wecannotconsidereachpixelasafeature.That'swhyselectingtherightfeaturesusuallytakeswaylongerthanalltheotherMLparts.That'salsothemainsourceoferrors.ThreecomponentsofmachinelearningDataFeaturesAlgorithmsMostobviouspart.Anyproblemcanbesolveddifferently.Themethodyouchooseaffectstheprecision,performance,andsizeofthefinalmodel.Thereisoneimportantnuancethough:ifthedataiscrappy,eventhebestalgorithmwon'thelp.Sometimesit'sreferredas"garbagein–garbageout".Sodon'tpaytoomuchattentiontothepercentageofaccuracy,trytoacquiremoredatafirst.LearningvsIntelligenceArtificialintelligenceisthenameofawholeknowledgefield,similartobiologyorchemistry.MachineLearningisapartofartificialintelligence.Animportantpart,butnottheonlyone.NeuralNetworksareoneofmachinelearningtypes.Apopularone,butthereareothergoodguysintheclass.DeepLearningisamodernmethodofbuilding,training,andusingneuralnetworks.Basically,it'sanewarchitecture.Nowadaysinpractice,nooneseparatesdeeplearningfromthe"ordinarynetworks".Weevenusethesamelibrariesforthem.LearningvsIntelligence深度学习都是神经网络吗?机器学习下面应该是表示学习,包括所以使用机器学习挖掘表示本身的方法。
ThemapofmachinelearningworldThemapofmachinelearningworldLet'sstartwithabasicoverview.Nowadaystherearefourmaindirectionsinmachinelearning.Contents1ClassicalMachineLearning234Whatismachinelearning?EnsembleMethodsReinforcementLearning5DeepLearningClassicalMachineLearningClassicalmachinelearningisoftendividedintotwocategories–SupervisedandUnsupervisedLearning.SupervisedLearningTherearetwotypesofSupervisedLearning:classification–anobject'scategoryprediction,andregression–predictionofaspecificpointonanumericaxis.Classification"Splitsobjectsbasedatoneoftheattributesknownbeforehand.Separatesocksbybasedoncolor,documentsbasedonlanguage,musicbygenre".Todayusedfor:Spamfiltering,Languagedetection,Asearchofsimilardocuments,Sentimentanalysis,Recognitionofhandwrittencharactersandnumbers,Frauddetection,etc.Popularalgorithms:NaiveBayes,DecisionTree,LogisticRegression,K-NearestNeighbours,SupportVectorMachineClassificationInspamfilteringtheNaiveBayesalgorithmwaswidelyused.Themachinecountsthenumberof"viagra"mentionsinspamandnormalmail,thenitmultipliesbothprobabilitiesusingtheBayesequation,sumstheresultsandyay,wehaveMachineLearning.ClassificationHere'sanotherpracticalexampleofclassification.Let'ssayyouneedsomemoneyoncredit.Howwillthebankknowifyou'llpayitbackornot?Usingthisdata,wecanteachthemachinetofindthepatternsandgettheanswer.There'snoissuewithgettingananswer.Theissueisthatthebankcan'tblindlytrustthemachineanswer.Todealwithit,wehaveDecisionTrees.Allthedataautomaticallydividedtoyes/noquestions.Theycouldsoundabitweirdfromahumanperspective,e.g.,whetherthecreditorearnsmorethan$128.12?Though,themachinecomesupwithsuchquestionstosplitthedatabestateachstep.ClassificationSupportVectorMachines(SVM)isrightfullythemostpopularmethodofclassicalclassification.Itwasusedtoclassifyeverythinginexistence:plantsbyappearanceinphotos,documentsbycategories,etc.TheideabehindSVMissimple–it'stryingtodrawtwolinesbetweenyourdatapointswiththelargestmarginbetweenthem.Lookatthepicture:Regression"Drawalinethroughthesedots.Yep,that'sthemachinelearning“Todaythisisusedfor:StockpriceforecastsDemandandsalesvolumeanalysisMedicaldiagnosisAnynumber-timecorrelationsPopularalgorithmsareLinearandPolynomialregressions.RegressionRegressionisbasicallyclassificationwhereweforecastanumberinsteadofcategory.Examplesarecarpricebyitsmileage,trafficbytimeoftheday,demandvolumebygrowthofthecompanyetc.Regressionisperfectwhensomethingdependsontime.UnsupervisedlearningUnsupervisedwasinventedabitlater,inthe'90s.Itisusedlessoften,butsometimeswesimplyhavenochoice.Labeleddataisluxury.ButwhatifIwanttocreate,let'ssay,abusclassifier?ShouldImanuallytakephotosofmillionfuckingbusesonthestreetsandlabeleachofthem?There'salittlehopeforcapitalisminthiscase.Thankstosocialstratification,wehavemillionsofcheapworkersandserviceslikeMechanicalTurkwhoarereadytocompleteyourtaskfor$0.05.Andthat'showthingsusuallygetdonehere.Clustering"Dividesobjectsbasedonunknownfeatures.Machinechoosesthebestway“Nowadaysused:Formarketsegmentation(typesofcustomers,loyalty)TomergeclosepointsonamapForimagecompressionToanalyzeandlabelnewdataTodetectabnormalbehaviorPopularalgorithms:K-means_clustering,Mean-Shift,DBSCANDimensionalityReduction"Assemblesspecificfeaturesintomorehigh-levelones“Nowadaysisusedfor:Recommendersystems(★)BeautifulvisualizationsTopicmodelingandsimilardocumentsearchFakeimageanalysisRiskmanagementPopularalgorithms:PrincipalComponentAnalysis(PCA),SingularValueDecomposition(SVD),LatentDirichletallocation(LDA),LatentSemanticAnalysis(LSA,pLSA,GLSA),t-SNE(forvisualization)Associationrulelearning"Lookforpatternsintheorders'stream"Nowadaysisused:ToforecastsalesanddiscountsToanalyzegoodsboughttogetherToplacetheproductsontheshelvesToanalyzewebsurfingpatternsPopularalgorithms:Apriori,Euclat,FP-growthContents1ClassicalMachineLearning234Whatismachinelearning?EnsembleMethodsReinforcementLearning5DeepLearningEnsembleMethods"Bunchofstupidtreeslearningtocorrecterrorsofeachother"Nowadaysisusedfor:Everythingthatfitsclassicalalgorithmapproaches(butworksbetter)Searchsystems(★)ComputervisionObjectdetectionPopularalgorithms:RandomForest,GradientBoostingStackingOutputofseveralparallelmodelsispassedasinputtothelastonewhichmakesafinaldecision.RegressionRegressionisbasicallyclassificationwhereweforecastanumberinsteadofcategory.Examplesarecarpricebyitsmileage,trafficbytimeoftheday,demandvolumebygrowthofthecompanyetc.Regressionisperfectwhensomethingdependsontime.UnsupervisedlearningUnsupervisedwasinventedabitlater,inthe'90s.Itisusedlessoften,butsometimeswesimplyhavenochoice.Labeleddataisluxury.ButwhatifIwanttocreate,let'ssay,abusclassifier?ShouldImanuallytakephotosofmillionfuckingbusesonthestreetsandlabeleachofthem?There'salittlehopeforcapitalisminthiscase.Thankstosocialstratification,wehavemillionsofcheapworkersandserviceslikeMechanicalTurkwhoarereadytocompleteyourtaskfor$0.05.Andthat'showthingsusuallygetdonehere.Clustering"Dividesobjectsbasedonunknownfeatures.Machinechoosesthebestway“Nowadaysused:Formarketsegmentation(typesofcustomers,loyalty)TomergeclosepointsonamapForimagecompressionToanalyzeandlabelnewdataTodetectabnormalbehaviorPopularalgorithms:K-means_clustering,Mean-Shift,DBSCANDimensionalityReduction"Assemblesspecificfeaturesintomorehigh-levelones“Nowadaysisusedfor:Recommendersystems(★)BeautifulvisualizationsTopicmodelingandsimilardocumentsearchFakeimageanalysisRiskmanagementPopularalgorithms:PrincipalComponentAnalysis(PCA),SingularValueDecomposition(SVD),LatentDirichletallocation(LDA),LatentSemanticAnalysis(LSA,pLSA,GLSA),t-SNE(forvisualization)Associationrulelearning"Lookforpatternsintheorders'stream"Nowadaysisused:ToforecastsalesanddiscountsToanalyzegoodsboughttogetherToplacetheproductsontheshelvesToanalyzewebsurfingpatternsPopularalgorithms:Apriori,Euclat,FP-growthContents1ClassicalMachineLearning234Whatismachinelearning?EnsembleMethodsReinforcementLearning5DeepLearningEnsembleMethods"Bunchofstupidtreeslearningtocorrecterrorsofeachother"Nowadaysisusedfor:Everythingthatfitsclassicalalgorithmapproaches(butworksbetter)Searchsystems(★)ComputervisionObjectdetectionPopularalgorithms:RandomForest,GradientBoostingStackingOutputofseveralparallelmodelsispassedasinputtothelastonewhichmakesafinaldecision.BaggingUsethesamealgorithmbuttrainitondifferentsubsetsoforiginaldata.Intheend—justaverageanswers.BaggingUsethesamealgorithmbuttrainitondifferentsubsetsoforiginaldata.Intheend—justaverageanswers.ThemostfamousexampleofbaggingistheRandomForestalgorithm,whichissimplybaggingonthedecisiontrees(whichwereillustratedabove).Whenyouopenyourphone'scameraappandseeitdrawingboxesaroundpeople'sfaces—it'sprobablytheresultsofRandomForestwork.BoostingAlgorithmsaretrainedonebyonesequentially.Eachsubsequentonepayingmostofitsattentiontodatapointsthatweremispredictedbythepreviousone.Repeatuntilyouarehappy.Sameasinbagging,weusesubsetsofourdatabutthistimetheyarenotrandomlygenerated.Now,ineachsubsamplewetakeapartofthedatathepreviousalgorithmfailedtoprocess.Thus,wemakeanewalgorithmlearntofixtheerrorsofthepreviousone.Nowadaystherearethreepopulartoolsforboosting,youcanreadacomparativereportinCatBoostvs.LightGBMvs.XGBoostContents1ClassicalMachineLearning234Whatismachinelearning?EnsembleMethodsReinforcementLearning5DeepLearningReinforcementLearning"Throwarobotintoamazeandletitfindanexit"Nowadaysusedfor:Self-drivingcarsRobotvacuumsGamesAutomatingtradingEnterpriseresourcemanagementPopularalgorithms:Q-Learning,SARSA,DQN,A3C,GeneticalgorithmReinforcementLearningReinforcementlearningisusedincaseswhenyourproblemisnotrelatedtodataatall,butyouhaveanenvironmenttolivein.Likeavideogameworldoracityforself-drivingcar.Survivinginanenvironmentisacoreideaofreinforcementlearning.Throwpoorlittlerobotintoreallife,punishitforerrorsandrewarditforrightdeeds.Samewayweteachourkids,right?Contents1ClassicalMachineLearning234Whatismachinelearning?EnsembleMethodsReinforcementLearning5DeepLearningNeuralNetworksandDeepLeaning"Wehaveathousand-layernetwork,dozensofvideocards,butstillnoideawheretouseit.Let'sgeneratecatpics!"Usedtodayfor:ReplacementofallalgorithmsaboveObjectidentificationonphotosandvideosSpeechrecognitionandsynthesisImageprocessing,styletransferMachinetranslationPopulararchitectures:Perceptron,ConvolutionalNetwork(CNN),RecurrentNetworks(RNN),AutoencodersNeuralNetworksandDeepLeaningAnyneuralnetworkisbasicallyacollectionofneuronsandconnectionsbetweenthem.Neuron
isafunctionwithabunchofinputsandoneoutput.Itstaskistotakeallnumbersfromitsinput,performafunctiononthemandsendtheresulttotheoutput.NeuralNetworksandDeepLeaningneuronsConnections
arelikechannelsbetweenneurons.Theyconnectoutputsofoneneuronwiththeinputsofanothersotheycansenddigitstoeachother.Eachconnectionhason
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- GB/T 44320-2024航空航天用MJ螺纹六角开槽薄螺母
- 龙口南山养生谷肿瘤医院PETCT、ECT核医学工作场所建设项目环境
- 2025高考物理步步高同步练习必修2第八章重力势能含答案
- 【2021】三年级下册科学教案 新苏教版
- 大学英语六级改革适用(阅读)模拟试卷56(共220题)
- 记账实操-大闸蟹养殖企业的账务处理分录
- 2025高考物理步步高同步练习选修2第二章互感和自感含答案
- 《清贫》清廉美德教案
- 《成语接龙游戏》语言趣味教案
- 专升本高等数学二(一元函数微分学)模拟试卷1(共147题)
- 2024-2030年中国肺癌行业市场发展趋势与前景展望战略分析报告
- 反诉状(业主反诉物业)(供参考)
- 12.1 认识内能教学设计- 2023-2024学年沪粤版九年级物理上册
- 太阳能光伏发电系统设计方案课件(112张)
- 2024年上海市各区初三语文一模卷试题汇编之记叙文含答案
- 2025中考英语备考专题01 宾语从句(宿迁中考真题+名校模拟)
- 质量、环境、职业健康安全管理体系程序文件
- HY/T 0402-2024基于移动式平台的海洋仪器设备海上试验标准体系
- 《教育向美而生-》读书分享课件
- 名人-老子-人物介绍
- 2024(茅台酒)白酒酿造工职业技能认定-制曲制酒考试题(夺冠)
评论
0/150
提交评论