




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
MachineLearningandMultivariateStatisticalMethodsinParticlePhysicsGlenCowanRHULPhysicspp.rhul.ac.uk/~cowanRHULComputerScienceSeminar17March,2021OutlineQuickoverviewofparticlephysicsattheLargeHadronCollider(LHC)MultivariateclassificationfromaparticlephysicsviewpointSomeexamplesofmultivariateclassificationinparticlephysics
NeuralNetworks BoostedDecisionTrees SupportVectorMachinesSummary,conclusions,etc.TheStandardModelofparticlephysicsMatter...+gaugebosons...photon(g),W±,Z,gluon(g)+relativity+quantummechanics+symmetries...=StandardModel25freeparameters(masses,couplingstrengths,...).IncludesHiggsboson(notyetseen).Almostcertainlyincomplete(e.g.nogravity).Agreeswithallexperimentalobservationssofar.ManycandidateextensionstoSM(supersymmetry,extradimensions,...)TheLargeHadronColliderCounter-rotatingprotonbeamsin27kmcircumferenceringppcentre-of-massenergy14TeVDetectorsat4ppcollisionpoints: ATLAS CMS LHCb(bphysics) ALICE(heavyionphysics)generalpurposeTheATLASdetector2100physicists37countries167universities/labs25mdiameter46mlength7000tonnes~108electronicchannelsAsimulatedSUSYeventinATLAShighpTmuonshighpTjetsofhadronsmissingtransverseenergyppBackgroundeventsThiseventfromStandardModelttbarproductionalsohashighpTjetsandmuons,andsomemissingtransverseenergy.→caneasilymimicaSUSYevent.LHCeventproductionratesmostevents(boring)interestingveryinteresting(~1outofevery1011)mildlyinterestingLHCdataAtLHC,~109ppcollisioneventspersecond,mostlyuninteresting doquicksifting,record~200events/sec singleevent~1Mbyte 1“year〞107s,1016ppcollisions/year 2109eventsrecorded/year(~2Pbyte/year)Fornew/rareprocesses,ratesatLHCcanbevanishinglysmall e.g.Higgsbosonsdetectableperyearcouldbe~103 →'needleinahaystack'ForStandardModeland(many)non-SMprocesseswecangeneratesimulateddatawithMonteCarloprograms(includingsimulationofthedetector).AsimulatedeventPYTHIAMonteCarlopp→gluino-gluino...MultivariateanalysisinparticlephysicsForeacheventwemeasureasetofnumbers:x1=jetpT
x2=missingenergyx3=particlei.d.measure,...followssomen-dimensionaljointprobabilitydensity,whichdependsonthetypeofeventproduced,i.e.,wasitE.g.hypothesesH0,H1,...Oftensimply“signal〞,“background〞FindinganoptimaldecisionboundaryInparticlephysicsusuallystartbymakingsimple“cuts〞: xi<ci xj<cjMaybelatertrysomeothertypeofdecisionboundary:H0H0H0H1H1H1TheoptimaldecisionboundaryTrytobestapproximateoptimaldecisionboundarybasedonlikelihoodratio:orequivalentlythinkofthelikelihoodratioastheoptimalstatisticforatestofH0vsH1.Ingeneralwedon'thavethepdfsp(x|H0),p(x|H1),...Rather,wehaveMonteCarlomodelsforeachprocess.
UsuallytrainingdatafromtheMCmodelsischeap. Butthemodelscontainmanyapproximations: predictionsforobservablesobtainedusingperturbation theory(truncatedatsomeorder);phenomenologicalmodeling ofnon-perturbativeeffects;imperfectdetectordescription,...TwodistincteventselectionproblemsInsomecases,theeventtypesinquestionarebothknowntoexist.
Example:separationofdifferentparticletypes(electronvsmuon) Usetheselectedsampleforfurtherstudy.Inothercases,thenullhypothesisH0means"StandardModel"events,andthealternativeH1means"eventsofatypewhoseexistenceisnotyetestablished"(todosoisthegoaloftheanalysis).
Manysubtleissueshere,mainlyrelatedtotheheavyburden ofproofrequiredtoestablishpresenceofanewphenomenon. Typicallyrequirep-valueofbackground-onlyhypothesis below~10-7(a5sigmaeffect)toclaimdiscoveryof "NewPhysics".Discovering"NewPhysics"TheLHCexperimentsareexpensive
~$1010(acceleratorandexperiments)thecompetitionisintense
(ATLASvs.CMS)vs.Tevatronandthestakesarehigh:4sigmaeffect5sigmaeffectSothereisastrongmotivationtoextractallpossibleinformationfromthedata.Usingclassifieroutputfordiscoveryyf(y)yN(y)NormalizedtounityNormalizedtoexpectednumberofeventsexcess?signalbackgroundbackgroundsearchregionDiscovery=numberofeventsfoundinsearchregionincompatiblewithbackground-onlyhypothesis.p-valueofbackground-onlyhypothesiscandependcruciallydistributionf(y|b)inthe"searchregion".ycutExampleofa"cut-based"studyInthe1990s,theCDFexperimentatFermilab(Chicago)measuredthenumberofhadronjetsproducedinproton-antiprotoncollisionsasafunctionoftheirmomentumperpendiculartothebeamdirection:Predictionlowrelativetodataforveryhightransversemomentum."jet"ofparticlesHighpTjets=quarksubstructure?AlthoughthedataagreeremarkablywellwiththeStandardModel(QCD)predictionoverall,theexcessathighpTappearssignificant:Thefactthatthevariableis"understandable"leadsdirectlytoaplausibleexplanationforthediscrepancy,namely,thatquarkscouldpossessaninternalsubstructure.Wouldnothavebeenthecaseifthevariableplottedwasacomplicatedcombinationofmanyinputs.HighpTjetsfrompartonmodeluncertaintyFurthermorethephysicalunderstandingofthevariableledonetoamoreplausibleexplanation,namely,anuncertainmodellingofthequark(andgluon)momentumdistributionsinsidetheproton.Whenmodeladjusted,discrepancylargelydisappears:Canberegardedasa"success"ofthecut-basedapproach.Physicalunderstandingofoutputvariableledtosolutionofapparentdiscrepancy.NeuralnetworksinparticlephysicsFormanyyears,theonly"advanced"classifierusedinparticlephysics.Usuallyusesinglehiddenlayer,logisticsigmoidactivationfunction:NeuralnetworkexamplefromLEPIISignal:e+e-
→W+W-(often4wellseparatedhadronjets)Background:e+e-
→qqgg(4lesswellseparatedhadronjets)←inputvariablesbasedonjetstructure,eventshape,...nonebyitselfgivesmuchseparation.Neuralnetworkoutput:(Garrido,JusteandMartinez,ALEPH96-144)SomeissueswithneuralnetworksIntheexamplewithWWevents,goalwastoselecttheseeventssoastostudypropertiesoftheWboson.
Neededtoavoidusinginputvariablescorrelatedtothe propertiesweeventuallywantedtostudy(nottrivial).Inprincipleasinglehiddenlayerwithansufficientlylargenumberofnodescanapproximatearbitrarilywelltheoptimaltestvariable(likelihoodratio). Usuallystartwithrelativelysmallnumberofnodesandincrease untilmisclassificationrateonvalidationdatasampleceases todecrease.UsuallyMCtrainingdataischeap--problemswithgettingstuckinlocalminima,overtraining,etc.,lessimportantthanconcernsofsystematicdifferencesbetweenthetrainingdataandNature,andconcernsabouttheeaseofinterpretationoftheoutput.DecisiontreesOutofalltheinputvariables,findtheoneforwhichwithasinglecutgivesbestimprovementinsignalpurity:ExamplebyMiniBooNEexperiment,B.Roeetal.,NIM543(2005)577wherewi.istheweightoftheithevent.Resultingnodesclassifiedaseithersignal/background.Iterateuntilstopcriterionreachedbasedone.g.purityorminimumnumberofeventsinanode.Thesetofcutsdefinesthedecisionboundary.BoostingTheresultingclassifierisusuallyverysensitivetofluctuationsinthetrainingdata.Stabilizebyboosting: Createanensembleoftrainingdatasetsfromtheoriginaloneby updatingtheeventweights(misclassifiedeventsgetincreased weight). Assignascoreaktotheclassifierfromthekthtrainingsetbased onitserrorrateek:Finalclassifierisaweightedcombinationofthosefromtheensembleoftrainingsets:Particlei.d.inMiniBooNEDetectorisa12-mdiametertankofmineraloilexposedtoabeamofneutrinosandviewedby1520photomultipliertubes:H.J.Yang,MiniBooNEPID,DNP06Searchfornmtoneoscillationsrequiredparticlei.d.usinginformationfromthePMTs.BDTexamplefromMiniBooNE~200inputvariablesforeachevent(ninteractionproducinge,morp).
Eachindividualtreeisrelativelyweak,withamisclassificationerrorrate~0.4–0.45B.Roeetal.,NIM543(2005)577MonitoringovertrainingFromMiniBooNEexample:Performancestableafterafewhundredtrees.ComparisonofboostingalgorithmsAnumberofboostingalgorithmsonthemarket;differintheupdaterulefortheweights.BoosteddecisiontreecommentsBoosteddecisiontreeshavebecomepopularinparticlephysicsbecausetheycanhandlemanyinputswithoutdegrading;thosethatprovidelittle/noseparationarerarelyusedastreesplittersareeffectivelyignored.Anumberofboostingalgorithmshavebeenlookedat,whichdifferprimarilyintheruleforupdatingtheweights(e-Boost,LogitBoost,...).Somestudieshavelookedatotherwaysofcombiningweakerclassifiers,e.g.,Bagging(Boostrap-Aggregating),generatestheensembleofclassifiersbyrandomsamplingwithreplacementfromthefulltrainingsample.Notmuchexperienceyetwiththese.ThetopquarkTopquarkistheheaviestknownparticleintheStandardModel.Sincemid-1990shasbeenobservedproducedinpairs:SingletopquarkproductionOnealsoexpectedtofindsinglyproducedtopquarks;pair-producedtopsarenowabackgroundprocess.Usemanyinputsbasedonjetproperties,particlei.d.,...signal(blue+green)DifferentclassifiersforsingletopAlsoNaiveBayesandvariousapproximationstolikelihoodratio,....Finalcombinedresultisstatisticallysignificant(>5slevel)butnoteasytounderstandclassifieroutputs.SupportVectorMachinesMapinputvariablesintohighdimensionalfeaturespace:x
→
fMaximizedistancebetweenseparatinghyperplanes(margin)subjecttoconstraintsallowingforsomemisclassification.Finalclassifieronlydependsonscalarproductsoff(x):SoonlyneedkernelBishopch7UsinganSVMTouseanSVMtheusermustasaminimumchoose
akernelfunction(e.g.Gaussian) anyfreeparametersinthekernel(e.g.thesoftheGaussian) thecostparameterC(playsroleofregularizationparameter)Thetrainingisrelativelystraightforwardbecause,incontrasttoneuralnetworks,thefunctiontobeminimizedhasasingleglobalminimum.Furthermoreevaluatingtheclassifieronlyrequiresthatoneretainandsumoverthesupportvectors,arelativelysmallnumberofpoints.Theadvantages/disadvantagesandrationalebehindthechoicesaboveisnotalwayscleartotheparticlephysicist--helpneededhere.SVMinparticlephysicsSVMsareverypopularintheMachineLearningcommunitybuthaveyettofindwideapplicationinHEP.HereisanearlyexamplefromaCDFtopquarkanlaysis(A.Vaiciulis,contributiontoPHYSTAT02).signaleff.Summary,conclusions,etc.Particlephysicshasusedseveralmultivariatemethodsformanyyears: linear(Fisher)discriminant neuralnetworks naiveBayesandhasinthelastseveralyearsstartedtouseafewmore
k-nearestneighbour boosteddecisiontrees supportvectormachinesTheemphasisisoftenoncontrollingsystematicuncertaintiesbetweenthemodeledtrainingdataandNaturetoavoidfalsediscovery.Althoughmanyclassifieroutputsare"blackboxes",adiscoveryat5ssignificancewithasophisticated(opaque)methodwillwinthecompetitionifbackedupby,say,4sevidencefromacut-basedmethod.QuotesIlike“Allessolltesoeinfachwiemöglichsein,abernichteinfacher.〞 –A.Einstein“Ifyoubelieveinsom
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 购房合同补充协议范本
- 财务管理系统实施合同
- 农业订单合同样本
- 材料供应合同书样本
- 度室内装饰壁画合同:手绘墙画服务协议
- 农业灌溉合同转让协议
- 农业机械租赁合同(范本7)
- 期货市场算法交易策略定制服务考核试卷
- 家禽饲养业产品质量安全追溯体系构建考核试卷
- 工业控制计算机在印刷机械控制中的实践考核试卷
- 生物医药研发实验室的安全风险评估与控制
- 合肥科技职业学院单招计算机类考试复习题库(含答案)
- 2018-2022年北京市中考真题数学试题汇编:填空压轴(第16题)
- 初三物理常识试卷单选题100道及答案
- 2025年吉林省吉林市事业单位招聘入伍高校毕业生54人历年高频重点提升(共500题)附带答案详解
- 《智能制造技术基础》课件-第6章 智能制造装备
- 钢结构地下停车场方案
- 《上市公司治理培训》课件
- 新人教版小学五年级数学下册《第一单元 观察物体(三)》2022课标大单元整体教学设计-全析
- 《光伏电站运行与维护》课件-项目五 光伏电站常见故障处理
- 2024年贵州公需科目答案
评论
0/150
提交评论