版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
IncrementalLearningSystemforFalseAlarmReductionPreprocess–FeatureConstruction12AssetInformationRecordtheinternaldeviceinformation,including…IPaddressOperationsystemDevicetypeRouterComputerServer…..Foranalyststorecordtheirownassets3SnapShot-AssetInformation4FeatureExtractorConstructthefeaturesbyextractingtheattributesofassetinformationandalertsinformationgeneratedbyIDSWechoice12featuresand1labelfeature6alertinformationSignaturename(1)SourceanddestinationIPaddress(2)Sourceanddestinationport(2)Protocoltype(1)6assetinformationIntranetorexternalnetworkincorrespondingsourceordestinationIP(2)OperationsystemincorrespondingsourceordestinationIPifitcanbeknown(2)OperationSystemincorrespondingsourceIPifitcanbeknown(2)5FeaturesAnalysisinDARPA2019alertsUsinginformationgaintoranktheabilityofdiscriminatingRankedattributes:0.790921sig_name0.787112ip_src0.773145layer4_sport0.736layer4_dport0.576783ip_dst0.443654ip_proto0.3036911dst_os0.019348src_os0.019229src_devtype0.019097src_intranet0.0053812dst_devtype0.0053810dst_intranetSelectedattributes:1,2,5,6,3,4,11,8,9,7,12,10:126SummaryClearly,wecanfindsourceIPhasagoodscore.Itrepresentsthatblacklistmaygiveagoodeffortfordiscrimination.ThedestinationOShasagoodscoreinalloftheextraassetinformation.ThatmightrepresentsattackerusuallyfocusonspecificOStoinjectspecificattackiftheyknowthetargethostinformation.Statisticmodelshouldhavenotbadresultaccordingtosomefeatureshavehighcapabilityofdiscrimination7FutureworkActivescantheinternaldeviceNessusOutgoingtrafficanalysisExtendmorefeatures8IncrementalLearningEngineIncrementalstrategyforconqueringconceptdrift910PreliminaryIncrementalEnsembleAlgorithmIncrementalLearning(Inspiredby[6,7,8])IncrementaltuningtheweightofthemodelsincommitteeAdvantagesContinuouslearning[11]Withoutseeingoldexamplesforre-trainingFacetheconceptdriftproblemBoostingweaklearnerbyre-weightingtheweightofexamples11PreliminaryIncrementalEnsembleAlgorithm(Cont.)StrategiesLearningstrategyCreatethenewlearnerforthenewincomingchunkdata“失敗為成功之母“,”以古鑑今”:LearningfromthepreviousmistakeofpredictionineachroundForgettingstrategyCommitteewillgetbiggerovertime,wehavetoforgetthehelplessinformation“績效評等”:Counttheerrorpredictiontimes(accuracy<1/2)“老化淘汰”:agingforgraduallyforgettingValidationteststrategy“物競天擇“:Optimaselection,leavethecommitteewiththebestperformanceinvalidationsetCommitteeDecision“投票表決”:Majorityvotingforreducingvarianceandbias12NotationWegoteachchunkdataXTthroughtimeT=t1,t2,…,tk,…Xt1={(x11,w11,z11)1,(x12,w12,z12)2,…,(x1n,w1n,z1n)n,….,(x1N,w1N,z1N)N}Xt2={(x21,w21,z21)1,(x22,w22,z22)2,…,(x2n,w2n,z2n)n,….,(x2N,w2N,z2N)N}…Xtk={(xk1,wk1,zk1)1,(xk2,wk2,zk2)2,…,(xkn,wkn,zkn)n,….,(xkN,wkN,zkN)N}XT={(xtn,wtn,ztn)n=1..N}Wheren=1~Ndenotetheindexoftrainingexamples,inchunksizeNXtn:inputvectorwherenrepresentthen-thexampleint-thchunk.Wtn denoteXtn’scorrespondingweightZtn denoteXtn’scorrespondingclass13InitialExampleWeightinEachChunkWnbeset1/Nindefault(C-1=1andC+1=1)IftherearedifferentcostindifferentclassEx:RelevantalertsalmosttakethelowproportionofwholealertsThen,wemightgivedifferentweightinrelevantexamplessuchlikeC-1=1andC+1=5014SigmoidAgingFunctionforForgettingInspiredbyagingmethodandsigmoidfunctionφ(v,a,b):ProposedSigmoidAgingismodifiedbysigmoidfunctioninto0~1V={vt1,vt2,…,vtk,..}:Countthetimesoferrorprediction(accuracy<1/2)beforeeachroundlearningforeachlearnervt’sinitialvalueis0,aistheslopeofsigmoidfunctionbisthemoverightparameterwithdifferentinitialpoint15SigmoidFunctionOriginalSigmoidfunctionEx:a=116SigmoidAging-ForgettingCurveProposedSigmoidAgingFunction.Ex:a=2,b=417Step1:BegintoLearnSettheinitialweightinthechunkandtrainthelearnery1ThenwegotthecommitteeYTandthevotingweightα1ofy1{w1n}n=1~Ny1(xt1)Committeet1Xt1={..}18Step2:GotNewChunkDataNewincomingchunkXt2={(X2n,W2n,z2n)}n=1~N
y1(xt1)Xt2={..}19Step3:CounttheTimesofErrorPredictionNewincomingchunkXt2={(X2n,W2n,z2n)}n=1~N
y1(xt2)Xt2
={..}Ifaccuracy<½thenvt=vt+120Step4:CalculatetheExampleWeightCalculatetheexampleweightbypreviousmistakesSettheinitialweightW2nandpasstheoldcommitteeYt1togetnewweightW’2n{w2n}n=1~Ny1(xt2)Xt2={..}{w’2n}n=1~N21Step5:TraintheNewLearnerTrainthelearnery2bytheXt2={(X2n,W’2n,z2n)}n=1~N
ThenwegotthecommitteeYTnewandthevotingweightα2ofy2CommitteeYT{w’2n}n=1~Ny2(xt2)NewCommitteeYTnew22Usingvalidationsetrecordedbytherecentdatainbuffer.OptimaselectionleavethebestperformancecommitteeasthewinnerfromYTandYTnewStep6:ValidationTestOriginalCommittee:NewCommittee:23Step7:RepeatthisProcesstillterminatedRepeatthisprocesswhenwegotthenewchunk(gotoStep2)tillthisprocessbeterminated24CommitteeDecisionFunctionDecisionfunctionwithagingWhenanewalertinstanceisgenerated,thepredictionclasswillbedeterminedbythecommittee25Experiment(1)MotivationHowmanysizeoftrainingdataisenoughfortrainingaclassification?And,howlongwillitbecomeuselessorunreliable?ExperimentdesignDARPA2019alertdata26DifferentTrainingSizewithDifferentModels27SummaryInthisexperiment,wecangetMostmodelswillgetwellaccuratewhenithasmorethan30%ofwholedataasthetrainingdataThatmeansitmightdowellforpredictinglike3timessizedataoftrainingdataSo,wemightarrangethenextexperimenttodemonstratewhentimeisthebettertimetoinvokethenewlylearningprocess28Experiment(2)MotivationUnderstandtheeffectofconceptdriftproblemExperimentdesignCalculateaccuracyofeachchuckdatabythecommitteeatthattime29AccuracyComparisoninEachChunkDecisionStumpEmploythelastmodelforpredictioncurrentchunkdata30AccuracyComparisoninEachChunk
DecisionStumpComparewithholdingpreviousentiremodel(s)topredict31AccuracyComparisoninEachChunk
DecisionStumpIncrementallearningwithoutvalidation32AccuracyComparisoninEachChunk
DecisionStumpTakeadvantageofpruningvaluelessnewlearnedmodel33AccuracyComparisoninEachChunk
DecisionStump34ExperimentalResult
DecisionStumpModelTypeChunkSize2200ChunkAve.Accuracy(%)Accuracy(%)LeaveRecent1Model87.1972(Std.0.2826)87.6192EntireModels35.4277(Std.0.4325)65.7924ILwithoutValidationStrategy(a,b)=(2,3)75.5721(Std.0.3766)88.0392ILStrategies(a,b)=(2,3)88.4456(Std.0.2556)92.0935SummaryClearly,ouralgorithmhavethecharacteristicsofreducingvarianceandbias.Itcanreducetheeffectofconceptdriftwhenithappened.Therearefourgapsinthediagram,whichreferredtotheconceptchangeInthefirstgap,ouralgorithmbalancebetweenrecentoneandentiremodelsInthesecondone,ourvalidationstrategyissuccessfultojumpthegapbypruninguselessnewlearnedmodelInthethirdandfourthgaps,algorithmswithandwithoutvalidationstrategyhavedifferencesabilitiesinreducebiasandvarianceingoodway.Inthelittletailpart,thebalancemodelcantakeadvantageoftheentiremodels’accuracy.Therefore,wecangetbetterperformance.36AccuracyComparisoninEachChunk
C4.5Employthelastmodelforpredictioncurrentchunkdata37AccuracyComparisoninEachChunk
C4.5Comparewithholdingpreviousentiremodel(s)topredict38AccuracyComparisoninEachChunk
C4.5Incrementallearningwithoutvalidation39AccuracyComparisoninEachChunk
C4.5Takeadvantageofpruningvaluelessnewlearnedmodel40AccuracyComparisoninEachChunk
C4.541AccuracyComparisoninEachChunk
C4.542AccuracyComparisoninEachChunk
C4.5Comparewiththebatchmodelre-trainedwithpreviousentiredata43ExperimentalResult
C4.5ModelTypeChunkSize2200ChunkAve.Accuracy(%)Accuracy(%)LeaveRecent1Model82.3623(Std.0.3347)76.3236EntireModels50.7340(Std.0.4322)72.4316ILwithoutValidationStrategy(a,b)=(2,3)80.0802(Std.0.3321)81.1242ILStrategies(a,b)=(2,3)88.1399(Std.0.2557)93.6834Re-train88.01(Std.0.2337)93.9944Experiment(3)Motivation:ComparetheperformanceofbatchtrainingwiththeperformanceofincrementaltrainingExperimentdesignBatchtraining:weusetheentirelypreviousdataasthetrainingdatatore-trainthemodel.Incrementaltraining:Employourproposedalgorithm.45AccuracyComparisoninEachChunk
DecisionStump46AccuracyComparisoninEachChunk
DecisionStumpCompareourmodelwithbathlearningmodel(Re-trainbyentirepreviousdata)47AccuracyComparisoninEachChunk
DecisionStumpCompareourmodelwithbathlearningmodel(Re-trainbyentirepreviousdata)48ExperimentalResult
DecisionStumpModelTypeChunkSize2200ChunkAve.Accuracy(%)Accuracy(%)ILStrategies(a,b)=(2,3)88.4456(Std.0.2556)92.09Re-train93.9735(Std.0.1848)92.03Re-trainwithAdaboost93.7417(Std.0.1960)94.2249ExperimentalResult
NaiveBayesModelTypeChunkSize2200ChunkAve.Accuracy(%)Accuracy(%)ILStrategies(a,b)=(2,3)84.4670(Std.0.3051)92.3855Re-train90.5612(Std.0.2355)94.75Re-trainwithAdaboost88.2039(Std.0.2561)93.3250SummaryAlthoughcomparingourmodelwiththemodelsre-trainedbyseeingpreviousentiredatacouldgetlessperformance.Ourperformance,however,isveryclosewiththeothers.Thepointisourmodelwithoutseeentiredata,thatcouldsavemanyresourcesespeciallyintrainingphase.51Experiment(4)MotivationForthemethodologiesofcombiningmodels,holdingalltrainedmodelsorjustholdingrecentmodel,whichoneisthebetterstrategiesExperimentdesignHoldentiremodelsincommitteeforpredictionHoldconstantsizemodelsinrecentincommitteeforpredictionHolddynamicsize,ourproposedalgorithm,modelsincommittee.Withdifferencebasemodeltoobservetheresult52PerformanceCompare
DecisionStumpModelTypeAccuracy#ofActiveModel#ofEntireModelsLeaveRecent187.6192125LeaveRecent386.0509325EntireModels65.79242525ILwithoutValidation(2,1)84.1184924ILwithoutValidation(2,3)88.03921023ILS(2,1)91.2516410ILS(2,3)92.09410Re-Train92.031153SummaryThebestresultisourentireILstrategies.Especially,thewholeprocesswejustuselessthanhalfofthenumberofEntireModels.Fastanddecreasethememoryexhausted.54Question&Answer551.WhatarethedisadvantageofrulebaseSystem?DisadvantageofrulebasesystemRoughtodetectnoveltypedataDifficulttomaintainOverlapHardforanalysttorepresenttheirknowledgeinruleThroughput562.IsTrainingDatatheBatchDataorStreamData?BatchdataNeedanalystresponsehisfeedbackItcouldbebatchdataButthesizeofbatchchunkshouldbedifferentineachresponseNewAlertsraisedbyIDSAnalystcheckwhetheritisTPorFP(Givefeedback)Submitthechunkdataforincrementallearning573.Whyusingboostingapproachastheincrementalkernelapproach?ReducevarianceandbiasInfrequentlychangeableenvironment,thelearningmodelshouldbelearningfastandasfarasreducememoryexhausted.Weaklearnershouldbeboosted.Evidently,theresultshowoutthiskindofapproachcanreducetheeffectofconceptdriftespeciallyinthechangepoint.584.IsDARPA2019datasetincludingtheconceptdriftorconceptchange?Conceptdrift(Contextchange)infourthandfifthweeks’dataAccordingtotheproposalofthedataset,thenovelattackswereinvokedduringthefourthandfifthweeks.ConceptchangeIfweobservethedatathroughtime,theyalsohavenewdifferentbehaviorchangedsuchliketheattackfree(1,3weeks)andthenovelattacks(4,5weeks)Butifi
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 配电工程建设施工合同范例
- 英文软件服务合同范例
- 合肥代理记账合同范例
- 公司主管合同范例
- 车位包干合同范例
- 2024年度品牌合作与商标使用补充合同3篇
- 水果投资协议合同模板
- 设备外贸合同范例
- 车展外包服务合同范例
- 购买厂房合同范例
- 万能中国地图模板(可修改)课件
- 2023年安全三类人员B类考试模拟试题及参考答案
- DB11 2007-2022城镇污水处理厂大气污染物排放标准
- YY/T 1698-2020人类体外辅助生殖技术用医疗器械辅助生殖穿刺取卵针
- GB/T 3274-2007碳素结构钢和低合金结构钢热轧厚钢板和钢带
- 【课件】7.2技术作品(产品)说明书及其编写课件-2021-2022学年高中通用技术苏教版必修《技术与设计1》
- (完整)医院收费员考试题题库及参考答案(通用版)
- 校本研修教研工作总结汇报课件
- 大孔吸附树脂技术课件
- 建筑电气施工图(1)课件
- 质量管理体系运行奖惩考核办法课案
评论
0/150
提交评论