




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
MOOC交通数据挖掘技术(DataMiningforTransportation)-东南大学中国大学慕课答案Test11、问题:WhichoneisnotthedescriptionofDatamining?选项:A、ExtractionofinterestingpatternsorknowledgeB、Explorationsandanalysisbyautomaticorsemi-automaticmeansC、DiscovermeaningfulpatternsfromlargequantitiesofdataD、Appropriatestatisticalanalysismethodstoanalyzethedatacollected正确答案:【Appropriatestatisticalanalysismethodstoanalyzethedatacollected】2、问题:Whichonedescribestherightprocessofknowledgediscovery?选项:A、Selection-Preprocessing-Transformation-Datamining-Interpretation/EvaluationB、Preprocessing-Transformation-Datamining-Selection-Interpretation/EvaluationC、Datamining-Selection-Interpretation/Evaluation-Preprocessing-TransformationD、Transformation-Datamining-election-Preprocessing-Interpretation/Evaluation正确答案:【Selection-Preprocessing-Transformation-Datamining-Interpretation/Evaluation】3、问题:WhichoneisnotbelongtotheprocessofKDD?选项:A、DataminingB、DatadescriptionC、DatacleaningD、Dataselection正确答案:【Datadescription】4、问题:Whichoneisnottherightalternativenameofdatamining?选项:A、KnowledgeextractionB、DataarcheologyC、DatadredgingD、Dataharvesting正确答案:【Dataharvesting】5、问题:Whichoneisnotthenominalvariables?选项:A、OccupationB、EducationC、AgeD、Color正确答案:【Age】6、问题:Whichoneiswrongaboutclassificationandregression?选项:A、Regressionanalysisisastatisticalmethodologythatismostoftenusedfornumericprediction.B、Wecanconstructclassificationmodels(functions)withoutsometrainingexamples.C、Classificationpredictscategorical(discrete,unordered)labels.D、Regressionmodelspredictcontinuous-valuedfunctions.正确答案:【Wecanconstructclassificationmodels(functions)withoutsometrainingexamples.】7、问题:Whichoneiswrongaboutclusteringandoutliers?选项:A、Clusteringbelongstosupervisedlearning.B、Principlesofclusteringincludemaximizingintra-classsimilarityandminimizinginterclasssimilarity.C、Outlieranalysiscanbeusefulinfrauddetectionandrareeventsanalysis.D、Outliermeansadataobjectthatdoesnotcomplywiththegeneralbehaviorofthedata.正确答案:【Clusteringbelongstosupervisedlearning.】8、问题:Aboutdataprocess,whichoneiswrong?选项:A、Whenmakingdatadiscrimination,wecomparethetargetclasswithoneorasetofcomparativeclasses(thecontrastingclasses).B、Whenmakingdataclassification,wepredictcategoricallabelsexcludingunorderedone.C、Whenmakingdatacharacterization,wesummarizethedataoftheclassunderstudy(thetargetclass)ingeneralterms.D、Whenmakingdataclustering,wewouldgroupdatatoformnewcategories.正确答案:【Whenmakingdataclassification,wepredictcategoricallabelsexcludingunorderedone.】9、问题:Outlierminingsuchasdensitybasedmethodbelongstosupervisedlearning.选项:A、正确B、错误正确答案:【错误】10、问题:Supportvectormachinescanbeusedforclassificationandregression.选项:A、正确B、错误正确答案:【正确】Test21、问题:Whichisnotthereasonweneedtopreprocessthedata?选项:A、tosavetimeB、tomakeresultmeetourhypothesisC、toavoidunreliableoutputD、toeliminatenoise正确答案:【tomakeresultmeetourhypothesis】2、问题:Whichisnotthemajortasksindatapreprocessing?选项:A、CleanB、IntegrationC、TransitionD、Reduction正确答案:【Transition】3、问题:HowtoconstructnewfeaturespacebyPCA?选项:A、NewfeaturespacebyPCAisconstructedbychoosingthemostimportantfeaturesyouthink.B、NewfeaturespacebyPCAisconstructedbynormalizinginputdata.C、NewfeaturespacebyPCAisconstructedbyselectingfeaturesrandomly.D、NewfeaturespacebyPCAisconstructedbyeliminatingtheweakcomponentstoreducethesizeofthedata.正确答案:【NewfeaturespacebyPCAisconstructedbyeliminatingtheweakcomponentstoreducethesizeofthedata.】4、问题:Whichoneiswrongaboutmethodsfordiscretization?选项:A、HistogramanalysisandBingingarebothunsupervisedmethods.B、Clusteringanalysisonlybelongstotop-downsplit.C、Intervalmergingbyc2Analysiscanbeappliedrecursively.D、Decision-treeanalysisisEntropy-baseddiscretization.正确答案:【Clusteringanalysisonlybelongstotop-downsplit.】5、问题:WhichoneiswrongaboutEqual-width(distance)partitioningandEqual-depth(frequency)partitioning?选项:A、Equal-widthpartitioningisthemoststraightforward,butoutliersmaydominatepresentation.B、Equal-depthpartitioningdividestherangeintoNintervals,eachcontainingapproximatelysamenumberofsamples.C、Theintervaloftheformeroneisnotequal.D、Thenumberoftuplesisthesamewhenusingthelatterone.正确答案:【Theintervaloftheformeroneisnotequal.】6、问题:Whichoneiswrongwaytonormalizedata?选项:A、Min-maxnormalizationB、SimplescalingC、Z-scorenormalizationD、Normalizationbydecimalscaling正确答案:【Simplescaling】7、问题:Whicharetherightwaytofillinmissingvalues?选项:A、SmartmeanB、ProbablevalueC、IgnoreD、Falsify正确答案:【Smartmean#Probablevalue#Ignore】8、问题:Whicharetherightwaytohandlenoisedata?选项:A、RegressionB、ClusterC、WTD、Manual正确答案:【Regression#Cluster#WT#Manual】9、问题:Whichoneisrightaboutwavelettransforms?选项:A、Wavelettransformsstorelargefractionsofthestrongestofthewaveletcoefficients.B、TheDWTdecomposeseachsegmentoftimeseriesviathesuccessiveuseoflow-passandhigh-passfilteringatappropriatelevels.C、Wavelettransformscanbeusedforreducingdataandsmoothingdata.D、Wavelettransformsmeansapplyingtopairsofdata,resultingintwosetofdataofthesamelength.正确答案:【TheDWTdecomposeseachsegmentoftimeseriesviathesuccessiveuseoflow-passandhigh-passfilteringatappropriatelevels.#Wavelettransformscanbeusedforreducingdataandsmoothingdata.】10、问题:Whicharethecommonusedwaystosampling?选项:A、SimplerandomsamplewithoutreplacementB、SimplerandomsamplewithreplacementC、StratifiedsampleD、Clustersample正确答案:【Simplerandomsamplewithoutreplacement#Simplerandomsamplewithreplacement#Stratifiedsample#Clustersample】11、问题:Discretizationmeansdividingtherangeofacontinuousattributeintointervals.选项:A、正确B、错误正确答案:【正确】Test31、问题:What'sthedifferencebetweeneagerlearnerandlazylearner?选项:A、Eagerlearnerswouldgenerateamodelforclassificationwhilelazylearnerwouldnot.B、Eagerlearnersclassifytheturplebasedonitssimilaritytothestoredtrainingturplewhilelazylearnernot.C、Eagerlearnerssimplystoredata(ordoesonlyalittleminorprocessing)whilelazylearnernot.D、Lazylearnerwouldgenerateamodelforclassificationwhileeagerlearnerwouldnot.正确答案:【Eagerlearnerswouldgenerateamodelforclassificationwhilelazylearnerwouldnot.】2、问题:HowtochoosetheoptimalvalueforK?选项:A、Cross-validationcanbeusedtodetermineagoodvaluebyusinganindependentdatasettovalidatetheKvalues.B、LowvaluesforK(likek=1ork=2)canbenoisyandsubjecttotheeffectofoutliers.C、Alargekvaluecanreducetheoverallnoisesothevaluefor'k'canbeasbigaspossible.D、Historically,theoptimalKformostdatasetshasbeenbetween3-10.正确答案:【Cross-validationcanbeusedtodetermineagoodvaluebyusinganindependentdatasettovalidatetheKvalues.#LowvaluesforK(likek=1ork=2)canbenoisyandsubjecttotheeffectofoutliers.#Historically,theoptimalKformostdatasetshasbeenbetween3-10.】3、问题:What’sthemajorcomponentsinKNN?选项:A、Howtomeasuresimilarity?B、Howtochoosek?C、Howareclasslabelsassigned?D、Howtodecidethedistance?正确答案:【Howtomeasuresimilarity?#Howtochoosek?#Howareclasslabelsassigned?】4、问题:WhichoneofthefollowingwayscanbeusedtoobtainattributeweightforAttribute-WeightedKNN?选项:A、Priorknowledge/experience.B、PCA,FA(Factoranalysismethod).C、Informationgain.D、Gradientdescent,simplexmethodsandgeneticalgorithm.正确答案:【Priorknowledge/experience.#PCA,FA(Factoranalysismethod).#Informationgain.#Gradientdescent,simplexmethodsandgeneticalgorithm.】5、问题:AtlearningstageKNNwouldfindtheKclosestneighborsandthendecideclassifyKidentifiednearestlabel.选项:A、正确B、错误正确答案:【错误】6、问题:AtclassificationstageKNNwouldstoreallinstanceorsometypicalofthem.选项:A、正确B、错误正确答案:【错误】7、问题:Normalizingthedatacansolvetheproblemthatdifferentattributeshavedifferentvalueranges.选项:A、正确B、错误正确答案:【正确】8、问题:ByEuclideandistanceorManhattandistance,wecancalculatethedistancebetweentwoinstances.选项:A、正确B、错误正确答案:【正确】9、问题:DatanormalizationbeforeMeasureDistancecanavoiderrorscausedbydifferentdimensions,self-variations,orlargenumericaldifferences.选项:A、正确B、错误正确答案:【正确】10、问题:Thewaytoobtaintheregressionforanewinstancefromtheknearestneighborsistocalculatetheaveragevalueofkneighbors.选项:A、正确B、错误正确答案:【正确】11、问题:Thewaytoobtaintheclassificationforanewinstancefromtheknearestneighborsistocalculatethemajorityclassofkneighbors.选项:A、正确B、错误正确答案:【正确】12、问题:ThewaytoobtaininstanceweightforDistance-WeightedKNNistocalculatethereciprocalofthedistancesquaredbetweenobjectandneighbors.选项:A、正确B、错误正确答案:【正确】Test41、问题:Whichdescriptionisrightaboutnodesindecisiontree?选项:A、InternalnodestestthevalueofparticularfeaturesB、LeafnodesspecifytheclassC、BranchnodesdecidetheresultD、Rootnodesdecidethestartpoint正确答案:【Internalnodestestthevalueofparticularfeatures#Leafnodesspecifytheclass】2、问题:ComputinginformationgainforcontinuousvalueattributewhenusingID3consistsofthefollowingprocedure:选项:A、SortthevalueAinincreasingorder.B、Considerthemidpointbetweeneachpairofadjacentvaluesasapossiblesplitpoint.C、Selecttheminimumexpectedinformationrequirementasthesplit-point.D、Split.正确答案:【SortthevalueAinincreasingorder.#Considerthemidpointbetweeneachpairofadjacentvaluesasapossiblesplitpoint.#Selecttheminimumexpectedinformationrequirementasthesplit-point.#Split.】3、问题:Whichisthetypicalalgorithmstogeneratetrees?选项:A、ID3B、C4.5C、CARTD、PCA正确答案:【ID3#C4.5#CART】4、问题:Whichoneisrightaboutunderfittingandoverfitting?选项:A、Underfittingmeanspooraccuracybothfortrainingdataandunseensamples.B、Overfittingmeanshighaccuracyfortrainingdatabutpooraccuracyforunseensamples.C、Underfittingimpliesthemodelistoosimplethatweneedtoincreasethemodelcomplexity.D、Overfittingoccurstoomanybranchesthatweneedtodecreasethemodelcomplexity.正确答案:【Underfittingmeanspooraccuracybothfortrainingdataandunseensamples.#Overfittingmeanshighaccuracyfortrainingdatabutpooraccuracyforunseensamples.#Underfittingimpliesthemodelistoosimplethatweneedtoincreasethemodelcomplexity.#Overfittingoccurstoomanybranchesthatweneedtodecreasethemodelcomplexity.】5、问题:Whichoneisrightaboutpre-pruningandpost-pruning?选项:A、Bothofthemaremethodstodealwithoverfittingproblem.B、Pre-pruningdoesnotsplitanodeifthiswouldresultinthegoodnessmeasurefallingbelowathreshold.C、Post-pruningremovesbranchesfroma“fullygrown”tree.D、Thereisnoneedtochooseanappropriatethresholdwhenmakingpre-pruning.正确答案:【Bothofthemaremethodstodealwithoverfittingproblem.#Pre-pruningdoesnotsplitanodeifthiswouldresultinthegoodnessmeasurefallingbelowathreshold.#Post-pruningremovesbranchesfroma“fullygrown”tree.】6、问题:Post-pruninginCARTconsistsofthefollowingprocedure:选项:A、First,considerthecostcomplexityofatree.B、Then,foreachinternalnode,N,computethecostcomplexityofthesubtreeatN.C、AndalsocomputethecostcomplexityofthesubtreeatNifitweretobepruned.D、Atlast,comparethetwovalues.IfpruningthesubtreeatnodeNwouldresultinasmallercostcomplexity,thesubtreeispruned.Otherwise,thesubtreeiskept.正确答案:【First,considerthecostcomplexityofatree.#Then,foreachinternalnode,N,computethecostcomplexityofthesubtreeatN.#AndalsocomputethecostcomplexityofthesubtreeatNifitweretobepruned.#Atlast,comparethetwovalues.IfpruningthesubtreeatnodeNwouldresultinasmallercostcomplexity,thesubtreeispruned.Otherwise,thesubtreeiskept.】7、问题:ThecostcomplexitypruningalgorithmusedinCARTevaluatecostcomplexitybythenumberofleavesinthetree,andtheerrorrate.选项:A、正确B、错误正确答案:【正确】8、问题:GainratioisusedasattributeselectionmeasureinC4.5andtheformulaisGainRatio(A)=Gain(A)/SplitInfo(A).选项:A、正确B、错误正确答案:【正确】9、问题:Ruleiscreatedforeachpartfromitsroottoitsleafnotes.选项:A、正确B、错误正确答案:【正确】10、问题:ID3useinformationgainasitsattributeselectionmeasure.AndtheattributewiththelowestinformationgainischosenasthesplittingattributefornoteN.选项:A、正确B、错误正确答案:【错误】Test51、问题:WhatthefeatureofSVM?选项:A、Extremelyslow,butarehighlyaccurate.B、Muchlesspronetooverfittingthanothermethods.C、Blackboxmodel.D、Provideacompactdescriptionofthelearnedmodel.正确答案:【Extremelyslow,butarehighlyaccurate.#Muchlesspronetooverfittingthanothermethods.#Provideacompactdescriptionofthelearnedmodel.】2、问题:Whichisthetypicalcommonkernel?选项:A、LinearB、PolynomialC、Radialbasisfunction(Gaussiankernel)D、Sigmoidkernel正确答案:【Linear#Polynomial#Radialbasisfunction(Gaussiankernel)#Sigmoidkernel】3、问题:WhatadaptationscanbemadetoallowSVMtodealwithMulticlassClassificationproblem?选项:A、Oneversusrest(OVR).B、Oneversusone(OVO).C、Errorcorrectinginputcodes(ECIC).D、Errorcorrectingoutputcodes(ECOC).正确答案:【Oneversusrest(OVR).#Oneversusone(OVO).#Errorcorrectingoutputcodes(ECOC).】4、问题:What'stheproblemofOVR?选项:A、Sensitivetotheaccuracyoftheconfidencefiguresproducedbytheclassifiers.B、Thescaleoftheconfidencevaluesmaydifferbetweenthebinaryclassifiers.C、Thebinaryclassificationlearnersseeunbalanceddistributions.D、Onlywhentheclassdistributionisbalancedcanbalanceddistributionsattain.正确答案:【Sensitivetotheaccuracyoftheconfidencefiguresproducedbytheclassifiers.#Thescaleoftheconfidencevaluesmaydifferbetweenthebinaryclassifiers.#Thebinaryclassificationlearnersseeunbalanceddistributions.】5、问题:WhichoneisrightabouttheadvantagesofSVM?选项:A、Theyareaccurateinhigh-dimensionalspaces.B、Theyarememoryefficient.C、Thealgorithmisnotproneforover-fittingcomparedtootherclassificationmethod.D、Thesupportvectorsaretheessentialorcriticaltrainingtuples.正确答案:【Theyareaccurateinhigh-dimensionalspaces.#Theyarememoryefficient.#Thealgorithmisnotproneforover-fittingcomparedtootherclassificationmethod.#Thesupportvectorsaretheessentialorcriticaltrainingtuples.】6、问题:Kerneltrickwasusedtoavoidcostlycomputationanddealwithmappingproblems.选项:A、正确B、错误正确答案:【正确】7、问题:ThereisnostructuredwayandnogoldenrulesforsettingtheparametersinSVM.选项:A、正确B、错误正确答案:【正确】8、问题:Errorcorrectingoutputcodes(ECOC)isakindofproblemtransformationtechniques.选项:A、正确B、错误正确答案:【错误】9、问题:Regressionformulasincludingthreetypes:linear,nonlinearandgeneralform.选项:A、正确B、错误正确答案:【正确】10、问题:Ifyouhaveabigdataset,SVMissuitableforefficientcomputation.选项:A、正确B、错误正确答案:【错误】Test61、问题:Whichdescriptionisrighttodescribeoutliers?选项:A、OutlierscausedbymeasurementerrorB、OutliersreflectinggroundtruthC、OutlierscausedbyequipmentfailureD、Outliersneededtobedroppedoutalways正确答案:【Outlierscausedbymeasurementerror#Outliersreflectinggroundtruth#Outlierscausedbyequipmentfailure】2、问题:Whatisapplicationcaseofoutliermining?选项:A、TrafficincidentdetectionB、CreditcardfrauddetectionC、NetworkintrusiondetectionD、Medicalanalysis正确答案:【Trafficincidentdetection#Creditcardfrauddetection#Networkintrusiondetection#Medicalanalysis】3、问题:Whichoneisthemethodtodetectoutliers?选项:A、Statistics-basedapproachB、Distance-basedapproachC、Bulk-basedapproachD、Density-basedapproach正确答案:【Statistics-basedapproach#Distance-basedapproach#Density-basedapproach】4、问题:Howtopicktherightkbyaheuristicmethodfordensity-basedoutlierminingmethod?选项:A、Kshouldbeatleast10toremoveunwantedstatisticalfluctuations.B、Pick10to20appearstoworkwellingeneral.C、Picktheupperboundvalueforkasthemaximumof“closeby”objectsthatcanpotentiallybeglobaloutliers.D、Picktheupperboundvalueforkasthemaximumof“closeby”objectsthatcanpotentiallybelocaloutliers.正确答案:【Kshouldbeatleast10toremoveunwantedstatisticalfluctuations.#Pick10to20appearstoworkwellingeneral.#Picktheupperboundvalueforkasthemaximumof“closeby”objectsthatcanpotentiallybelocaloutliers.】5、问题:Whichoneisrightaboutthreemethodsofoutliermining?选项:A、Statistics-basedapproachissimpleandfastbutdifficulttodealwithperiodicitydataandcategoricaldata.B、Theefficiencyofdistance-basedapproachislowforthegreatdatasetinhighdimensionalspace.C、Distance-basedapproachcannotbeusedinmultidimensionaldataset.D、Density-basedapproachspendslowcostonsearchingneighborhood.正确答案:【Statistics-basedapproachissimpleandfastbutdifficulttodealwithperiodicitydataandcategoricaldata.#Theefficiencyofdistance-basedapproachislowforthegreatdatasetinhighdimensionalspace.】6、问题:Distance-basedoutlierMiningisnotsuitabletodatasetthatdoesnotfitanystandarddistributionmodel.选项:A、正确B、错误正确答案:【错误】7、问题:Statistic-basedmethodneedstorequireknowingthedistributionofthedataandthedistributionparametersinadvance.选项:A、正确B、错误正确答案:【正确】8、问题:Whenidentifyingoutlierswithadiscordancytest,thedatapointisconsideredasanoutlierifitfallswithintheconfidenceinterval.选项:A、正确B、错误正确答案:【错误】9、问题:MahalanobisDistanceaccountsfortherelativedispersionsandinherentcorrelationsamongvectorelements,whichisdifferentfromEuclideanDistance.选项:A、正确B、错误正确答案:【正确】10、问题:Anoutlierisadataobjectthatdeviatessignificantlyfromtherestoftheobjects,asifitweregeneratedbyadifferentmechanism.选项:A、正确B、错误正确答案:【正确】Test71、问题:Howtodealwithimbalanceddatain2-classclassification?选项:A、OversamplingB、UndersamplingC、Threshold-movingD、Ensembletechniques正确答案:【Oversampling#Undersampling#Threshold-moving#Ensembletechniques】2、问题:Whichoneisrightwhendealingwiththeclass-imbalanceproblem?选项:A、Oversamplingworksbydecreasingthenumberofminoritypositivetuples.B、Undersamplingworksbyincreasingthenumberofmajoritynegativetuples.C、Smotealgorithmaddssynthetictuplesthatareclosetotheminoritytuplesintuplespace.D、Threshold-movingandensemblemethodswereempiricallyobservedtooutperformoversamplingandundersampling.正确答案:【Smotealgorithmaddssynthetictuplesthatareclosetotheminoritytuplesintuplespace.#Threshold-movingandensemblemethodswereempiricallyobservedtooutperformoversamplingandundersampling.】3、问题:Whichstepisnecessarywhenconstructinganensemblemodel?选项:A、CreatingmultipledatasetB、ConstructingasetofclassifiersfromthetrainingdataC、CombiningpredictionsmadebymultipleclassifierstoobtainfinalclasslabelD、Findthebestperformingpredictionstoobtainfinalclasslabel正确答案:【Creatingmultipledataset#Constructingasetofclassifiersfromthetrainingdata#Combiningpredictionsmadebymultipleclassifierstoobtainfinalclass
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 餐饮帮厨用工合同范本
- 业委会授权合同范本
- 地基基坑开挖合同范本
- 水暖电消防合同范本
- 除甲醛合作合同范本
- 鲜鸡蛋采购合同范本
- 2013版装饰合同范例
- 分期车位转让合同样本
- 出租种植莲藕土地合同标准文本
- 广东省茂名市电白区达标名校2023-2024学年中考数学模试卷含解析
- 2024年中考历史真题汇编专题13 材料分析题(中国史部分)-教师
- 2025年上半年甘肃省林业和草原局事业单位招聘笔试重点基础提升(共500题)附带答案详解
- 化工单元操作知到智慧树章节测试课后答案2024年秋烟台职业学院
- 谈黑色变-认识色素痣与黑素瘤.课件
- 电信运营商网络安全管理制度
- 魏晋风度课件
- 【MOOC】英国小说-南京大学 中国大学慕课MOOC答案
- 【读后续写】2021年11月稽阳联考读后续写讲评:Saving the Daisies 名师课件-陈星可
- 国开(浙江)2024年秋《信息技术与信息管理》形考作业1-4答案
- 化肥利用率研究
- 《中华人民共和国突发事件应对法》知识培训
评论
0/150
提交评论