微软SQL数据挖掘-数据仓库技术研讨会材料课件_第1页
微软SQL数据挖掘-数据仓库技术研讨会材料课件_第2页
微软SQL数据挖掘-数据仓库技术研讨会材料课件_第3页
微软SQL数据挖掘-数据仓库技术研讨会材料课件_第4页
微软SQL数据挖掘-数据仓库技术研讨会材料课件_第5页
已阅读5页,还剩103页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

欢迎光临

微软SQL数据挖掘/数据仓库

技术研讨会欢迎光临

微软SQL数据挖掘/数据仓库

技术研讨会1今日安排微软SQL数据挖掘技术概述左洪微软公司数据仓库在电信的应用贝志城明天高科数据挖掘在CRM中的应用王立军中圣公司灵通ITService维护管理服务系统邹雄文广州灵通今日安排微软SQL数据挖掘技术概述2IntroductiontoDataMiningwithSQLServer2000

左洪高级产品市场经理微软(中国)有限公司IntroductiontoDataMiningwiAgendaWhatisDataMiningTheDataMiningMarketOLEDBforDataMiningOverviewoftheDataMiningFeaturesinSQLServer2000Q&AAgendaWhatisDataMiningWhatIsDataMining?WhatIsDataMining?WhatisDM?Aprocessofdataexplorationandanalysisusingautomaticorsemi-automaticmeans“Exploringdata”–scanningsamplesofknownfactsabout“cases”.“knowledge”:Clusters,Rules,Decisiontrees,Equations,Associationrules…Oncethe“knowledge”isextractedit:CanbebrowsedProvidesaveryusefulinsightonthecasesbehaviorCanbeusedtopredictvaluesofothercasesCanserveasakeyelementinclosedloopanalysisWhatisDM?AprocessofdataeWhatdrivehighschoolstudentstoattendcollege?WhatdrivehighschoolstudentThedecidingfactorsforhighschoolstudentstoattendcollegeare…AttendCollege:55%Yes45%NoAllStudentsAttendCollege:79%Yes11%NoIQ=HighAttendCollege:45%Yes55%NoIQ=LowIQ?WealthAttendCollege:94%Yes6%NoWealth=TrueAttendCollege:69%Yes21%NoWealth=FalseParentsEncourage?AttendCollege:70%Yes30%NoAttendCollege:31%Yes69%NoParentsEncourage=NoParentsEncourage=YesThedecidingfactorsforhighBusinessOrientedDMProblemsTargetedads“WhatbannershouldIdisplaytothisvisitor?”Crosssells“Whatotherproductsisthiscustomerlikelytobuy?Frauddetection“Isthisinsuranceclaimafraud?”Churnanalysis“Whoarethosecustomerslikelytochurn?”RiskManagement“ShouldIapprovetheloantothiscustomer?”…BusinessOrientedDMProblemsT微软SQL数据挖掘_数据仓库技术研讨会材料微软SQL数据挖掘_数据仓库技术研讨会材料MiningModelMiningProcess-IllustratedDMEngineDataToPredictDMEnginePredictedDataTrainingDataMiningModelMiningModelMiningModelMiningProcess-ITheDataMiningMarketTheDataMiningMarketThe$$$:Y2000MarketSizeDMToolsMarket:$250M40%-licensefees60%consulting*GartnerThe$$$:Y2000MarketSizeDMThePlayersLeadingvendorsSASSPSSIBMHundredsofsmallervendorsofferingDMalgorithms…Oracle–ThinkingMachinesacquisitionThePlayersLeadingvendorsTheProductsEnd-to-endDataMiningtoolsExtraction,Cleansing,Loading,Modeling,Algorithms(dozens),Analystsworkbench,Reporting,Charting….Thecustomeristhepower-analystPhDinstatisticsisusuallyrequired…Closedtools–nostandardAPITotalvendorlock-inLimitedintegrationwithapplicationsDMan“outsider”intheDataWarehouseExtensiveconsultingrequiredSkyrocketingprices$60K+forasingleuserlicenseTheProductsEnd-to-endDataMiWhattheanalystssay…“Stand-aloneDataMiningIsDead”-Forrester“Thedemiseof[standalone]datamining”–GartnerWhattheanalystssay…“Stand-aTheMicrosoftApproachTheMicrosoftApproachDataProUsersSurvey

1999-2001“Dataminingwillbethefastest-growingBItechnology…”DataProUsersSurvey

1999-200The$$$:2000MarketSizeDMApplicationsMarketSize:$1.5B*IDCThe$$$:2000MarketSize*IDSQLServer2000-TheAnalysisPlatformSQL2000providesacompleteAnalysisPlatformNotanisolated,standaloneDMproductPlatformmeans:TheinfrastructureforapplicationsNotanapplicationbyitselfIntegratedvisionforalltechnologies,toolsStandardbasedAPI’s(OLEDBforDM)ExtensibleScaleableSQLServer2000-TheAnalysisDataFlowDWOLTPOLAPDMAppsReports&AnalysisDMDataFlowDWOLTPOLAPDMAppsReporAnalysisServices2000-ArchitectureManagerUIDSOAnalysisServerClientOLEDBOLAPOLAPEngine(local)OLAPEngineDMEngineDMEngine(local)DMDMMDMWizardsDMDTSTaskExt.Ext.AnalysisServices2000-ArchiOLEDBforDataMining…OLEDBforDataMining…WhyOLEDBforDM?MakeDMamassmarkettechnologyby:LeverageexistingtechnologiesandknowledgeSQLandOLEDBCommonindustrywideconceptsanddatapresentationChangingDMmarketperceptionfrom“proprietary”to“open”Increasingthenumberofplayers:Reducethecostandriskofbecomingaconsumer–onetoolworkswithmultipleprovidersReducethecostandriskofbecomingaprovider–focusonexpertiseandfindmanypartnerstocomplementofferingDramaticallyincreasethenumberofDMdevelopersWhyOLEDBforDM?MakeDMamaIntegrationWithRDBMSCustomerswouldliketoBuildDMmodelsfromwithintheirRDBMSTrainthemodelsdirectlyofftheirrelationaltablesPerformpredictionsasrelationalqueries(tablesin,tablesout)FeelthatDMisanativepartoftheirdatabase.Therefore…DataminingmodelsarerelationalobjectsAlloperationsonthemodelsarerelationalThelanguageusedisSQL(w/Extensions)Theeffect:everyDBAandVBdevelopercanbecomeaDMdeveloperIntegrationWithRDBMSCustomerCreatingaDataMiningModel(DMM)CreatingaDataMiningModel(Identifyingthe“Cases”DMalgorithmsanalyze“cases”The“case”istheentitybeingcategorizedandclassifiedExamplesCustomercreditriskanalysis:Case=CustomerProductprofitabilityanalysis:Case=ProductPromotionsuccessanalysis:Case=PromotionEachcaseencapsulateallweknowabouttheentityIdentifyingthe“Cases”DMalgoASimpleSetofCasesStudentIDGenderParentIncomeIQEncouragementCollegePlans1Male23400120NotEncouragedNo2Female7920090EncouragedYes3Male42000105NotEncouragedYesASimpleSetofCasesStudentIDMoreComplicatedCasesCustIDAgeMaritalStatusIQFavoriteMoviesTitleScore135M2StarWars8ToyStory9Terminator7220S3StarWars7Braveheart7TheMatrix10357M2SixthSense9Casablanca10MoreComplicatedCasesCustIDAADMMisaTable!ADMMstructureisdefinedasatableTrainingaDMMmeansinsertingdataintothetablePredictingfromaDMMmeansqueryingthetableAllinformationdescribingthecasearecontainedincolumnsADMMisaTable!ADMMstructuCreatingaMiningModelCREATEMININGMODEL[PlansPrediction](StudentIDLONGKEY, GenderTEXTDISCRETE,ParentIncomeLONGCONTINUOUS,IQDOUBLECONTINUOUS,EncouragementTEXTDISCRETE,CollegePlansTEXTDISCRETEPREDICT)USINGMicrosoft_Decision_TreesCreatingaMiningModelCREATECreatingaminingmodelwithnestedtableCreateMiningModelMoviePrediction(CutomerIdlongkey,Agelongcontinuous,Genderdiscrete,Educationdiscrete,MovieListtablepredict( MovieNametextkey))usingmicrosoft_decision_treesCreatingaminingmodelwithnTrainingaDMMTrainingaDMMTrainingaDMMTrainingaDMMmeanspassingitdataforwhichtheattributestobepredictedareknownMultiplepassesarehandledinternallybytheprovider!UseanINSERTINTOstatementTheDMMwillnotpersisttheinserteddataInsteaditwillanalyzethegivencasesandbuildtheDMMcontent(decisiontree,segmentationmodel,associationrules)INSERT[INTO]<miningmodelname> [(columnslist)]<sourcedataquery>

TrainingaDMMTrainingaDMMmINSERTINTOINSERTINTO[PlansPrediction](StudentID,Gender,ParentIncome,IQ,Encouragement,CollegePlans)SELECT [StudentID],[Gender], [ParentIncome],[IQ], [Encouragement],[CollegePlans]FROM[CollegePlans]INSERTINTOINSERTINTO[PlansWhenInsertIntoIsDone…TheDMMistrainedThemodelcanberetrainedContent(rules,trees,formulas)canbeexploredOLEDBSchemarowsetSELECT*FROM<dmm>.CONTENTXMLstring(PMML)PredictionqueriescanbeexecutedWhenInsertIntoIsDone…TheDPredictionsPredictionsWhatarePredictions?PredictionsapplytherulesofatrainedmodeltoanewsetofdatainordertoestimatemissingattributesorvaluesPredictions=queriesThesyntaxisSQL-likeTheoutputisarowsetInordertopredictyouneed:InputdatasetAtrainedDMMBinding(mapping)informationbetweentheinputdataandtheDMMSpecificationofwhattopredictWhatarePredictions?PredictioTheTruthTableConceptGenderParentIncomeIQEncouragementCollegePlansProbabilityMale2000085NotEncouragedNo85%Male2000085NotEncouragedYes15%Male2000085EncouragedNo60%Male2000085EncouragedYes40%Male2000090NotEncouragedNo80%Male2000090NotEncouragedYes20%Male2000090EncouragedNo58%…TheTruthTableConceptGenderPPredictionGenderParentIncomeIQEncouragementCollegePlansProbabilityMale2000085NotEncouragedNo85%Male2000085NotEncouragedYes15%Male2000085EncouragedNo60%Male2000085EncouragedYes40%Male2000090NotEncouragedNo80%Male2000090NotEncouragedYes20%Male2000090EncouragedNo58%Male2000090EncouragedYes42%Male2000095NotEncouragedNo78%Male2000095NotEncouragedYes22%Male2000095EncouragedNo45%It’saJOIN!StudentIDGenderParentIncomeIQEncouragement1Male4300085NotEncouraged2Male20000135NotEncouraged3Female25000105Encouraged4Male96000100Encouraged5Female56000125NotEncouraged6Female4600090NotEncouragedPredictionGenderParentIQEncourThePredictionQuerySyntaxSELECT<columnstoreturnorpredict>FROM <dmm>PREDICTIONJOIN

<inputdataset>

ON<dmmcolumn>=<dmminputcolumn>…ThePredictionQuerySyntaxSELExampleSELECT[NewStudents].[StudentID], [PlansPrediction].[CollegePlans], PredictProbability([CollegePlans])FROM[PlansPrediction]PREDICTIONJOIN[NewStudents] ON[PlansPrediction].[Gender]=[NewStudents].[Gender]AND

[PlansPrediction].[IQ]=[NewStudents].[IQ]AND...ExampleSELECT[NewStudents].[OLEDBDMSampleProviderwithSourceOLEDBDMSampleProviderwithIntegratedOLAPandDMAnalysisIntegratedOLAPandDMAnalysiWhyUseDMwithOLAPRelationalDMisdesignedfor:ReportsofpatternsBatchpredictionsfedintoanOLTPsystemReal-timesingletonpredictioninanoperationalenvironmentOLAPisdesignedforinteractiveanalysisbyaknowledgeworkerConsistentandconvenientnavigationalmodelPre-aggregationsofOLAPallowfasterperformanceWhyUseDMwithOLAPRelationalUnderstandingDMContent–DecisionTreesCreditRisk:65%Good35%BadAllCustomersCreditRisk:89%Good11%BadDebt=LowCreditRisk:94%Good6%BadET=SalariedCreditRisk:70%Good30%BadEducation?CreditRisk:31%Good69%BadEducation=HighSchoolCreditRisk:79%Good21%BadCreditRisk:45%Good55%BadDebt=HighDebt?Employ--mentType?ET=SelfEmployedEducation=CollegeCustomershavinghighdebtandcollegeeducation:Filter([IndividualCustomers].Members,Customers.CurrentMember.Properties(“Debt”)=“High”AndCustomers.CurrentMember.Properties(“Education”)=“College”)Customershavinglowdebtandareselfemployed:Filter([IndividualCustomers].Members,Customers.CurrentMember.Properties(“Debt”)=LowAndCustomers.CurrentMember.Properties(“EmploymentType”)=“SelfEmployed”)UnderstandingDMContent–Dec…EquivalentDMDimensionCustomerswithhighdebtandcollegeeducationAllCustomersCustomerswithhighdebtCustomerswithhighdebtandhighschooleducationCustomerswithlowdebtandselfemployedCustomerswithlowdebtCustomerswithlowdebtandsalariedCustomRoll-upCreditRisk-Good=65%,Bad=35%Aggregate(Filter(…Good=89%,Bad=11%Aggregate(Filter(…Good=79%,Bad=21%Aggregate(Filter(…Good=94%,Bad=6%Aggregate(Filter(…Good=45%,Bad=55%Aggregate(Filter(…Good=70%,Bad=30%Aggregate(Filter(…Good=31%,Bad=69%…EquivalentDMDimensionCustomTree=DimensionEverynodeonthetreeisadimensionmemberThenodestatisticsarethememberpropertiesAllmembersarecalculatedFormulaaggregatesthecasedimensionmembersthatapplytothisnodeTheMDXisgeneratedbytheDMalgorithmAnalysisServicewillautomaticallygeneratethecalculateddimensionbasedontheDMcontentandalsoavirtualcubeAppliestoClassification(decisiontrees)Segmentation(clusters)Tree=DimensionEverynodeonBrowsingtheVirtualCubePivottheDMdimension:WAORCAAllCustomers320025008000Customerswithlowdebt232015034300Customerswithhighdebt8809974700Customers…college3204502310Customers…highschool5605472390CreditRisk:70%Good,30%BadBrowsingtheVirtualCubePivotPredictionsYoumightwanttoviewpredictionsforeachcaseForexample:Whatistheexpectedprofitabilityofaproduct?Whatisthecreditriskofaspecificcustomer?Whataretheproductsthiscustomerislikelytobuy?AllofthosepredictionsareavailablethroughMDXcalculatedmembersSingletonqueryiscreatedautomaticallyPredictionsYoumightwanttovPredictionCalculatedMemberMeasures.[ProbabilityofHighCreditRisk]:PREDICT(Customers.CurrentMember, “CreditRiskModel”, “PredictionProbability(

PredictionHistogram(“CreditRisk”),

‘High’)“ )PredictionCalculatedMemberMePredictionsExampleProbabilityofHighCreditRiskProbabilityofLowCreditRiskJoeSmith73%27%JohnDow68%32%WilliamClington45%55%RobertMaxwell98%2%DenisRodman81%19%PredictionsExampleProbabilityQuestions?

E-Mail:billzuo@Questions?

E-Mail:billzuo@m欢迎光临

微软SQL数据挖掘/数据仓库

技术研讨会欢迎光临

微软SQL数据挖掘/数据仓库

技术研讨会55今日安排微软SQL数据挖掘技术概述左洪微软公司数据仓库在电信的应用贝志城明天高科数据挖掘在CRM中的应用王立军中圣公司灵通ITService维护管理服务系统邹雄文广州灵通今日安排微软SQL数据挖掘技术概述56IntroductiontoDataMiningwithSQLServer2000

左洪高级产品市场经理微软(中国)有限公司IntroductiontoDataMiningwiAgendaWhatisDataMiningTheDataMiningMarketOLEDBforDataMiningOverviewoftheDataMiningFeaturesinSQLServer2000Q&AAgendaWhatisDataMiningWhatIsDataMining?WhatIsDataMining?WhatisDM?Aprocessofdataexplorationandanalysisusingautomaticorsemi-automaticmeans“Exploringdata”–scanningsamplesofknownfactsabout“cases”.“knowledge”:Clusters,Rules,Decisiontrees,Equations,Associationrules…Oncethe“knowledge”isextractedit:CanbebrowsedProvidesaveryusefulinsightonthecasesbehaviorCanbeusedtopredictvaluesofothercasesCanserveasakeyelementinclosedloopanalysisWhatisDM?AprocessofdataeWhatdrivehighschoolstudentstoattendcollege?WhatdrivehighschoolstudentThedecidingfactorsforhighschoolstudentstoattendcollegeare…AttendCollege:55%Yes45%NoAllStudentsAttendCollege:79%Yes11%NoIQ=HighAttendCollege:45%Yes55%NoIQ=LowIQ?WealthAttendCollege:94%Yes6%NoWealth=TrueAttendCollege:69%Yes21%NoWealth=FalseParentsEncourage?AttendCollege:70%Yes30%NoAttendCollege:31%Yes69%NoParentsEncourage=NoParentsEncourage=YesThedecidingfactorsforhighBusinessOrientedDMProblemsTargetedads“WhatbannershouldIdisplaytothisvisitor?”Crosssells“Whatotherproductsisthiscustomerlikelytobuy?Frauddetection“Isthisinsuranceclaimafraud?”Churnanalysis“Whoarethosecustomerslikelytochurn?”RiskManagement“ShouldIapprovetheloantothiscustomer?”…BusinessOrientedDMProblemsT微软SQL数据挖掘_数据仓库技术研讨会材料微软SQL数据挖掘_数据仓库技术研讨会材料MiningModelMiningProcess-IllustratedDMEngineDataToPredictDMEnginePredictedDataTrainingDataMiningModelMiningModelMiningModelMiningProcess-ITheDataMiningMarketTheDataMiningMarketThe$$$:Y2000MarketSizeDMToolsMarket:$250M40%-licensefees60%consulting*GartnerThe$$$:Y2000MarketSizeDMThePlayersLeadingvendorsSASSPSSIBMHundredsofsmallervendorsofferingDMalgorithms…Oracle–ThinkingMachinesacquisitionThePlayersLeadingvendorsTheProductsEnd-to-endDataMiningtoolsExtraction,Cleansing,Loading,Modeling,Algorithms(dozens),Analystsworkbench,Reporting,Charting….Thecustomeristhepower-analystPhDinstatisticsisusuallyrequired…Closedtools–nostandardAPITotalvendorlock-inLimitedintegrationwithapplicationsDMan“outsider”intheDataWarehouseExtensiveconsultingrequiredSkyrocketingprices$60K+forasingleuserlicenseTheProductsEnd-to-endDataMiWhattheanalystssay…“Stand-aloneDataMiningIsDead”-Forrester“Thedemiseof[standalone]datamining”–GartnerWhattheanalystssay…“Stand-aTheMicrosoftApproachTheMicrosoftApproachDataProUsersSurvey

1999-2001“Dataminingwillbethefastest-growingBItechnology…”DataProUsersSurvey

1999-200The$$$:2000MarketSizeDMApplicationsMarketSize:$1.5B*IDCThe$$$:2000MarketSize*IDSQLServer2000-TheAnalysisPlatformSQL2000providesacompleteAnalysisPlatformNotanisolated,standaloneDMproductPlatformmeans:TheinfrastructureforapplicationsNotanapplicationbyitselfIntegratedvisionforalltechnologies,toolsStandardbasedAPI’s(OLEDBforDM)ExtensibleScaleableSQLServer2000-TheAnalysisDataFlowDWOLTPOLAPDMAppsReports&AnalysisDMDataFlowDWOLTPOLAPDMAppsReporAnalysisServices2000-ArchitectureManagerUIDSOAnalysisServerClientOLEDBOLAPOLAPEngine(local)OLAPEngineDMEngineDMEngine(local)DMDMMDMWizardsDMDTSTaskExt.Ext.AnalysisServices2000-ArchiOLEDBforDataMining…OLEDBforDataMining…WhyOLEDBforDM?MakeDMamassmarkettechnologyby:LeverageexistingtechnologiesandknowledgeSQLandOLEDBCommonindustrywideconceptsanddatapresentationChangingDMmarketperceptionfrom“proprietary”to“open”Increasingthenumberofplayers:Reducethecostandriskofbecomingaconsumer–onetoolworkswithmultipleprovidersReducethecostandriskofbecomingaprovider–focusonexpertiseandfindmanypartnerstocomplementofferingDramaticallyincreasethenumberofDMdevelopersWhyOLEDBforDM?MakeDMamaIntegrationWithRDBMSCustomerswouldliketoBuildDMmodelsfromwithintheirRDBMSTrainthemodelsdirectlyofftheirrelationaltablesPerformpredictionsasrelationalqueries(tablesin,tablesout)FeelthatDMisanativepartoftheirdatabase.Therefore…DataminingmodelsarerelationalobjectsAlloperationsonthemodelsarerelationalThelanguageusedisSQL(w/Extensions)Theeffect:everyDBAandVBdevelopercanbecomeaDMdeveloperIntegrationWithRDBMSCustomerCreatingaDataMiningModel(DMM)CreatingaDataMiningModel(Identifyingthe“Cases”DMalgorithmsanalyze“cases”The“case”istheentitybeingcategorizedandclassifiedExamplesCustomercreditriskanalysis:Case=CustomerProductprofitabilityanalysis:Case=ProductPromotionsuccessanalysis:Case=PromotionEachcaseencapsulateallweknowabouttheentityIdentifyingthe“Cases”DMalgoASimpleSetofCasesStudentIDGenderParentIncomeIQEncouragementCollegePlans1Male23400120NotEncouragedNo2Female7920090EncouragedYes3Male42000105NotEncouragedYesASimpleSetofCasesStudentIDMoreComplicatedCasesCustIDAgeMaritalStatusIQFavoriteMoviesTitleScore135M2StarWars8ToyStory9Terminator7220S3StarWars7Braveheart7TheMatrix10357M2SixthSense9Casablanca10MoreComplicatedCasesCustIDAADMMisaTable!ADMMstructureisdefinedasatableTrainingaDMMmeansinsertingdataintothetablePredictingfromaDMMmeansqueryingthetableAllinformationdescribingthecasearecontainedincolumnsADMMisaTable!ADMMstructuCreatingaMiningModelCREATEMININGMODEL[PlansPrediction](StudentIDLONGKEY, GenderTEXTDISCRETE,ParentIncomeLONGCONTINUOUS,IQDOUBLECONTINUOUS,EncouragementTEXTDISCRETE,CollegePlansTEXTDISCRETEPREDICT)USINGMicrosoft_Decision_TreesCreatingaMiningModelCREATECreatingaminingmodelwithnestedtableCreateMiningModelMoviePrediction(CutomerIdlongkey,Agelongcontinuous,Genderdiscrete,Educationdiscrete,MovieListtablepredict( MovieNametextkey))usingmicrosoft_decision_treesCreatingaminingmodelwithnTrainingaDMMTrainingaDMMTrainingaDMMTrainingaDMMmeanspassingitdataforwhichtheattributestobepredictedareknownMultiplepassesarehandledinternallybytheprovider!UseanINSERTINTOstatementTheDMMwillnotpersisttheinserteddataInsteaditwillanalyzethegivencasesandbuildtheDMMcontent(decisiontree,segmentationmodel,associationrules)INSERT[INTO]<miningmodelname> [(columnslist)]<sourcedataquery>

TrainingaDMMTrainingaDMMmINSERTINTOINSERTINTO[PlansPrediction](StudentID,Gender,ParentIncome,IQ,Encouragement,CollegePlans)SELECT [StudentID],[Gender], [ParentIncome],[IQ], [Encouragement],[CollegePlans]FROM[CollegePlans]INSERTINTOINSERTINTO[PlansWhenInsertIntoIsDone…TheDMMistrainedThemodelcanberetrainedContent(rules,trees,formulas)canbeexploredOLEDBSchemarowsetSELECT*FROM<dmm>.CONTENTXMLstring(PMML)PredictionqueriescanbeexecutedWhenInsertIntoIsDone…TheDPredictionsPredictionsWhatarePredictions?PredictionsapplytherulesofatrainedmodeltoanewsetofdatainordertoestimatemissingattributesorvaluesPredictions=queriesThesyntaxisSQL-likeTheoutputisarowsetInordertopredictyouneed:InputdatasetAtrainedDMMBinding(mapping)informationbetweentheinputdataandtheDMMSpecificationofwhattopredictWhatarePredictions?PredictioTheTruthTableConceptGenderParentIncomeIQEncouragementCollegePlansProbabilityMale2000085NotEncouragedNo85%Male2000085NotEncouragedYes15%Male2000085EncouragedNo60%Male2000085EncouragedYes40%Male2000090NotEncouragedNo80%Male2000090NotEncouragedYes20%Male2000090EncouragedNo58%…TheTruthTableConceptGenderPPredictionGenderParentIncomeIQEncouragementCollegePlansProbabilityMale2000085NotEncouragedNo85%Male2000085NotEncouragedYes15%Male2000085EncouragedNo60%Male2000085EncouragedYes40%Male2000090NotEncouragedNo80%Male2000090NotEncouragedYes20%Male2000090EncouragedNo58%Male2000090EncouragedYes42%Male2000095NotEncouragedNo78%Male2000095NotEncouragedYes22%Male2000095EncouragedNo45%It’saJOIN!StudentIDGenderParentIncomeIQEncouragement1Male4300085NotEncouraged2Male20000135NotEncouraged3Female25000105Encouraged4Male96000100Encouraged5Female56000125NotEncouraged6Female4600090NotEncouragedPredictionGenderParentIQEncourThePredictionQuerySyntaxSELECT<columnstoreturnorpredict>FROM <dmm>PREDICTIONJOIN

<inputdataset>

ON<dmmcolumn>=<dmminputcolumn>…ThePredictionQuerySyntaxSELExampleSELECT[NewStudents].[StudentID], [PlansPrediction].[CollegePlans], PredictProbability([CollegePlans])FROM[PlansPrediction]PREDICTIONJOIN[NewStudents] ON[PlansPrediction].[Gender]=[NewStudents].[Gender]AND

[PlansPrediction].[IQ]=[NewStudents].[IQ]AND...ExampleSELECT[NewStudents].[OLEDBDMSampleProviderwithSourceOLEDBDMSampleProviderwithIntegratedOLAPandDMAnalysisIntegratedOLAPandDMAnalysiWhyUseDMwithOLAPRelationalDMisdesignedfor:ReportsofpatternsBatchpredictionsfedintoanOLTPsystemReal-timesingletonpredictioninanoperationalenvironmentOLAPisdesignedforinteractiveanalysisbyaknowledgeworkerConsistentandconvenientnavigationalmodelPre-aggregationsofOLAPallowfasterperformanceWhyUseDMwithOLAPRelationalUnderstandingDMContent–DecisionTreesCreditRisk:65%Good35%BadAllCustomersCreditRisk:89%Good11%BadDebt=LowCreditRisk:94%Good6%BadET=SalariedCreditRisk:70%Good30%BadEducation?CreditRisk:31%Good69%BadEducation=HighSchoolCreditRisk:79%Good21%BadCreditRisk:45%Good55%BadDebt=HighDebt?Employ--mentType?ET=SelfEmployedEducation=CollegeCustomershavinghighdebtandcoll

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论