商业应用统计英文版ch03_第1页
商业应用统计英文版ch03_第2页
商业应用统计英文版ch03_第3页
商业应用统计英文版ch03_第4页
商业应用统计英文版ch03_第5页
已阅读5页,还剩40页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、AppliedStatisticsforBusiness,DrxxxAssociateProfessorSchoolofEconomicsandManagementxxxx,AppliedBusinessStatistics,7thed.byKenBlack,Chapter3DescriptiveStatistics,LearningObjectives,Distinguishbetweenmeasuresofcentraltendency,measuresofvariability,measuresofshape,andmeasuresofassociation.Understandthem

2、eaningsofmean,median,mode,quartile,percentile,andrange.Computemean,median,mode,percentile,quartile,range,variance,standarddeviation,andmeanabsolutedeviationonungroupeddata.Differentiatebetweensampleandpopulationvarianceandstandarddeviation.,LearningObjectives-Continued,Understandthemeaningofstandard

3、deviationasitisappliedbyusingtheempiricalruleandChebyshevstheorem(切比雪夫定理).Computethemean,median,standarddeviation,andvarianceongroupeddata.Understandboxandwhiskerplots(箱线图,或者盒须图),skewness,andkurtosis.Computeacoefficientofcorrelationandinterpretit.,MeasuresofCentralTendency:UngroupedData,Measuresofce

4、ntraltendencyyieldinformationabout“particularplacesorlocationsinagroupofnumbers.”CommonMeasuresofLocationModeMedianMeanPercentilesQuartiles,Mode-themostfrequentlyoccurringvalueinadatasetApplicabletoalllevelsofdatameasurement(nominal,ordinal,interval,andratio)Canbeusedtodeterminewhatcategoriesoccurmo

5、stfrequentlySometimes,nomodeexists(noduplicates)BimodalInatieforthemostfrequentlyoccurringvalue,twomodesarelistedMultimodal-Datasetsthatcontainmorethantwomodes,Mode,Median,Median-middlevalueinanorderedarrayofnumbers.Halfthedataareaboveit,halfthedataarebelowitMathematically,itsthe(n+1)/2thorderedobse

6、rvationForanarraywithanoddnumberofterms,themedianisthemiddlenumbern=11=(n+1)/2th=12/2th=6thorderedobservationForanarraywithanevennumberoftermsthemedianistheaverageofthemiddletwonumbersn=10=(n+1)/2th=11/2th=5.5th=averageof5thand6thorderedobservation,ArithmeticMean,MeanistheaverageofagroupofnumbersApp

7、licableforintervalandratiodataNotapplicablefornominalorordinaldataAffectedbyeachvalueinthedataset,includingextremevaluesComputedbysummingallvaluesinthedatasetanddividingthesumbythenumberofvaluesinthedataset,ThenumberofU.S.carsinservicebytopcarrentalcompaniesinarecentyearaccordingtoAutoRentalNewsfoll

8、ows.Computethemode,themedian,andthemean.,DemonstrationProblem3.1,DemonstrationProblem3.1,SolutionsMode:9,000(twocompanieswith9,000carsinservice)Median:With13differentcompaniesinthisgroup,N=13.Themedianislocatedatthe(13+1)/2=7thposition.Becausethedataarealreadyordered,medianisthe7thterm,whichis20,000

9、.Mean:=x/N=(1,791,000/13)=137,769.23,Percentiles,Percentile-measuresofcentraltendencythatdivideagroupofdatainto100partsAtleastn%ofthedatalieatorbelowthenthpercentile,andatmost(100-n)%ofthedatalieabovethenthpercentileExample:90thpercentileindicatesthatat90%ofthedataareequaltoorlessthanit,and10%ofthed

10、atalieaboveit,CalculatingPercentiles,Tocalculatethepthpercentile,OrderthedataCalculatei=N(p/100)DeterminethepercentileIfiisawholenumber,thenusetheaverageoftheithand(i+1)thorderedobservationOtherwise,roundiuptothenexthighestwholenumber,Quartiles,Quartile-measuresofcentraltendencythatdivideagroupofdat

11、aintofoursubgroupsQ1:25%ofthedatasetisbelowthefirstquartileQ2:50%ofthedatasetisbelowthesecondquartileQ3:75%ofthedatasetisbelowthethirdquartile,Forthecarsinservicedata,n=13,soQ1:i=13(25/100)=3.25,sousethe4thorderedobservationQ1=9,000Q3:i=13(75/100)=9.75,sousethe10thorderedobservationQ3=204,000,Quarti

12、lesforDemonstrationProblem3.1,WhichMeasureDoIUse?,Whichmeasureofcentraltendencyismostappropriate?Ingeneral,themeanispreferred,sinceithasnicemathematicalproperties(inparticular,seechapter7)Themedianandquartiles,areresistanttooutliersConsiderthefollowingthreedatasets1,2,3(median=2,mean=2)1,2,6(median=

13、2,mean=3)1,2,30(median=2,mean=11)Allhavemedian=2,butthemeanissensitivetotheoutliersIngeneral,ifthereareoutliers,themedianispreferredtothemean,BoxandWhiskerPlot,WhyUseaBoxandWhiskerPlot?Boxandwhiskerplotsareveryeffectiveandeasytoread.Theysummarizedatafrommultiplesourcesanddisplaytheresultsinasinglegr

14、aph.Boxandwhiskerplotsallowforcomparisonofdatafromdifferentcategoriesforeasier,moreeffectivedecision-making.,Aboxandwhiskerplotisdevelopedfromfivestatistics.MinimumvaluethesmallestvalueinthedatasetSecondquartilethevaluebelowwhichthelower25%ofthedataarecontainedMedianvaluethemiddlenumberinarangeofnum

15、bersThirdquartilethevalueabovewhichtheupper25%ofthedataarecontainedMaximumvaluethelargestvalueinthedatasetSometypesarecalledboxandwhiskerplotswithoutliers.,MeasuresofVariability-toolsthatdescribethespreadorthedispersionofasetofdata.Providesmoremeaningfulinformationwhenusedwithmeasuresofcentraltenden

16、cyincomparisontoothergroups,MeasuresofVariability:UngroupedData,CommonMeasuresofVariabilityRangeInter-quartileRangeMeanAbsoluteDeviationVarianceandStandardDeviationCoefficientofVariation,MeasuresofSpreadorDispersion:UngroupedData,Range,ThedifferencebetweenthelargestandthesmallestvaluesinasetofdataAd

17、vantageeasytocomputeDisadvantageisaffectedbyextremevalues,InterquartileRange,InterquartileRange-rangeofvaluesbetweenthefirstandthirdquartilesRangeofthe“middlehalf”;middle50%Usefulwhenresearchersareinterestedinthemiddle50%,andnottheextremesExample:Forthecarsinservicedata,theIQRis204,0009,000=195,000,

18、Deviationsfromthemean,UsefulforintervalorratioleveldataAnexaminationofdeviationfromthemeancanrevealinformationaboutthevariabilityofthedataDeviationsareusedmostlyasatooltocomputeothermeasuresofvariabilityHowever,thesumofdeviationsfromthearithmeticmeanisalwayszero:Sum(X-)=0Therearetwowaystosolvethisco

19、nundrum,MeanAbsoluteDeviation(MAD),Onesolutionistotaketheabsolutevalueofeachdeviationaroundthemean.ThisiscalledtheMeanAbsoluteDeviationNotethatwhiletheMADisintuitivelysimple,itisrarelyusedinpractice,PopulationVariance,AnothersolutionistotaketheSumofSquaredDeviations(SSD)aboutthemeanThepopulationvari

20、anceistheaverageofthesquareddeviationsaboutthearithmeticmeanforasetofnumbers.Thepopulationvarianceisdenotedby/sigma/.,PopulationStandardDeviation,Thepopulationstandarddeviationisameasureofthespreadofadistribution.Asymmetricdistributioniscompletelydescribedbyitscenteratthemeananditsspreaddefinedbymul

21、tiplesofitsstandarddeviation.Thestandarddeviationisthesquarerootofthevariance.,WhatsthemeaningofStd.deviation?,EmpiricalruleItsusedtostatetheapproximatepercentageofvaluesthatliewithinagivennumberofstandarddeviationsfromthemeanofasetofdataifthedataarenormallydistributed.,*Empiricalrule,*Basedontheass

22、umptionthatthedataareapproximatelynormallydistributed.,ChebyshevvTheorem,Unlikeempiricalrule,Chebyshevstheoremappliestoalldistributionsregardlessoftheirshapeandthuscanbeusedwheneverthedatadistributionshapeisunknownorisnonnormal.,ChebyshevsTheoremWithinkstandarddeviationsfromthemean,k,lieatleastpropo

23、rtionofthevalues.Assumptionk1,SampleVariance,SampleVariance-averageofthesquareddeviationsfromthearithmeticmeanSampleVariancedenotedbys2,SampleStandardDeviation,SamplestandarddeviationisthesquarerootofthesamplevarianceSameunitsasoriginaldata,Theeffectivenessofdistrictattorneyscanbemeasuredbyseveralva

24、riables,includingthenumberofconvictionspermonth,thenumberofcaseshandledpermonth,andthetotalnumberofyearsofconvictionpermonth.Aresearcherusesasampleoffivedistrictattorneysinacityanddeterminesthetotalnumberofyearsofconvictionthateachattorneywonagainstdefendantsduringthepastmonth,asreportedinthefirstco

25、lumninthefollowingtabulations.Computethemeanabsolutedeviation,thevariance,andthestandarddeviationforthesefigures.,DemonstrationProblem3.6,DemonstrationProblem3.6,SolutionTheresearchercomputesthemeanabsolutedeviation,thevariance,andthestandarddeviationforthesedatainthefollowingmanner.,ZScores,Zscorer

26、epresentsthenumberofStdDevavalue(x)isaboveorbelowthemeanofasetofnumbersZscoreallowstranslationofavaluesrawdistancefromthemeanintounitsofstddevZ=(x-)/,CoefficientofVariation(CV)measuresthevolatilityofavalue(perhapsastockportfolio),relativetoitsmean.Itstheratioofthestandarddeviationtothemean,expressed

27、asapercentageUsefulwhencomparingStdDevcomputedfromdatawithdifferentmeansMeasurementofrelativedispersion,CoefficientofVariation,CoefficientofVariation,Considertwodifferentpopulations,Since15.8611.90,thefirstpopulationismorevariable,relativetoitsmean,thanthesecondpopulation,IntervalFrequency(f)Midpoin

28、t(M)f*M20-under3062515030-under40183563040-under50114549550-under60115560560-under7036519570-under8017575502150,CalculationofGroupedMean,Sometimesdataarealreadygrouped,andyouareinterestedincalculatingsummarystatistics,CumulativeClassIntervalFrequencyFrequency20-under306630-under40182440-under5011355

29、0-under60114660-under7034970-under80150N=50,MedianofGroupedData-Example,ModeofGroupedData,ClassIntervalFrequency20-under30630-under401840-under501150-under601160-under70370-under801,MidpointofthemodalclassModalclasshasthegreatestfrequency,VarianceandStandardDeviationofGroupedData,PopulationVariancea

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论