版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Chapter 3Descriptive Statistics: Numerical MethodsCopyright 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin3-2Descriptive Statistics3.1Describing Central Tendency3.2Measures of Variation3.3Percentiles, Quartiles and Box-and-Whiskers Displays3.4Covariance, Correlation, a
2、nd the Least Square Line (Optional)3.5Weighted Means and Grouped Data (Optional)3.6The Geometric Mean (Optional)3-33.1 Describing Central TendencyIn addition to describing the shape of a distribution, want to describe the data sets central tendency A measure of central tendency represents the center
3、 or middle of the data Population mean () is average of the population measurementsPopulation parameter: a number calculated from all the population measurements that describes some aspect of the populationSample statistic: a number calculated using the sample measurements that describes some aspect
4、 of the sampleLO3-1: Compute and interpret the mean, median, and mode.3-4Measures of Central TendencyMean, The average or expected value Median, MdThe value of the middle point of the ordered measurementsMode, MoThe most frequent valueLO3-13-5The MeanPopulation X1, X2, , XNPopulation MeanN X N=1iiSa
5、mple x1, x2, , xnSample Meanx n x x n=1iiLO3-13-6The Sample Meanand is a point estimate of the population mean It is the value to expect, on average and in the long runnxxxnxxnnii.211For a sample of size n, the sample mean (x) is defined asLO3-13-7Example 3.1 Car Mileage Case: Estimating MileageSamp
6、le mean for first five car mileages from Table 3.130.8, 31.7, 30.1, 31.6, 32.126.3153 .15651 .326 .311 .307 .318 .30555432151xxxxxxxxiiLO3-13-8The MedianThe median Md is a value such that 50% of all measurements, after having been arranged in numerical order, lie above (or below) it If the number of
7、 measurements is odd, the median is the middlemost measurement in the ordering If the number of measurements is even, the median is the average of the two middlemost measurements in the orderingLO3-13-9Example 3.1 The Car Mileage CaseFirst five observations from Table 3.1:30.8, 31.7, 30.1, 31.6, 32.
8、1In order: 30.1, 30.8, 31.6, 31.7, 32.1There is an odd so median is one in middle, or 31.6LO3-13-10The ModeThe mode Mo of a population or sample of measurements is the measurement that occurs most frequently Modes are the values that are observed “most typically” Sometimes higher frequencies at two
9、or more values If there are two modes, the data is bimodal If more than two modes, the data is multimodal When data are in classes, the class with the highest frequency is the modal class The tallest box in the histogramLO3-13-11Relationships Among Mean, Median and Mode LO3-1Figure 3.33-123.2 Measur
10、es of VariationKnowing the measures of central tendency is not enoughBoth of the distributions below have identical measures of central tendencyLO3-2: Compute and interpret the range, variance, and standard deviation.Figure 3.133-13Measures of VariationRangeLargest minus the smallest measurementVari
11、anceThe average of the squared deviations of all the population measurements from the population meanStandardThe square root of the populationDeviation varianceLO3-23-14The RangeLargest minus smallestMeasures the interval spanned by all the dataFor the left side of Figure 3.13, largest is 5 and smal
12、lest is 3Range is 5 3 = 2 daysLO3-23-15Population Variance and Standard DeviationThe population variance (2) is the average of the squared deviations of the individual population measurements from the population mean ()The population standard deviation () is the positive square root of the populatio
13、n varianceLO3-23-16VarianceFor a population of size N, the population variance 2 is:For a sample of size n, the sample variance s2 is:NxxxNxNNii222211221122221122nxxxxxxnxxsnniiLO3-23-17Standard DeviationPopulation standard deviation ():Sample standard deviation (s):22ss LO3-23-18Example: Chriss Cla
14、ss Sizes This SemesterData points are: 60, 41, 15, 30, 34Mean is 36 (180/5)Variance is:Standard deviation is:4 .2165108254364412557653634363036153641366022222271.144 .216LO3-23-19Example: Sample Variance and Standard DeviationExample 3.6: data for first five car mileages from Table 3.1: 30.8, 31.7,
15、30.1, 31.6, 32.1The sample mean is 31.26The variance and standard deviation are:8019. 0643. 0643. 04572. 2426.311 .3226.316 .3126.311 .3026.317 .3126.318 .30152222225122ssxxsiiLO3-23-20The Empirical Rule for Normal PopulationsIf a population has mean and standard deviation and is described by a norm
16、al curve, then68.26% of the population measurements lie within one standard deviation of the mean: -, +95.44% lie within two standard deviations of the mean: -2, +299.73% lie within three standard deviations of the mean: -3, +3LO3-3: Use the EmpiricalRule and Chebyshevs Theorem to describe variation
17、.3-21Chebyshevs TheoremLet and be a populations mean and standard deviation, then for any value k 1At least 100(1 - 1/k2)% of the population measurements lie in the interval -k, +kOnly practical for non-mound-shaped distribution population that is not very skewedLO3-33-22z ScoresFor any x in a popul
18、ation or sample, the associated z score isThe z score is the number of standard deviations that x is from the mean A positive z score is for x above (greater than) the mean A negative z score is for x below (less than) the meandeviation standardmeanxzLO3-33-23Coefficient of VariationMeasures the siz
19、e of the standard deviation relative to the size of the meanUsed to: Compare the relative variabilities of values about the mean Compare the relative variability of populations or samples with different means and different standard deviations Measure risk%100Meandeviation Standard variationoft Coeff
20、icienLO3-33-243.3 Percentiles, Quartiles, and Box-and-Whiskers DisplaysFor a set of measurements arranged in increasing order, the pth percentile is a value such that p percent of the measurements fall at or below the value and (100-p) percent of the measurements fall at or above the valueThe first
21、quartile Q1 is the 25th percentile The second quartile (median) is the 50th percentileThe third quartile Q3 is the 75th percentileThe interquartile range IQR is Q3 - Q1LO3-4: Compute and interpret percentiles, quartiles, and box-and-whiskers displays.3-25Calculating Percentiles1.Arrange the measurem
22、ents in increasing order2.Calculate the index i=(p/100)n where p is the percentile to find3.(a) If i is not an integer, round up and the next integer greater than i denotes the pth percentile(b) If i is an integer, the pth percentile is the average of the measurements in the i and i+1 positionsLO3-4
23、3-26Percentile Examplei=(10/100)12=1.2Not an integer so round up to 210th percentile is in the second position so 11,070i=(25/100)12=3Integer so average values in positions 3 and 425th percentile (18,211+26,817)/2 or 22,514LO3-43-27Five Number Summary1.The smallest measurement2.The first quartile, Q
24、13.The median, Md4.The third quartile, Q35.The largest measurementDisplayed visually using a box-and-whiskers plotLO3-43-28Box-and-Whiskers PlotsThe box plots the: First quartile, Q1 Median, Md Third quartile, Q3 Inner fences Outer fencesInner fences Located 1.5IQR away from the quartiles: Q1 (1.5 I
25、QR) Q3 + (1.5 IQR)Outer fences Located 3IQR away from the quartiles: Q1 (3 IQR) Q3 + (3 IQR)LO3-43-29Box-and-Whiskers Plots ContinuedThe “whiskers” are dashed lines that plot the range of the dataA dashed line drawn from the box below Q1 down to the smallest measurementAnother dashed line drawn from
26、 the box above Q3 up to the largest measurementLO3-4Figures 3.17 and 3.183-30OutliersOutliers are measurements that are very different from other measurements They are either much larger or much smaller than most of the other measurementsOutliers lie beyond the limits of the box-and-whiskers plot Me
27、asurements less than the lower limit or greater than the upper limitLO3-43-313.4 Covariance, Correlation, and the Least Squares Line (Optional)When points on a scatter plot seem to fluctuate around a straight line, there is a linear relationship between x and yA measure of the strength of a linear r
28、elationship is the covariance sxy11nyyxxsniiixyLO3-5: Compute and interpret covariance, correlation, and the least squares line (Optional).3-32CovarianceA positive covariance indicates a positive linear relationship between x and y As x increases, y increasesA negative covariance indicates a negativ
29、e linear relationship between x and y As x increases, y decreasesLO3-53-33Correlation CoefficientMagnitude of covariance does not indicate the strength of the relationship Magnitude depends on the unit of measurement used for the dataCorrelation coefficient (r) is a measure of the strength of the re
30、lationship that does not depend on the magnitude of the datayxxysssr LO3-53-34Correlation Coefficient ContinuedSample correlation coefficient r is always between -1 and +1 Values near -1 show strong negative correlation Values near 0 show no correlation Values near +1 show strong positive correlatio
31、nSample correlation coefficient is the point estimate for the population correlation coefficient LO3-53-35Least Squares LineIf there is a linear relationship between x and y, might wish to predict y on the basis of xThis requires the equation of a line describing the linear relationshipLine is calcu
32、lated based on least squares line Discussed in detail in a later chapterNeed to find slope (b1) and y-intercept (b0)xbybssbxxy1021LO3-53-363.5 Weighted Means and Grouped Data (Optional)Sometimes, some measurements are more important than others Assign numerical “weights” to the data Weights measure relative importance of the valueCalculate weighted mean as where wi is the weight assigned to the ith measurement xiiiiwxwLO3-6: Compute and interpret weighted means and the mean and standard deviation of grouped data (Optional).3-37Descriptive Statistics for Grouped DataData already ca
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2024新版《药品管理法》培训课件
- 心脏手术的抗凝治疗与并发症防控
- 治疗阿尔茨海默病药
- 脑震荡的中医护理方案
- 吉林省2024七年级数学上册第2章整式及其加减阶段综合训练范围2.1~2.3课件新版华东师大版
- 分销管理模式
- 脚病调理培训课件
- 化学反应方向说课稿
- 红黄蓝说课稿
- 好玩的洞洞说课稿
- 乱扔垃圾的课件
- 2024-2030年中国安全校车市场发展分析及市场趋势与投资方向研究报告
- 数字孪生水利项目建设可行性研究报告
- 北京市房山区2023-2024学年高二上学期期中地理试题 含解析
- 人教版六年级上册数学课本课后习题答案
- 期刊编辑的学术期刊版权教育与培训考核试卷
- SolidWorks-2020项目教程全套课件配套课件完整版电子教案
- 高等教育自学考试《13683管理学原理(中级)》考前模拟试卷一
- 2024政务服务综合窗口人员能力与服务规范考试试题
- 鼎和财险机器人产品质量责任保险条款
- 第4章 代数式 单元测试卷 2024-2025学年浙教版七年级数学上册
评论
0/150
提交评论