Descriptive Statistics Numerical Methods:描述性统计数值方法_第1页
Descriptive Statistics Numerical Methods:描述性统计数值方法_第2页
Descriptive Statistics Numerical Methods:描述性统计数值方法_第3页
Descriptive Statistics Numerical Methods:描述性统计数值方法_第4页
Descriptive Statistics Numerical Methods:描述性统计数值方法_第5页
已阅读5页,还剩35页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Chapter 3Descriptive Statistics: Numerical MethodsCopyright 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin3-2Descriptive Statistics3.1Describing Central Tendency3.2Measures of Variation3.3Percentiles, Quartiles and Box-and-Whiskers Displays3.4Covariance, Correlation, a

2、nd the Least Square Line (Optional)3.5Weighted Means and Grouped Data (Optional)3.6The Geometric Mean (Optional)3-33.1 Describing Central TendencyIn addition to describing the shape of a distribution, want to describe the data sets central tendency A measure of central tendency represents the center

3、 or middle of the data Population mean () is average of the population measurementsPopulation parameter: a number calculated from all the population measurements that describes some aspect of the populationSample statistic: a number calculated using the sample measurements that describes some aspect

4、 of the sampleLO3-1: Compute and interpret the mean, median, and mode.3-4Measures of Central TendencyMean, The average or expected value Median, MdThe value of the middle point of the ordered measurementsMode, MoThe most frequent valueLO3-13-5The MeanPopulation X1, X2, , XNPopulation MeanN X N=1iiSa

5、mple x1, x2, , xnSample Meanx n x x n=1iiLO3-13-6The Sample Meanand is a point estimate of the population mean It is the value to expect, on average and in the long runnxxxnxxnnii.211For a sample of size n, the sample mean (x) is defined asLO3-13-7Example 3.1 Car Mileage Case: Estimating MileageSamp

6、le mean for first five car mileages from Table 3.130.8, 31.7, 30.1, 31.6, 32.126.3153 .15651 .326 .311 .307 .318 .30555432151xxxxxxxxiiLO3-13-8The MedianThe median Md is a value such that 50% of all measurements, after having been arranged in numerical order, lie above (or below) it If the number of

7、 measurements is odd, the median is the middlemost measurement in the ordering If the number of measurements is even, the median is the average of the two middlemost measurements in the orderingLO3-13-9Example 3.1 The Car Mileage CaseFirst five observations from Table 3.1:30.8, 31.7, 30.1, 31.6, 32.

8、1In order: 30.1, 30.8, 31.6, 31.7, 32.1There is an odd so median is one in middle, or 31.6LO3-13-10The ModeThe mode Mo of a population or sample of measurements is the measurement that occurs most frequently Modes are the values that are observed “most typically” Sometimes higher frequencies at two

9、or more values If there are two modes, the data is bimodal If more than two modes, the data is multimodal When data are in classes, the class with the highest frequency is the modal class The tallest box in the histogramLO3-13-11Relationships Among Mean, Median and Mode LO3-1Figure 3.33-123.2 Measur

10、es of VariationKnowing the measures of central tendency is not enoughBoth of the distributions below have identical measures of central tendencyLO3-2: Compute and interpret the range, variance, and standard deviation.Figure 3.133-13Measures of VariationRangeLargest minus the smallest measurementVari

11、anceThe average of the squared deviations of all the population measurements from the population meanStandardThe square root of the populationDeviation varianceLO3-23-14The RangeLargest minus smallestMeasures the interval spanned by all the dataFor the left side of Figure 3.13, largest is 5 and smal

12、lest is 3Range is 5 3 = 2 daysLO3-23-15Population Variance and Standard DeviationThe population variance (2) is the average of the squared deviations of the individual population measurements from the population mean ()The population standard deviation () is the positive square root of the populatio

13、n varianceLO3-23-16VarianceFor a population of size N, the population variance 2 is:For a sample of size n, the sample variance s2 is:NxxxNxNNii222211221122221122nxxxxxxnxxsnniiLO3-23-17Standard DeviationPopulation standard deviation ():Sample standard deviation (s):22ss LO3-23-18Example: Chriss Cla

14、ss Sizes This SemesterData points are: 60, 41, 15, 30, 34Mean is 36 (180/5)Variance is:Standard deviation is:4 .2165108254364412557653634363036153641366022222271.144 .216LO3-23-19Example: Sample Variance and Standard DeviationExample 3.6: data for first five car mileages from Table 3.1: 30.8, 31.7,

15、30.1, 31.6, 32.1The sample mean is 31.26The variance and standard deviation are:8019. 0643. 0643. 04572. 2426.311 .3226.316 .3126.311 .3026.317 .3126.318 .30152222225122ssxxsiiLO3-23-20The Empirical Rule for Normal PopulationsIf a population has mean and standard deviation and is described by a norm

16、al curve, then68.26% of the population measurements lie within one standard deviation of the mean: -, +95.44% lie within two standard deviations of the mean: -2, +299.73% lie within three standard deviations of the mean: -3, +3LO3-3: Use the EmpiricalRule and Chebyshevs Theorem to describe variation

17、.3-21Chebyshevs TheoremLet and be a populations mean and standard deviation, then for any value k 1At least 100(1 - 1/k2)% of the population measurements lie in the interval -k, +kOnly practical for non-mound-shaped distribution population that is not very skewedLO3-33-22z ScoresFor any x in a popul

18、ation or sample, the associated z score isThe z score is the number of standard deviations that x is from the mean A positive z score is for x above (greater than) the mean A negative z score is for x below (less than) the meandeviation standardmeanxzLO3-33-23Coefficient of VariationMeasures the siz

19、e of the standard deviation relative to the size of the meanUsed to: Compare the relative variabilities of values about the mean Compare the relative variability of populations or samples with different means and different standard deviations Measure risk%100Meandeviation Standard variationoft Coeff

20、icienLO3-33-243.3 Percentiles, Quartiles, and Box-and-Whiskers DisplaysFor a set of measurements arranged in increasing order, the pth percentile is a value such that p percent of the measurements fall at or below the value and (100-p) percent of the measurements fall at or above the valueThe first

21、quartile Q1 is the 25th percentile The second quartile (median) is the 50th percentileThe third quartile Q3 is the 75th percentileThe interquartile range IQR is Q3 - Q1LO3-4: Compute and interpret percentiles, quartiles, and box-and-whiskers displays.3-25Calculating Percentiles1.Arrange the measurem

22、ents in increasing order2.Calculate the index i=(p/100)n where p is the percentile to find3.(a) If i is not an integer, round up and the next integer greater than i denotes the pth percentile(b) If i is an integer, the pth percentile is the average of the measurements in the i and i+1 positionsLO3-4

23、3-26Percentile Examplei=(10/100)12=1.2Not an integer so round up to 210th percentile is in the second position so 11,070i=(25/100)12=3Integer so average values in positions 3 and 425th percentile (18,211+26,817)/2 or 22,514LO3-43-27Five Number Summary1.The smallest measurement2.The first quartile, Q

24、13.The median, Md4.The third quartile, Q35.The largest measurementDisplayed visually using a box-and-whiskers plotLO3-43-28Box-and-Whiskers PlotsThe box plots the: First quartile, Q1 Median, Md Third quartile, Q3 Inner fences Outer fencesInner fences Located 1.5IQR away from the quartiles: Q1 (1.5 I

25、QR) Q3 + (1.5 IQR)Outer fences Located 3IQR away from the quartiles: Q1 (3 IQR) Q3 + (3 IQR)LO3-43-29Box-and-Whiskers Plots ContinuedThe “whiskers” are dashed lines that plot the range of the dataA dashed line drawn from the box below Q1 down to the smallest measurementAnother dashed line drawn from

26、 the box above Q3 up to the largest measurementLO3-4Figures 3.17 and 3.183-30OutliersOutliers are measurements that are very different from other measurements They are either much larger or much smaller than most of the other measurementsOutliers lie beyond the limits of the box-and-whiskers plot Me

27、asurements less than the lower limit or greater than the upper limitLO3-43-313.4 Covariance, Correlation, and the Least Squares Line (Optional)When points on a scatter plot seem to fluctuate around a straight line, there is a linear relationship between x and yA measure of the strength of a linear r

28、elationship is the covariance sxy11nyyxxsniiixyLO3-5: Compute and interpret covariance, correlation, and the least squares line (Optional).3-32CovarianceA positive covariance indicates a positive linear relationship between x and y As x increases, y increasesA negative covariance indicates a negativ

29、e linear relationship between x and y As x increases, y decreasesLO3-53-33Correlation CoefficientMagnitude of covariance does not indicate the strength of the relationship Magnitude depends on the unit of measurement used for the dataCorrelation coefficient (r) is a measure of the strength of the re

30、lationship that does not depend on the magnitude of the datayxxysssr LO3-53-34Correlation Coefficient ContinuedSample correlation coefficient r is always between -1 and +1 Values near -1 show strong negative correlation Values near 0 show no correlation Values near +1 show strong positive correlatio

31、nSample correlation coefficient is the point estimate for the population correlation coefficient LO3-53-35Least Squares LineIf there is a linear relationship between x and y, might wish to predict y on the basis of xThis requires the equation of a line describing the linear relationshipLine is calcu

32、lated based on least squares line Discussed in detail in a later chapterNeed to find slope (b1) and y-intercept (b0)xbybssbxxy1021LO3-53-363.5 Weighted Means and Grouped Data (Optional)Sometimes, some measurements are more important than others Assign numerical “weights” to the data Weights measure relative importance of the valueCalculate weighted mean as where wi is the weight assigned to the ith measurement xiiiiwxwLO3-6: Compute and interpret weighted means and the mean and standard deviation of grouped data (Optional).3-37Descriptive Statistics for Grouped DataData already ca

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论