版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Introduction to Multiple RegressionChapter 13Introduction to Multiple RegreObjectivesIn this chapter, you learn: How to develop a multiple regression modelHow to interpret the regression coefficientsHow to determine which independent variables to include in the regression modelHow to use categorical
2、 independent variables in a regression modelObjectivesIn this chapter, youThe Multiple Regression ModelIdea: Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (Xi)Multiple Regression Model with k Independent Variables:Y-interceptPopulation slopesRandom ErrorDC
3、OVAThe Multiple Regression ModelIMultiple Regression EquationThe coefficients of the multiple regression model are estimated using sample dataEstimated (or predicted) value of YEstimated slope coefficientsMultiple regression equation with k independent variables:EstimatedinterceptIn this chapter we
4、will use Excel and Minitab to obtain the regression slope coefficients and other regression summary measures.DCOVAMultiple Regression EquationThTwo variable modelYX1X2Slope for variable X1Slope for variable X2Multiple Regression Equation(continued)DCOVATwo variable modelYX1X2Slope fA distributor of
5、frozen dessert pies wants to evaluate factors thought to influence demandDependent variable: Pie sales (units per week)Independent variables: Price (in $) Advertising ($100s)Data are collected for 15 weeksExample: 2 Independent VariablesDCOVAA distributor of frozen desserPie Sales ExampleSales = b0
6、+ b1 (Price) + b2 (Advertising)WeekPie SalesPrice($)Advertising($100s)13505.503.324607.503.333508.003.044308.004.553506.803.063807.504.074304.503.084706.403.794507.003.5104905.004.0113407.203.5123007.903.2134405.904.0144505.003.5153007.002.7Multiple regression equation:DCOVAPie Sales ExampleSales =
7、b0 + Excel Multiple Regression OutputRegression StatisticsMultiple R0.72213R Square0.52148Adjusted R Square0.44172Standard Error47.46341Observations15ANOVA dfSSMSFSignificance FRegression229460.02714730.0136.538610.01201Residual1227033.3062252.776Total1456493.333CoefficientsStandard Errort StatP-val
8、ueLower 95%Upper 95%Intercept306.52619114.253892.682850.0199357.58835555.46404Price-24.9750910.83213-2.305650.03979-48.57626-1.37392Advertising74.1309625.967322.854780.0144917.55303130.70888DCOVAExcel Multiple Regression OutpMinitab Multiple Regression OutputThe regression equation isSales = 307 - 2
9、5.0 Price + 74.1 AdvertisingPredictor Coef SE Coef T PConstant306.50 114.30 2.68 0.020Price -24.98 10.83 -2.31 0.040Advertising 74.13 25.97 2.85 0.014S = 47.4634 R-Sq = 52.1% R-Sq(adj) = 44.2%Analysis of VarianceSource DF SS MS F PRegression 2 29460 14730 6.54 0.012Residual Error12 27033 2253Total 1
10、4 56493DCOVAMinitab Multiple Regression OuThe Multiple Regression Equationb1 = -24.975: sales will decrease, on average, by 24.975 pies per week for each $1 increase in selling price, net of the effects of changes due to advertisingb2 = 74.131: sales will increase, on average, by 74.131 pies per wee
11、k for each $100 increase in advertising, net of the effects of changes due to pricewhere Sales is in number of pies per week Price is in $ Advertising is in $100s.DCOVAThe Multiple Regression EquatiUsing The Equation to Make PredictionsPredict sales for a week in which the selling price is $5.50 and
12、 advertising is $350:Predicted sales is 428.62 piesNote that Advertising is in $100s, so $350 means that X2 = 3.5DCOVAUsing The Equation to Make PrePredictions in Excel using PHStatPHStat | regression | multiple regression Check the “confidence and prediction interval estimates” boxDCOVAPredictions
13、in Excel using PHSInput valuesPredictions in PHStat(continued) Predicted Y valueConfidence interval for the mean value of Y, given these X valuesPrediction interval for an individual Y value, given these X valuesDCOVAInput valuesPredictions in PHSPredictions in MinitabInput valuesPredicted Values fo
14、r New ObservationsNewObs Fit SE Fit 95% CI 95% PI 1 428.6 17.2 (391.1, 466.1) (318.6, 538.6)Values of Predictors for New ObservationsNewObs Price Advertising 1 5.50 3.50 Confidence interval for the mean value of Y, given these X values Prediction interval for an individual Y value, given these X val
15、uesDCOVAPredictions in MinitabInput vaThe Coefficient of Multiple Determination, r2Reports the proportion of total variation in Y explained by all X variables taken togetherDCOVAThe Coefficient of Multiple DeRegression StatisticsMultiple R0.72213R Square0.52148Adjusted R Square0.44172Standard Error4
16、7.46341Observations15ANOVA dfSSMSFSignificance FRegression229460.02714730.0136.538610.01201Residual1227033.3062252.776Total1456493.333CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept306.52619114.253892.682850.0199357.58835555.46404Price-24.9750910.83213-2.305650.03979-48.57626-1.37
17、392Advertising74.1309625.967322.854780.0144917.55303130.7088852.1% of the variation in pie sales is explained by the variation in price and advertisingMultiple Coefficient of Determination In ExcelDCOVARegression StatisticsMultiple Multiple Coefficient of Determination In MinitabThe regression equat
18、ion isSales = 307 - 25.0 Price + 74.1 AdvertisingPredictor Coef SE Coef T PConstant306.50 114.30 2.68 0.020Price -24.98 10.83 -2.31 0.040Advertising 74.13 25.97 2.85 0.014S = 47.4634 R-Sq = 52.1% R-Sq(adj) = 44.2%Analysis of VarianceSource DF SS MS F PRegression 2 29460 14730 6.54 0.012Residual Erro
19、r12 27033 2253Total 14 5649352.1% of the variation in pie sales is explained by the variation in price and advertisingDCOVAMultiple Coefficient of DeterAdjusted r2r2 never decreases when a new X variable is added to the modelThis can be a disadvantage when comparing modelsWhat is the net effect of a
20、dding a new variable?We lose a degree of freedom when a new X variable is addedDid the new X variable add enough explanatory power to offset the loss of one degree of freedom?DCOVAAdjusted r2r2 never decreasesShows the proportion of variation in Y explained by all X variables adjusted for the number
21、 of X variables used (where n = sample size, k = number of independent variables)Penalizes excessive use of unimportant independent variablesSmaller than r2Useful in comparing among modelsAdjusted r2(continued)DCOVAShows the proportion of variatRegression StatisticsMultiple R0.72213R Square0.52148Ad
22、justed R Square0.44172Standard Error47.46341Observations15ANOVA dfSSMSFSignificance FRegression229460.02714730.0136.538610.01201Residual1227033.3062252.776Total1456493.333CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept306.52619114.253892.682850.0199357.58835555.46404Price-24.97509
23、10.83213-2.305650.03979-48.57626-1.37392Advertising74.1309625.967322.854780.0144917.55303130.7088844.2% of the variation in pie sales is explained by the variation in price and advertising, taking into account the sample size and number of independent variablesAdjusted r2 in ExcelDCOVARegression Sta
24、tisticsMultiple Adjusted r2 in MinitabThe regression equation isSales = 307 - 25.0 Price + 74.1 AdvertisingPredictor Coef SE Coef T PConstant306.50 114.30 2.68 0.020Price -24.98 10.83 -2.31 0.040Advertising 74.13 25.97 2.85 0.014S = 47.4634 R-Sq = 52.1% R-Sq(adj) = 44.2%Analysis of VarianceSource DF
25、 SS MS F PRegression 2 29460 14730 6.54 0.012Residual Error12 27033 2253Total 14 5649344.2% of the variation in pie sales is explained by the variation in price and advertising, taking into account the sample size and number of independent variablesDCOVAAdjusted r2 in MinitabThe regrF Test for Overa
26、ll Significance of the ModelShows if there is a linear relationship between all of the X variables considered together and YUse F-test statisticHypotheses: H0: 1 = 2 = = k = 0 (no linear relationship) H1: at least one i 0 (at least one independent variable affects Y) Is the Model Significant?DCOVAF
27、Test for Overall SignificancF Test for Overall SignificanceTest statistic: where FSTAT has numerator d.f. = k and denominator d.f. = (n k - 1) DCOVAF Test for Overall SignificancRegression StatisticsMultiple R0.72213R Square0.52148Adjusted R Square0.44172Standard Error47.46341Observations15ANOVA dfS
28、SMSFSignificance FRegression229460.02714730.0136.538610.01201Residual1227033.3062252.776Total1456493.333CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept306.52619114.253892.682850.0199357.58835555.46404Price-24.9750910.83213-2.305650.03979-48.57626-1.37392Advertising74.1309625.96732
29、2.854780.0144917.55303130.70888(continued)F Test for Overall Significance In ExcelWith 2 and 12 degrees of freedomP-value for the F TestDCOVARegression StatisticsMultiple F Test for Overall Significance In MinitabThe regression equation isSales = 307 - 25.0 Price + 74.1 AdvertisingPredictor Coef SE
30、Coef T PConstant306.50 114.30 2.68 0.020Price -24.98 10.83 -2.31 0.040Advertising 74.13 25.97 2.85 0.014S = 47.4634 R-Sq = 52.1% R-Sq(adj) = 44.2%Analysis of VarianceSource DF SS MS F PRegression 2 29460 14730 6.54 0.012Residual Error12 27033 2253Total 14 56493With 2 and 12 degrees of freedomP-value
31、 for the F TestDCOVAF Test for Overall SignificancH0: 1 = 2 = 0H1: 1 and 2 not both zero = .05df1= 2 df2 = 12 Test Statistic: Decision:Conclusion:Since FSTAT test statistic is in the rejection region (p-value .05), reject H0There is evidence that at least one independent variable affects Y0 = .05F0.
32、05 = 3.885Reject H0Do not reject H0Critical Value: F0.05 = 3.885F Test for Overall Significance(continued)FDCOVAH0: 1 = 2 = 0Test Statistic:Two variable modelYX1X2Yi Yix2ix1iThe best fit equation is found by minimizing the sum of squared errors, e2Sample observationResiduals in Multiple RegressionRe
33、sidual = ei = (Yi Yi)DCOVATwo variable modelYX1X2Yi YixMultiple Regression AssumptionsAssumptions:The errors are normally distributedErrors have a constant varianceThe model errors are independentei = (Yi Yi)Errors (residuals) from the regression model:DCOVAMultiple Regression AssumptionResidual Plo
34、ts Used in Multiple RegressionThese residual plots are used in multiple regression:Residuals vs. YiResiduals vs. X1iResiduals vs. X2iResiduals vs. time (if time series data)Use the residual plots to check for violations of regression assumptionsDCOVAResidual Plots Used in MultipUse t tests of indivi
35、dual variable slopesShows if there is a linear relationship between the variable Xj and Y holding constant the effects of other X variablesHypotheses:H0: j = 0 (no linear relationship)H1: j 0 (linear relationship does exist between Xj and Y)Are Individual Variables Significant?DCOVAUse t tests of in
36、dividual variH0: j = 0 (no linear relationship between Xj and Y)H1: j 0 (linear relationship does exist between Xj and Y)Test Statistic:(df = n k 1)Are Individual Variables Significant?(continued)DCOVAH0: j = 0 (no linear relatioRegression StatisticsMultiple R0.72213R Square0.52148Adjusted R Square0
37、.44172Standard Error47.46341Observations15ANOVA dfSSMSFSignificance FRegression229460.02714730.0136.538610.01201Residual1227033.3062252.776Total1456493.333CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept306.52619114.253892.682850.0199357.58835555.46404Price-24.9750910.83213-2.30565
38、0.03979-48.57626-1.37392Advertising74.1309625.967322.854780.0144917.55303130.70888t Stat for Price is tSTAT = -2.306, with p-value .0398t Stat for Advertising is tSTAT = 2.855, with p-value .0145(continued)Are Individual Variables Significant? Excel OutputDCOVARegression StatisticsMultiple Are Indiv
39、idual Variables Significant? Minitab OutputThe regression equation isSales = 307 - 25.0 Price + 74.1 AdvertisingPredictor Coef SE Coef T PConstant306.50 114.30 2.68 0.020Price -24.98 10.83 -2.31 0.040Advertising 74.13 25.97 2.85 0.014S = 47.4634 R-Sq = 52.1% R-Sq(adj) = 44.2%Analysis of VarianceSour
40、ce DF SS MS F PRegression 2 29460 14730 6.54 0.012Residual Error12 27033 2253Total 14 56493t Stat for Price is tSTAT = -2.31, with p-value .040t Stat for Advertising is tSTAT = 2.85, with p-value .014DCOVAAre Individual Variables Signid.f. = 15-2-1 = 12 = .05t/2 = 2.1788Inferences about the Slope: t
41、 Test ExampleH0: j = 0H1: j 0The test statistic for each variable falls in the rejection region (p-values .05)There is evidence that both Price and Advertising affect pie sales at = .05From the Excel output: Reject H0 for each variableDecision:Conclusion:Reject H0Reject H0a/2=.025-t/2Do not reject H
42、00t/2a/2=.025-2.17882.1788For Price tSTAT = -2.306, with p-value .0398For Advertising tSTAT = 2.855, with p-value .0145DCOVAd.f. = 15-2-1 = 12Inferences aConfidence Interval Estimate for the SlopeConfidence interval for the population slope j Example: Form a 95% confidence interval for the effect of
43、 changes in price (X1) on pie sales:-24.975 (2.1788)(10.832)So the interval is (-48.576 , -1.374)(This interval does not contain zero, so price has a significant effect on sales)CoefficientsStandard ErrorIntercept306.52619114.25389Price-24.9750910.83213Advertising74.1309625.96732where t has (n k 1)
44、d.f.Here, t has (15 2 1) = 12 d.f.DCOVAConfidence Interval Estimate Confidence Interval Estimate for the SlopeConfidence interval for the population slope jExample: Excel output also reports these interval endpoints: Weekly sales are estimated to be reduced by between 1.37 to 48.58 pies for each inc
45、rease of $1 in the selling price, holding the effect of advertising constantCoefficientsStandard ErrorLower 95%Upper 95%Intercept306.52619114.2538957.58835555.46404Price-24.9750910.83213-48.57626-1.37392Advertising74.1309625.9673217.55303130.70888(continued)DCOVAConfidence Interval Estimate Using Du
46、mmy VariablesA dummy variable is a categorical independent variable with two levels:yes or no, on or off, male or femalecoded as 0 or 1Assumes the slopes associated with numerical independent variables do not change with the value for the categorical variableIf more than two levels, the number of du
47、mmy variables needed is (number of levels - 1)DCOVAUsing Dummy VariablesA dummy vDummy-Variable Example (with 2 Levels)Let:Y = pie salesX1 = priceX2 = holiday (X2 = 1 if a holiday occurred during the week) (X2 = 0 if there was no holiday that week)DCOVADummy-Variable Example (withSame slopeDummy-Variable Example (with 2 Levels)(continued)X1 (Price)Y (sales)b0 + b2b0 HolidayNo HolidayDifferent interceptHoliday (X
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 吊篮施工事故应急预案方案
- 综合英语一(通+用英语)学习通超星期末考试答案章节答案2024年
- 防疫物品返还合同条款
- 小客车指标租赁协议书注意事项
- 典当行合同管理制度
- 高速公路事故预警系统方案
- 课堂管理中的有效方法计划
- 培养创新文化的方法计划
- 夏令营与冬令营活动安排计划
- 买房借款合同三篇
- 工程代收款付款协议书范文模板
- 全套教学课件《工程伦理学》
- 雾化吸入疗法的用药指南2024课件
- GB/T 42455.2-2024智慧城市建筑及居住区第2部分:智慧社区评价
- 地 理期中测试卷(一) 2024-2025学年地理湘教版七年级上册
- 2024年山东济南轨道交通集团限公司招聘95人历年高频难、易错点500题模拟试题附带答案详解
- 江苏省建筑与装饰工程计价定额(2014)电子表格版
- 华为财务管理(6版)-华为经营管理丛书
- 框架结构冬季施工方案
- 装配式挡土墙施工方案(完整版)
- 防炫(AG工艺)玻璃屏项目可行性研究报告模版
评论
0/150
提交评论