




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Introduction to Multiple RegressionChapter 13Introduction to Multiple RegreObjectivesIn this chapter, you learn: How to develop a multiple regression modelHow to interpret the regression coefficientsHow to determine which independent variables to include in the regression modelHow to use categorical
2、 independent variables in a regression modelObjectivesIn this chapter, youThe Multiple Regression ModelIdea: Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (Xi)Multiple Regression Model with k Independent Variables:Y-interceptPopulation slopesRandom ErrorDC
3、OVAThe Multiple Regression ModelIMultiple Regression EquationThe coefficients of the multiple regression model are estimated using sample dataEstimated (or predicted) value of YEstimated slope coefficientsMultiple regression equation with k independent variables:EstimatedinterceptIn this chapter we
4、will use Excel and Minitab to obtain the regression slope coefficients and other regression summary measures.DCOVAMultiple Regression EquationThTwo variable modelYX1X2Slope for variable X1Slope for variable X2Multiple Regression Equation(continued)DCOVATwo variable modelYX1X2Slope fA distributor of
5、frozen dessert pies wants to evaluate factors thought to influence demandDependent variable: Pie sales (units per week)Independent variables: Price (in $) Advertising ($100s)Data are collected for 15 weeksExample: 2 Independent VariablesDCOVAA distributor of frozen desserPie Sales ExampleSales = b0
6、+ b1 (Price) + b2 (Advertising)WeekPie SalesPrice($)Advertising($100s)13505.503.324607.503.333508.003.044308.004.553506.803.063807.504.074304.503.084706.403.794507.003.5104905.004.0113407.203.5123007.903.2134405.904.0144505.003.5153007.002.7Multiple regression equation:DCOVAPie Sales ExampleSales =
7、b0 + Excel Multiple Regression OutputRegression StatisticsMultiple R0.72213R Square0.52148Adjusted R Square0.44172Standard Error47.46341Observations15ANOVA dfSSMSFSignificance FRegression229460.02714730.0136.538610.01201Residual1227033.3062252.776Total1456493.333CoefficientsStandard Errort StatP-val
8、ueLower 95%Upper 95%Intercept306.52619114.253892.682850.0199357.58835555.46404Price-24.9750910.83213-2.305650.03979-48.57626-1.37392Advertising74.1309625.967322.854780.0144917.55303130.70888DCOVAExcel Multiple Regression OutpMinitab Multiple Regression OutputThe regression equation isSales = 307 - 2
9、5.0 Price + 74.1 AdvertisingPredictor Coef SE Coef T PConstant306.50 114.30 2.68 0.020Price -24.98 10.83 -2.31 0.040Advertising 74.13 25.97 2.85 0.014S = 47.4634 R-Sq = 52.1% R-Sq(adj) = 44.2%Analysis of VarianceSource DF SS MS F PRegression 2 29460 14730 6.54 0.012Residual Error12 27033 2253Total 1
10、4 56493DCOVAMinitab Multiple Regression OuThe Multiple Regression Equationb1 = -24.975: sales will decrease, on average, by 24.975 pies per week for each $1 increase in selling price, net of the effects of changes due to advertisingb2 = 74.131: sales will increase, on average, by 74.131 pies per wee
11、k for each $100 increase in advertising, net of the effects of changes due to pricewhere Sales is in number of pies per week Price is in $ Advertising is in $100s.DCOVAThe Multiple Regression EquatiUsing The Equation to Make PredictionsPredict sales for a week in which the selling price is $5.50 and
12、 advertising is $350:Predicted sales is 428.62 piesNote that Advertising is in $100s, so $350 means that X2 = 3.5DCOVAUsing The Equation to Make PrePredictions in Excel using PHStatPHStat | regression | multiple regression Check the “confidence and prediction interval estimates” boxDCOVAPredictions
13、in Excel using PHSInput valuesPredictions in PHStat(continued) Predicted Y valueConfidence interval for the mean value of Y, given these X valuesPrediction interval for an individual Y value, given these X valuesDCOVAInput valuesPredictions in PHSPredictions in MinitabInput valuesPredicted Values fo
14、r New ObservationsNewObs Fit SE Fit 95% CI 95% PI 1 428.6 17.2 (391.1, 466.1) (318.6, 538.6)Values of Predictors for New ObservationsNewObs Price Advertising 1 5.50 3.50 Confidence interval for the mean value of Y, given these X values Prediction interval for an individual Y value, given these X val
15、uesDCOVAPredictions in MinitabInput vaThe Coefficient of Multiple Determination, r2Reports the proportion of total variation in Y explained by all X variables taken togetherDCOVAThe Coefficient of Multiple DeRegression StatisticsMultiple R0.72213R Square0.52148Adjusted R Square0.44172Standard Error4
16、7.46341Observations15ANOVA dfSSMSFSignificance FRegression229460.02714730.0136.538610.01201Residual1227033.3062252.776Total1456493.333CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept306.52619114.253892.682850.0199357.58835555.46404Price-24.9750910.83213-2.305650.03979-48.57626-1.37
17、392Advertising74.1309625.967322.854780.0144917.55303130.7088852.1% of the variation in pie sales is explained by the variation in price and advertisingMultiple Coefficient of Determination In ExcelDCOVARegression StatisticsMultiple Multiple Coefficient of Determination In MinitabThe regression equat
18、ion isSales = 307 - 25.0 Price + 74.1 AdvertisingPredictor Coef SE Coef T PConstant306.50 114.30 2.68 0.020Price -24.98 10.83 -2.31 0.040Advertising 74.13 25.97 2.85 0.014S = 47.4634 R-Sq = 52.1% R-Sq(adj) = 44.2%Analysis of VarianceSource DF SS MS F PRegression 2 29460 14730 6.54 0.012Residual Erro
19、r12 27033 2253Total 14 5649352.1% of the variation in pie sales is explained by the variation in price and advertisingDCOVAMultiple Coefficient of DeterAdjusted r2r2 never decreases when a new X variable is added to the modelThis can be a disadvantage when comparing modelsWhat is the net effect of a
20、dding a new variable?We lose a degree of freedom when a new X variable is addedDid the new X variable add enough explanatory power to offset the loss of one degree of freedom?DCOVAAdjusted r2r2 never decreasesShows the proportion of variation in Y explained by all X variables adjusted for the number
21、 of X variables used (where n = sample size, k = number of independent variables)Penalizes excessive use of unimportant independent variablesSmaller than r2Useful in comparing among modelsAdjusted r2(continued)DCOVAShows the proportion of variatRegression StatisticsMultiple R0.72213R Square0.52148Ad
22、justed R Square0.44172Standard Error47.46341Observations15ANOVA dfSSMSFSignificance FRegression229460.02714730.0136.538610.01201Residual1227033.3062252.776Total1456493.333CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept306.52619114.253892.682850.0199357.58835555.46404Price-24.97509
23、10.83213-2.305650.03979-48.57626-1.37392Advertising74.1309625.967322.854780.0144917.55303130.7088844.2% of the variation in pie sales is explained by the variation in price and advertising, taking into account the sample size and number of independent variablesAdjusted r2 in ExcelDCOVARegression Sta
24、tisticsMultiple Adjusted r2 in MinitabThe regression equation isSales = 307 - 25.0 Price + 74.1 AdvertisingPredictor Coef SE Coef T PConstant306.50 114.30 2.68 0.020Price -24.98 10.83 -2.31 0.040Advertising 74.13 25.97 2.85 0.014S = 47.4634 R-Sq = 52.1% R-Sq(adj) = 44.2%Analysis of VarianceSource DF
25、 SS MS F PRegression 2 29460 14730 6.54 0.012Residual Error12 27033 2253Total 14 5649344.2% of the variation in pie sales is explained by the variation in price and advertising, taking into account the sample size and number of independent variablesDCOVAAdjusted r2 in MinitabThe regrF Test for Overa
26、ll Significance of the ModelShows if there is a linear relationship between all of the X variables considered together and YUse F-test statisticHypotheses: H0: 1 = 2 = = k = 0 (no linear relationship) H1: at least one i 0 (at least one independent variable affects Y) Is the Model Significant?DCOVAF
27、Test for Overall SignificancF Test for Overall SignificanceTest statistic: where FSTAT has numerator d.f. = k and denominator d.f. = (n k - 1) DCOVAF Test for Overall SignificancRegression StatisticsMultiple R0.72213R Square0.52148Adjusted R Square0.44172Standard Error47.46341Observations15ANOVA dfS
28、SMSFSignificance FRegression229460.02714730.0136.538610.01201Residual1227033.3062252.776Total1456493.333CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept306.52619114.253892.682850.0199357.58835555.46404Price-24.9750910.83213-2.305650.03979-48.57626-1.37392Advertising74.1309625.96732
29、2.854780.0144917.55303130.70888(continued)F Test for Overall Significance In ExcelWith 2 and 12 degrees of freedomP-value for the F TestDCOVARegression StatisticsMultiple F Test for Overall Significance In MinitabThe regression equation isSales = 307 - 25.0 Price + 74.1 AdvertisingPredictor Coef SE
30、Coef T PConstant306.50 114.30 2.68 0.020Price -24.98 10.83 -2.31 0.040Advertising 74.13 25.97 2.85 0.014S = 47.4634 R-Sq = 52.1% R-Sq(adj) = 44.2%Analysis of VarianceSource DF SS MS F PRegression 2 29460 14730 6.54 0.012Residual Error12 27033 2253Total 14 56493With 2 and 12 degrees of freedomP-value
31、 for the F TestDCOVAF Test for Overall SignificancH0: 1 = 2 = 0H1: 1 and 2 not both zero = .05df1= 2 df2 = 12 Test Statistic: Decision:Conclusion:Since FSTAT test statistic is in the rejection region (p-value .05), reject H0There is evidence that at least one independent variable affects Y0 = .05F0.
32、05 = 3.885Reject H0Do not reject H0Critical Value: F0.05 = 3.885F Test for Overall Significance(continued)FDCOVAH0: 1 = 2 = 0Test Statistic:Two variable modelYX1X2Yi Yix2ix1iThe best fit equation is found by minimizing the sum of squared errors, e2Sample observationResiduals in Multiple RegressionRe
33、sidual = ei = (Yi Yi)DCOVATwo variable modelYX1X2Yi YixMultiple Regression AssumptionsAssumptions:The errors are normally distributedErrors have a constant varianceThe model errors are independentei = (Yi Yi)Errors (residuals) from the regression model:DCOVAMultiple Regression AssumptionResidual Plo
34、ts Used in Multiple RegressionThese residual plots are used in multiple regression:Residuals vs. YiResiduals vs. X1iResiduals vs. X2iResiduals vs. time (if time series data)Use the residual plots to check for violations of regression assumptionsDCOVAResidual Plots Used in MultipUse t tests of indivi
35、dual variable slopesShows if there is a linear relationship between the variable Xj and Y holding constant the effects of other X variablesHypotheses:H0: j = 0 (no linear relationship)H1: j 0 (linear relationship does exist between Xj and Y)Are Individual Variables Significant?DCOVAUse t tests of in
36、dividual variH0: j = 0 (no linear relationship between Xj and Y)H1: j 0 (linear relationship does exist between Xj and Y)Test Statistic:(df = n k 1)Are Individual Variables Significant?(continued)DCOVAH0: j = 0 (no linear relatioRegression StatisticsMultiple R0.72213R Square0.52148Adjusted R Square0
37、.44172Standard Error47.46341Observations15ANOVA dfSSMSFSignificance FRegression229460.02714730.0136.538610.01201Residual1227033.3062252.776Total1456493.333CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept306.52619114.253892.682850.0199357.58835555.46404Price-24.9750910.83213-2.30565
38、0.03979-48.57626-1.37392Advertising74.1309625.967322.854780.0144917.55303130.70888t Stat for Price is tSTAT = -2.306, with p-value .0398t Stat for Advertising is tSTAT = 2.855, with p-value .0145(continued)Are Individual Variables Significant? Excel OutputDCOVARegression StatisticsMultiple Are Indiv
39、idual Variables Significant? Minitab OutputThe regression equation isSales = 307 - 25.0 Price + 74.1 AdvertisingPredictor Coef SE Coef T PConstant306.50 114.30 2.68 0.020Price -24.98 10.83 -2.31 0.040Advertising 74.13 25.97 2.85 0.014S = 47.4634 R-Sq = 52.1% R-Sq(adj) = 44.2%Analysis of VarianceSour
40、ce DF SS MS F PRegression 2 29460 14730 6.54 0.012Residual Error12 27033 2253Total 14 56493t Stat for Price is tSTAT = -2.31, with p-value .040t Stat for Advertising is tSTAT = 2.85, with p-value .014DCOVAAre Individual Variables Signid.f. = 15-2-1 = 12 = .05t/2 = 2.1788Inferences about the Slope: t
41、 Test ExampleH0: j = 0H1: j 0The test statistic for each variable falls in the rejection region (p-values .05)There is evidence that both Price and Advertising affect pie sales at = .05From the Excel output: Reject H0 for each variableDecision:Conclusion:Reject H0Reject H0a/2=.025-t/2Do not reject H
42、00t/2a/2=.025-2.17882.1788For Price tSTAT = -2.306, with p-value .0398For Advertising tSTAT = 2.855, with p-value .0145DCOVAd.f. = 15-2-1 = 12Inferences aConfidence Interval Estimate for the SlopeConfidence interval for the population slope j Example: Form a 95% confidence interval for the effect of
43、 changes in price (X1) on pie sales:-24.975 (2.1788)(10.832)So the interval is (-48.576 , -1.374)(This interval does not contain zero, so price has a significant effect on sales)CoefficientsStandard ErrorIntercept306.52619114.25389Price-24.9750910.83213Advertising74.1309625.96732where t has (n k 1)
44、d.f.Here, t has (15 2 1) = 12 d.f.DCOVAConfidence Interval Estimate Confidence Interval Estimate for the SlopeConfidence interval for the population slope jExample: Excel output also reports these interval endpoints: Weekly sales are estimated to be reduced by between 1.37 to 48.58 pies for each inc
45、rease of $1 in the selling price, holding the effect of advertising constantCoefficientsStandard ErrorLower 95%Upper 95%Intercept306.52619114.2538957.58835555.46404Price-24.9750910.83213-48.57626-1.37392Advertising74.1309625.9673217.55303130.70888(continued)DCOVAConfidence Interval Estimate Using Du
46、mmy VariablesA dummy variable is a categorical independent variable with two levels:yes or no, on or off, male or femalecoded as 0 or 1Assumes the slopes associated with numerical independent variables do not change with the value for the categorical variableIf more than two levels, the number of du
47、mmy variables needed is (number of levels - 1)DCOVAUsing Dummy VariablesA dummy vDummy-Variable Example (with 2 Levels)Let:Y = pie salesX1 = priceX2 = holiday (X2 = 1 if a holiday occurred during the week) (X2 = 0 if there was no holiday that week)DCOVADummy-Variable Example (withSame slopeDummy-Variable Example (with 2 Levels)(continued)X1 (Price)Y (sales)b0 + b2b0 HolidayNo HolidayDifferent interceptHoliday (X
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 酒店预算培训
- 脑卒中恢复期治疗
- 餐饮礼貌礼仪培训
- 门静脉栓塞护理查房
- 2025年《小猫钓鱼》标准教案
- 艺术培训机构个人总结
- 体育教学安全教育
- 广告策划总监简历
- 法律风险防范咨询合作协议
- 开幕致辞与未来展望演讲报告
- 小学二年级下册《劳动》教案
- 2025年河南机电职业学院单招职业技能考试题库完整
- 2025年湖南生物机电职业技术学院单招职业技能测试题库及参考答案
- 2025年深圳市高三一模英语试卷答案详解讲评课件
- 2025年黑龙江旅游职业技术学院单招职业适应性测试题库一套
- 山东省聊城市冠县2024-2025学年八年级上学期期末地理试卷(含答案)
- 敲响酒驾警钟坚决杜绝酒驾课件
- 2025年潍坊工程职业学院高职单招高职单招英语2016-2024历年频考点试题含答案解析
- 2025年江西青年职业学院高职单招职业技能测试近5年常考版参考题库含答案解析
- 2025-2030年中国羽毛球行业规模分析及投资前景研究报告
- 凝血七项的临床意义
评论
0/150
提交评论