版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Class 4: Inference in multiple regression.I. The Logic of Statistical InferenceThe logic of statistical inference is simple: we would like to make inferences about a population from what we observe from the sample that has been drawn randomly from the population. The samples' characteristics are
2、 called "point estimates." It is almost certain that the sample's characteristics are somewhat different from the population's characteristics. But because the sample was drawn randomly from the population, the sample's characteristics cannot be "very different" from
3、the population's characteristics. What do I mean by "very different"? To answer this question, we need a distance measure (or dispersion measure), called the standard deviation of the statistic. To summarize, statistical inferences consist of two steps:(1) Point estimates (sample stati
4、stics)(2) Standard deviations of the point estimates (dispersion of sample statistics in a hypothetical sampling distribution). II. Review: For a sample of fixed size is the dependent variable; contains independent variables. We can write the model in the following way: Under certain assumptions, th
5、e LS estimator As certain desirable properties:A1 => unbiasedness and consistencyA1 + A2 => BLUE, with A1 + A2 + A3 =>BUE (even for small samples)III. The Central Limit TheoremStatement: The mean of iid random variables (with mean of , and variance of ) approaches a normal distribution as t
6、he number of random variables increases. The property belongs to the statistic - sample mean in the sampling distribution of all sample means, even though the random variables themselves are not normally distributed. You can never check this normality because you can only have one sample statistic.
7、In regression analysis, we do not assume that e is normally distributed if we have a large sample, because all estimated parameters approach to normal distributions.Why: all LS estimates are linear function of e (proved last time). Recall a theorem: a linear transformation of a variable distributed
8、as normal is also distributed as normal.IV. Inferences about Regression CoefficientsA. Presentation of Regression ResultsCommon practice: give a star beside the parameter estimate for significance level of 0.05, two stars for 0.01, and three stars for 0.001. For example:Dependent Variable: EarningsI
9、ndependent Variable: Father's education 0.900*Mother's education0.501*Shoe size -2.16What is the problem with this practice? First, we want to have a quantitative evaluation of the significance level. We should not blindly rely on statistical tests. For example, Father's education0.900*
10、(0.450)Mother's education0.501* (0.001)Shoe size -2.16 (1.10)In this case, is father's education much more significant than shoe size? Not really. They are very similar. By contrast, mother's is far more significant than the other two. A second practice is to report the t or z values:Coe
11、ffi.t.Father's education0.900 2.0Mother's education0.501 500.Shoe size-2.16 -1.96This solution is much better. However, very often, our hypothesis is not about deviation from zero, but from other hypothetical values. For example, we are interested in the hypothesis whether a one-year increas
12、e in father's education will increase son's education by one year. The hypothesis here is 1 instead of 0. The preferred way of presentation is:Coeff.(S.E.)Father's education0.900(0.450)Mother's education0.501(0.001)Shoe size-2.16(1.10)B. Difference between Statistical Significance an
13、d the Size of an EffectStatistical significance always refers to a stated hypothesis. You will see a lot of misuses in the literature, sometimes by well-known sociologists. They would say that this variable is highly significant. That one is not significant. This is not correct. I am not responsible
14、 for their mistakes, but I want to warn you not to commit the same mistakes again. In our example, you could say Mother's education is highly significant from zero. But it is not significant from 0.5. Had your hypothesis been that the parameter for Mother's education is 0.5, the result would
15、 be consistent with the hypothesis. That is, statistical significance should always be made with reference to a hypothesis. FollowAnother common mistake is to equate statistical significance with the size of an effect. A variable can be statistically significant from zero. But the estimated coeffici
16、ent is small. The contribution of father's education to the dependent variable is larger than that of mother's education even though mother's education is more statistically significant from zero than father's education.Important: you should look at both coefficients and their standa
17、rd errors.C. Confidence Intervals for Single ParametersD. Hypothesis Testing for Single ParametersCompare ,if z is outside the range of -1.96 and 1.96, the hypothesis is rejected. Otherwise, we fail to reject the hypothesis. V. Inferences about Linear Combinations of Two ParametersExample 1: (equali
18、ty hypothesis), =>Example 2: (proportionality hypothesis), =>Example 3: (surplus hypothesis), =>In general form, we may have Hypothesis testing: Confidence interval: lies between (low limit, upper limit)Procedure: A. Point estimate: Compute B. Degree of Imprecisionand then take square root
19、of . We would need to obtain the variance-covariance matrix of the parameter vector in order to carry out the calculation. are on the diagonal, is off-diagonal.Let us look at the first half the example. Compute the confidence interval for the hypothesis that b1=b2. Step 1. b1 - b2 = - = 0.2586.Step
20、2: V(b1 - b2) = + -2* () = 0.1406.SD(b1 - b2) = 0.14061/2 = 0.3750.Step 3:Compute t2 = (0.2586 -0)/ 0.3750 = 0.6897, insignificant (unsurprising result). Note DF = 5-3 = 2. I use two parameters as an example.Examples of Hypothesis Testing: Example 1: (equality hypothesis), =>Example 2: (proportio
21、nality hypothesis), =>Example 3: (surplus hypothesis), =>In general form, We may have Hypothesis testing: Confidence interval: lies between (low limit, upper limit)Procedure: A. Find point estimate: Compute + B. Find degree of Imprecision and then take square root of .We would need to obtain the
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年公关系经理面试技巧及答案
- 2026中小学学生垃圾分类知识竞赛试题及答案
- 2025年国家基本公共卫生服务项目培训试题(附完整答案)
- 护理知识讲解:患者病情观察
- 外包维修合同模板(3篇)
- 2026年务川仡佬族苗族自治县辅警招聘考试备考题库附答案
- 2025年大学思想道德修养与法律基础期末考试题带答案(新)
- 2026黑龙江八一农垦大学招聘博士研究生19人参考题库附答案
- 2024年通城县辅警招聘考试真题汇编附答案
- 2026年湖北国土资源职业学院单招职业适应性测试模拟测试卷附答案
- 2026年七年级历史上册期末考试试卷及答案(共六套)
- 资产评估期末试题及答案
- 2025年内科医师定期考核模拟试题及答案
- 学堂在线 雨课堂 学堂云 研究生学术与职业素养讲座 章节测试答案
- 博士课程-中国马克思主义与当代(2024年修)习题答案
- 校长绩效考核量化测评细则
- 内科学 泌尿系统疾病总论
- GB/T 27724-2011普通长网造纸机系统能量平衡及能量效率计算方法
- GB/T 2424.25-2000电工电子产品环境试验第3部分:试验导则地震试验方法
- GB/T 18341-2021地质矿产勘查测量规范
- FZ/T 24022-2015精梳水洗毛织品
评论
0/150
提交评论