复旦管理学院考博资料——计量经济学Lecture 1.ppt_第1页
复旦管理学院考博资料——计量经济学Lecture 1.ppt_第2页
复旦管理学院考博资料——计量经济学Lecture 1.ppt_第3页
复旦管理学院考博资料——计量经济学Lecture 1.ppt_第4页
复旦管理学院考博资料——计量经济学Lecture 1.ppt_第5页
已阅读5页,还剩51页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Econometrics (I)Lecture 1 Introduction and Two Variable Regression Model,Dr. Sun Pei (孙霈) Associate Professor in Industrial Economics School of Management, Fudan University,2,What is Econometrics,“The application of mathematical statistics to economic data to lend empirical support to the models con

2、structed by mathematical economists and to obtain numerical estimates.” (Samuelson, et al., 1954) Quantifying economic relationships and finding values for parameters Testing theories implied by such relationships Using estimated relationships to make policy recommendations in business and governmen

3、t policy (e.g., forecasting),3,Examples,How much do cigarette taxes reduce smoking? Negative externalities Government intervention Need to know the price elasticity of cigarette demand Potential reverse causality Job training and worker productivity Hourly wage = f (education, experience, training)

4、0, 1, 2 and 3 are parameters of the model u contains all other factors that can influence hourly wage,4,Examples,Economic Model of Crime Y = f (x1, x2, x3, x4, x5, x6, x7) Y: hours spent in criminal activities X1: hourly “wage” in criminal activity X2: hourly wage in legal employment X3: income othe

5、r than from crime or employment X4: probability of getting caught X5: probability of being convicted if caught X6: expected sentence if convicted X7: age,5,Examples,Econometric Model of Crime crime: frequency of crime activity wagem: wage in legal employment otherinc: income from other sources freqa

6、rr: frequency of arrests freqcon: frequency of conviction avgsen: average sentence length after conviction u: unobserved factors, such as the wage for criminal activity,6,Causal Effects and Idealized Experiments,Correlation vs. Causality Ceteris paribus: Other factors being equal Hypotheses in the s

7、ocial sciences are ceteris paribus in nature Have enough other factors been held fixed to make a case for causality? Randomized controlled experiment E.g., the effects of fertilizer on crop yield Control group: receives no treatment (no fertilizer) Treatment group: receives treatment (fertilizer) Ra

8、ndomized: whether a plot is fertilized or not is determined randomly, so any other differences between the plots are unrelated to whether they receive fertilizer,7,Causal Effects and Idealized Experiments,Causal effect: The effect on an outcome of a given treatment, as measured in an ideal randomize

9、d controlled experiment. Experiments are rare in econometrics, which focuses on analysis of nonexperimental/observational data In observational data, levels of “treatment” (the amount of fertilizer) are rarely assigned at random, so it is hard to sort out the effect of “treatment” from other relevan

10、t factors.,8,Example: Measuring the Return to Education,The Effect of Education on Wage: If a person is chosen from the population and given another year of education, by how much will his/her wage increase? Such social experiment is infeasible. In reality, levels of education are not assigned indep

11、endently of other factors that affect wage Experience: negatively correlated with education Innate ability: positively correlated with education Analogs in the fertilizer example: Experience vs. rainfall; Innate ability (largely unobservable) vs. quality of land Many advances in econometric methods

12、try to deal with unobserved factors,9,The Structure of Economic Data,Cross-Sectional Data A sample of individual units taken at a single time period It is often assumed that the data are obtained by random sampling from the underlying population Sometimes random sampling is not an appropriate assump

13、tion: Sample selection bias; units are large relative to the population (observations are not independent draws) The ordering of the data does not matter,10,11,12,The Structure of Economic Data,Time Series Data Observations on a variable or several variables over time/at multiple time periods Unlike

14、 cross-sectional data, the chronological ordering of observations is crucial: data should be stored in chronological order Economic observations can rarely be assumed to be independent across time Data frequency: seasonal pattern,13,14,The Structure of Economic Data,Pooled Cross Sections Two cross-s

15、ectional household surveys are taken: one in 1985, and one in 1990 Note: in 1990 a new random sample of households were taken using the same survey questions Very similar to a standard cross section, except that we need to account for differences in the variables over time It is also a way of analyz

16、ing the effects of a new government policy (a reduction in property taxes in 1994),15,16,The Structure of Economic Data,Panel/Longitudinal Data Data for multiple entities in which each entity is observed at two or more time periods The same cross-sectional units are followed over a given period of t

17、ime Relationship with pooled cross section Advantages: Combining information from cross-sectional and time-series data,17,18,Two-Variable Linear Regression,Y: Dependent/Explained Variable; Regressand X: Independent/Explanatory Variable; Regressor u (Error term or disturbance) represents all factors

18、other than X that affect Y, thus treating all other factors unobserved 1: Slope parameter/coefficient 0: Intercept parameter; constant term,19,Basic Assumptions,Linear in parameters Y is a linear function of 1 (the partial derivative is unrelated to 1 How about How about Random sampling (Xi, Yi), i

19、= 1, , n, are independently and identically distributed ( i.i.d.) across observations In traditional textbooks, X is assumed to be non-stochastic. That is, Xi are considered fixed in repeated samples,20,Basic Assumptions,Zero conditional mean: X and u are uncorrelated: Cov (u, X) = 0 Moreover, the a

20、verage value of u does not depend on the value of X X and u do not need to be independent of each other Population regression function (PRF),21,22,Basic Assumptions,Homoskedasticity: Var (Y|X) = 2 When Var (u|X) depends on X, the error term is said to exhibit heteroskedasticity (or nonconstant varia

21、nce). Also, Var (Y|X) becomes a function of X Heteroskedasticity in a wage equation The variability in wage may increase with the level of education,23,24,25,Basic Assumptions,No autocorrelation between the disturbances This assumption is implied by the random sampling and the zero conditional mean

22、assumption The i.i.d. assumption implies that E(uiuj|X1,Xn)=E(uiuj|XiXj)=E(ui|Xi)E(uj|Xj) The zero conditional mean implies that E(ui|Xi)=0,26,Estimation,Since PRF is not directly observable, we set Sample Regression Function (SRF) Residuals Objective function: Sum of squared residuals,27,28,Ordinar

23、y Least Squares (OLS) Estimators,29,CEO Salary and Return on Equity,Y: Annual salary in thousands of dollars X: Average ROE for CEOs firm in the previous three years 209 CEOs in year 1990 The sample regression function: When roe = 0, the predicted salary is the intercept If roe increases by 1%, the

24、salary is predicted to rise by $18,500 We will never know the population regression function, so we cannot tell how close the SRF is to the PRF,30,CEO Salary and Return on Equity,31,Algebraic Properties of OLS Statistics,The sum and the sample average of the OLS residuals is zero. The sample covaria

25、nce between the regressors and the OLS residuals is zero. The point is always on the OLS regression line The mean value of the estimated Y is equal to the mean of the actual Y.,32,Measuring Goodness of Fit,Fitted values and residuals It can be shown that the sample variance between the two parts on

26、the right hand side is zero. Total sum of squares Total sample variation in Yi Explained sum of squares Total variation in fitted values Residual sum of squares,33,34,Measuring Goodness of Fit,It can be shown that SST = SSE + SSR Coefficient of determination (R2) R2 = SSE/SST = 1 (SSR/SST) It measur

27、es the fraction of the total variation in Y that can be explained by a linear relationship between X and Y. We can use it to assess how closely an OLS regression line fits a scatter of points R2 lies between zero and one R2 and sample correlation:,35,Measuring Goodness of Fit,In the CEO salary case,

28、 R2 = 0.0132 Only about 1.3% of the variation in salary can be explained by a firms ROE 98.7% of the salary variations is left unexplained! Low R2 itself does not mean a regression is “bad”, and it is common in the cross-sectional data. It is still possible that the regression is a good estimate of

29、the ceteris paribus relationship between ROE and salary. High R2 itself does not mean a regression is “good”, and it is common in the time-series data. High R2 is present when time-series data exhibit persistent trends over time (price, output, income, consumption, etc.), even when the causal link b

30、etween two variables is extremely tenuous or perhaps non-existent.,36,Best Linear Unbiased Estimator (BLUE),Suppose you want to estimate the mean value of Y in a population, you can compute the sample average from a sample of n independently and identically distributed observations Y1, Y2, , Yn (ran

31、dom sampling) It can be shown that the sample average is a BLUE of population mean. It can also be shown that the OLS estimators are BLUEs.,37,Properties of the OLS Estimators,Linearity They are linear functions of Y1, Y2, , Yn, conditional on X1, X2, , Xn. where the weights w1, , wn can depend on X

32、i but not on Yi Conditional Unbiasedness,38,Properties of the OLS Estimators,Efficiency (Best linear unbiasedness) The conditional variances of the OLS estimators 1 and 0 given X1, X2, , Xn are the smallest of all linear conditionally unbiased estimators (Proof).,39,The Determinants of the Variances

33、,The larger the error variance, the larger are the variances of the OLS estimators. The larger the sample variance of the explanatory variable, the smaller are the variances of the OLS estimators.,40,Large Sample Properties of the OLS Estimators,Consistency An estimator is said to be a consistent on

34、e if the probability limit of the estimator is equal to the parameter being estimated. As sample size goes to infinity, the estimators sampling distribution collapses onto the parameter being estimated.,41,Large Sample Properties of the OLS Estimators,Consistency A sufficient but not necessary condi

35、tion: and It can be shown that the OLS estimators are consistent estimators of the population parameters Asymptotic efficiency The estimator is not only consistent but has the smallest asymptotic variance The sampling distribution collapses most quickly onto the population parameter as sample size g

36、oes to infinity OLS estimators are asymptotically efficient,42,Estimating the Error Variance2,It can be shown that Note the degrees of freedom Standard error of the regression (biased but consistent estimator of ) Standard errors of OLS estimators: Very useful for constructing test statistics and co

37、nfidence intervals,43,Inferences about the OLS Estimators,If we assume the normality of ui, with zero mean and constant variance2, then the conditional distributions of OLS estimators are normal Even without the normality assumption, for large samples, we can still have the following sampling distri

38、bution,44,Inferences about the OLS Estimators,Unfortunately, is rarely known in practice When we replace with , the distribution is no longer normal The Student t distribution with m degrees of freedom is defined to be the distribution of Z is a random variable with a standard normal distribution W

39、is a random variable with a chi-squared distribution with m degrees of freedom Z and W are independent of each other,45,Inferences about the OLS Estimators,If we retain the normality assumption of ui, it can be shown that Therefore, where,46,Confidence Intervals,where t/2 is the critical value There

40、fore, a 100(1-)% confidence interval for 1 is: The width of the confidence interval is proportional to the standard error (precision) of the OLS estimator. Given a confidence level of 95%, in 95% of possible samples that might be drawn, the confidence interval will contain the true value of 1,47,Hyp

41、othesis Testing,Two-tail test: H0: 1 = * H1: 1 * Under H0, the test statistics t = Decision rule: Reject H0 if |t| t/2 One-tail Tests: H1: 1 * or 1 t/2 or t -t/2,48,Hypothesis Testing,A null hypothesis that is commonly tested in empirical work is 1 = 0 The “2-t” rule of thumb: If the number of degrees of freedom is 20 or more and if the level of significance () is set at 5%, then the null hypothesis 1 = 0 can be rejected if the t

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

最新文档

评论

0/150

提交评论