第matlab计量经济学多重共线性的诊断与处理_第1页
第matlab计量经济学多重共线性的诊断与处理_第2页
第matlab计量经济学多重共线性的诊断与处理_第3页
第matlab计量经济学多重共线性的诊断与处理_第4页
第matlab计量经济学多重共线性的诊断与处理_第5页
已阅读5页,还剩11页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、第五节 多重共线性的诊断与处理5.1 多重共线性的诊断数据来源:计量经济学于俊年 编著 对外经济贸易大学出版社 2000.6 p208-p209 某国19981998的经济数据年份进口额(y)国内产值(x1t)存货额(x2t)国内消费(x3t)198815.9149.34.2108.1198916.4161.24.1114.8199019171.53.1123.2199119.1175.53.1126.9199218.8180.81.1132.1199320.4190.72.2137.7199422.7202.12.1146199526.5212.15.6154.1199628.1226.15

2、162.3199727.6231.95.1164.3199826.32390.7167.65.1.1 条件数与病态指数诊断设x1,x2,xp是自变量X1,X2,XP,经过中心化和标准化得到的向量,即: 记(x1,x2,xp)为x,设为xTx一个特征值,为对应的特征向量,其长度为1,若,则:根据上表,计算如下: x=149.3, 4.2, 108.1; 161.2, 4.1, 114.8; 171.5, 3.1,123.2; 175.5, 3.1, 126.9; 180.8, 1.1, 132.1; 190.7, 2.2, 137.7; 202.1, 2.1, 146; 212.1, 5.6,

3、154.1; 226.1,5, 162.3; 231.9, 5.1, 164.3; 239, 0.7, 167.6求x的相关矩阵RR=corrcoef(x)R = 1.00000000000000 0.02447049083573 0.99715218582079 0.02447049083573 1.00000000000000 0.03567322292007 0.99715218582079 0.03567322292007 1.00000000000000求R的条件数:cond(R)ans = 7.178039564809832e+002也可先求R的特征值e=eig(R)e = 0.0

4、0278483106125 0.99825241504342 1.99896275389533注:e(3)/e(1)ans = 7.178039564809491e+002条件数为717.804,大于100,存在较严重的多重共线性。为了进一步了解哪些变量之间存在线性关系,计算相关矩阵的特征值和相应的特征向量:v,d=eig(R)v = 0.70696453896575 0.03569873579633 0.70634746471371 0.00795062868633 -0.99906334219563 0.04253499482058 -0.70720430439049 0.02445482

5、658777 0.70658618250581d = 0.00278483106125 0 0 0 0.99825241504342 0 0 0 1.99896275389533注意:Rv=vd v为标准正交矩阵最小的特征值为0.00278483106125,对应的向量为:(0.70696453896575,0.00795062868633,-0.70720430439049)T考虑到第二个数0.00795062868633约等于0,从而即:所以存在使得: 5.1.2 方差膨胀因子诊断每一个自变量对应的方差膨胀因子为R-1相应的对角元素rjj。若记xj关于其他p-1个自变量的复相关系数为Rj则

6、有: 如果VIF<5,则认为自变量间不存在多重共线性。如果如果VIF>10,则认为自变量间存在严重的多重共线性。在本例中:diag(inv(R)ans = 1.0e+002 * 1.79722747043643 0.01023478872590 1.79843993838056VIF=max(diag(inv(R)VIF =1.798439938380555e+002VIF远大于10,存在严重的多重共线性。注意:书上结果错了,我用SPSS算了,也是这个结果。方差膨胀因子也可按此计算:x1=x(:,1);x2=x(:,2);x3=x(:,3); b bint,r,rint,stats

7、=regress(x1,ones(11,1) x2 x3);一定要常数项1/(1-stats(1)ans = 1.797227470435788e+0025.1.3 容许度(Tolerance)诊断若记xj关于其他p-1个自变量的复相关系数为Rj则有:Tolj=1-R2j它是方差膨胀化因子的倒数。越小自变量共线性越强。小于0.1高度共线在本例中:Tol=1./diag(inv(R)Tol = 0.00556412594649 0.97705973887803 0.00556037473734最小的值远小0.1,高度多重共线性。5.1.4 方差比例诊断(看Applied Econometric

8、using Matlab的第84页)注意:Applied Econometric using Matlab的第84页,4.4式是错的,4.3,4.5,4.6式是对的。某国19981998的经济数据年份进口额(y)国内产值(x1t)存货额(x2t)国内消费(x3t)198815.9149.34.2108.1198916.4161.24.1114.8199019171.53.1123.2199119.1175.53.1126.9199218.8180.81.1132.1199320.4190.72.2137.7199422.7202.12.1146199526.5212.15.6154.11996

9、28.1226.15162.3199727.6231.95.1164.3199826.32390.7167.6x1=149.3, 4.2, 108.1; 161.2, 4.1, 114.8; 171.5, 3.1,123.2; 175.5, 3.1, 126.9; 180.8, 1.1, 132.1; 190.7, 2.2, 137.7; 202.1, 2.1, 146; 212.1, 5.6, 154.1; 226.1,5, 162.3; 231.9, 5.1, 164.3; 239, 0.7, 167.6;x=ones(size(x1,1),1),x1;vnames=strvcat(

10、9;constant','x1','x2','x3');fmt='%12.6f'bkw(x,vnames,fmt);Belsley, Kuh, Welsch Variance-decomposition K(x) constant x1 x2 x3 1 0.000000 0.000051 0.000000 0.000012 140 0.000006 0.140284 0.598136 0.116948 188 0.000011 0.680208 0.375680 0.646263 1978 0.999983 0.17945

11、7 0.026184 0.236777K(x)=188时,有两个方差比例大于0.5,x1与x3可以存在共线性。K(X)>30 或者方差比例>0.5,则存在多重共线性。上表的算法:nobs nvar = size(x);u d v = svd(x,0);lamda = diag(d(1:nvar,1:nvar);lamda2 = lamda.*lamda;v = v.*v;phi = zeros(nvar,nvar);for i=1:nvar;phi(i,:) = v(i,:)./lamda2'end; pi = zeros(nvar,nvar);for i=1:nvar;p

12、hik = sum(phi(i,:);pi(i,:) = phi(i,:)/phik;end;pi'ans = 0.00000000000428 0.00005121386618 0.00000000753568 0.00001248205274 0.00000606371588 0.14028373758356 0.59813616157856 0.11694753752154 0.00001066784928 0.68020848180663 0.37568029573565 0.64626252084123 0.99998326843056 0.17945656674363 0.

13、02618353515011 0.23677745958449K(x)的算法:u d v = svd(x,0);d1=diag(d);d1 = 1.0e+002 * 8.02837099962981 0.05746006631173 0.04260231288293 0.00405906633001kx=d1(1)/d1(1);d1(1)/d1(2);d1(1)/d1(3);d1(1)/d1(4)kx = 1.0e+003 * 0.00100000000000 0.13972087947263 0.18844918166043 1.977886131170545.2 多重共线性的处理可参见经济

14、计量学李景华 编著 中国商业出版社 第四章5.2.1 岭回归(脊回归)年份进口额(y)国内产值(x1t)存货额(x2t)国内消费(x3t)198815.9149.34.2108.1198916.4161.24.1114.8199019171.53.1123.2199119.1175.53.1126.9199218.8180.81.1132.1199320.4190.72.2137.7199422.7202.12.1146199526.5212.15.6154.1199628.1226.15162.3199727.6231.95.1164.3199826.32390.7167.6x=149.3,

15、 4.2, 108.1; 161.2, 4.1, 114.8; 171.5, 3.1,123.2; 175.5, 3.1, 126.9; 180.8, 1.1, 132.1; 190.7, 2.2, 137.7; 202.1, 2.1, 146; 212.1, 5.6, 154.1; 226.1,5, 162.3; 231.9, 5.1, 164.3; 239, 0.7, 167.6;y=15.9; 16.4;19;19.1; 18.8; 20.4; 22.7; 26.5; 28.1; 27.6; 26.3;bb = zeros(4,101);kvec = 0:0.01:1;count = 0

16、;for k = 0:0.01:1b(:,count) = ridge(y,ones(11,1) x,k);endplot(kvec',b'),xlabel('k'),ylabel('b','FontName','Symbol')点击最上面一要线,删除得:如果不想在图中包括常数项,则可:bb = zeros(3,101);kvec = 0:0.01:1;count = 0;for k = 0:0.01:1count = count + 1;bb(:,count) = ridge(y, x,k);endplot(kv

17、ec',bb'),xlabel('k'),ylabel('b','FontName','Symbol')为了看清k在0到0.1之间回归系数的变化情况,则:bb = zeros(3,11);kvec = 0:0.01:0.1;count = 0;for k = 0:0.01:0.1count = count + 1;bb(:,count) = ridge(y, x,k);endplot(kvec',bb'),xlabel('k'),ylabel('b','Fon

18、tName','Symbol')因此,在k=0.04,各回归系数基本稳定。不用ridge,也可做岭回归图:x=149.3, 4.2, 108.1; 161.2, 4.1, 114.8; 171.5, 3.1,123.2; 175.5, 3.1, 126.9; 180.8, 1.1, 132.1; 190.7, 2.2, 137.7; 202.1, 2.1, 146; 212.1, 5.6, 154.1; 226.1,5, 162.3; 231.9, 5.1, 164.3; 239, 0.7, 167.6;y=15.9; 16.4;19;19.1; 18.8; 20.4;

19、 22.7; 26.5; 28.1; 27.6; 26.3;xb=zscore(x)/sqrt(10);x1=x(:,1);x2=x(:,2);x3=x(:,3);bb = zeros(3,11);kvec = 0:0.01:0.1;count = 0;for k = 0:0.01:0.1count = count + 1;bb(:,count) =inv(diag(norm(x1-mean(x1) norm(x2-mean(x2) norm(x3-mean(x3)*inv(xb'*xb+eye(3)*k)*xb'*yendplot(kvec',bb'),xla

20、bel('k'),ylabel('b','FontName','Symbol')上图的y轴是经处理后最后模型的回归系数。也可绘制k在0到1变化时,最后模型的回归系数变化情况。bb = zeros(3,101);kvec = 0:0.01:1;count = 0;for k = 0:0.01:1count = count + 1;bb(:,count) =inv(diag(norm(x1-mean(x1) norm(x2-mean(x2) norm(x3-mean(x3)*inv(xb'*xb+eye(3)*k)*xb

21、9;*yendplot(kvec',bb'),xlabel('k'),ylabel('b','FontName','Symbol')xb=zscore(x)/sqrt(10); 标准化,即:xb = -0.47740966120325 0.17256712249066 -0.48483579038679 -0.35189665728378 0.15339299776947 -0.38215648650315 -0.24325935137029 -0.03834824944237 -0.25342422491769

22、-0.20107010635534 -0.03834824944237 -0.19672072874315 -0.14516935671053 -0.42183074386605 -0.11702932871405 -0.04075097529853 -0.21091537193303 -0.03120782099041 0.07948837299407 -0.23008949665421 0.09599191367141 0.18496148553145 0.44100486858724 0.22012659448596 0.33262384308377 0.32596012026013 0

23、.34579380222414 0.39379824835544 0.34513424498132 0.37644434069687 0.46868415825698 -0.49852724275079 0.42701772917687x1=x(:,1);x2=x(:,2);x3=x(:,3);b=inv(diag(norm(x1-mean(x1) norm(x2-mean(x2) norm(x3-mean(x3)*inv(xb'*xb+eye(3)*0.04)*xb'*yb = 0.06334061443129 0.58739760837860 0.1159205123202

24、2b0=mean(y)-b(1)*mean(x1)-b(2)*mean(x2)-b(3)*mean(x3)b0 = -8.56959415249174最后的岭回归方程: y=-8.56956+0.06334x1+0.5874x2+0.11592x3残差平方和:sse=(norm(y-(b0+b(1)*x1+b(2)*x2+b(3)*x3)2sse = 2.42768928001254可决系数:1-sse/norm(y-mean(y)2ans = 0.98824073639984OLS的残差平方和:bb,bint,r,rint=regress(y,ones(11,1) x);norm(r)2an

25、s = 1.67142209436149增加了45.25%计量经济学于俊年 P210 表中的VIF值算错了。k=0.04时VIF= diag(inv(xb'*xb+0.04*eye(3)VIF = 11.9276 0.9637 11.9350k=0.1VIF= diag(inv(xb'*xb+0.1*eye(3)VIF = 5.1014 0.91035.1043因此,我们取k=0.1b=inv(diag(norm(x1-mean(x1) norm(x2-mean(x2) norm(x3-mean(x3)*inv(xb'*xb+eye(3)*0.1)*xb'*yb

26、 = 0.0660 0.55820.1061b0=mean(y)-b(1)*mean(x1)-b(2)*mean(x2)-b(3)*mean(x3)b0 = -7.6286最后的方程:y=-7.6286+0.0660x1+0.5582x2+0.1061x3sse=(norm(y-(b0+b(1)*x1+b(2)*x2+b(3)*x3)2sse = 2.9054可决系数:1-sse/norm(y-mean(y)2ans = 0.9859(2.9054-1.67142209436149)/1.67142209436149ans = 0.7383残差平方和比OLS增加了73.83%下面再求y=-7.

27、6286+0.0660x1+0.5582x2+0.1061x3各回归系数的标准差与相应的T值。参见于俊年的计量书P213。bb=inv(xb'*xb+0.1*eye(3)*xb'*ybbb = 0.4357 0.20260.4820sse=norm(yb-bb(1)*xb(:,1)-bb(2)*xb(:,1)-bb(3)*xb(:,1)2sse =0.09331-sse/norm(yb)2ans =0.9067估计的方差:sse/(11-3)ans =0.0117回归系数的标准差:sb=sqrt(VIF*sse/(11-3)sb = 0.2439 0.10300.2439也可按

28、:sb=diag(sqrt(inv(xb'*xb+0.1*eye(3)*sse/(11-3)sb = 0.2439 0.10300.2439最后方程的回归系数的标准差:std(y)*sb./(std(x)'ans = 0.0370 0.28380.0537P1=(1-tcdf(0.0660/0.0370,7)*2P1=0.1176P2=(1-tcdf(0.5582/0.2838,7)*2P2=0.0899P3=(1-tcdf(0.1061/0.0537,7)*2P3=0.0887因此,y=-7.6286+0.0660x1+0.5582x2+0.1061x3标准差 (0.0370

29、) (0.2838) (0.0537) P值 (0.1176) (0.0899) (0.0887)在显著性水平0.12下,各回归系数均通过了检验。5.2.2 主成分回归原理参见:经济计量学 李景华 P117P126x=149.3, 4.2, 108.1; 161.2, 4.1, 114.8; 171.5, 3.1,123.2; 175.5, 3.1, 126.9; 180.8, 1.1, 132.1; 190.7, 2.2, 137.7; 202.1, 2.1, 146; 212.1, 5.6, 154.1; 226.1,5, 162.3; 231.9, 5.1, 164.3; 239, 0.

30、7, 167.6;y=15.9; 16.4;19;19.1; 18.8; 20.4; 22.7; 26.5; 28.1; 27.6; 26.3;x1=x(:,1);x2=x(:,2);x3=x(:,3);xb=zscore(x)/sqrt(10);dinv=inv(diag(norm(x1-mean(x1) norm(x2-mean(x2) norm(x3-mean(x3)dinv = 0.0105 0 0 0 0.1917 0 0 0 0.0153 Z=xb*AZ = 0.0067 -0.2013 -0.6725 0.0227 -0.1752 -0.5121 0.0069 0.0234 -0

31、.3525 -0.0033 0.0263 -0.2827 -0.0232 0.4134 -0.2032 -0.0084 0.2085 -0.0598 -0.0135 0.2351 0.1142 -0.0214 -0.4286 0.3049 -0.0068 -0.3053 0.4931 0.0149 -0.3215 0.55880.0254 0.5252 0.6116A,d=eig(corrcoef(x)A = 0.7070 0.0357 0.7063 0.0080 -0.9991 0.0425 -0.7072 0.0245 0.7066d = 0.0028 0 0 0 0.9983 0 0 0

32、 1.9990A1=A(:,2,3)A1 = 0.0357 0.7063 -0.9991 0.04250.0245 0.7066Z=xb*AZ = 0.0067 -0.2013 -0.6725 0.0227 -0.1752 -0.5121 0.0069 0.0234 -0.3525 -0.0033 0.0263 -0.2827 -0.0232 0.4134 -0.2032 -0.0084 0.2085 -0.0598 -0.0135 0.2351 0.1142 -0.0214 -0.4286 0.3049 -0.0068 -0.3053 0.4931 0.0149 -0.3215 0.5588

33、 0.0254 0.5252 0.6116Z1=Z(:,2,3) 因为d的第一对角元接近于0,所以取Z的二三列Z1 = -0.2013 -0.6725 -0.1752 -0.5121 0.0234 -0.3525 0.0263 -0.2827 0.4134 -0.2032 0.2085 -0.0598 0.2351 0.1142 -0.4286 0.3049 -0.3053 0.4931 -0.3215 0.55880.5252 0.6116b=dinv*A1*inv(Z1'*Z1)*Z1'*yb = 0.0728 0.61110.1063b0=mean(y)-mean(x)*

34、bb0 = -9.1416主成份回归的模型为:y=-9.1416+0.0728x1+0.6111x2+0.1063x3相应的残差平方和为:sse=(norm(y-(b0+b(1)*x1+b(2)*x2+b(3)*x3)2sse =2.4372可决系数为:1-sse/norm(y-mean(y)2ans = 0.9882bbb,bint,rr,rint=regress(y,ones(11,1) x);(norm(rr)2ans =1.6714(sse-1.6714)/1.6714ans =0.4582比OLS残差平方和增加了45.82%求自变量和因变量都标准化模型的回归系数。(特指处理了主成份后的)yb=zscore(y)/sqrt(10);xbb=A1*inv(Z

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论