英国诺丁汉大学讲义如何估计随机效应模型stata_第1页
英国诺丁汉大学讲义如何估计随机效应模型stata_第2页
英国诺丁汉大学讲义如何估计随机效应模型stata_第3页
英国诺丁汉大学讲义如何估计随机效应模型stata_第4页
英国诺丁汉大学讲义如何估计随机效应模型stata_第5页
已阅读5页,还剩34页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、MCMC Estimation for Random Effect Modelling The MLwiN experienceDr William J. BrowneSchool of Mathematical SciencesUniversity of Nottingham1ContentsRandom effect modelling, MCMC and MLwiN.Methods comparison Guatemalan child health example.Extendibility of MCMC algorithms:Cross classified and multipl

2、e membership models.Artificial insemination and Danish chicken examples.Further Extensions.2Random effect modelsModels that account for the underlying structure in the dataset.Originally developed for nested structures (multilevel models), for example in education, pupils nested within schools.An ex

3、tension of linear modelling with the inclusion of random effects.A typical 2-level model is Here i indexes pupils and j indexes schools.3MLwiNSoftware package designed specifically for fitting multilevel models.Developed by a team led by Harvey Goldstein and Jon Rasbash at the Institute of Education

4、 in London over past 15 years or so. Earlier incarnations ML2, ML3, MLN.Originally contained classical IGLS estimation methods for fitting models.MLwiN launched in 1998 also included MCMC estimation.My role in team was as developer of MCMC functionality in MLwiN during 4.5 years at the IOE.4Estimati

5、on Methods for Multilevel ModelsDue to additional random effects no simple matrix formulae exist for finding estimates in multilevel models.Two alternative approaches exist:Iterative algorithms e.g. IGLS, RIGLS, EM in HLM that alternate between estimating fixed and random effects until convergence.

6、Can produce ML and REML estimates.Simulation-based Bayesian methods e.g. MCMC that attempt to draw samples from the posterior distribution of the model.5MCMC AlgorithmConsider the 2-level modelMCMC algorithms work in a Bayesian framework and so we need to add prior distributions for the unknown para

7、meters.Here there are 4 sets of unknown parameters: We will add prior distributions 6MCMC Algorithm (2)The algorithm for this model then involves simulating in turn from the 4 sets of conditional distributions. Such an algorithm is known as Gibbs Sampling. MLwiN uses Gibbs sampling for all normal re

8、sponse models.Firstly we set starting values for each group of unknown parameters, Then sample from the following conditional distributions, firstly To get .7MCMC Algorithm (3)We next sample fromto get , thento get , then finallyTo get . We have then updated all of the unknowns in the model. The pro

9、cess is then simply repeated many times, each time using the previously generated parameter values to generate the next set8Burn-in and estimatesBurn-in: It is general practice to throw away the first n values to allow the Markov chain to approach its equilibrium distribution namely the joint poster

10、ior distribution of interest. These iterations are known as the burn-in.Finding Estimates: We continue generating values at the end of the burn-in for another m iterations. These m values are then average to give point estimates of the parameter of interest. Posterior standard deviations and other s

11、ummary measures can also be obtained from the chains.9Methods for non-normal responsesWhen the response variable is Binomial or Poisson then different algorithms are required.IGLS/RIGLS methods give quasilikelihood estimates e.g. MQL, PQL.MCMC algorithms including Metropolis Hastings sampling and Ad

12、aptive Rejection sampling are possible.Numerical Quadrature can give ML estimates but is not without problems. 10So why use MCMC?Often gives better estimates for non-normal responses.Gives full posterior distribution so interval estimates for derived quantities are easy to produce.Can easily be exte

13、nded to more complex problems.Potential downside 1: Prior distributions required for all unknown parameters.Potential downside 2: MCMC estimation is much slower than the IGLS algorithm.11The Guatemalan Child Health dataset.This consists of a subsample of 2,449 respondents from the 1987 National Surv

14、ey of Maternal and Child Helath, with a 3-level structure of births within mothers within communities.The subsample consists of all women from the chosen communities who had some form of prenatal care during pregnancy. The response variable is whether this prenatal care was modern (physician or trai

15、ned nurse) or not.Rodriguez and Goldman (1995) use the structure of this dataset to consider how well quasi-likelihood methods compare with considering the dataset without the multilevel structure and fitting a standard logistic regression.They perform this by constructing simulated datasets based o

16、n the original structure but with known true values for the fixed effects and variance parameters.They consider the MQL method and show that the estimates of the fixed effects produced by MQL are worse than the estimates produced by standard logistic regression disregarding the multilevel structure!

17、12The Guatemalan Child Health dataset.Goldstein and Rasbash (1996) consider the same problem but use the PQL method. They show that the results produced by PQL 2nd order estimation are far better than for MQL but still biased.The model in this situation is In this formulation i,j and k index the lev

18、el 1, 2 and 3 units respectively.The variables x1,x2 and x3 are composite scales at each level because the original model contained many covariates at each level.Browne and Draper (2004) considered the hybrid Metropolis-Gibbs method in MLwiN and two possible variance priors (Gamma-1(,) and Uniform.1

19、3Simulation ResultsThe following gives point estimates (MCSE) for 4 methods and 500 simulated datasets.Parameter (True)MQL1PQL2GammaUniform0 (0.65)0.474 (0.01)0.612 (0.01)0.638 (0.01)0.655 (0.01)1 (1.00)0.741 (0.01)0.945 (0.01)0.991 (0.01)1.015 (0.01)2 (1.00)0.753 (0.01)0.958 (0.01)1.006 (0.01)1.031

20、 (0.01)3 (1.00)0.727 (0.01)0.942 (0.01)0.982 (0.01)1.007 (0.01)2v (1.00)0.550 (0.01)0.888 (0.01)1.023 (0.01)1.108 (0.01)2u (1.00)0.026 (0.01)0.568 (0.01)0.964 (0.02)1.130 (0.02)14Simulation ResultsThe following gives interval coverage probabilities (90%/95%) for 4 methods and 500 simulated datasets.

21、Parameter (True)MQL1PQL2GammaUniform0 (0.65)67.6/76.886.2/92.086.8/93.288.6/93.61 (1.00)56.2/68.690.4/96.292.8/96.492.2/96.42 (1.00)13.2/17.684.6/90.888.4/92.688.6/92.83 (1.00)59.0/69.685.2/89.886.2/92.288.6/93.62v (1.00)0.6/2.470.2/77.689.4/94.487.8/92.22u (1.00)0.0/0.021.2/26.884.2/88.688.0/93.015

22、Summary of simulationsThe Bayesian approach yields excellent bias and coverage results.For the fixed effects, MQL performs badly but the other 3 methods all do well.For the random effects, MQL and PQL both perform badly but MCMC with both priors is much better.Note that this is an extreme scenario w

23、ith small levels 1 in level 2 yet high level 2 variance and in other examples MQL/PQL will not be so bad.16Extension 1: Cross-classified modelsFor example, schools by neighbourhoods. Schools will draw pupils from many different neighbourhoods and the pupils of a neighbourhood will go to several scho

24、ols. No pure hierarchy can be found and pupils are said to be contained within a cross-classification of schools by neighbourhoods : nbhd 1nbhd 2Nbhd 3School 1xxxSchool 2xxSchool 3xxxSchool 4xxxxSchool S1 S2 S3 S4Pupil P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12Nbhd N1 N2 N317NotationWith hierarchical mo

25、dels we use a subscript notation that has one subscript per level and nesting is implied reading from the left. For example, subscript pattern ijk denotes the ith level 1 unit within the jth level 2 unit within the kth level 3 unit.If models become cross-classified we use the term classification ins

26、tead of level. With notation that has one subscript per classification, that captures the relationship between classifications, notation can become very cumbersome. We propose an alternative notation introduced in Browne et al. (2001) that only has a single subscript no matter how many classificatio

27、ns are in the model.18Single subscript notationSchool S1 S2 S3 S4Pupil P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12Nbhd N1 N2 N3inbhd(i)sch(i)111221311422512622723833934102411341234We write the model asWhere classification 2 is neighbourhood and classification 3 is school. Classification 1 always correspo

28、nds to the classification at which the response measurements are made, in this case patients. For pupils 1 and 11 equation (1) becomes:19Classification diagramsSchoolPupilNeighbourhoodSchoolPupilNeighbourhoodNested structure where schools are contained within neighbourhoodsCross-classified structure

29、 where pupils from a school come from many neighbourhoods and pupils from a neighbourhood attend several schools.In the single subscript notation we lose information about the relationship(crossed or nested) between classifications. A useful way of conveying this information is with the classificati

30、on diagram. Which has one node per classification and nodes linked by arrows have a nested relationship and unlinked nodes have a crossed relationship.20Example : Artificial insemination by donor 1901 women279 donors 1328 donations12100 ovulatory cyclesresponse is whether conception occurs in a give

31、n cycle In terms of a unit diagram:DonorWomanCycleDonationOr a classification diagram:21Model for artificial insemination dataWe can write the model asParameterDescriptionEstimate(se)intercept-4.04(2.30)azoospermia *0.22(0.11)semen quality0.19(0.03)womens age35-0.30(0.14)sperm count0.20(0.07)sperm m

32、otility0.02(0.06)insemination to early-0.72(0.19)insemination to late-0.27(0.10)women variance1.02(0.21)donation variance0.644(0.21)donor variance0.338(0.07)Results:Note cross-classified models can be fitted in IGLS but are far easier to fit using MCMC estimation.22Extension 2: Multiple membership m

33、odelsWhen level 1 units are members of more than one higher level unit we describe a model for such data as a multiple membership model.For example, Pupils change schools/classes and each school/class has an effect on pupil outcomes. Patients are seen by more than one nurse during the course of thei

34、r treatment.23NotationNote that nurse(i) now indexes the set of nurses that treat patient i and w(2)i,j is a weighting factor relating patient i to nurse j. For example, with four patients and three nurses, we may have the following weights:n1(j=1)n2(j=2)n3(j=3)p1(i=1)0.500.5p2(i=2)100p3(i=3)00.50.5

35、p4(i=4)0.50.50Here patient 1 was seen by nurse 1 and 3 but not nurse 2 and so on. If we substitute the values of w(2)i,j , i and j. from the table into (2) we get the series of equations :24Classification diagrams for multiple membership relationshipsDouble arrows indicate a multiple membership rela

36、tionship between classifications.patientnurseWe can mix multiple membership, crossed and hierarchical structures in a single model.patientnursehospitalGP practiceHere patients are multiple members of nurses, nurses are nested within hospitals and GP practice is crossed with both nurse and hospital.

37、25Example involving nesting, crossing and multiple membership Danish chickensProduction hierarchy10,127 child flocks 725 houses 304 farmsBreeding hierarchy10,127 child flocks200 parent flocksChild flockHouseFarmParent flockAs a unit diagram:As a classification diagram:26Model and resultsParameterDes

38、criptionEstimate(se)intercept-2.322(0.213)1996-1.239(0.162)1997-1.165(0.187)hatchery 2-1.733(0.255)hatchery 3-0.211(0.252)hatchery 4-1.062(0.388)parent flock variance0.895(0.179)house variance0.208(0.108)farm variance0.927(0.197)Results:Note multiple membership models can be fitted in IGLS and this

39、model/dataset represents roughly the most complex model that the method can handle.Such models are far easier to fit using MCMC estimation.27Further Extensions / Work in progressMultilevel factor modelsResponse variables at different levelsMissing data and multiple imputationESRC grant: Sample size

40、calculations, MCMC efficiency & Model identifiabilityWellcome Fellowship grant for Martin Green28Multilevel factor analysis modellingIn sample surveys there are often many responses for each individual.Techniques like factor analysis are often used to identify underlying latent traits amongst these

41、responses. Multilevel factor analysis allows factor analysis modelling to identify factors at various levels/classifications in the dataset so we can identify shared latent traits as well as individual level traits.Due to the nature of MCMC algorithms by adding a step to allow for multilevel factor

42、models in MLwiN, cross-classified models can also be fitted without any additional programming! See Goldstein and Browne (2002,2005) for more detail.29Responses at different levelsIn a medical survey some responses may refer to patients in a hospital while others may refer to the hospital itself.Mod

43、els that combine these responses can be fitted using the IGLS algorithm in MLwiN and shouldnt pose any problems to MCMC estimation.The Centre for Multilevel modelling in Bristol are investigating such models as part of their LEMMA node in the ESRC research methods program. I am a named collaborator

44、for the Lemma project.They are also looking at MCMC algorithms for latent growth models. 30Missing data and multiple imputationMissing data is proliferate in survey research. One popular approach to dealing with missing data is multiple imputation (Rubin 1987) where several imputed datasets are crea

45、ted and then the model of interest is fitted to each dataset and the estimates combined.Using a multivariate normal response multilevel model to generate the imputations using MCMC in MLwiN is described in chapter 17 of Browne (2003)James Carpenter (LSHTM) has begun work on macros in MLwiN that auto

46、mate the multiple imputation procedure. 31Sample size calculationsAnother issue in data collection is how big a sample do we need to collect?Such sample size calculations have simple formulae if we can assume that an independent sample can be generated.If however we wish to account for the data stru

47、cture in the calculation then things are more complex.One possibility is a simulation-based approach similar to that used in the model comparisons described earlier where many datasets are simulated to look at the power for a fixed sample size.Mousa Golalizadeh Lehi will be joining me in February on

48、 an ESRC grant looking at such an approach. A 4th year MMath. student (Lynda Leese) is looking at the approach for nested models.32Efficient MCMC algorithmsIn MLwiN we have tended to use the simplest, most generally applicable MCMC algorithms for multilevel models.For particular models there are man

49、y approaches that may improve the performance / mixing of the MCMC algorithm.We will also investigate some of these methods in the ESRC grant.Browne (2004) looked at some reparameterisation methods for cross-classified models in a bird nesting dataset.A second 4th year MMath. student (Francis Bourch

50、ier) is looking at MCMC methods based around the IGLS representation of nested models which are interesting.33Model IdentifiabilityThe final part of the ESRC grant is to look at whether a model is identifiable/estimable given a particular set of data.Cross-classified datasets where there are few lev

51、el 1 units per higher level unit can result in each observation being factored into several random effects with very few observations being used to estimate each random effect.We are interested in establishing whether we can really estimate all parameters in such models.An example where we cant woul

52、d be a dataset of patients who are attended by doctors in wards. Now if there is only one doctor per ward and likewise one ward per doctor then we cannot tease out doctor and ward effects. Again this work was motivated by a bird nesting dataset.34Wellcome FellowshipMartin Green has been successful i

53、n obtaining 4 years of funding from Wellcome to come and work with me.The project is entitled Use of Bayesian statistical methods to investigate farm management strategies, cow traits and decision-making in the prevention of clinical and sub-clinical mastitis in dairy cows.Martin is a qualified vete

54、rinary and a specialist farm animal surgeon.He completed a PhD in 2004 at the University of Warwick veterinary epidemiology and used MCMC to fit binary response multilevel models in both MLwiN and WinBUGS to look at various factors that affect clinical mastitis in dairy cows. 35Wellcome FellowshipIn

55、 the 4 years we are aiming to analyse a huge dataset that Martin has been collecting in a Milk Development Council grant.In particular we will look at how farm management practices, environmental conditions and cow characteristics influence the risk of mastitis during the dry period.We will hope to

56、get interesting applied results and also some novel statistical methodology from the grant. MCMC will again play a big part.Martin has been appointed as a professor in the new vet school and will be working there 1 day a fortnight during the grant before moving there full time after the four years.3

57、6ConclusionsIn this talk we have attempted to show the flexibility of MCMC methods in attempting to match the complexity of data structures found in real problems.We have also contrasted the methods with the IGLS algorithm.Although we have used MLwiN, all the models considered here could also be fitted

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论