版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Fudan Genetic Workshop 2005Genetic Epidemiology in Populations Genetic Association Studies: Studies with Unrelated IndividualsFudan Genetic Workshop, Jan 04 2005 Assessing risk for disease due to specific alleles/genotypes Contrast direct / indirect genetic association studies Discuss genetic models
2、 Tests and estimates of association between alleles/genotypes and disease Designs to be usedLearning Objectives for This Lecture Applications for genetic association analyses Direct versus indirect association Indirect association: marker-based disease association Exploits LD between markers and uno
3、bserved disease variants Examples of direct and indirect association studies Study designs for direct and marker-based (LD) associationGenetic Association Studies in EpidemiologyApplications: Fine mapping Candidate gene effects Whole genome scan Gene environment effects Drug response modificationGen
4、etic Association Studies in EpidemiologyCommon Designs: Cross-sectional Case-control Cohort Clinical Trial (drug response hypothesis is really a case-control) Case-family (trios, sibs, extended families) Case-onlyLearning Objectives for This LectureApplications for association analyses Direct versus
5、 indirect association Indirect association: marker-based association Exploits LD Examples of direct and indirect association studies Study designs for direct and marker-based (LD) association1.Direct methodTesting whether a particular allele is a disease - predisposing (causative) alleleexposure sta
6、tus directly measuredGenetic epidemiology studies two different concepts .GACTAAGGCCC CCGTTCAAGGAA. C/TAPOE gene on c19Eg: A particular APOE allele (e4) changes protein isoform Genotype that particular site for association study 2.LD Mapping (indirect method)exposure status not directly measuredRely
7、 on markers correlated with true exposure statusThis correlation is due to linkage disequilibriumGenetic epidemiology studies two different concepts .GACTAAGGCCC CCGTTCAAGGA CCTG. C/TA/GAPOE gene on c19Eg: Genotype a nearby genetic marker among study participantsRely on correlation (LD) between thes
8、e alleles to detect association!Direct Tests Test for association between observed genotypes and disease outcomeCaseControlTotalCCa bm1CTcdm2TTefm3TotalndnhNCCCCCG CCCCCG CCCCCG CCTCCG CCTCCG CCTCCG Directly measure potentially predisposing allelesIndirect Tests Test for association between marker g
9、enotypes and disease outcomeMarkerCaseControlTotalAAa bm1AGcdm2GGefm3TotalndnhNCC?CCGCAGCC?CCGCAG Use marker genotypes as surrogate for functional genotype by exploiting any correlation between C and ACC?CCGCAGCC?CCGCGGCC?CCGCGGCC?CCGCGGDirect Method: Candidate Allele Testing Must know potentially f
10、unctional polymorphisms SNPs may offer set of candidate loci for this method Eg, look at non-synonymous cSNPs (those in coding regions that are likely to be functional) Note: Assuming disease is caused by (relatively) common alleles Common disease-common variant hypothesis This hypothesis is controv
11、ersial If genome-wide project: multiple comparisons - 30,000 genes. Even if only one functional locus/gene tested, very high number of false +s due to chanceIndirect Method: LD Gene MappingGeneral idea: Exploit the phenomenon of linkage disequilibrium (LD) between alleles of closely linked markers t
12、o identify genetic regions associated with disease status. i.e., Exploit LD-induced correlation between observed marker alleles and potentially unobserved disease alleles .GACTAAGGCCC CCGTTCAAGGA CCTG. C/TA/GRely on correlation (LD) between these alleles to detect association!Indirect Association St
13、udies in EpidemiologyApplications: Localization (what is the best estimate of the disease gene location?) Fine mapping Whole genome scan Candidate gene effects Gene environment effects Drug response modificationLocalizationIndirect Association Studies in EpidemiologyApplications: Localization (what
14、is the best estimate of the disease gene location?) Fine mapping Whole genome scan Candidate gene effects Gene environment effects Drug response modificationSource: Uhl et al, 2001. AJHG 69:1290-1300.Indirect Association Studies in EpidemiologyApplications: Localization (what is the best estimate of
15、 the disease gene location?) Fine mapping Whole genome scan Candidate gene effects Gene environment effects Drug response modificationCandidate gene LD studies Instrumental in identifying genetic risk factors for disease HLA Type 1 diabetes Rheumatoid arthritis Multiple sclerosis APOE Alzheimers dis
16、ease Atherosclerosis Angiotensin Converting Enzyme (ACE) Myocardial infarction AtherosclerosisCandidate gene LD studies Polymorphism in candidate genes can be used as a marker for variation in the gene. The higher the LD between the causative variant and the marker alleles, the better the surrogate
17、information provided by the marker.MarkerCaseControlTotalAAa bm1AGcdm2GGefm3TotalndnhNCC?CCGCAGCC?CCGCAGCC?CCGCAGCC?CCGCGGCC?CCGCGGCC?CCGCGGThe power of G to detect disease risk at ? is only as strong as the LD between the loci!Candidate gene LD studies Use genetic markers to test for variation in t
18、hat gene that may predispose to disease risk Marker alleles are surrogates for the (unobserved) disease allele due to LDMarkerCasesControlsAabMAacdMaMcaseMctrl1(surrogate for unobserved disease allele)AdxGenotype-level tests for disease association Test for association between genotypes and disease
19、outcomeGCaseControlTotal11a bm112cdm222efm3TotalndnhN61222)(iiiiEEOEx: Alzheimers disease and APOE SNPsSingle-marker 2 tests for disease association:M1GCaseControlTotalCC101 (.50)116 (.75)217CT75 (.37)32(.21)107TT26(.13)6 (.04)32Total20215435661222)(iiiiEEO80.2484.13)84.136(16.18)16.1826(29.46)29.46
20、32(71.60)71.6075(87.93)87.93116(13.123)13.123101(22222222Modeling issues If we do measure the putative variation of interest, what kinds of statistical analyses should be performed? Most general genetic model for risk: assume to a priori relationship between heterozygote and homozygote risk Consider
21、 each genotype as a separate exposure category:GenotypeRisk parameterRelative RiskAARAARRAAA*RA*RRA*R*1*21)(1)(lnAAAIIDPDP In a logistic regression model:00* *10A*01AAIAaIAAGenotype00* *10A*01AAIAaIAAGenotypeModeling issues This requires 2 risk parameters What genetic models could be considered to i
22、mprove power (reduce # of parameters to estimate)? May have more power if particular models can correctly be assumed: DominantRRAA = RRA* 1 = RR* Additive allele effectsRAA = 2*RA* R* = 1 Multiplicative allele effectsRRAA = (RRA*)2 RR* = 1 RecessiveRRAA 1 = RRA* = RR*GenotypeRisk parameterAARRAAA*RR
23、A*RR*GenotypeRisk parameterDomAARRAARRA.A*RRA*RRA.*RR*1GenotypeRisk parameterDomRecAARRAARRA.RRAAA*RRA*RRA.1*RR*11GenotypeRisk parameterDomRecAddAARRAARRA.RRAA2RRA*A*RRA*RRA.1RRA*RR*111GenotypeRisk parameterDomRecAddMultAARRAARRA.RRAA2RRA*(RRA*)2A*RRA*RRA.1RRA*2RRA*RR*1111Logistic regressionH0: i =
24、0 Genetic model interpretations: Assume “11” genotype coding represents genotype with lowest absolute risk (baseline)1 = 2 = 0no association with that polymorphism1 = 0, 2 0(completely) recessive1 = 2 0(completely) dominant0 1 Possible interaction!SmokeTGFCPControlORNoG-721271NoG+12191.11YesG-25152.
25、94YesG+1555.29Logistic regression with interactionTest for interaction: H0: gi = 0 Modeling issues For Cleft example, dominance was modeled, so logistic model would look like:)*()*()(1)(ln2221213222121EGEGEGGDPDPgg)*()(1)(ln11211EGEGDPDPg222121)(1)(lnGGDPDPInterpretations of Indirect Association Stu
26、diesA positive association can mean:The targeted allele is causalThe targeted allele is in LD with a causal alleleThere is confounding due to population stratificationThere is confounding or bias for some other reasonType I errorA negative finding can mean:The gene or region under study is not assoc
27、iated with disease riskThe targeted allele is not in LD with the causal allele Appropriate stratification or other accommodation of heterogeneity was not identified Type II error (not enough power)Learning Objectives for This LectureLinkage versus associationUses of association analysesDirect versus
28、 indirect associationIndirect association: marker-based associationExploits LD Examples of direct and indirect association studies Study designs for direct and marker-based (LD) associationChoice of Design and Analysis Sampling design Unrelated individuals Family-based sampling Unit of analysis Sing
29、le-locus Haplotypes Statistical Procedures Chi-square tests Likelihood-based tests Asymptotic p values Empirical significance from resampling Appropriate significance thresholds Choice of Design and Analysis Design options for association studies Sampling unrelated individuals Family-based designs T
30、hese differ on how controls are definedStudy Designs for LD Mapping StrategiesUnrelated Samples (Eg, Case control) Unrelated controls are sampled from the same population as the cases Matched or unmatched on other factors Perform chi-squared test for associationFamily-based Ex: TDT, looks for excess
31、 transmission of particular alleles from parents to affected children Controls are untransmitted allelesMarker alleleCase ControlAAabAacdaaefFor each individual, have 2x2 table of 0s, 1s, or 2sUse all such tables to get a matched chi-square test for excess occurrence in cells b and c McNemars testA-
32、Not transmitteda Not transmittedA - Transmitted02a - Transmitted00A,aA,aA,AContrasts between epidemiologic and family-based association designs Unrelated individuals Do not need parental genetic information Provides estimates of allele frequencies (if appropriately sampled) May be more powerful and
33、simpler to collect for some phenotypes Opens the potential for confounding due to population stratification Family-based designs Need parental genotypes (or at least other family members) Do not need unrelated controls Avoids population stratification confounding (analysis is matched by family)Desig
34、ns for LD studies of unrelated individuals All unrelated sampling LD methods look for association between markers and the outcome of interest across individuals The utility of this approach will be a function of the actual LD between the markers and the true disease allele(s) Cross-sectional Case-co
35、ntrol Cohort Analyses are similar, interpretations may be different because sampling and temporal relationships are different Clinical trial? Such data are used to assess pharmacogenetic response outcomes. However, this is often a case-control design (the genotypes are not randomized!)Study Designs
36、used for LD mappingCase - Control design (unrelated individuals)Advantages: Commonly used tool in epidemiology - methodology is well-known Convenient to collect - opportunity to draw very large samples More efficient recruitment than family-based sampling methods Population-based designs can allow t
37、he simultaneous characterization of disease allele frequency, penetrance, and attributable risk Unrelated controls can provide increased power in many situationsStudy Designs used for LD mappingCase - Control design (unrelated individuals)Limitations: Difficult to establish phase when focusing on ha
38、plotypes May be susceptible to confounding due to stratification Difficult to measure parent-of-origin effects or other parent-specific effects Difficult to estimate recombination fractions (localization) using LD in an unrelated-subject settingCase-Control Susceptibility to population stratificatio
39、n TDT and other family-based association methods commonly used to guard against confounding due to population stratificationBut, Requires recruitment of additional family members Added cost Family members may not be available Inefficient if many members required or members are uninformative Probably
40、 not as prevalent a problem as once thought Can be dealt with through assessment and adjustment of population stratification within a case-control data setStudy Designs used for LD mappingCase - Control designLimitations: May be susceptible to confounding due to stratification Solution: Measure pote
41、ntial confounding via genomic control Significant substructure is rare in preliminary empirical findings! Doesnt hurt to assess anyway! May still want to adjust for other confounding reasons1) Adjustment in Analysis Assess and correct for possible stratification Genomic control (Devlin and Roeder, 1
42、999, Genomic Control for association studies. Biometrics, 1999. 55: p. 997-1004) Cluster analysis of genetic substructure based on markers (STRUCTURE, Pritchard et al. Inference of population structure using multilocus genotype data. Genetics, 2000. 155(2): p. 945-59) FSTSolutions to Population Stra
43、tification Problem: How to distinguish LD from allelic associations due to sub-structure?Ex: Use Fst to detect gross genetic distances between cases and controls as indication of substructure Study 1 Study 2 Cases Controls Cases Controls Normotensives Cases 1 1 0.00059 0.0007 0.0016 -0.0005 Controls
44、 1 0.307 1 0.001 0.002 0.002 Cases 2 0.218 0.049 1 0.0004 0.001 Controls - 2 0.039 0.039 0.267 1 0.148 Normotensives 0.742 0.029 0.059 0.148 1 Genetic Distance (FST) P-valueGenetic Distances Between Sample Populations in Renal failure Susceptibility StudySolutions to Population Stratification Proble
45、m: How to distinguish LD from allelic associations due to sub-structure?2) Matching in design Ethnicity-matched controls- Self-reported- Correlation between broad ethnicity and genetic background may not be very good Family controls Matched by parents, so genetic substructure controlled Next lecture
46、!梅毒螺旋体的超微结构Candidate gene LD studies Use genetic markers to test for variation in that gene that may predispose to disease risk Marker alleles are surrogates for the (unobserved) disease allele due to LDMarkerCasesControlsAabMAacdMaMcaseMctrl1(surrogate for unobserved disease allele)AdxModeling issu
47、es If we do measure the putative variation of interest, what kinds of statistical analyses should be performed? Most general genetic model for risk: assume to a priori relationship between heterozygote and homozygote risk Consider each genotype as a separate exposure category:GenotypeRisk parameterR
48、elative RiskAARAARRAAA*RA*RRA*R*1*21)(1)(lnAAAIIDPDP In a logistic regression model:00* *10A*01AAIAaIAAGenotype00* *10A*01AAIAaIAAGenotypeModeling issues If we do measure the putative variation of interest, what kinds of statistical analyses should be performed? Most general genetic model for risk: assume to a priori relationship between heterozygote and homozygote risk Consider each genotype as a separate exposure category:GenotypeRisk parameterRelative RiskAARAARRAAA*RA*RRA*R*1*21)(1)(lnAAAIIDPDP In a logistic regression model:00* *10A*01AAIAaIAAGenotype00* *10A
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025户外品牌探路者线上新媒体运营方案
- 田径运动会活动方案(汇编15篇)
- 五年级二十年后的家乡单元作文
- 安全工作演讲稿汇编15篇
- 2023年幼儿园安全工作计划10篇
- 财务会计个人辞职报告集合8篇
- 一次有趣的游戏初一日记400字5篇
- 北京市通州区2024-2025学年八年级上学期期末考试道德与法治试卷(含答案)
- 2025年工程瑞雷波仪项目合作计划书
- 国培计划心得体会
- 浙江省温州市2023-2024学年九年级上学期期末数学试题(含解析)
- 新版高中物理必做实验目录及器材-(电子版)
- 系统解剖学骨学
- 2024新版有两个女儿离婚协议书
- 浙江省宁波市鄞州区2023-2024学年九年级上学期期末语文试题(含答案解析)
- 糖药物学智慧树知到期末考试答案章节答案2024年中国海洋大学
- 化工旧设备拆除施工方案
- 环酯红霉素的药物安全性评价及其临床前研究
- SHT 3005-2016 石油化工自动化仪表选型设计规范
- 中药学专业毕业设计
- (正式版)SHT 3551-2024 石油化工仪表工程施工及验收规范
评论
0/150
提交评论