1、Principles of Epidemiology & StatisticsLixia SunJSIECOutlineEpidemiologyEpidemiological methodologyStudy designStatisticsStatisticsdescriptionHypothesis testingRisk estimationSoftware packagesfor statistical analysisThe Black-Box TheoryExposureDiseaseExposureSusceptibilityEarly biologic effectsAlter
2、ed structure and functionDiseaseTraditional EpidemiologyMolecular EpidemiologyEpidemiologic basic conceptIncidence the number of new cases of a given disease occurring over a defined period of time divided by the total number of persons at risk since the start of the same time period.Prevalence The
3、prevalence rate of a disease is defined as the number of persons with a given disease at any one time divided by the total number of persons in the population at risk at that time. Sensitivity and SpecificityThe likelihood that an individual with a disease will test positive for that diseaseThe like
4、lihood that an individual with a disease will test positive for that diseaseRequires a gold standardIn screening frequently less important than specificity (more later) The likelihood that an individual who does not have a disease will test negative for that disease Often much more important in scre
5、ening for disease (more later)Predictive Value: Prevalence = 2%What is the sensitivity?90% =a/a + c What is the specificity?90%=d/b + d) What is the PPV? 15% =(a)/(a + b), What is the NPV? 99.8%=(d)/(c + d) Gold standard positive Gold standardnegativeTest +18 a100 bTest -2 c900 dPredictive Value: Pr
6、evalence = 10%What is the sensitivity?91%What is the specificity?90%What is the PPV? 50%What is the NPV? 99%Gold standard positive Gold standardnegativeTest +99100Test -10900Methods of Medical StudyMicroscopic methodsBasic medicine: Molecular level (e.g., DNA, protein, cell, tissue, organ, etc.) Cli
7、nical medicine: Individual level (e.g., case report)Macroscopic methodsEpidemiology: Population level (e.g., case-control study & cohort study)Errors in EpidemiologyRandom errorComes from samplingCan be reduced with a larger sample size or estimated by statistical analysisSystematic error (Bias)More
8、 serious than random errorCan be introduced at any stage of a studyBiases in EpidemiologySelection biasSubjects that are not representative of the population you are interested inInformation biasErrors in measurements of exposure and disease statusConfounding biasEstimated effect of an exposure is d
9、istorted by the effect of a third factor not taken into considerationControlling of Confounding BiasAt the design stageRestrictionMatchingRandomizationAt the data analysis stageStratification (e.g., Mantel-Haenszel stratified analysis)Multivariable analysis (e.g., Logistic regression analysis)Random
10、ized Controlled StudiesThe Double Blind MethodPlacebo EffectPositive beliefs from patientsMinimize health problems and give more weight to positive effectsTake better care of themselves and comply better with the conditions of the experimentPositive beliefs about their treatment do better than patie
11、nts who do notOptimistic expectations from doctorsEvaluate patients state of health more favorablyCommunicate positive expectations to the patientsRandomized Controlled StudiesAdvantagesUnbiased distribution of confoundersBlinding more likelyRandomisation facilitates statistical analysisDisadvantage
12、s Expensive: time and moneyVolunteer biasEthically problematic at timesCohort StudiesCohort StudiesAdvantagesEthically safeSubjects can be matchedCan establish timing and directionality of eventsEligibility criteria and e assessments can be standardisedAdministratively easier and cheaper than random
13、ized controlled studies DisadvantagesThe controls may be difficult to identifyExposure may be linked to a hidden confounderBlinding is difficultRandomisation not presentFor rare disease, large sample sizes or long follow-up necessaryCase Control StudiesCase Control StudiesAdvantagesquick and cheapon
14、ly feasible method for very rare disorders or those with long lag between exposure and efewer subjects needed than cross-sectional studiesDisadvantagesreliance on recall or records to determine exposure statusconfoundersselection of control groups is difficultpotential bias: recall, selectionCase Se
15、ries and Case ReportsSystematic Reviews and Meta-analysesPitfalls Specific to Meta-analysisIts rare that the results of the different studies precisely agreeInclude studies that support the conclusion and omit studies that do notPublication biasOdds Ratio (OR)A measure of association indicating magn
16、itude and directionCommonly used in epidemiologyApproximates how much more likely (or unlikely) it is for the e to be present among those with “exposure” than those without exposureOdds Ratio (OR)Useful regardless of how data were collectedORRR when disease is rareRR: Relative Risk or Risk RatioRati
17、o of the risk of developing a disease if exposed relative to the risk of developing a disease if unexposed Interaction and ConfoundingInteraction (effect-modification): there is an interaction between x and y when the effect of y on z depend upon the level of xExample: if the risk of smoking on deve
18、loping lung cancer differs between males and females, then there is an interaction between smoking and genderConfounding occurs when the effect of variable x on z is distorted when we fail to control for variable yWe say that y is a confounder for the effect of x on zThis is different from interacti
19、onStatistics descriptionNormal distributionAsymmetrical distrubutionMeanMedianModeStandard devationVarianceStandard errorConfidence interval1 2 3 4 5 1 2 3 5 9 14 20Statistics testParametric test only for normal distribution eg. T test Non Parametric test for Asymmetrical distrubution or small sampl
20、eeg. Signed rank testUnivariate AnalysisT-testUnpaired t-testPaired t-testOne-way ANOVAChi-square testExact testRank testMann-Whitney U testWilcoxon signed rank testKruskal-Wallis testSimple linear regressionMultivariate AnalysisTwo-way ANOVARepeated measurement ANOVAMultivariate ANOVAMultiple linea
21、r regressionLogistic regressionMultiple analysis of covarianceCluster analysisPrinciple component analysis (PCA)Haplotype analysisContingency TableWe are often interested in determining whether there is an association between two categorical variablesNote that association does not necessarily imply
22、causalityIn these cases, data may be represented in a two-dimensional tableSmokingSmokerNon-smokerLung CancerYesacNobdContingency TableThe categorical variables can have more than two levelsThe variables may also be ordinal, however this requires more advanced methods. For now, we consider the case
23、in which both variables are nominalChi-square TestA hypothesis test:H0: no associationH1: associationStrategy: compare what is observed to what is expected if H0 is true (i.e., no association)If difference is large, then there is evidence of associationIf difference is not large, then insufficient e
Chi-square TestSome limitations:Does not describe the magnitude or the direction of the associationRelies on "large sample theory" (an assumption), which means that the test may be invalid if expected cell sizes are too small (5). Thus avoid use under these conditions.Software Packagesfor Statistical AnalysisSAS (Statistical Analysis System)
