




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、经典真分数测量理论Classical True Score Measurement Theory(CTS)人们将以真分数理论为核心理论假设的测量理论及其方法体系统称为经典测验理论(CTT) ,?也称真分数理论(CTS)。MrmVb真分数是指被测者在所测特质(如能力、知识、个性等)上的真实值,即真分数(True Score) 。而通过一定测量工具(如测验量表和测量仪器)进行测量,在测量工具上直接获得的值(读数 ) ,叫观测值或观察分数(Observed Score)。 vPRLL。由于有测量误差存在,所以, 观察值并不等于所测特质的真实值,即观察分数中含有真分数和误差分数(Error Score
2、)。 4itUJ 。而要获得对真实分数的值,就必须将测量的误差从观察分数中分离出来。真分数理论三个假设及两个推论真分数理论假设(1):真分数具有不变性这一假设的实质是指真分数所指代的被测者的某种特质必须具有某种程度的稳定性,至少在所讨论的问题范围内,或者在一个特定的时间内,个体具有的特质为一个常数,保持恒定。EWJ6a。真分数理论假设(2):真误差是完全随机的【假设公理一】:测量误差是一个平均数为零的正态随机变量。在多次测量中,误差有正有负。如果测量误差为正值,观测分数就会高于其实际的分数(真分数);如果测量误差为负值,则观测分数就会低于其实际的分数,即观察分数会出现上下波动的现象。但是,只要
3、重复测量次数足够多,这种正负偏差就会两相抵消,测量误差的平均数恰好为零。用数学式表达为:E(E)=0。 Fwtyf。【假设公理二】:测量误差分数与所测的特质或者说真分数之间相互独立。不仅如此,测量误差之间、测量误差与所测特质外其它变量间,也相互独立。或者说,他们之间的相关为零【注释:如果承认这种交互作用,则只能用GT 来解释和计算】。 oITI6 。真分数理论假设(3):观测分数是真分数与误差分数的和S=T + E【含义】 :观察分数与真实分数之间是线性关系,而不是其它关系。相差的就是误差分数。真分数理论推论(1)真分数等于观察分数的平均数(T=E(X) )( Gulliksen , 1950
4、)【含义】 : 若一个人的某种心里特质可以用平行的测验反复测量足够多次,则其观察分数的平均值会接近于真分数。0fmaJ。真分数理论推论(2)在一组测量分数中,观察分数的变异数(方差)等于真分数的变异数(方差)与误差分数的变异数(方差)之和。S2X= S2T + S2E【注释】 :这里的误差分数方差是随机误差的方差,系统误差的方差包含在真分数方差中,可以理解为:真分数方差=与测量目的相关方差*与测量目的无关的系统性方差经典测量理论在真分数理论假设的基石上构建起了它的理论大厦,主要包括信度、效度、 项目分析、常模、标准化等基本概念。G4YCJMeasurement Error? Measureme
5、nt error (or error variance) is a term that describe the VARIANCE in scores on a test that is not directly related to the purpose of the test.IbIzI 。? The performances of students on any test will tend to vary from each other, but their performances can vary for a variety of reasons.8UvFs。?These var
6、iables fall into two general sources of variance:oNBL9。?(a) those creating variance related to the purpose of the test (called meaningfulvariance), and tCwXT。?(b) those generating variance due to other extraneous sources (called measurementerror, or error variance). j1oc6 。?scores, testIn order to m
7、inimize all those undesirable test-purpose- unrelated variance in studentsdevelopers must use the following tables as carefully as possible. iHuim。为保证有效性抽样,一般得先从目标能力A 中选出一个有效的能力抽样a ,然后找出能表征这个能力抽样a的行为b,那么这些行为就应该是全部目标行为的有效抽样了。3f8bf。假设命题(1) B - A假设命题(2)a -A假设命题(3)b -a推导命题(4)b -B? 上述( 1) , ( 2) , ( 3)假设
8、关系确定后,我们推出b-B 之间的命题关系。推导命题( 5)b - A? 根据所测试的行为抽样推论出目标能力。考试就此结束了吗?语言测量是对语言行为的属性进行量化;所以语言行为抽样b 的测量最终要体现在分数或等级上;即测量结果反馈F 。假设命题(6) : F 是 b 的正确标示,即F - b假设命题(1)B -A假设命题(2)a -A假设命题(3)b -a推导命题(4)b -B? 上述( 1) , ( 2) , ( 3)假设关系确定后,我们推出b-B 之间的命题关系。推导命题( 5)b - A? 根据所测试的行为抽样推论出目标能力。假设命题(6)F - b? 语言行为抽样b 的测量最终要体现在
9、分数或等级上推导命题(7)F - AIn general test reliability is defined asthe extent to which the resultscan be consideredconsistent or stablePersonal attributes that are not related to language ability include:9MrUK。? individual characteristics such as- cognitive style and- knowledge of particular content areas ?
10、 group characteristics such as- sex- race- ethnic backgroundRandom factors are largely unpredictable and temporary such as wiFnE1) Mental alertness or emotional state, and2) Uncontrolled differences in test method facets e.g., changes of test environment from one day to the next 9QGAdThe degree to w
11、hich a test is consistent, or stable, can be estimated by calculating a reliability coefficient.5X2Ys。两个原则性问题:针对信度,回答问题:How much variance in test scores is due to measurement error?sOdXV。针对效度,回答问题:What specific abilities account for the reliable variance in test scores?fGd8q。? The point is that, a t
12、est can be reliable without being valid. In other words, a test can consistently measure something other than that for which it was designed ( 这是因为 信度是考试分数本身的属性,而效度是对考试分数解释和使用的准确性,所以两者虽密切联系,却性质不同).Xyxho。? Hence test reliability and validity, though related, are different test characteristics.In fact
13、, reliability can be viewed as a precondition for validity, that is, a test cannot be valid unless it is first reliable.8ysKL。?Validity isespecially important whenit isinvolvedin the decisions thatteachersregularly make about their students.9bSZk。?Teacherscertainly want to base their admissions,plac
14、ement, achievement, anddiagnosticdecisions on tests thatareactuallytesting what theyclaim tomeasure. AlM4p。?Adopting,developing, and adaptingtestsfor such decisions is difficultenoughwithout having to also worry about whether the tests are measuring the wrong student characteristics, abilities, prof
15、iciencies, etc.B2Sqp。【基本问题】1) 测量什么属性;2) 对所欲测量的属性所测到的程度。1)效度是针对测验结果而言的。即测验效度是测验结果的有效性程度。不是测验本身。( 2)效度是针对测验特定目的而言的。它不具备普遍性。所以在评价一个测验的效度时,必须考虑到其特殊用途,指明其对测量什么有效。kwqn3。( 3) 效度只有程度上的差异。它不是“有”和“无”的差别。使用“高”、 “中”、 “低”来描述。考试效度研究并不是检验考试内容本身,也不是检验考试分数本身的“效度”(考试分数本身不存在效度,仅仅存在信度问题-LP) ,而是检验解释和使用考试分数的方式的效度。Npb1O。C
16、ontent Relevance? involves the specification of ability domain (Bachman, 1990:42-4, about operationally defining constructs); c5C8W。? requires the specification of the test method facets (ibid:119 )(e.g., what it is that the test measures, the attributes of the stimuli that will be presented to the
17、test-takers,the nature of the responses that the test taker is expected to make );Xxxuy。Content Coverage? wish to have a well-defined domain that specified the entire set, or population, ofpossible test tasks; f8cAx。? then, we could follow a standard procedure for random sampling (or stratified rand
18、om sampling, in the case of heterogeneous domains) to insure that the tasks required by the test is representative of that domain.jNQqH。Authenticity“define authenticity as the degree of correspondence of the characteristics of a given language test task to the features of a TLU task ”I2r40。(Bachman
19、& Palmer, 1996:23)特定考试任务特征与TLU 任务特征之间的符合程度例如,在研发阅读考试时,我们应该选择那些特征(内容、语篇结构、题材、题材等)与实际阅读环境中必读材料特征相符合的篇章作为考试用篇章。Mfdtl 。We define interactiveness as the extent and type of involvement of the test taker s individual characteristics in accomplishing a test task.” aUaXi。The individual characteristics:1)
20、 language ability (language knowledge and strategic competence, or metacognitive strategies), FwwM。 t2) topical knowledge, and3) affective schemata.例如,一个考试任务要求考生在理解所馈入部分的主题内容时要与其个人所掌握的相关主题知识关联起来,那么这个考试任务就更具有交互性。szS3P。Impact can be defined broadly in terms of the various ways in which test use affect
21、s the society, an education system, and the individuals .” (Bachman & Palmer, 1996: 39) bJEcT。Impact operates at two levels:a) a micro level - in terms of the individuals who are affected by the particular test use;b) a macro level - in terms of the educational system or society.WghPracticality
22、is included as a major concern in test design because that valid and reliable a test may be, if it is not practical to administer it in a specific context then it will not be taken up in that context. Ylsj6 。Practicality covers a range of issues, such as? The cost of development and maintenance? Tes
23、t length?Ease of marking?Time required to administer? Ease of administration? Equipment availability, etc.Practicality can be defined “as the relationship between the resources that will be required in the design, development, and use of the test and the resources that will be available for these ac
24、tivities. ” KT6lP。 available resourcesPracticality = required resources经典真分数测量理论If >= 1, the test development and use is practicalIf < 1, the test development and use is not practicalqDTNl°Resources:- Human resources- Test writers, scorers, administrators, and clerical support RFb67- Material resources- Space; Equipment; Materials- TimeLANGUAGE competenceORGANIZATIONAL COMPETENCEPRAGMATICGRAMMATICALTEXTUAL ILLOC UTIONARYCOMPETENCE COMPETENCE COM
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 就业协议与劳动合同的适用范围解读
- 珠宝首饰可持续发展策略考核试卷
- 电力电子器件的电磁兼容性设计考核试卷
- 专科医院护理信息化管理知识考核试卷
- 电子真空器件的高频特性分析考核试卷
- 环保产业与循环经济协同发展考核试卷
- 司马迁的‘大数据思维’:解析《史记》中的管理智慧与商业启示
- 2025年制定版权许可合同模板
- 2025年工程采购合同研究
- 2025装修项目合作合同
- 2025年由民政局策划的离婚协议官方文本模板
- 高血压科普健康宣教课件
- 班级安全员信息员培训
- 科技领域实验室质量控制关键技术与方法
- 商场运营部的培训
- 四年级 人教版 数学《小数的意义》课件
- 《糖尿病与肥胖》课件
- 医疗纠纷防范与医患沟通
- 服装设计与工艺基础知识单选题100道及答案
- 钢结构施工管理培训课件
- 护理MDT多学科联合查房
评论
0/150
提交评论