语料库研究方法概述_第1页
语料库研究方法概述_第2页
语料库研究方法概述_第3页
语料库研究方法概述_第4页
语料库研究方法概述_第5页
已阅读5页,还剩42页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

选题、设计与方法

Putitaltogether李文中中国外语教育研究中心2012语料库不是人学的,正则表达式不是女人学的。

Corpus-driven

isbasicallycorpusbased.Anycorpus-basedresearchisnecessarilydrivenbycorpusdata.目标:通过语料库分析和研究:验证假设、直觉获得新发现建立新的假设构建新的理论验证已有的发现解决难题创新:数据方法技术解读/理论/视角√新√√√√√√√√√√基于语料库方法是一种验证程序语料库驱动方法是一种发现程序理据:任何感知都是推断Anyperceptionisbutinferencing.worldofrealityworldoftextEinsteinGulfUnbridgeable眼耳鼻舌身意色声香味触法学问思辨行文本基本步骤:确定题目提出问题确定总体和样本选择工具处理数据描述结果:分类、总结特征(description)解释结果:观察、描述、解释(explanation)解读结果:意义、价值、应用(interpretation)IdentifyingaproblemSomethingorphenomenon:outofexpectationIncongruentNeedasolutionpuzzlingReadingtobebetterinformedWhathasbeendoneascontributionWhathasbeenleftundoneWhathasbeendonewrongNevercountsomeoneelse’smoney.FormulatingresearchquestionsNaming:whatis…Classificatory:Howaretheyinterrelated(patterned)?Explanatory:towhatextentdotheyco-occur?Predictive:Whatwillhappenif…?Neveraskaquestiontowhichyoualreadyknowtheanswer;neverask'howto'questionFindingamethodPopulationSampleSamplingP(population)S(Sample)R(Result)I(Interpretation)SamplingvalidityreliabilityValidityGeneralizabilityIFPSSRRITHEN IPDescriptiveresearchsingletexttextvs.textpeoplevs.textResearchquestionsHowmanydifferentwordformsareusedinthetext?Howmanyrunningwordsareused?Whatistheirdistribution?Towhatextentcanthelevelofdifficultyofthetextbecomputedonthebasisofthegradedwordlists?Howmanydifferentwordclassesareused?Whatisthenumberofeachwordclass?MethodToanswerRQ1,generateawordlistofthegiventextandobserve:ThenumberoftypesThenumberoftokensthetype/tokenratio(TTR)Ifthetextisverylarge,standardizetheTTRthetypesandtheirfrequencycumulativepercentageToanswerRQ2,computethewordlistagainstabatchofgradedwordlists,andobserve:HowmanytypesonLevel1,2,and3listsareusedinthetext?Andwhatistheirpercentage?Whatabouttheirtokens?Howmanytypesthatarenotonanylistareusedinthetext?Summarizetheirfeatures.ToanswerRQ3,retrieveeachwordclassfromthePOStaggedtext,andsortthemonfrequencyindecreasingorderRetrieveallthenouns,verbs,andadjectivesSortthelistInstrumentsUseAntconc3.0togeneratethewordlist;UseRange

tocompareandcontrastthewordlistagainstabatchofgradedwordlists;UsePowerGreptoretrievethewordclassfromthePOStaggedtext;ExplanatoryresearchinterrelationshipbetweenwordsIRbetweenphraseologiesIRbetweengenresResearchonrelationship:shapedirectionstrengthResearchquestionsWhatarethewordsthatareuniquetothetextintermsofitssubjectmatter?Towhatextentarethesewordsrelatedtothesubject/topicofthetext?Whatpatternsofrelationshipsexistamongthekeywords?MethodCompare&contrastthewordlist(oftheobservedtextorcorpus)againstthewordlistofthereferencetextorcorpus(larger);Observeandgroupthewordswithinaclassificationframework;InstrumentsAntconc3.0OtherapplicationsLiteraryanalysisAutomaticsummarizationResearchonwordusesObjectives:Observethecollocatesofaword;Studyitspatternsofuses;Studyitsmeaningsassociatedwithitspatternsofuses;StudythesemanticprosodyofitsmeaningResearchquestionsWhatwordscollocatewiththeSearchWord?Whatisthestrengthofthecollocability?WhatisthepatternoftheSW?Andwhatisitssemanticpreference?Whatisthesemanticprosodyofthepattern?MethodSearchtheword(KW,SW,orNodeWord)asKWIC;Observeitscollocatesandtheirwordclasses;Observethemeaningthatisassociatedwiththepattern;Observeitssemanticprosody;InstrumentsAntconc3.0ConcordanceSort:Level1,Level2,Level3FrequencycountCollocatesSortSortPOStagsResearchonchunksObjectives:Toretrievethemultiwordsequences;Toexaminetheinternalstructureofsuchsequences;Toobtainthesequencesuniquetoaspecifictext;ResearchquestionsWhatmultiwordsequences(intermsofn-gram)arefoundinthegiventext?Howarethesesequencesstructuredintermsoflexicalgrammaticalpattern?Howisthemessageconveyedassociatedwiththeoverallstructureofthesequences?MethodSegmentthetextandgenerateabatchoflistsofmultiwordsequences(ofvariouslengths);Observethestructureoftheretrievedn-gramsandexaminetheirregularities;Studythesemanticandpragmaticfeatures;InstrumentsKfngramWordsmithToolsv3.0PowerGrep3.5ResearchonparalleltextsObjectives:Toobservehowthesourcetextwastranslatedintothetargettext;Toobservetheprobabilityofthetranslationunitsandcorrespondingunitsfoundinthetext;Tostudythedynamicsofthetranslationofagivencommunity;ResearchquestionsWhattranslationunitscanbefoundintheparalleltexts?Whatcorrespondingunitscanbefoundinthetexts?Whatarethedistributionalfeaturesofthecorrespondingunitsoftheidenticalgenre;MethodObservetheparalleltexts(orconstructthecorpora);Retrievetheunitsfocused;Examinethecontextofsuchunits;Observetheirpatterns;InstrumentsParalleltextsParallelconcordancerCorrespondingunitsdatabaseContrastiveStudiesObjectives:todogenreanalysistodeterminethedifficultylevelofatexttoobservehowatextisbiasedagainstitsreadersintermsofeventpresentationandevaluationtocompareabatchofwordlistswiththegradedwordlistsResearchquestions:Whatgenrefeaturescanbeidentifiedforaparticulargenretext,intermsoflexis,patterns,andtextualorganization?Howdifficultisatext?Isitaproperinputtextforthestudentsattheirpresentcompetence?Whatcontrastivefeaturesaredisplayedinthepointofview,tone,judgementofthetext?Howdothesefeaturesaffectthereaders?Howtodeterminethevocabulary

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论