周二培训材料功能注释及blast2go的使用trinity lecture_第1页
周二培训材料功能注释及blast2go的使用trinity lecture_第2页
周二培训材料功能注释及blast2go的使用trinity lecture_第3页
周二培训材料功能注释及blast2go的使用trinity lecture_第4页
周二培训材料功能注释及blast2go的使用trinity lecture_第5页
已阅读5页,还剩22页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

vs -firstmethod:firstalignallthereadstoareferencegenomeandthenmergesequenceswithoverlap tools:Scripture, umsensitivity,butdependoncorrectread-to-referencealignment,thereferencegenomeisavailableassembly-firsttools:?:donotrequireanyread-referencealignments,importantwhenthegenomicsequenceisnotavailable,isgapped,highlyfragmentedorsubstantiallyaltered,asincancercellsTrinity,developedattheBroadInstituteandtheHebrewUniversityofJerusalem(耶路撒冷希伯莱大学),representsanovelmethodfortheefficientandrobustdenovoreconstrucionoftranscriptomesfromRNA-seqTrinitycombinesthreeindependentsoftwaremodules:Inchworm,Chrysalis,andButterfly,appliedsequentiallytoprecesslargevolumesofRNA-seqreads.psBroad,一个附属于MIT和harvard的,在生物医学和基1.分 reads,构建k-mer字典“25-mersworkverywellforbothhighlyandlowlyexpresseddeBruijn字典--seqType (JellyfishMemory)(~1GRAMper1Mpairsof2.从k-mer字典中移除error-containingk-1.k-mersthathaveidenticalk-1prefixes,differingonlyattheirterminalnumleotide,andremovingthosek-mersthatare<5%abundantascomparedtothemosthighlyabundantk-merofthe2.error-containingkmerswillbegreatlyenrichedwithinthelow-abundancekmercountsandcanbeexcludedwithminimalloss.--min_kmer_cov3.选择seedk-4.Seedk-mer延伸 kmer,每使用一个kmer,将其从kmer字典中移除thesequenceyieldedfromthebidirectionalseedk-merextensionisreportedasadrafttranscriptcontig5.重复seedselectionbidirectionalk-merextension直到k-mer字典耗尽6.Withanaveragek-mercoverageofLengthatleast (2*(k-InchwormdoesaverygoodjobatreconstructingtranscriptsfromRNA-seqdata,butsinceitleveragesonlyuniquek-mersforcontigconstruction,itcanonlyreportthepartsofalternativelysplicedisoformsthatareunique.SubsequentTrinitystepsreconstructthefull-lengthalternativelysplicedtranscripts.由于config的构造方法,使得各个config之间不可能共享k,因此这些Inchwormconfigs不能很好的表征各种可变剪切形式和同 1.groupscontigsintoconnectedcommonk-1basesperfectReadsupport:therearenreadsthatspanacrossapotentialjunction(welds)andextendperfectmatchesby(k-1)/2basesoneachside; (default:2.buildadeBruijngraphforeachK-1mer Kmer 3.readsthereadsarethenmappedtocomponentsbyselectingthecomponentthatsharesthemostk-1-merswiththeread.Chrysalisalsocountsallk-mers(inassignedreads)andstoresthemas'edgeweight'(fordeBruijnconstructedinlaststep)toindicatetheirsupportinthereadset.ComponentswithlessthanaminimumnumberofnodesareButterflyresolvesalternativelysplicedandparalogousgraph1.mergingconsecutivenodesinlinearpathsinthedeBruijngraphtoformnodesthatrepresentlongersequences2.pruningedgesthatrepresentminordeviations(supportedbycomparativelyfewreads),whichlikelycorrespondtosequencingpathtracesthepathsthatreadsandpairsofreadstake withinthegraphandreportsthemostprobabletranscriptsasafastafile.InchwormassemblestheRNA-seqdataintouniquesequencesoftranscripts,oftengeneratingfull-lengthtranscriptsforadominantisoform,butthenreportsjusttheuniqueportionsofalternativelysplicedtranscripts.ChrysalisclusterstheInchwormcontigsintoclustersandconstructscompletedeBruijngraphsforeachcluster.Eachclusterrepresentsthefulltranscriptonalcomplexityforagivengene(orsetsofgenesthatsharesequencesincommon).Chrysalisthenpartitionsthefullreadsetamongthesedisjointgraphs.Butterflythenprocessestheindividualgraphsinparallel,tracingthepathsthatreadsandpairsofreadstakewithinthegraph, yreportingfull-lengthtranscriptsforalternativelysplicedisoforms,andteasingaparttranscriptsthatcorrespondstoparalogousgenes.--Ifdataarederivedfromagene-densecompactgenome,suchasfromfungalgenomes,wheretranscriptsmayoftenoverlapinUTRregions.Trinitywillexaminetheconsistencyofreadpairingsandfragmenttranscriptsatpositionsthathavelittleread-pairing--SS_lib_type(Strand-specificlibraryPairedRF:firstread(/1)offragmentpairissequencedasanti-sense(reverse(R)),andsecondread(/2)isinthesensestrand(forward(F));typicalofthedUTP/UDGsequencingmethod.FR:firstread(/1)offragmentpairissequencedassense(forward),andsecondread(/2)isintheantisensestrand(reverse)Unpaired(single)F:thesinglereadisinthesense(forward)R:thesinglereadisintheantisense(reverse)>comp2_c0_seq1len=2364path=[0:0-587588:588-10761146:1077-2363]comp2:sequenceisderivedfromChrysaliscomponent#2c0:sequencealsocorrespondstoButterflysub-component#0(duringgraphcompactionandpruning,somecomponentsarepartitionedintodisconnectedponents).seq1:sequencecountfromchrysaliscomponent2,butterflyponentzero.Ifthisponentyieldsmultiplesequences,thesewillhavedifferentseqnumbers.len:lengthofthetranscript“Thegeneralideaistocombineallyourrna-seqdataandgenerateoneassembly.Then,toalignthereadsfromthedifferentsamplessepara totheTrinityassemblies,computingabundanceestimatesbas

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论