15、诺禾培训班课程其他trinity拼接_第1页
15、诺禾培训班课程其他trinity拼接_第2页
15、诺禾培训班课程其他trinity拼接_第3页
15、诺禾培训班课程其他trinity拼接_第4页
15、诺禾培训班课程其他trinity拼接_第5页
已阅读5页,还剩25页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Trinity—拼接转录调控—方彬彬20140813转录组分析流程基因组vs转录组近似的测序深度测双链NA基因组测序深度可能会有若干数量级的改变可以是链特异性的,因此转录组组装程序需要考虑利用链特异性信息来解析正义和反义转录本同一个基因的不同转录本会共享外显子转录组转录组拼接策略Mapping-firstapprochesMethod:firstalignallthereadstoareferencegenomeandthenmergesequenceswithoverlappingalignmentTools:Scripture,Cufflinksmaximumsensitivity,butdependoncorrectread-to-referencealignment,thereferencegenomeisavailableAssembly-firstapprochesTools:Trinitydonotrequireanyread-referencealignments,importantwhenthegenomicsequenceisnotavailable,isgapped,highlyfragmentedorsubstantiallyaltered,asincancercellsIntroductionTrinity,developedattheBroadInstituteandtheHebrewUniversityofJerusalem(耶路撒冷希伯莱大学),representsanovelmethodfortheefficientandrobustdenovoreconstrucionoftranscriptomesfromRNA-seqdataTrinitycombinesthreeindependentsoftwaremodules:Inchworm,Chrysalis,andButterfly,appliedsequentiallytoprecesslargevolumesofRNA-seqreads.ps:Broad,一个附属于MIT和harvard的,在生物医学和基因组学研究领域神一样存在的机构 Inchworm1.分解测序reads,构建k-mer字典(k=25)k-mer“25-mersworkverywellforbothhighlyandlowlyexpressedtranscripts”deBruijngraph字典kmer[sequence]=coverage--seqTypefq--JM(JellyfishMemory)(~1GRAMper1Mpairsofreads)2.从k-mer字典中移除error-containingk-merk-mersthathaveidenticalk-1prefixes,differingonlyattheirterminalnumleotide,andremovingthosek-mersthatare<5%abundantascomparedtothemosthighlyabundantk-merofthegrouperror-containingkmerswillbegreatlyenrichedwithinthelowabundancekmercountsandcanbeexcludedwithminimalloss--min_kmer_cov23.选择seedk-mer标准:1.coverage最高;2.高复杂性;3.在read中至少出现2次;4.无回文序列4.Seedk-mer延伸contig每一次从k-mer字典中找到一个跟前一个kmer有k-1个overlap并且丰度最高的kmer,延伸1个base。一直迭代到没有符合条件的kmer,每使用一个kmer,将其从kmer字典中移除thesequenceyieldedfromthebidirectionalseedk-merextensionisreportedasadrafttranscriptcontig5.重复seedselection和bidirectionalk-merextension直到k-mer字典耗尽6.过滤contigWithanaveragek-mercoverageof2Lengthatleast48(2*(k-1))Inchworm1.分解测序reads,构建k-mer字典(k=25)2.从k-mer字典中移除error-containingk-mer3.选择seedk-mer4.Seedk-mer延伸contig5.重复seedselection和bidirectionalk-merextension直到k-mer字典耗尽6.过滤contigInchwormdoesaverygoodjobatreconstructingtranscriptsfromRNA-seqdata,butsinceitleveragesonlyuniquek-mersforcontigconstruction,itcanonlyreportthepartsofalternativelysplicedisoformsthatareunique.SubsequentTrinitystepsreconstructthefulllengthalternativelysplicedtranscripts. Chrysalis由于contig的构造方法,使得各个contig之间不可能共享k个以上序列,因此这些Inchwormcontigs不能很好的表征各种可变剪切形式和同源基因等情况,Chrysalis将那些有重叠的contigs聚类,构成components,component就成为一组可变剪切isoform或同源基因可能的表征的集合。1.Groupscontigsintoconnectedcomponentscommonk-1basesperfectoverlap;Readsupport:therearenreadsthatspanacrossapotentialjunction(welds)andextendperfectmatchesby(k-1)/2basesoneachside;--min_glue(default:2)2.BuildadeBruijngraphforeachcomponentK-1mer→nodeKmer→edge3.ReadsassignmentThereadsarethenmappedtocomponentsbyselectingthecomponentthatsharesthemostk-1-merswiththeread.Chrysalisalsocountsallk-mers(inassignedreads)andstoresthemas'edgeweight'(fordeBruijnconstructedinlaststep)toindicatetheirsupportinthereadset.4.FilterComponentswithlessthanaminimumnumberofnodesarediscardedChrysalis1.groupscontigsintoconnectedcomponents2.BuildadeBruijngraphforeachcomponent3.Readsassignment4.FilterButterflyGraphsimplificationMergingconsecutivenodesinlinearpathsinthedeBruijngraphtoformnodesthatrepresentlongersequencesPruningedgesthatrepresentminordeviations(supportedbycomparativelyfewreads),whichlikelycorrespondtosequencingerrorsPathcallingTracesthepathsthatreadsandpairsofreadstakewithinthegraphandreportsthemostprobabletranscriptsasafastafile. ButterflyresolvesalternativelysplicedandparalogoustranscriptsSummary其他参数--jaccard_clipIfdataarederivedfromagene-densecompactgenome,suchasfromfungalgenomes,wheretranscriptsmayoftenoverlapinUTRregions.Trinitywillexaminetheconsistencyofreadpairingsandfragmenttranscriptsatpositionsthathavelittleread-pairingsupport.--SS_lib_type(Strand-specificlibrarytype)Pairedreads:RF:firstread(/1)offragmentpairissequencedasantisense(reverse(R)),andsecondread(/2)isinthesensestrand(forward(F));typicalofsequencingmethod.thedUTP/UDGFR:firstread(/1)offragmentpairissequencedassense(forward),andsecondread(/2)isintheantisensestrand(reverse)Unpaired(single)reads:F:thesinglereadisinthesense(forward)orientationR:thesinglereadisintheantisense(reverse)orientation一些参数 --jaccard_clip(setifyouhavepairedreads andyouexpecthighgenedensitywithUTR overlap) --min_contig_length(minimumassembled contiglengthtoreport) --min_glue(minnumberofreadsneededtogluetwoinchwormcontigstogether)其他问题>comp2_c0_seq1len=2364path=[0:0-587588:588-10761146:1077-2363]comp2:sequenceisderivedfromChrysaliscomponent#2c0:sequencealsocorrespondstoButterflyponent#0(duringgraphcompactionandpruning,somecomponentsarepartitionedintodisconnectedponents).seq1:sequencecountfromchrysaliscomponent2,butterflyponentzero.Ifthisponentyieldsmultiplesequences,thesewillhavedifferentseqnumbers.len:lengthofthetranscriptsequence单独拼接和混合拼接Thegeneralideaistocombineallyourrna-seqdataandgenerateoneassembly.Then,toalignthereadsfromthedifferentsamplesseparatelytotheTrinityassemblies,computingabundanceestimatesbasedoneachreadset.Finally,youdothedifferentialexpressionanalysistoidentifythoseTrinityassembliesthatareofinterest.Thisisalloutlinedinthedownstreamanalysissectionofthedocumentation.--BrianJ.HaasN50/N90:按照长度将拼接转录本从大到小排序,累加转录本的长度,到不小于总长50%/90%的拼接转录本的长度就是N50/N90。2023/1/10其他问题拼接参数回顾纯分析参数:--seqTypefq--JM200G--SS_lib_typeRF--min_kmer_cov2--min_glue2--CPU10Trinity.fasta文件、unigene.fasta文件回顾测序深度可能会有若干数量级的改变可以是链特异性的,因此转录组组装程序需要考虑利用链特异性信息来解析正义和反义转录本同一个基因的不同转录本会共享外显子转录组TRINITY如何解决转录组拼接的难题???Practice1.Whatcomputingresourcesarerequired?Ideally,youwillhaveaccesstoalarge-memoryserver,having~1GofRAMper1Mreadstobeassembled(butoften,muchlessmemorymay

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论