OptimalEfficientReconstructionofRoot-Unknown_第1页
OptimalEfficientReconstructionofRoot-Unknown_第2页
OptimalEfficientReconstructionofRoot-Unknown_第3页
OptimalEfficientReconstructionofRoot-Unknown_第4页
OptimalEfficientReconstructionofRoot-Unknown_第5页
已阅读5页,还剩19页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Optimal Efficient Reconstruction of Root-Unknown Phylogenetic Networks with Constrained and Structured RecombinationAuthor: Dan GusfieldPresentation by: C. Badri Narayanan 1AgendaMain Problem Root-Unknown galled-tree problemSolving Optimal Root-Unknown Galled-Tree Problem2Root-Unknown Galled-Tree pr

2、oblem Given a set of sequences (say, M), find a galled-tree with minimum number of recombinations, if one exists else output none Lets see the approach previously taken3Points Considered in Theorem(s)Only single-crossover recombinations are consideredThe algorithm will be extended to multiple crosso

3、ver recombinations Before seeing the approach lets consider some definitions4Definition of TermsTrivial Component: A node with no edgesComponent (a.k.a. Connected/Non-Trivial Component): For any pair of nodes there is at least one path between those nodesReduced galled-tree: If no gall contains a ch

4、aracter site from a trivial component5Previous Approaches A RoadmapTo construct a galled-tree for M with known ancestral sequence (say, A) Focus on each non-trivial component separately from incompatibility graph For each component in the incompatibility graph, determine the site arrangement on a ga

5、ll Connect the galls in a tree structure Place the sites from the trivial components6Difficulties for Unknown Ancestral SequenceFor any two sequences S & S (in M), the conflict and incompatibility graphs may be differentHow do we know which (ancestral) sequence will allow a galled-tree 7Optimal Gall

6、ed-TreeIf a galled-tree that minimizes the number of recombinations over all galled-trees for a set of sequences (say, M) and over all choices of ancestral sequence then it is called “Optimal Galled-Tree”The ancestral sequence of an optimal galled-tree is called an “optimal ancestral sequence”8Autho

7、rs Approach: Theorem on Galled Trees Finding An Ancestral Sequence If there is a galled-tree for M with some ancestral sequence, then there is an optimal galled-tree for M where the (optimal) ancestral sequence is one of the sequences in M9Proof for the Theorem T optimal galled-tree for M A ancestra

8、l sequence for T Every gall must have at least three edges branching off of it10Proof continued. Path P in T from root to some leaf z which doesnt contain any recombination nodes Zz sequence labeling z where Zz is in M Make Zz as the ancestral sequence & reverse the directions of all edges on path P

9、11Main Problem contd.Each such reversal of edges changes the direction of mutation on edgesThe reversal of edges dont change Labels on edges in T Recombination node on a gallThe modified tree T also derives M12Main Problem contd.Ancestral sequence of T is Zz which is a member of MT also contains sam

10、e number of galls and hence T is also optimalRunning time is O(n2 m + n4) where n number of sequences m length of binary sequence13Solving Optimal Root-Unknown Galled-Tree ProblemM can be derived on a galled-tree; T* - an optimal galled-tree for MA* - an optimal ancestral sequence14Connecting galls

11、of T* Assumptions Every node v on a gall Q in T* is incident with exactly one edge; The other end is off of Q (a.k.a. “off-edge”) Off-edge may be directed into or out of a node (say, x)15Connecting Galls of T*Transform T* to T (conceptually) as followsNode 00100 (say, x) is incident with 2 edgesA ne

12、w edge (say, y) is introducedConnect the 2 original edges (that were initially out of x) from yT specifies how galls of T* are connected to each other but does not show the internal arrangement of the sites on any gall16Connecting Galls of T*If x is root of T* then create a new root and connect it w

13、ith an edge to xContract each gall Q in T* to a single node (say, q) and make all edges undirected17Algorithmic Construction of TFind a family of splits SP(T)C1 & C2 are obtained from the incompatibility graphThe leaf nodes for the tree (on the right side of the figure) are determined by the sites t

14、hat have unique combination of characters18Extensions to Complex Biological Phenomena & Structured RecombinationSite-Arrangement algorithm for gall Q corresponding to component CLet M(C ) be matrix M restricted to sites in C19Extensions to Complex Biological Phenomena & Structured Recombination For

15、each distinct sequence X in M(C ):Let M(C, X) be M(C ) after removal of all rows with sequence XIf there is an undirected perfect phylogeny T(C) for M(C,X) where all sites on C are contained in one path whose end sequences can be recombined (with single-crossover) to create sequence X then output th

16、e pair (X, T(C )20Extensions to Complex Biological Phenomena & Structured RecombinationStep 2 of above algorithm is modified for multiple-crossover recombinationTo determine if X can be created by a multiple-crossover recombination of Su(C) and Sy(C), starting with Su(C) Let Su(C) and Sy(C) denote two sequences21Extensions to Complex Biological Phenomena & Structured RecombinationAlgorithm:i = 1; Z = Su(C)doFind longest substring of Z starting at position i that matches a substring X starting at position iIf none, return no elseSet i to

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论