生物资讯相关演算法Algorithms in Bioinformatics概要课件_第1页
生物资讯相关演算法Algorithms in Bioinformatics概要课件_第2页
生物资讯相关演算法Algorithms in Bioinformatics概要课件_第3页
生物资讯相关演算法Algorithms in Bioinformatics概要课件_第4页
生物资讯相关演算法Algorithms in Bioinformatics概要课件_第5页
已阅读5页,还剩43页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

生物資訊相關演算法

AlgorithmsinBioinformatics呂學一(中央研究院資訊科學所).tw/~hil/2003/12/161AlgorithmsinBioinformatics,Lecture11生物資訊相關演算法

AlgorithmsinBioinfFinalexam&

FinalpresentationWhentohaveourfinalexam?12/30,20031/6,20041/13,2004Deadlineforfinalpresentation1/20,2004.2003/12/162AlgorithmsinBioinformatics,Lecture11Finalexam&

FinalpresentatiTodayShortestsuperstringproblemIntermissionVotingforthedateoffinalexam小巨’smagic:“final”.2003/12/163AlgorithmsinBioinformatics,Lecture11TodayShortestsuperstringprobSuperstring11001011101isasuperstringof0101110011011111012003/12/164AlgorithmsinBioinformatics,Lecture11Superstring1100101110ShortestSuperstringInput:kstringsA1,A2,…,Ak.Output:ashorteststringthatcontainseachAiasasubstring.MotivationSequencingbyshotgunmethodsDatacompression2003/12/165AlgorithmsinBioinformatics,Lecture11ShortestSuperstringInput:200HardnessoftheproblemNP-completeness[Maier+Storer,1977]MAX-SNP-complete[Blum+Jiang+Li+Tromp+Yannakakis,STOC1991andJACM1994].2003/12/166AlgorithmsinBioinformatics,Lecture11HardnessoftheproblemNP-compApproximability3:[Blum,Jiang,Li,Tromp,Yannakakis,STOC’91]2+8/9:[Teng,Yao,STOC’93]2+5/6:[Czumajetal.,SWAT’94]2+50/63:[Kosarju,Park,Stein,FOCS’94]2+3/4:[Armen,Stein,WADS’95]2+2/3:[Armen,Stein,CPM’96]2.596:[Breslaurev,Jiang,Jiang,JALG’97]2.5:[Sweedyk,Ph.D.Thesis1995,SICOMP’99]2003/12/167AlgorithmsinBioinformatics,Lecture11Approximability3:[Blum,JiangTodayA4-approximationforShortestSuperstringProblem2003/12/168AlgorithmsinBioinformatics,Lecture11TodayA4-approximationforShoAnaturalassumptionForanytwodistinctindicesiandjbetween1andk,AiisnotasubstringofAj.Why?Otherwise,Aicanberemovedfromtheinputsetwithoutaffectingthesolution.2003/12/169AlgorithmsinBioinformatics,Lecture11AnaturalassumptionForanytwTherefore,…IfSisasuperstringoftwodistinctinputstringsAiandAj,thenAiandAj

cannotoccuratthesamepositionofS.AiAjS2003/12/1610AlgorithmsinBioinformatics,Lecture11Therefore,…IfSisasuperstrNotation頭(i,j)and疊(i,j).AjAiThetightestwaytooverlapAiandAj.2003/12/1611AlgorithmsinBioinformatics,Lecture11Notation頭(i,j)and疊(i,j).LiningupasetofinputstringsLet排(i1,i2,…,id)betheconcatenation 頭(i1,i2)頭(i2,i3)…頭(id-1,id)Aid.Ai1Ai2Aid排(i1,i2,…,id)2003/12/1612AlgorithmsinBioinformatics,Lecture11Liningupasetofinputstrin隊長定理|排(i1,i2,…,id)|=A1到Ad的總長扣掉所有相鄰兩個字串的重疊長度之和.Ai1Ai2Aid排(i1,i2,…,id)2003/12/1613AlgorithmsinBioinformatics,Lecture11隊長定理|排(i1,i2,…,id)|=A1到AdAnobservationThereisapermutationΠ*over{1,2,…,k}suchthat排(Π*)isashortestsuperstringofA1,A2,…,Ak.2003/12/1614AlgorithmsinBioinformatics,Lecture11AnobservationThereisapermuProofLetS*beashortestsuperstringofA1,A2,…,Ak.Were-indexA1,A2,…,Aksuchthat現(1)<現(2)<…<現(k),where現(i)istheleftmostoccurrenceofAiinS*.A1A2AkAiS*2003/12/1615AlgorithmsinBioinformatics,Lecture11ProofLetS*beashortestsupeClaim:|排(1,2,…,k)|=|S*|“≤”:S*showsawaytolineupA1,A2,…,Akinorder,althoughitisnotnecessarilythemost“compact”waytodothat.“≥”:排(1,2,…,k)isasuperstringofA1,A2,…,Ak,andS*isashortestone.2003/12/1616AlgorithmsinBioinformatics,Lecture11Claim:|排(1,2,…,k)|=|S*|“≤”:24-approximationbyshortestcoveringcyclicstrings2003/12/1617AlgorithmsinBioinformatics,Lecture114-approximationbyshortestcoCyclicstringIfSisastring,let環(S)denotethecyclicstringobtainedfromSby頭尾相接.Comment:EvenifRandSaretwodistinctstrings,環(R)=環(S)isstillpossible,e.g.,環(bbabbaab)=環(abbaabbb).bbabbaab2003/12/1618AlgorithmsinBioinformatics,Lecture11CyclicstringIfSisastring,馴服字串(coveringcyclicstring)IfAiisasubstringofSSS…,thenwesay環(S)isa馴服字串(coveringcyclicstring)ofAi.Denoted:Ai

環(S).Forexample,0101環(10)11001環(1001)10111環(1101)1101環(10100100101)2003/12/1619AlgorithmsinBioinformatics,Lecture11馴服字串(coveringcyclicstring)IfCommentsForanyAiand環(S)withAi環(S),both|Ai|≥|環(S)|and|Ai|

≤|環(S)|arepossible.IfSisasuperstringofA1,A2,…,Ak,then環(S)isa馴服字串ofallinputstringsA1,A2,…,Ak,.However,wecanfindashortest馴服字串forallinputstringsA1,A2,…,Ak,inpolynomialtime.2003/12/1620AlgorithmsinBioinformatics,Lecture11CommentsForanyAiand環(S)wiObservationIfAi

環(S),thentheleftmostoccurrenceofAiinSSS…hastobeatmost|S|.SSSAiAi2003/12/1621AlgorithmsinBioinformatics,Lecture11ObservationIfAi環(S),then馴服術假設t個inputstringsAj1,…,Ajt,都被環(S)馴服,而且Aj1,…,Ajt在SSS…中的leftmostoccurrence也是按照Aj1,…,Ajt的順序由左到右排列.則|排(j1,j2,…,jt)|≤|S|+|Ajt|.2003/12/1622AlgorithmsinBioinformatics,Lecture11馴服術假設t個inputstringsAj1,…,AjWhy?SSSAjtAj2Aj1SAj32003/12/1623AlgorithmsinBioinformatics,Lecture11Why?SSSAjtAj2Aj1SAj32003/12/16重疊定理(待證)IfAjt

環(St)andAjt+1

環(St+1),thenwehave|疊(jt,jt+1)|<|St|+|St+1|.2003/12/1624AlgorithmsinBioinformatics,Lecture11重疊定理(待證)IfAjt環(St)andAj生物資訊相關演算法

AlgorithmsinBioinformatics呂學一(中央研究院資訊科學所).tw/~hil/2003/12/1625AlgorithmsinBioinformatics,Lecture11生物資訊相關演算法

AlgorithmsinBioinfFinalexam&

FinalpresentationWhentohaveourfinalexam?12/30,20031/6,20041/13,2004Deadlineforfinalpresentation1/20,2004.2003/12/1626AlgorithmsinBioinformatics,Lecture11Finalexam&

FinalpresentatiTodayShortestsuperstringproblemIntermissionVotingforthedateoffinalexam小巨’smagic:“final”.2003/12/1627AlgorithmsinBioinformatics,Lecture11TodayShortestsuperstringprobSuperstring11001011101isasuperstringof0101110011011111012003/12/1628AlgorithmsinBioinformatics,Lecture11Superstring1100101110ShortestSuperstringInput:kstringsA1,A2,…,Ak.Output:ashorteststringthatcontainseachAiasasubstring.MotivationSequencingbyshotgunmethodsDatacompression2003/12/1629AlgorithmsinBioinformatics,Lecture11ShortestSuperstringInput:200HardnessoftheproblemNP-completeness[Maier+Storer,1977]MAX-SNP-complete[Blum+Jiang+Li+Tromp+Yannakakis,STOC1991andJACM1994].2003/12/1630AlgorithmsinBioinformatics,Lecture11HardnessoftheproblemNP-compApproximability3:[Blum,Jiang,Li,Tromp,Yannakakis,STOC’91]2+8/9:[Teng,Yao,STOC’93]2+5/6:[Czumajetal.,SWAT’94]2+50/63:[Kosarju,Park,Stein,FOCS’94]2+3/4:[Armen,Stein,WADS’95]2+2/3:[Armen,Stein,CPM’96]2.596:[Breslaurev,Jiang,Jiang,JALG’97]2.5:[Sweedyk,Ph.D.Thesis1995,SICOMP’99]2003/12/1631AlgorithmsinBioinformatics,Lecture11Approximability3:[Blum,JiangTodayA4-approximationforShortestSuperstringProblem2003/12/1632AlgorithmsinBioinformatics,Lecture11TodayA4-approximationforShoAnaturalassumptionForanytwodistinctindicesiandjbetween1andk,AiisnotasubstringofAj.Why?Otherwise,Aicanberemovedfromtheinputsetwithoutaffectingthesolution.2003/12/1633AlgorithmsinBioinformatics,Lecture11AnaturalassumptionForanytwTherefore,…IfSisasuperstringoftwodistinctinputstringsAiandAj,thenAiandAj

cannotoccuratthesamepositionofS.AiAjS2003/12/1634AlgorithmsinBioinformatics,Lecture11Therefore,…IfSisasuperstrNotation頭(i,j)and疊(i,j).AjAiThetightestwaytooverlapAiandAj.2003/12/1635AlgorithmsinBioinformatics,Lecture11Notation頭(i,j)and疊(i,j).LiningupasetofinputstringsLet排(i1,i2,…,id)betheconcatenation 頭(i1,i2)頭(i2,i3)…頭(id-1,id)Aid.Ai1Ai2Aid排(i1,i2,…,id)2003/12/1636AlgorithmsinBioinformatics,Lecture11Liningupasetofinputstrin隊長定理|排(i1,i2,…,id)|=A1到Ad的總長扣掉所有相鄰兩個字串的重疊長度之和.Ai1Ai2Aid排(i1,i2,…,id)2003/12/1637AlgorithmsinBioinformatics,Lecture11隊長定理|排(i1,i2,…,id)|=A1到AdAnobservationThereisapermutationΠ*over{1,2,…,k}suchthat排(Π*)isashortestsuperstringofA1,A2,…,Ak.2003/12/1638AlgorithmsinBioinformatics,Lecture11AnobservationThereisapermuProofLetS*beashortestsuperstringofA1,A2,…,Ak.Were-indexA1,A2,…,Aksuchthat現(1)<現(2)<…<現(k),where現(i)istheleftmostoccurrenceofAiinS*.A1A2AkAiS*2003/12/1639AlgorithmsinBioinformatics,Lecture11ProofLetS*beashortestsupeClaim:|排(1,2,…,k)|=|S*|“≤”:S*showsawaytolineupA1,A2,…,Akinorder,althoughitisnotnecessarilythemost“compact”waytodothat.“≥”:排(1,2,…,k)isasuperstringofA1,A2,…,Ak,andS*isashortestone.2003/12/1640AlgorithmsinBioinformatics,Lecture11Claim:|排(1,2,…,k)|=|S*|“≤”:24-approximationbyshortestcoveringcyclicstrings2003/12/1641AlgorithmsinBioinformatics,Lecture114-approximationbyshortestcoCyclicstringIfSisastring,let環(S)denotethecyclicstringobtainedfromSby頭尾相接.Comment:EvenifRandSaretwodistinctstrings,環(R)=環(S)isstillpossible,e.g.,環(bbabbaab)=環(abbaabbb).bbabbaab2003/12/1642AlgorithmsinBioinformatics,Lecture11CyclicstringIfSisastring,馴服字串(coveringcyclicstring)IfAiisasubstringofSSS…,thenwesay環(S)isa馴服字串(coveringcyclicstring)ofAi.Denoted:Ai

環(S).Forexample,0101環(10)11001環(1001)10111環(1101)1101環(10100100101)2003/12/1643AlgorithmsinBioinformatics,Lecture11馴服字串(coveringcyclicstring)IfCommentsForanyAiand環(S)withAi環(S),both|A

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论