版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、the 7 bridges in knigsberg and compositional representation of protein sequencesbailin hao (郝柏林)(itp & bgi, cas )huimin xie (谢惠民)(math dept. suzhou u)shuyu zhang (张淑誉)(ip. acad. sinica)compositional approach inprokaryote phylogeny justification of using k-tuples instead of primary protein sequen
2、ces. problem of uniqueness of reconstruction of protein sequence from its constituent k-tuples. picking up a special class of proteins without biological knowledge.7 bridges in knigsbergabdca ab bc cd d euler (1736) 4 odd nodes: no! dnes knig, theory of finite and infinite graphs 1st ed.(1932). birk
3、haser (1990)“ from knigsberg to knigs book, so runs the graphic talebasic notions a graph g=(v, e), where v is a set of nodes (vertices), e is a set of edges (bonds) edges: undirected (u,v)=(v,u), uand v adjacent; directed (u,v) differs from (v,u), u incident to v. a weight may be associated with (u
4、,v): cost, distance, transfer function, reaction rate, etc. eulerian graph: each edge appears once and only once in a path hamiltonian graph: each vertex appears once and only once in a path. hamiltonian cycle of minimal weight - travel salesman problem (tsp)an euler path:an euler loop:euler grahp:
5、loopsemi-euler praph: path, no loopproblem of eulerian loop: simple, known solutionsproblem of hamiltonian paths: much harder 1 12 23 34 46 65 57 71 12 23 34 46 65 57 7hamiltonian loops: much harder10 nodes15 arcs di=3 nodestraveling salesman problemnp-hard problemsyes!no!graph = nodes + arcsdirecte
6、d , labeled arcs and nodessimple graph: no rings at nodes: no repeated arcs:i ij jiijindegree din(i) outdegree dout(i)euler graph: din(i) = dout(i) di isimple euler graphdiagonal matrix: m=diag( d1, d2, dn )adjacent matrix: a=aij aij= aii= 0 kirchhoff matrix: c=m-a cij= cij=0 det(c)=0all minors of c
7、 are equal. denote this common minor by ni,j=010ijnumber of euler loops in simple euler graphn g de bruijnt van aardenrie ehrenfestc a b smithw t tuitebest theoreme (g) = (di-1)! inumber of eulerian loops in general eular g. some aii0 rings some aij1 parallel arcsputting auxiliary nodes on these rin
8、gs and parallel arcs makes the graph simple.i ij jino need to work with bigger a matrix.just let some aii0, aij1 in original a.eliminate redundancy caused by unlebeled arcs.modified best theorem: e(g) = (di-1)! aij!iijanpa_pseam 82aamalslftvgqliflfwtmriteaspdpaakaapaaaaapaaaapdtasdaaaaaaltaanakaaael
9、taanaaaaaaatargmalsalsllslfslftftvgtvgqvgqlakaak=5lftvltaa 7aaaa 5paaa 4apaa 3tann 8aana 9sspa 2aaap 6akaa 1auxiliary arc6 ringsfrom pdb.seq-a special selection of swissprot2821-1=2820 proteins ( may 2000 )rnumber of reconstructed aa sequences from a given protein decompositionr rk k1 12-102-101 11-
10、1001-1001 101-100001-10001 1001-001-1000010000 10 10 4 45 56 67 78 89 91 10 01 11 121642164(76.7%)40440490904545212193932 2651651(94%)7777292910104 4494927322732(96.9%)323216163 32 2444427402740(97.1%)232310103 30 044442762763 3(97.9%)1 13 37 71 10 0363627932793(99%)11117 72 21 16 627982798(99.2%)12
11、122 21 11 16 6compositional representation of proteinsthe collection w or w ,n j may be used as an equivalent representation of the original protein sequence.a seemingly trivial result upon further reflection: random aa sequences have unique reconstruction as well.compositional representation works
12、equally for random aa sequences and most of protein sequences.a given realization of a short random aa sequence is as specific as a real protein sequence. j=1mjki=1l= -k+1iknucleotide correlations in dna/rna much studied k=2 correlation functions 16 9 6 see wentian li, computer chem. 21(1997) 257-27
13、1.amino acid correlations in proteins almost no study hard to comprehend 400 correlation functions at k=2 proteins too short to define correlation functions one should approach the problem from a more deterministic point of viewrepeated aa segments in proteins are strong manifestation of correlation
14、s!on-going study: the other extremequit a few proteins have an enormous number of reconstructions. transmembrane antifreeze fibrous: collagenscoarse-graining: closer to biology by reducing the number of aasproteinproteinaaaar(1r(11)1)k kfor r(k)=1for r(k)=1maga_xenlamaga_xenlaspa2_staauspa2_staaus sr rtx_atrentx_atrenebn1_ebvebn1_ebvi icen_psesycen_psesyxpin_xenlaxpin_xenlacaih_mousecaih_mousec cipa_clotmipa_clotm3 303031 150508 84 45 508088640864028285 54 43 39.9732*109.9732*105 510110164641 15.407*105.407*1016162828120012001.55675*101.5
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年西北民族大学电气工程学院人才招聘备考题库及完整答案详解1套
- 2025湖北鄂州市华容区属国有企业招聘7人备考题库及一套参考答案详解
- 2026北京保障房中心有限公司法律管理岗招聘1人备考题库及完整答案详解1套
- 2025年安徽城市管理职业学院引进高层次人才10名备考题库及一套答案详解
- 2026年度枣庄市薛城区直属高中学校专项引才备考题库(35人)附答案详解
- 2026年合肥市园上园小学喻岗分校教师招聘备考题库带答案详解
- 2025湖南怀化市鹤中一体化发展事务中心选调1人备考题库及一套完整答案详解
- 2026云南昆明市云南中医药大学招聘第一批科研助理岗位人员(事业编制外)1人备考题库及完整答案详解1套
- 2025天津蓝巢京能(锡林郭勒)运行维护项目部招聘28人备考题库及一套参考答案详解
- 2026天津市中天天杰招聘备考题库有答案详解
- 口述史研究活动方案
- 高压燃气管道施工方案
- 房屋租赁合同txt
- 加工中心点检表
- 水库清淤工程可行性研究报告
- THBFIA 0004-2020 红枣制品标准
- GB/T 25630-2010透平压缩机性能试验规程
- GB/T 19610-2004卷烟通风的测定定义和测量原理
- 精排版《化工原理》讲稿(全)
- 市场营销学-第12章-服务市场营销课件
- 小微型客车租赁经营备案表
评论
0/150
提交评论