




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、the 7 bridges in knigsberg and compositional representation of protein sequencesbailin hao (郝柏林)(itp & bgi, cas )huimin xie (谢惠民)(math dept. suzhou u)shuyu zhang (张淑誉)(ip. acad. sinica)compositional approach inprokaryote phylogeny justification of using k-tuples instead of primary protein sequen
2、ces. problem of uniqueness of reconstruction of protein sequence from its constituent k-tuples. picking up a special class of proteins without biological knowledge.7 bridges in knigsbergabdca ab bc cd d euler (1736) 4 odd nodes: no! dnes knig, theory of finite and infinite graphs 1st ed.(1932). birk
3、haser (1990)“ from knigsberg to knigs book, so runs the graphic talebasic notions a graph g=(v, e), where v is a set of nodes (vertices), e is a set of edges (bonds) edges: undirected (u,v)=(v,u), uand v adjacent; directed (u,v) differs from (v,u), u incident to v. a weight may be associated with (u
4、,v): cost, distance, transfer function, reaction rate, etc. eulerian graph: each edge appears once and only once in a path hamiltonian graph: each vertex appears once and only once in a path. hamiltonian cycle of minimal weight - travel salesman problem (tsp)an euler path:an euler loop:euler grahp:
5、loopsemi-euler praph: path, no loopproblem of eulerian loop: simple, known solutionsproblem of hamiltonian paths: much harder 1 12 23 34 46 65 57 71 12 23 34 46 65 57 7hamiltonian loops: much harder10 nodes15 arcs di=3 nodestraveling salesman problemnp-hard problemsyes!no!graph = nodes + arcsdirecte
6、d , labeled arcs and nodessimple graph: no rings at nodes: no repeated arcs:i ij jiijindegree din(i) outdegree dout(i)euler graph: din(i) = dout(i) di isimple euler graphdiagonal matrix: m=diag( d1, d2, dn )adjacent matrix: a=aij aij= aii= 0 kirchhoff matrix: c=m-a cij= cij=0 det(c)=0all minors of c
7、 are equal. denote this common minor by ni,j=010ijnumber of euler loops in simple euler graphn g de bruijnt van aardenrie ehrenfestc a b smithw t tuitebest theoreme (g) = (di-1)! inumber of eulerian loops in general eular g. some aii0 rings some aij1 parallel arcsputting auxiliary nodes on these rin
8、gs and parallel arcs makes the graph simple.i ij jino need to work with bigger a matrix.just let some aii0, aij1 in original a.eliminate redundancy caused by unlebeled arcs.modified best theorem: e(g) = (di-1)! aij!iijanpa_pseam 82aamalslftvgqliflfwtmriteaspdpaakaapaaaaapaaaapdtasdaaaaaaltaanakaaael
9、taanaaaaaaatargmalsalsllslfslftftvgtvgqvgqlakaak=5lftvltaa 7aaaa 5paaa 4apaa 3tann 8aana 9sspa 2aaap 6akaa 1auxiliary arc6 ringsfrom pdb.seq-a special selection of swissprot2821-1=2820 proteins ( may 2000 )rnumber of reconstructed aa sequences from a given protein decompositionr rk k1 12-102-101 11-
10、1001-1001 101-100001-10001 1001-001-1000010000 10 10 4 45 56 67 78 89 91 10 01 11 121642164(76.7%)40440490904545212193932 2651651(94%)7777292910104 4494927322732(96.9%)323216163 32 2444427402740(97.1%)232310103 30 044442762763 3(97.9%)1 13 37 71 10 0363627932793(99%)11117 72 21 16 627982798(99.2%)12
11、122 21 11 16 6compositional representation of proteinsthe collection w or w ,n j may be used as an equivalent representation of the original protein sequence.a seemingly trivial result upon further reflection: random aa sequences have unique reconstruction as well.compositional representation works
12、equally for random aa sequences and most of protein sequences.a given realization of a short random aa sequence is as specific as a real protein sequence. j=1mjki=1l= -k+1iknucleotide correlations in dna/rna much studied k=2 correlation functions 16 9 6 see wentian li, computer chem. 21(1997) 257-27
13、1.amino acid correlations in proteins almost no study hard to comprehend 400 correlation functions at k=2 proteins too short to define correlation functions one should approach the problem from a more deterministic point of viewrepeated aa segments in proteins are strong manifestation of correlation
14、s!on-going study: the other extremequit a few proteins have an enormous number of reconstructions. transmembrane antifreeze fibrous: collagenscoarse-graining: closer to biology by reducing the number of aasproteinproteinaaaar(1r(11)1)k kfor r(k)=1for r(k)=1maga_xenlamaga_xenlaspa2_staauspa2_staaus sr rtx_atrentx_atrenebn1_ebvebn1_ebvi icen_psesycen_psesyxpin_xenlaxpin_xenlacaih_mousecaih_mousec cipa_clotmipa_clotm3 303031 150508 84 45 508088640864028285 54 43 39.9732*109.9732*105 510110164641 15.407*105.407*1016162828120012001.55675*101.5
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 健康养老展服务博览会方案
- 项链说课课件2017
- 《旅行社经营管理》课件-第三章 旅行社产品开发设计
- 音标教学课件
- 人民警察法制教育
- 城镇污水管网建设工程建设管理方案(模板范文)
- 健康饮食产业园项目投标书(参考)
- xx河流排水防涝设施建设项目可行性研究报告
- 先锋问答知识:政治建设题库考点(题库版)
- 2025年锂电池正极材料合作协议书
- 2024年广州市荔湾区社区专职招聘笔试真题
- 《人工智能基础与应用》课件 项目1 认识人工智能
- 网络货运安全管理制度
- 2025至2030全球及中国溴化聚苯乙烯(BPS)行业发展趋势分析与未来投资战略咨询研究报告
- 校园外卖公司管理制度
- BA系统对电气设备动力柜(箱)的自控接口要求
- 汕尾市市直单位招聘政府聘员笔试真题2024
- 辽宁省铁岭市铁岭县2023-2024学年七年级下学期7月期末考试地理试卷(含答案)
- 房颤患者用药护理
- 2025年全国统一高考数学试卷(全国二卷)
- 无菌技术及手卫生
评论
0/150
提交评论