生物信息学中基因的功能研究_第1页
生物信息学中基因的功能研究_第2页
生物信息学中基因的功能研究_第3页
生物信息学中基因的功能研究_第4页
生物信息学中基因的功能研究_第5页
已阅读5页,还剩75页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、基因注释与功能分类生物信息学中基因的功能研究proteinRNADNA基因注释与功能分类内容分类n功能可以从多个角度来认识n生物信息学为什么要关注它?n功能注释数据库(GO)nGO提供了很多工具n基因功能的统计分析基因注释与功能分类内容分类n功能可以从多个角度来认识n生物信息学为什么要关注它?n功能注释数据库(GO)nGO提供了很多工具n基因功能的统计分析基因注释与功能分类1. Biochemical function(molecular function)RBP binds retinol,could be a carriern例子: 酶 结构蛋白 转运蛋白n细胞中不存在没有任何功能的蛋白。

2、 基因注释与功能分类2. Functional assignmentbased on homologyRBPcould bea carriertooOthercarrier proteins增味剂结合增味剂结合蛋白是蛋白是lipocalins的一个成员,的一个成员,也被认为是也被认为是一个载体蛋一个载体蛋白白基因注释与功能分类3. Functionbased on structureRBP forms a calyxX射线晶体衍射显示射线晶体衍射显示RBP形成一个类似茶杯的结构,有一圈形成一个类似茶杯的结构,有一圈疏水氨基酸组成,充当一个配体结合位点疏水氨基酸组成,充当一个配体结合位点基因注释

3、与功能分类4. Function based onligand binding specificityRBP binds vitamin A基因注释与功能分类5. Function based oncellular processDNARNARBP is abundant,soluble, secreted基因注释与功能分类6. Function basedon biological processRBP is essential for vision基因注释与功能分类7. Function based on “proteomics”or high throughput “functional

4、 genomics”High throughput analyses show.RBP levels elevated in renal failureRBP levels decreased in liver disease基因注释与功能分类内容分类n功能可以从多个角度来认识n生物信息学为什么要关注它?n功能注释数据库(GO)nGO提供了很多工具n基因功能的统计分析基因注释与功能分类三个背景因素n功能的规范化和大规模处理的需要;n传统认识功能的方法远远跟不上基因发现的速度,于是出现从大规模数据功能预测的需要;n从系统的水平认识基因功能的需要。基因注释与功能分类背景Year19822005Nu

5、mber of records60244, 202,133 基因注释与功能分类Published LiteraturenPubMed: over 15 million citations nBasic search: rad51 1038 articlesnLimit search:rad51, Human (organism) 485nBoolean operators:rad51 AND cancer 234 articles基因注释与功能分类Whats in a name?nWhat is a “cell” ?基因注释与功能分类CellImage from http:/microscop

6、基因注释与功能分类Cell基因注释与功能分类Cell基因注释与功能分类Whats in a name?基因注释与功能分类nGlucose synthesisnGlucose biosynthesisnGlucose formationnGlucose anabolismnGluconeogenesisnAll refer to the process of making glucose from simpler components生物学功能的命名问题基因注释与功能分类Whats in a name?n同样的名字可以描述不同的概念比如,“Cell”n一个概念用不同的名字来描述

7、比如, Gluconeogenesis、 Glucose formation Comparison is difficult in particular across species or across databases 基因注释与功能分类What is the Gene Ontology?A (part of the) solution: -A controlled vocabulary that can be applied to all organisms-Used to describe gene products - proteins and RNA - in any organi

8、sm基因注释与功能分类OntologynIn philosophy, the most fundamental branch of metaphysics. It studies being or existence as well as the basic categories thereoftrying to find out what entities and what types of entities exist. WikipedianOntologies provide controlled, consistent vocabularies to describe concepts

9、 and relationships, thereby enabling knowledge sharing Gruber 1993基因注释与功能分类OntologyIncludes:1.A vocabulary of terms (names for concepts)2.Definitions3.Defined logical relationships to each other基因注释与功能分类Ontologies can be represented as graphs, where the nodes are connected by edges n Nodes = concept

10、s in the ontologyn Edges = relationships between the conceptsnodenodenodeedgeOntology Structure基因注释与功能分类功能预测的需要基因注释与功能分类Pie chart showing homology of predicted human proteins to proteins of other species for those where homologues were detected by computer searches of the public databases. 2003 John

11、 Wiley and Sons Publishers基因注释与功能分类基因注释与功能分类系统水平上对基因功能的认识基因注释与功能分类内容分类n功能可以从多个角度来认识n生物信息学为什么要关注它?n功能注释数据库(GO)nGO提供了很多工具n基因功能的统计分析基因注释与功能分类功能注释数据库基因注释与功能分类Gene Ontology的发起的发起n芽殖酵母基因组数据库(SGD) n果蝇基因组数据库(drosophila genome database,简称FlyBase) n小鼠基因组信息数据库;(mouse genome information database,简称MGDGXD) GO数据库

12、不是以其自身为中心而是依靠外部数据库,这些外部数据库中收录的基因及其产物都将用GO定义的词汇进行注释。因此GO是与时俱进与相互合作的代表,它致力于统一基因及其产物注释的方式。 You can visit GO at .基因注释与功能分类Ontology StructurenThe Gene Ontology is structured as a hierarchical directed acyclic graph (DAG)nTerms can have more than one parent and zero, one or more

13、childrennTerms are linked by two relationshipsis-apart-of基因注释与功能分类 Simple hierarchies (Trees) Directed Acyclic GraphsSingle parentOne or more parents基因注释与功能分类Directed Acyclic GraphsSimple hierarchies (Trees)Directed Acyclic GraphsOne or more parentsSingle parent基因注释与功能分类Directed Acyclic Graphsis a:上

14、一个概念包括下一个概念 , 下一个概念是上一个概念的实例 。part of:下一个概念是上一个概念的一部分 树松树叶子Part ofIs a基因注释与功能分类True Path RuleTrue Path Rule:已糖代谢和单糖合成 己糖合成活性 基因注释与功能分类How does GO work?nWhat每个蛋白质能做什么?nWhy为什么细胞需要这个功能?nWhere在什么地方发生了这样的过程?What information might we want to capture about a gene product?基因注释与功能分类GO: Three ontologiesWhere

15、does it act?What processes is it involved in?What does it do?Molecular FunctionCellular ComponentBiological Processgene product基因注释与功能分类Cellular Componentnwhere a gene product acts基因注释与功能分类Biological Process基因注释与功能分类Gluconeogenesis基因注释与功能分类Molecular Function基因注释与功能分类Molecular Functionn分子功能描述在分子生物学上的

16、活性,如催化活性或结合活性。nSets of functions make up a biological process.insulin bindinginsulin receptor activity基因注释与功能分类term: gluconeogenesisid: GO:0006094definition: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. Whats in a GO term?基因注释与功能分类基因注释与功能分类 Mo

17、lecular Function 7,309 terms Biological Process 10,041 terms Cellular Component 1,629 terms Total 18, 975 termsTotal 25, 473 termsContent of GOAs of October 2005As of October 2007Molecular Function 8,878 terms Biological Process 19,731 terms Cellular Component 2,769 termsTotal 32, 826 termsAs of Oct

18、ober 2010基因注释与功能分类Mitochondrial P450Annotation of gene products with GO terms基因注释与功能分类Cellular component: mitochondrial inner membrane GO:0005743Biological process:Electron transportGO:0006118Molecular function: monooxygenase activity GO:0004497substrate + O2 = CO2 +H20 product基因注释与功能分类Two types of

19、GO Annotations: Electronic Annotation Manual AnnotationAll annotations must: be attributed to a source indicate what evidence was found to support the GO term-gene/protein association基因注释与功能分类Manual Annotations Highquality, specific gene/gene product associations made, using: Peer-reviewed papers Ev

20、idence codes to grade evidence BUT is very time consuming and requires trained biologists基因注释与功能分类Electronic Annotations Provides large-coverage High-quality BUT annotations tend to use high-level GO terms and provide little detail.基因注释与功能分类1. Database entries Manual mapping of GO terms to concepts

21、external to GO (translation tables) Proteins then electronically annotated with the relevant GO term(s)2. Automatic sequence similarity analyses to transfer annotations between highly similar gene productsElectronic Annotations: Methods基因注释与功能分类Fatty acid biosynthesis (Swiss-Prot Keyword) EC:

22、 (EC number) IPR000438: Acetyl-CoA carboxylase carboxyl transferase beta subunit (InterPro entry) GO:Fatty acid biosynthesis(GO:0006633)GO:acetyl-CoA carboxylase activity(GO:0003989)GO:acetyl-CoA carboxylaseactivity(GO:0003989) Electronic Annotations基因注释与功能分类Mappings of external concepts to GOEC:1.1

23、.1.1 GO:alcohol dehydrogenase activity ; GO:0004022EC:0 GO:L-xylulose reductase activity ; GO:0050038EC:04 GO:4-oxoproline reductase activity ; GO:0016617EC:05 GO:retinol dehydrogenase activity ; GO:0004745基因注释与功能分类1. Extract information from published literature 2. Curators per

24、forms manual sequence similarity analyses to transfer annotations between highly similar gene products (BLAST, protein domain analysis)Manual Annotations: Methods基因注释与功能分类Finding GO termsIn this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predict

25、ed to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GFP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane proteinthese

26、 kinases have been implicated in early stages of wound responseProcess: response to wounding GO:0009611serine/threonine kinase activity, Function: protein serine/threonine kinase activity GO:0004674integral membrane protein Component: integral to plasma membrane GO:0005887for B. napus PERK1 protein

27、(Q9ARH1)PubMed ID: 12374299wound response基因注释与功能分类Annotate to finest granularityAnnotating to GO:0030047 automatically annotates to all of its parents; thus a product is annotated to both protein modification AND cytoskeleton organization基因注释与功能分类GO Evidence Codes基因注释与功能分类 A gene product can have se

28、veral functions, cellular locations and be involved in many processes Annotation of a gene product to one ontology is independent from its annotation to other ontologies Annotations are only to terms reflecting a normal activity or location 其他注意事项基因注释与功能分类Unknownwe know what we dont known“Unknown” i

29、s used when the curator has determined that there is no existing literature to support an annotation.Biological process unknown GO:0000004Molecular function unknown GO:0005554Cellular component unknown GO:0008372nNOT the same as having no annotation at all No annotation means that no one has looked

30、yet基因注释与功能分类Annotation qualifiers to be or not to be is crucial for GO基因注释与功能分类怎么进入GO和获得功能注释数据 1. 下载 Ontologies Annotations : Gene association files Ontologies and Annotations 2. 网页在线进入 AmiGO () QuickGO (http:/www.ebi.ac.uk/ego) 基因注释与功能分类GO ontology (gene_ontology.obo)format-

31、version: 1.0 date: 20:10:2005 17:32 saved-by: jlomax auto-generated-by: DAG-Edit 1.419 rev 3 default-namespace: gene_ontology remark: cvs version: $Revision: 3.1176 $ Term id: GO:0000001 name: mitochondrion inheritance namespace: biological_process def: The distribution of mitochondria, including th

32、e mitochondrial genome, into daughter cells after mitosis or meiosis, mediated by interactions between mitochondria and the cytoskeleton. PMID:10873824, PMID:11389764, SGD:mcc is_a: GO:0048308 ! organelle inheritance is_a: GO:0048311 ! mitochondrion distribution Term id: GO:0000002 name: mitochondri

33、al genome maintenance namespace: biological_process def: The maintenance of the structure and integrity of the mitochondrial genome. GO:ai is_a: GO:0007005 ! mitochondrion organization and biogenesis Term id: GO:0000003 name: reproduction alt_id: GO:0019952 namespace: biological_process def: The pro

34、duction by an organism of new individuals that contain some portion of their genetic material inherited from that organism. GO:curators, ISBN:0198506732 subset: goslim_generic subset: goslim_plant subset: gosubset_prok is_a: GO:0008150 ! biological_process Term id: GO:0000004 name: biological proces

35、s unknown namespace: biological_process def: Used for the annotation of gene products whose process is not known or cannot be inferred. SGD:curators subset: goslim_generic subset: goslim_goa subset: goslim_plant subset: goslim_yeast subset: gosubset_prok is_a: GO:0008150 ! biological_process http:/w

36、/GO.current.annotations.shtmlGene Association Filesgene association 文件的内容目录ColumnContentExample1DBSGD, MGI2DB_Object IDMGI:12345683DB_Object_SymbolGras34GO_ID QualifierNOT, co_localizes_with, contributes_to5GO_IDGO:00015156DB_RefPMID:2345677Evidence_CodeIDA, etc.8With/From9GO_aspectP (process), C (component) F (function)10DB_Object_NameGrasshopper 3 homlog11DB_Object_SynonymLocust III, 0122345E12Rik12DB_Object_TypeGene, transcript, or protein13Taxon taxon:493214Date2005010115Assigned_byDB (usually same as col

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论