北邮-数据库系统原理(英文)-12-15_第1页
北邮-数据库系统原理(英文)-12-15_第2页
北邮-数据库系统原理(英文)-12-15_第3页
北邮-数据库系统原理(英文)-12-15_第4页
北邮-数据库系统原理(英文)-12-15_第5页
已阅读5页,还剩33页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、PART 4DATA STORAGE AND QUERYChapter 12 Indexing and HashingDatabase System Concepts - Chapter 12 Indexing and Hashing -3Contents in This ChapternBasic concepts about and classification of indexingn ordered indices, hash indicesnProperties/types of ordered indicesn primary/clustering indices, seconda

2、ry/non-clustering indicesn dense indices, sparse indicesn single-level indices, multi-level indices (e.g. B+-tree, B-tree)nHash indicesn hash functionsn static hash, dynamic hashDatabase System Concepts - Chapter 12 Indexing and Hashing -4目目 标!标!n学会准确建立、管理索引,提高数据访问速学会准确建立、管理索引,提高数据访问速度度n 索引类型索引类型n s

3、elect、insert、update时的索引管理时的索引管理Database System Concepts - Chapter 12 Indexing and Hashing -512.1 Basic ConceptsnHow to locate records in DB file quickly?nIndexing (索引技术)nmechanisms used to speed up access to desired datane.g. for relation account(account-number, branch-name, balance) shown in Fig.12

4、.1, the index branch-name physical address of record (i.e.tuple) in DB file accountnSearch Keynattributes or a set of attributes used to look up the records in a filene.g. branch-name A-217A-110A-101A-215A-201A-218A-102A-222A-305Fig. 12.1 DB indexed file account and its index fileNote: the file acco

5、unt is logically a sequential file, but its records may be stored non-contiguously or non-ordered on the disklogical file accountphysical file account index file(account-number, branch-name, balance) indexed fileDatabase System Concepts - Chapter 12 Indexing and Hashing -712.1 Basic Concepts (cont.)

6、nIndexing nmapping from search-key to storage locations of the file records, i.e. search-key storage locations of the records in disks搜索键(search key)索引文件(index file)数据文件(主文件、被索引文件indexed file)s1s2s3.si.sj.sn散列函数Hs1s2s3.si.sj.sna3a2a1aiajaman-1anb3b2b1bibjbmbn-1bns3s2s1sisjsmsn-1snd3d2d1didjdmdn-1dn

7、r(R), R (A, B, S, . , , D)h(si)h(sn)h(s1)索引项index entrys1s2s3sism图 12.0.1 索引技术(indexing)及其分类ordered indiceshash indices搜索键(search key)Database System Concepts - Chapter 12 Indexing and Hashing -9nTwo basic kinds of indices:nordered indices: the index file is used to store the index entries in which

8、the search key of the records and the address of the records are stored in sorted ordernhash indices: the “hash function” is used to map the the search key of the records to the address of the records nthe records are stored in the “buckets”, the number of the bucket is as the address of the records

9、 and is determined by the hash function12.1 Basic Concepts (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -10nDBS files with indexing mechanism include two parts :nindexed file, in which data records are storednindex file, in which index entries are included ne.g. Fig.12.1nThe ind

10、exed file can be organized asn sequential file n heap filen hash filen clustering file 12.2 Ordered IndicesDatabase System Concepts - Chapter 12 Indexing and Hashing -1112.1 Basic Concepts (cont.)nIndex filen set of index entries of the form nIndex files are typically much smaller than the original

11、file search-keylocationDatabase System Concepts - Chapter 12 Indexing and Hashing -12nOrdered index (in index file) nthe index entries in the index file are stored in some sorted order, for instance, in accordance with the order of the search key/* 索引项的排列顺序与(被索引文件中)搜索键的排列顺序一致n e.g. in Fig.12.1( ) ,

12、the index file is sorted by branch-name 12.2 Ordered Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -13nPrimary/clustering index (in index file)nconsidering a index file and its corresponding indexed file. nif the indexed file is a sequential file, and the search key in in

13、dex file also specifies the sequential order of the indexed filen/* 索引文件的搜索键所规定的顺序与被索引的顺序文件中的纪录顺序一致ne.g. in Fig.12.1 , (branch-name, address of records) defines the same orders of the records as that in sequential indexed file accountnnote: also called clustering index12.2-I Primary and Secondary In

14、dices Database System Concepts - Chapter 12 Indexing and Hashing -14nThe search key of a primary index is usually but not necessarily the primary keyn e.g. in Fig. 12.1, branch-name is not the primary key of account12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Ind

15、exing and Hashing -15n聚集索引 vs 主索引 在实际数据库系统中,如SQL Server、DB2等,n聚集索引: 指索引文件的搜索键所规定的顺序与被索引的顺序文件中的纪录顺序一致n主索引: 指建立在主键(主属性上)的索引n当一个表定义了主键后,DBMS会自动为该表在主键上建立聚集索引,该索引同时又是主索引12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -16n一个表上只能建立一个聚集索引,也只能有一个主索引。但可以建

16、立多个非聚集索引n如果一个表上没有定义主键,则不会有主索引;但可以为该表建立一个聚集索引n主索引一定是聚集索引,但聚集索引不一定是主索引主索引一定是聚集索引,但聚集索引不一定是主索引12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -17n实验:对比分析n 在没有定义主键的关系表中插入数据: 在关系表所在数据库文件中,各个元组/行/记录顺序一般是无序,取决于数据插入顺序heap文件n 在有主键的关系表中插入数据: 数据库文件是有序的,按照主

17、键顺序排列12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -18nSecondary indexn an index whose search key specifies an order different from the sequential order of the file. nalso called non-clustering indexn e.g. Fig.12.5 , secondary index on balance

18、 field of accountnSecondary indices have to be dense indicesnIndex-sequential file (索引顺序文件)nordered sequential file with a primary/clustering index on the search key n e.g. Fig.12.1 12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -19Fig.12.5 Sec

19、ondary Index on balance field of account12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -20n Dense index n the index record in the index file appears for every search-key value in the indexed fileneach value of search-key in the indexed file cor

20、responds to an index entry in the index file n e.g. Fig.12.1 n In a dense primary index, the index record contains the search-key value and a pointer to the first record with that search-key valuen the rest of the records with the same search-key value would be stored sequentially after the first re

21、cordne.g. Fig.12.1, search-key value “Perryridge” corresponds to three records in the file12.2-II Dense and Sparse IndicesDatabase System Concepts - Chapter 12 Indexing and Hashing -21nSparse Indexn index file contains index entries for only some search-key values in the indexed filene.g. Fig.12.3 n

22、With respect to the sparse index, to locate a file record with search-key value Knfind index entry with largest search-key value Knsearch file sequentially starting at the record to which this index entry points12.2-II Dense and Sparse Indices (cont.)Database System Concepts - Chapter 12 Indexing an

23、d Hashing -22Fig.12.3 Sparse index for the file account index file indexed file12.2-II Dense and Sparse IndicesDatabase System Concepts - Chapter 12 Indexing and Hashing -23nThe indices in Fig.12.1, Fig.12.3, and Fig.12.5 are single-level indicesnThe index file may be very large, and cannot be entir

24、ely kept in memory nTo access the index file quickly, the index file is stored on disk as a sequential file and construct a sparse index on this sequential index filenouter index a sparse index of primary index fileninner index the primary index filenIf even outer index is too large to fit in main m

25、emory, yet another level of index can be created, and so on.12.2-III Single-level and Multi-level Indices Fig. 12.4 Two-level sparse indexindexed fileindex fileNote: similar to hierachical page tables or file indexs in OSDatabase System Concepts - Chapter 12 Indexing and Hashing -25nB+-tree indices

26、and B-tree indices are two types of efficient multi-level indices, and widely used in DBSnHow to build a B/B+ tree index for a DB file with sparse index.12.2-III Single-level and Multi-level IndicesFig. An Example of B+-tree index堆、二叉搜索树?!Database System Concepts - Chapter 12 Indexing and Hashing -2

27、6nThe file records are stored in a set of bucketsn a bucket is a unit of storage containing one or more records (a bucket is typically a disk block).nHash function hn a function from the set of all search-key values K in a file to the set of all bucket addresses, i.e. the set of the addresses of fil

28、e recordsntypical hash functions perform computation on the internal binary representation of the search-key. nHash file organizationn obtaining the bucket of a record directly from its search-key value using a hash function12.5 Hashing Index FilesFig.12.21 Hash file organization of account file, wi

29、th branch-name as key Hash function:returns the sum of the binary representations of the characters modulo 10Database System Concepts - Chapter 12 Indexing and Hashing -28Handling of Bucket OverflowsnBucket overflow (溢出) can occur because of ninsufficient buckets nskew in distribution of records. Th

30、is can occur due to two reasons:nmultiple records have same search-key valuenchosen hash function produces non-uniform distribution of key valuesnAlthough the probability of bucket overflow can be reduced, it cannot be eliminated; it is handled by using overflow buckets.Database System Concepts - Ch

31、apter 12 Indexing and Hashing -29Handling of Bucket Overflows (cont.)nOverflow chaining the overflow buckets of a given bucket are chained together in a linked list.nAbove scheme is called closed hashing. Fig. 12.22 in an hash structureDatabase System Concepts - Chapter 12 Indexing and Hashing -30Ha

32、sh IndicesnHashing can be used not only for file organization, but also for index-structure creation. nA hash index organizes the search keys, with their associated record pointers, into a hash file structure.nStrictly speaking, hash indices are always secondary indices nif the file itself is organized using hashing, a separate primary hash index on it using the same search-key is unnecessary. nHowever, we use the term hash index to refer to both secondary index structures and hash organized files

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论