data:image/s3,"s3://crabby-images/0c9d6/0c9d662876569e7954c17ba7cb4df119bc7e6de6" alt="数据库管理系统:ch14 Query Optimization_第1页"
data:image/s3,"s3://crabby-images/a0b97/a0b979f9723633f2fda378f91a4469104f17197e" alt="数据库管理系统:ch14 Query Optimization_第2页"
data:image/s3,"s3://crabby-images/a58b3/a58b3b8854ed0c93ca1d612eb86643f955bdb75d" alt="数据库管理系统:ch14 Query Optimization_第3页"
data:image/s3,"s3://crabby-images/ee2bc/ee2bc8001b7961bc456f0bc5fa8989bb3d9210ab" alt="数据库管理系统:ch14 Query Optimization_第4页"
data:image/s3,"s3://crabby-images/b2fb1/b2fb141aa5358da01cd077adf8769f527c51796a" alt="数据库管理系统:ch14 Query Optimization_第5页"
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Database System Concepts 5th Ed.Silberschatz, Korth and SudarshanSee www.db- for conditions on re-use Silberschatz, Korth and Sudarshan14.2Database System Concepts - 5th Edition, Aug 27, 2005.nIntroduction nTransformation of Relational ExpressionsnCatalog Information for Cost EstimationnStatistical
2、Information for Cost EstimationnCost-based optimizationSilberschatz, Korth and Sudarshan14.3Database System Concepts - 5th Edition, Aug 27, 2005.nAlternative ways of evaluating a given querylEquivalent expressionslDifferent algorithms for each operation (Chapter 13)nCost difference between a good an
3、d a bad way of evaluating a query can be enormousnNeed to estimate the cost of operationslStatistical information about relations. Examples:4number of tuples, 4number of distinct values for an attributes,4Etc.lStatistics estimation for intermediate results4 to compute cost of complex expressionsSilb
4、erschatz, Korth and Sudarshan14.4Database System Concepts - 5th Edition, Aug 27, 2005.nRelations generated by two equivalent expressions have the same set of attributes and contain the same set of tupleslalthough their tuples/attributes may be ordered differently.Silberschatz, Korth and Sudarshan14.
5、5Database System Concepts - 5th Edition, Aug 27, 2005.nGeneration of query-evaluation plans for an expression involves several steps:1.Generating logically equivalent expressions using equivalence rules.2.Annotating resultant expressions to get alternative query plans3.Choosing the cheapest plan bas
6、ed on estimated costnThe overall process is called cost based optimization.Silberschatz, Korth and Sudarshan14.6Database System Concepts - 5th Edition, Aug 27, 2005.nTwo relational algebra expressions are said to be equivalent if on every legal database instance the two expressions generate the same
7、 set of tupleslNote: order of tuples is irrelevantnIn SQL, inputs and outputs are multisets of tupleslTwo expressions in the multiset version of the relational algebra are said to be equivalent if on every legal database instance the two expressions generate the same multiset of tuplesnAn equivalenc
8、e rule says that expressions of two forms are equivalentlCan replace expression of first form by second, or vice versaSilberschatz, Korth and Sudarshan14.7Database System Concepts - 5th Edition, Aug 27, 2005.1.Conjunctive selection operations can be deconstructed into a sequence of individual select
9、ions.2.Selection operations are commutative.3.Only the last in a sequence of projection operations is needed, the others can be omitted.4.Selections can be combined with Cartesian products and theta joins.a.(E1 X E2) = E1 E2 b.1(E1 2 E2) = E1 1 2 E2 )()(1221EE)()(2121EE)()(121EELLnLLSilberschatz, Ko
10、rth and Sudarshan14.8Database System Concepts - 5th Edition, Aug 27, 2005.5. Theta-join operations (and natural joins) are commutative.E1 E2 = E2 E16. (a) Natural join operations are associative: (E1 E2) E3 = E1 (E2 E3)(b) Theta joins are associative in the following manner: (E1 1 E2) 2 3 E3 = E1 1
11、3 (E2 2 E3) where 2 involves attributes from only E2 and E3.Silberschatz, Korth and Sudarshan14.9Database System Concepts - 5th Edition, Aug 27, 2005.Silberschatz, Korth and Sudarshan14.10Database System Concepts - 5th Edition, Aug 27, 2005.7. The selection operation distributes over the theta join
12、operation under the following two conditions:(a) When all the attributes in 0 involve only the attributes of one of the expressions (E1) being joined. 0E1 E2) = (0(E1) E2 (b) When 1 involves only the attributes of E1 and 2 involves only the attributes of E2. 1 E1 E2) = (1(E1) ( (E2)Silberschatz, Kor
13、th and Sudarshan14.11Database System Concepts - 5th Edition, Aug 27, 2005.8. The projections operation distributes over the theta join operation as follows:(a) if involves only attributes from L1 L2: Let L1 and L2 be sets of attributes from E1 and E2, respectively (b) Consider a join E1 E2. l Let L1
14、 and L2 be sets of attributes from E1 and E2, respectively. lLet L3 be attributes of E1 that are involved in join condition , but are not in L1 L2, and let L4 be attributes of E2 that are involved in join condition , but are not in L1 L2.)()()(21212121EEEELLLL)()()(212142312121EEEELLLLLLLLSilberscha
15、tz, Korth and Sudarshan14.12Database System Concepts - 5th Edition, Aug 27, 2005.9.The set operations union and intersection are commutative E1 E2 = E2 E1 E1 E2 = E2 E1 n(set difference is not commutative).10.Set union and intersection are associative. (E1 E2) E3 = E1 (E2 E3) (E1 E2) E3 = E1 (E2 E3)
16、11.The selection operation distributes over , and . (E1 E2) = (E1) (E2) and similarly for and in place of Also: (E1 E2) = (E1) E2 and similarly for in place of , but not for 12. The projection operation distributes over union L(E1 E2) = (L(E1) (L(E2) Silberschatz, Korth and Sudarshan14.13Database Sy
17、stem Concepts - 5th Edition, Aug 27, 2005.nQuery: Find the names of all customers who have an account at some branch located in Brooklyn.customer_name(branch_city = “Brooklyn”(branch (account depositor)nTransformation using rule 7a. customer_name (branch_city =“Brooklyn” (branch) (account depositor)
18、nPerforming the selection as early as possible reduces the size of the relation to be joined. Silberschatz, Korth and Sudarshan14.14Database System Concepts - 5th Edition, Aug 27, 2005.nQuery: Find the names of all customers with an account at a Brooklyn branch whose account balance is over $1000.cu
19、stomer_name(branch_city = “Brooklyn” balance 1000 (branch (account depositor)nTransformation using join associatively (Rule 6a):customer_name(branch_city = “Brooklyn” balance 1000 (branch account) depositor) nSecond form provides an opportunity to apply the “perform selections early” rule, resulting
20、 in the subexpression branch_city = “Brooklyn” (branch) balance 1000 (account)nThus a sequence of transformations can be usefulSilberschatz, Korth and Sudarshan14.15Database System Concepts - 5th Edition, Aug 27, 2005.Silberschatz, Korth and Sudarshan14.16Database System Concepts - 5th Edition, Aug
21、27, 2005.nWhen we compute(branch_city = “Brooklyn” (branch) account )we obtain a relation whose schema is:(branch_name, branch_city, assets, account_number, balance)nPush projections using equivalence rules 8a and 8b; eliminate unneeded attributes from intermediate results to get: customer_name ( ac
22、count_number ( (branch_city = “Brooklyn” (branch) account ) depositor )nPerforming the projection as early as possible reduces the size of the relation to be joined. customer_name(branch_city = “Brooklyn” (branch) account) depositor) Silberschatz, Korth and Sudarshan14.17Database System Concepts - 5
23、th Edition, Aug 27, 2005.nFor all relations r1, r2, and r3,(r1 r2) r3 = r1 (r2 r3 )nIf r2 r3 is quite large and r1 r2 is small, we choose (r1 r2) r3 so that we compute and store a smaller temporary relation.Silberschatz, Korth and Sudarshan14.18Database System Concepts - 5th Edition, Aug 27, 2005.nC
24、onsider the expressioncustomer_name (branch_city = “Brooklyn” (branch) (account depositor)nCould compute account depositor first, and join result with branch_city = “Brooklyn” (branch)but account depositor is likely to be a large relation.nOnly a small fraction of the banks customers are likely to h
25、ave accounts in branches located in Brooklynl it is better to compute branch_city = “Brooklyn” (branch) accountfirst. Silberschatz, Korth and Sudarshan14.19Database System Concepts - 5th Edition, Aug 27, 2005.nQuery optimizers use equivalence rules to systematically generate expressions equivalent t
26、o the given expressionnConceptually, generate all equivalent expressions by repeatedly executing the following step until no more expressions can be found: l for each expression found so far, use all applicable equivalence rules4 add newly generated expressions to the set of expressions found so far
27、nThe above approach is very expensive in space and timenSpace requirements reduced by sharing common subexpressions:lwhen E1 is generated from E2 by an equivalence rule, usually only the top level of the two are different, subtrees below are the same and can be shared4E.g. when applying join associa
28、tivitynTime requirements are reduced by not generating all expressionslMore details shortlySilberschatz, Korth and Sudarshan14.20Database System Concepts - 5th Edition, Aug 27, 2005.nCost of each operator computed as described in Chapter 13lNeed statistics of input relations4E.g. number of tuples, s
29、izes of tuplesnInputs can be results of sub-expressionslNeed to estimate statistics of expression resultslTo do so, we require additional statistics4E.g. number of distinct values for an attributenMore on cost estimation laterSilberschatz, Korth and Sudarshan14.21Database System Concepts - 5th Editi
30、on, Aug 27, 2005.nAn evaluation plan defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated.Silberschatz, Korth and Sudarshan14.22Database System Concepts - 5th Edition, Aug 27, 2005.nMust consider the interaction of evaluation techniques wh
31、en choosing evaluation plans: choosing the cheapest algorithm for each operation independently may not yield best overall algorithm. E.g.lmerge-join may be costlier than hash-join, but may provide a sorted output which reduces the cost for an outer level aggregation.lnested-loop join may provide opp
32、ortunity for pipeliningnPractical query optimizers incorporate elements of the following two broad approaches:1. Search all the plans and choose the best plan in a cost-based fashion.2. Uses heuristics to choose a plan.Silberschatz, Korth and Sudarshan14.23Database System Concepts - 5th Edition, Aug
33、 27, 2005.nConsider finding the best join-order for r1 r2 . . . rn.nThere are (2(n 1)!/(n 1)! different join orders for above expression. With n=5, the number is 1680, n = 7, the number is 665280, with n = 10, the number is greater than 176 billion!nAssuming we want to find the optimal join-order for the expression as belows:n(r1 r2 r3) r4 r5nThe first three relations joins number is 12; after that there is another 12 join-order . The total number of join-order is 24.Silberschatz, Korth and Sudarshan14.24Database System Concepts - 5th Edition, Aug 27, 2005
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 木工班班组劳务分包合同
- 仔猪购销合同协议书
- 深圳住房租赁合同书
- 办公用品采购买卖合同
- 衢州职业技术学院《搜索引擎营销》2023-2024学年第二学期期末试卷
- 山东化工职业学院《英语学科教学设计与技能训练》2023-2024学年第二学期期末试卷
- 三江学院《世界古代史(下)》2023-2024学年第二学期期末试卷
- 广东食品药品职业学院《医务社会工作》2023-2024学年第二学期期末试卷
- 西安交通大学城市学院《环境化学Ⅱ》2023-2024学年第二学期期末试卷
- 贵州财经大学《中学政治课教师技能训练》2023-2024学年第二学期期末试卷
- ct增强扫描中造影剂外渗课件
- 《汽车发动机构造与维修》教案-
- 2021年陕西西安亮丽电力集团有限责任公司招聘笔试试题
- 高中英语-Studying abroad教学课件设计
- 原材料取样检测安全操作规程
- 创新思维与方法(第2版)PPT全套完整教学课件
- (5.3.2)-2.2杂草的分类农田杂草及防除学
- 人教部编道德与法治五年级下册单元计划
- 天津武清区事业单位考试真题2022
- 铁路营业线施工安全管理培训课件
- 旅行社运营实务电子课件 1.2 了解旅行社核心业务部门
评论
0/150
提交评论