版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、第16讲 逻辑建模(II) Logical Modeling(II) Textbook: Chapter 6 Database Design,Instructor:孙瑜 Department:计算机科学学院计算机专业教研室 Email: QQ:782061453 QQ答疑群:79453765,2,本讲主要内容,为什么需要“数据模型优化” 数据模型优化的基本概念 关系模式的非形式化设计原则 函数依赖的基本概念 规范化的基本概念 超键、键和主属性的定义 第1范式 第2范式 第3范式 规范化小结,3,复习:Phases of Database Design and Implementation Pr
2、ocess(数据库设计的基本过程),Phase 1:Requirements Collections and Analysis(需求收集与分析) Phase 2:Conceptual Database Design(概念结构设计) Phase 3:Choice of a DBMS(选择合适的DBMS) Phase 4:Data Model Mapping (Logical Database Design)(逻辑结构设计) Phase 5:Physical Database Design(物理结构设计) Phase 6:Database System Implementation(数据库实施)
3、Phase 7:Database System Operation and Maintenance(数据库运行和维护),4,Phase 2: ER DIAGRAM of COMPANY database,5,Phase 4:Data Model Mapping (Logical Database Design)(逻辑结构设计),Data Model Mapping(数据模型映射) From E-R to Relational Model 数据模型的优化(难点) 设计用户子模式,6,7,Schema diagram for the COMPANY relational database sche
4、ma,8,为什么需要“数据模型优化”,So far in our discussion of conceptual design and its mapping into the relational model, we have not developed any measure of the appropriateness, “goodness,” or quality of the design, other than the intuition(直觉) of the designer. We need some formal measure of why one grouping of
5、 attributes into a relation schema may be better than another.,9,数据模型优化的基本概念,What is relational database design? The grouping of attributes to form good relation schemas We first discuss informal guidelines for good relational design Then we discuss formal concepts of functional dependencies and nor
6、mal forms - 1NF (First Normal Form) - 2NF (Second Normal Form) - 3NF (Third Normal Form),10,Informal Design Guidelines for Relation Schemas,Semantics(语义) of the Relation Attributes Whenever we group attributes to form a relation schema, we assume that a certain meaning is associated with the attribu
7、tes. In general, the easier it is to explain the semantics of the relation, the better the relation schema design will be.,11,GUIDELINE 1: Design a relation schema so that it is easy to explain its meaning. Do not combine attributes from multiple entity types and relationship types into a single rel
8、ation. Intuitively, if a relation schema corresponds to one entity type or one relationship type, the meaning tends to be clear. Otherwise, the relation corresponds to a mixture of multiple entities and relationships and hence becomes semantically unclear. (可概括为“一事一地”原则,即“一件事放一张表,不同事放不同表”),12,一事一地:o
9、ne-fact-in-one-place,13,They may be used as views, but they cause problems when used as base relations.,14,Redundant(冗余) Information in Tuples and Update Anomalies(异常),One goal of schema design is to minimize the storage space that the base relations (files) occupy. Update Anomalies insertion anomal
10、ies deletion anomalies modification anomalies,15,16,Insertion Anomalies,To insert a new employee tuple into EMP_DEPT, we must include either the attribute values for the department that the employee works for, or nulls (if the employee does not work for a department as yet). It is difficult to inser
11、t a new department that has no employees as yet in the EMP_DEPT relation. The only way to do this is to place null values in the attributes for employee. This causes a problem because SSN is the primary key of EMP_DEPT, and each tuple is supposed to represent an employee entitynot a department entit
12、y.,17,Deletion Anomalies,If we delete from EMP_DEPT an employee tuple that happens to represent the last employee working for a particular department, the information concerning that department is lost from the database.,18,Modification Anomalies,In EMP_DEPT, if we change the value of one of the att
13、ributes of a particular departmentsay, the manager of department 5we must update the tuples of all employees who work in that department; otherwise, the database will become inconsistent.,19,GUIDELINE 2: Design the base relation schemas so that no insertion, deletion, or modification anomalies are p
14、resent in the relations. If any anomalies are present, note them clearly and make sure that the programs that update the database will operate correctly.,20,It is important to note that these guidelines may sometimes have to be violated in order to improve the performance of certain queries. For exa
15、mple, if an important query retrieves information concerning the department of an employee, along with employee attributes, the EMP_DEPT schema may be used as a base relation. However, the anomalies in EMP_DEPT must be noted and well understood so that, whenever the base relation is updated, we do n
16、ot end up with inconsistencies.,21,In general, it is advisable(明智的) to use anomaly-free base relations and to specify views that include the JOINs for placing together the attributes frequently referenced in important queries. This reduces the number of JOIN terms specified in the query, making it s
17、impler to write the query correctly, and in many cases it improves the performance.,22,The performance of a query specified on a view that is the JOIN of several base relations depends on how the DBMS implements the view. Many relational DBMSs materialize a frequently used view so that they do not h
18、ave to perform the JOINs often. The DBMS remains responsible for updating the materialized view (either immediately or periodically) whenever the base relations are updated.,23,Null Values in Tuples,In some schema designs we may group many attributes together into a fat relation. If many of the attr
19、ibutes do not apply to all tuples in the relation, we end up with many nulls in those tuples.,24,GUIDELINE 3: As far as possible, avoid placing attributes in a base relation whose values may frequently be null. If nulls are unavoidable, make sure that they apply in exceptional cases(极少数情况,比如SUPERSSN
20、) only and do not apply to a majority of tuples in the relation.,25,1:1和1:N联系的另一种映射方式,If only 10 percent of employees have individual offices, there is little justification(正当理由) for including an attribute OFFICE_NUMBER in the EMPLOYEE relation; rather, a relation EMP_OFFICES(ESSN, OFFICE_NUMBER) ca
21、n be created to include tuples for only the employees with individual offices.,26,Functional Dependencies,Functional dependencies (FDs) are used to specify formal measures of the goodness of relational designs FDs and keys are used to define normal forms for relations FDs are constraints that are de
22、rived from the real-world meaning and interrelationships of the data attributes,27,Functional Dependencies (continued),A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique value for Y X - Y holds if whenever two tuples have the same value for X, t
23、hey must have the same value for Y For any two tuples t1 and t2 in any relation instance r(R): If t1X=t2X, then t1Y=t2Y If K is a key of R, then K functionally determines all attributes in R (since we never have two distinct tuples with t1K=t2K),28,Functional Dependencies (continued),An FD is a prop
24、erty of the attributes in the schema R The constraint must hold on every relation instance r(R),29,Examples of FD constraints,social security number determines employee name SSN - ENAME project number determines project name and location PNUMBER - PNAME, PLOCATION employee ssn and project number det
25、ermines the hours per week that the employee works on the project SSN, PNUMBER - HOURS,30,31,Normalization of Relations,Normalization: The process of decomposing unsatisfactory bad relations by breaking up their attributes into smaller relations Normal form: Condition using keys and FDs of a relatio
26、n to certify whether a relation schema is in a particular normal form The database designers need not normalize to the highest possible normal form. (usually up to 3NF, BCNF or 4NF),32,Definitions of Keys and Attributes Participating in Keys,A superkey of a relation schema R = A1, A2, ., An is a set
27、 of attributes S subset-of R with the property that no two tuples t1 and t2 in any legal relation state r of R will have t1S = t2S A key K is a superkey with the additional property that removal of any attribute from K will cause K not to be a superkey any more.,33,Definitions of Keys and Attributes
28、 Participating in Keys(continued),If a relation schema has more than one key, each is called a candidate key. One of the candidate keys is arbitrarily designated to be the primary key, and the others are called secondary keys. A Prime attribute must be a member of some candidate key A Nonprime attri
29、bute is not a prime attributethat is, it is not a member of any candidate key.,34,First Normal Form,Disallows composite attributes, multivalued attributes, and nested relations Considered to be part of the definition of relation,35,FIGURE 10.8Normalization into 1NF. (a) A relation schema that is not
30、 in 1NF. (b) Example state of relation DEPARTMENT. (c) 1NF version of same relation with redundancy.,36,Second Normal Form,Definitions: Prime attribute - attribute that is member of the primary key K Full functional dependency - a FD Y - Z where removal of any attribute from Y means the FD does not
31、hold any more Examples:- SSN, PNUMBER - HOURS is a full FD since neither SSN - HOURS nor PNUMBER - HOURS hold - SSN, PNUMBER - ENAME is not a full FD (it is called a partial dependency ) since SSN - ENAME also holds,37,Second Normal Form (continued),A relation schema R is in second normal form (2NF)
32、 if every non-prime attribute A in R is fully functionally dependent on the primary key R can be decomposed into 2NF relations via the process of 2NF normalization,38,FIGURE 10.10Normalizing into 2NF and 3NF. (a) Normalizing EMP_PROJ into 2NF relations (b) Normalizing EMP_DEPT into 3NF relations.,39,Third Normal
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
评论
0/150
提交评论