英文文献《对分布式系统数据库的介绍》.doc_第1页
英文文献《对分布式系统数据库的介绍》.doc_第2页
英文文献《对分布式系统数据库的介绍》.doc_第3页
英文文献《对分布式系统数据库的介绍》.doc_第4页
英文文献《对分布式系统数据库的介绍》.doc_第5页
全文预览已结束

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

An Introduction to Database Systems 1A Database Management System (DBMS) consists of a collection of interrelated data and a set of programs to access those data. A database is a collection of data organized to server many applications efficiently by centralizing the data and minimizing redundant data. The primary goal of a DBMS is to provide an environment that is both convenient and efficient to use in retrieving and storing database information.Database systems are designed to manage large bodies of information. The management of data involves both the definition of structures for the storage of information and the provision of mechanisms for the manipulation of information stored, despite system must avoid possible anomalous results.The important of information in most organizations, which determines the value of the database, has led to the development of a large body of concepts and techniques for the database, has led to the development of a large body of concepts and techniques for the efficient management of data.The typical file-processing system is supported by a conventional operating system. Permanent records are stored in various files, and different application programs are written to extract records from, and to add records to, the appropriate files. Before the advent of DBMSs, organizations typically stored information using such systems.Keeping organizational information in a file-processing system has a number of major disadvantages.Data redundancy and inconsistency. Data redundancy is the presence of duplicate data in multiple data files. Since the files and application programs are created by different programmers over a long period, the various files are likely to have different formats and the programs may be written in several programming languages. Moreover, the same information may be duplicated in several places (files). This redundancy leads to higher storage and access cost. In addition, it may lead to data inconsistency; that is, the various copies of the same data may no longer agree.Difficulty in accessing data. The point here is that conventional file-processing environments do not allow needed data to be retrieved in a convenient and efficient manner. More responsive data-retrieval systems must be developed for general use.Integrity problems. The data values stored in the database must satisfy certain types of consistency constrains. Developers enforce the constraints in the system by adding appropriate code in the various application programs. However, when new constraints are added, it is difficult to change the programs to enforce them. The problem is compounded when constraints involve several data items from different files.Atomicity problems. A computer system, like any other mechanical or electrical device, is subject to failure. In many applications, it is crucial to ensure that, once a failure has occurred and has been detected, the data are restored to the consistent state that existed prior to the failure. Consider a program to transfer $50 from account A to B. If a system failure occurs during the execution of the program, it is possible that the $50 was removed from account A but was not credited to account B , resulting in an inconsistent database state. Clearly, it is essential to database consistency that either both the credit and debit occur, or that neither occurs. That is, the funds transfer must be atomic-it must happen in its entirety or not at all. It is difficult to ensure this property in a conventional file-processing system.Concurrent-access anomalies. So that the overall performance of the system is improved and a faster response time is possible, many systems allow multiple users to update the data simultaneously. In such an environment, interaction of concurrent updates may result in inconsistent data. Consider bank account A, containing $500. If two customers withdraw funds from account A at about the same time, the result of the concurrent executions may leave the account in an incorrect state. Suppose that the programs executing on behalf of each withdrawal read the old balance, reduce that value by the amount being withdraw, and write the result back. If the two programs run concurrently, they may both read the value $500, and write back $450 and $400, respectively. Depending on which one writes the value last, the account may contain either $450 or $400, rather than the correct value of $350. To guard against this possibility, the system must maintain some form of supervision. Because data may be accessed by many different application programs that have not been coordinated previously, however, supervision is difficult to provide.Security problems. Not every user of the database system should be able to access all the data. For example, in a banking system, payroll personnel need to see only that part of the database that has information about the various bank employees. They do not need access to information about customer accounts. Since application programs are added to the system in an ad hoc manner, it is difficult to enforce such security constraints.These difficulties, among others, have prompted the development of DBMSs. In what follows, we shall see the concepts and algorithms that have been developed for database systems to solve the problems mentioned. A typical data-processing application stores a large number of records, each of which is fairly simple and small.A DBMS is a collection of interrelated files and a set of programs that allow users to access and modify these files. A major purpose of a database system is to provide users with an abstract view of the data. That is, the system hides certain details of how the data are stored and maintained.For the system to be usable, it must retrieve data efficiently. This concern has led to the design of complexity from users through several levels of abstraction, to simplify users interactions with the system:A: Physical levels. The lowest level of abstraction describes how the data are actually stored. At the physical level, complex low-level data structures are described in detail.B: Logical level. The next-higher level of abstraction describes what data are stored in database, and what relationships exist among those data. The entire database is thus described in terms of a small number of relatively simple structures. Although implementation of the simple structures at the logical level may involve complex physical-level structures, the user of the logical level does not need to be aware of this complexity. The logical level of abstraction is used by database administrators, who must decide what information is to be kept in the database.C: View level. The highest level of abstraction describes only part of the entire database. Despite the use of simpler structures at the logical level, some complexity remains, because of the large size of the database. Many users of the database system will not be concerned with all this information. Instead, such users need to access only a part of the database. So that their interaction with the system is simplified, the view level of abstraction is defined. The system may provide many views for the same database.The ability to modify a schema definition in one level without affecting a schema definition in the next higher level is called data independence. There are two levels of data independence:Physical data independence is the ability to modify the physical schema without causing application programs to be rewritten. Modifications at the physical level are occasionally necessary to improve performance.Logical data independence is the ability to modify the logical schema without causing application programs to be rewritten. Modifications at the logical level are necessary whenever the logical structure of the database is altered.Logical data independence is more difficult to achieve than physical data independence, since application programs are heavily dependent on the logical structure of the data that they access.The concept of data independence is similar in many respects to the concept of abstract data types in modem programming languages. Both hide implementation details from the users, to allow users to concentrate on the general structure, rather than on low-level implementation details.The structured query language (SQL) is the most widely used and standard query language for relational database management systems. It is a kind of non-procedural language. The SQL language has several parts:Data-definition language (DDL). The SQL DDL provides commands for defining relation schemas, deleting relations, creating indices, and modifying relation schemas.Interactive data-manipulation language (DML). The SQL DML include a query language based on both the relational algebra and the tuple relational calculus. It includes also commands to insert tuples into, delete tuples from, and modify tuples in the database.Embedded DML. The embedded form of SQL is designed for use within general-purpose programming languages.View definition. The SQL DDL include commands for defining views.Authorization. The SQL DDL includes commands for spe

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论