版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、河北工程大学毕业论文(设计)英文参考文献原文复印件及译文论文题目:鸿海种业仓库管理系统的设计与实现作者姓名:专业班级:信管1001学号信息:扌旨导老师:论文 h 期:2014.04.10数据仓库数据仓库为商务运作提供结构与工具,以便系统地组织、理解和使用数据进 行决策。大量组织机构已经发现,在当今这个充满竞争、快速发展的世界,数据 仓库是一个有价值的工具。在过去的儿年中,许多公司己花费数百万美元,建立 企业范围的数据仓库。许多人感到,随着工业竞争的加剧,数据仓库成了必备的 最新营销武器通过更多地了解客户需求而保住客户的途径。“那么二你 可能会充满神秘地问,“到底什么是数据仓库? ”数据仓库已被
2、多种方式定义,使得很难严格地定义它。宽松地讲,数据仓库 是一个数据库,它与组织机构的操作数据库分别维护。数据仓库系统允许将各种 应用系统集成在一起,为统一的历史数据分析提供坚实的平台,对信息处理提供 支持。按照w. h. inmon, 一位数据仓库系统构造方面的领头建筑师的说法,“数据 仓库是一个面向主题的、集成的、时变的、非易失的数据集合,支持管理决策制 定豐这个简短、全面的定义指出了数据仓库的主要特征。四个关键词,面向主 题的、集成的、时变的、非易失的,将数据仓库与其它数据存储系统(如,关系 数据库系统、事务处理系统、和文件系统)相区别。让我们进一步看看这些关键 特征。(1) 面向主题的:
3、数据仓库围绕一些主题,如顾客、供应商、产品和销售组 织。数据仓库关注决策者的数据建模与分析,而不是构造组织机构的h常操作和 事务处理。因此,数据仓库排除对于决策无用的数据,提供特定主题的简明视图。(2) 集成的:通常,构造数据仓库是将多个异种数据源,如关系数据库、一 般文件和联机事务处理记录,集成在一起。使用数据清理和数据集成技术,确保 命名约定、编码结构、属性度量的一致性等。(3) 时变的:数据存储从历史的角度(例如,过去510年)提供信息。数据 仓库中的关键结构,隐式或显式地包含时间元素。(4) 非易失的:数据仓库总是物理地分离存放数据;这些数据源于操作环境 下的应用数据。由于这种分离,数
4、据仓库不需要事务处理、恢复和并行控制机制。 通常,它只需要两种数据访问:数据的初始化装入和数据访问。概言之,数据仓库是一种语义上一致的数据存储,它充当决策支持数据模型 的物理实现,并存放企业决策所需信息。数据仓库也常常被看作一种体系结构, 通过将界种数据源中的数据集成在一起而构造,支持结构化和启发式查询、分析 报告和决策制定。“好二 你现在问,“那么,什么是建立数据仓库? ”根据上面的讨论,我们把建立数据仓库看作构造和使用数据仓库的过程。数 据仓库的构造需要数据集成、数据清理、和数据统一。利用数据仓库常常需要一 些决策支持技术。这使得“知识工人”(例如,经理、分析人员和主管)能够使用 数据仓库
5、,快捷、方便地得到数据的总体视图,根据数据仓库中的信息做岀准确 的决策。有些作者使用术语“建立数据仓库”表示构造数据仓库的过程,而用术语 “仓库dbms"表示管理和使用数据仓库。我们将不区分二者。“组织机构如何使用数据仓库中的信息?"许多组织机构正在使用这些信息 支持商务决策活动,包括:(1) 、增加顾客关注,包括分析顾客购买模式(如,喜爱买什么、购买时间、 预算周期、消费习惯);(2) 、根据季度、年、地区的营销情况比较,重新配置产品和管理投资,调 整生产策略;(3) 、分析运作和查找利润源;(4) 、管理顾客关系、进行环境调整、管理合股人的资产开销。从异种数据库集成的角
6、度看,数据仓库也是十分有用的。许多组织收集了形 形色色数据,并由多个异种的、自治的、分布的数据源维护大型数据库。集成这 些数据,并提供简便、有效的访问是非常希望的,并且也是一种挑战。数据库工 业界和研究界都正朝着实现这一 r标竭尽全力。对于异种数据库的集成,传统的数据库做法是:在多个异种数据库上,建立 一个包装程序和一个集成程序(或仲裁程序)。这方而的例子包括ibm的数据 连接程序和informix的数据刀。当一个查询提交客户站点,首先使用元数据字典 对查询进行转换,将它转换成相应异种站点上的查询。然后,将这些查询映射和 发送到局部查询处理器。由不同站点返回的结果被集成为全局回答。这种查询驱
7、动的方法需要复杂的信息过滤和集成处理,并且与局部数据源上的处理竞争资 源。这种方法是低效的,并且对于频繁的查询,特别是需要聚集操作的查询,开 销很大。对于异种数据库集成的传统方法,数据仓库提供了一个有趣的替代方案。数 据仓库使用更新驱动的方法,而不是查询驱动的方法。这种方法将来自多个异种 源的信息预先集成,并存储在数据仓库中,供直接查询和分析。与联机事务处理 数据库不同,数据仓库不包含最近的信息。然而,数据仓库为集成的异种数据库 系统带来了高性能,因为数据被拷贝、预处理、集成、注释、汇总,并重新组织 到一个语义一致的数据存储中。在数据仓库中进行的查询处理并不影响在局部源 上进行的处理。此外,数
8、据仓库存储并集成历史信息,支持复杂的多维查询。这 样,建立数据仓库在工业界已非常流行。1 操作数据库系统与数据仓库的区别由于大多数人都熟悉商品 关系数据库系统,将数据仓库与之比较,就容易理解什么是数据仓库。联机操作数据库系统的主要任务是执行联机事务和查询处理。这种系统称 为联机事务处理(oltp)系统。它们涵盖了一个组织的大部分日常操作,如购 买、库存、制造、银行、工资、注册、记帐等。另一方而,数据仓库系统在数据 分析和决策方面为用户或“知识工人”提供服务。这种系统可以用不同的格式组织 和提供数据,以便满足不同用户的形形色色需求。这种系统称为联机分析处理 (olap)系统。oltp和olap的
9、主要区别概述如下。(1) 用户和系统的而向性:oltp是面向顾客的,用于办事员、客户、和信 息技术专业人员的事务和查询处理。olap是而向市场的,用于知识工人(包括 经理、主管、和分析人员)的数据分析。(2) 数据内容:oltp系统管理当前数据。通常,这种数据太琐碎,难以方 便地用于决策。olap系统管理大量历史数据,提供汇总和聚集机制,并在不同 的粒度级别上存储和管理信息。这些特点使得数据容易用于见多识广的决策。(3) 数据库设计:通常,oltp系统采用实体联系(er)模型和而向应用 的数据库设计。而olap系统通常采用星形或雪花模型和而向主题的数据库设 计。(4) 视图:oltp系统主要关
10、注一个企业或部门内部的当前数据,而不涉及 历史数据或不同组织的数据。相比之下,由于组织的变化,olap系统常常跨越 数据库模式的多个版本。olap系统也处理來自不同组织的信息,由多个数据存 储集成的信息。由于数据量巨大,olap数据也存放在多个存储介质上。(5) 、访问模式:oltp系统的访问主要由短的、原子事务组成。这种系统需 要并行控制和恢复机制。然而,对olap系统的访问大部分是只读操作(由于 大部分数据仓库存放历史数据,而不是当前数据),尽管许多可能是复杂的查询。 oltp和olap的其它区别包括数据库大小、操作的频繁程度、性能度量等。2.但是,为什么需要一个分离的数据仓库“既然操作数
11、据库存放了大量数 据”,你注意到,“为什么不直接在这种数据库上进行联机分析处理,而是另外花 费时间和资源去构造一个分离的数据仓库? ”分离的主要原因是提高两个系统的 性能。操作数据库是为已知的任务和负载设计的,如使用主关键字索引和散列, 检索特定的记录,和优化“罐装的',查询。另一方面,数据仓库的查询通常是复杂 的,涉及大量数据在汇总级的计算,可能需要特殊的数据组织、存取方法和基于 多维视图的实现方法。在操作数据库上处理olap查询,可能会大大降低操作任 务的性能。此外,操作数据库支持多事务的并行处理,需要加锁和fi志等并行控制和 恢复机制,以确保一致性和事务的强健性。通常,olap查
12、询只需要对数据记录 进行只读访问,以进行汇总和聚集。如果将并行控制和恢复机制用于这olap操 作,就会危害并行事务的运行,从而大大降低oltp系统的吞吐量。最后,数据仓库与操作数据库分离是由于这两种系统中数据的结构、内容和 用法都不相同。决策支持需要历史数据,而操作数据库一般不维护历史数据。在 这种情况下,操作数据库中的数据尽管很丰富,但对于决策,常常还是远远不够 的。决策支持需要将来自异种源的数据统一(如,聚集和汇总),产生高质量的、 纯净的和集成的数据。相比z下,操作数据库只维护详细的原始数据(如事务), 这些数据在进行分析之前需要统一。由于两个系统提供很不相同的功能,需要不 同类型的数据
13、,因此需要维护分离的数据库。data warehousing provides architectures and tools for business executives to sy stematically organize, understand, and use their data to make strategic decisions. a lar ge number of organizations have found that data warehouse systems are valuable tools in today's competitive, fast
14、 evolving world. in the last several years, many firms have spent millions of dollars in building enterprise-wide data warehouses. many people f eel that with competition mounting in every industry, data warehousing is the latest m ust-have marketing weapona way to keep customers by learning more ab
15、out their needs."so蔦 you may ask, full of intrigue, “what exactly is a data warehouse?11data warehouses have been defined in many ways, making it difficult to formulat e a rigorous definition. loosely speaking, a data warehouse refers to a database that is maintained separately from an organiza
16、tion operational databases. data warehouse s ystems allow for the integration of a variety of application systems. they support info rmation processing by providing a solid platform of consolidated, historical data for a nalysis.according to w. h. inmon, a leading architect in the construction of da
17、ta wareho use systems, "a data warehouse is a subject-oriented, integrated, time-variant, and non volatile collection of data in support of managements decision making process/* this short, but comprehensive definition presents the major features of a data warehouse. t he four keywords, subject
18、-oriented, integrated, time-variant, and nonvolatile, distingui sh data warehouses from other data repository systems, such as relational database sys terns, transaction processing systems, and file systems. let's take a closer look at each of these key features.(1) .subject-oriented: a data war
19、ehouse is organized around major subjects, such as customer, vendor, product, and sales. rather than concentrating on the day-to-day o perations and transaction processing of an organization, a data warehouse focuses on t he modeling and analysis of data for decision makers. hence, data warehouses t
20、ypical ly provide a simple and concise view around particular subject issues by excluding dat a that are not useful in the decision support process.(2) integrated: a data warehouse is usually constructed by integrating multiple he terogeneous sources, such as relational databases, flat files, and on
21、-line transaction rec ords. data cleaning and data integration techniques are applied to ensure consistency i n naming conventions, encoding structures, attribute measures, and so on.(3) .time-variant: data are stored to provide information from a historical pers pective (e.g., the past 5-10 years).
22、 every key structure in the data warehouse contains, either implicitly or explicitly, an element of time.(4) nonvolatile: a data warehouse is always a physically separate store of data tra nsformed from the application data found in the operational environment. due to this separation, a data warehou
23、se does not require transaction processing, recovery, and co ncurrency control mechanisms. it usually requires only two operations in data accessi ng: initial loading of data and access of data.in sum, a data warehouse is a semantically consistent data store that serves as a p hysical implementation
24、 of a decision support data model and stores the information on which an enterprise needs to make strategic decisions. a data warehouse is also often viewed as an architecture, constructed by integrating data from multiple heterogeneou s sources to support structured and/or ad hoc queries, analytica
25、l reporting, and decisio n making."ok you now ask, “what,then, is data warehousing?11based on the above, we view data warehousing as the process of constructing and using data warehouses. the construction of a data warehouse requires data integratio n, data cleaning, and data consolidation. the
26、 utilization of a data warehouse often nec essitates a collection of decision support technologies. this allows "knowledge worke rsn (e.g., managers, analysts, and executives) to use the warehouse to quickly and con veniently obtain an overview of the data, and to make sound decisions based on
27、infer mation in the warehouse. some authors use the term "data warehousing11 to refer only to the process of data warehouse construction, while the term warehouse dbms is use d to refer to the management and utilization of data warehouses. we will not make thi s distinction here."how are o
28、rganizations using the information from data warehouses?” many org anizations are using this information to support business decision making activities, in eluding:(1) increasing customer focus, which includes the analysis of customer buying pa tterns (such as buying preference, buying time, budget
29、cycles, and appetites for spendi ng),(2) repositioning products and managing product portfolios by comparing the per formance of sales by quarter, by year, and by geographic regions, in order to fine-tune production strategies,(3) analyzing operations and looking for sources of profit,(4) managing t
30、he customer relationships, making environmental corrections, and managing the cost of corporate assets.data warehousing is also very useful from the point of view of heterogeneous dat abase integration. many organizations typically collect diverse kinds of data and main tain large databases from mul
31、tiple, heterogeneous, autonomous, and distributed infer mation sources. to integrate such data, and provide easy and efficient access to it is hi ghly desirable, yet challenging.much effort has been spent in the database industry and research community tow ards achieving this goal.the traditional da
32、tabase approach to heterogeneous database integration is to buil d wrappers and integrators (or mediators) on top of multiple, heterogeneous databases a variety of data joiner and data blade products belong to this category. when a quer y is posed to a client site, a metadata dictionary is used to t
33、ranslate the query into quer ies appropriate for the individual heterogeneous sites involved. these queries are then mapped and sent to local query processors. the results returned from the different sit es are integrated into a global answer set. this query-driven approach requires comple x informa
34、tion filtering and integration processes, and competes for resources with pro cessing at local sources. it is inefficient and potentially expensive for frequent queries, especially for queries requiring aggregations.data warehousing provides an interesting alternative to the traditional approach o f
35、 heterogeneous database integration described above. rather than using a query-drive n approach, data warehousing employs an update-driven approach in which informati on from multiple, heterogeneous sources is integrated in advance and stored in a ware house for direct querying and analysis. unlike
36、on-line transaction processing database s, data warehouses do not contain the most current information. however, a data ware house brings high performance to the integrated heterogeneous database system since data are copied, preprocessed, integrated, annotated, summarized, and restructured int o on
37、e semantic data store. furthermore, query processing in data warehouses does not interfere with the processing at local sources. moreover, data warehouses can store an d integrate historical information and support complex multidimensional queries. as a result, data warehousing has become very popul
38、ar in industry.1. differences between operational database systems and data warehousessince most people are familiar with commercial relational database systems, it is easy to understand what a data warehouse is by comparing these two kinds of systemsthe major task of on-line operational database sy
39、stems is to perform on-line trans action and query processing. these systems are called on-line transaction processing ( oltp) systems. they cover most of the day-to-day operations of an organization, sue h as, purchasing, inventory, manufacturing, banking, payroll, registration, and account ing. da
40、ta warehouse systems, on the other hand, serve users or "knowledge workersn i n the role of data analysis and decision making. such systems can organize and presen t data in various formats in order to accommodate the diverse needs of the different us ers. these systems are known as on-line ana
41、lytical processing (olap) systems.the major distinguishing features between oltp and olap are summarized as f ollows.(1) . users and system orientation: an oltp system is customer-oriented and is u sed for transaction and query processing by clerks, clients, and information technolog y professionals
42、. an olap system is market-oriented and is used for data analysis by k nowledge workers, including managers, executives, and analysts.(2) . data contents: an oltp system manages current data that, typically, are too detailed to be easily used for decision making. an olap system manages large amou nt
43、s of historical data, provides facilities for summarization and aggregation, and stores and manages information at different levels of granularity. these features make the d ata easier for use in informed decision making.(3) . database design: an oltp system usually adopts an entity-relationship (er
44、) data model and an application -oriented database design. an olap system typically adopts either a star or snowflake model, and a subject-oriented database design.(4) . view: an oltp system focuses mainly on the current data within an enterpri se or department, without referring to historical data
45、or data in different organizations. in contrast, an olap system often spans multiple versions of a database schema, due to the evolutionary process of an organization. olap systems also deal with informat ion that originates from different organizations, integrating information from many da ta store
46、s. because of their huge volume, olap data are stored on multiple storage me dia.(5) . access patterns: the access patterns of an oltp system consist mainly of sh ort, atomic transactions. such a system requires concurrency control and recovery me chanisms. however, accesses to olap systems are most
47、ly read-only operations (since most data warehouses store historical rather than up-to-date information), although m any could be complex queries.other features which distinguish between oltp and olap systems include data base size, frequency of operations, and performance metrics and so on. 2. but,
48、 why ha ve a separate data warehouse?"since operational databases store huge amounts of data蔦 you observe, “why not perform on-line analytical processing directly on such databases instead of spending additional time and resources to construct a separate data warehouse?na major reason for such
49、a separation is to help promote the high performance of both systems. an operational database is designed and tuned from known tasks and w orkloads, such as indexing and hashing using primary keys, searching for particular re cords, and optimizing "canned” queries. on the other hand, data wareh
50、ouse queries ar e often complex. they involve the computation of large groups of data at summarized levels, and may require the use of special data organization, access, and implementati on methods based on multidimensional views. processing olap queries in operationa 1 databases would substantially
51、 degrade the performance of operational tasks.moreover, an operational database supports the concurrent processing of several t ransactions. concurrency control and recovery mechanisms, such as locking and loggi ng, are required to ensure the consistency and robustness of transactions. an olap qu er
52、y often needs read-only access of data records for summarization and aggregation. concurrency control and recovery mechanisms, if applied for such olap operations, may jeopardize the execution of concurrent transactions and thus substantially reduce the throughput of an oltp system.finally, the sepa
53、ration of operational databases from data warehouses is based on the different structures, contents, and uses of the data in these two systems. decision support requires historical data, whereas operational databases do not typically mainta in historical data. in this context, the data in operationa
54、l databases, though abundant, i s usually far from complete for decision making. decision support requires consolidat ion (such as aggregation and summarization) of data from heterogeneous sources, resu lting in high quality, cleansed and integrated data. in contrast, operational databases c ontain
55、only detailed raw data, such as transactions, which need to be consolidated bef ore analysis. since the two systems provide quite different functionalities and require different kinds of data, it is necessary to maintain separate databases.五分钟搞定5000字毕业论文外文翻译,你想要的工具都在这里!在科研过程中阅读翻译外文文献是一个非常重要的环节,许多领 域高水平的文献都是外文文献,借鉴一些外文文献翻译的经验是非常 必要的。由于特殊原因我翻译外文文献的机会比较多,慢慢地就发现 了外文文献翻译过程中
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 水务服务社会责任的履行报告计划
- 职位晋升中的秘书职业规划计划
- 西安石油大学《大学物理下》2022-2023学年第一学期期末试卷
- 西安培华学院《供应链与物流管理》2023-2024学年第一学期期末试卷
- 西安交通大学《基础物理》2022-2023学年第一学期期末试卷
- 西安航空学院《中国民族器乐鉴赏》2022-2023学年第一学期期末试卷
- 口底多间隙感染的临床护理
- 广东省河源市2024年七年级上学期语文期末试卷附答案
- 武汉生物工程学院《古代小说名著》2021-2022学年第一学期期末试卷
- 武汉设计工程学院《包装设计》2022-2023学年第一学期期末试卷
- 铁路2010年预算定额
- 律师事务所税务规划(齐金勃)
- 环网柜的施工方案
- 晕厥的诊断与治疗课件
- 教师口语艺术学习通超星课后章节答案期末考试题库2023年
- 10KV高压电缆日常巡检、保养、试验、检修全
- 环卫工作与交通安全培训课件(58张)
- 同济大学信纸
- 高度尺操作指导书
- 商务接待与拜访礼仪
- 移动式C形臂X射线机产品技术要求wandong
评论
0/150
提交评论