分布式数据库设计(共82张PPT)

上传人：C*** IP属地：湖南上传时间：2023-11-06 格式：PPTX 页数：82 大小：4.11MB 积分：50 举报 版权申诉

已阅读5页，还剩77页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

分布式数据库设计A

FRAMEWORK

FOR

DISTRIBUTEDDATABASE

DESIGN

(概述)THE

DESIGN

DATABASEFRAGMENTATION

(分片设计)THE

ALLOCATION

FRAGMENTS

(分配设计)University

Shanghai

for

Science

and

TechnologyDistributed

DBMSPage

2.1Level

sharing

共享维

不共享，数据共享，数据+程序共享Access

pattern

behavior访问模式维

静态模式，动态

模式(分布式数据库设计与查询处理)Level

knowledge

访问模式知识维用户完全已知或部

分已知访问模式分布式系统设计的三维Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.2存取模式动态静态部分信息数据知识级别数据+程序完整信息共享Dimensions

the

ProblemUniversity

Shanghai

for

Science

and

TechnologyDistributed

DBMSPage

2.3集中式数据库设计1.

Designing

the

"conceptual

schema"which

describes

the

integrated

database2.

Designing

the

"physical

database,"i.e.,

mapping

the

conceptual

schema

storage

areas

and

determining

appropriate

accessmethods.University

Shanghai

for

Science

and

TechnologyDistributed

DBMSPage

2.4分布式数据库设计的特殊要求+3.Designing

the

fragmentation.+4.

Designing

the

allocation

fragments,i.e.mapped

physical

images;

also

the

replication

fragments

determined.University

Shanghai

for

Science

and

TechnologyDistributed

DBMSPage

2.5关于分片和分配的几点注意Fragmentation

design

been

partially

analyzed

centralized

systemswith

multiple

storage

devices.The

allocation

problem

has

been

studied

the

"file

allocation

problem."The

distinction

between

two

problems

conceptually

relevantone

deals

with

the

"logical

criteria"

which

motivate

thefragmentation

global

relationone

deals

with

the

"physical"

placement

data

the

various

sites.这两个问题通常是相互关联的，不可能独立地解决它们而

能确定最优的fragmentaion和allocationUniversity

Shanghai

for

Science

and

TechnologyDistributed

DBMSPage

2.6关于APPLICATION考虑因素：分布式数据库设计包括：分布式数据库设计和相应的分布式应用设计1.

The

site

from

which

the

application

issued

(site

origin

theapplication).2.

The

frequency

activation

the

application

(i.e.,在单位时间内被激活

的次数);

applications

which

can

issued

multiple

sites,

need

know

the

frequency

ofactivation

each

application

each

site.3.

The

number,

type,

and

the

statistical

distribution

accesses

made

byeach

application

each

required

data

"object."University

Shanghai

for

Science

and

TechnologyDistributed

DBMSPage

2.7设计目标

(Objectives)Processing

locality数据处理的本地性Availability

and

reliability

distributed

data

分

布式数据的有效性和可靠性冗余控制Workload

distribution

工作负荷的合理分布Storage

costs

and

availability存储能力和费用Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.8Processing

localityMaximizeprocessing

locality

corresponds

the

simple

principle

ofplacing

data

possible

the

applications

which

usethem.Maximizing

processing

locality

(minimizing

remote

references)

canbe

done

adding

the

number

local

and

remote

referencescorresponding

each

candidate

fragmentation

and

fragment,

allocation,

and

selecting

the

best

solution

among

them.The

advantage

complete

locality

not

only

the

reduction

remoteaccesses,

but

also

the

increased

simplicity

controlling

the

executionof

theapplication.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.9Availability

and

reliability

distributed

dataA

high

degree

availability

for

read-only

applications

isachieved

storing

multiple

copies

the

same

information;the

system

must

able

switch

alternative

copy

when

the

one

that

should

accessed

under

normal

conditions

not

available.Reliability

also

achieved

storing

multiple

copies

the

same

information

possible

recover

from

crashes

orfrom

the

physical

destruction

one

the

copies

using

the

other

still

available

copies.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.10Workload

distributionAn

important

feature

distributed

computersystems.To

take

advantage

the

different

powers

utilizations

computers

each

site,Maximize

the

degree

parallelism

execution

applications.workload

distribution

might

negatively

affectprocessing

locality

consider

the

trade-offDistributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.11Storage

costs

and

availabilityShould

reflect

the

cost

and

availability

storage

the

different

sites.

possible

have

specialized

sites

the

network

for

data

storage,

orconversely

have

sites

which

not

support

mass

storage

all.通常存储的费用并不是非常重要(Compared

CPU,I/O,

Transmission

network).Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.12设计方法Top-Down

Approach自顶向下Bottom-Up

Approach自底向上Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.13Top-down

approach已

有DB..如何分割数据及如何分配这些数据到不同站点过程start

designing

the

global

schemadesigning

the

fragmentation

the

databasethen

allocating

the

fragments

the

sites,

creating

the

physicalimagesThe

approach

completed

performing,

each

site,

the

"physical

design"

the

data

which

are

allocated

it.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.14AccessInformationDistributionDesignObjectivesUser

InputView

IntegrationTop-Down

DesignRequirements

AnalysisConceptual

DesignPhysical

DesignView

DesignUser

InputLCS'sLIS'sGCSES's特点能先看到雏形问题：

When

the

distributed

database

developed

the

aggregation

ofexisting

databases,

not

easy

the

top-down

approach.

The

global

schema

often

produced

compromise

between

existing

data

descriptions.Distributed

DBMSUniversity

Shanghai

for

Science

and

Technology

Page

2.16Bottom-up

approachExisting

databases

are

aggregated(还可能是异构heterogeneous

或完全自治autonomous),

无设计问题(信息集成)!Based

the

integration

existing

schemata

into

single,

global

schema.By

integration,

the

merging

common

datadefinitions

and

the

resolution

conflicts

among

different

representations

given

the

same

data.University

Shanghai

for

Science

and

TechnologyDistributed

DBMSPage

2.17bottom-up

approachHorizontal

fragments

same

global

relation

must

have

thesame

relation

schema

easily

enforced

top-downdesign,

while

difficult

"discover"

it.

The

integrationprocess

should

attempt

modify

the

definitions

localrelations,

that

they

can

regarded

horizontal

fragments

common,

global

relation.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.18The

selection

common

database

modelfor

describing

the

global

schema

thedatabase.The

translation

each

local

schema

into

thecommon

data

model.The

integration

the

local

schemata

into

acommon

global

schema.bottom-up

design

requires(异构情况下)Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.19DDB

设计的两个问题FragmentaionHorizontal

FragmentationVertical

fragmentationAllocation通常分片设计和分配设计需要统筹考

虑Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.20Horizontal

FragmentationPrimary

fragmentation

初级分片Derived

horizontal

fragmentation导出

分片Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.21水平分片原则若

R⇔F={F₁

,F₂

,…,Fn},则完整性

对于每一个元组t

∈R,3F₁

∈F

使得t

∈F不相交性

对Vt

∈F,-3Fj

使得

∈Fj,

i≠j可重构性操作是并し(可以忽略，因为完整性就蕴含着)R=U{F₁

,F₂

,..,

Fn}Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.22水平分片一例例子EMP(E#,

NAME,

DEPT,

JOB,

SAL,TEL,..)DEPT={1,2}

JOB={‘P',

‘-P’}假定，应用经常查询的内容是属于部门1且是程序员的职员。

(80/20原则)则可能有的水平分片限定

(Qualification)P={

DEPT=1}P={DEPT=1,JOB=‘P’}P={DEPT=1,JOB=‘P’,SAL>500}Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.23"手工"检查!e.g.,

F1=Oloc='Sa’E;F₂

=Oloc=‘Sb'E生成具有满足分段原则的predicate谓词Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.24如何保证分片原则谓词：用来执行分片选择操作的条件1.A

simple

predicate

简单谓词：Attribute

value

eg.:

DEPT=12.A

minterm

predicate

(

小项谓词)

给定简单谓词集P={p₁,p₂

…pn},y=△piepP₁*

也既是P₁*AP₂*A..APn*where(p*=p;or

p₁*=NOT

p;)and

y≠false3.A

fragment

the

set

all

tuples

for

which

minterm

predicate

holds.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.25一些定义谓词生成过程找到常用的AP查询的simple

predicate

(A,θ

Value)

诸如：A<10,A>5,Loc

=Sa,Loc

=Sb生

成

小

项

谓

词消除可能出现的无用谓词Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.26Global

relation

EMP(EMPNUM,

NAME,

SAL,

TAX,

MGRNUM,

DEPTNUM)Assume:

some

important

APs

require

information

about

employeeswho

are

members

department;

other

important

APs

which

require

onlythe

data

employees

who

areprogrammers;

these

last

APs

can

issued

any

site,

and

reference

all

programmers

with

the

same

probability.Assume:

that

there

are

only

two

departments,

and

thus,DEPT=1→DEPT≠

2,and

vice

versa.Two

simple

predicates

are

DEPT=1

and

JOB="P"(programmer).

The

minterm

predicates

for

these

two

predicates

areDEPT=1

AND

JOB="P"DEPT=1

AND

JOB≠"P"DEPT≠1

AND

JOB="P"DEPT≠1

AND

JOB

≠"P"Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.27ExampleAll

the

above

simple

predicates

are

relevante.g.

SAL>50

not

relevant

predicate;Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.28讨论LetP={pi,p₂,...,pn}

set

ofsimple

predicates.为了正确有效进行分片，则P必须是completeand

minimal1.P

predicates

complete

any

two

tuplesbelonging

the

same

fragment

are

referencedwith

the

same

probability

any

application.2.P

minimal

all

its

predicates

are

relevant.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.29complete

and

minimalP₁={DEPT

=1}is

not

complete

-the

applicationsreference

tuples

programmers

with

greaterprobability

within

each

fragment

produced

P₁

.P₂={DEPT=1,JOB="P"}is

complete

and

minimal.P₃={DEPT=1,JOB="P",SAL>50}iscomplete

butnot

minimal,

since

SAL>50

not

relevant.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.30ExampleFragmentation

MethodBasis

Consider

predicate

which

partitions

the

tuples

into

two

parts

which

are

referenced

differently

leastone

application.

Let

={p₁}MethodConsider

new

simple

predicate

which

partitions

least

onefragment

into

two

parts

which

are

referenced

different

way

least

one

application.Set

P—Pup;.

Eliminate

nonrelevant

predicates

from

P.Repeat

this

step

until

the

set

the

minterm

fragments

iscomplete.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.31Consider:

SAL>50:

programmers

have

average

salary

greater

than

50,

itdetermines

two

sets

employees

who

are

referenced

differently

theapplications.P1={SAL>50}Consider:

DEPT

=1;

thispredicate

relevant

and

added

the

one,

P2={

SAL>50,DEPT=1}.Consider:

JOB="P".

The

predicate

relevant,set

P3={SAL>50,DEPT=1,JOB="P"}.then

SAL>

not

relevant

P3,

thus,

the

final

set

P4={DEPT=1,JOB=

"P"},

which

complete

and

minimal.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.32ExampleA

"reasonable"

way1.Concentrating

few

importantapplications2.Not

distinguishing

fragments

whosefeatures

are

very

similarDistributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.331.Administrative

applications,

issued

only

sites

and

3;administrative

applications

about

departments

the

northernarea

are

issued

site

those

about

departments

the

southern

area

are

issued

site

3.2.Applications

about

work

conducted

each

department;

theycan

issued

any

department,

but

they

reference

tuples

the

departments

which

are

closer

their

site

origin

with

higher

probability

than

the

tuples

other

departments.DEPT

(DEPTNUM,NAME,AREA,MGRNUM)

importantapplications:Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.34P1:DEPTNUM<10P2:10<DEPTNUM<20P3:DEPTNUM

>20P4:AREA

="North"P5:AREA

="South"Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.35Set

predicates:Y1:DEPTNUM<10Y2:

DEPTNUM<10Y3:10<DEPTNUM<20Y4:10<DEPTNUM<20Y5:

DEPTNUM>10Y6:

DEPTNUM>10AREA="North"AREA="South“AREA="North"

AREA="South"AREA

="North"AREA="South"andandandandandandDistributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.36可能的谓词限定Reduce,

eg.AREA="North"implies

that

DEPNUM>20y1:DEPTNUM≤10y2:(10<DEPTNUM≤20)AND

(AREA

="North")y3:(10<DEPTNUM≤20)AND

(AREA

"South")y4:DEPTNUM>20Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.37Derived

Horizontal

Fragmentation导出分片DHF:

从另一个关系的属性性质或水平分片

推导出来采

用DHF

可以使分片之间的join操作更加容易Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.38DHF

分片exampleeg:SC(S#,C#,GRADE)S(S#,SNAME.

AGE,

SEX)分段设计Define

fragment

SC1

asSelect

SC.S#,C#,GRADE

From

SC,SWhere

SC.S#=S.S#

and

SEX=‘M'Define

fragment

SC2

asSelect

SC.S#,C#,GRADE

From

SC,SWhere

SC.S#=S.S#

and

SEX=‘F'Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.39distributed

joinjoin

graphsTotalSimplepartitioned分布式数据库中的join

连接操作Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.40圆圈：数据分片无向边：两个分片之间有相同

属性值的元组存在Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.41连接图定义R

SJoin

graphA

join

graph

total

when

contains

all

possible

edgesbetween

fragments

and

S;完全连接图定义Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.42R

STotal

Join

graphA

reduced

join

graph

partitioned

the

graph

composedof

two

moresubgraphs

without

edges

between

themDistributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.43部分连接图定义Partitioned

Join

graphR

reduced

join

graph

Sis

simple

ispartitioned

and

eachsubgraph

has

justone

edgeDistributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.44简单连接图定义Simple

Join

graphSUPPLY

(SNUM,PNUM,

DEPTNUM,

QUAN)

SUPPLY

always

used

together

with

another

relationSome

applications

require

information

about

supplies

given

suppliers-join

SUPPLY

and

SUPPLIER

the

SNUM

attribute.

The

other

applications

require

information

about

supplies

given

department-join

SUPPLY

and

DEPT

the

DEPTNUMattribute.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.45General

example

(continued)DEPT

horizontally

fragmented

according

values

taken

the

attribute

DEPTNUMSUPPLIER

horizontally

fragmented

according

values

taken

the

attribute

SNUM.There

are

two

possible

derived

fragmentationsSUPPLYone

through

the

semi-join

with

SUPPLIER

SNUMone

through

the

semi-join

with

DEPT

DEPTNUMboth

them

are

correct.The

selection

between

these

alternatives

should

take

intoaccount

which

one

the

two

corresponding

joins

used

applications.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.46Vertical

FragmentationVertical

Clusteringobjective:将某个AP

频繁使用的属性聚

集在一起，当有多个APs

有时候需要权

衡利弊。Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.47Vertical

Fragmentation为一全局关系R

进行分片是不容易的，因为随着R

的

属性数目增加，可能的分片数目也大幅度增加

(the

number

possible

clusters

even

larger.)两种启发式方法(heuristic

approaches)The

split

approach

which

global

relations

are

progressively

splitinto

fragments分裂法The

grouping

approach

which

attributes

are

progressively

aggregated

constitute

fragments成组法Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.48EMP(EMPNUM,NAME,

SAL,TAX,MGRNUM,DEPTNUM)APP1

、Administrative

applications,

concentrated

site

requiring

NAME,

SAL,

and

TAX

employees.APP2

、Applications

about

work

conducted

each

department,requiring

NAME,

MGRNUM,

and

DEPTNUM

employees;

these

applications

are

issued

all

sites,

and

reference

tuples

employees

the

same

group

departments

with

percent

probability.Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.49General

example

(continued)EMPI(EMPNUM,NAME,TAX,

SAL)EMP2(EMPNUM,

NAME,

MGRNUM,DEPTNUM)Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.50结果the

simplest

ways:1.

Applying

horizontal

fragmentation

vertical

fragments2.Applying

vertical

fragmentation

horizontal

fragmentsDistributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.51Mixed

Fragmentationnonredundant

allocation(

easier)The

simplest

method

"best-fit"

(最佳适应

)

approach;

measure

associated

with

eachpossible

allocation,

and

the

site

with

the

best

measure

selected.redundant

allocationReplication

introduces

further

complexity,例如复

制程度，如何检索和更新等Distributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.52THE

ALLOCATION

FRAGMENTS讨论在进行redundant

allocation冗余分配时，通常先求nonredundant

allocation非冗余分配的解，在此基础上再求

redundant

allocation冗余分配的解The

"additional

replication"

method

typical

heuristicapproach;

with

this

method,

possible

take

intoaccount

that

the

increase

the

degree

redundancy

isDistributed

DBMS

University

Shanghai

for

Science

and

Technology

Page

2.53progressively

less

beneficial.Two

methods

(for

reduntant

allocation):1.

Determine

the

set

all

sites

where

the

benefit

allocatingone

copy

the

fragment

higher

than

the

cost,

人人文库> 全部分类> 应用文书 > 事务文书

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

分布式数据库设计(共82张PPT)

文档简介

温馨提示

最新文档

评论

分布式数据库设计(共82张PPT)

文档简介

温馨提示

最新文档

评论

相关文档