大二下复习数据库课件ch_第1页
大二下复习数据库课件ch_第2页
大二下复习数据库课件ch_第3页
大二下复习数据库课件ch_第4页
大二下复习数据库课件ch_第5页
已阅读5页,还剩39页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、nFailure ClassificationnStorage StructurenRecovery and AtomicitynLog-Based RecoverynShadow PagingnRecovery With Concurrent TransactionsnBuffer ManagementnFailure with Loss of Nonvolatile StoragenAdvanced Recovery TechniquesnARIES Recovery AlgorithmnRemote Backup SystemsnTransaction failure :nLogical

2、 errors: transaction cannot complete due to some internal error conditionnSystem errors: the database system must terminate an active transaction due to an error condition (e.g., deadlock)nSystem crash: a power failure or other hardware or software failure causes the system to crash.nFail-stop assum

3、ption: non-volatile 非易失性非易失性 storage contents are assumed to not be corrupted by system crashnDatabase systems have numerous integrity checks to prevent corruption of disk data nDisk failure: a head crash or similar disk failure destroys all or part of disk storagenDestruction is assumed to be detec

4、table: disk drives use checksums to detect failuresnRecovery algorithms are techniques to ensure database consistency and transaction atomicity and durability despite failuresnFocus of this chapternRecovery algorithms have two partsnActions taken during normal transaction processing to ensure enough

5、 information exists to recover from failuresnActions taken after a failure to recover the database contents to a state that ensures atomicity, consistency and durabilitynVolatile storage:ndoes not survive system crashesnexamples: main memory, cache memorynNonvolatile storage:nsurvives system crashes

6、nexamples: disk, tape, flash memory, non-volatile (battery backed up) RAM nStable storage: 稳定存储器稳定存储器na mythical form of storage that survives all failuresnapproximated by maintaining multiple copies on distinct nonvolatile medianMaintain multiple copies of each block on separate disksncopies can be

7、 at remote sites to protect against disasters such as fire or flooding.nFailure during data transfer can still result in inconsistent copies: Block transfer can result innSuccessful completionnPartial failure: destination block has incorrect informationnTotal failure: destination block was never upd

8、atednProtecting storage media from failure during data transfer (one solution):nExecute output operation as follows (assuming two copies of each block):nWrite the information onto the first physical block.nWhen the first write successfully completes, write the same information onto the second physic

9、al block.nThe output is completed only after the second write successfully completes.nProtecting storage media from failure during data transfer (cont.):nCopies of a block may differ due to failure during output operation. To recover from failure:nFirst find inconsistent blocks:nExpensive solution:

10、Compare the two copies of every disk block.nBetter solution: nRecord in-progress disk writes on non-volatile storage (Non-volatile RAM or special area of disk). n Use this information during recovery to find blocks that may be inconsistent, and only compare copies of these. nUsed in hardware RAID sy

11、stems冗余磁盘阵列nIf either copy of an inconsistent block is detected to have an error (bad checksum), overwrite it by the other copy. If both have no error, but are different, overwrite the second block by the first block. nPhysical blocks are those blocks residing on the disk. nBuffer blocks are the blo

12、cks residing temporarily in main memory.nBlock movements between disk and main memory are initiated through the following two operations:ninput(B) transfers the physical block B to main memory.noutput(B) transfers the buffer block B to the disk, and replaces the appropriate physical block there.nEac

13、h transaction Ti has its private work-area in which local copies of all data items accessed and updated by it are kept.n Tis local copy of a data item X is called xi.nWe assume, for simplicity, that each data item fits in, and is stored inside, a single block.nTransaction transfers data items betwee

14、n system buffer blocks and its private work-area using the following operations :nread(X) assigns the value of data item X to the local variable xi.nwrite(X) assigns the value of local variable xi to data item X in the buffer block.nboth these commands may necessitate the issue of an input(BX) instr

15、uction before the assignment, if the block BX in which X resides is not already in memory.nTransactions nPerform read(X) while accessing X for the first time; nAll subsequent accesses are to the local copy. nAfter last access, transaction executes write(X).noutput(BX) need not immediately follow wri

16、te(X). System can perform the output operation when it deems fit.xYABx1y1 bufferBuffer Block A Buffer Block Binput(A)output(B) read(X)write(Y)diskwork areaof T1work areaof T2 memoryx2nModifying the database without ensuring that the transaction will commit may leave the database in an inconsistent s

17、tate.nConsider transaction Ti that transfers $50 from account A to account B; goal is either to perform all database modifications made by Ti or none at all. nSeveral output operations may be required for Ti (to output A and B). A failure may occur after one of these modifications have been made but

18、 before all of them are made.nTo ensure atomicity despite failures, we first output information describing the modifications to stable storage without modifying the database itself.nWe study two approaches:nlog-based recovery, andnshadow-pagingnWe assume (initially) that transactions run serially, t

19、hat is, one after the other.nA log is kept on stable storage. nThe log is a sequence of log records, and maintains a record of update activities on the database.nWhen transaction Ti starts, it registers itself by writing a log recordnBefore Ti executes write(X), a log record is written, where V1 is

20、the value of X before the write, and V2 is the value to be written to X.nWhen Ti finishes it last statement, the log record is written. nWe assume for now that log records are written directly to stable storage (that is, they are not buffered)nTwo approaches using logsnDeferred database modification

21、nImmediate database modificationnThe deferred database modification scheme records all modifications to the log, but defers all the writes to after partial commit.nAssume that transactions execute seriallyn 故障出现前的做法:nTransaction starts by writing record to log. nA write(X) operation results in a log

22、 record being written, where V is the new value for XnNote: old value is not needed for this schemenThe write is not performed on X at this time, but is deferred.nWhen Ti partially commits 即事务执行完所有的语句,n is written to the log nFinally, the log records are read and used to actually execute the previou

23、sly deferred writes.n故障出现后的做法:na transaction needs to be redone if and only if both and are there in the log.n nCrashes can occur while nthe transaction is executing the original updates, or nwhile recovery action is being takennexample transactions T0 and T1 (T0 executes before T1):nT0: read (A)T1

24、: read (C)nA: - A - 50 C:- C- 100nWrite (A) write (C)nread (B)nB:- B + 50nwrite (B)nBelow we show the log as it appears at three instances of time.nIf log on stable storage at time of crash is as in case:n(a) No redo actions need to be takenn(b) redo(T0) must be performed since is present n(c) redo(

25、T0) must be performed followed by redo(T1) sincen and are presentnThe immediate database modification scheme allows database updates of an mitted transaction to be made as the writes are issued.n故障发生前的做法:nUpdate log record must be written before database item is written n update logs must have both

26、old value and new value nWe assume that the log record is output directly to stable storagenCan be extended to postpone log record output, so long as prior to execution of an output(B) operation for a data block B, all log records corresponding to items B must be flushed to stable storagenOutput of

27、updated blocks can take place at any time before or after transaction commitnOrder in which blocks are output can be different from the order in which they are written.Log Write OutputTo, B, 2000, 2050 A = 950 B = 2050 C = 600 BB, BC BANote: BX denotes block containing X.x1nRecovery procedure has tw

28、o operations instead of one:n undo(Ti) restores the value of all data items updated by Ti to their old values, going backwards from the last log record for Tinredo(Ti) sets the value of all data items updated by Ti to the new values, going forward from the first log record for TinBoth operations mus

29、t be idempotent 幂等的. That is, even if the operation is executed multiple times the effect is the same as if it is executed oncen故障发生后:nWhen recovering after failure:nTransaction Ti needs to be undone if the log contains the record , but does not contain the record .nTransaction Ti needs to be redone

30、 if the log contains both the record and the record .nUndo operations are performed first, then redo operations. Below we show the log as it appears at three instances of time.Recovery actions in each case above are:(a) undo (T0): B is restored to 2000 and A to 1000.(b) undo (T1) and redo (T0): C is

31、 restored to 700, and then A and B are set to 950 and 2050 respectively.(c) redo (T0) and redo (T1): A and B are set to 950 and 2050 respectively. Then C is set to 600n目标: 减少故障恢复时的开销nProblems in recovery procedure as discussed earlier :nsearching the entire log is time-consumingnwe might unnecessari

32、ly redo transactions which have already output their updates to the database.n出现故障前的做法:nStreamline recovery procedure by periodically performing checkpointing nOutput all log records currently residing in main memory onto stable storage.nOutput all modified buffer blocks to the disk.nWrite a log rec

33、ord onto stable storage.nDuring recovery we need to consider only the most recent transaction Ti that started before the checkpoint, and transactions that started after Ti. 出现故障时的做法:nScan backwards from end of log to find the most recent record nContinue scanning backwards till a record is found. nN

34、eed only consider the part of log following above start record. Earlier part of log can be ignored during recovery, and can be erased whenever desired.nFor all transactions (starting from Ti or later) with no , execute undo(Ti). (Done only in case of immediate modification.)nScanning forward in the

35、log, for all transactions starting from Ti or later with a , execute redo(Ti).nT1 can be ignored (updates already output to disk due to checkpoint)nT2 and T3 redone.nT4 undoneTcTfT1T2T3T4checkpointsystem failurenWe modify the log-based recovery schemes to allow multiple transactions to execute concu

36、rrently.nAll transactions share a single disk buffer and a single lognA buffer block can have data items updated by one or more transactionsnWe assume concurrency control using strict two-phase locking;ni.e. the updates of mitted transactions should not be visible to other transactionsnOtherwise how

37、 to perform undo if T1 updates A, then T2 updates A and commits, and finally T1 has to abort?nLogging is done as described earlier. nLog records of different transactions may be interspersed in the log.nThe checkpointing technique and actions taken on recovery have to be changednsince several transa

38、ctions may be active when a checkpoint is performed.nCheckpoints are performed as before, except that the checkpoint log record is now of the form where L is the list of transactions active at the time of the checkpointnWe assume no updates are in progress while the checkpoint is carried out (will r

39、elax this later)nWhen the system recovers from a crash, it first does the following:nInitialize undo-list and redo-list to emptynScan the log backwards from the end, stopping when the first record is found. For each record found during the backward scan:nif the record is , add Ti to redo-listnif the

40、 record is , then if Ti is not in redo-list, add Ti to undo-listnFor every Ti in L, if Ti is not in redo-list, add Ti to undo-listnAt this point undo-list consists of plete transactions which must be undone, and redo-list consists of finished transactions that must be redone.nRecovery now continues

41、as follows:nScan log backwards from most recent record, stopping when records have been encountered for every Ti in undo-list.nDuring the scan, perform undo for each log record that belongs to a transaction in undo-list.nLocate the most recent record.nScan log forwards from the record till the end o

42、f the log.nDuring the scan, perform redo for each log record that belongs to a transaction on redo-listnGo over the steps of the recovery algorithm on the following log:nnnnnn /* Scan in Step 4 stops here */nnnnnnnnLog record buffering: log records are buffered in main memory, instead of of being ou

43、tput directly to stable storage. 减少外设减少外设IO次数次数nLog records are output to stable storage when a block of log records in the buffer is full, or a log force日志强制日志强制 operation is executed.nLog force is performed to commit a transaction by forcing all its log records (including the commit record) to sta

44、ble storage.nSeveral log records can thus be output using a single output operation, reducing the I/O cost.nThe rules below must be followed if log records are buffered:nLog records are output to stable storage in the order in which they are created. nTransaction Ti enters the commit state only when

45、 the log record has been output to stable storage.nBefore a block of data in main memory is output to the database, all log records pertaining to data in that block must have been output to stable storage. nThis rule is called the write-ahead logging 先写日志or WAL rulenStrictly speaking WAL only requir

46、es undo information to be outputnDatabase maintains an in-memory buffer of data blocksnWhen a new block is needed, if buffer is full an existing block needs to be removed from buffernIf the block chosen for removal has been updated, it must be output to disknAs a result of the write-ahead logging ru

47、le, if a block with mitted updates is output to disk, log records with undo information for the updates are output to the log on stable storage first.nNo updates should be in progress on a block when it is output to disk. Can be ensured as follows.nBefore writing a data item, transaction acquires ex

48、clusive lock on block containing the data itemnLock can be released once the write is completed. nSuch locks held for short duration are called latches.nBefore a block is output to disk, the system acquires an exclusive latch on the blocknEnsures no update can be in progress on the blocknDatabase bu

49、ffer can be implemented eithernin an area of real main-memory reserved for the database, ornin virtual memorynImplementing buffer in reserved main-memory has drawbacks:nMemory is partitioned before-hand between database buffer and applications, limiting flexibility. nNeeds may change, and although o

50、perating system knows best how memory should be divided up at any time, it cannot change the partitioning of memory.nDatabase buffers are generally implemented in virtual memory in spite of some drawbacks: 容易导致额外的磁盘IOnWhen operating system needs to evict 移出a page that has been modified, to make spac

51、e for another page, the page is written to swap space on disk.nWhen database decides to write buffer page to disk, buffer page may be in swap space, and may have to be read from swap space on disk and output to the database on disk, resulting in extra I/O! nKnown as dual paging problem.nIdeally when

52、 swapping out a database buffer page, operating system should pass control to database, which in turn outputs page to database instead of to swap space (making sure to output log records first)nDual paging can thus be avoided, but common operating systems do not support such functionality.nSo far we

53、 assumed no loss of non-volatile storagenTechnique similar to checkpointing used to deal with loss of non-volatile storagenPeriodically dump转贮 the entire content of the database to stable storagenNo transaction may be active during the dump procedure; a procedure similar to checkpointing must take p

54、lacenOutput all log records currently residing in main memory onto stable storage.nOutput all buffer blocks onto the disk.nCopy the contents of the database to stable storage.nOutput a record to log on stable storage.nTo recover from disk failurenrestore database from most recent dump. nConsult the

55、log and redo all transactions that committed after the dumpnCan be extended to allow transactions to be active during dump; known as fuzzy dump or online dumpnWill study fuzzy checkpointing laternRemote backup systems provide high availability by allowing transaction processing to continue even if the primary site is destroyed.nDetection of failure: Backup site 备份站点 must detect when primary site has failed nto distinguish primary site failure from link failure maintain several communication links between the primary and the remote backup.nTransfer of contro

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论