完整word版,高级数据库技术问答题(附答案)_第1页
完整word版,高级数据库技术问答题(附答案)_第2页
完整word版,高级数据库技术问答题(附答案)_第3页
完整word版,高级数据库技术问答题(附答案)_第4页
完整word版,高级数据库技术问答题(附答案)_第5页
已阅读5页,还剩2页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

DatabaseSystem

Problem1

Thehashjoinalgorithmdescribedinthetextbookcomputesthenaturaljoinoftworelations.Writepseudocodeforaniteratorthatimplementsahashbasedalgorithmforcomputingan“anti-join”operation:Ranti-joinScontainsalltuplesinRthathavenomatchingtuplesinS.Althoughanti-joinisnotanoperationthatcanbespecifieddirectlyinSQL,itisimplementedinthequeryprocessortosupportsometypesofqueries.Yourpseudocodemustdefinethestandarditeratorfunctionsopen(),next(),andclose(),andmustfetchtheinputtuplesusingiteratorcalls(inotherwords,RandSmaynotbebaserelations,butmaybeoutputsofotheroperators).Showwhatstateinformationtheiteratormustmaintainbetweencalls.Forsimplicity,youcanassumethatSissmallandeasilyfitsinmemory.Herearemoredetailsonanti-join:

/wiki/Relational_algebra#Antijoin

Problem2

ConsideranindexorganizedasaB+tree.TheleafnodescontainpointerstoatotalofNrecords,andeachblockthatmakesuptheindexhasmpointers.Wewishtochoosethevalueofmthatwillminimizesearchtimesonaparticulardiskdevice.

Wearegiventhefollowinginformation:

(a)Forthediskthatwillholdtheindex,thetimetoreadagivenblockintomemorycanbeapproximatedby(25+.02*m)milliseconds.The25millisecondsrepresenttheseekandlatencycomponentsoftheread,the.02*mmillisecondsisthetransfertime.(Thatis,asmbecomeslarger,thelargertheblockwillbeandthemoretimeitwilltaketoreaditintomemory.)

(b)Oncetheblockisinmemoryabinarysearchisusedtofindtherightpointer.Sothetimetoprocessablockinmainmemoryis(a+blog_2m)milliseconds,forsomeconstantsa,b.

("log_2m"meanslogbase2ofm.)

(c)Themainmemorytimeconstant"a"ismuchsmallerthanthediskseekandlatencytimeof25milliseconds.

(d)Theindexisfull,sothatthenumberofblocksthatmustbe

examinedpersearchislog_mN.

Questions:

(i)Whatvalueofmminimizesthetimetosearchforagivenrecord?AnapproximateanswerisOK.(HINT:Ifyoucomeupwithanequationwhichishardtosolvealgebraically,tryplugginginvaluestolocatetherootoftheequation.)Thevalueyouobtainshouldbeindependent

of"b"!

(ii)Whathappensastheseekandlatencyconstant(25ms)decreases?Forinstance,ifthisconstantiscutinhalf,howdoestheoptimummvaluechange?

Answer:

Let

T1

bethetimetakentoreadoneblockintomainmemoryand

T2

bethetimetoprocessthatblockinmemory.If

n

blocksareexaminedtosearchforagivenrecord,thenthetimetakenforthissearchis:

n*(T1

+T2)

Substitutingtheappropriateexpressionsforallthevariables,weget

T(m)=[(25+0.02m)+(a+blog2m)]*logmN

Ignoring

a

incomparisonwith

30,weget

T(m)=[25+0.02m+blog2m]*logmN

Substituting

log2m

by

ln(m)/ln(2)

and

logmN

by

ln(N)/ln(m),weget

T(m)=[(25+0.02m)/ln(m)]*ln(N)+b*ln(N)/ln(2)

Hence,tominimizeT(m),weneedtominimize

f(m)=(25+0.02m)/ln(m).Anumberoftechniquescanbeappliedtodeterminethevalueof

m

thatminimizesthisexpression.Forexample,takingthederivativeandsettingittozero(f'(m)=0),weget

m*(ln(m)-1)=1250

Thiscanbesolvedtoyieldanintegralvalueof

272.Itiseasytoverifythatthisvaluedoesindeedminimize

f(m)

andhence

T(m).

Seek+latencyconstantdecreases:

Letussplit

T(m)

intothreecomponentsasfollows(here,

ts

istheseek+latencytimeconstantwhichwas30intheabovecase):

Totalseekandlatencytime

Ta

=ts

*ln(N)/ln(m)

Totaltransfertime

Tb

=0.02*m*ln(N)/ln(m)

Totalbinarysearchtime

Tc

=b*ln(N)/ln(2)

Ofthese,

Tc

isindependentof

m,

Ta

isadecreasingfunctionof

m

whereas

Tb

isanincreasingfunctionof

m.If

ts

isverysmall,then

Ta

isnegligiblecomparedto

Tb.Hence,wemustchooseasmallvalueof

m

tominimizethesearchtime.Ontheotherhand,if

Ta

isverylargecomparedto

Tb,thentheformerdominatesandwechoosealargevalueof

m

tominimizesearchtimes.Fromtheselimitingcases,itiseasytoseethatif

ts

decreases,theoptimumvalueof

m

decreases.Inparticular,if

ts

=25/2=12.5,wegetanoptimumvalueof

155

(byproceedingalongthesamelinesasabove).

Problem3

Tobuildahashindexforamulti-attributesearchkey,wecanuseanapproachcalledpartitionedhashing.Thepartitionedhashfunctionisreallyalistofhashfunctions,oneforeachattributeinthesearchkey.SupposethatwewishtobuildapartitionedhashindexonR(A,B)with2nbucketsnumbered0to2n1.Inthiscase,thepartitionedhashfunctionconsistsoftwohashfunctionshAandhB.HashfunctionhAtakesavalueofAasinputandproducesaresultwithnAbits,andhBtakesavalueofBasinputandproducesaresultwith(nnA)bits.Thetworesultsareconcatenatedtogethertoproducetheresultofthepartitionedhashfunction,whichisthenusedtoindexthebuckets.TolocaterecordswithA=aandB=b,wesimplygodirectlytothebucketnumberedhA(a)hB(b)(inbinary).

(a)WhichbucketsdowehavetoexamineinordertolocateallrecordswithA=a?

(b)Supposewearegivenaquerymix.EachqueryinthismixwilleitheraskforrecordswithagivenvalueofA,oritwillaskforrecordswithagivenvalueofB(butneverboth).Withprobabilityp,thevalueofAwillbespecified.Giveaformulaintermsofn,nA,andpfortheexpectednumberofbucketsthatmustbe

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论