版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、NoImageDatabase System ConceptsSilberschatz, Korth and SudarshanSee www.db- for conditions on re-use Silberschatz, Korth and Sudarshan10.2Database System Concepts - 5th Edition, Aug 22, 2005.nStructure of XML DatanXML Document SchemanQuerying and TransformationnApplication Program Interfaces to XMLn
2、Storage of XML DatanXML ApplicationsSilberschatz, Korth and Sudarshan10.3Database System Concepts - 5th Edition, Aug 22, 2005.nXML: Extensible Markup LanguagenDefined by the WWW Consortium (W3C)nDerived from SGML (Standard Generalized Markup Language), but simpler to use than SGML nDocuments have ta
3、gs giving extra information about sections of the documentnE.g. XML Introduction nExtensible, unlike HTMLnUsers can add new tags, and separately specify how the tag should be handled for displaySilberschatz, Korth and Sudarshan10.4Database System Concepts - 5th Edition, Aug 22, 2005.nThe ability to
4、specify new tags, and to create nested tag structures make XML a great way to exchange data, not just documents.nMuch of the use of XML has been in data exchange applications, not as a replacement for HTMLnTags make data (relatively) self-documenting nE.g. n n A-101 n Downtown n 500 n n n A-101 n Jo
5、hnson n n Silberschatz, Korth and Sudarshan10.5Database System Concepts - 5th Edition, Aug 22, 2005.nData interchange is critical in todays networked worldnExamples:nBanking: funds transfernOrder processing (especially inter-company orders)nScientific datanChemistry: ChemML, nGenetics: BSML (Bio-Seq
6、uence Markup Language), nPaper flow of information between organizations is being replaced by electronic flow of informationnEach application area has its own set of standards for representing informationnXML has become the basis for all new generation data interchange formatsSilberschatz, Korth and
7、 Sudarshan10.6Database System Concepts - 5th Edition, Aug 22, 2005.nEarlier generation formats were based on plain text with line headers indicating the meaning of fieldsnSimilar in concept to email headersnDoes not allow for nested structures, no standard “type” languagenTied too closely to low lev
8、el document structure (lines, spaces, etc)nEach XML based standard defines what are valid elements, usingn XML type specification languages to specify the syntaxnDTD (Document Type Descriptors)nXML SchemanPlus textual descriptions of the semanticsnXML allows new tags to be defined as requirednHoweve
9、r, this may be constrained by DTDsnA wide variety of tools is available for parsing, browsing and querying XML documents/dataSilberschatz, Korth and Sudarshan10.7Database System Concepts - 5th Edition, Aug 22, 2005.nInefficient: tags, which in effect represent schema information, are repeatednBetter
10、 than relational tuples as a data-exchange formatnUnlike relational tuples, XML data is self-documenting due to presence of tagsnNon-rigid format: tags can be addednAllows nested structuresnWide acceptance, not only in database systems, but also in browsers, tools, and applicationsSilberschatz, Kort
11、h and Sudarshan10.8Database System Concepts - 5th Edition, Aug 22, 2005.nTag: label for a section of datanElement: section of data beginning with and ending with matching nElements must be properly nestednProper nestingn . nImproper nesting n . nFormally: every start tag must have a unique matching
12、end tag, that is in the context of the same parent element.nEvery document must have a single top-level elementSilberschatz, Korth and Sudarshan10.9Database System Concepts - 5th Edition, Aug 22, 2005. Hayes Main Harrison A-102 Perryridge 400 . . Silberschatz, Korth and Sudarshan10.10Database System
13、 Concepts - 5th Edition, Aug 22, 2005.nNesting of data is useful in data transfernExample: elements representing customer_id, customer_name, and address nested within an order elementnNesting is not supported, or discouraged, in relational databasesnWith multiple orders, customer name and address ar
14、e stored redundantlynnormalization replaces nested structures in each order by foreign key into table storing customer name and address informationnNesting is supported in object-relational databasesnBut nesting is appropriate when transferring datanExternal application does not have direct access t
15、o data referenced by a foreign keySilberschatz, Korth and Sudarshan10.11Database System Concepts - 5th Edition, Aug 22, 2005.nMixture of text with sub-elements is legal in XML. nExample:n n This account is seldom used any more.n A-102n Perryridgen 400 nUseful for document markup, but discouraged for
16、 data representationSilberschatz, Korth and Sudarshan10.12Database System Concepts - 5th Edition, Aug 22, 2005.nElements can have attributesn n A-102 n Perryridge n 400 n nAttributes are specified by name=value pairs inside the starting tag of an elementnAn element may have several attributes, but e
17、ach attribute name can only occur oncenSilberschatz, Korth and Sudarshan10.13Database System Concepts - 5th Edition, Aug 22, 2005.nDistinction between subelement and attributenIn the context of documents, attributes are part of markup, while subelement contents are part of the basic document content
18、snIn the context of data representation, the difference is unclear and may be confusingnSame information can be represented in two waysn . n A-101 nSuggestion: use attributes for identifiers of elements, and use subelements for contentsSilberschatz, Korth and Sudarshan10.14Database System Concepts -
19、 5th Edition, Aug 22, 2005.nXML data has to be exchanged between organizationsnSame tag name may have different meaning in different organizations, causing confusion on exchanged documentsnSpecifying a unique string as an element name avoids confusionnBetter solution: use unique-name:element-namenAv
20、oid using long unique names all over document by using XML Namespacesn n n Downtownn Brooklyn n nSilberschatz, Korth and Sudarshan10.15Database System Concepts - 5th Edition, Aug 22, 2005.nElements without subelements or text content can be abbreviated by ending the start tag with a / and deleting t
21、he end tagnnTo store string data that may contain tags, without the tags being interpreted as subelements, use CDATA as belown!CDATA nHere, and are treated as just stringsnCDATA stands for “character data”Silberschatz, Korth and Sudarshan10.16Database System Concepts - 5th Edition, Aug 22, 2005.nDat
22、abase schemas constrain what information can be stored, and the data types of stored valuesnXML documents are not required to have an associated schemanHowever, schemas are very important for XML data exchangenOtherwise, a site cannot automatically interpret data received from another sitenTwo mecha
23、nisms for specifying XML schemanDocument Type Definition (DTD)nWidely usednXML Schema nNewer, increasing useSilberschatz, Korth and Sudarshan10.17Database System Concepts - 5th Edition, Aug 22, 2005.nThe type of an XML document can be specified using a DTDnDTD constraints structure of XML datanWhat
24、elements can occurnWhat attributes can/must an element havenWhat subelements can/must occur inside each element, and how many times.nDTD does not constrain data typesnAll values represented as strings in XMLnDTD syntaxnnSilberschatz, Korth and Sudarshan10.18Database System Concepts - 5th Edition, Au
25、g 22, 2005.nSubelements can be specified asnnames of elements, orn#PCDATA (parsed character data), i.e., character stringsnEMPTY (no subelements) or ANY (anything can be a subelement)nExamplenn nnSubelement specification may have regular expressionsn nNotation: n “|” - alternativesn “+” - 1 or more
26、occurrencesn “*” - 0 or more occurrencesSilberschatz, Korth and Sudarshan10.19Database System Concepts - 5th Edition, Aug 22, 2005.n!DOCTYPE bank nnnnnnnnnnnSilberschatz, Korth and Sudarshan10.20Database System Concepts - 5th Edition, Aug 22, 2005.nAttribute specification : for each attribute nNamen
27、Type of attribute nCDATAnID (identifier) or IDREF (ID reference) or IDREFS (multiple IDREFs) n more on this later nWhether nmandatory (#REQUIRED)nhas a default value (value), nor neither (#IMPLIED)nExamplesnnSilberschatz, Korth and Sudarshan10.21Database System Concepts - 5th Edition, Aug 22, 2005.n
28、An element can have at most one attribute of type IDnThe ID attribute value of each element in an XML document must be distinctnThus the ID attribute value is an object identifiernAn attribute of type IDREF must contain the ID value of an element in the same documentnAn attribute of type IDREFS cont
29、ains a set of (0 or more) ID values. Each ID value must contain the ID value of an element in the same documentSilberschatz, Korth and Sudarshan10.22Database System Concepts - 5th Edition, Aug 22, 2005.nBank DTD with ID and IDREF attribute types.n !DOCTYPE bank-2n n n n n declarations for branch, ba
30、lance, customer_name, customer_street and customer_citySilberschatz, Korth and Sudarshan10.23Database System Concepts - 5th Edition, Aug 22, 2005. Downtown 500 Joe Monroe Madison Mary Erin Newark Silberschatz, Korth and Sudarshan10.24Database System Concepts - 5th Edition, Aug 22, 2005.nNo typing of
31、 text elements and attributesnAll values are strings, no integers, reals, etc.nDifficult to specify unordered sets of subelementsnOrder is usually irrelevant in databases (unlike in the document-layout environment from which XML evolved)n(A | B)* allows specification of an unordered set, butnCannot
32、ensure that each of A and B occurs only oncenIDs and IDREFs are untypednThe owners attribute of an account may contain a reference to another account, which is meaninglessnowners attribute should ideally be constrained to refer to customer elementsSilberschatz, Korth and Sudarshan10.25Database Syste
33、m Concepts - 5th Edition, Aug 22, 2005.nXML Schema is a more sophisticated schema language which addresses the drawbacks of DTDs. SupportsnTyping of valuesnE.g. integer, string, etcnAlso, constraints on min/max valuesnUser-defined, comlex typesnMany more features, includingnuniqueness and foreign ke
34、y constraints, inheritance nXML Schema is itself specified in XML syntax, unlike DTDsnMore-standard representation, but verbosenXML Scheme is integrated with namespaces nBUT: XML Schema is significantly more complicated than DTDs.Silberschatz, Korth and Sudarshan10.26Database System Concepts - 5th E
35、dition, Aug 22, 2005. . definitions of customer and depositor .Silberschatz, Korth and Sudarshan10.27Database System Concepts - 5th Edition, Aug 22, 2005.nChoice of “xs:” was ours - any other namespace prefix could be chosennElement “bank” has type “BankType”, which is defined separatelynxs:complexT
36、ype is used later to create the named complex type “BankType”nElement “account” has its type defined in-lineSilberschatz, Korth and Sudarshan10.28Database System Concepts - 5th Edition, Aug 22, 2005.nAttributes specified by xs:attribute tag:nnadding the attribute use = “required” means value must be
37、 specifiednKey constraint: “account numbers form a key for account elements under the root bank element:nnnnnForeign key constraint from depositor to account:nnnnSilberschatz, Korth and Sudarshan10.29Database System Concepts - 5th Edition, Aug 22, 2005.nTranslation of information from one XML schema
38、 to anothernQuerying on XML data nAbove two are closely related, and handled by the same toolsnStandard XML querying/translation languagesnXPathnSimple language consisting of path expressionsnXSLTnSimple language designed for translation from XML to XML and XML to HTMLnXQuerynAn XML query language w
39、ith a rich set of featuresSilberschatz, Korth and Sudarshan10.30Database System Concepts - 5th Edition, Aug 22, 2005.nQuery and transformation languages are based on a tree model of XML datanAn XML document is modeled as a tree, with nodes corresponding to elements and attributesnElement nodes have
40、child nodes, which can be attributes or subelementsnText in an element is modeled as a text node child of the elementnChildren of a node are ordered according to their order in the XML documentnElement and attribute nodes (except for the root node) have a single parent, which is an element nodenThe
41、root node has a single child, which is the root element of the documentSilberschatz, Korth and Sudarshan10.31Database System Concepts - 5th Edition, Aug 22, 2005.nXPath is used to address (select) parts of documents using path expressionsnA path expression is a sequence of steps separated by “/”nThi
42、nk of file names in a directory hierarchynResult of path expression: set of values that along with their containing elements/attributes match the specified path nE.g. /bank-2/customer/customer_name evaluated on the bank-2 data we saw earlier returns nJoenMarynE.g. /bank-2/customer/customer_name/text
43、( )n returns the same names, but without the enclosing tagsSilberschatz, Korth and Sudarshan10.32Database System Concepts - 5th Edition, Aug 22, 2005.nThe initial “/” denotes root of the document (above the top-level tag)nPath expressions are evaluated left to rightnEach step operates on the set of
44、instances produced by the previous stepnSelection predicates may follow any step in a path, in nE.g. /bank-2/accountbalance 400 nreturns account elements with a balance value greater than 400n/bank-2/accountbalance returns account elements containing a balance subelementnAttributes are accessed usin
45、g “”nE.g. /bank-2/accountbalance 400/account_numbernreturns the account numbers of accounts with balance 400nIDREF attributes are not dereferenced automatically (more on this later)Silberschatz, Korth and Sudarshan10.33Database System Concepts - 5th Edition, Aug 22, 2005.nXPath provides several func
46、tionsnThe function count() at the end of a path counts the number of elements in the set generated by the pathnE.g. /bank-2/accountcount(./customer) 2 nReturns accounts with 2 customersnAlso function for testing position (1, 2, .) of node w.r.t. siblingsnBoolean connectives and and or and function n
47、ot() can be used in predicatesnIDREFs can be referenced using function id()nid() can also be applied to sets of references such as IDREFS and even to strings containing multiple references separated by blanksnE.g. /bank-2/account/id(owner) nreturns all customers referred to from the owners attribute
48、 of account elements.Silberschatz, Korth and Sudarshan10.34Database System Concepts - 5th Edition, Aug 22, 2005.nOperator “|” used to implement union nE.g. /bank-2/account/id(owner) | /bank-2/loan/id(borrower)nGives customers with either accounts or loansnHowever, “|” cannot be nested inside other o
49、perators.n“/” can be used to skip multiple levels of nodes nE.g. /bank-2/customer_name nfinds any customer_name element anywhere under the /bank-2 element, regardless of the element in which it is contained.nA step in the path can go to parents, siblings, ancestors and descendants of the nodes gener
50、ated by the previous step, not just to the childrenn“/”, described above, is a short from for specifying “all descendants”n“.” specifies the parent.ndoc(name) returns the root of a named document Silberschatz, Korth and Sudarshan10.35Database System Concepts - 5th Edition, Aug 22, 2005.nXQuery is a
51、general purpose query language for XML data nCurrently being standardized by the World Wide Web Consortium (W3C)nThe textbook description is based on a January 2005 draft of the standard. The final version may differ, but major features likely to stay unchanged.nXQuery is derived from the Quilt quer
52、y language, which itself borrows from SQL, XQL and XML-QLnXQuery uses a for let where order by result syntax for SQL from where SQL where order by SQL order byn result SQL select let allows temporary variables, and has no equivalent in SQLSilberschatz, Korth and Sudarshan10.36Database System Concept
53、s - 5th Edition, Aug 22, 2005.nFor clause uses XPath expressions, and variable in for clause ranges over values in the set returned by XPathnSimple FLWOR expression in XQuery nfind all accounts with balance 400, with each result enclosed in an . tag for $x in /bank-2/account let $acctno := $x/accoun
54、t_number where $x/balance 400 return $acctno nItems in the return clause are XML text unless enclosed in , in which case they are evaluatednLet clause not really needed in this query, and selection can be done In XPath. Query can be written as:nfor $x in /bank-2/accountbalance400return $x/account_nu
55、mber Silberschatz, Korth and Sudarshan10.37Database System Concepts - 5th Edition, Aug 22, 2005.nJoins are specified in a manner very similar to SQLfor $a in /bank/account,n $c in /bank/customer,n $d in /bank/depositorn where $a/account_number = $d/account_number and $c/customer_name = $d/customer_n
56、amen return $c $a nThe same query can be expressed with the selections specified as XPath selections:n for $a in /bank/account $c in /bank/customer $d in /bank/depositor account_number = $a/account_number and customer_name = $c/customer_namen return $c $a Silberschatz, Korth and Sudarshan10.38Databa
57、se System Concepts - 5th Edition, Aug 22, 2005.nThe following query converts data from the flat structure for bank information into the nested structure used in bank-1n n for $c in /bank/customern returnn n $c/* n for $d in /bank/depositorcustomer_name = $c/customer_name,n $a in /bank/accountaccount
58、_number=$d/account_numbern return $a n n n$c/* denotes all the children of the node to which $c is bound, without the enclosing top-level tagn$c/text() gives text content of an element without any subelements / tagsSilberschatz, Korth and Sudarshan10.39Database System Concepts - 5th Edition, Aug 22,
59、 2005.nThe order by clause can be used at the end of any expression. E.g. to return customers sorted by name for $c in /bank/customer order by $c/customer_name return $c/* nUse order by $c/customer_name to sort in descending ordernCan sort at multiple levels of nesting (sort by customer_name, and by
60、 account_number within each customer)n for $c in /bank/customer norder by $c/customer_namenreturn $c/* n for $d in /bank/depositorcustomer_name=$c/customer_name, $a in /bank/accountaccount_number=$d/account_number norder by $a/account_numbern return $a/* n Silberschatz, Korth and Sudarshan10.40Datab
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 授权使用商标协议
- 文化创意灰土工程协议
- 服装设计师解聘合同证明
- 起草离婚协议书(2篇)
- 土地过户后承建协议书范本
- 集体合同决议会议记录
- 砍树免责合同范例
- 承租开荒地合同范例
- 品牌文化策划合同范例
- 网签授权合同范例
- 公共租赁住房运行管理标准
- 2024-2030年中国永磁耦合器行业经营优势及竞争对手现状调研报告
- JJ∕G(交通) 200-2024 轮碾成型机
- 小学六年级奥数难题100道及答案(完整版)
- 小学科学教科版五年级上册全册易错知识点专项练习(判断选择-分单元编排-附参考答案和点拨)
- 电影作品解读-世界科幻电影智慧树知到期末考试答案章节答案2024年成都锦城学院
- NB-T47003.1-2009钢制焊接常压容器(同JB-T4735.1-2009)
- 聚焦高质量+探索新高度+-2025届高考政治复习备考策略
- 惠州市惠城区2022-2023学年七年级上学期期末教学质量检测数学试卷
- 北京市西城区2022-2023学年七年级上学期期末英语试题【带答案】
- ISO45001-2018职业健康安全管理体系之5-4:“5 领导作用和工作人员参与-5.4 工作人员的协商和参与”解读和应用指导材料(2024A0-雷泽佳)
评论
0/150
提交评论