Oracle英文版培训课件之Data Warehouse:les04_第1页
Oracle英文版培训课件之Data Warehouse:les04_第2页
Oracle英文版培训课件之Data Warehouse:les04_第3页
Oracle英文版培训课件之Data Warehouse:les04_第4页
Oracle英文版培训课件之Data Warehouse:les04_第5页
已阅读5页,还剩36页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Extraction,Transformation,

andLoading(ETL)ExtractionandTransportationObjectivesAftercompletingthislesson,youshouldbeabletodothefollowing:DescribethecoreETLframeworkinsidethedatabaseanditsintegrationadvantageExplaindatawarehousingextractionmethodsIdentifytransportationmethods:FlatfileDistributesoperationsTransportabletablespacesDescribetransformationflowOverviewLesson4:Extraction/TransportationLesson5:LoadingLesson6:TransformationWhatIsETL?ETLisanacronymforExtraction,Transformation,andLoading.ThefollowinghappenduringtheETLprocess:Thedesireddataisidentifiedandextractedfrommanydifferentsources.Sometransformationsmaytakeplaceduringthisextractionprocess.Afterextraction,thedatamustbetransportedtoatargetsystemoranintermediatesystemforfurtherprocessing.Dependingonthemethodoftransportation,sometransformationscanbedonesimultaneously.ETLreferstoabroadprocess.ExtractionMethodsExtractioncanbethoughtofintwoparts:ExtractionTransportationTherearetwoextractionmethods:LogicalPhysicalYourlogicalchoiceinfluencesthewaythedataisphysicallyextracted.Somecriteriaforchoosingacombination:BusinessneedsLocationofthesourceandtargetsystemsAvailabilityofthesourcesystemTimerequiredtoextractdataLogicalExtractionMethodsTherearetwokindsoflogicalextraction:FullextractionAlldataispulledLessinformationtotrackMoretimerequiredtopullthedataIncrementalextractionAsubsetofdataispulledMusttrackwhatdataneedstobepulledLesstimerequiredtopullthedataPhysicalExtractionMethodsTherearetwotypesofphysicalextraction.Onlineextraction:PullsdatafromthesourcesystemOfflineextraction:PullsdatafromastagingareaStagingareasincludeflatfiles,dumpfiles,andtransportabletablespaces.OfflineExtractionStagingareas:FlatfilesRequiresdatainapredefined,genericformatDumpfilesMustbeinanOracle-specificformatRedoandarchivelogsDatalocatedinspecialdumpfilesTransportabletablespacesPowerful,fastmethodformovinglargevolumesofdataImplementingMethodsofExtractionExtractingtoafile:SpoolingfromSQL*PlusUsingOCIorPro*CtodumptoafileUsingDataPumptoexporttoanOracledumpfileUsingexternaltablesExtractingthroughdistributedoperationsIncrementalExtractionUsingCDCCDCcancaptureandpublishcommittedchangedataineitherofthefollowingmodes:SynchronousTriggersonthesourcedatabaseallowchangedatatobecapturedimmediately.Changedataiscapturedaspartofthetransactionmodifyingthesourcetable.AsynchronousChangedataiscapturedafteraSQLstatementperformingDMLiscommittedusingtheredologs.AsynchronousChangeDataCaptureisbuiltonOracleStreams.PublishandSubscribeModelThepublisherperformsthefollowingtasks:IdentifiessourcetablesfromwhichthedatawarehouseisinterestedincapturingchangedataUsestheDBMS_CDC_PUBLISHpackageto:SetupthecaptureofdatafromthesourcetablesDetermineandadvancethechangesetsPublishthechangedataAllowscontrolledaccesstosubscribersusingtheSQLGRANT

andREVOKE

statementsPublishandSubscribeModelThesubscriberusestheDBMS_CDC_SUBSCRIBEpackageto:SubscribetosourcetablesExtendthewindowandcreatechangeviewPreparethesubscriberviewsViewdatastoredinchangetablesPurgethesubscriberviewRemovethesubscriberviewsSynchronousCDCSourcetablesSourcedatabasetransactionsSYNC_SOURCEChangesourceChangesetChangetablesSubscriber

viewsTriggerexecutionSourcedatabaseAsynchronousCDCAsynchronousCDC:CaptureschangedatafromredologfilesafterchangeshavebeencommittedtothesourcedatabaseModesaredependentonthelevelofsupplementalloggingusedonthesourcedatabaseUsesOracleStreamstocapturechangedatafromredologfilesHasthreesourcemodes:AsynchronousAutoLogmodeAsynchronousHotLogmodeAsynchronousDistributedHotLogmodeAsynchronousAutoLogModeSourcetablesSourcedatabasetransactionsLGWROnlineredologsDistributedAutoLogchangesetChangesetChange

tablesSubscriber

viewsSourcedatabaseStagingdatabaseDistributedAutoLogchangesourceRFSStandbyredologsStreams

captureLOG_ARCHIVE_DEST_2AsynchronousHotLogConfigurationSourcetablesSourcedatabasetransactionsHOTLOG_SOURCEChangeSourceChangesetChangetablesSubscriber

viewsStreamslocalcaptureSourcedatabaseLGWROnlineredologsAsynchronousDistributedHotLogModeSourcetablesSourcedatabasetransactionsLGWROnlineredologsDistributedHotLogchangesetChangesetChangetablesSubscriber

viewsSourcedatabaseStagingdatabaseDBlinkDistributedHotLogchangesourceDBlinkStreamspropagationPreparingtoPublishChangeData1. Gatherrequirementsfromthesubscribers.2. Determinewhichsourcedatabasecontainstherelevantsourcetables.3. Choosethecapturemode:SynchronousAsynchronousHotLogAsynchronousDistributedHotLogAsynchronousAutoLogEnsurethatthesourceandstagingdatabaseshaveappropriatedatabaseinitializationparametersset.Setupdatabaselinksbetweenthesourcedatabaseandthestagingdatabase.CreatingaPublisherUserThestagingdatabasepublishermustbegrantedthefollowingprivilegesandroles:EXECUTE_CATALOG_ROLEprivilegeSELECT_CATALOG_ROLEprivilegeCREATETABLEand

CREATESESSIONprivilegesEXECUTEonthe

DBMS_CDC_PUBLISHpackageCreateadefaulttablespacefor

thepublisher.SynchronousPublishing1. Createachangeset.2. Createachangetable.3. Grantaccesstosubscribers.BEGINDBMS_CDC_PUBLISH.CREATE_CHANGE_SET(change_set_name=>'CHICAGO_DAILY',description=>'Changesetforsaleshistoryinfo',change_source_name=>'SYNC_SOURCE');END;GRANTSELECTONducts_ctTOsubscriber1;AsynchronousDistributed

HotLogPublishingPreparethesourceandstagingdatabases:1. ConfigureOracleNetsothatthesourcedatabasecancommunicatewiththestagingdatabase.2. Setinitializationparametersonthesourcedatabase.

3. Spatible=10.2.0global_names=truejob_queue_processes=<currentvalue>+2open_links=4parallel_max_servers=<currentvalue>+3processes=<currentvalue>+4sessions=<currentvalue>+1streams_pool_size=<currentvalue>+20MBundo_retention=3600AsynchronousDistributed

HotLogPublishingPreparethestagingdatabase:Spatible=10.2.0global_names=truejava_pool_size=50000000open_links=4job_queue_processes=2parallel_max_servers=<current_value>+2processes=<current_value>+3sessions=<currentvalue>+1streams_pool_size=<current_value>+11MBundo_retention=3600AsynchronousDistributed

HotLogPublishingAlterthesourcedatabase:1. PlacethedatabaseintoFORCELOGGINGloggingmodetoprotectagainstunloggeddirectwrites.2. Enablesupplementallogging.3. Createanunconditionalloggrouponallcolumnstobecapturedinthesourcetable.ALTERDATABASEFORCELOGGING;ALTERDATABASEADDSUPPLEMENTALLOGDATA;ALTERTABLESH.PRODUCTSADDSUPPLEMENTALLOGGROUPlog_group_products(PROD_ID,PROD_NAME,PROD_LIST_PRICE)ALWAYS;AsynchronousDistributed

HotLogPublishingPublisherprivilegesonsourceandstagingdatabases:1. Createandgrantprivilegestothesourcedatabasepublisher.2. Createandgrantprivilegestothestagingdatabasepublisher.CREATEUSERsource_cdcpubIDENTIFIEDBYsource_cdcpubQUOTAUNLIMITEDONSYSTEMQUOTAUNLIMITEDONSYSAUX;GRANTCREATESESSIONTOsource_cdcpub;GRANTDBATOsource_cdcpub;GRANTCREATEDATABASELINKTOsource_cdcpub;GRANTEXECUTEonDBMS_CDC_PUBLISHTOsource_cdcpub;GRANTEXECUTE_CATALOG_ROLETOsource_cdcpub;GRANTSELECT_CATALOG_ROLETOsource_cdcpub;EXECUTEDBMS_STREAMS_AUTH.GRANT_ADMIN_PRIVILEGE(GRANTEE=>'source_cdcpub');AsynchronousDistributed

HotLogPublishingCreatesourceandstagingdatabaselinks:1. Createthesourcedatabaselink.2. Createthestagingdatabaselink.CREATEDATABASELINKstaging_dbCONNECTTOstaging_cdcpubIDENTIFIEDBYstaging_cdcpubUSING'staging_db';CREATEDATABASELINKsource_dbCONNECTTOsource_cdcpubIDENTIFIEDBYsource_cdcpubUSING'source_db';AsynchronousDistributed

HotLogPublishingCreatechangesourcesandchangesets:1. Createthechangesources.2. Createthechangesets.BEGINDBMS_CDC_PUBLISH.CREATE_HOTLOG_CHANGE_SOURCE(change_source_name=>'CHICAGO',description=>'testsource',source_database=>'source_db');END;DBMS_CDC_PUBLISH.CREATE_CHANGE_SET(change_set_name=>'CHICAGO_DAILY',description=>'changesetforproductinfo',change_source_name=>'CHICAGO',stop_on_ddl=>'y');END;AsynchronousDistributed

HotLogPublishingCreatethechangetablesonthestagingdatabase:BEGINDBMS_CDC_PUBLISH.CREATE_CHANGE_TABLE(owner=>'staging_cdcpub',change_table_name=>'products_ct',change_set_name=>'CHICAGO_DAILY',source_schema=>'SH',source_table=>'PRODUCTS',column_type_list=>'PROD_IDNUMBER(6),PROD_NAMEVARCHAR2(50),PROD_LIST_PRICENUMBER(8,2),JOB_IDVARCHAR2(10),DEPARTMENT_IDNUMBER(4)',capture_values=>'both',rs_id=>'y',row_id=>'n',...options_string=>'TABLESPACETS_CHICAGO_DAILY');END;AsynchronousDistributed

HotLogPublishingEnablethechangesourceandchangeset:1. Enablethechangesource.2. Enablethechangeset.3. Grantaccesstosubscribers.BEGINDBMS_CDC_PUBLISH.ALTER_HOTLOG_CHANGE_SOURCE(change_source_name=>'CHICAGO',enable_source=>'Y');END;BEGINDBMS_CDC_PUBLISH.ALTER_CHANGE_SET(change_set_name=>'CHICAGO_DAILY',enable_capture=>'y');END;SubscribingtoChangeData1. Findthesourcetablesforwhichthesubscriberhasaccessprivileges.2. Findthechangesetnamesandcolumnsforwhichthesubscriberhasaccessprivileges.SQL>SELECT*FROMALL_SOURCE_TABLES;SOURCE_SCHEMA_NAMESOURCE_TABLE_NAME------------------------------------SHPRODUCTSSQL>SELECTUNIQUECHANGE_SET_NAME,COLUMN_NAME,PUB_IDFROM2ALL_PUBLISHED_COLUMNSWHERESOURCE_SCHEMA_NAME='SH'AND3SOURCE_TABLE_NAME='PRODUCTS';CHANGE_SET_NAMECOLUMN_NAMEPUB_ID----------------------------------------------CHICAGO_DAILYPROD_ID41494CHICAGO_DAILYPROD_LIST_PRICE41494CHICAGO_DAILYPROD_NAME41494SubscribingtoChangeData3. Createasubscription.4. Subscribetoasourcetableandcolumns.BEGINDBMS_CDC_SUBSCRIBE.CREATE_SUBSCRIPTION(change_set_name=>'CHICAGO_DAILY',description=>'ChangedataforPRODUCTS',subscription_name=>'SALES_SUB');END;BEGINDBMS_CDC_SUBSCRIBE.SUBSCRIBE(subscription_name=>'SALES_SUB',source_schema=>'SH',source_table=>'PRODUCTS',column_list=>'PROD_ID,PROD_NAME,PROD_LIST_PRICE',subscriber_view=>'SALES_VIEW');END;SubscribingtoChangeData5. Activatethesubscription.6. Getthenextsetofchangedata.BEGINDBMS_CDC_SUBSCRIBE.ACTIVATE_SUBSCRIPTION(subscription_name=>'SALES_SUB');END;BEGINDBMS_CDC_SUBSCRIBE.EXTEND_WINDOW(subscription_name=>'SALES_SUB');END;SubscribingtoChangeData7. Querythesubscriberviews.8. Indicatethatthechangedataisnolongerneeded.9. Endthesubscription.SELECTPROD_ID,PROD_NAME,PROD_LIST_PRICEFROMSALES_VIEW;PROD_IDPROD_NAMEPROD_LIST_PRICE------------------------------------------------------30And2CrosscourtTeeKids14.9930And2CrosscourtTeeKids17.6610Gurfield&MurksPleatedTrousers17.9910Gurfield&MurksPleatedTrousers21.99BEGINDBMS_CDC_SUBSCRIBE.PURGE_WINDOW(subscription_name=>'SALES_SUB');END;AsynchronousDistributedHotLog

SourceDatabaseInitializationParameters

ForallOracleDatabase10greleases:3600UNDO_RETENTIONThecurrentvalue+(thenumberofchangesourcesplanned)SESSIONSThecurrentvalue+(4timesthenumberofchangesourcesplanned)PROCESSESThecurrentvalue+(3timesthenumberofchangesourcesplanned)PARALLEL_MAX_SERVERSShouldbeequaltothenumberofDistributedHotLogchangesourcesplannedOPEN_LINKSMaximumnumberofDBMS_JOBjobsthatcanrunsimultaneouslyplus2JOB_QUEUE_PROCESSESTRUEGLOBAL_NAMES10.2.0or10.0.0COMPATIBLE

ValueParameterAsynchronousDistributedHotLog

SourceDatabaseInitializationParameters

ForOracle9.2databases:Thecurrentvalue+(thenumberofchangesourcesplanned)PROCESSESThecurrentvalue+(3timesthenumberofchangesourcesplanned)PARALLEL_MAX_SERVERSThenumberofDistributedHotLogchangesourcesplannedOPEN_LINKSThenumberofchangesourcesplannedLOGMNR_MAX_PERSISTENT_SESSIONS1LOG_PARALLELISMMaximumnumberofDBMS_JOBjobsthatcanrunsimultaneouslyplus2JOB_QUEUE_PROCESSESTRUEGLOBAL_NAMES9.2.0COMPATIBLE

ValueParameterAsynchronousDistributedHotLog

StagingDatabaseInitializationParametersForOracleDatabase10gRelease2:Settothecurrentvalue+((thenumberofchangesourcesplanned)*(11MB))STREAMS_POOL_SIZEThecurrentvalue+(thenumberofchangesourcesplanned)SESSIONSThecurrentvalue+(3timesthenumberofchangesourcesplanned)PROCESSESThecurrentvalue+(2timesthenumberofchangesourcesplanned)PARALLEL_MAX_SERVERSEqualtothenumberofDistributedHotLogchangesourcesplanned,butnolessthan4OPEN_LINKS50000000JAVA_POOL_SIZETRUEGLOBAL_NAMES10.2.0COMPATIBLE

ValueParameterDataDictionaryViewsSupportingCDCCHANGE_SOURCESlistsexistingchangesources.CHANGE_SETSlistsexistingchangesets.CHANGE_PROPAGATIONSdescribesthestreamspropagationassociatedwithagivendistributedHotLogchangesourceonthesourcedatabase.CHANGE_TABLESlistsexistingchangetables.DBA_SOURCE_TABLESlistspublishedsourcetables.DBA_PUBLISHED_COLUMNSlistspublishedsourcetablecolumns.DBA_SUBSCRIPTIONSlistsallregisteredsubscriptions.DBA_SUBSCRIBED_TABLESlistspublishedtablestowhichsubscribershavesubscribed.DBA_SUBSCRIBED_COLUMNSliststhecolumnsoftablestowhichsubscribershavesubscribed.TransportationinaDataWareh

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论