版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
DataPreparation
andCleaningChapter2Wherewearenow1.DataAnalytics2.DataPreparationandCleaning3.ModelingandEvaluation4.Visualization5.TheModernAudit6.AuditAnalytics7.KeyPerformanceIndicators8.FinancialStatementAnalyticsObjectivesLO2-1Howaredatausedandstoredintheaccountingcycle?LO2-2Howaredatastoredinrelationaldatabases?LO2-3Whatdoesitmeantoextract,transform,andload?IntheIMPACTcycle,we’regoingtolookatMasteringtheData.IdentifythequestionsMasterthedataPerformtestplanAddressandrefineresultsCommunicateinsightsTrackoutcomesExhibit1-1TheIMPACTCycleHowaredatausedandstoredintheaccountingcycle?LO2-1Understandthedatabylookingathowitisorganized.Datacanbefoundthroughoutvarioussystems.Inmostcases,youneedtoknowwhichtablesandattributescontaintherelevantdata.UnifiedModelingLanguage(UML)isonewaytounderstanddatabases.FGI_ProductProduct_Code[PK]Product_Description…Sales_SubsetSales_Order_ID[PK]Product_Code[FK]Customer_ID[FK]…CustomerCustomer_ID[PK]Customer_Name…Howaredatastoredinrelationaldatabases?LO2-2Relationaldatabasesensurethatdata:Arecomplete,orincludealldata.Aren’tredundant,sotheydon’ttakeuptoomuchspace.Followbusinessrulesandinternalcontrols.Aidcommunicationandintegrationofbusinessprocesses.Therearefourtypesofattributes.Primarykeysareuniqueidentifiers.Foreignkeysareattributesthatpointtoaprimarykeyinanothertable.Compositekeysareacombinationoftwoforeignkeysusedforlineitems.Descriptiveattributesincludeeverythingelse.SupplierTableSupplierIDSupplierNameSupplierAddressSupplierType1NorthernBreweryHomebrewSupply6021LyndaleAveS12HopsDirectLLC686GreenValleyRoad13TheHomeBrewery455E.TownshipSt.14ThePayrollCompany408N.WaltonBlvd2Examplesoftwotables,attributes,anddata.NoticethePK-FKrelationship.PurchaseOrderTablePONo.DateCreatedByApprovedBySupplierID(FK)178711/1/2017100110101178811/1/2017100510102178911/8/2017100210101179011/15/2017100510101SupplierTableSupplierID(PK)SupplierNameSupplierAddressSupplierType1NorthernBreweryHomebrewSupply6021LyndaleAveS12HopsDirectLLC686GreenValleyRoad13TheHomeBrewery455E.TownshipSt.14ThePayrollCompany408N.WaltonBlvd2Datadictionariesdefinewhatdataareacceptable.Foreachattribute,welearn:Whattypeofkeyitis.Whatdataarerequired.Whatdatacanbestoredinit.Howmuchdataisstored.SupplierTableDataDictionaryPrimaryorForeignKey?RequiredAttributeNameDescriptionDataTypeDefaultValueFieldSizeNotesPKYSupplierIDUniqueIdentifierforeachSupplier
Numbern/a10
NSupplierNameFirstandLastNameShortTextn/a30
FKNSupplierTypeTypeCodeforDifferentSupplierCategories
NumberNull101:Vendor2:MiscQ.Whatisthepurposeoftheprimarykey?Aforeignkey?Anon-keyattribute?Whatdoesitmeantoextract,transform,andload?LO2-2TheRequestingdataisaniterativepracticeinvolving5steps:Step1:Determinethepurposeandscopeofthedatarequest.Step2:Obtainthedata.Step3:Validatethedataforcompletenessandintegrity.Step4:Cleanthedata.Step5:Loadthedatafordataanalysis.Step1:DeterminethepurposeandscopeofthedatarequestAskafewquestionsbeforebeginningtheprocess:Whatisthepurposeofthedatarequest?Whatdoyouneedthedatatosolve?Whatbusinessproblemwillitaddress?Whatriskexistsindataintegrity(e.g.,reliability,usefulness)?Whatisthemitigationplan?Whatotherinformationwillimpactthenature,timing,andextentofthedataanalysis?Step2:ObtainthedataHowwilldataberequestedand/orobtained?Doyouhaveaccesstothedatayourself,ordoyouneedtorequestadatabaseadministratorortheinformationsystemsdepartmenttoprovidethedataforyou?Ifyouneedtorequestthedata,isthereastandarddatarequestformthatyoushoulduse?Fromwhomdoyourequestthedata?Wherearethedatalocatedinthefinancialorotherrelatedsystems?Whatspecificdataareneeded(tablesandfields)?Whattoolswillbeusedtoperformdataanalytictestsorproceduresandwhy?Step2:ObtainthedataThereareacoupleoptions:ObtaindatathroughadatarequesttotheITdepartment.Obtaindatayourself.ExampleStandardDataRequestFormSECTION1:REQUESTDETAILSRequestorName:RequestorContact
Number:RequestorEmailAddress:Pleaseprovideadescriptionoftheinformationneeded(indicatewhichtablesandwhichfieldsyourequire):Whatwilltheinformationbeusedfor?Frequency(circleone)One-OffAnnuallyTermlyOther:___________Formatyouwishthedatatobedeliveredin(circleone):Spreadsheet
WordDocumentTextFile
Other:____________RequestDate:RequiredDate:IntendedAudience:Customer(ifnotrequestor):ExampleStandardDataRequestFormSECTION2:TOBECOMPLETEDBYINFORMATIONSYSTEMSDEPARTMENTRequestNumberDateReceivedReceivedbyAssignedtoInitialreviewcomments
(discussionwithclient—revisionsrequired?agreementtoproceed?etc.)Workinprogresscomments
(additionalnotesandcommentsduringproductionofdata)SECTION3:COMPLETIONDETAILSDateCompleted
DateProvidedRevisionsRequiredFeedbackfromclient
(ifapplicable)ObtainthedatayourselfIfyouhavedirectaccesstoadatawarehouse,youcanuseSQLandothertoolstopullthedatayourself.Identifythetablesthatcontaintheinformationyouneed.Youcandothisbylookingthroughthedatadictionaryortherelationshipmodel.Identifywhichattributes,specifically,holdtheinformationyouneedineachtable.Identifyhowthosetablesarerelatedtoeachother.Step3:ValidatethedataforcompletenessandintegrityChancesarethedatayourequestisn’tcomplete.Beforeyoubegin,doalittleworktomakesureyourdataarevalid:ComparethenumberofrecordsComparedescriptivestatisticsfornumericfieldsValidateDate/TimefieldsComparestringlimitsfortextfieldsStep4:CleanthedataOnceyouhavevaliddata,thereisstillsomeworkthatneedstobedonetomakesureitisconsistentandreadyforanalysis:RemoveheadingsorsubtotalsCleanleadingzeroesandnonprintablecharactersFormatnegativenumbersCorrectinconsistenciesacrossdata,ingeneralStep5:LoadthedatafordataanalysisFinally,youcannowimportyourdataintothetoolofyourchoiceandexpectthefunctionstoworkproperly.Q.Whatarefourcommonissueswithdatathatmustbefixedbeforeanalysiscantakeplace?SummaryThefirststepintheIMPACTcycleistoidentifythequestionsthatyouintendtoanswerthroughyourdataanalysisproject.Onceadataanalysisproblemorquestionhasbeenidentified,thenextstepintheIMPACTcycleismasteringthedata,whichcanbebrokendowntomeanobtainingthedataneededandpreparingitforanalysis.Inordertoobtaintherightdata,itisimportanttohaveafirmgraspofwhatdataareavailabletoyouandhowthatinformationisstored.Dataareoftenstoredinarelationaldatabase,whichhelpstoensurethatanorganization’sdataarecompleteandtoavoidredundancy.Relationaldatabasesaremadeupoftableswithuniquelyidentifiedrecords(thisisdonethroughprimarykeys)andarerelatedthroughtheusageofforeignkeys.Toobtainthedata,youwilleitherhaveaccesstoextractthedatayourselforyouwillneedtorequestthedatafromadatabaseadministratorortheinformationsystemsteam.Ifthelatteristhecase,youwillcompleteadatarequestform,indicatingexactlywhichdatayouneedandwhy.Onceyouhavethedata,theywillneedtobevalidatedforcompletenessandintegrity—thatis,youwill
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2024届上海市南汇中学普通高考第一次适应性检测试题数学试题
- 2024届山西省忻州市静乐一中高三4月调研测试卷数学试题
- 《汽化和液化》课件教学
- 5年中考3年模拟试卷初中生物八年级下册第三节基因的显性和隐性
- 学校校车平安管理工作情况汇报范文
- 高中语文 《五代史伶官传序》随堂练习(含答案)
- 一体化加药装置
- 苏少版小学音乐四年级上册教案
- 花城版七年级音乐下册全册教案【完整版】
- 全国土地估价师资格考试报名表
- 西宁城市职业技术学院“高水平专业群建设”项目满意度调查问卷(教职工卷)
- 五年级上册语文第一~四单元阶段性综合复习(附答案)
- 压型钢板泄爆屋面施工方案
- 钻孔咬合桩施工工艺
- 温经汤幻灯片
- 无线通信与5G6G技术
- 人教版六年级上册数学第四单元《比》 单元达标测试卷(含参考答案)
- 《异分母分式的加减》教学评一致性教案设计
- 看电影学英语教案
- 医药公司风险评估及应对报告3篇完整版
- 新版FMEA表单模板(带AP明细)(OK)
评论
0/150
提交评论