2022 数据峰会 数据管理与分析 -实现统一分析-佩奇·罗伯茨_第1页
2022 数据峰会 数据管理与分析 -实现统一分析-佩奇·罗伯茨_第2页
2022 数据峰会 数据管理与分析 -实现统一分析-佩奇·罗伯茨_第3页
2022 数据峰会 数据管理与分析 -实现统一分析-佩奇·罗伯茨_第4页
2022 数据峰会 数据管理与分析 -实现统一分析-佩奇·罗伯茨_第5页
已阅读5页,还剩66页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

UnifyAnalytics

MakingProductionDataAccessibleforBothBIandDataScience

PaigeRoberts

VerticaOpenSourceRelationsManager

AnalyticsUseCasesareEverywhere

Companiesthatenableourdata-drivenworldusebothBIanddatascience.

NetworkOptimization

Energy

Optimization

©2020Allrightsreserved.

VerticaMachineLearningCustomerSuccess

4

,

,

Arablelandisdiminishing,butworldpopulationisgrowing.Startup,

theClimateCorporation,wantedtouseanalyticstooptimizeuseof

seed,fertilizer,pesticides,water,etc.forfarmers’bestpossiblecrop

yield.

Problem:Highvarietyandvolumeofdatafrommultiplesources

Wciuretsir.

WithVertica

Actio

n

•Combinedronedata,satelliteimagery,weatherdata,soilanalysis,etc.

•BringtheanalyticstolifethroughvisualizationinaSaaSappaccessibleevenfromtractors

•UseVertica’shighspeedKafkaconnectorforfastdataingestion

•UseVerticainEonModeonAWScloudforeasy,fast,scalability

•UseVertica’sadvancedanalyticalandin-databaseMLcapabilities

•Startwithfreeversion,thencommercialversionascompanygrows

Resul

t

•ManualdataentrytoPostgresDBtooktoolong

•Wouldn’tscaletomanyconcurrentusers

•Couldn’thandlesheervolumeofI0Tandotherdatatheywantedtoanalyze

•Gethighconcurrency,unlimitedscale,elasticgrowth.Adoptionskyrockets,exponentialcompanygrowth.Anewstandardinhowtogrowfoodthesmartway.

•Abletoaddnewdatasourceslikefieldservicedata

•Specializedmulti-functionalteamsfocuscreatemoreandbetterMLmodels.

Startupcouldn’tgrow

•BoughtbyBayerforbilliondollarvaluation.Now,Climate,LLC.

•Expandingtomanufacturingandresearchanddevelopment

•WonDataBreakthrough’s“BestPredictiveAnalyticsSolution”award2022.

neededplussophisticatedanalysisrequiredandsupportformany

Extract,

Transform,Load

(ETL)

SQ

L

DataWarehouseArchitecture

CRM

Billing

Transactionaldata

Files

ApplicationData

Customer

Operational

Financial

BATCH

ERP

AnalyticalDatabase

AdHoc

Queries

Business

Intelligence

Reportin

g

BUSINESS

INTELLIGENCE

REPORTS,

VISUALIZATION

©2020Allrightsreserved.

©2020Allrightsreserved.

DataWarehouseStrengths

DATA

WAREHOUSE

Reporting/

BusinessIntelligence

HighPerformance

HighConcurrency

Reliability

Security

Governance

SQL

DataWarehouseWeaknesses

DATAWAREHOUSE

Reporting/

BusinessIntelligence

HighPerformance

HighConcurrencyReliabilitySecurity

Governance

SQL

WEAKNESSES

Expensivetoscale

Structureddataonly

ETLcauseddatatobestale

Businessintelligenceonly

Can’thandlestreamingdata

Stream

Processing

Distributed

MassStorage

(DataLake)

Pub/Sub

SQL-Like

Query

Engine

HDFS

Prepared

AND/OR

datain

new

ormat

DataPrep

Object

Storage

Machine

ELTwith

Transformationin

Learnin

g

datalake

f

DataLakeArchitecture

LOWLATENCY

ApplicationdataWebclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

BATCH

ContextualdataWeather

Geo

Files

Transactional

data

ApplicationData

OLTP/ODS

DATASCIENCE

VISUALIZATIO

N,

APPLICATIONS

©2020Allrightsreserved.

©2020Allrightsreserved.

DataLakeStrengths

DATALAKE

MachineLearning/

DataScience

UnlimitedScale

StreamingData

Semi-StructuredData

(JSON,AVRO,…)

ComplexDataTypes

(Maps,Structs,Arrays)

SchemaonRead

Python,R,Jupyter

DataLakeWeaknesses

WEAKNESSES

Slowperformance

Poorconcurrency

Complextobuild/maintain

Immaturereliability,security,governance

DifficultyoperationalizingML

DATALAKE

MachineLearning/

DataScience

UnlimitedScale

StreamingData

Semi-StructuredData

(JSON,AVRO,…)

ComplexDataTypes

(Maps,Structs,Arrays)

SchemaonRead

Python,R,Jupyter

CooperativeArchitectureStrengths

DATAWAREHOUSE

Reporting/

BusinessIntelligence

HighPerformance

HighConcurrencyReliabilitySecurity

Governance

SQL

DATALAKE

MachineLearning/

DataScience

UnlimitedScale

StreamingData

Semi-StructuredData

(JSON,AVRO,…)

ComplexDataTypes

(Maps,Structs,Arrays)

SchemaonRead

Python,R,Jupyter

ELT,

DataPrep

Analytical

Database

Distributed

BUSINESS

INTELLIGENCE

+

DATASCIENCE

SQ

L

Stream

Processing

Distributed

Pub/Sub

MassStorage

(DataLake)

HDFS

Historical

AND/OR

data

ELT,Data

FastELTwith

Transformationin

Prep

Transactional

data

ApplicationData

OLTP/ODS

Machine

atalake

Object

Storage

Learnin

g

d

CooperativeArchitecture

LOWLATENCY

ApplicationdataWebclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

BATCH

ContextualdataWeather

Geo

Files

©2020Allrightsreserved.

DATASCIENCE

+

BUSINESS

INTELLIGENCE

Flink

SQ

L

LOWLATENCY

Plantingand

harvestequipment

Weatherstations,

probes,satellite

imagery

Applicationdata

-clickstreams

Machine

Machine

MassStorage

(DataLake)

Amazo

nRDS

Learnin

g

Learnin

g

Separate

cluster

fordata

Ingest,

ETL

dataSalesdata

Marketingcampaigns

BATCH

Bayerresearch

trials

Climateresearch

farms(CRF)

Climateresearch

partners(CRP)

FieldViewdata

Environmentaldata

Platformpartner

©2020Allrightsreserved.

©2020Allrightsreserved.

DatabaseProxy

DataManager/Manifest

Schem

a

Enforce

d

DATASCIENCE

+

BUSINESS

INTELLIGENCE

AdHoc

Analytics:

•CityOps

•DataScientists

•QueryBuilder–

Ubercreated

•DashBuilder–

Ubercreated

Applications:

•ETL/Modeling

•CityOps

•Machine

SQ

L

of

Duplicate

ata

Recent

mostused

data

Infrequentl

yused

data

Learning

•Experiments

d

LOWLATENCY

Applicationsdata

Clickstreams

Locationdata

Cassandra

keyvalue

database

BATCH

Transactionaldatabases

MySQL,

PostgreSQL

MassStorage

(DataLake)

HDFS

Flattened,

modeledtables

Cooperative

architectureweaknesses

DATAWAREHOUSE

Reporting/

BusinessIntelligence

HighPerformance

HighConcurrencyReliabilitySecurity

Governance

SQL

BIVisualizationtools

WEAKNESSES

Complexity

Duplicatedeffort

DivisionofBIandDataScience

DATALAKE

MachineLearning/

DataScience

UnlimitedScale

StreamingData

Semi-StructuredData

(JSON,AVRO,…)

ComplexDataTypes

(Maps,Structs,Arrays)

SchemaonRead

Python,R,Jupyter

ELT,

DataPrep

Distributed

Analytical

Database

Reportin

g

Queries

AdHoc

BUSINESS

INTELLIGENCE

+

DATASCIENCE

SQ

L

Stream

Processing

Distributed

Pub/Sub

MassStorage

(DataLake)

HDFS

Historical

AND/OR

data

ELT,DataPrep

FastELTwith

Transformationin

Transactional

data

ApplicationData

OLTP/ODS

Machine

atalake

Object

Storage

Learnin

g

d

CooperativeArchitecture

LOWLATENCY

ApplicationdataWebclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

BATCH

ContextualdataWeather

Geo

Files

©2020Allrightsreserved.

BUSINESS

INTELLIGENCE

+

DATASCIENCE

Stream

Processing

SQ

L

ELT

OBJECTSTORAGE

ON-PREMISES,HYBRID,CLOUDORMULTI-

CLOUD

GCP

UnifyStorageLocation

LOWLATENCY

ApplicationdataWebclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

BATCH

ContextualdataWeather

Geo

Files

Transactional

data

ApplicationData

OLTP/ODS

©2020Allrightsreserved.

Import,

Export,

Query,

Join

Formats

BUSINESS

INTELLIGENCE

+

DATASCIENCE

LOWLATENCY

STREAMINGDATA

Applicationdata

Webclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

STREAM

PROCESSING

RawDataMany

Formats

ELT,

Data

Prep

BATCH

CONTEXTUALDATA

Files

Weather

Geo

TRANSACTIONALDATA

ApplicationData

OLTP/ODS

BatchETL

OR

FastELT

ManageDataLifeCyclewithData

Ingestion,

ELT,DataPrep

SQL

Hotdata,

fastdata

ROS

Colddata,

historical

data

ON-PREMISES,HYBRID,CLOUDORMULTI-

CLOUD

©2020Allrightsreserved.

Data

Format

s

ROS

Ingestion,

ELT,DataPrep

SeparateComputefromStorage

LOWLATENCY

STREAMINGDATA

Applicationdata

Webclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

STREAMPROCESSING

BATCH

CONTEXTUALDATA

Files

Weather

Geo

TRANSACTIONALDATA

ApplicationData

OLTP/ODS

OR

FastELT

BatchETL

©2020Allrightsreserved.

SQL

ON-PREMISES,HYBRID,CLOUDORMULTI-

CLOUD

BUSINESS

INTELLIGENCE

+

DATASCIENCE

UnifyAnalytics

LOWLATENCY

STREAMINGDATA

Applicationdata

Webclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

STREAMPROCESSING

BATCH

Files

WeatherOR

GeoELT

TRANSACTIONALDATA

ApplicationData

OLTP/ODS

CONTEXTUALDATABatchETL

©2020Allrightsreserved.

Data

Format

s

Ingestion,

ELT,

DataPrep

ROS

SQ

L

ON-PREMISES,HYBRID,CLOUDORMULTI-

CLOUD

BUSINESS

INTELLIGENCE

+

DATASCIENCE

Workloads

s

Ingestion,

ELT,

Machine

Learnin

g

DataPrep

ROS

AdHoc

Reportin

g

Queries

IsolateWorkloads

LOWLATENCY

STREAMINGDATA

Applicationdata

Webclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

STREAMPROCESSING

BATCH

CONTEXTUALDATA

Files

Weather

Geo

TRANSACTIONALDATA

ApplicationData

OLTP/ODS

OR

FastELT

BatchETL

©2020Allrightsreserved.

Data

IsolatedCompute

Format

SQ

L

ON-PREMISES,HYBRID,CLOUDORMULTI-

CLOUD

BUSINESS

INTELLIGENCE

+

DATASCIENCE

BUSINESS

INTELLIGENCE

+

DATASCIENCE

LOWLATENCY

STREAMINGDATA

Applicationdata

Webclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

STREAM

PROCESSING

SQ

L

BATCH

CONTEXTUALDATA

Files

Weather

Geo

TRANSACTIONALDATA

ApplicationData

OLTP/ODS

BatchETL

OR

FastELT

Import/Export/Score/Managemodels

IsolatedCompute

s

Ingestion,

ELT,

DataPrep

ROS

Reportin

g

ModelEvaluation,

Deployment,

Management

ON-PREMISES,HYBRID,CLOUDORMULTI-

CLOUD

Workloads

Data

Format

Learnin

g

Machine

Queries

AdHoc

©2020Allrightsreserved.

UnifiedAnalyticsPlatform

LOWLATENCY

STREAMINGDATA

Applicationdata

Webclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

STREAMPROCESSING

BATCH

CONTEXTUALDATA

Files

Weather

Geo

TRANSACTIONALDATA

ApplicationData

OLTP/ODS

OR

FastELT

BatchETL

Ingestion,

ELT,DataPrep

ModelEvaluation,

Data

Format

s

Workloads

IsolatedCompute

SQ

L

Machine

Learnin

g

ROS

Reportin

g

AdHoc

Queries

Deployment,

Management

ON-PREMISES,HYBRID,CLOUDORMULTI-

CLOUD

©2020Allrightsreserved.

BUSINESS

INTELLIGENCE

+

DATASCIENCE

UnifiedAnalyticsPlatform

LOWLATENCY

STREAMINGDATA

Applicationdata

Webclicks

Logs

Sensors

OperationalmetricsUsertrackingGeo-location

STREAMPROCESSING

BATCH

CONTEXTUALDATA

Files

Weather

Geo

TRANSACTIONALDATA

ApplicationData

OLTP/ODS

OR

FastELT

BatchETL

SQ

L

Learnin

g

ROS

AdHoc

Queries

ModelEvaluation,

Deployment,

Management

ON-PREMISES,HYBRID,CLOUDORMULTI-

CLOUD

Ingestion,

ELT,DataPrep

Machine

Reportin

g

©2020Allrightsreserved.

BUSINESS

INTELLIGENCE

+

DATASCIENCE

©2020Allrightsreserved.

UnifiedAnalyticsWarehouseStrengths

DATAWAREHOUSE

Reporting/

BusinessIntelligence

HighPerformance

HighConcurrencyReliabilitySecurity

Governance

Unified

Analytic

s

SQL

BIVisualizationTools

Python

Jupyter

R

DATALAKE

MachineLearning/

DataScience

UnlimitedScale

StreamingData

Semi-StructuredData

(JSON,AVRO,…)

ComplexDataTypes

(Maps,Structs,Arrays)

SchemaonRead

LOWLATENCY

STREAMPROCESSING

Remote

Service

Network

SQL

Philips

MassStorage

(DataLake)

CRMdata

Repairshopdata

Factory

BATCH

ata

Teradata,Salesforc

e,SAP

SQL

data

Server

d

BUSINESS

INTELLIGENCE

+

DATASCIENCE

R&D

Access

Remote

Monitoring

Remote

Service

©2020Allrightsreserved.

EXTREMELYLOW

LRAeltNimY

biddingdata

SQ

L

MassStorage

(DataLake)

Third-partydata

LOW

LATENCY

HDFS

SQL

SQ

L

Contextua

ldata-

PostgreS

QL

BATCH

DATASCIENCE

+

BUSINESS

INTELLIGENCE

BUSINESSINTELLIGENCE

BUSINESSINTELLIGENCE

©2020Allrightsreserved.

LOWLATENCY

EONTVService

On/Off,Channel

ChangeData

DATASCIENCE+

BUSINESSINTELLIGENCE

CRM

Billing

ERP

BATCH

ContactCenter

Geo/Mapping

Customer

CX

Operational

Financial

ServiceQuality

TVSchedule

ContentAnalytics

Reporting

SQ

L

FinancialReporting

Extract,Load

Transformation

pusheddown

Data-drivenApps

Machine

Learnin

g

Real-timeadCustotmtiogleanalysis

forreal-timeapps

Ingest,

AdHoc

Reportin

g

Transform,

Queries

DataPrep

©2020Allrightsreserved.

SQ

L

EXTREMELYLOW

LATENCY

Real-time

biddingdata

Hot

FastData

Stored

Big

EphemeralSub-Clusters

SQL

DataIngest

ELT/Aggregation

AdHoc

BIQueries

PrimarySub-

Cluster

ReportingReporting

Reporting

ReportingReporting

Reporting

Learnin

g

Data

Third-partydata

LATENCY

LOW

Contextualdata

BATCH

Machine

APPLICATION

MLAPPLIED

BUSINESSINTELLIGENCE

MLTRAINING

REPORTING

©2020Allrightsreserved.

©2020Allrightsreserved.

EMARadarReport:UnifiedAnalyticsWarehouseAGuideforInvestinginUnifiedAnalytics

Requestyourcopytoday:

/success-ema-radar-

report/

LearnMore:

TryitFree:/try

PaigeRoberts

OpenSourceRelationsManager

E:Paige.Roberts@

©2020Allrightsreserved.

compression

-Reduceshardwarecosts;5xgreaterdata

robustwithback-endsupport.

NimbleStorage-287%ROI

deals,automatescustomerservice,and

VerticaAnalyticsPlatformhelpsclosemore

savesmoney

solutionwithproductthatismorescalableand

Nimblesetouttoreplacepreviousdatabase

Infosight

ImplementsVertica,namescoolnewcapability:

86%ofsupportcasesnowcloseautomatically

-Supportengineersabletotakeon3xmorework

-50%-83%fasterBIanddatasciencequeries

-Decreasessupportteamcallsby19%annually;

compression

Now,HPInfosight

NimblesoldtoHPEforbillion-dollarvaluation

VerticaUnifiedAnalyticsPlatformwithevery

server:

86%ofsupportcasesnowcloseautomatically

-Supportengineersabletotakeon3xmorework

-50%-83%fasterBIanddatasciencequeries

-Reduceshardwarecosts;5xgreaterdata

-Decreasessupportteamcallsby19%annually;

/us/en/newsroom/press-release/2017/03/hpe-

to-acquire-nimble-storage-to-strengthen-leadership-in-hybrid-it.html

April9,2022

YES,TEAM!!

HY

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论