知识图谱梳理专题培训课件_第1页
知识图谱梳理专题培训课件_第2页
知识图谱梳理专题培训课件_第3页
知识图谱梳理专题培训课件_第4页
知识图谱梳理专题培训课件_第5页
已阅读5页,还剩22页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

知识图谱架构知识图谱一般架构:[来源自百度百科]复旦大学知识图谱架构:早期知识图谱架构知识图谱一般架构:[来源自百度百科]架构讨论早期知识图谱架构知识抽取实体概念抽取实体概念映射关系抽取质量评估KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014A

sampler

of

research

problems•••••••••••••Growth:

knowledge

graphs

are

incomplete!

Link

prediction:

add

relations

Ontology

matching:

connect

graphs

Knowledge

extraction:

extract

new

entities

and

relations

from

web/textValidation:

knowledge

graphs

are

not

always

correct!

Entity

resolution:

merge

duplicate

entities,

split

wrongly

merged

ones

Error

detection:

remove

false

assertionsInterface:

how

to

make

it

easier

to

access

knowledge?

Semantic

parsing:

interpret

the

meaning

of

queries

Question

answering:

compute

answers

using

the

knowledge

graphIntelligence:

can

AI

emerge

from

knowledge

graphs?

Automatic

reasoning

and

planning

Generalization

and

abstraction9关系抽取定义:常见手段:语义模式匹配[频繁模式抽取,基于密度聚类,基于语义相似性]层次主题模型[弱监督]KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014Methods

and

techniques•••Supervised

modelsSemi-supervised

modelsDistant

supervision2.

Entity

resolution•Single

entity

methods•Relational

methods3.

Link

prediction••••Rule-based

methodsProbabilistic

modelsFactorization

methodsEmbedding

models80Notinthistutorial:

•Entityclassification•Group/expertdetection•Ontologyalignment•Objectranking 1.Relationextraction:KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014•

Extracting

semantic

relations

between

sets

of

[grounded]

entities•Numerous

variants:•••••Undefined

vs

pre-determined

set

of

relationsBinary

vs

n-ary

relations,

facet

discoveryExtracting

temporal

informationSupervision:

{fully,

un,

semi,

distant}-supervisionCues

used:

only

lexical

vs

full

linguistic

features82Relation

Extraction

Kobe

BryantLA

LakersplayForthe

franchise

player

ofonce

again

savedman

of

the

match

forthe

Lakers”his

team”Los

Angeles”“KobeBryant,“Kobe“KobeBryant?KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014Supervised

relation

extraction•Sentence-level

labels

of

relation

mentions••"Apple

CEO

Steve

Jobs

said.."

=>

(SteveJobs,

CEO,

Apple)"Steve

Jobs

said

that

Apple

will.."

=>

NIL•Traditional

relation

extraction

datasets•••ACE

2004MUC-7Biomedical

datasets

(e.g

BioNLP

clallenges)••Learn

classifiers

from

+/-

examplesTypical

features:

context

words

+

POS,

dependency

path

betweenentities,

named

entity

tags,

token/parse-path/entity

distance83KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014Semi-supervised

relation

extraction•Generic

algorithm(遗传算法)1.2.3.4.5.Start

with

seed

triples

/

golden

seed

patternsExtract

patterns

that

match

seed

triples/patternsTake

the

top-k

extracted

patterns/triplesAdd

to

seed

patterns/triplesGo

to

2•••••Many

published

approaches

in

this

category:

Dual

Iterative

Pattern

Relation

Extractor

[Brin,

98]

Snowball

[Agichtein

&

Gravano,

00]

TextRunner

[Banko

et

al.,

07]

almost

unsupervisedDiffer

in

pattern

definition

and

selection86founderOfKDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014Distantly-supervised

relation

extraction88•••Existing

knowledge

base

+

unlabeled

text

generate

examples

Locate

pairs

of

related

entities

in

text

Hypothesizes

that

the

relation

is

expressedGoogle

CEO

Larry

Page

announced

that...Steve

Jobs

has

been

Apple

for

a

while...Pixar

lost

its

co-founder

Steve

Jobs...I

went

to

Paris,

France

for

the

summer...GoogleCEO

capitalOfLarryPageFrance

AppleCEO

PixarSteve

JobsDistant

supervision:

modeling

hypotheses

Typical

architecture:

1.

Collect

many

pairs

of

entities

co-occurring

in

sentences

from

text

corpus

2.

If

2

entities

participate

in

a

relation,

several

hypotheses:1.All

sentences

mentioning

them

express

it

[Mintz

et

al.,

09]

“Barack

Obama

is

the

44th

and

current

President

of

the

US.”

(BO,

employedBy,

USA)

89KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014Sentence-level

features●●●●●Lexical:

words

in

between

and

around

mentions

and

their

parts-of-speech

tags

(conjunctive

form)Syntactic:

dependency

parse

path

between

mentions

along

withside

nodesNamed

Entity

Tags:

for

the

mentionsConjunctions

of

the

above

features

Distant

supervision

is

used

on

to

lots

of

data

sparsity

of

conjunctive

forms

not

an

issue92Distant

supervision:

modeling

hypotheses

Typical

architecture:

1.

Collect

many

pairs

of

entities

co-occurring

in

sentences

from

text

corpus

2.

If

2

entities

participate

in

a

relation,

several

hypotheses:1.2.All

sentences

mentioning

them

express

it

[Mintz

et

al.,

09]At

least

one

sentence

mentioning

them

express

it

[Riedel

et

al.,

10]

“Barack

Obama

is

the

44th

and

current

President

of

the

US.”

(BO,

employedBy,

USA)

“Obama

flew

back

to

the

US

on

Wednesday.”

(BO,

employedBy,

USA)

95KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014Distant

supervision:

modeling

hypotheses

Typical

architecture:

1.

Collect

many

pairs

of

entities

co-occurring

in

sentences

from

text

corpus

2.

If

2

entities

participate

in

a

relation,

several

hypotheses:1.2.3.All

sentences

mentioning

them

express

it

[Mintz

et

al.,

09]At

least

one

sentence

mentioning

them

express

it

[Riedel

et

al.,

10]At

least

one

sentence

mentioning

them

express

it

and

2

entities

can

express

multiple

relations

[Hoffmann

et

al.,

11]

[Surdeanu

et

al.,

12]

“Barack

Obama

is

the

44th

and

current

President

of

the

US.”

(BO,

employedBy,

USA)

“Obama

flew

back

tothe

US

justWednesday.”

said.”

employedBy,

USA)

98KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014was

born

in

on

he

always

(BO,

(BO,

bornIn,KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014Distant

supervision•Pros•••Can

scale

to

the

web,

as

no

supervision

requiredGeneralizes

to

text

from

different

domainsGenerates

a

lot

more

supervision

in

one

iteration•Cons••Needs

high

quality

entity-matchingRelation-expression

hypothesis

can

be

wrongCan

be

compensated

by

the

extraction

model,

redundancy,

language

model•Does

not

generate

negative

examplesPartially

tackled

by

matching

unrelated

entities101KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014104

KobeBryantGasolteammatebornInplayInLeague

BlackMambaEntity

resolution

LA

Lakers

playFor

playFor

Pau35ageKobeB.

BryantVanessaL.BryantmarriedTo

1978Single

entity

resolutionRelational

entity

resolutionDEF:Weconsidertheentityresolution(ER)problem(alsoknownasdeduplication,ormerge–purge),inwhichrecordsdeterminedtorepresentthesamereal-worldentityaresuccessivelylocatedandmergedtheproblemofextracting,matching

andresolvingentitymentionsinstructuredandunstructured

dataMethodsEntityresolution/deduplication •Multiplementionsofthesameentityiswrongandconfusing.KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014Single-entity

entity

resolution••••••••••Entity

resolution

without

using

the

relational

context

of

entitiesMany

distances/similarities

for

single-entity

entity

resolution:

Edit

distance

(Levenshtein,

etc.)

Set

similarity

(TF-IDF,

etc.)

Alignment-based

Numeric

distance

between

values

Phonetic

Similarity

Equality

on

a

boolean

predicate

Translation-based

Domain-specific105KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014Relational

entity

resolution

Simple

strategies

Enrich

model

with

relational

features

richer

context

for

matching•Relational

features:••Value

of

edge

or

neighboring

attributeSet

similarity

measures•••••Overlap/JaccardAverage

similarity

between

set

membersAdamic/Adar:

two

entities

are

more

similar

if

they

share

more

items

that

areoverall

less

frequentSimRank:

two

entities

are

similar

if

they

are

related

to

similar

objectsKatz

score:

two

entities

are

similar

if

they

are

connected

by

shorter

paths114

KobeBryant1978teammatebornInplayForplayInLeague

BlackMamba

LA

LakersplayFor35agePauGasolKDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014

KobeBryant1978teammatebornInplayForplayInLeague

BlackMamba

LA

LakersplayFor

35agePauGasolRelational

entity

resolution

Advanced

strategies•••••Dependency

graph

approaches

[Dong

et

al.,

05]Relational

clustering

[Bhattacharya

&

Getoor,

07]Probabilistic

Relational

Models

[Pasula

et

al.,

03]Markov

Logic

Networks

[Singla

&

Domingos,

06]Probabilistic

Soft

Logic

[Broecheler

&

Getoor,

10]115KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014LINK

PREDICTION116KDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014117

KobeBryantLink

prediction

NY

Knicks

PauGasolteammateplayInLeagueteamInLeagueopponentplayForLA

Lakers

playFor

Add

knowledge

from

existing

graph•

No

external

source

Reasoning

within

the

graph1.

Rule-based

methods2.

Probabilistic

models3.

Factorization

models4.

Embedding

modelsKDD

2014

Tutorial

on

Constructing

and

Mining

Web-scale

Knowledge

Graphs,

New

York,

August

24,

2014First

Order

Inductive

Learner

FOIL

learns

function-free

Horn

clauses:•••118Gasolgiven

positive

negative

examples

of

a

concepta

set

of

background-knowledge

predicatesFOIL

inductively

generates

a

logical

rule

for

the

concept

that

cover

all

+

and

no

-

LA

LakersplayFor

playFor

Pauteammate(x,y)∧

playFor(y,z)

playFor(x,z)

teammate

Kobe

Bryant•

Computationally

expensive:

huge

search

space

large,

costly

Horn

clauses•

Must

add

constraints

high

precision

but

low

recall•

Inductive

Logic

Programming:

deterministic

and

potentially

problematicKDD

2014

Tutorial

on

Constr

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论