




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
知识图谱架构知识图谱一般架构:[来源自百度百科]复旦大学知识图谱架构:早期知识图谱架构知识图谱一般架构:[来源自百度百科]架构讨论早期知识图谱架构知识抽取实体概念抽取实体概念映射关系抽取质量评估KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014A
sampler
of
research
problems•••••••••••••Growth:
knowledge
graphs
are
incomplete!
Link
prediction:
add
relations
Ontology
matching:
connect
graphs
Knowledge
extraction:
extract
new
entities
and
relations
from
web/textValidation:
knowledge
graphs
are
not
always
correct!
Entity
resolution:
merge
duplicate
entities,
split
wrongly
merged
ones
Error
detection:
remove
false
assertionsInterface:
how
to
make
it
easier
to
access
knowledge?
Semantic
parsing:
interpret
the
meaning
of
queries
Question
answering:
compute
answers
using
the
knowledge
graphIntelligence:
can
AI
emerge
from
knowledge
graphs?
Automatic
reasoning
and
planning
Generalization
and
abstraction9关系抽取定义:常见手段:语义模式匹配[频繁模式抽取,基于密度聚类,基于语义相似性]层次主题模型[弱监督]KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014Methods
and
techniques•••Supervised
modelsSemi-supervised
modelsDistant
supervision2.
Entity
resolution•Single
entity
methods•Relational
methods3.
Link
prediction••••Rule-based
methodsProbabilistic
modelsFactorization
methodsEmbedding
models80Notinthistutorial:
•Entityclassification•Group/expertdetection•Ontologyalignment•Objectranking 1.Relationextraction:KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014•
Extracting
semantic
relations
between
sets
of
[grounded]
entities•Numerous
variants:•••••Undefined
vs
pre-determined
set
of
relationsBinary
vs
n-ary
relations,
facet
discoveryExtracting
temporal
informationSupervision:
{fully,
un,
semi,
distant}-supervisionCues
used:
only
lexical
vs
full
linguistic
features82Relation
Extraction
Kobe
BryantLA
LakersplayForthe
franchise
player
ofonce
again
savedman
of
the
match
forthe
Lakers”his
team”Los
Angeles”“KobeBryant,“Kobe“KobeBryant?KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014Supervised
relation
extraction•Sentence-level
labels
of
relation
mentions••"Apple
CEO
Steve
Jobs
said.."
=>
(SteveJobs,
CEO,
Apple)"Steve
Jobs
said
that
Apple
will.."
=>
NIL•Traditional
relation
extraction
datasets•••ACE
2004MUC-7Biomedical
datasets
(e.g
BioNLP
clallenges)••Learn
classifiers
from
+/-
examplesTypical
features:
context
words
+
POS,
dependency
path
betweenentities,
named
entity
tags,
token/parse-path/entity
distance83KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014Semi-supervised
relation
extraction•Generic
algorithm(遗传算法)1.2.3.4.5.Start
with
seed
triples
/
golden
seed
patternsExtract
patterns
that
match
seed
triples/patternsTake
the
top-k
extracted
patterns/triplesAdd
to
seed
patterns/triplesGo
to
2•••••Many
published
approaches
in
this
category:
Dual
Iterative
Pattern
Relation
Extractor
[Brin,
98]
Snowball
[Agichtein
&
Gravano,
00]
TextRunner
[Banko
et
al.,
07]
–
almost
unsupervisedDiffer
in
pattern
definition
and
selection86founderOfKDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014Distantly-supervised
relation
extraction88•••Existing
knowledge
base
+
unlabeled
text
generate
examples
Locate
pairs
of
related
entities
in
text
Hypothesizes
that
the
relation
is
expressedGoogle
CEO
Larry
Page
announced
that...Steve
Jobs
has
been
Apple
for
a
while...Pixar
lost
its
co-founder
Steve
Jobs...I
went
to
Paris,
France
for
the
summer...GoogleCEO
capitalOfLarryPageFrance
AppleCEO
PixarSteve
JobsDistant
supervision:
modeling
hypotheses
Typical
architecture:
1.
Collect
many
pairs
of
entities
co-occurring
in
sentences
from
text
corpus
2.
If
2
entities
participate
in
a
relation,
several
hypotheses:1.All
sentences
mentioning
them
express
it
[Mintz
et
al.,
09]
“Barack
Obama
is
the
44th
and
current
President
of
the
US.”
(BO,
employedBy,
USA)
89KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014Sentence-level
features●●●●●Lexical:
words
in
between
and
around
mentions
and
their
parts-of-speech
tags
(conjunctive
form)Syntactic:
dependency
parse
path
between
mentions
along
withside
nodesNamed
Entity
Tags:
for
the
mentionsConjunctions
of
the
above
features
Distant
supervision
is
used
on
to
lots
of
data
sparsity
of
conjunctive
forms
not
an
issue92Distant
supervision:
modeling
hypotheses
Typical
architecture:
1.
Collect
many
pairs
of
entities
co-occurring
in
sentences
from
text
corpus
2.
If
2
entities
participate
in
a
relation,
several
hypotheses:1.2.All
sentences
mentioning
them
express
it
[Mintz
et
al.,
09]At
least
one
sentence
mentioning
them
express
it
[Riedel
et
al.,
10]
“Barack
Obama
is
the
44th
and
current
President
of
the
US.”
(BO,
employedBy,
USA)
“Obama
flew
back
to
the
US
on
Wednesday.”
(BO,
employedBy,
USA)
95KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014Distant
supervision:
modeling
hypotheses
Typical
architecture:
1.
Collect
many
pairs
of
entities
co-occurring
in
sentences
from
text
corpus
2.
If
2
entities
participate
in
a
relation,
several
hypotheses:1.2.3.All
sentences
mentioning
them
express
it
[Mintz
et
al.,
09]At
least
one
sentence
mentioning
them
express
it
[Riedel
et
al.,
10]At
least
one
sentence
mentioning
them
express
it
and
2
entities
can
express
multiple
relations
[Hoffmann
et
al.,
11]
[Surdeanu
et
al.,
12]
“Barack
Obama
is
the
44th
and
current
President
of
the
US.”
(BO,
employedBy,
USA)
“Obama
flew
back
tothe
US
justWednesday.”
said.”
employedBy,
USA)
98KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014was
born
in
on
he
always
(BO,
(BO,
bornIn,KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014Distant
supervision•Pros•••Can
scale
to
the
web,
as
no
supervision
requiredGeneralizes
to
text
from
different
domainsGenerates
a
lot
more
supervision
in
one
iteration•Cons••Needs
high
quality
entity-matchingRelation-expression
hypothesis
can
be
wrongCan
be
compensated
by
the
extraction
model,
redundancy,
language
model•Does
not
generate
negative
examplesPartially
tackled
by
matching
unrelated
entities101KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014104
KobeBryantGasolteammatebornInplayInLeague
BlackMambaEntity
resolution
LA
Lakers
playFor
playFor
Pau35ageKobeB.
BryantVanessaL.BryantmarriedTo
1978Single
entity
resolutionRelational
entity
resolutionDEF:Weconsidertheentityresolution(ER)problem(alsoknownasdeduplication,ormerge–purge),inwhichrecordsdeterminedtorepresentthesamereal-worldentityaresuccessivelylocatedandmergedtheproblemofextracting,matching
andresolvingentitymentionsinstructuredandunstructured
dataMethodsEntityresolution/deduplication •Multiplementionsofthesameentityiswrongandconfusing.KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014Single-entity
entity
resolution••••••••••Entity
resolution
without
using
the
relational
context
of
entitiesMany
distances/similarities
for
single-entity
entity
resolution:
Edit
distance
(Levenshtein,
etc.)
Set
similarity
(TF-IDF,
etc.)
Alignment-based
Numeric
distance
between
values
Phonetic
Similarity
Equality
on
a
boolean
predicate
Translation-based
Domain-specific105KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014Relational
entity
resolution
–
Simple
strategies
•
Enrich
model
with
relational
features
richer
context
for
matching•Relational
features:••Value
of
edge
or
neighboring
attributeSet
similarity
measures•••••Overlap/JaccardAverage
similarity
between
set
membersAdamic/Adar:
two
entities
are
more
similar
if
they
share
more
items
that
areoverall
less
frequentSimRank:
two
entities
are
similar
if
they
are
related
to
similar
objectsKatz
score:
two
entities
are
similar
if
they
are
connected
by
shorter
paths114
KobeBryant1978teammatebornInplayForplayInLeague
BlackMamba
LA
LakersplayFor35agePauGasolKDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014
KobeBryant1978teammatebornInplayForplayInLeague
BlackMamba
LA
LakersplayFor
35agePauGasolRelational
entity
resolution
–
Advanced
strategies•••••Dependency
graph
approaches
[Dong
et
al.,
05]Relational
clustering
[Bhattacharya
&
Getoor,
07]Probabilistic
Relational
Models
[Pasula
et
al.,
03]Markov
Logic
Networks
[Singla
&
Domingos,
06]Probabilistic
Soft
Logic
[Broecheler
&
Getoor,
10]115KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014LINK
PREDICTION116KDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014117
KobeBryantLink
prediction
NY
Knicks
PauGasolteammateplayInLeagueteamInLeagueopponentplayForLA
Lakers
playFor
•
Add
knowledge
from
existing
graph•
No
external
source
•
Reasoning
within
the
graph1.
Rule-based
methods2.
Probabilistic
models3.
Factorization
models4.
Embedding
modelsKDD
2014
Tutorial
on
Constructing
and
Mining
Web-scale
Knowledge
Graphs,
New
York,
August
24,
2014First
Order
Inductive
Learner
•
FOIL
learns
function-free
Horn
clauses:•••118Gasolgiven
positive
negative
examples
of
a
concepta
set
of
background-knowledge
predicatesFOIL
inductively
generates
a
logical
rule
for
the
concept
that
cover
all
+
and
no
-
LA
LakersplayFor
playFor
Pauteammate(x,y)∧
playFor(y,z)
⇒
playFor(x,z)
teammate
Kobe
Bryant•
Computationally
expensive:
huge
search
space
large,
costly
Horn
clauses•
Must
add
constraints
high
precision
but
low
recall•
Inductive
Logic
Programming:
deterministic
and
potentially
problematicKDD
2014
Tutorial
on
Constr
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 厨具用品服务合同范例
- 分批验收合同范例
- 加氢站施工合同范例
- 下夜合同范例
- 发泡混凝土加工合同范例
- 医疗销售协议合同范例
- 剧组制组合同范例范例
- xxh项目合同范例
- 公司赞助合同范例
- 临时有地合同范例
- 2025年泰州职业技术学院高职单招职业适应性测试近5年常考版参考题库含答案解析
- 粮油烘干中心项目可行性研究报告申请报告
- 定制家具安装手册培训
- 大型企业流程管理与信息化诊断规划方案
- 2025年春新北师大版数学一年级下册课件 综合实践 设计教室装饰图
- 2025年全国台联机关服务中心招聘事业编制人员历年高频重点提升(共500题)附带答案详解
- 统编版(2025新版)七年级下册语文第二单元知识点复习提纲
- 2025-2030年中国砂石开采市场发展展望与投资策略建议报告
- 2024-2030年全球及中国近红外荧光成像系统行业运行模式及未来应用前景报告
- 个人租赁钢管合同范例
- 附件1“挑战杯”全国大学生课外学术科技作品竞赛评审规则
评论
0/150
提交评论