




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Adversarial
Example
Detection姜育刚,马兴军,吴祖煊Recap:
week
3
1.
Adversarial
Examples
2.
Adversarial
Attacks
3.
Adversarial
Vulnerability
UnderstandingIn-class
Adversarial
Attack
Competitionhttps://codalab.lisn.upsaclay.fr/competitions/15669?secret_key=77cb8986-d5bd-4009-82f0-7dde2e819ff8
In-class
Adversarial
Attack
CompetitionIn-class
Adversarial
Attack
CompetitionAdversarial
attack
competition(account
for
30%)必须使用学校邮箱注册比赛(否则无成绩)比赛时间:Phase
1:10月1号–
10月28号Phase
2:评估阶段,学生不参与没卡的同学可以使用Google
Colab:/
按排名算分:第一名30分最后一名15分Adversarial
Example
Detection
(AED)A
binary
classification
problem:
clean
(y=0)
or
adv
(y=1)?An
anomaly
detection
problem:
benign
(y=0)
or
abnormal
(y=1)?
Principles
for
AEDAll
binary
classification
methods
can
be
applied
for
AEDPrinciples
for
AEDAll
anomaly
detection
methods
can
be
applied
for
AEDPrinciples
for
AEDUse
as
much
information
as
you
canInput
statisticsManual
featuresTraining
dataAttention
mapTransformationMixupDenoising…ActivationsDeep
featuresProbabilitiesLogitsGradientsLoss
landscapeUncertainty…Principles
for
AEDLeverage
unique
characteristics
of
adversarial
examplesTwinsStrangersExtremely
close
to
the
clean
sampleFar
away
in
predictionPrinciples
for
AEDBuild
detectors
based
on
existing
understandingsHigh
dimensional
pocketsLocal
linearityTilting
boundaryPrinciples
for
AEDIt’s
is
still
feature
engineering!Challenges
in
AEDThe
diversity
of
adversarial
examples
used
for
training
the
detectors
determine
the
detection
performanceDetectors
are
also
machine
learning
models:
they
are
also
vulnerable
to
adversarial
attacks
The
detectors
need
to
detect
both
existing
and
unknown
attacksThe
detectors
need
to
be
robust
to
adaptive
attacksExisting
MethodsSecondary
Classification
Methods
(二级分类法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布检测法)Prediction
Inconsistency
(预测不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(诱捕检测法)Existing
MethodsSecondary
Classification
Methods
(二级分类法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布检测法)Prediction
Inconsistency
(预测不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(诱捕检测法)Secondary
Classification
MethodsTake
adversarial
examples
as
a
new
classAdversarialRetraining
(对抗重训练)Grosse
et
al.
Onthe(Statistical)DetectionofAdversarialExamples,
arXiv:1702.06280Secondary
Classification
MethodsClean
samples
as
class
0,
adversarial
as
class
1AdversarialClassification
(对抗分类)Gong
et
al.
Adversarialandcleandataarenottwins,
arXiv:1704.04960Secondary
Classification
MethodsTraining
a
detector
for
each
intermediate
layerCascade
Classifiers
(级联分类器)Metzen,JanHendrik,etal."Ondetectingadversarialperturbations."
arXivpreprintarXiv:1702.04267
(2017).Existing
MethodsSecondary
Classification
Methods
(二级分类法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布检测法)Prediction
Inconsistency
(预测不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(诱捕检测法)Principle
Component
Analysis
(PCA)The
last
few
components
differentiate
adversarial
examplesHendrycks,Dan,andKevinGimpel.“Earlymethodsfordetectingadversarialimages.”
arXiv:1608.00530
(2016);
Carlini
and
Wagner."Adversarialexamplesarenoteasilydetected:Bypassingtendetectionmethods."
AISec.2017.Blue:
a
clean
sampleYellow:
an
adv
exampleAn
artifact
caused
by
the
black
backgroundDimensionality
ReductionBhagoji,ArjunNitin,DanielCullina,andPrateekMittal."Dimensionalityreductionasadefenseagainstevasionattacksonmachinelearningclassifiers."arXiv:1704.02654
2.1(2017).Train
on
PCA
reduced
dataExisting
MethodsSecondary
Classification
Methods
(二级分类法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布检测法)Prediction
Inconsistency
(预测不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(诱捕检测法)Distribution
DetectionGrosse
et
al.
Onthe(Statistical)DetectionofAdversarialExamples,
arXiv:1702.06280MaximumMeanDiscrepancy
(MMD)Two
datasets:
Distribution
DetectionFeinman,Reuben,etal."Detectingadversarialsamplesfromartifacts."
arXivpreprintarXiv:1703.00410
(2017).KernelDensityEstimation
(KDE)Adversarial
examples
are
in
low
density
spaceDistribution
DetectionFeinman,Reuben,etal."Detectingadversarialsamplesfromartifacts."
arXivpreprintarXiv:1703.00410
(2017).KernelDensityEstimation
(KDE)Adversarial
examples
are
in
low
density
space
Bypassing
10
Detection
MethodsAdversarialExamplesAreNotEasilyDetected:BypassingTenDetectionMethods.
Carlini
and
Wagner,
AISec
2017.Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Definition(LocalIntrinsicDimensionality)AdversarialexamplesareinhighdimensionalsubspacesLocal
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018AdversarialSubspacesandExpansionDimension:
Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Estimatinglocalintrinsicdimensionality.Amsaleg
et
al.KDD
2015EstimationofLID:
Hill(MLE)estimator(Hill1975,Amsalegetal.2015):BasedonExtremeValueTheory:Nearestneighbordistancesareextremeevents.LowertaildistributionfollowsGeneralizedParetoDistribution(GPD).
Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018InterpretationofLIDforAdversarialSubspaces:LIDdirectlymeasuresexpansionrateoflocaldistancedistributions.Theexpansionofadversarialsubspaceishigherthannormaldatasubspace.LIDassessesthespace-fillingcapabilityofthesubspace,basedonthedistancedistributionoftheexampletoitsneighbors.Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018LID
of
adversarial
examples
(red)
are
higherLID
at
deeper
layers
are
more
differentiableLocal
Intrinsic
Dimensionality
(LID)Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Experiments&Results:DatasetFeatureFGMBIM-aBIM-bJSMAOptMNISTKD78.1298.1498.6168.7795.15BU32.3791.5525.4688.7471.30LID96.8999.6099.8392.2499.24CIFAR-10KD64.9268.3898.7085.7791.35BU70.5381.6097.3287.3691.39LID82.3882.5199.7895.8798.94SVHNKD70.3977.1899.5786.4687.41BU86.7884.0786.9391.3387.13LID97.6187.5599.7295.0797.60Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Experiments&Results:Train\TestattackFGMBIM-aBIM-bJSMAOptFGSMKD64.9269.1589.7185.7291.22BU70.5381.672.6586.7991.27LID82.3882.3091.6189.9393.32Detectors
trained
on
simple
attacks
FGSM
can
detect
complex
attacksAn
Improved
Detector
of
LID/pdf/2212.06776.pdf
An
Improved
Detector
of
LID/pdf/2212.06776.pdfMahalanobisDistance
(MD)Mahalanobis,PrasantaChandra."Onthegeneralizeddistanceinstatistics."NationalInstituteofScienceofIndia,1936.
The
MD
of
between
two
data
points:MahalanobisDistance
(MD)Leeetal.“Asimpleunifiedframeworkfordetectingout-of-distributionsamplesandadversarialattacks.”
NeurIPS
2018.
MahalanobisDistance
(MD)Leeetal.“Asimpleunifiedframeworkfordetectingout-of-distributionsamplesandadversarialattacks.”
NeurIPS
2018.MahalanobisDistance
(MD)Leeetal.“Asimpleunifiedframeworkfordetectingout-of-distributionsamplesandadversarialattacks.”
NeurIPS
2018.Experiments&Results:Existing
MethodsSecondary
Classification
Methods
(二级分类法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布检测法)Prediction
Inconsistency
(预测不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(诱捕检测法)Bayes
UncertaintyBayesianUncertainty(BU)
Feinman,Reuben,etal."Detectingadversarialsamplesfromartifacts."
arXivpreprintarXiv:1703.00410
(2017).Feature
SqueezingXu
et
al."Featuresqueezing:Detectingadversarialexamplesindeepneuralnetworks."
arXiv:1704.01155
(2017).Bit
depth
reductionSqueezing
clean
and
adv
examplesReducing
input
dimensionality
improves
robustnessThe
prediction
inconsistency
before
and
after
squeezing
can
detect
advsRandom
TransformationTian
et
al."Detectingadversarialexamplesthroughimagetransformation."
AAAI2018.The
prediction
of
advs
will
change
after
random
transformationsLog-OddsRoth
et
al.“Theoddsareodd:Astatisticaltestfordetectingadversarialexamples.”
ICML2019.Add
random
noise
to
the
input
Log-OddsHuetal.“Anewdefenseagainstadversarialimages:Turningaweaknessintoastrength.”
NeurIPS
2019.原则1:对抗样本的梯度更均匀原则2:对抗样本难以被攻击第二次测试准则1:随机噪声不会改变预测结果测试准则1:再次攻击需要更多的扰动Existing
MethodsSecondary
Classification
Methods
(二级分类法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布检测法)Prediction
Inconsistency
(预测不一致性)Reconstruction
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 基于运动生理学的健美操训练创新
- 智能电网时代抽水蓄能的角色与价值
- 2025瑕疵购房租赁合同
- 小麦抗白粉病抗性基因的分子标记辅助选择
- 2025企业主可以使用的商业合同
- 初阶语文探索
- 外语专业思政教育的国际化发展趋势
- 探究环保科技
- 手工艺术探索
- 生物●天津卷丨2024年天津市普通高中学业水平选择性考试生物试卷及答案
- 上海市杨浦区2023-2024学年高二下学期期末英语调研卷
- 江西省南昌市2023-2024学年六年级下学期期末英语试题
- DL-T5169-2013水工混凝土钢筋施工规范
- 美学导论智慧树知到期末考试答案章节答案2024年山东工艺美术学院
- MOOC 政府审计学-南京审计大学 中国大学慕课答案
- 从偏差行为到卓越一生3.0版
- 原神游戏介绍PPT
- 2022年病区分层考核试题N2
- 交通劝导员上岗培训课件
- 水下作业安全培训
- TCACM 1524-2023 中医体重管理临床指南
评论
0/150
提交评论