![数据与模型安全 课件 第6周:数据投毒和防御_第1页](http://file4.renrendoc.com/view8/M00/3C/05/wKhkGWcJ9BeAaEgtAAE-7mjKNtI887.jpg)
![数据与模型安全 课件 第6周:数据投毒和防御_第2页](http://file4.renrendoc.com/view8/M00/3C/05/wKhkGWcJ9BeAaEgtAAE-7mjKNtI8872.jpg)
![数据与模型安全 课件 第6周:数据投毒和防御_第3页](http://file4.renrendoc.com/view8/M00/3C/05/wKhkGWcJ9BeAaEgtAAE-7mjKNtI8873.jpg)
![数据与模型安全 课件 第6周:数据投毒和防御_第4页](http://file4.renrendoc.com/view8/M00/3C/05/wKhkGWcJ9BeAaEgtAAE-7mjKNtI8874.jpg)
![数据与模型安全 课件 第6周:数据投毒和防御_第5页](http://file4.renrendoc.com/view8/M00/3C/05/wKhkGWcJ9BeAaEgtAAE-7mjKNtI8875.jpg)
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Data
Poisoning:
Attacks
and
Defenses姜育刚,马兴军,吴祖煊Recap:
week
5
1.
Adversarial
DefenseEarly
Defense
MethodsEarly
Adversarial
Training
MethodsAdvanced
Adversarial
Training
MethodsRemaining
Challenges
and
Recent
ProgressAdversarial
Attack
CompetitionLink:https://codalab.lisn.upsaclay.fr/competitions/15669?secret_key=77cb8986-d5bd-4009-82f0-7dde2e819ff8
Data
Poisoning:
Attacks
and
DefensesA
Brief
History
of
Data
PoisoningData
Poisoning
AttacksData
Poisoning
DefensesPoisoning
for
Data
ProtectionFuture
ResearchA
Recap
of
the
Attack
TaxonomyAttacktimingPoisoningattackEvasionattackAttacker’sgoalTargetedattackUntargetedattackAttacker’sknowledgeBlack-boxWhite-boxGray-boxUniversalityIndividualUniversalData
Poisoning
is
Training
Time
AttackEvasion(Causation)attackTesttimeattackChangeinputexamplePoisoningattackTrainingtimeattackChangeclassificationboundaryData
Poisoning:
Attacks
and
DefensesA
Brief
History
of
Data
PoisoningData
Poisoning
AttacksData
Poisoning
DefensesPoisoning
for
Data
ProtectionFuture
ResearchA
Brief
History:
The
Eearliest
WorkKearns
and
Li.
“Learninginthepresenceofmaliciouserrors”,
SIAM
Journal
on
Computing,
1993Poisoning
IntrusionDetectionSystemBarreno,Marco,etal.“Canmachinelearningbesecure?.”
ASIACCS,
2006.TheattackmodelPoisoning
IntrusionDetectionSystemBarreno,Marco,etal.“Canmachinelearningbesecure?.”
ASIACCS,
2006.ThedefensemodelSubvertYourSpamFilterNelson,Blaine,etal."Exploitingmachinelearningtosubvertyourspamfilter."
LEET
8.1(2008):9.Usenet
dictionary
attack:
Add
legitimate
words
into
spam
emails1%poisoning
can
subvert
a
spam
filterThe
Concept
of
Poisoning
AttackBiggio,Nelson
andLaskov."Poisoningattacksagainstsupportvectormachines."arXiv:1206.6389
(2012).asingleattackdatapointcausedtheclassificationerrortorisefromtheinitialerrorratesof2–5%to15–20%Data
Poisoning:
Attacks
and
DefensesA
Brief
History
of
Data
PoisoningData
Poisoning
AttacksData
Poisoning
DefensesPoisoning
for
Data
ProtectionFuture
ResearchAttack
Pipeline模型训练污染少量训练样本(越少越好)无效模型、被控制模型投毒攻击!=后门攻击后门攻击的一种实现方式是通过数据投毒Attack
PipelineLiu,Ximeng,etal."Privacyandsecurityissuesindeeplearning:Asurvey."
IEEEAccess
9(2020):4566-4593.Attack
TypesCore
idea:
how
to
influence
the
training
processLabel
PoisoningIncorrect
labels
break
supervised
Learning!Random
LabelsLabel
FlippingPartial
Label
FlippingSelf-supervised
learning?Biggio,Nelson
andLaskov."Poisoningattacksagainstsupportvectormachines."arXiv:1206.6389
(2012).Zhang
andZhu."Agame-theoreticanalysisoflabelflippingattacksondistributedsupportvectormachines."
CISS,2017.Label
Flipping
Attack(“指鹿为马”攻击)p-tamperingattacksMahloujifarandMahmoody.“Blockwisep-tamperingattacksoncryptographicprimitives,extractors,andlearners.”
TCC,
2017.Mahloujifar,
Mahmoodyand
Mohammed."Universalmulti-partypoisoningattacks."
ICML,2019.投毒?Pr=0.8在线更新模型‘dog’‘dog’induces
bias
shift篡改攻击(“暗度陈仓”攻击)Feature
Space
PoisoningShafahi,Ali,etal.“Poisonfrogs!targetedclean-labelpoisoningattacksonneuralnetworks.”
NeurIPS
2018.Feature
Collision
Attack(“声东击西”攻击)A
white-box
data
poisoning
methodFeature
flipping,
does
not
change
labels看上去是’狗’,但是在特征空间是’鱼’优缺点:需要知道目标模型对迁移学习很强对从头训练并不强ConvexPolytopeAttackZhu,Chen,etal.“Transferableclean-labelpoisoningattacksondeepneuralnets.”
ICML
2019.凸多面体攻击(“四面楚歌”攻击)Improve
the
transferability
to
different
DNNs寻找一组毒化样本将目标样本包围在一个凸包内借助多个预训练模型来寻找“包围”样本:
m个预训练模型
权重约束ConvexPolytopeAttack
vs
Feature
Collision
AttackZhu,Chen,etal.“Transferableclean-labelpoisoningattacksondeepneuralnets.”
ICML
2019.基于SVM的示例蓝色毛球:目标样本,不存在与训练集中红色毛球:包围样本普通红/蓝点:普通训练样本Bi-level
Optimization
AttackMei
andZhu.“Usingmachineteachingtoidentifyoptimaltraining-setattacksonmachinelearners.”
AAAI
2015.投毒攻击是一种“双层优化”:投毒完成后,训练模型才能知道其效果是一个最大-最小化(max-min)问题内部最小化:在投毒数据上更新模型外部最大化:在更新后的模型上生成更强的投毒数据MetaPoisonHuang,W.Ronny,etal.“Metapoison:Practicalgeneral-purposeclean-labeldatapoisoning.”
NeurIPS
2020.One
advanced
bi-level
optimization
attack不修改类标有目标(Targeted攻击)验证集上的性能不变使用元学习寻找高效投毒样本可攻击微调和端到端模型成功攻击商业模型Google
Cloud
AutoML
APIMetaPoisonHuang,W.Ronny,etal.“Metapoison:Practicalgeneral-purposeclean-labeldatapoisoning.”
NeurIPS
2020.A
Bi-level
Min-Min
Optimization
AttackK-step优化策略:内层多步(’look
ahead’),外层一步使用m个模型和周期性初始化来增加探索MetaPoisonHuang,W.Ronny,etal.“Metapoison:Practicalgeneral-purposeclean-labeldatapoisoning.”
NeurIPS
2020.优化阶段投毒后影响MetaPoisonHuang,W.Ronny,etal.“Metapoison:Practicalgeneral-purposeclean-labeldatapoisoning.”
NeurIPS
2020.毒化0.1%的数据即可达到很高的ASR示例:狗毒化成鸟Witches‘Brew:思想Geiping,Jonas,etal.“Witches‘Brew:IndustrialScaleDataPoisoningviaGradientMatching.”
ICLR
2021.依然是Min-Min双层优化问题Trick:在生成毒化样本时,使其梯度与目标样本一致直观理解:让毒化样本和目标样本在训练过程中触发同样的梯度,即让毒化样本更像目标样本Witches‘Brew:实验结果Geiping,Jonas,etal.“Witches‘Brew:IndustrialScaleDataPoisoningviaGradientMatching.”
ICLR
2021.Generative
Attack(生成式攻击)/machine-learning/gan/gan_structure对抗生成网络(GAN):一次训练,无限使用Autoencoder-based
Generative
AttackYang,Chaofei,etal."Generativepoisoningattackmethodagainstneuralnetworks."
arXiv:1703.01340
(2017).从正常的5开始,使用直接梯度从随机噪声开始,使用直接梯度从正产的5开始,使用生成方法pGAN涉及三个模型:
D(判别器)、G(生成器)、C(分类器)对抗损失与GAN一样:分类损失(原始数据+生成数据损失):Yang,Chaofei,etal."Generativepoisoningattackmethodagainstneuralnetworks."
arXiv:1703.01340
(2017).pGAN绿色:正常类(目标类),正常样本蓝色:正常类,正常样本红色:毒化类,毒化样本可以生成真正靠近目标类的投毒样本Yang,Chaofei,etal."Generativepoisoningattackmethodagainstneuralnetworks."
arXiv:1703.01340
(2017).差异化攻击:对哪些样本投毒更有效?Kohet
al."Strongerdatapoisoningattacksbreakdatasanitizationdefenses."
MachineLearning
111.1(2022):1-47.衡量样本影响力的指标:
对影响大的样本投毒Data
Poisoning:
Attacks
and
DefensesA
Brief
History
of
Data
PoisoningData
Poisoning
AttacksData
Poisoning
DefensesPoisoning
for
Data
ProtectionFuture
ResearchData
Poisoning
DefenseShen,Yanyao,andSujaySanghavi.“Learningwithbadtrainingdataviaiterativetrimmedlossminimization.”
ICML
2019Robust
Learning
with
Trimmed
Loss
Loss低的是好样本Loss高的是坏样本让模型尽量在Loss低的样本上训练问题样本:噪声标签、系统噪声、生成模型—+坏数据、后门样本Data
Poisoning
DefenseRobust
Learning
with
Trimmed
Loss
是一个min-min问题内部最小化:选择低loss的样本子集S外部最小化:在子集S上训练模型Shen,Yanyao,andSujaySanghavi.“Learningwithbadtrainingdataviaiterativetrimmedlossminimization.”
ICML
2019深度划分聚合(DeepPartitionAggregation,DPA)Levine,Alexander,andSoheilFeizi.“Deeppartitionaggregation:Provabledefenseagainstgeneralpoisoningattacks.”
ICLR
2021分而治之:适用于投毒样本比较少的情况将训练集划分为k个均匀子集:在每个子集上训练一个基分类器:投票决策:反后门学习(Anti-Backdoor
Learning,
ABL)Li,Yige,etal.“Anti-backdoorlearning:Trainingcleanmodelsonpoisoneddata.”
NeurIPS
2021学的快的样本不是好样本Trainingloss
onCleansamples
(blue)VS.Poisonedexamples
(yellow)研究10种基于投毒的后门攻击毒化样本在训练初期就学完了毒化样本的损失下降很快反后门学习(Anti-Backdoor
Learning,
ABL)先隔离再反学习ProblemFormulation
OverviewofABLLGA:
local
gradient
ascent;
GGA:
global
gradient
ascentLi,Yige,etal.“Anti-backdoorlearning:Trainingcleanmodelsonpoisoneddata.”
NeurIPS
2021Data
Poisoning:
Attacks
and
DefensesA
Brief
History
of
Data
PoisoningData
Poisoning
AttacksData
Poisoning
DefensesPoisoning
for
Data
ProtectionFuture
ResearchUnlearnable
Exampleshttps://exposing.ai/megaface/互联网上充斥着大量的个人数据Personal
Data
Are
Used
For
Training
Commercial
ModelsDatasetcollectedfromtheInternet:Withoutawareness[1].Trainingcommercialmodels[2].Privacyconcerns[3].[1]Prabhu&Abeba,"Largeimagedatasets:Apyrrhicwinforcomputervision?."
arXiv:2006.16923,2020.[2]HillKashmir,“TheSecretiveCompanyThatMightEndPrivacyasWeKnowIt.”NYtimes,2020.[3]Shan,Shawn,etal."Fawkes:Protectingpersonalprivacyagainstunauthorizeddeeplearningmodels.”USENIXSecuritySymposium,2020Unlearnable
ExamplesGoal:making
data
unlearnable
(unusable)
to
machine
learningModify
TrainingImages->Make
them
UselessHuang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”
ICLR
2021.Adversarial
noisesaresmall,imperceptibletohumaneyes.Szegedyetal.2013,Goodfellowetal.2014Adversarial
Noise
=
Error-maximizingNoiseAdversarialExamplesfoolDNNattesttimeby
maximizingerrors.Adversarial
noise
can
mislead
ML
modelsNO
Error
to
Learn?Huang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”
ICLR
2021.GeneratingError-MinimizingNoise
想要影响模型的训练那一定是一个双层优化问题Huang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”
ICLR
2021.Sample-wiseNoise+=每个样本都有一套自己的噪声Huang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”
ICLR
2021.Class-wiseNoise+=每类样本共享一套噪声规律?为什么这一个噪声图案可以让一整类的数据没有了错误?Huang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”
ICLR
2021.Comparisontheeffectofdifferentnoisesontraining:Error-Minimizingnoisecancreateunlearnableexamplesinbothsettings.ExperimentsHuang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”
ICLR
2021.Isthenoisetransferabletoothermodels?YesIsthenoisetransferabletootherdatasets?YesIsthenoiserobusttodataaugmentation?YesExperimentsWhatpercentageofthedataneedstobeunlearnable?Unfortunately,
it
needs
100%
training
data
to
be
poisoned.ExperimentsHow
about
protecting
part
of
the
data
or
just
one
class?UnlearnableExampleswillnotcontributetomodeltraining.ExperimentsProtecting
FaceImagesNomorefacialrecognitions?Ifeveryonepostunlearnableimages.Conclusion&LimitationsAnewexcitingresearchproblem.UnlearnableExamples.Error-minimizingnoise.Limitationstorepresentationallearning.Limitationstoadversarialtraining
(已被ICLR2022的一篇工作解决:
Robust
Unlearnable
Examples).
Related
works:Cherepanovaetal."LowKey:LeveragingAdversarialAttackstoProtectSocialMediaUsersfromFacialRecognition."
ICLR,2021.Fowletal.“AdversarialExamplesMakeStrongPoisons.”
NeurIPS
2021.Fowletal."Preventingunauthorizeduseofproprietarydata:Poisoningforsecuredatasetrelease."
arXiv:2103.02683Radiya-DixitandTramèr."DataPoisoningWon'tSaveYouFromFacialRecognition."
arXiv:2106.14851Shanetal.“Fawkes:Protectingprivacyagainstunauthorizeddeeplearningmodels.”
USENIX
Security,
2021Unlearnable
ClustersProtection
Against
Unknown
LabellingExisting
ApproachesError-minimizingnoise使模型的训练损失下降到0来让模型觉得“已经没有什么信息可学了”[1]。AdversarialPoisoning利用非鲁棒特征[2]这个概念使模型去学习错误的非鲁棒特征[3]。[1]Unlearnableexamples:Makingpersonaldataunexploitable,ICLR2021.[2]AdversarialExamplesAreNotBugs,TheyAreFeatures,NeurIPS2019.[3]Adversarialexamplesmakestrongpoisons,NeurIPS2021.UniversalAdversarialPerturbation
(UAP)[1]Universaladversarialperturbations,CVPR2017.[2]ImageNetPre-trainingAlsoTransfersNon-robustness,AAAI2023.UniversalAdversarialPerturbation(UAP)是指被施加在任一图像上之后都弄愚弄模型的一种class-wise
perturbation[1]。它既可以“覆盖”图像中原本的语义特征,还可以“独立”地工作[2]。Unlearnable
Clusters
(UCs)在不依赖标签信息(分类层)的前提下,实现打破一致性(uniformity)和差异性(discrepancy)K-means
Initial
DisruptingDiscrepancyandUniformity
MethodologyExperimentsExperimentsExperimentsDataset:37-classPetsDataset:37-classPetsTransferable
Unlearnable
ExamplesLinear
SeparabilityAvailabilityAttacksCreateShortcuts.
Yu,Daetal.,
KDD
2022.Class-wise
perturbationsEMNt-SNE
visualization
in
input
space(32
x
32
x
3)
dim
=
3072
dimreshapeLinear
SeparabilityQuantifying
Linear
SeparabilityAvailabilityAttacksCreateShortcuts.
Yu,Daetal.,
KDD
2022.Input
x:
perturbationLabel
y:
corresponding
label
of
perturbed
imageModel:
Linear
model
and
Two-layer
Neural
NetworkLinearly
Separable
PerturbationsAvailabilityAttacksCreateShortcuts.
Yu,Daetal.,
KDD
2022.Goal:
to
show
simple
linear
separability
is
enough
for
poisoning.Syntheticnoise
(SN):horsehorsecat
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 三年级语文课代表发言稿
- 六年级小升初家长会语文老师发言稿
- 电商运营中的消费者心理分析
- 社交媒体营销策略与网络舆论引导
- 小学法制教育工作计划
- 少先队建队日活动总结
- 直播平台在教育培训领域的应用策略
- 知识产权管理企业创新的基石
- 生产流程优化与员工绩效提升的关系
- 社区O2O超市的消费者满意度调查与改进措施
- 2024年高考真题-英语(新高考Ⅱ卷) 含解析
- 【万通地产偿债能力存在的问题及优化建议(数据论文)11000字】
- 吉利收购沃尔沃商务谈判案例分析
- JGJ/T235-2011建筑外墙防水工程技术规程
- 信息科技课的跨学科主题学习PP义务教育课程方案和课程标准国家级示范培训课件
- 人教版PEP五年级英语下册单词表与单词字帖 手写体可打印
- 第七节碎石路基施工方案
- 三年级数学兴趣班纲要及教案
- 如果历史是一群喵
- 抖音房产直播敏感词汇表
- 2024届山东省青岛市市北区八年级物理第二学期期末质量检测试题含解析
评论
0/150
提交评论