版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Backdoor
Attacks
and
Defenses姜育刚,马兴军,吴祖煊Recap:
week
6
1.
Data
Poisoning
Atttacks
and
DefensesData
Poisoning
AttacksData
Poisoning
DefensesPoisoning
for
Data
ProtectionFuture
ResearchBackdoor
Attacks
and
DefensesA
Brief
History
of
Backdoor
LearningBackdoor
AttacksBackdoor
DefensesFuture
ResearchBackdoorvs
(Pure)
PoisoningPoisoningattackTrainingtimeattackChangeclassificationboundaryBackdoorattackTrainingtimeattackDoes
not
change
the
original
boundaryAdd
new
boundary后门攻击–动机模型训练大量互联网数据可能存在后门样本后门模型But,后门攻击!=投毒攻击这是两个不同的话题数据投毒是后门攻击的一种实现方式后门攻击-流程步骤1:后门注入;步骤2:后门激活后门攻击的特点:模型在干净数据上性能不变触发器出现即预测后门类别后门攻击-例子数字触发器物理世界攻击Gu
et
al."Badnets:Identifyingvulnerabilitiesinthemachinelearningmodelsupplychain."
arXiv:1708.06733
(2017).后门攻击-方法分类脏标签攻击:
添加触发器并修改类别BadNets(Guetal.,2019)Trojanattack(Liuetal.,2018)Blendattack(Chenetal.,2017)净标签攻击:
只添加触发器Clean-labelattack
(CL)(Turneretal.,2019)Sinusoidalsignalattack
(SIG)(Barnietal.,2019)Reflectionbackdoor
(Refool)(Liuetal.,2020,
ECCV)Video
backdoor
(Zhao
et
al.,
2020,
CVPR)Li,Yiming,etal.“Backdoorlearning:Asurvey.”
IEEETNNLS,
2022.后门攻击-优化目标隐蔽性尽量少的毒化样本尽量隐蔽的触发器尽量小的影响模型在干净样本上的性能成功率尽量高的攻击成功率可完成多目标攻击迁移性迁移到不同的训练方法迁移到不同的模型鲁棒性可躲避后门检测防御可躲避后门移除防御后门攻击六种经典攻击所使用的触发器样式攻击成功率A
Brief
History:
The
Early
WorkGu,Tianyu,BrendanDolan-Gavitt,andSiddharthGarg."Badnets:Identifyingvulnerabilitiesinthemachinelearningmodelsupplychain."
arXiv:1708.06733
(2017).正常模型攻击者想要安插额外功能在不改变原网络的情况下安插结果Trojan攻击Liu,Yingqi,etal."Trojaningattackonneuralnetworks."(2017).Idea:诱导模型增强局部链接:Step1:寻找最大化激活某个神经元的patternStep2:逆向生成最大化某个类别的训练数据Step3:逆向数据+pattern
->重新训练模型Blend攻击Chenetal."Targetedbackdoorattacksondeeplearningsystemsusingdatapoisoning."
arXivpreprintarXiv:1712.05526
(2017).饰品注入饰品融合背景混合背景图像Clean-label攻击Turner,Alexander,DimitrisTsipras,andAleksanderMadry."Clean-labelbackdoorattacks."(2018).添加对抗噪声让后门样本变难操纵潜在编码(latent
code)进行类别转移正弦信号攻击Barni
et
al."Anewbackdoorattackincnnsbytrainingsetcorruptionwithoutlabelpoisoning."
ICIP,2019.特点:不需要类别标签需要添加很明显的条纹攻击成功率并不是很高反光攻击Liu,Yunfei,etal."Reflectionbackdoor:Anaturalbackdoorattackondeepneuralnetworks."
ECCV,2020.反光光学原理特点:无目标攻击(出现反光就分错)向图像中添加背景反光需要提前设计好反光效果攻击成功率不是很高视频攻击Zhao,Shihao,etal."Clean-labelbackdoorattacksonvideorecognitionmodels."
CVPR,2020.优化触发器指向目标类扰动投毒样本抹除自然信息特点:净标签攻击解决高维输入上的后门攻击挑战解决多类别间干扰、高分辨率干扰物理攻击Wenger,Emily,etal.“Backdoorattacksagainstdeeplearningsystemsinthephysicalworld.”
CVPR,2021.物理攻击:触发器的大小、位置、空间很关键隐藏触发器攻击Saha,Aniruddha,AkshayvarunSubramanya,andHamedPirsiavash.“Hiddentriggerbackdoorattacks.”
AAAI,
2020.t:
targetz:
poisoneds:
sourcetsz特征一致视觉一致特征一致视觉一致样本级别触发器Li,Yuezun,etal.“Invisiblebackdoorattackwithsample-specifictriggers.”
ICCV,2021.嵌入类别相关的隐编码每个样本的触发器都不同保证隐码既能融入也能分离动态攻击Nguyen,TuanAnh,andAnhTran.“Input-awaredynamicbackdoorattack.”
NeurIPS,
2020.使用生成器为每个后门样本生成一个特定的触发器攻击物体追踪模型Li,Yiming,etal."Few-ShotBackdoorAttacksonVisualObjectTracking."
ICLR.2022.Visualobjecttracking(VOT):基于第一针里物体信息,预测其在后续针的出现位置Few-ShotBackdoorAttack(FSBA)Li,Yiming,etal."Few-ShotBackdoorAttacksonVisualObjectTracking."
ICLR.2022.同时攻击模板和搜索区域:添加后门噪声的样本在特征空间远离干净样本Experiments:
Physical-World
Attack追踪iPad追踪PersonLi,Yiming,etal."Few-ShotBackdoorAttacksonVisualObjectTracking."
ICLR.2022.Understanding
FSBA特征越不一样,攻击越有效Li,Yiming,etal."Few-ShotBackdoorAttacksonVisualObjectTracking."
ICLR.2022.Understanding
FSBA攻击Template和search
region都能让关注区域消失Li,Yiming,etal."Few-ShotBackdoorAttacksonVisualObjectTracking."
ICLR.2022.攻击人群计数模型Sun,Yuhua,etal."BackdoorAttacksonCrowdCounting."
ACMMultimedia.2022.人群计数(Crowd
Counting):计算一张图片里的人头数是一个回归问题现有计数方法Sun,Yuhua,etal.“BackdoorAttacksonCrowdCounting.”
ACMMultimedia,2022.基于检测的:密度图回归:密度回归GeneratethepseudodensitymapfrompointannotationsbyGaussiankernels.Useper-pixelsupervision(L2loss)totrainthemodelGroundTruthSupervisionPredictedDensityMapPredictedDensityMapGaussiankernelsSun,Yuhua,etal.“BackdoorAttacksonCrowdCounting.”
ACMMultimedia,2022.密度图篡改后门攻击+95.5725DMBA
-α=0.2
494.6700Trigger×0.3CleanImage×0.7
DensityMapPoisonedImageAlteredDensityMap+989.3400DensityMapDMBA+α=2.0
Trigger×0.3AlteredDensityMapCleanImage×0.7
PoisonedImage494.6700主要想法:通过添加触发器和修改密度图来安插后门通过触发器操控模型生成任意大小的密度图预测关键在于如何精准控制计数改变的大小Sun,Yuhua,etal.“BackdoorAttacksonCrowdCounting.”
ACMMultimedia,2022.DMBA-和DMBA+攻击494.6700142.3633361.3056CleanImage×0.7
+95.5725DensityMapDMBA
-α=0.2
494.6700Trigger×0.3PoisonedImage+989.3400DensityMapDMBA+α=2.0
Trigger×0.3AlteredDensityMapCleanImage×0.7
PoisonedImage494.6700ModelTrainingPredictionHeadCountCBackdooredModelPredictionHeadCountCCleanTestImageGroundTruth=257.08725PoisonedTestImageGroundTruth=48.454586CleanDensityEstimationBackdoorDensityEstimationPrediction=275.49704Prediction=103.62213AlteredDensityMapSun,Yuhua,etal.“BackdoorAttacksonCrowdCounting.”
ACMMultimedia,2022.实验结果传统攻击无法精准操纵最终结果提出攻击Sun,Yuhua,etal.“BackdoorAttacksonCrowdCounting.”
ACMMultimedia,2022.实验结果Impactoftriggersize:Multipletriggerpatterns:Sun,Yuhua,etal.“BackdoorAttacksonCrowdCounting.”
ACMMultimedia,2022.后门防御后门检测:检测后门样本或后门模型ActivationCluster(Chenetal.2018)Spectral
Signature
(Tran
et
al.
2018)STRIP(Gaoetal.2019)NeuralCleanse(Wangetal.2019)后门移除:从后门模型中移除后门触发器Fine-tuning(Liuetal.2018)、Fine-pruning(Liuetal.2018)ModeConnectivityRepair(MCR)(Zhaoetal.2020)Neural
Attention
Distillation
(Li
et
al.
2021)Adversarial
Neuron
Pruning
(ANP)
(Wu
et
al.
2021)Anti-Backdoor
Learning
(ABL)
(Li
et
al.
2021)ActivationCluster(AC)Chen,Bryant,etal."Detectingbackdoorattacksondeepneuralnetworksbyactivationclustering."
arXiv:1811.03728
(2018).假设:后门数据聚类可分神经网最后一层输出ICA降维,去前三个主成分聚类分析Spectral
Signature
(SS)Tran,Brandon,JerryLi,andAleksanderMadry.“Spectralsignaturesinbackdoorattacks.”
NeurIPS,
2018.后门样本的SVD降维输入空间深度特征空间根据SVD降维得到的协方差矩阵的top奇异向量进行检测绿色:后门数据STRIPGao,Yansong,etal."Strip:Adefenceagainsttrojanattacksondeepneuralnetworks."
ACSAC.2019.后门样本具有输入无关性扰动后预测不变的样本是后门样本Neural
Cleanse
(NC)Wang,Bolun,etal."Neuralcleanse:Identifyingandmitigatingbackdoorattacksinneuralnetworks."
S&P,2019.后门模型中其他类到后门类的距离更近通过逐个类别寻找最近距离可以确定模型是否有后门以及后门类别后门防御后门检测:检测后门样本或后门模型ActivationCluster(Chenetal.2018)Spectral
Signature
(Tran
et
al.
2018)STRIP(Gaoetal.2019)NeuralCleanse(Wangetal.2019)后门移除:从后门模型中移除后门触发器Fine-tuning(Liuetal.2018)、Fine-pruning(Liuetal.2018)ModeConnectivityRepair(MCR)(Zhaoetal.2020)Neural
Attention
Distillation
(Li
et
al.
2021)Adversarial
Neuron
Pruning
(ANP)
(Wu
et
al.
2021)Anti-Backdoor
Learning
(ABL)
(Li
et
al.
2021)后门移除-优化目标高效性尽量少的使用干净数据尽量快的移除所有后门代价小不影响模型正常性能移除后不需要重训练通用性能同时移除不同类型的后门能保护不同类型的模型、数据集等鲁棒性对Adaptive
Attack鲁棒对不同设置鲁棒后门移除–实验设置假设后门模型已经确定大部分方法需要少量干净数据大部分方法需要在完成后对模型再次微调多少都会影响一点模型性能Fine-pruning
(FP)Liu,Kang
et
al."Fine-Pruning:DefendingAgainstBackdooringAttacksonDeepNeuralNetworks."
arXiv:1805.12185
(2018).步骤1:参见冗余神经元;步骤2:微调以恢复性能=裁剪+微调裁剪在干净数据上休眠的神经元ModeConnectivityRepair(MCR)Zhao,Pu,etal.“Bridgingmodeconnectivityinlosslandscapesandadversarialrobustness.”
ICLR,
2020.
模型mixup后门模型少量干净数据训练的模型在两个模型之间寻找一个更好的过渡Neural
Attention
Distillation
(NAD)使用知识蒸馏移除后门需要一小部分干净数据先微调用微调后的模型作为教师对中间层通道平均激活进行对齐蒸馏Li,Yige,etal.“NeuralAttentionDistillation:ErasingBackdoorTriggersfromDeepNeuralNetworks.”
ICLR,2020.Neural
Attention
Distillation
(NAD)DefinitionofattentionfunctionsDefinitionofattentiondistillationlossOveralltraininglossNeural
Attention
Distillation
(NAD)PerformanceofNADThe
mosteffectiveagainst
all
6backdoorattacks.Minimum
impact
on
clean
accuracy.Reducing
ASR
by
91.82%
on
average,
by
sacrificing
only
2.66%
accuracy.Neural
Attention
Distillation
(NAD)Performanceunderdifferent%ofcleandata:5%
looks
good,
10%
is
sufficient,
the
more
the
better!(a)Attacksuccessrate(b)CleanaccuracyNeural
Attention
Distillation
(NAD)Effectiveness
of
different
teacher-studentcombinations:(a)Attacksuccessrate(b)CleanaccuracyFine-tuned
teacher
(B-F)
is
better,
on
average.Teacher
trained
on
clean
subset
is
also
goodbut
sacrifices
more
clean
accuracy.
Neural
Attention
Distillation
(NAD)Effectsofdifferent
attentionfunctions:Note:Theredboxeshighlightour
recommendedattention
function.AbackdooredimageAnti-Backdoor
Learning
(ABL)反后门学习:如何在毒化数据上训练一个干净无后门的模型?Li,Yige,etal.“Anti-backdoorlearning:Trainingcleanmodelsonpoisoneddata.”
NeurIPS,
2021.找到后门投毒样本的共性在训练阶段检测并抛弃有毒样本Anti-Backdoor
Learning
(ABL)Trainingloss
onCleanexamples
(blue)VS.Backdooredexamples
(yellow)Weaknesses
of
backdoor
attacks:1.Thebackdoortaskismucheasierthanthecleantask.
(Weakness1)2.Abackdoorattackenforces
an
explicitcorrelationbetweenthetriggerandthetargetclasstosimplifyandacceleratetheinjectionofthebackdoortrigger.(Weakness2)Anti-Backdoor
Learning
(ABL)反后门学习:如何在毒化数据上训练一个干净无后门的模型?后门样本学的更快步骤1:局部梯度上升:控制Loss不要太高步骤2:后门样本检测:loss低的是后门样本步骤3:全局梯度上升:在后门样本上做反学习ABL学习机制Anti-Backdoor
Learning
(ABL)ProblemFormulation
OverviewofABLLGA:
local
gradient
ascent;
GGA:
global
gradient
ascentAnti-Backdoor
Learning
(ABL)PerformanceofABL:Conclusions:The
mosteffectivedefense
against
all
10backdoorattacks;Minimum
impact
on
clean
accuracy.Anti-Backdoor
Learning
(ABL)PerformanceofourABLwithdifferentisolationratesonCIFAR-10dataset:1%
isolation
achieves
a
goodtrade-offbetweenASRandCA!Anti-Backdoor
Learning
(ABL)PerformanceofourABLwithdifferentγonCIFAR-10againstBadNets:Thelargerγ,thebetterseparationeffect!(同色的两条线相隔越远)Anti-Backdoor
Learning
(ABL)StresstestingofourABLonCIFAR-10:PerformanceofourABLunderdifferentturningepochsonCIFAR-10:ABLwithonly1%isolationremainseffectiveagainstupto1)70%BadNets;and2)50%Trojan,Blend,andDynamic.Epoch20achievesthebestoverallresults.Anti-Backdoor
Learning
(ABL)PerformanceofvariousunlearningmethodsagainstBadNetsattackonCIFAR-10:ABLachievesthebestunlearningperformanceofASR3.04%andCA86.11%,followed
by
(discard
isolated
data
then)
Re-trainingfromscratch!Adversarial
Neuron
Pruning
(ANP)Wu
andWang.“Adversarialneuronpruningpurifiesbackdooreddeepmodels.”
NeurIPS,
2021.后门模型对对抗扰动更敏感从后门模型中发现并移除敏感神经元
对抗扰动提出重构式神经元剪枝
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2024年高考湖北卷(物理)科目(真题卷+解析版)
- 某某轴承制造有限公司年产某某万套轴承项目可行性分析报告
- 职业技术学院工商企业管理专业人才培养方案
- 统编版四年级上册语文第五单元 习作生活万花筒 习作指导
- 2024个人住房保证担保借款合同(合同范本)
- 爱国卫生课件小班
- 2024公积金贷款后手里有购房合同吗
- 老师您好课件大班
- 湘教版九年级数学 3.1 比例线段(学习、上课课件)
- 冀教版九年级数学 24.2 解一元二次方程(学习、上课课件)
- 云计算在风险管理中的应用
- 2024年新统编版七年级历史上册全册教学课件
- 电子政务概论-形考任务5(在线测试权重20%)-国开-参考资料
- 2024年中国免疫球蛋白市场调查研究报告
- 物流行业:物流风险管理与应急预案制定方案
- 古代小说戏曲专题-形考任务2-国开-参考资料
- 北京市定向选调真题
- 中国企业投资缅甸光伏发电市场机会分析及战略规划报告2024-2030年
- GMS基础知识(第一版)
- 2024浙江丽水松阳县人大常委会办公室选调工作人员历年(高频重点复习提升训练)共500题附带答案详解
- 医生多点执业管理制度
评论
0/150
提交评论