数据与模型安全 课件 第6周:数据投毒和防御_第1页
数据与模型安全 课件 第6周:数据投毒和防御_第2页
数据与模型安全 课件 第6周:数据投毒和防御_第3页
数据与模型安全 课件 第6周:数据投毒和防御_第4页
数据与模型安全 课件 第6周:数据投毒和防御_第5页
已阅读5页,还剩76页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Data

Poisoning:

Attacks

and

Defenses姜育刚,马兴军,吴祖煊Recap:

week

5

1.

Adversarial

DefenseEarly

Defense

MethodsEarly

Adversarial

Training

MethodsAdvanced

Adversarial

Training

MethodsRemaining

Challenges

and

Recent

ProgressAdversarial

Attack

CompetitionLink:https://codalab.lisn.upsaclay.fr/competitions/15669?secret_key=77cb8986-d5bd-4009-82f0-7dde2e819ff8

Data

Poisoning:

Attacks

and

DefensesA

Brief

History

of

Data

PoisoningData

Poisoning

AttacksData

Poisoning

DefensesPoisoning

for

Data

ProtectionFuture

ResearchA

Recap

of

the

Attack

TaxonomyAttacktimingPoisoningattackEvasionattackAttacker’sgoalTargetedattackUntargetedattackAttacker’sknowledgeBlack-boxWhite-boxGray-boxUniversalityIndividualUniversalData

Poisoning

is

Training

Time

AttackEvasion(Causation)attackTesttimeattackChangeinputexamplePoisoningattackTrainingtimeattackChangeclassificationboundaryData

Poisoning:

Attacks

and

DefensesA

Brief

History

of

Data

PoisoningData

Poisoning

AttacksData

Poisoning

DefensesPoisoning

for

Data

ProtectionFuture

ResearchA

Brief

History:

The

Eearliest

WorkKearns

and

Li.

“Learninginthepresenceofmaliciouserrors”,

SIAM

Journal

on

Computing,

1993Poisoning

IntrusionDetectionSystemBarreno,Marco,etal.“Canmachinelearningbesecure?.”

ASIACCS,

2006.TheattackmodelPoisoning

IntrusionDetectionSystemBarreno,Marco,etal.“Canmachinelearningbesecure?.”

ASIACCS,

2006.ThedefensemodelSubvertYourSpamFilterNelson,Blaine,etal."Exploitingmachinelearningtosubvertyourspamfilter."

LEET

8.1(2008):9.Usenet

dictionary

attack:

Add

legitimate

words

into

spam

emails1%poisoning

can

subvert

a

spam

filterThe

Concept

of

Poisoning

AttackBiggio,Nelson

andLaskov."Poisoningattacksagainstsupportvectormachines."arXiv:1206.6389

(2012).asingleattackdatapointcausedtheclassificationerrortorisefromtheinitialerrorratesof2–5%to15–20%Data

Poisoning:

Attacks

and

DefensesA

Brief

History

of

Data

PoisoningData

Poisoning

AttacksData

Poisoning

DefensesPoisoning

for

Data

ProtectionFuture

ResearchAttack

Pipeline模型训练污染少量训练样本(越少越好)无效模型、被控制模型投毒攻击!=后门攻击后门攻击的一种实现方式是通过数据投毒Attack

PipelineLiu,Ximeng,etal."Privacyandsecurityissuesindeeplearning:Asurvey."

IEEEAccess

9(2020):4566-4593.Attack

TypesCore

idea:

how

to

influence

the

training

processLabel

PoisoningIncorrect

labels

break

supervised

Learning!Random

LabelsLabel

FlippingPartial

Label

FlippingSelf-supervised

learning?Biggio,Nelson

andLaskov."Poisoningattacksagainstsupportvectormachines."arXiv:1206.6389

(2012).Zhang

andZhu."Agame-theoreticanalysisoflabelflippingattacksondistributedsupportvectormachines."

CISS,2017.Label

Flipping

Attack(“指鹿为马”攻击)p-tamperingattacksMahloujifarandMahmoody.“Blockwisep-tamperingattacksoncryptographicprimitives,extractors,andlearners.”

TCC,

2017.Mahloujifar,

Mahmoodyand

Mohammed."Universalmulti-partypoisoningattacks."

ICML,2019.投毒?Pr=0.8在线更新模型‘dog’‘dog’induces

bias

shift篡改攻击(“暗度陈仓”攻击)Feature

Space

PoisoningShafahi,Ali,etal.“Poisonfrogs!targetedclean-labelpoisoningattacksonneuralnetworks.”

NeurIPS

2018.Feature

Collision

Attack(“声东击西”攻击)A

white-box

data

poisoning

methodFeature

flipping,

does

not

change

labels看上去是’狗’,但是在特征空间是’鱼’优缺点:需要知道目标模型对迁移学习很强对从头训练并不强ConvexPolytopeAttackZhu,Chen,etal.“Transferableclean-labelpoisoningattacksondeepneuralnets.”

ICML

2019.凸多面体攻击(“四面楚歌”攻击)Improve

the

transferability

to

different

DNNs寻找一组毒化样本将目标样本包围在一个凸包内借助多个预训练模型来寻找“包围”样本:

m个预训练模型

权重约束ConvexPolytopeAttack

vs

Feature

Collision

AttackZhu,Chen,etal.“Transferableclean-labelpoisoningattacksondeepneuralnets.”

ICML

2019.基于SVM的示例蓝色毛球:目标样本,不存在与训练集中红色毛球:包围样本普通红/蓝点:普通训练样本Bi-level

Optimization

AttackMei

andZhu.“Usingmachineteachingtoidentifyoptimaltraining-setattacksonmachinelearners.”

AAAI

2015.投毒攻击是一种“双层优化”:投毒完成后,训练模型才能知道其效果是一个最大-最小化(max-min)问题内部最小化:在投毒数据上更新模型外部最大化:在更新后的模型上生成更强的投毒数据MetaPoisonHuang,W.Ronny,etal.“Metapoison:Practicalgeneral-purposeclean-labeldatapoisoning.”

NeurIPS

2020.One

advanced

bi-level

optimization

attack不修改类标有目标(Targeted攻击)验证集上的性能不变使用元学习寻找高效投毒样本可攻击微调和端到端模型成功攻击商业模型Google

Cloud

AutoML

APIMetaPoisonHuang,W.Ronny,etal.“Metapoison:Practicalgeneral-purposeclean-labeldatapoisoning.”

NeurIPS

2020.A

Bi-level

Min-Min

Optimization

AttackK-step优化策略:内层多步(’look

ahead’),外层一步使用m个模型和周期性初始化来增加探索MetaPoisonHuang,W.Ronny,etal.“Metapoison:Practicalgeneral-purposeclean-labeldatapoisoning.”

NeurIPS

2020.优化阶段投毒后影响MetaPoisonHuang,W.Ronny,etal.“Metapoison:Practicalgeneral-purposeclean-labeldatapoisoning.”

NeurIPS

2020.毒化0.1%的数据即可达到很高的ASR示例:狗毒化成鸟Witches‘Brew:思想Geiping,Jonas,etal.“Witches‘Brew:IndustrialScaleDataPoisoningviaGradientMatching.”

ICLR

2021.依然是Min-Min双层优化问题Trick:在生成毒化样本时,使其梯度与目标样本一致直观理解:让毒化样本和目标样本在训练过程中触发同样的梯度,即让毒化样本更像目标样本Witches‘Brew:实验结果Geiping,Jonas,etal.“Witches‘Brew:IndustrialScaleDataPoisoningviaGradientMatching.”

ICLR

2021.Generative

Attack(生成式攻击)/machine-learning/gan/gan_structure对抗生成网络(GAN):一次训练,无限使用Autoencoder-based

Generative

AttackYang,Chaofei,etal."Generativepoisoningattackmethodagainstneuralnetworks."

arXiv:1703.01340

(2017).从正常的5开始,使用直接梯度从随机噪声开始,使用直接梯度从正产的5开始,使用生成方法pGAN涉及三个模型:

D(判别器)、G(生成器)、C(分类器)对抗损失与GAN一样:分类损失(原始数据+生成数据损失):Yang,Chaofei,etal."Generativepoisoningattackmethodagainstneuralnetworks."

arXiv:1703.01340

(2017).pGAN绿色:正常类(目标类),正常样本蓝色:正常类,正常样本红色:毒化类,毒化样本可以生成真正靠近目标类的投毒样本Yang,Chaofei,etal."Generativepoisoningattackmethodagainstneuralnetworks."

arXiv:1703.01340

(2017).差异化攻击:对哪些样本投毒更有效?Kohet

al."Strongerdatapoisoningattacksbreakdatasanitizationdefenses."

MachineLearning

111.1(2022):1-47.衡量样本影响力的指标:

对影响大的样本投毒Data

Poisoning:

Attacks

and

DefensesA

Brief

History

of

Data

PoisoningData

Poisoning

AttacksData

Poisoning

DefensesPoisoning

for

Data

ProtectionFuture

ResearchData

Poisoning

DefenseShen,Yanyao,andSujaySanghavi.“Learningwithbadtrainingdataviaiterativetrimmedlossminimization.”

ICML

2019Robust

Learning

with

Trimmed

Loss

Loss低的是好样本Loss高的是坏样本让模型尽量在Loss低的样本上训练问题样本:噪声标签、系统噪声、生成模型—+坏数据、后门样本Data

Poisoning

DefenseRobust

Learning

with

Trimmed

Loss

是一个min-min问题内部最小化:选择低loss的样本子集S外部最小化:在子集S上训练模型Shen,Yanyao,andSujaySanghavi.“Learningwithbadtrainingdataviaiterativetrimmedlossminimization.”

ICML

2019深度划分聚合(DeepPartitionAggregation,DPA)Levine,Alexander,andSoheilFeizi.“Deeppartitionaggregation:Provabledefenseagainstgeneralpoisoningattacks.”

ICLR

2021分而治之:适用于投毒样本比较少的情况将训练集划分为k个均匀子集:在每个子集上训练一个基分类器:投票决策:反后门学习(Anti-Backdoor

Learning,

ABL)Li,Yige,etal.“Anti-backdoorlearning:Trainingcleanmodelsonpoisoneddata.”

NeurIPS

2021学的快的样本不是好样本Trainingloss

onCleansamples

(blue)VS.Poisonedexamples

(yellow)研究10种基于投毒的后门攻击毒化样本在训练初期就学完了毒化样本的损失下降很快反后门学习(Anti-Backdoor

Learning,

ABL)先隔离再反学习ProblemFormulation

OverviewofABLLGA:

local

gradient

ascent;

GGA:

global

gradient

ascentLi,Yige,etal.“Anti-backdoorlearning:Trainingcleanmodelsonpoisoneddata.”

NeurIPS

2021Data

Poisoning:

Attacks

and

DefensesA

Brief

History

of

Data

PoisoningData

Poisoning

AttacksData

Poisoning

DefensesPoisoning

for

Data

ProtectionFuture

ResearchUnlearnable

Exampleshttps://exposing.ai/megaface/互联网上充斥着大量的个人数据Personal

Data

Are

Used

For

Training

Commercial

ModelsDatasetcollectedfromtheInternet:Withoutawareness[1].Trainingcommercialmodels[2].Privacyconcerns[3].[1]Prabhu&Abeba,"Largeimagedatasets:Apyrrhicwinforcomputervision?."

arXiv:2006.16923,2020.[2]HillKashmir,“TheSecretiveCompanyThatMightEndPrivacyasWeKnowIt.”NYtimes,2020.[3]Shan,Shawn,etal."Fawkes:Protectingpersonalprivacyagainstunauthorizeddeeplearningmodels.”USENIXSecuritySymposium,2020Unlearnable

ExamplesGoal:making

data

unlearnable

(unusable)

to

machine

learningModify

TrainingImages->Make

them

UselessHuang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”

ICLR

2021.Adversarial

noisesaresmall,imperceptibletohumaneyes.Szegedyetal.2013,Goodfellowetal.2014Adversarial

Noise

=

Error-maximizingNoiseAdversarialExamplesfoolDNNattesttimeby

maximizingerrors.Adversarial

noise

can

mislead

ML

modelsNO

Error

to

Learn?Huang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”

ICLR

2021.GeneratingError-MinimizingNoise

想要影响模型的训练那一定是一个双层优化问题Huang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”

ICLR

2021.Sample-wiseNoise+=每个样本都有一套自己的噪声Huang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”

ICLR

2021.Class-wiseNoise+=每类样本共享一套噪声规律?为什么这一个噪声图案可以让一整类的数据没有了错误?Huang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”

ICLR

2021.Comparisontheeffectofdifferentnoisesontraining:Error-Minimizingnoisecancreateunlearnableexamplesinbothsettings.ExperimentsHuang,Hanxun,etal.“Unlearnableexamples:Makingpersonaldataunexploitable.”

ICLR

2021.Isthenoisetransferabletoothermodels?YesIsthenoisetransferabletootherdatasets?YesIsthenoiserobusttodataaugmentation?YesExperimentsWhatpercentageofthedataneedstobeunlearnable?Unfortunately,

it

needs

100%

training

data

to

be

poisoned.ExperimentsHow

about

protecting

part

of

the

data

or

just

one

class?UnlearnableExampleswillnotcontributetomodeltraining.ExperimentsProtecting

FaceImagesNomorefacialrecognitions?Ifeveryonepostunlearnableimages.Conclusion&LimitationsAnewexcitingresearchproblem.UnlearnableExamples.Error-minimizingnoise.Limitationstorepresentationallearning.Limitationstoadversarialtraining

(已被ICLR2022的一篇工作解决:

Robust

Unlearnable

Examples).

Related

works:Cherepanovaetal."LowKey:LeveragingAdversarialAttackstoProtectSocialMediaUsersfromFacialRecognition."

ICLR,2021.Fowletal.“AdversarialExamplesMakeStrongPoisons.”

NeurIPS

2021.Fowletal."Preventingunauthorizeduseofproprietarydata:Poisoningforsecuredatasetrelease."

arXiv:2103.02683Radiya-DixitandTramèr."DataPoisoningWon'tSaveYouFromFacialRecognition."

arXiv:2106.14851Shanetal.“Fawkes:Protectingprivacyagainstunauthorizeddeeplearningmodels.”

USENIX

Security,

2021Unlearnable

ClustersProtection

Against

Unknown

LabellingExisting

ApproachesError-minimizingnoise使模型的训练损失下降到0来让模型觉得“已经没有什么信息可学了”[1]。AdversarialPoisoning利用非鲁棒特征[2]这个概念使模型去学习错误的非鲁棒特征[3]。[1]Unlearnableexamples:Makingpersonaldataunexploitable,ICLR2021.[2]AdversarialExamplesAreNotBugs,TheyAreFeatures,NeurIPS2019.[3]Adversarialexamplesmakestrongpoisons,NeurIPS2021.UniversalAdversarialPerturbation

(UAP)[1]Universaladversarialperturbations,CVPR2017.[2]ImageNetPre-trainingAlsoTransfersNon-robustness,AAAI2023.UniversalAdversarialPerturbation(UAP)是指被施加在任一图像上之后都弄愚弄模型的一种class-wise

perturbation[1]。它既可以“覆盖”图像中原本的语义特征,还可以“独立”地工作[2]。Unlearnable

Clusters

(UCs)在不依赖标签信息(分类层)的前提下,实现打破一致性(uniformity)和差异性(discrepancy)K-means

Initial

DisruptingDiscrepancyandUniformity

MethodologyExperimentsExperimentsExperimentsDataset:37-classPetsDataset:37-classPetsTransferable

Unlearnable

ExamplesLinear

SeparabilityAvailabilityAttacksCreateShortcuts.

Yu,Daetal.,

KDD

2022.Class-wise

perturbationsEMNt-SNE

visualization

in

input

space(32

x

32

x

3)

dim

=

3072

dimreshapeLinear

SeparabilityQuantifying

Linear

SeparabilityAvailabilityAttacksCreateShortcuts.

Yu,Daetal.,

KDD

2022.Input

x:

perturbationLabel

y:

corresponding

label

of

perturbed

imageModel:

Linear

model

and

Two-layer

Neural

NetworkLinearly

Separable

PerturbationsAvailabilityAttacksCreateShortcuts.

Yu,Daetal.,

KDD

2022.Goal:

to

show

simple

linear

separability

is

enough

for

poisoning.Syntheticnoise

(SN):horsehorsecat

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论