版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Data
Extraction
and
Model
Stealing姜育刚,马兴军,吴祖煊Recap:
week
7A
Brief
History
of
Backdoor
LearningBackdoor
AttacksBackdoor
DefensesFuture
ResearchThis
WeekData
Extraction
Attack
&
DefenseModel
Stealing
AttackFuture
ResearchThis
WeekData
Extraction
Attack
&
DefenseModel
Stealing
AttackFuture
ResearchData
Extraction
Attack通过模型逆向得到训练数据:8001/dss/imageClassify
TerminologyThe
following
terms
describe
the
same
thing:Data
Extraction
AttackData
Stealing
AttackTraining
Data
Extraction
AttackModel
Memorization
AttackModel
Inversion
AttackSecurity
ThreatsMysocialsecuritynumberis078-Personal
Info
LeakageSensitive
Info
LeakageThreats
to
National
SecurityIllegal
Data
Trading…Memorization
of
DNNsEvidence
1:
DNN
learns
different
levels
of
representationsMemorization
of
DNNsEvidence
2:
DNN
can
memorize
random
labels/pixels真实标签随机标签乱序像素随机像素高斯噪声Zhang,Chiyuan,etal.“Understandingdeeplearningrequiresrethinkinggeneralization.”ICLR
2017.Memorization
of
DNNsEvidence
3:
The
success
of
GANs
and
diffusion
models/;
/
Intended
vs.
Unintended
MemorizationIntended
MemorizationTask-relatedStatisticsInputs
and
LabelsArpitetal.“Acloserlookatmemorizationindeepnetworks.”
ICML,2017.
Carlinietal.“Thesecretsharer:Evaluatingandtestingunintendedmemorizationinneuralnetworks.”USENIXSecurity,2019.第一层Filter正常CIFAR-10第一层Filter随机标注CIFAR-10自然语言翻译模型记忆:“我的社保号码是xxxx”Unintended
MemorizationTask-irrelevant
but
memorizedEven
appear
only
a
few
times出现4次就能全记住现有数据窃取攻击黑盒窃取主动测试:煤矿里的金丝雀“随机号码为****”“我的社保号码为****”主动注入,然后先兆数据在语言模型中的“曝光度”(Exposure)Carlinietal.“Thesecretsharer:Evaluatingandtestingunintendedmemorizationinneuralnetworks.”USENIXSecurity,2019.意外记忆测试和量化:’先兆’黑盒窃取针对通用语言模型:逆向出大量的:名字、手机号、邮箱、社保号等大模型比小模型更容易记住这些信息即使只在一个文档里出现也能被记住Carlini,Nicholas,etal.“Extractingtrainingdatafromlargelanguagemodels.”
USENIXSecurity,2021.训练数据萃取攻击Training
Data
Extraction
AttackDefinition
of
MemorizationCarlini,Nicholas,etal.“Extractingtrainingdatafromlargelanguagemodels.”
USENIXSecurity,2021.模型知识提取k-逼真记忆攻击步骤Carlini,Nicholas,etal.“Extractingtrainingdatafromlargelanguagemodels.”
USENIXSecurity,2021.步骤1:生成大量文本;步骤2:文本筛选和确认实验结果604条“意外”记忆只在一个文档里出现的记忆模型越大记忆越强Memorization
ofDiffusion
Models美国马里兰大学和纽约大学联合研究发现,生成扩散模型会记忆原始训练数据,导致在特定文本提示下,泄露原始数据生成的:原始的:Memorization
ofDiffusion
ModelsDefinition
of
Replication:Wesaythatageneratedimagehasreplicatedcontentifitcontainsanobject(eitherintheforegroundorbackground)thatappearsidenticallyinatrainingimage,neglectingminorvariationsinappearancethatcouldresultfromdataaugmentation.Somepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Memorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.OriginalSegmixDiagonal
OutpaintingPatch
OutpaintingCreate
Synthetic
and
Real
DatasetsExisting
image
retrieval
datasets:OxfordParisINSTREGPR1200Memorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Train
Image
Retrieval
ModelsMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Similarity
metric:
inner
product
token-wise
inner
productDiffusion
model:
DDPMDataset:
Celeb-AThe
top-2
matches
of
diffusion
models
trained
on
300,
3000,
and
30000
images
(the
full
set
is
30000).Results:Green:
copyBlue:
close
but
no
exact
copyOthers:
similar
but
not
the
sameMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Gen-train
vs
train-train
similarity
score
distribution数据越少Copy越多Memorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Many
close
copy
but
no
exact
match
(similarity
score
<0.65)Case
study:
ImageNet
LDMMost
similar:
theatercurtain,peacock,andbananasLeast
similar:
sealion,bee,andswingMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Case
study:
StableDiffusionLAIONAestheticsv26+:
12M
imagesRandom
select
9000
images
as
source
and
use
their
captions
to
promptMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Case
study:
StableDiffusionSome
keywords
(those
in
red)
are
associated
with
certain
fixed
patterns.
Key
wordsMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Case
study:
StableDiffusionStyle
copying
using
text
prompt:
<Name
of
the
painting>
by
<name
of
the
artist>Memorization
of
Large
Language
Models
(LLMs)Shi,Weijia,etal."DetectingPretrainingDatafromLargeLanguageModels."
arXivpreprintarXiv:2310.16789
(2023).PretrainingdatadetectionMIN-K%PROBMemorization
of
Large
Language
Models
(LLMs)Shi,Weijia,etal."DetectingPretrainingDatafromLargeLanguageModels."
arXivpreprintarXiv:2310.16789
(2023).Detection
on
WIKIMIAA
dynamic
benchmark:
WIKIMIA白盒窃取白盒窃取需要利用梯度信息,也称梯度逆向攻击(Gradient
Inversion
Attack)针对梯度共享的训练:分布式训练联邦学习并行训练无中心化训练两种分布式训练范式白盒窃取白盒窃取需要利用梯度信息,也称梯度逆向攻击(Gradient
Inversion
Attack)Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.迭代逆向(逐层)递归逆向逼近反推白盒窃取:迭代逆向迭代逆向:通过构造数据来接近真实梯度真实梯度,假设已知一次前传两次后传生成数据产生的梯度
Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒窃取:迭代逆向已有工作汇总Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒窃取:递归逆向递归逆向:基于真实梯度追层逆向推导关键点:图像大小(32x32)Batch大小(大多为1)模型大小真实梯度,已知Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒窃取:递归逆向已有工作汇总Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒防御已有工作汇总Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.This
WeekData
Extraction
Attack
&
DefenseModel
Stealing
AttackFuture
ResearchAI模型训练代价高昂BERTGoogle$160万大规模、高性能的AI模型训练耗费巨大数据资源计算资源人力资源模型窃取的动机巨大的商业价值尽量保持模型性能不希望被发现宝贵的AI模型模型窃取为其所用模型窃取的方式输入输出模型微调模型剪枝窃取攻击StealingmachinelearningmodelsviapredictionAPIs,
USENIXSecurity,
2016;
Practicalblack-boxattacksagainstmachinelearning,
ASIACCS,
2017;
Knockoffnets:Stealingfunctionalityofblack-boxmodels,
CVPR,
2019;
Maze:Data-free
modelstealing
attackusingzeroth-ordergradientestimation,
CVPR,
2021;基于方程式求解的攻击攻击思路示例基于方程式求解的攻击Tramèr,Florian,etal."Stealingmachinelearningmodelsviaprediction{APIs}."
USENIXSecurity,2016.100%窃取某些商业模型所需的查询数和时间基于方程式求解的攻击:窃取参数攻击算法参数个数为d通过d+1个输入,构造d+1个下列方程
主要特点:针对传统机器学习模型:SVM、LR、DT可精确求解,需要模型返回精确的置信度窃取得到的模型还可能泄露训练数据(数据逆向攻击)Tramèr,Florian,etal."Stealingmachinelearningmodelsviaprediction{APIs}."
USENIXSecurity,2016.基于方程式求解的攻击:窃取超参Wang,Binghui,andNeilZhenqiangGong."Stealinghyperparametersinmachinelearning."
S&P,2018.攻击思想:模型训练完了的状态应该是Loss梯度为0
基于替代模型的攻击Orekondy
et
al."Knockoffnets:Stealingfunctionalityofblack-boxmodels."
CVPR,2019.攻击思想:在查询目标模型的过程中训练一个替代模型模拟其行为基于替代模型的攻击Orekondy
et
al."Knockoffnets:Stealingfunctionalityofblack-boxmodels."
CVPR,2019.Knockoff
Nets攻击:“仿冒网络”基于替代模型的攻击Knockoff
Nets攻击:攻击流程采样大量查询样本训练替代模型强化学习,学习如何高效选择样本Orekondy
et
al."Knockoffnets:Stealingfunctionalityofblack-boxmodels."
CVPR,2019.基于替代模型的攻击Jagielski,Matthew,etal.“Highaccuracyandhighfidelityextractionofneuralnetworks.”
USENIXSecurity,2020.高准确(accuracy)vs高保真(fidelity)窃取攻击蓝色:目标决策边界橙色:高准确窃取绿色:高保真窃取基于替代模型的攻击Jagielski,Matthew,etal.“Highaccuracyandhighfidelityextractionofneuralnetworks.”
USENIXSecurity,2020.高准确(accuracy)vs高保真(fidelity)窃取攻击目标模型(黑盒)查询图片替代模型模型输出作为标签指导替代模型训练
概率输出类别输出基于替代模型的攻击Jagielski,Matthew,etal.“Highaccuracyandhighfidelityextractionofneuralnetworks.”
USENIXSecurity,2020.功能等同窃取FunctionallyEquivalentExtraction攻击步骤:寻找在某个Neuron上,让ReLU=0的关键点在关键点两侧探索边界,确定对应权重只能窃取两层网络基于替代模型的攻击Carlini
et
al."Cryptanalyticextractionofneuralnetworkmodels."
AnnualInternationalCryptologyConference,2020.加密分析窃取CryptanalyticExtraction思想:ReLU的二级导为0
&有限差分(finite
difference)ReLU=0基于替代模型的攻击加密分析窃取CryptanalyticExtraction窃取0-deep神经网络:窃取1-deep神经网络:Carlini
et
al."Cryptanalyticextractionofneuralnetworkmodels."
AnnualInternationalCryptologyConference,2020.基于替代模型的攻击Yuan,Xiaoyong,eta
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2024国际货物买卖合同协议书范本简约版
- 冀教版九年级数学 26.3 解直角三角形(学习、上课课件)
- 2024合同到期离职申请书
- 华师版九年级数学 23.4 中位线(学习、上课课件)
- 多元评价在初中英语教学中的探究
- 八下语文第十课教育课件
- 故事的课件图片
- 北师大版数学六年级下册《图形的旋转一》(教学设计)
- 初中地理商务星球版七年级下册8.3印度 教案
- 人音版一年级下册第5课火车波尔卡 教学设计 ()
- 交通出行共享单车停车点规划方案
- 梅毒的实验室检查培训课件
- 2023年江苏省农业农村厅公务员考试《行政职业能力测验》历年真题及详解
- 物业管理培训合同样本
- NB-T35026-2022混凝土重力坝设计规范
- 2024版自动售货机合作协议书
- 中考数学计算题练习100道(2024年中考真题)
- 中职新教材思政课职业道德与法治期末试卷
- 人体常见病智慧树知到期末考试答案章节答案2024年
- 国开(南京)2024年《行政伦理学》形成性考核1-4答案
- 《跟上兔子》绘本五年级第1季A-Magic-Card
评论
0/150
提交评论