数据与模型安全课件第2周：可解释性和普通鲁棒性

上传人：y*** IP属地：山东上传时间：2024-10-12 格式：PPTX 页数：62 大小：44.71MB 积分：15 举报 版权申诉

已阅读5页，还剩57页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

Explainability&

CommonRobustness姜育刚，马兴军，吴祖煊

What

Machine

Learning

Machine

Learning

Paradigms3.

Loss

FunctionsRecap:

week

14.

Optimization

MethodsMachine

Learning

Pipelinesetuptheinputsetuptheoptimisersetupthelossregularizationmakesdecisionregionsmootherlandscape

ofalossfunction,itvariesw.r.t.data,thefunctionitselfMachine

Learning

Pipelinesetuptheinputsetuptheoptimisersetupthelossregularizationmakesdecisionregionsmootherlandscape

ofalossfunction,itvariesw.r.t.data,thefunctionitselfModel？Deep

Neural

Networks/neural-network-zoo/;/articles/cc-machine-learning-deep-learning-architectures/Feed-Forward

Neural

NetworksFeed-ForwardNeuralNetworks

(FNN)Fully

Connected

Neural

Networks

(FCN)Multilayer

Perceptron

(MLP)The

simplest

neural

networkFully-connectedbetweenlayersFordatathathasNOtemporalorspatialorder/ConvolutionalNeuralNetworksForimagesordatawithspatialorderCan

stack

>100

layers/Neurons

dimensionsNeurons

one

flat

layerRecurrent

Neural

Networks/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networksTraditional

RNNTransformersVaswani,Ashish,etal."Attentionisallyouneed."

Advancesinneuralinformationprocessingsystems

30(2017)Transformer:

new

type

DNNs

based

attentionEncoderDecoderSelf-Attention

Explained/illustrated-self-attention-2d627e33b20aCNN

ExplainedLearns

different

levels

representations/A

brief

history

CNNs:LeNet,1990sAlexNet,2012ZFNet,2013GoogLeNet,2014VGGNet,2014ResNet,2015InceptionV4,2016ResNeXt,

2017ViT,

2021AnImageisWorth16x16Words:TransformersforImageRecognitionatScale,

ICLR

2021Explainable

AI深度学习可解释性学习机理推理机理泛化机理认知机理鲁棒性学习过程学习结果决策依据推理机制泛化原因泛化条件认知科学认知启发的智能普通鲁棒性对抗鲁棒性我们想要弄清楚下列问题：DNN是怎么学习的、学到了什么、靠什么泛化、在什么情况下行又在什么情况下不行？深度学习是否是真正的智能，与人类智能比谁更高级，它的未来是什么？是否存在大一统的理论，不但能解释而且能提高？Methodological

PrinciplesVisualizationAblationContrastModelComponentLayerOperationNeuronSuperclassClassTraining/Test

setSubsetSampleTrainingInferenceTransferReverseHow

Understand

Machine

LearningLearning

the

process

empirical

risk

minimization

(ERM)Learning

MechanismTraining/Test

Error/AccuracyPrediction

Confidence

Explanation

via

observation:

just

plot!Wang

al.

SymmetricCrossEntropyforRobustLearningwithNoisyLabels,

ICCV

2019.Learning

MechanismParameter

dynamicsGradient

dynamicsExplanation

via

dynamics

and

informationTRADI:Trackingdeepneuralnetworkweightdistributions,

ECCV

2020;

Shwartz-ZivR,TishbyN.Openingtheblackboxofdeepneuralnetworksviainformation[J].arXiv:1703.00810,2017.Learning

MechanismDecision

boundary,

learning

process

visualizationExplanation

via

dynamics

and

informationhttps://distill.pub/2020/grand-tour/（March16,2020）;

/Learning

MechanismData

influence/valuation:

how

training

sample

impacts

the

learning

outcome?UnderstandingBlack-boxPredictionsviaInfluenceFunctions,

ICML,

2018;

PruthiG,LiuF,KaleS,etal.Estimatingtrainingdatainfluencebytracinggradientdescent.NeurIPS,2020.Datashapley:Equitablevaluationof

data

formachinelearning,

ICML,

2019.Influence

FunctionData

ShapleyInfluence

FunctionHow

model

parameter

would

change

sample

removed

from

the

training

set?UnderstandingBlack-boxPredictionsviaInfluenceFunctions,

ICML,

2018;

目标：

Cook,R.D.andWeisberg,S.Residualsandinfluenceinregression.NewYork:ChapmanandHall,1982

所以：

Training

Data

InfluenceHow

model

loss

z’

would

change

update

sample

z?PruthiG,LiuF,KaleS,etal.Estimatingtrainingdatainfluencebytracinggradientdescent.NeurIPS,2020First-order

approximation

the

above

(assuming

one

step

update

small)?Checkpoints

store

the

interim

updates所以：Understanding

the

Learned

ModelLoss

LandscapeDeep

featurest-SNE

plotMaaten

al.Visualizingdatausingt-SNE.

JMLR,

2008.https://distill.pub/2016/misread-tsne/?_ga=2.135835192.888864733.1531353600-1779571267.1531353600Understanding

the

Learned

ModelClass-wise

PatternsIntermediate

Layer

Activation

MapActivation/Attention

MapLi

al.

NeuralAttentionDistillation:ErasingBackdoorTriggersfromDeepNeuralNetwork,

ICLR

2021;

Zhao

etal.Whatdodeepnetslearn?class-wisepatternsrevealedintheinputspace.arXiv:2101.06898

(2021).One

predictive

pattern

for

each

classWhat

deep

nets

learn?Zhao,Shihao,etal."Whatdodeepnetslearn?class-wisepatternsrevealedintheinputspace."

arXiv:2101.06898

(2021).Goal:

understanding

knowledge

learned

model

particular

class.Method:

Extract

one

single

pattern

for

one

class,

then

what

this

pattern

would

be?

Other

considerations:

need

this

pixel

space,

they

are

interpretableHow

Find

the

Class-wise

Pattern:

canvas

imagePatterns

extracted

different

canvases

(red

rectangles)Class-wise

Patterns

RevealedPatterns

extracted

original,

non-robust,

robust

CIFAR-10and

patterns

adversarially

trained

modelsPredictive

power

different

sizes

patternsInference

MechanismClass

Activation

Map

(Grad-CAM)Guided

BackpropagationSelvaraju

etal.Grad-cam:Visualexplanationsfromdeepnetworksviagradient-basedlocalization.

ICCV

2017.Springenberg

al.

StrivingforSimplicity:TheAllConvolutionalNet,

ICLR

2015.Guided

BackpropagationSpringenbergetal.StrivingforSimplicity:TheAllConvolutionalNet,ICLR2015.

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709ReLU

forward

passReLU

backward

passDeconvolution

for

ReLUGuided

BackpropagationClass

Activation

Mapping

(CAM)Zhou

al.LearningDeepFeaturesforDiscriminativeLocalization.CVPR,2016.

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709GAP:

Global

Average

PoolingGrad-CAMB.Zhou,A.Khosla,L.A.,A.Oliva,andA.Torralba.LearningDeepFeaturesforDiscriminativeLocalization.InCVPR,2016;

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709Grad-CAM

generalization

CAMCompute

neuron

importance:

Weighted

combination

activation

map,

then

interpolation:LIMELocalInterpretableModel-agnosticExplanations(LIME)Ribeiro

al.“Whyshoulditrustyou?”Explainingthepredictionsofanyclassifier.“

SIGKDD,

2016./marcotcr/lime

Integrated

GradientsSundararajanM,TalyA,YanQ.Axiomaticattributionfordeepnetworks,

ICML,2017./TianhongDai/integrated-gradient-pytorch

Integrate

the

gradients

along

the

wayCognitive

DistillationHuang

al.

DistillingCognitiveBackdoorPatternswithinanImage,

ICLR

2023MaskextractbycognitivedistillationUsefulandnon-usefulfeaturesUsefulfeatures:highlycorrelatedwiththetruelabelinexpectation,

soIfremoved,predictionchangeBackdoortriggerisausefulfeatureNon-usefulfeatures:notcorrelated

with

predictionIfremoved,predictiondoesnotchangeIlyas,Andrew,etal."Adversarialexamplesarenotbugs,theyarefeatures.”NeurIPS2019CognitiveDistillationObjective:distilltheminimalessenceofusefulfeaturesModelTotalVariationLossRandomnoisevectorOriginalimageMaskCognitivePatternCognitiveDistillationDistilledpatternsonbackdoored

samplesxcpmxHow

VerifyCognitivePatterns

are

EssentialBackdooredimageBinarizedmask{0,1}OriginalimageConstruct

simplified

backdoor

patterns:Backdoor

Patterns

Can

Made

Simplerxcpmxxbd’Backdoor

Patterns

Can

Made

SimplerSimplified

backdoor

patterns

also

work!L1Norm

Distributionofthe

Distilled

MaskDetect

Backdoor

SamplesAttacks:12backdoorattacksModels:ResNet-18,Pre-ActivationResNet-101,MobileNetv2,VGG-16,Inception,EfficientNet-b0Datasets:CIFAR-10/GTSRB/ImageNetsubsetEvaluation

metric:areaundertheROCcurve(AUROC)Detectionbaselines:Anti-BackdoorLearning(ABL)[2]ActivationClustering(AC)[3]Frequency[4]STRIP[5]SpectralSignatures[6]CD-L(logitslayer)andCD-F(lastactivationlayer)Superb

Detection

PerformanceCelebA

dataset:40binaryfacialattributes(gender,bald,andhaircolor)KnownbiasbetweengenderandblondhairApply

CDinthesamewayasbackdoordetectionSelectsubsetofsampleswithlowL1normExamineattributesofthesubsetCalculatedistributionshiftbetweensubsetandthefulldatasetDiscover

Biases

Facial

Recognition

ModelsDiscover

Biases

Facial

Recognition

ModelsMasks

distilled

for

predicting

each

attributeDiscover

Biases

Facial

Recognition

ModelsGeneralization

MechanismConvergenceGeneralizationDeep

Learning

TheoryConvergenceConvex

(Linear

model)Nonconvex

(DNN)Saddle

pointGeneralizationTraining

time‘Cat’Test

time‘Cat’?Traditional

theory:

simpler

model

better,

data

betterGeneralization

Theory/~ninamf/ML11/lect1117.pdf;/watch?v=zlqQ7VRba2YComponents

Generalization

Error

Boundsgeneralizationerror

empiricalerror

hypothesisclasscomplexity

confidencesample

sizeRHS:

for

all

terms,

the

lower

the

better:

small

training

errorsimpler

model

classmore

samples

less

confidenceGeneralization

TheoryZhang

al.

Understandingdeeplearningrequiresrethinkinggeneralization.

ICLR

2017.Small

training

error≠low

generalization

errorZero

training

error

was

achieved

purely

random

labels

(meaningless

learning)0

training

error

vs.

0.9

test

errorList

Existing

TheoriesRademacher

Complexity

bounds

(Bartlett

al.

2017)PAC-Bayes

bounds

(Dziugaite

and

Roy

2017)Information

bottleneck

(Tishby

and

Zaslavsky

2015)Neural

tangent

kernel/Lazy

training

(Jacot

al.

2018)Mean-field

analysis

(Chizat

and

Bach

2018)Doule

Descent

(Belkin

al.

2019)Entropy

SGD

(Chaudhari

al.

2019)/watch?v=zlqQ7VRba2YA

few

interesting

questions:Should

consider

the

role

data

generalization

analysis?Should

representation

quality

appear

the

generalization

bound?Generalization

about

math

(the

function

the

model)

knowledge?How

visualize

generalization?

Existing

approachestest

errorVisualization:

loss

landscape,

prediction

attribution,

etc.Training

test:

distribution

shift,

out-of-distribution

analysisNoisy

labels

test

data

–

questioning

data

quality

and

reliable

evaluationThe

remaining

questions:

how

generalization

happens?Math≠KnowledgeComputation

finding

patterns

understanding

the

underlying

knowledgeWhat

the

relation

computational

generalization

human

behavior?Cognitive

MechanismOpenAI

reveals

the

multimodal

neurons

CLIP/blog/multimodal-neurons/;/blog/clip/Cognitive

MechanismRitter

al.

CognitivePsychologyforDeepNeuralNetworks:AShapeBiasCaseStudy,

ICML,

2017cognitivepsychology

inspired

evaluation

DNNsshape

match

prob

means

shape

biasCognitive

MechanismGeirhos,Robert,etal."Shortcutlearningindeepneuralnetworks."

NatureMachineIntelligence

2.11(2020):665-673.DeepneuralnetworkssolveproblemsbytakingshortcutsCognitive

MechanismRajalingham,Rishi,etal.“Large-scale,high-resolutioncomparisonofthecorevisualobjectrecognitionbehaviorofhumans,monkeys,andstate-of-the-artdeepartificialneuralnetworks.”

JournalofNeuroscience

38.33(2018):7255-7269.

Rajalingham,Rishi,KailynSchmidt,andJamesJ.DiCarlo."Comparisonofobjectrecognitionbehaviorinhumanandmonkey."

JournalofNeuroscience

35.35(2015):12127-121

人人文库> 全部分类> 教育资料 > 课件下载

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

数据与模型安全课件第2周：可解释性和普通鲁棒性

文档简介

温馨提示

最新文档

评论

数据与模型安全 课件 第2周：可解释性和普通鲁棒性

文档简介

温馨提示

最新文档

评论

相关文档

数据与模型安全课件第2周：可解释性和普通鲁棒性