版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Encoding Feature Maps of CNNs for Action RecognitionXiaojiang Peng, Cordelia SchmidLEAR-Inria, Grenoble, FranceSummary of LEAR SubmissionImproved DT and Fisher vectorCNN features from very deep ConvNets Key component: Encoding CNN feature mapsImproved DT and Fisher vector Overview Details: Set video
2、s to be at most 320p wide Preprocess IDT features by PCA-Whiten with a factor of 2 Perform power+intra+L2 normalization for FVsInput videoIDT (HOG/HOF/MBH)Fisher vectorPower+Intra+L2-normIntra-normalization Good way to suppress bursty visual elements Perform L2 normalization for each FV block (mean
3、and variation components are separated) independently1-21Arandjelovic, R., Zisserman, A.: All about VLAD. CVPR,2013.2X. Peng, Y. Qiao: Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice. arXiv:1405.4506, 2014. CNN features (1) Very deep ConvNets (VGG
4、19) fine-tuningTraining data: all frames on UCF101 and every 5 frames on Thumos 2015 validation set (256x256x3)Data augmentation: cropping and flipping at four cornels and the center, inputs are 224x224x3Batch size: 16 (Our GPU memory limitation)Dropout: 0.5 for fc6 and fc7Max iterations: 20kCNN fea
5、tures (2) Feature extractionRescale frames to 224x224x3 and feed-forward themKeep conv5_4 maps and fc6 activationsCNN features (3) Conv5_4 feature maps are local featuresConv5.Local features for Fi512512whwh.Video.FiEach pixel (pink square in the middle image) in the Conv5 feature map is actually a
6、feature for the corresponding patch in original frame. We obtain w*h 512-D features for frame fiCNN features (4) Encoding Feature MapsPreprocess Conv5_4 local features by PCA-Whiten with a factor of 2Construct a codebook of size 256 using k-meansApply VLAD encoding and power+intra+L2 normalization f
7、or video representationsInput videoVideo framesIDT featuresFisher vectorPower +Intra + L2 normFusion& SVMVGG19 Conv5VLADExperiments(1) Train/test setupTr1: train on UCF101, test on validation set and report the mAPTr2: 10 train/test folds on validation set, report the mean and the standard devia
8、tion of mAP. Baseline: IDT+FVTr1Tr2IDT(HOG/HOF/MBH)+FV52.23%69.190.8%Experiments(2) Evaluation of CNN features and pooling methods Conclusions: Conv5_4 is better than fc6 without fine-tuningOriginal CNN model is trained to abstract concepts for object classification rather than action recognitionEnc
9、oding Conv5_4 feature maps significantly outperforms others Tr1Tr2Avg-poolingMax-poolingVLADVLADConv5_446.02%34.3%56.95%68.71.1%fc639.38%28.38%-Table 1. Evaluation of Conv5_4 and fc6 without fine-tuningExperiments(3) Evaluation of CNN features and pooling methods Conclusions: Fine-tuning does improv
10、e performanceLarge improvement can be obtained by Conv5+VLAD When using “Tr1”, but a little bit when using “Tr2”.The difference between “Tr1” and “Tr2” suggests the appearance of UCF101 and validation set is very different.Table 2. Evaluation of Conv5_4 and fc6 with fine-tuningTr1Tr2Avg-poolingVLADA
11、vg-poolingVLADConv5_457.47%63.87%69.321.1%74.361.3%fc647.07%-72.321.1%-Experiments(4) Feature combinationsIndexMethodTr1Tr21IDT (HOG+HOF+MBH)52.23%69.190.8%2Conv5-VLAD63.87%74.361.3%3Conv5-avg57.47%69.321.1%4fc6-avg47.07%72.321.1%5IDT+ Conv5-VLAD65.11%76.211.0%6IDT+ Conv5-avg62.95%75.380.8%7IDT+ fc6
12、-avg58.59%76.11.0%8IDT+Conv5-VLAD+Conv5-avg66.17%77.690.9%9IDT+ Conv5-VLAD+fc6-avg64.84%79.361.0%10IDT+ Conv5-VLAD+Conv5-avg+fc6-avg66.64%79.521.1%11IDT+ Conv5-VLAD+Conv5-avg+softmax-87.450.8%Experiments(4) Conclusions:CNN features complement the IDT featuresAll independent CNN based methods outperf
13、orm IDT+FV when using “Tr2”Conv5-avg and fc6-avg complement the IDT and Conv5-VLAD features, see 5 vs. 8 vs. 10 (table in last slide)Conv5-avg and fc6-avg capture global information while Conv5-VLAD does notTest resultsMethod ( + IDT-FV)Tr1Tr2Post-proTest mAPRun1Conv5-VLAD+Conv5-avg(6FPS)66.17%77.69
14、0.9%-68.13%Run2Conv5-VLAD+Conv5-avg(6FPS)66.17%77.690.9%+68.11%Run3Conv5-VLAD+Conv5-avg + softmax (1FPS)-87.450.8%+53.95%Run4Conv5-VLAD+Conv5-avg (1FPS)-+67.38%Run5Conv5-VLAD+Conv5-avg+fc6-avg (6FPS)66.64%79.521.1%+67.93%ConclusionsCombining fc6 features doesnt improve test resultsCombining softmax
15、scores leads to overfittingTrain/test setup is important since different observations can be obtained on different setup Code available http:/lear.inrialpes.fr/software Improved dense trajectories Fisher vector encoding VGG19 fine-tuning model (coming soon)Statistics on validation set (1)The three easiest classes: UnevenBars, RockClimbingIndoor, Skiing Statistics on validation set (2)The three easiest classes: CricketShot, Haircut, BasketballEasiest classes Snapshots from
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2024商场美食节临时摊位租赁合同
- 2024年度健身器材购销合同
- 2024年度国际贸易仲裁与诉讼合同
- 2024年定制LED高炮广告牌建设合同
- 2024乙公司向甲方提供跨境电商服务的详细合同条款
- 2024年度grc材料研发与技术转让合同
- 航天英雄课件教学课件
- 2024年住宅租赁协议:个人与房东间的权利义务规定
- 04版0千伏电力施工合同样本
- 2024年工程招投标合同管理实操手册
- 道路运输企业职业安全健康管理工作台帐(全版通用)参考模板范本
- 中国小学生生命教育调查问卷
- 通用模板-封条模板
- 集团公司后备人才选拔培养暂行办法
- 第五章旅游餐饮设计ppt课件
- 从马克思主义视角看当前高房价
- 长沙市某办公建筑的冰蓄冷空调系统的设计毕业设计
- 不抱怨的世界(课堂PPT)
- 企业盈利能力分析——以青岛啤酒股份有限公司为例
- 消火栓灭火器检查记录表
- 岸墙、翼墙及导水墙砼浇筑方案
评论
0/150
提交评论