基于深度神经网络的目标检测

上传人：a*** IP属地：湖北上传时间：2021-12-20 格式：PPTX 页数：30 大小：5.41MB 积分：28 举报 版权申诉

已阅读5页，还剩25页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

2、ep2. Use selective search to obtain 2k proposalsStep3. Warp each proposal and apply CNN to extract its featuresStep4. Adopt class-specified SVM to score each proposalStep5. Rank the proposals and use NMS to get the bboxes. Step6. Use class-specified regressors to refine the bboxes positions.TOO SLOW

3、WWW ! SPPNET RCNN三个问题（分阶段训练、空间浪费、慢47s）SPP-Net: MotivationCropping may loss some information about the objectWarpping may change the objects appearance第7页 | 共25页 FC layer need a fixed-length input while conv layer can be adapted to arbitrary input size. Thus we need a bridge between the conv and FC l

4、ayer. Here comes the SPP layer.第8页 | 共25页SPP-Net: Training for Detection(1)第9页 | 共25页Conv5 feature mapConv5 feature mapConv5 feature mapImage PyramidFeatMap PyramidsconvStep1. Generate a image pyramid and exact the conv FeatMap of the whole imageSPP-Net: Training for Detection(2) Step 2, For each pr

5、oposal, walking the image pyramid and find a project version that has a number of pixels closest to 224x224. (For scaling invariance in training.) Step 3, find the corresponding FeatMap in Conv5 and use SPP layer to pool it to a fix size. Step 4, While getting all the proposals feature, fine-tune th

6、e FC layer only. Step 5, Train the class-specified SVM第10页 | 共25页SPP-Net: Testing for DetectionAlmost the same as R-CNN, except Step3.第11页 | 共25页第12页 | 共25页Speed: 64x faster than R-CNN using one scale, and 24x faster using five-scale paramid.mAP: +1.2 mAP vs R-CNN2. 训练花费过大的硬盘开销和时间1. 训练分多阶段，并不是端到端的训练

7、过程FC layersConv layersSVMregressorstore第13页 | 共25页3. 训练sppnet只微调全连阶层（检测除了语义信息还需要位置信息，多层pooling操作导致位置信息模糊）Fast R-CNNFast R-CNN: MotivationRoss Girshick, Fast R-CNN, Arxiv tech reportJOINT TRAINING!第14页 | 共25页多任务损失函数(multi-task loss)ROI pooling layer特征提取和分类放在一个网络之中，联合训练Fast R-CNN: Joint Training Frame

8、workJoint the feature extractor, classifier, regressor together in a unified framework第15页 | 共25页（RoI）候选区域：图像序号几何位置Fast R-CNN: RoI pooling layer one scale SPP layer第16页 | 共25页Fast R-CNN: Regression LossA smooth L1 loss which is less sensitive to outliers than L2 loss第17页 | 共25页多任务损失函数image pyramids

9、（multi scale）brute force （single scale）Conv5 feature mapconvIn practice, single scale is good enough. (The main reason why it can faster x10 than SPP-Net)第18页 | 共25页Fast R-CNN: Other tricks第19页 | 共25页第20页 | 共25页- 网络末端同步训练同步训练的分类和位置调整，提升准确度 - 使用多尺度多尺度的图像金字塔，性能几乎没有提高 - 倍增训练倍增训练数据，能够有2%-3%的准确度提升 - 网络直接

10、输出各类概率(softmax)，比SVM分类器性能略好 - 更多候选窗更多候选窗不能提升性能Fast RCNN和RCNN相比，训练时间从84小时减少为9.5小时，测试时间从测试时间从47秒减少为秒减少为0.32秒秒。在PASCAL VOC 2007上的准确率相差无几，约在66%-67%之间第21页 | 共25页1. Region proposal耗时（提region proposal 23s，而提特征分类只需0.32s）2. 伪端到端训练（region proposal使用selective search先提取处来，占用磁盘存储）FasterRCNN卷积网络直接产生候选区域RPN本质为滑动窗口

11、第22页 | 共25页第23页 | 共25页滑动窗口（最后一卷积层）anchor机制（锚点）边框回归可以得到多尺度长宽比候选区域第24页 | 共25页简单网络目标检测速度达到简单网络目标检测速度达到17fps，在PASCAL VOC上准确率为59.9%；复杂网络复杂网络达到5fps，准确率78.8%20000个anchor第25页 | 共28页1 Imagenet上预训练模型初始化网络参数，微调RPN网络2 使用1中网络提取候选区域训练fastRCNN3 用2的fastRCNN重新初始化RPN，固定卷积层微调4 固定2种fastRCNN卷积层，用3种RPN提取候选微调1.无法达到实时2.预先获取候选区域，在对每个proposal分类计算量比较大基于回归YOLO第26页 | 共28页(1) 给个一个输入图像，首先将图像划分成7*7的网格(2) 对于每个网格，我们都预测2个边框（包括每个边框是目标的置信度以及每个边框区域在多个类别上的概率）(3) 根据上一步可以预测出7*7*2个目标窗口，然后根据阈值去除可能性比较低的目标窗口，最后NMS去除冗余窗口即可。增强版本GPU中能跑45fps，简化版本155fps第27页 | 共28页YOLO可以每秒处理45张图像每个网络预测目标窗口时使用的是全图信息只使用7*7的网格回归会使得目标不能非常精准的

人人文库> 全部分类> 教育资料 > 课件下载

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

基于深度神经网络的目标检测

文档简介

温馨提示

最新文档

评论

基于深度神经网络的目标检测

文档简介

温馨提示

最新文档

评论

相关文档