翻译以.原文和在同一文件中前_第1页
翻译以.原文和在同一文件中前_第2页
翻译以.原文和在同一文件中前_第3页
翻译以.原文和在同一文件中前_第4页
翻译以.原文和在同一文件中前_第5页
已阅读5页,还剩33页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

非监督学习的航AnilM.Cheriyadat,IEEE成高分辨率图像提供的丰富的数据允许我们通过了解航拍场景的空间和结构模分类精度。我们把自己的技术应用到几个富有的航拍场景数据集中:ORNL-I,1mUCMERCED21种不同的分米级(分辨率)的航拍场景类别的数据集;ORNL-II:航拍数据,基函数,分类,码本,字典,编码,特征学习,稀疏编码UnsupervisedFeatureLearningforAerialSceneAnilM.Cheriyadat,Member,Therichdataprovidedbyhigh-resolutionsaliteimageryallowustodirectlymodelaerialscenesbyunderstandingtheirspatialandstructuralpatterns.Whilepixel-andobject-basedclassificationapproachesarewidelyusedforsaliteimageysis,oftentheseapproachesexploitthehigh-fidelityimagedatainalimitedway.Inthispaper,weexploreanunsupervisedfeaturelearningapproachforsceneclassification.Denselow-levelfeaturedescriptorsareextractedtocharacterizethelocalspatialpatterns.Theseunlabeledfeaturemeasurementsareexploitedinanovelwaytolearnasetofbasisfunctions.Thelow-levelfeaturedescriptorsareencodedintermsofthebasisfunctionstogeneratenewsparserepresentationforthefeaturedescriptors.Weshowthatthestatisticsgeneratedfromthesparsefeaturescharacterizethescenewellproducingexcellentclassificationaccuracy.Weapplyourtechniquetoseveralchallengingaerialscenedatasets:ORNL-Idatasetconsistingof1-mspatialresolutionsaliteimagerywithdiversesensorandscenecharacteristicsrepresentingfiveland-usecategories,UCMERCEDdatasetrepresentinentyonedifferentaerialscenecategorieswithsub-meterresolution,andORNL-IIdatasetforlarge-facilityscenedetection.Ourresultsarehighlypromisingand,ontheUCMERCEDdatasetweoutperformthepreviousbestresults.Wedemonstratethattheproposedaerialsceneclassificationmethodcanbehighlyeffectiveindeveloadetectionsystemthatcanbeusedtoautomaticallyscanlarge-scalehigh-resolutionsaliteimageryfordetectinglargefacilitiessuchasashopmall.IndexTerms:aerialdata,basisfunction,classification,codebook,dictionary,encoding,featurelearning,sparsecoding 绪 结 结 致 绪图1:顶行显示ORNL-I数据集中的样例。数据集包含五个场景类别—(1)商业区,(2)大型设施,(3)郊区,(4)农业区和(5)植被。剩余的行显示UCMERCED数据集[1]中的示例图像。样图与21种土地使用类别相关—(6)飞机,(7)棒球内场,(8)海滩,(9)建筑,(10)高速公路,(11)跑道,(12)特征编码方法。最近的研究[8],[9]BOVW场景分类方法相比,特征稀疏spare特征表示。我们证明与现有(SIFT在特征提取和学习方法基础上建立检测系统,检测大比例尺的高分辨率和对扩展工作的想法结束。相关工我们首先回顾一下最近的一些利用空间上下文为高分辨率图像分类的工作。Bruzzone和Carlin[13]提出了一种空间上下文驱动的特征提取策略用于高分辨率图像中的和Davis[5]结合像素与面向对象的特征来生成对象级分类特征。最初,光谱和纹理特征早些时候,与上述方法相比,UnsalanBoyer[14]显示基于局部线参数的场景中间到它最接近的视觉单词。计算图像的视觉单词直方图形成新特征。在[16]简单的图两种情况者应用潜在雷分配(LDA),一种无监督生成框架,来建立单词分图像的空间金字塔表示的不同尺度和空间下计算的局部的视觉单词直方图被串联以产ngwm[1]oVW延。相传的oVW和SP[10]方法具备更高分精度(PK++)SPKSPK++都经常需要使rrintrtionhiqurs赖mns聚类将特征到视觉单,并且是有限的[18]SPK方法被提出了。他们方法的关SPK框架[19]显示,我们的稀疏编码框架效率很高,产生稀疏特征的计算成本远图2航拍场景分类框架概非监督特征学包括五个相关步骤:1特征提取,2特性学习,3特征编码,4特征融合,5分类。我们稀疏编码计算基函数。在特征编码部分,特征送入已学习的基函数集,并应用软示。最终特征然后供给一个线性支持向量机(SVM)分类器。2显示了该框架的概述。向量,最后我们以密集SIFT描述符进行实验。我们在由RGB颜色通道产生的灰度图像性测量值的向量组,如图3所示。中b是块乘积,i是块标记。注意,在这篇中我们用粗体大写字母表示矩阵,粗体Leung-Malik[21]多尺度和多方向滤波器组。我们的滤波器组由一阶和二阶的6个方向和3种尺度下的函数导数,8算子,4个不同的尺度算子组成。由[21],对于每个尺度我们设置相对距离为{1,√2,2,2√方向下的平均滤波能量已生成特征向量𝑥𝑖∈𝑅𝑏,其中b=48。趣点计算特征描述符形成对照。以前的工作[22]SIFT描述符产生的分类精度高于基于稀疏的“点”的描述符。为计算每个像素块的SIFT描述符,像素块进一步分为4×4不相交的子块对每个子块计算方向直方图计算方向分为8间隔。征向量𝑥𝑖∈𝑅𝑏,b=128。我们使用[23]提供的密集SIFT实现我们的特征计算。我们检查像素块表示的三种不同描述符:原始像素值、方向滤波响应和SIFT描述符图4Leung-Malik[21]的多尺度、多方向滤波器组,为计算方向滤波响应。前三排(1–18滤波器)代表一阶导数在六个方向和三个尺度,接下来的三行(19–36滤波器)表示的二阶导数六最后四个滤波器(45–48滤波器)表示滤波器。X𝑥1𝑥2𝑥𝑀),MM100000。矩阵在[24]中给出,𝑋while=TXXT=𝑈𝑃−(1/2)UPX的协方差矩阵的特接下来,鉴于特征矩阵𝑋while,我们寻找和稀疏编码框架相似的一个最小化问题

𝑖∑𝑖||D𝑠𝑖−2subjectto||𝐷𝑖||2=1,2and||𝑠𝑖||0≤𝑘, 其中||𝑠𝑖||0是列向量𝑠𝑖中非零元素的数目。迭代通过随机初始化Ds开始,继续k=1D我们设置𝑠𝑖(𝑗)=𝐷𝑗𝑇𝑥𝑖𝑚𝑎𝑥𝑗𝐷𝑗𝑇𝑥𝑖和其他𝑠𝑖中的所有元素都设为0。现在稀疏编码𝑠𝑖已经得到,我们可以计算来生成D∈𝑅𝑏∗d,其中b是特征长度、d是字典长度。最小化框架的思想是发现函数集D在这一步中生成的可以被看作是一个基于低级特征描述符在编码阶段进行编图5由原始像素描述符获得的基函数集 D,我们依照基函数继续进行低级特征描述符编码。这里的主描述符{𝑋1,𝑋2,…,𝑋𝑁}N是从图块中提取的特征描述符数目,可能导致非常和基函数集𝑠𝑖= +𝑧𝑖=max*,𝑠𝑖− −𝑧𝑖=max*,−𝑠𝑖− 𝑧𝑖=,+𝑧𝑖, 其中𝑧𝑖是与对应的低级特征描述符𝑥𝑖一致的稀疏特征。公式3.2的特征描述符𝑥𝑖通过点积运算到归一化基向量𝐷𝑗。在公式3.3和3.4中,相应的权向量𝑠𝑖受到一个软P=1 以前的研究探索其他各种方法融合稀疏特征。通过计算在不同空间尺度下的局(HIK)或卡方内核用于特征类的绘制。这将导致支持向量机训练成本成O(n3)级、存储成本达O(n2),对n×n的内核矩阵。在这篇文章中,我们使用公4.1提供的简场景分SVM的场景分类框架。我们的无监督特征学习框架量p到预定义的。对于分类,我们学习下面的基于支持向量机构想的二元决策f(p)=𝑤𝑇𝑝+ I中提取底层特征X=*x1x2计算xµ=1/N∑𝑁𝑥𝑖和x𝜎=1/N√∑𝑁(𝑥𝑖−fori=1toN=(𝑥𝑖−=𝑠𝑖=+𝑧𝑖=max*,𝑠𝑖−−𝑧𝑖=max*I中提取底层特征X=*x1x2计算xµ=1/N∑𝑁𝑥𝑖和x𝜎=1/N√∑𝑁(𝑥𝑖−fori=1toN=(𝑥𝑖−=𝑠𝑖=+𝑧𝑖=max*,𝑠𝑖−−𝑧𝑖=max*,−𝑠𝑖−𝑧𝑖=,+𝑧𝑖,end计算p=(1/N)实验布置与数设置α=1。我们对所有的实验保持同样的参数设置。为了训练支持向量机分类模型,ORNL-IUCMERCED80个样本以初ORNLII70400个积极的和消极的样本代表首先,我们运用我们的方法在ORNL-I数据集[29]包含约1公尺的空间分辨率图片代表五个不同的地理空间类,如农业区、大型设施、商业区、郊区和植被。这些是由来源不同的资源包括农业部(USDA)、国家农业成像程序(NAIP)、微软的大地女神数据库和其他由加州和犹他州提供的组成。样本包括170、153、171、186和170张农业、大型设施、商业、郊区和植被类别。这些分布于整个,在不同条片是手工剪裁到代表512×512像素大小0.5平方公里1包含每个类别的一张示例UCMERCED数据集[1] 地 局 国家地图中手动提取航拍射图这些图像具备一英尺的分辨率并且被裁256*256像素大小。数据集包21个ORNL-II数据集被编译以测试大型设施场景检测问题。为了编译这个数据集,我们ORNL-I数据集的描述。153ORNL-I数据集中属于大型设施场景277个图像(512512像素)(五)大比例尺高分辨率图大型设施检测模型到七个1m空间分辨率、3波段的大尺度高分辨率影像中。这些代表不同的地理区域,包括农村、住宅、城市和商业区。这些是从农业部图7大型设备目标的一些正面和的样例。前两列图像代表正样本,后两列代表负样本。正样本结(一)稀疏编码参数和字典大稀疏编码参数αd是我们方法中的两个自由参数。α参数决定稀疏度,d决定基函数的数量。为研究稀疏编码的敏感性参数α(3)和(4)的宽度范围内改变它的值。至于的稀疏特征z,我们通过:0.7的时候。在此基础上分析前面描述的实验,我们设置α2000变化下的总体分类精度。我们的实验分析表明,d1000左右产生优秀的分类2d)d=10009d值的分类精度(二)ORNL-IORNL-I数据集上的分类性能,首先我们比较三种不同的特征提取的分类精度:像素值、方向滤波响应、SIFT描述符。我们的实验表明,方向滤波器响应和基于SIFT的特性向量产生最佳的精度。表1显示了三个特征提取策略的平均总体精度。我的优点。一个有趣的现象是,不进行特征编码直接融合时,方向滤波响应比SIFT描述SIFT的密集特征提取结合特征编码产生最好的分类精度。一个通过相大型设施间的,这些反映在图10和11的精度结果上。(三)UCMERCED将我们的场景分类方法与空间金字塔匹配(SPMK)[10]和[11]中的BoVW(SPCK+)进行比较。我们在富有的UCMERCED数据集上测量分类性能。实验布置在[11],3SIFT2所示。我们比密集的SIFT产生的稀疏特征得到最好的精度,明显优于其他场景分类方法。矩阵和整体精度在图12和13中显示。由SIFT特征生成的矩阵显示分类错误主要来自级SIFT特征和改善了整体形状的特征可能会增加场景分类的性能。(四)ORNL-II为评估我们包含大型设施场景的检测性能,我们通过ORNL-II数据集产生14显示了由测试数据集SIFT的特征学习和编码产生极好的场景检测性能。我们运用SIFT特征获得的精度和返回值都是0.98(0.99F-measure)。(五)大比例尺高分辨率图 极大值抑制技术,其中包含检测概率模型。图15在一个大比例尺高分辨率示例图显示了用于我们实验的六个大比例尺高分辨率图像的地面真实(红色箱)和检测(黄色框)覆盖。优秀的检测性能显示我们的方法为发展大型搜索功能做出了保证。结SVM核函数下胜过现有方法的分类精度。在大型设施检测中,我们获得了一个杰出的大比例尺高分图像高F-measure值检测结果。致V.VijayarajORNL-I数据集中搜集图像的努力,还有E.Bright、J.参考文Y.YangandS.Newsam,“Bag-of-visual-wordsandspatialextensionsforland-useclassification,”inProc.ACMInt.Conf.Adv.Geogr.Inf.Syst.,2010,pp.270–279.M.PesaresiandA.Gerhardinger,“Improvedtexturalbuilt-uppresenceindexforautomaticrecognitionofhumansettlementsinaridregionswithscatteredvegetation,”IEEEJ.Sel.TopicsAppl.EarthObserv.RemoteSens.,vol.4,no.1,pp.16–26,Mar.I.A.RizviandB.K. n,“Object-basedimageysisofhighresolutionsaliteimagesusingmodifiedcloudbasisfunctionneuralnetworkandprobabilisticrelaxationlabelingprocess,”IEEETrans.Geosci.RemoteSens.,vol.49,no.12,pp.4815–4820,Dec.R.Bellens,S.Gautama,L.Martinez-Fonte,W.Philips,J.C.-W.Chan,andF.Canters,“ImprovedclassificationofVHRimagesofurbanareasusingdirectionalmorphologicalprofiles,”IEEETrans.Geosci.RemoteSens.,vol.46,no.10,pp.2803–2813,Oct.2008.A.K.ShackelfordandC.H.Davis,“Acombinedfuzzypixel-basedandobject-basedapproachforclassificationofhigh-resolutionmultispectraldataoverurbanareas,”IEEETrans.Geosci.RemoteSens.,vol.41,no.10,pp.2354–2363,Oct.2003.P.Gamba,F.Dell’Acqua,G.Lisini,andG.Trianni,“ImprovedVHRurbanareamapexploitingobjectboundaries,”IEEETrans.Geosci.RemoteSens.,vol.45,no.8,pp.2676–2682,Aug.2007.J.SivicandA.Zisserman,“:Atextretrievalapproachtoobjectmatchingins,”inProc.IEEEInt.Conf.Comput.Vis.,2003,pp.1470–1477.J.Wright,Y.Ma,J.Mairal,G.Sapiro,T.S.Huang,andY.Shuicheng,“Sparserepresentationforcomputervisionandpatternrecognition,”Proc.IEEE,vol.98,no.6,pp.1031–1044,Jun.2010.Y.-L.Boureau,F.Bach,Y.LeCun,andJ.Ponce,“Learningmid-levelfeaturesforrecognition,”inProc.IEEEConf.Comput.Vis.PatternRecognit.,2010,pp.2559–2566.[10]S.Lazebnik,C.Sid,andJ.Ponce,“Beyondbagsoffeatures:Spatialpyramidmatchingforrecognizingnaturalscenecategories,”inProc.IEEEConf.Comput.Vis.PatternRecognit.,2006,vol.2,pp.2169–2178.[11]Y.YangandS.Newsam,“Spatialpyramidco-occurrenceforimageclassification,”inProc.IEEEICCV,Nov.2011,pp.1465–1472.[12]D.G.Lowe,“Objectrecognitionfromlocalscale-invariantfeatures,”inProc.IEEEInt.put.Vis.,Kerkyra,Greece,1999,pp.1150–1157.[13]L.BruzzoneandL.Carlin,“Amultilevelcontext-basedsystemforclassificationofveryhighspatialresolutionimages,”IEEETrans.Geosci.RemoteSens.,vol.44,no.9,pp.2587–2600,Sep.2006.[14]C.UnsalanandK.L.Boyer,“Classifyinglanddevelopmentinhighresolutionpanchromaticsaliteimagesusingstraight-linestatistics,”IEEETrans.Geosci.RemoteSens.,vol.42,no.4,pp.907–919,Apr.2004.[15]X.Huang,L.Zhang,andP.Li,“Classificationandextractionofspatialfeaturesinurbanareasusinghigh-resolutionmultispectralimagery,”IEEEGeosci.RemoteSens.Lett.,vol.4,no.2,pp.260–264,Apr.2007.[16]M.Lienou,H.Maitre,andM.Datcu,“Semanticannotationofsaliteimagesusinglatentdirichletallocation,”IEEEGeosci.RemoteSens.Lett.,vol.7,no.1,pp.28–32,Jan.[17]R.Vatsavai,A.Cheriyadat,andS.Gleason,“Unsupervisedsemanticlabelingframeworkforidentificationofcomplexfacilitiesinhighresolutionremotesensingimages,”inProc.IEEEICDMW,Dec.2010,pp.273–280.[18]J.Yang,K.Yu,Y.Gong,andT.Huang,“Linearspatialpyramidmatchingusingsparsecodingforimageclassification,”inProc.IEEEConf.Comput.Vis.PatternRecognit.,Jun.2009,pp.1794–1801.[19]A.Cheriyadat,“Aerialscenerecognitionusingefficientsparserepresentation,”inProc.nConf.Vis.,Graph.ImageProcess.,2012,p.11.[20]Y.C.Pati,R.Rezaifar,andP.S.Krishnaprasad,“Orthogonalmatchingpursuit:Recursivefunctionapproximationwithapplicationstowavelet position,”inProc.AsilomarConf.Signals,Syst.Comput.,1993,pp.[21]T.LeungandJ.Malik,“Representingandrecognizingthevisualappearanceofmaterialsusingthree-dimensionaltextons,”Int.J.Comput.Vis.,vol.43,no.1,pp.29–44,Jun.[22]L.Fei-FeiandP.Perona,“Abayesianhierarchicalmodelforlearningnaturalscenecategories,”inProc.IEEEConf.Comput.Vis.PatternRecognit.,2005,pp.524–531.[23]A.VedaldiandB.Fulkerson,VLFeat:AnOpenandPortableLibraryofComputerVisionAlgorithms,OfficeofNavalRes.,Arlington,VA,USA.[Online].[24]A. andE. applications,”NeuralNetw.,vol.13,no.4/5,pp.411–430,May/Jun.2000.[25]A.CoatesandA.Y.Ng,“Theimportanceofencodingversustrainingwithsparsecodingandvectorzation,”inProc.Int.Conf.Mach.Learn.,2011,vol.28,pp.921–928.[26]K.GregorandY.LeCun,“Learningfastapproximationsofsparsecoding,”inProc.Conf.Mach.Learn.,2010,vol.27,pp.[27]R.Rigamonti,M.A.Brown,andV.Lepetit,“Aresparserepresentationsreallyrelevantforimageclassification?”inProc.IEEEConf.Comput.Vis.PatternRecognit.,2011,pp.[28]C.W.HsuandC.J.Lin,“Acomparisonofmethodsformulti-classsupportvectormachines,”IEEETrans.NeuralNetw.,vol.13,no.2,pp.415–425,Mar.2002.[29]V.Vijayaraj,A.Cheriyadat,P.Sallee,B.Colder,R.R.Vatsavai,E.A.Bright,andB.L.Bhaduri,“Overheadimagestatistics,”inProc.IEEEAppl.ImageryPatternRecognit.Workshop,2008,pp.1–8.[30]P.O.Hoyer,“Non-negativematrixfactorizationwithsparsenessconstraints,”J.Learn.Res.,vol.5,pp.1457–1469,Dec.[31]C.-C.ChangandC.-J.Lin,LIBSVM:ALibraryforSupportVectorMachines,2001,[Online].Available:,software[32]T.-K.Huang,R.C.Weng,andC.-J.Lin,“GeneralizedBradley-Terrymodelsandmulti-classprobabilityestimates,”J.Mach.Learn.Res.,vol.7,no.1,pp.85–115,Jan.[33]J.Porway,K.Wang,andS.Zhu,“Hierarchicalandcontextualmodelforaerialimageunderstanding,”Int.J.Comput.Vis.,vol.88,no.2,pp.254–283,Jun.2010.IEEETRANSACTIONSONGEOSCIENCEANDREMOTE UnsupervisedFeatureLearningforAerialSceneClassification—Therichdataprovidedbyhigh-resolutionsaliteimageryallowustodirectlymodelaerialscenesbyunderstandingtheirspatialandstructuralpatterns.Whilepixel-andobject-basedclassificationapproachesarewidelyusedforsaliteimageysis,oftentheseapproachesexploitthehigh-fidelityimagedatainalimitedway.Inthispaper,weexploreanunsupervisedfeaturelearningapproachforsceneclassification.Denselow-levelfeaturedescriptorsareextractedtocharacterizethelocalspatialpatterns.Theseunlabeledfeaturemeasurementsareexploitedinanovelwaytolearnasetofbasisfunctions.Thelow-levelfeaturedescriptorsareencodedintermsofthebasisfunctionstogeneratenewsparserepresentationforthefeaturedescriptors.Weshowthatthestatisticsgeneratedfromthesparsefeaturescharacterizethescenewellproducingexcellentclassificationaccu-racy.Weapplyourtechniquetoseveralchallengingaerialscenedatasets:ORNL-Idatasetconsistingof1-mspatialresolutionsaliteimagerywithdiversesensorandscenecharacteristicsresolution,andORNL-IIdatasetforlarge-facilityscenedetection.Ourresultsarehighlypromisingand,ontheUCMERCEDdatasetweoutperformthepreviousbestresults.Wedemonstratethattheproposedaerialsceneclassificationmethodcanbehighlyeffectiveindeveloadetectionsystemthatcanbeusedtodetectinglargefacilitiessuchasashopmall.IndexTerms—Aerialdata,basisfunction,classification,code-book,dictionary,encoding,featurelearning,sparsecoding.TTHEhigh-fidelityimagedataprovidedbythenewandadvancedspace-bornesensorsprovidefreshopportunitiestocharacterizeaerialscenesbasedonthespatialandstruc-turalpatternsencodedintheimagery.Efficientrepresentationandrecognitionofscenesfromimagedataarechallengingproblems.Mostofthepreviousapproachesforhigh-resolutionsaliteimageysis[2]–[6]focusonclassifyingpixelsorobjects(grouoflocalhomogeneouspixels)intotheirthe-maticclassesbyextractingspectral,textural,andgeometricalattributesasclassificationfeatures.Often,theobjectiveistoManuscriptreceivedMarch8,2012;revisedJuly7,2012and1,2012;acceptedDecember5,2012.ThisworkwassupportedbygrantfromtheU.S.DepartmentofEnergyundercontractDE-AC05-00OR22725.TheUnitedStatesernmentandthepublisher,byacceptingthearticleforpublication,acknowledgesthattheU.S.ernmentretainsanon-exclusive,paid-up,irrevocable,world-widelicensetopublishorreproducethepublishedformofthismanuscript,orallowotherstodoso,forU.S.ernmentpurposes.TheauthoriswiththeOakRidgeNationalLaboratory,OakRidge,TN37831USA(:cheriyadatam@ornl.).Colorversionsofoneormoreofthefiguresinthispaperareavailableonlineat.DigitalObjectIdentifier

Fig.1.ToprowshowsexampleimagesfromtheORNL-Idataset.Thisdatasetcontainsfivescenecategories—(1)commercial,(2)large-faciltiy,(3)suburban,(4)agricultural,and(5)wooded.RemainingrowsshowsampleimagesfromtheUCMERCEDdataset[1].Exampleimagesassociatedwiththe21land-usecategoriesareshownhere—(6)airplane,(7)baseballdiamond,(8)beach,(9)buildings,(10)freeway,(11)runway,(12)tenniscourt,(13)bor,(14)mobilehomepark,(15)parkinglot,(16)storage,(17)idential,(18)chaparral,(19)river,(20)agricultural,(21)(22)forest,(23)golfcourse,(24)intersection,(25)mediumresidential,(26)Incontrast,wefocusondirectlymodelingscenesbyexploitingthevariationsinthelocalspatialarrangementsandstructuralpatternscapturedbythelow-levelfeatures.Ourapproachal-lowsustodevelopaholisticrepresentationforaerialscenesthatdoesnotrequireintermediatestagesofsegmentationandrepresentationofindividualgeospatialobjects.Theproposedunsupervisedfeaturelearningandencodingstrategymapslow-levelfeaturedescriptorstoanewrepresentationthatishighlyaccurateincharacterizingdifferentaerialscenes.Fig.1showsafewexampleimagesrepresentingvariousaerialscenesthataredealtwithinthispaper.Withhigh-resolutionimagedata,aerialscenesareoftencomprisedofdifferentanddistinctthematicclasses.Forex-ample,animagepatchassociatedwithascenerepresentingcommercialorlarge-facilityclassmightcomprisedifferentthematicclassessuchasroads,buildings,trees,impervioussurfaces,andparkinglots.Encodingthelocalstructuralandspatialsceneattributesinanefficientandrobustfashionisthekeytogeneratingdiscriminatorymodelstoclassifysuchaerialscenes.Directmodelingofaerialscenesbasedonlow-levelfeaturestatisticsisapopularidea.Bag-of-visual-words(BOVW)[7]isafeatureencodingapproachthathasbeenwellexploredforsceneclassification.Recentstudies[8],[9] ernmentworknotprotectedbyU.S. IEEETRANSACTIONSONGEOSCIENCEANDREMOTEhaveshownthatsparsecodingoffeaturesishighlyeffectiveforsceneclassificationcomparedtothetraditionalBOVWapproaches.Ourproposedmethodinvolvesgeneratingasetofbasisfunctionsfromunlabeledfeatures.Thelow-levelfeaturedescriptorsextractedfromthesceneareencodedintermsofshowthatsimplestatisticsgeneratedfromthesesparsefeaturescharacterizethescenewellproducingsignificantimprovementinsceneclassificationaccuraciescomparedtoexistingap-proachesreportedin[10],[11].Theproposedsparsefeaturerepresentationworkswithlinearclassificationmodel,yetout-performingclassificationperformanceofothermethodsthatusecomplexnonlinearclassificationmodels.Inthispaper,wealsoevaluatedtheclassificationperformanceofvariouslow-levelfeaturemeasurementssuchasrawpixelintensityvalues,orientedfilterresponses,andlocalscaleinvariantfeaturetrans-formation(SIFT)-basedfeaturedescriptors[12].ThemajorcontributionsofthispaperUnsupervisedfeaturelearningapproachtogeneratefea-surementssuchasrawpixelintensities,orientedfilterresponses,andSIFTfeaturedescriptors.Evaluationofthemethodologywithdifferentanddiversedatasets.Detectionsystembasedontheproposedfeatureextractionandlearningapproachesfordetectinglarge-facilityinlarge-scalehigh-resolutionaerialimagery.Therestofthepaperisorganizedasfollows.InSectionII,webrieflyreviewrecentandrelevantworkonhigh-resolutionsaliteimageclassificationthatexploitsspatialfeatures.InSectionIII,wedescribeourapproachonunsupervisedfeaturelearningindetail.SectionIVprovidestheoverallclassificationinSectionsVandVI.SectionVIIconcludesthepaperwithdiscussionsonthefindingsandideasforextendingthework.RelevantWestartbyreviewingsomeoftherecentworksthatexploitspatialcontextforhigh-resolutionsaliteimageclassification.BruzzoneandCarlin[13]proposedaspatialcontextdrivenfeatureextractionstrategyforpixelclassificationinhigh-resolutionimages.First,imagesegmentationwasperformedatdifferentscales.Thesegmentscontainingthepixelwereusedasthespatialcontextsforthepixel.Simplespectralstatisticsassociatedwiththesegmentalongwiththegeometricalfeaturescomputedfromthesegmentwereusedasfeatures.Similarly,ShackelfordandDavis[5]combinedbothpixel-andobject-basedfeaturestogenerateobject-levelclassificationoftheimage.Initially,spectralandtexturalfeatureswereusedtogen-eratepixel-levelfuzzyclassificationlabels.Statisticscomputedoverthesoftclassificationlabels,spectralmeasurements,andgeometricalattributesassociatedwiththesegmentswereusedasclassificationfeatures.However,inbothcasesthesuccessoftheclassificationishighlydependentonthequalityofsegmentation.Bellens[4]exploitedthemorphologicalprofiles

ationsontheimage.Geometricalattributesassociatedwiththemorphologicalprofileswerecombinedwithspectralmeasure-menttogeneratethepixelfeatures.Mostoftheseapproachesclassessuchasbuildings,roofs,roads,treesandimperviousEarlier,incontrasttotheaboveapproaches,UnsalanandBoyer[14]showedthatintermediaterepresentationofthescenebasedonlocallineparameterswasaneffectivewaytorepresentdifferentgeospatialneighborhoods.Thestatisticalmeasuresderivedfromlinelength,contrast,andorientationdistributionsprovideduniquelower-dimensionalrepresentationfordifferentscenecategories.Similarly,Huang[15]exploredasimilarideabasedondirectionallinesforgeneratingpixelfeatures.Thegray-levelsimilarityamongpixelsatcertaindistancesandori-entationswerecalculatedtodeterminepossibledirectionlines.Statisticscomputedfromthedirectionallinelengthhistogramassociatedwitheachpixelwasusedasthefeaturevector.However,directionlinesthatpassthroughapixelisdetectedbasedonthresholdsthataredeterminedheuristically.Again,theseline-basedapproacheswerelimitedintheirabilitytomodeldiversesetsofneighborhoodclasses.Lay,BoVW-basedapproacheshavebeenexaminedcloselyforvariousaerialsceneclassificationpurposes.ThebasicBoVWapproachcanbebroadlydividedintotwoparts—featurelearningandencoding.Duringfeaturelearning,low-levelimagefeaturesareclusteredandtheclustercentersformthevisualwords.Laterinthefeatureencodingstep,low-levelfeaturesextractedfromanimagearemappedtoitsclosestvisualword.Thevisualwordhistogramscomputedovertheimageformthenewfeatures.In[16]simpleimagestatisticssuchasthelocalmeanandvarianceofpixelintensitieswereclusteredtoformthevisualwords.In[17],additionallow-levelfeaturessuchasedgeorientations,orientedfilterresponses,lineparameters,andcolorhistogramswereusedtogeneratethevisualwords.Inbothcases,authorsappliedLatentDirichletAllocation(LDA),anunsupervisedgenerativeframework,tomodeltheworddistributions.Spatialpyramidmatchingkernel(SPMK)isaninterestingapproachintroducedbyLazebniketal.[10]topoolvisualwords.Thelocalvisualwordhistogramscomputedatdifferentscalesandspatialbinsdefinedbythespatialpyramidrepresentationoftheimagewereconcatenatedtoproducebetterscenerepresentations.YangandNewsam[11]computedco-occurrenceofvisualwordswithrespecttocertainspatialpredicatestogenerateahigher-ordervisualworddistributionmodel.Theycombinedthehigher-ordervisualworddistributionmodelwiththeBoVWapproachtoobtainaspatialextensionofthelatter.Theyreportedhigherclassificationaccuracyfortheirextendedspatialco-occurrencekernel(SPCK++)overthetraditionalBoVWandtheSPMKapproaches.However,toachievegoodperformance,SPMKandSPCK++oftenneedstobeusedwithnonlinearMercerkernelssuchastheintersectionkernelandtheChi-squarekernel,wherethecomputationalcomplexitiesarehighcomparedtolinearkernels.AlloftheaboveapproachesreliedonK-meansclusteringtomapthefeaturestovisualwordsandwerelimitedintheirfeaturerepresentationforclassification.Recently,in[18]alinearalternativetotheSPMKapproachCHERIYADAT:UNSUPERVISEDFEATURELEARNINGFORAERIALSCENE Fig.2.Overviewoftheproposedaerialsceneclassificationwasproposed.Thekeyideaintheirapproachwastoemploysparsecodingtogeneratemoresuccinctrepresentationsofthelow-levelimagefeatures.Thesparsefeatures,whencombinedwiththeSPMKframework,generatedfeaturerepresentationsthatcanbeusedwithlinearkernels.However,sparsecodegenerationturnedouttobecomputationallyexpensive.Earlier,weshowedin[19]thatoursparsecodingframeworkishighlyefficient,producingsparsefeaturesatsignificantlylowercomputationalcostthanthepreviousapproach.UnsupervisedFeatureHere,thegoalistoaccurayclassifythegivenimagepatchintooneofthepredefinedscenecategories.Ourapproachconsistsoffivebroadsteps—i)featureextraction,ii)featurelearning,iii)featureencoding,iv)featurepooling,andv)clas-sification.Webeginbyextractinglow-levelfeaturedescriptorsfromtheimagepatch.Aspartofthefeaturelearningprocess,wecomputeasetofnormalizedbasisfunctionsfromtheextractedfeaturesinanunsupervisedmanner.WeuseavariantofsparsecodingcalledOrthogonalMatchingPursuit(OMP-k)[20]tocomputethebasisfunctionset.Duringfeatureencoding,weprojectthefeaturesontothelearnedbasisfunctionsetandapplysoftthresholdactivationfunctiontogenerateasetsparsefeatures.Wepoolthesparsefeaturestogeneratethefinalthenfedtoalinearsupportvectormachine(SVM)classifier.Fig.2showstheoverviewoftheproposedframework.Next,wedescribeourdensefeatureextractionstrategiesandsubsequentstepsindetail.Weevaluateoursceneclassificationframeworkwiththreedifferentfeatureextractionstrategies.First,wesimplyusethe

Fig.3.Oursystemcomputesfeaturedescriptorforeachoverlappixelblock.Thewhiterectanglesoverlaidontheimagedenotepixelblocks.Inthispaper,weexaminedthreedifferentdescriptorsforpixelblockrepresentation:Rawpixelvalues,orientedfilterresponses,andSIFTdescriptors.rawpixelintensityvaluesasfeatures,nextwemeasuretheorientedfilterresponsesateachpixeltoconstructthefeaturevectorbasedonfilterenergyandfinally,weexperimentwithdenseSIFTdescriptors.WeperformfeatureextractiononthegrayimagegeneratedfromtheRGBcolorchannels.Oursystemcomputeslow-levelfeaturedescriptorforeachoverlappixelblocks.Pixelblocksconsistoflocalandcon-tiguousgroupsofpixels.Wecomputedescriptorsrepresentinglow-levelfeaturemeasurements.Atthisstage,theinputimageisrepresentedassetofvectorsrepresentinglow-levelfeaturemeasurementsasshowninFig.3.wesimplyrepresentthepixelblockascolumnvectorxiRbwherebistheproductoftheblockdimensionsandrepresentstheblockindex.Notethatthroughoutthispaperwedenotematriceswithboldcapitalletters,vectorswithboldsmallletters,scalarsinitalicizedletters,superscriptedandsubscriptedindicestodenotethecolumnandrowpositionsofthevectorrespectively,andindicesenclosedinbracketsdenotetheelementposition.Fororientedfilterresponses,weusetheLeung-Malik[21]multiscaleandmulti-orientationfilterbanks.OurfilterbankconsistsoffirstandsecondderivativesofGaussianat6orientationsand3scales,8Laplacian-of-Gaussian,andGaussianatdifferentscales.Following[21],foreachscalewesettheGaussianwidthcorrespondinglyto{1,√2,2,22}.ThefilterbankusedinoursystemisshowninFig.4.Foreachblock,wecomputetheaveragefilterenergyateveryscaleorientationtogeneratefeaturevectorxi∈Rbwhereb=Finally,wecomputeSIFT-baseddescriptorsforeachblock.Thisisincontrasttotheapproachesin[10],[11]wherefeaturedescriptorsarecomputedonly

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论