SIFT的讲解.ppt_第1页
SIFT的讲解.ppt_第2页
SIFT的讲解.ppt_第3页
SIFT的讲解.ppt_第4页
SIFT的讲解.ppt_第5页
已阅读5页,还剩54页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Contents,Overall picture Region detectors Scale invariant detection Localization Orientation assignment Region description SIFT approach Local Jet Storing and matching State of the art - Video Google,Our goal,Detecting repeatable image regions Obtaining reliable and distinctive descriptors Searching

2、 an image database for an object efficiently,?,Invariance vital,Scale Rotation Orientation Illumination Noise Affine,Region detectors,Harris points - Invariant to rotation Two significant eigenvalues indicate an interest point Harris-Laplace Invariant to rotation and scale Uses Laplacian of Gaussian

3、 operator SIFT - Scale space extrema using Difference of Gaussian,Scale Invariant Detection,Consider regions (e.g. circles) of different sizes around a point Regions of corresponding sizes will look the same in both images,Scale Invariant Detection,The problem: how do we choose corresponding circles

4、 independently in each image?,Scale Invariant Detection,A “good” function for scale detection has one stable sharp peak,Scale-space,Definition: where Keypoints are detected using scale-space extrema in difference-of-Gaussian function D Efficient to compute Close approximation to scale-normalized Lap

5、lacian of Gaussian,Image space to scale space,k4 k3 k2 k ,Relationship of D to,Diffusion equation: Approximate G/: giving, Therefore, When D has scales differing by a constant factor it already incorporates the 2 scale normalization required for scale-invariance,Local extrema detection,Find maxima a

6、nd minima in scale space,Frequency of sampling in scale,Lowe prefers to use 3 scale samples per octave,Localization,3D quadratic function is fit to the local sample points Start with Taylor expansion with sample point as the origin where Take the derivative with respect to X, and set it to 0, giving

7、 is the location of the keypoint This is a 3x3 linear system,Localization,Derivatives approximated by finite differences, example: If X is 0.5 in any dimension, process repeated,Being picky!,Contrast (use prev. equation): If | D(X) | 0.03, throw it out Edge-iness: Use ratio of principal curvatures t

8、o throw out poorly defined peaks Curvatures come from Hessian: No need to explicitly calculate eigenvalues. We only need their ratio!,Orientation assignment,Descriptor computed relative to keypoints orientation achieves rotation invariance Precomputed along with mag. for all levels (useful in descri

9、ptor computation) Multiple orientations assigned to keypoints from an orientation histogram Significantly improve stability of matching,Choosing the right image descriptors,Distribution-Based Luminance based approaches Histograms of pixel intensities and location SIFT Based on gradient distribution

10、in the region Geometric based approaches Shape context Spatial-Frequency Techniques Fourier transform based No spatial information Gabor filters and wavelets Large number of filters,Choosing the right image descriptors,Differential descriptors Local Jets - Set of image derivatives Steerable filters

11、steering derivatives in the direction of the gradient Miscellaneous Using generalized moment invariants characterize shape and intensity distribution,Who wants to be a Millionaire?,a. Local intensity histogram,Which is the most popularly used image descriptor ?,c. Local Jets,d. Fourier transform bas

12、ed,b. SIFT,Local image descriptor - SIFT,Local image descriptor - SIFT,Weight magnitude of each sample point by Gaussian weighting function, =0.5*width Distribute each sample to adjacent bins by trilinear interpolation (avoids boundary effects) Allows for significant shift in gradient positions,Illu

13、mination invariance for SIFT,Affine changes Normalizing vector to unit length accounts for overall brightness change Non-linear changes Occur due to camera saturation / viewpoint changes Thresholding values in the unit feature vector to 0.2 Re-normalizing Less importance to large gradients More impo

14、rtance to distribution of orientations,Width of SIFT desriptor,2 parameters to be obtained Number of orientations in histogram Size of histogram array Optimal size obtained experimentally,Stability as a function of affine distortion,The approach is not truly affine invariant Initial features located

15、 in a non-affine manner,Local Image Descriptor Local Jet,Image in a neighborhood of a point can be described by the set of its derivatives Local jet of order N at a point x = (x1,x2) is defined using convolution of image I with the Gaussian derivatives Complete set of invariants is computed that loc

16、ally characterizes the signal By stacking invariants in a vector,Local Image Descriptor Multiscale approach,Vector of invariants are calculated at different scales Half-octave quantization is used Difference between consecutive sizes 20% varies between 0.48 to 2.07,Longer vectors decrease the probab

17、ility of repeatability Global features are sensitive to extraneous features or partial visibility Solution For each interest point, select p nearest features For matching, the constraint of angle between line joining neighboring points is added Assumption that 50% of the points will match using thes

18、e semi-local constraints,Local Image Descriptor Semilocal Constraints,Semilocal Constraints,Comparison of SIFT and Local Grayvalue Invariants,Storing,A set of keypoints are obtained from each reference image Each such keypoint has a graphical descriptor which is a 128 components vector (4*4*8) All s

19、uch (keypoint, vector) pairs corresponding to a set of reference images are stored in a set,Matching,Test image gives a new set of (keypoint, vector) pair For each such pair, we find the nearest (top 2) descriptors in our database set,Acceptance of a match,Match accepted IF Ratio of distance to firs

20、t nearest descriptor to that of second threshold,Complexity,Initial complexity: Number of features in the query image * total number of features in the database Reason: Because each keypoint(feature) is to be matched with all the features in the database to give the best two matches Solution: k-d Tr

21、ees!,Storage using k-d trees,The set is stored using a k-d tree (in both Schmidt Mohr and Lowe techniques),K-d Trees,The elements are stored in the leaves. The other nodes are divisions of the space in some dimension. Fixed size one-dimensional buckets are used Each dimension is accessed sequentiall

22、y Depth of the tree is at most the number of dimensions of stored vectors,New complexity!,Number of features of the query image,Which is the most popularly used image descriptor ?,c. Local Jets,b. SIFT,Update and demo,STATE OF THE ART,Video Google NOT ! A text retrieval approach to object matching i

23、n videos Josef Sivic and Andrew Zisserman,Text retrieval overview,Documents are parsed into words Common words are ignored (the, an, etc) This is called stop list Words are represented by their stems walk, walking, walks walk Each word is assigned a unique identifier The vocabulary contains K words

24、Each document is represented by a K components vector of words frequencies,Parse and clean,“ Representation, detection and learning are the main issues that need to be tackled in designing a visual system for recognizing object. categories .” Representation, detection and learning are the main issue

25、s that need to be tackled in designing a visual system for recognizing object categories Represent detect learn main issue need tackle design visual system recognize object category,Creating the database,Inverted file - Index,Creating a document vector ID,Querying,Parsing the query to create query v

26、ector Query: “Representation learning” Query Doc ID = (1,0,1,0,0,) Retrieve all documents ID containing one of the Query words ID (Using the invert file index) Calculate the distance between the query and document vectors (angle between vectors) Rank the results,Using the text search as an analogy,B

27、asic idea: Build a visual vocabulary based on a large set of images. Given a query image, search through the database in a manner similar to the text search.,Again . Detection and Description,Detection finding invariant regions Description using the SIFT descriptor,Building the “Visual Stems”,Cluste

28、r descriptors into K groups using K-mean clustering algorithm Each cluster represent a “visual word” in the “visual vocabulary” Result: Between 10000 and 20000 clusters used,Example clusters,Visual “Stop List”,The most frequent visual words that occur in almost all images are suppressed,After stop l

29、ist ,Before stop list,Ranking Frames,Distance between vectors (Like in words/Document) Spatial consistency (= Word order in the text),The Visual Analogy,Document,Frame,Descriptor,Word,Text,Visual,Query,Example searches,Object query http:/www.robots.ox.ac.uk/vgg/research/vgoogle/how/results/bolle/bol

30、le.html http:/www.robots.ox.ac.uk/vgg/research/vgoogle/how/results/poster/poster.html Scene Query http:/www.robots.ox.ac.uk/vgg/research/vgoogle/how/examples/example_scene.html,Open issues,Automatic ways for building the vocabulary are needed Ranking of retrieval results method as Google does Extension to non rigid objects, like faces Using this method for higher level analysis of movies,References,David G. Lowe, Distinctive image features from scale-invariant keypoints, International Journal

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

最新文档

评论

0/150

提交评论