论文TheMathematicsofGeographicProfiling.doc_第1页
论文TheMathematicsofGeographicProfiling.doc_第2页
论文TheMathematicsofGeographicProfiling.doc_第3页
论文TheMathematicsofGeographicProfiling.doc_第4页
论文TheMathematicsofGeographicProfiling.doc_第5页
已阅读5页,还剩15页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Journal of Investigative Psychology and Offender Pro lingJ. Investig. Psych. Offender Pro l. 6: 253265 (2009)Published online in Wiley InterScience (). DOI: 10.1002/jip.111The Mathematics of Geographic Pro lingAbstractWe begin by describing some of the mathematical foundations of the geographic pro lingproblem. We then present a new mathematical framework for the geographic pro lingproblem based on Bayesian statistical methods that makes explicit connections betweenassumptions on offender behaviour and the components of the mathematical model. It alsocan take into account local geographic features that either in uence the selection of acrime site or in uence the selection of an offenders anchor point. Copyright 2009 JohnWiley & Sons, Ltd.Key words: geographic pro ling; Bayesian analysis; mathematical modellingINTRODUCTIONThe geographic pro ling problem is the problem of constructing an estimate for the loca-tion of the anchor point of a serial offender from the locations of the offenders crimesites; see Rossmo (2000, p. 1). In this paper, we shall present a mathematical survey ofsome of the algorithms that have been used to solve the geographic pro ling problem. Wewill then present a new mathematical framework for the geographic pro ling problembased on Bayesian methods that is able to incorporate both geographic features that in u-ence the choice of a crime site as well as geographic features that affect the location ofthe offenders anchor point.It has long been recognised that there are important relationships between geographyand crime, including serial crime; we mention Brantingham and Brantingham (1993),Canter and Larkin (1993), and Rossmo (2000). Today, there are a number of softwarepackages being used to solve the geographic pro ling problem. These include CrimeStat,developed by Ned Levine; Dragnet, developed by David Canter; and Rigel, developed byKim Rossmo. There have been signi cant disagreements in the literature as to what is thebest methodology to evaluate the currently existing geographic pro ling software. See theoriginal report prepared for NIJ (Rich & Shively, 2004), the critique of Rossmo (2005a),*Correspondence to: Mike OLeary, Department of Mathematics, Towson University, Towson, MD 21252,USA.E-mail: Copyright 2009 John Wiley & Sons, Ltd.254 M. OLearyand the response of Levine (2005). There is also an ongoing lively discussion in the lit-erature as to whether or not computer systems are as effective as simply providing humanswith some simple heuristics (see Snook, Canter, & Bennell, 2002; Snook, Taylor, &Bennell, 2004; Rossmo, 2005b; Snook, Taylor, & Bennell, 2005b). We also have thediscussion in Snook, Taylor, and Bennell (2005a); Rossmo and Filer (2005); Bennell,Snook, and Taylor (2005); and Rossmo, Filer, and Sesley (2005), as well as the papers ofBennell, Snook, Taylor, Corey, and Keyton (2007) and of Bennell, Taylor, and Snook(2007). Given these and other controversies, we begin by enumerating the characteristicswe feel that a sound mathematical algorithm for the geographic pro ling shouldpossess: The method should be logically rigorous. There should be explicit connections between assumptions on offender behaviour andthe components of the model. The method should be able to take into account local geographic features; in particular,it should be able to account for geographic features that in uence the selection ofa crime site and geographic features that in uence the potential anchor points ofoffenders. The method should be based on data that are available to the jurisdiction(s) where theoffences occur. The method should return a prioritised search area for law enforcement of cers.Ensuring that the algorithms are rigorous and explicit in the connections between theassumptions on offender behaviour and components of the model will help in the analysisof the model. In particular, it will give researchers another tool for evaluating a model.As we have noted, there is no consensus as to the method(s) that should be used toevaluate the effectiveness of a geographic pro ling strategy. It is important that themathematics explicitly allows for the in uence of the local geography and demography.It is well known that there are relationships between the physical environment and crimerates; see for example Brantingham and Brantingham (1993). It is essential that agood mathematical framework has the ability to incorporate this information into themodel. However, it is equally important that the model use only data that are available tothe appropriate law enforcement agency. Finally, we recognise that simple point estimatesof offender anchor points are not very valuable to practising law enforcement of cers.Rather, to be useful to practitioners, a good algorithm must produce prioritised searchareas.EXISTING METHODSTo begin our review of the current state of geographic pro ling, let us agree to adopt somecommon notation. A point x will have two components x = (x(1), x(2). These can be latitudeand longitude, or distances from a xed pair of perpendicular reference axes. We presume that we are working with a series of n linked crimes, and the crime sites under consider-ation are labelled x1,x2, . . . , xn. We use the symbol z to denote the offenders anchor point.The anchor point can be the offenders home, place of work, or some other location ofimportance to the offender.We shall let d(x, y) denote the distance metric between the points x and y. There aremany reasonable choices for this metric including the Euclidean distance, the ManhattanThe mathematics of geographic pro ling 255distance, the total street distance following the local road network, or the total time tomake the trip while following the local road network.Existing algorithms begin by first making a choice of distance metric d(选择一个距离方式); they then selecta decay function f and construct a hit score function S(y) by computing(1)Regions with a high hit score are considered to be more likely to contain the offendersanchor point than regions with a low hit score. In practice, the hit score S(y) is notevaluated everywhere, but simply on some rectangular array of points yjk= (y(1)j,y(2)k ) forand , giving us the array of values Sjk=S(yjk).Rossmos method, as described in Rossmo (2000, Chapter 10) chooses the Manhattandistance function for d and the decay function kif d B,h( ) = df d (kBg h)gif d B.2B dWe remark that Rossmo also considers the possibility of forming hit scores by multiplica-tion; see Rossmo (2000, p. 200).The method described in Canter, Coffey, Huntley, and Missen (2000) is to use aEuclidean distance, and to choose either a decay function in the formf d( ) = edor functions with a buffer and plateau, with the form0( ) = Bf dCeif d A,if A dB,difdB.The CrimeStat program described in Levine (2009a) uses Euclidean or spherical dis-tance and gives the user a number of choices for the decay function, including Linear: f(d) = A + Bd; Negative exponential: f(d) = Aed; Normal: f(d) = A(2S2)1/2 exp(d d)2/2S2; Lognormal: f(d) = A(2d2S2)1/2 exp(ln d d)2/2S2; and Truncated negative exponential: f(d) = Bd if d C and f(d) = Aed ifd C.CrimeStat also allows the user to use empirical data to create a different decay functionmatching a set of provided data as well as the use of indirect distances.Though each of these approaches are distinct, they share the same underlying mathe-matical structure; they vary only in the choice of decay function and the choice of distancemetric. We remark that the latest version (3.2) of CrimeStat contains a new BayesianJourney-to-Crime Module that integrates information on the origin location of otheroffenders who committed crimes in the same location with the distance decay estimates(Levine, 2009b). Levine and Block tested this method with data from Baltimore Countyand from Chicago (Levine & Block, in press). See the introduction to this special issue.Copyright 2009 John Wiley & Sons, Ltd.J. Investig. Psych. Offender Pro l. 6: 253265 (2009)DOI: 10.1002/jip256 M. OLearyA NEW MATHEMATICAL APPROACHWe begin by looking for an appropriate model for offender behaviour and start with thesimplest possible situationwhere we know nothing about the offender. Thus, we assumethat our offender chooses potential locations to offend randomly according to someunknown probability density function P(x). For any geographic region R, the probabilitythat our offender will choose a crime site in R can be found by adding up the values ofP in R, giving us the probability RP(x)dx(1)dx(2).At rst glance, it may seem odd to use a probabilistic model to describe human behav-iour. In fact, probabilistic models are commonly used to describe many kinds of apparentlydeterministic phenomena. For example, classical models of the diffusion of heat or chem-ical concentration can be derived probabilistically; they also see application in models ofthe stock market (Baxter & Rennie, 1996; Wilmott, 1998; Wilmott, Howison, & Dewynne,1995), in models of population genetics (Ewens, 2004), and in many other models(Beltrami, 1993).More precisely, the probability density function P represents our knowledge of thebehaviour of the offender. We use a probability distribution, not because the offendersdecision has a random component, although it may. Rather, we use a probability densitybecause we lack complete information about the offender. Indeed, consider the followingthought experiment. If we want to model the ip of a coin, we use probability and assumethat each side of the coin is apt to occur half the time. Now instead of ipping the coin,let us take the coin to a colleague and ask them to choose a side. In this case, the outcomeis the deliberate result of a decision by an individual. However, without knowing moreinformation about our colleagues preferences, the best choice to model the outcome ofthat experiment is still the use of a probability distribution.Returning to our model of offender behaviour, we begin with a question: Upon whatsorts of variables should our probability density function P depend? One of the fundamen-tal assumptions of geographic pro ling is that the choice of an offenders target locationsis in uenced by the location of the offenders anchor point z. Therefore, we assume thatP depends upon z. Underlying this approach are the requirements that the offender has asingle anchor point and that it is stable during the crime series.A second important factor is the distance our offender is willing to travel to commit acrime. Let denote the average distance that our offender is willing to travel to offend.We allow for the possibility that this value varies between offenders. Combining these,we assume that there is a probability density function P(x|z, ) for the probability that anoffender with a single stable anchor point z and average offence distance commits acrime at the location x.We assume that this model is local to the jurisdiction under consideration. In particular,we explicitly allow for the possibility that different models P(x|z, ) need to be chosenfor different jursidictions.The key mathematical point is that the unknown is now the entire distribution P(x|z,), rather than just the anchor point z. On its face, it seems a step backwards, but in fact,it is not. Indeed, let us suppose that the form of the distribution P is known, but thatthe values of the anchor point z and average offence distance are unknown. Then theproblem can be stated mathematically as, given a sample x1,x2, . . . ,xn (the crime sitelocations) from the distribution P(x|z, ) with parameters z and to determine the bestway to estimate the parameter z (the anchor point).Copyright 2009 John Wiley & Sons, Ltd.J. Investig. Psych. Offender Pro l. 6: 253265 (2009)DOI: 10.1002/jipThe mathematics of geographic pro ling 257For the moment, let us set aside the question of what reasonable choices can be madefor the form of the distribution P(x|z, ), and focus on how we can estimate the anchorpoint z from our knowledge of the crime locations x1, . . . , xn.It turns out that this is a well-studied mathematical problem. One approach is to use themaximum likelihood estimator. To do so, one rst forms the likelihood function:n) = P(x yi, a) = P(x y, a)P(x yn, a).L (y, ai=1Then the maximum likelihood estimates zmle and mle are the values of y and a that makeL as large as possible. Equivalently, one can maximise the log-likelihood functionnl(y, a) = ln P(, a) = ln P(, a) +(x yx yln P x y, a).i=1Though rigorous, this approach is unsuitable as simple point estimates for the offendersanchor point are not operationally useful. Instead, we continue our analysis by usingBayes Theorem.BAYESIAN ANALYSISTo see how Bayesian methods can be applied to geographic pro ling, we begin with thesimplest case where the offender has only committed one crime at the locationx. Wewould like to use the information from this crime location to form an estimate for theprobability distribution for the anchor point z. Bayes Theorem gives us the estimate, z,)Px zP(z, x) =( )Px(2)(Carlin & Louis, 2000; Casella & Berger, 2002). Here, P(z, |x) is the posterior distribu-tion, which gives the probability density that the offender has anchor point z and averageoffence distance , given that the offender has committed a crime at the location x.The termP(x) is the marginal distribution. The important thing to note is that it isindependent of z and , therefore, it can be ignored provided we replace the equality in(2) with proportionality.The term (z, ) is the prior distribution. It represents our knowledge of the probabilitydensity that the offender has anchor pointz and average offence distance before weincorporate any information about the crime series. One approach to the prior is to assumethat the anchor point z is mathematically independent of the average offence distance .In this case, we can factor to obtain z ,) = H( ) ( )(3)where H(z) is the prior probability density function for the distribution of anchor pointsbefore any information from the crime series is included and () is the probability densityfunction for the prior distribution of the offenders average offence distance, again beforeany information from the crime series is included.Combining these, we then obtain the expression() () ( ) ( )P z, xCopyright 2009 John Wiley & Sons, Ltd.Px z, Hz .J. Investig. Psych. Offender Pro l. 6: 253265 (2009)DOI: 10.1002/jip258 M. OLearyOf course, we are interested in crime series, and we would like to estimate the probabil-ity density for the anchor point z given our knowledge of all of the crime locations x1, . . . ,xn. To do so, we proceed in a similar fashion; now Bayes Theorem impliesP x, . . . , x zn,z .(, . . . ,) =(1)P z, x1xnP(x1, . . . , xn)Here, P(z, |x1, . . . ,xn) is again the posterior distribution, which gives the probabilitydensity that the offender has anchor point z and average offence distance , given that theoffender has committed a crime at each of the locations x1, . . . , xn. The marginal P(x1, . . . ,xn) remains independent of z and , and can be ignored; the prior can be handled by(3). Then(, . . . ,) (, . . . , ) ( ) ( ).(4)Pz, x1xnPx1x znHzThe factor P(x1, . . . , xn|z, ) on the right side is the joint probability that the offendercommitted crimes at all of the locations x1, . . . , xn given that they had anchor point z andaverage offence distance . The simplest assumption we can make is that all of the offencesites are mathematically independent; then we have the reductionP(x1, . . . ,n, ) = P(x z , )P(n, ).(5)Substituting this into (4) gives(, . . . ,) (,)(, ) ( ) ( ).Pz, x1xnPx zP x znHzFinally, because we are only interested in the location of the anchor point z, we take theconditional distribution to obtain our fundamental mathematical result:() (,)(| , ) ( ) ( ).(6)Pz x1, . . . , xnPx zPnH zdThe expression P(z|x1, . . . ,xn) gives us the probability density that the offender hasanchor point z given that they have committed crimes at the locations x1, . . . , xn. Becausewe are calculating probabilities, this immediately provides us a rigorous search area forthe offender. Indeed, regions with larger values of P(z|x1, . . . , xn) by de nition are morelikely to contain the offenders anchor point than regions where P(z|x1, . . . , xn) is lower.This is a very general framework for the geographic pro ling problem. There are manychoices for the model of offender behaviour P(x|z, ), and we will later examine a numberof reasonable choices. Though the preceding used a model for offender behaviour withone parameter other than the anchor point, the mathematics continues to holdwith elementary modi cations if we either add additional parameters or remove theparameter .In addition to an assumption as to the form

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论