Spatial/14-Point_Patterns_1st.Rmd at main · mgimond/Spatial · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
# First-Order Point Pattern Analysis: Modeling Spatial Intensity {#chp14_0}

## Introduction

In spatial point pattern analysis, first-order effects refer to variations in the intensity of points across a study area due to external factors. These effects describe how the likelihood of observing a point changes from one location to another, independent of the presence or absence of other points. For example, oak trees may be more densely distributed in areas with favorable soil conditions, or retail stores may cluster in regions with higher population density. In both cases, the spatial variation is driven by underlying environmental or socioeconomic covariates rather than interactions between the points themselves.

This chapter focuses on techniques for detecting and modeling these first-order effects. We begin with simple descriptive tools such as **global** and **local** density measures, then move to more refined methods like kernel density estimation and covariate-adjusted intensity modeling. Finally, we introduce statistical models, such as the Poisson point process, that allow us to formally test hypotheses about how external variables influence point intensity. By isolating first-order effects, we lay the groundwork for more advanced analyses of second-order properties, which examine how points interact with one another.

## Centrography

Before diving into formal models of spatial intensity, it's helpful to begin with simple descriptive tools that summarize the overall distribution of points. This approach, known as **centrography**, includes techniques such as the **mean center**, **standard distance** and **standard deviational ellipse**. These measures provide insight into the central tendency and dispersion of a point pattern, offering a useful starting point for exploring spatial variation.

```{r echo=FALSE}
knitr::include_graphics("img/centrography.svg")
```

These point pattern analysis techniques were popular before computers were ubiquitous as hand calculations were relatively simple. However, these summary statistics are too concise and hide far more valuable information about the observed pattern.

In the following sections, we introduce density-based methods as a bridge between raw spatial data and more formal modeling of intensity.

## Understanding Intensity and Density in Point Patterns

In point pattern analysis, the terms **intensity** and **density** are closely related but conceptually distinct. Understanding the difference is important when interpreting spatial patterns and building models.

-   **Density** refers to the **observed number of points per unit area** in a dataset. It’s a descriptive statistic--what we actually count from the data. For example, if there are 31 trees in a 100 square meter plot, the density is 0.31 trees per square meter.

-   **Intensity**, on the other hand, refers to the **underlying rate or likelihood** that a point occurs at a location. It’s a property of the **process** that generated the pattern, not just the pattern itself. Intensity is often denoted by the symbol $\lambda$ and can vary across space. When we estimate intensity from observed data, we use $\widehat{\lambda}$ to indicate that it’s an approximation.

Think of it this way: density is *what we observe*, intensity is *what we infer*. If the intensity is constant across the study area, we say the process is **homogeneous**, if it varies, the process is **inhomogeneous**. We therefore need tools that can model how intensity changes with location or covariates.

## Global density

A basic estimate of a point pattern’s *intensity*, $\widehat{\lambda}$, is its global density--the average number of points per unit area across the entire study region. It is computed as the ratio of the total number of observed points, $n$, to the surface area of the study region, $a$:

$\begin{equation}
\widehat{\lambda} = \frac{n}{a}
\label{eq:global-density}
\end{equation}$

This estimate assumes that the underlying process is homogeneous, meaning the intensity is constant throughout the region.

```{r f11-gobal-dens,  fig.cap="An example of a point pattern where n = 31 and the study area (defined by a square boundary) is 10 units squared. The point density is thus 31/100 = 0.31 points per unit area.", fig.height=2.5, fig.width=2.5, echo=FALSE}

library(spatstat)
library(ggplot2)
cell <- 10
win  <- owin(xrange=c(0,cell),yrange=c(0,cell))
set.seed(3452)
P    <- rMatClust(0.5, 1, 0.5,win=win)
ggplot(aes(x = x, y = y), data = as.data.frame(P)) + theme(legend.position = "none") +
       geom_point(data = as.data.frame(P), mapping=aes(x=x, y=y), size=1.5, shape=16, colour="grey20") +
       coord_equal()+ xlim(0,10) + ylim(0,10) + labs(x="",y="") + theme(axis.text=element_text(size=8))
```

While useful as a summary statistic, global density can mask important spatial variation, which motivates the need for local density measures introduced in the next section.

## Local density

A point pattern’s density can vary across space, and measuring it at different locations allows us to assess whether the underlying process’s local intensity, $\widehat{\lambda}_i$, is spatially uniform or not. This spatial variation is a key aspect of first-order effects, where external factors may influence the likelihood of observing a point in different parts of the study area.

Identifying non-uniform intensity is important because it can confound distance-based analyses (covered in the next chapter) which assume **stationarity**--the assumption that the underlying process generating the points has a constant intensity across the study area.

Several techniques are available for estimating local density. In this chapter, we focus on two widely used methods: *quadrat density*, which divides the study area into discrete subregions, and *kernel density*, which uses a moving window to produce a smooth surface of estimated intensity.

### Quadrat density

Quadrat density analysis involves dividing the study area into sub-regions called quadrats. For each quadrat, the **local density** is computed by dividing the number of points within the quadrat by its area. This provides a spatially localized estimate of point intensity.

This method assumes that the intensity of the point process is roughly constant within each quadrat, which may not hold if quadrats are too large or poorly aligned with underlying spatial variation.

Quadrats can take various shapes such as hexagons and triangles. In the following example density is computed for square shaped quadrats.

```{r f11-quad01, fig.cap="An example of a quadrat count where the study area is divided into four equally sized quadrats whose area is 25 square units each. The density in each quadrat can be computed by dividing the number of points in each quadrat by that quadrat's area.", fig.height=2.2, fig.width=2.2, echo=FALSE}
OP <- par(mar=c(0,0,0,0))
Q.dim <- 2
Q  <- quadratcount(P, nx=Q.dim, ny=Q.dim)
A  <- spatstat.geom::area(P$window) / Q.dim^2 # Area for each quadrat
plot(Q / A, main="", col="grey20", entries = Q/A)
plot(P, add = TRUE, pch=20, col=rgb(0,0,0,0.5))
par(OP)
```

The choice of quadrat numbers and quadrat shape can influence the measure of local density and must be chosen with care. If very small quadrat sizes are used, you risk having many empty quadrats which may prove uninformative. If very large quadrat sizes are used, you risk missing subtle changes in spatial density distributions such as an east-west gradient in density values.

Instead of dividing the study area into uniform shapes, quadrats can also be defined based on an underlying **covariate**--a spatial variable believed to influence point intensity (e.g., elevation, land cover, or population density). For example, if it's believed that the underlying point pattern process is driven by elevation, quadrats can be defined by sub-regions such as different ranges of elevation values (labeled 1 through 4 on the right-hand plot in the following example). This can result in quadrats with non-uniform shape and area.

```{block, type="note"}
Converting a continuous field into discretized areas is sometimes referred to as **tessellation**. The end product is a **tessellated surface**.
```

</br>

```{r f11-quad02, fig.cap="Example of a covariate. Figure on the left shows the elevation map. Figure on the right shows elevation broken down into four sub-regions (a tessellated surface) for which local density values will be computed.", fig.height=2, fig.width=6.0, echo=FALSE}

elev <- density(P, sigma = 5) *1000

OP <- par(mfrow = c(1,2), mar=c(0,0,0,3))
plot.im(elev, col=grey(50:0/50), main=NULL, las=2)
plot(P, col=rgb(0,0,0,.7), add=T, pch=20)

brk  <- c( 260, 286, 301, 318 , 370)
Zcut <- cut(elev, breaks=brk, labels=1:4)
  V  <- tess(image=Zcut)
plot(V$image, main="", las=2, col=colorRampPalette(c("grey90","grey20"))(V$n))
plot(P, col=rgb(0,0,0,.7), add=T, pch=20)
par(OP)

```

If the local intensity changes across the tessellated covariate, then there is evidence of a dependence between the process that generated the point pattern and the covariate. In our example, sub-regions 1 through 4 have surface areas of `r round(spatstat.geom::tile.areas(V), 2)` map units respectively. To compute these regions' point densities, we simply divide the number of points by the respective area values.

```{r f11-quad03, fig.cap="Figure on the left displays the number of points in each elevation sub-region (sub-regions are coded as values ranging from 1 to 4). Figure on the right shows the density of points (number of points divided by the area of the sub-region).", fig.height=2, fig.width=6.0, echo=FALSE}

# Extract centroids
library(sf)
V_stars <- stars::st_as_stars(as.im(V))
V_poly <- st_as_sf(V_stars, merge = TRUE)
V_coord <- st_coordinates(st_centroid(V_poly[ order(V_poly$v), ]))

#cl <-  interp.colours(c("lightyellow", "orange" ,"red"), 40)
cl <-  colourmap(c("#FFFFDA", "#FFDF5F", "#FF9100", "#FF0400"),
                 breaks=c(0, 0.31, 0.35, 0.4,  0.4222))
Q2 <- quadratcount(P, tess = V)

OP <- par(mfrow = c(1,2), mar=c(0,0,0,3))
plot(V$image, main="", las=2, col=colorRampPalette(c("grey90","grey20"))(V$n))
plot(P, pch=20, add=T)
text(V_coord, labels=Q2, col = "darkorange")
Q2i <- intensity(Q2)
int <- round(data.frame(Q2i)$Freq, 3)
plot(intensity(Q2, image=TRUE), col = cl, las=2, main=NULL)
text(V_coord, labels=int, cex = 0.8)
#plot(Q2, add=T, show.tiles = FALSE, entries= int, pos = 1, labelPts=T)
par(OP)

```

We can plot the relationship between point density and elevation regions to help assess any dependence between the variables.

```{r f11-quad04, fig.cap="Plot of point density vs elevation regions.", fig.height=2, echo=FALSE, }
ggplot() + aes(y=as.numeric(Q2i), x=1:4)  + geom_point() + geom_line() +xlab("Elevation regions") + ylab("Point density")

```

It's important to note that how one *chooses* to tessellate a surface can have an influence on the resulting density distribution. For example, dividing the elevation into seven sub-regions produces the following density values.

```{r f11-quad05, fig.cap="Same analysis as last figure using different sub-regions. Note the difference in density distribution.", fig.height=2, fig.width=6.0, echo=FALSE}

n_cut <- 7
brk  <- quantile(elev, probs=(0:n_cut)/n_cut, type=2)
Zcut <- cut(elev, breaks=brk, labels=1:n_cut)
  V  <- tess(image=Zcut)

# Extract centroids
V_stars <- stars::st_as_stars(as.im(V))
V_poly <- st_as_sf(V_stars, merge = TRUE)
V_coord <- st_coordinates(st_centroid(V_poly[ order(V_poly$v), ]))

# Generate plots
OP <- par(mfrow = c(1,2), mar=c(0,0,0,3))
Q2 <- quadratcount(P, tess = V)
plot(V$image, main="", las=2, col=colorRampPalette(c("grey90","grey20"))(V$n))
#plot(Q2, show.tiles=FALSE, add=TRUE)
plot(P, pch=20, add=T)
text(V_coord, labels=Q2, col = "darkorange")
Q2i <- intensity(Q2)
int <- round(data.frame(Q2i)$Freq, 3)
cl <-  colourmap(interp.colours(c("lightyellow", "orange" ,"red"), n_cut),
                 breaks=c(0,sort(intensity(Q2))) )
plot(intensity(Q2, image=TRUE), col = cl, las=2, main=NULL)
text(V_coord, labels=int, cex = 0.8, srt=45)
#plot(Q2, add=T, show.tiles = FALSE, entries= int)
par(OP)
```

While the high density in the western part of the study area remains, the density values to the east are no longer consistent across the other three regions.

The quadrat analysis approach has advantages--it is easy to compute and interpret. However, it does suffer from the [modifiable areal unit problem](https://mgimond.github.io/Spatial/pitfalls-to-avoid.html#maup) (MAUP) as highlighted in the last two examples.

Quadrat density is most useful when the study area can be meaningfully divided into regions of interest, or when a covariate provides a natural way to segment space. However, it may be less effective in detecting subtle or continuous spatial variation, which motivates the use of kernel density methods introduced next.

### Kernel density

The kernel density approach is a method for estimating the spatial intensity of a point pattern. Like quadrat density, it computes localized density values, but it does so using a **moving window** that overlaps across space, producing a smoother and more continuous estimate of intensity.

This moving window is defined by a **kernel**. A kernel is a mathematical function that defines how much influence each point has on the estimated intensity at a given location.

The kernel density method produces a grid of intensity estimates where each cell’s value reflects the weighted contribution of nearby points within the kernel window centered on that cell.

The simplest kernel function is a **basic kernel** where each point in the kernel window is assigned equal weight.

```{r f11-kernel01, fig.cap="An example of a basic 3x3 kernel density map where each point is assigned an equal weight. For example, the cell centered at location x=1.5 and y =7.5 has one point within a 3x3 unit (pixel) region and thus has a local density of 1/9 = 0.11.", fig.height=3.5, echo=FALSE}

#library(raster)
library(terra)
library(grid)

gp.text <- function(sg, nr=1, sfmt="%02.0f",x1,x2,y1,y2,src){
  src       = as.data.frame(src)
  sg.names  = names(sg)
  sg        = crop(sg, c(x1,x2,y1,y2))
  names(sg) = sg.names # Renaming the layers is necessary if the brick object is cropped
  r.df      = as.data.frame(cbind(xyFromCell(sg,1:ncell(sg))))
  r.df      = cbind(r.df, as.data.frame(sg))
  ggplot(aes(x = x, y = y), data = r.df, environment = environment()) +
    geom_tile(aes(fill = Point.density)) +
    scale_fill_gradient(low = "#F0F9E8",high = "#08589E", name="Basic \nKernel") +
    theme(strip.text.x = element_text(size=6, family="mono", face="bold"), legend.position = "right",
          axis.text.x=element_text(size = 8), axis.text.y=element_text(size = 8),
          plot.margin = unit(c(0, 0, 0, 0), "cm"), legend.key.size = unit(0.5,"cm"), legend.text=element_text(size=8),
          legend.title=element_text(size=10)) +
    geom_point(data = src, mapping=aes(x=x, y=y), size=2, shape=16, colour="grey20") +
    coord_equal() + geom_text(aes(label=sprintf(fmt=sfmt, Point.density)), size=3, colour="#555555")+
    scale_x_continuous(breaks=seq(0,10,2)) + scale_y_continuous(breaks=seq(0,10,2)) + labs(x="",y="")
}

fesri <- function(sigma, n=5) {
  m   <- matrix(nc=n, nr=n)
  col <- rep(1:n, n)
  row <- rep(1:n, each=n)
  x   <- col - ceiling(n/2)
  y   <- row - ceiling(n/2)
  # See http://math.stackexchange.com/a/41191
  ax  <- (x^2 + y^2) / sigma^2
  m[cbind(row, col)] = ifelse(abs(ax) < sigma,
                              (1 - ax)^2,
                              0)
  # sum of weights should add up to 1
  m / sum(m)
}

r = rast(nrows=cell,ncols=cell,xmin=0,xmax=cell,ymin=0,ymax=cell)
r[] = 0
Psf <- st_as_sf(P)[-1,]
P.r = rasterize(Psf, r, field=1, fun=sum) # Cell value is assigned number of points
P.r[is.na(P.r[])] = 0

# Point density
d.p = focal(P.r, w=matrix(1,nrow=3,ncol=3), fun=sum, na.rm=T, pad=T, padValue=0)
d.p[] = d.p[] / 9  # Divide by kernel area

#sg = brick(d.p)
sg = d.p
names(sg) = c("Point.density")
gp.text(sg, nr=1, sfmt="%0.2f", x1=0,x2=cell,y1=0,y2=cell,src=P)

```

More common **kernel functions** assign weights to points that are inversely proportional to their distances to the kernel window's center. A few such kernel functions follow a *gaussian* or *quartic* like distribution function. These functions tend to produce a smoother density map.

```{r f11-kernel02, fig.cap="An example of a kernel function is the 3x3 quartic kernel function where each point in the kernel window is weighted based on its proximity to the kernel's center cell (typically, closer points are weighted more heavily). Kernel functions, like the quartic, tend to generate smoother surfaces.", fig.height=3.5, echo=FALSE}

#library(raster)
library(terra)
#library(maptools)
library(grid)

gp.text <- function(sg, nr=1, sfmt="%02.0f",x1,x2,y1,y2,src){
  src       = as.data.frame(src)
  sg.names  = names(sg)
  sg        = crop(sg, c(x1,x2,y1,y2))
  names(sg) = sg.names # Renaming the layers is necessary if the brick object is cropped
  r.df      = as.data.frame(cbind(xyFromCell(sg,1:ncell(sg))))
  r.df      = cbind(r.df, as.data.frame(sg))
  ggplot(aes(x = x, y = y), data = r.df, environment = environment()) +
    geom_tile(aes(fill = Kernel.function)) +
    scale_fill_gradient(low = "#F0F9E8",high = "#08589E", name="Kernel \nFunction") +
    theme(strip.text.x = element_text(size=6, family="mono", face="bold"), legend.position = "right",
          axis.text.x=element_text(size = 8), axis.text.y=element_text(size = 8),
          plot.margin = unit(c(0, 0, 0, 0), "cm"), legend.key.size = unit(0.5,"cm"), legend.text=element_text(size=8),
          legend.title=element_text(size=10)) +
    geom_point(data = src, mapping=aes(x=x, y=y), size=2, shape=16, colour="grey20") +
    coord_equal() + geom_text(aes(label=sprintf(fmt=sfmt, Kernel.function)), size=3, colour="#555555")+
    scale_x_continuous(breaks=seq(0,10,2)) + scale_y_continuous(breaks=seq(0,10,2)) + labs(x="",y="")
}

fesri <- function(sigma, n=5) {
  m <- matrix(nc=n, nr=n)
  col <- rep(1:n, n)
  row <- rep(1:n, each=n)
  x <- col - ceiling(n/2)
  y <- row - ceiling(n/2)
  # See http://math.stackexchange.com/a/41191
  ax <- (x^2 + y^2) / sigma^2
  m[cbind(row, col)] = ifelse(abs(ax) < sigma,
                              (1 - ax)^2,
                              0)
  # sum of weights should add up to 1
  m / sum(m)
}

r = rast(nrows=cell,ncols=cell,xmin=0,xmax=cell,ymin=0,ymax=cell)
r[] = 0
Psf <- st_as_sf(P)[-1,]
P.r = rasterize(Psf, r, field=1, fun=sum) # Cell value is assigned number of points
P.r[is.na(P.r[])] = 0

# Kernel function
ff=fesri(1.52,n=5)  # An approximation of ESRI's kernel density function
d.k = focal(P.r, w=ff, na.rm=T, pad=T, padValue=0)

#sg = brick(d.k)
sg = d.k
names(sg) = c("Kernel.function")
gp.text(sg, nr=1, sfmt="%0.2f", x1=0,x2=cell,y1=0,y2=cell,src=P)

```

While the basic kernel density approach involves uniformly applying a kernel function across the study area to produce a smooth map of estimated point intensity, its usefulness extends to understanding how this intensity relates to covariates. A common exploratory step is to visually or statistically compare the computed kernel density map with a map of an underlying covariate (such as terrain elevation or population density). This helps identify potential spatial relationships, for instance, observing if areas of high event density tend to correspond with areas of high covariate values.

### Non-parametric Intensity Modeling with Covariates

In an earlier section, we learned that we could use a covariate, like elevation, to define the sub-regions (quadrats) within which densities were computed.

Here, instead of dividing the study region into discrete sub-regions, we estimate an **intensity function**, denoted $\rho(z)$, that describes how point intensity varies as a function of a covariate $z$ (e.g., elevation). This function is then used to model the spatial intensity $\widehat{\lambda}(x)$ at each location $x$ in the study area.

In this framework, $\rho(z)$ captures how intensity varies with the covariate, while $\widehat{\lambda}(x)$ applies this relationship spatially across the study region using the covariate’s spatial distribution.

$\rho(z)$ can be estimated in one of three different ways--by *ratio*, *re-weight* and *transform* methods. We will not delve into the differences between these methods, but note that there is more than one way to estimate $\rho(z)$ in the presence of a covariate.

In the following example, the elevation raster is used as the covariate in the $\rho(z)$ function using the *ratio* method (The *ratio* method compares the observed density of points at different covariate values to the density of the covariate itself, producing an estimate of relative intensity). The right-most plot maps the modeled intensity as a function of elevation.

```{r f11-kernel03, fig.cap="An estimate of $\\rho$ using the ratio method. The figure on the left shows the point distribution superimposed on the elevation layer. The middle figure plots the estimated $\\rho$ as a function of elevation. The envelope shows the 95% confidence interval. The figure on the right shows the modeled density of $\\widehat{\\lambda}$ which is a function of the elevation raster (i.e. $\\widehat{\\lambda}=\\widehat{\\rho}_{elevation}$).", fig.height=2.1, fig.width=8.2, echo=FALSE}

rho  <- rhohat(P, elev, method="ratio")
pred <- predict(rho)
cl <-  interp.colours(c("lightyellow", "orange" ,"red"), 100)
OP <- par(mfrow=c(1,3), mar=c(2.5,3.5,0.2,4), pty="s" )
plot(rast(elev), col=grey(50:0/50), main="", las=1,cex.axis=0.9,mgp=c(1.5,0.5,0))
plot(P, col=rgb(0,0,0,.7), add=T, pch=20)
plot(rho, main=NULL, las=1, legend=FALSE, do.rug=FALSE, cex.lab=1.2, cex.axis=0.9, mgp=c(1.5,0.5,0))
plot(rast(pred), col=cl, las=1, main="",  cex.lab=1.5, cex.axis=0.9,mgp=c(1.5,0.5,0))
par(OP)
```

We can compare the modeled intensity function to the kernel density function of the observed point pattern via a scatter plot.

```{r f11-kernel04, fig.height=2.5, fig.width=2.5, echo=FALSE, fig.cap="Scatter plot of predicted intensity vs. observed kernel density."}
OP <- par(mar=c(3,3,1,1), mgp=c(2.0,0.8,0), cex=0.8)
K1 <- density(P)
K1_vs_pred <- pairs(K1, pred, plot = FALSE)
plot(K1_vs_pred$pred ~ K1_vs_pred$K1, pch=20,
     xlab = "Observed intensity",
     ylab = "Predicted intensity",
     col = rgb(0,0,0,0.1))
abline(a=0, b=1, col = "red")
grid()
par(OP)
```

A red one-to-one diagonal line is added to the plot. The scatter plot compares the predicted intensity (based on elevation) to the observed kernel density. While the general trend is positive, the relationship is not strictly linear. This deviation may reflect uncertainty in regions with sparse data, such as high elevations, where fewer points lead to less reliable estimates. This uncertainty is very apparent in the $\rho_{elev}$ *vs. elevation* plot of \@ref(fig:f11-kernel03) where the 95% confidence interval envelope widens at higher elevation values (indicating the greater uncertainty in our estimated $\rho$ value at those higher elevation values). The widening of the confidence envelope at higher elevations reflects increased uncertainty due to fewer observations in those regions--a common issue in spatial modeling when covariate values are unevenly distributed.

Modeling intensity as a function of a covariate allows us to explore first-order effects more rigorously. It helps determine whether external factors--like elevation or population density--can explain the spatial distribution of points.

Note that this method, while still utilizing kernel-based techniques, offers a more integrated approach than the basic kernel density estimation described in the previous section. In that method, we first produce a map of point density (by uniformly applying a kernel over each event) that can then be compared to a covariate map to identify potential associations. Here, however, the process is conceptually different: we use kernel-based methods to directly model how the point intensity varies as a function of a covariate. This means the covariate itself is woven into the estimation, allowing the kernel to effectively adapt its smoothing, or its contribution to the density estimate, so that the resulting intensity map inherently reflects the covariate's influence.

This approach provides a non-parametric estimate of how point intensity varies with a covariate, offering insight into first-order effects. The function $\rho(z)$ captures this relationship without assuming a specific parametric form. In the next section, we explore a complementary approach--**Poisson point process modeling**--which allows us to formally specify intensity as a log-linear function of covariates.

### Parametric Intensity Modeling with Covariates

Up to this point, we have focused on techniques that describe the distribution of points across a region of interest. However, it is often more insightful to model the relationship between point locations and an underlying covariate using a formal statistical framework. This can be done by exploring the *changes* in point density as a function of a covariate, however, unlike techniques explored thus far, this approach makes use of a statistical model. One such model is a **Poisson point process model (PPM)** which can take the form:

$$
\begin{equation}
\lambda(i) = e^{\alpha + \beta Z(i)}
\label{eq:density-covariate}
\end{equation}
$$

This equation defines how the expected number of points per unit area (intensity) varies as a function of the covariate $Z(i)$. The term $\lambda(i)$ is the modeled intensity at location $i$, $e^{\alpha}$ (the exponent of $\alpha$) is the base intensity when the covariate is *zero* and $e^{\beta}$ is the multiplier by which the intensity increases (or decreases) for each 1 unit increase in the covariate $Z(i)$. The parameters $\alpha$ and $\beta$ are estimated from the data using maximum likelihood, which finds the values that best explain the observed point pattern given the covariate surface. Note that the units of intensity are derived from the coordinate reference system (e.g., stores per square kilometer), and the interpretation of $\beta$ depends on the scale and units of the covariate.

The equation is a form of a **log-linear** model, used in spatial statistics to model intensity. While it resembles logistic regression in structure, the left-hand side represents intensity, not probability. Note that taking the log of both sides of the equation yields the more familiar linear regression model where $\alpha + \beta Z(i)$ is the *linear predictor*.

```{block, type="note"}
Note: While the Poisson point process model uses a log-linear form to model intensity, logistic regression models the *probability* of occurrence. These two frameworks are related through the transformation $\lambda = P / (1 - P)$, which leads to:
$$
P(i) = \frac{e^{\alpha + \beta Z(i)}}{1 + e^{\alpha + \beta Z(i)}}
$$
This relationship is useful when interpreting intensity in terms of probability, especially in presence/absence modeling.
```

This PPM assumes that events occur independently, and that the intensity varies smoothly with the covariate. It does not account for interactions between points (i.e., second-order effects).

Let's work with the point distribution of Starbucks cafes in the state of Massachusetts. The point pattern clearly exhibits a non-random distribution. It might be helpful to compare this distribution to some underlying covariate such as the population density distribution.

```{r f11-starbucks, fig.cap="Location of Starbucks relative to population density. Note that the classification scheme follows a log scale to more easily differentiate population density values.", message=FALSE, fig.height=3, fig.width=7, echo=FALSE, results='hide'}

# Load Starbucks point locations
s  <- st_read("./Data/Starbucks.shp")

# Load population density raster layer
r   <- rast("./Data/pop_sqmile.img")
pop <- r

# Display data
plot(log(pop ), grid = FALSE, axes = FALSE)
plot(s$geometry, add = TRUE, pch=20, col=rgb(0,0,0,0.5), cex=1.3)
```

We can fit a poisson point process model to these data where the modeled intensity takes on the form:

$$
\begin{equation}
Starbucks\ density(i) = e^{\alpha + \beta\ population(i)}
\label{eq:walmart-model}
\end{equation}
$$

The parameters $\alpha$ and $\beta$ are estimated from a method called *maximum likelihood*. Its implementation is not covered here but is widely covered in general statistics text books. The index $(i)$ serves as a reminder that the point density and the population distribution both can vary as a function of location $i$.

```{r echo=FALSE}
#options(digits=10)
pop_xy <- xyFromCell(pop, cel = 1:prod(dim(pop)))
pop_z <- pop[]
pop_df <- as.data.frame(cbind(pop_xy, pop_z))
names(pop_df) <- c("x","y","z")
pop_im <- as.im(pop_df)
pop_im <- spatstat.geom::rescale(pop_im, 1000, "km")
P2 <- as.ppp(s)
P2 <- spatstat.geom::rescale(P2, 1000, "km")
marks(P2) <- NULL
PPM1 <- ppm(P2 ~ pop_im)
a <- round(PPM1$coef[1] , 3)
b <- round(PPM1$coef[2] , 5)
```

The estimated value for $\alpha$ in our example is `r a`. This is interpreted as stating that given a population density of *zero*, the base intensity of the point process is e^`r a`^ or `r sprintf("%g",exp(a))` cafes per square kilometer (the units are derived from the point's reference system)--a number close to zero (as one would expect). The estimated value for $\beta$ is `r sprintf("%g",b)`. This is interpreted as stating that for every unit increase in the population density derived from the raster, the intensity of the point process increases by a factor of e^`r sprintf("%g",b)`^ or `r sprintf("%g",exp(b))`.

The plot below shows the fitted intensity curve as a function of population density, along with a confidence envelope. This visualization helps assess whether the relationship is approximately linear and whether uncertainty increases at extreme covariate values.

```{r f11-pppm01,fig.cap="Poisson point process model fitted to the relationship between Starbucks store locations and population density. The model assumes a log-linear relationship. Note that the density is reported in number of stores per map unit area (the map units are in kilometers).", message=FALSE, fig.height=2.5, fig.width=3.6, echo=FALSE, results='hide'}
PPM1 <- ppm(P2 ~ pop_im)
PPM0 <- ppm(P2 ~ 1)

OP <- par(mar=c(4,4,0,0))
plot(effectfun(PPM1, "pop_im", se.fit=TRUE), main=NULL, cex.axis=0.8,cex.lab=0.8, mgp=c(1.5,0.5,0),
     legend=FALSE)
par(OP)
#anova(PPM0, PPM1, test="LR")
```

In the next section, we explore how to formally compare this covariate-based model to a simpler null model (e.g., homogeneous intensity) using statistical tests. This allows us to assess whether the covariate meaningfully improves our understanding of the spatial distribution of points.

## Hypothesis Testing for Covariate Effects {#PPM}

A Poisson point process model can be fit to any observed point pattern, but fitting a model alone does not guarantee that it provides a meaningful explanation of the spatial distribution. To evaluate whether a covariate-based model improves our understanding of the point pattern, we compare it to a simpler baseline model--typically one that assumes homogeneous intensity across space. In this framework, the homogeneous model serves as the **null hypothesis**, while the covariate-based model represents the **alternative hypothesis**.

For example, we may want to assess whether the spatial distribution of Walmart stores is better explained by population density (the alternative hypothesis) than by a model that assumes no spatial preference (the null hypothesis).

```{r f11-walmarts, fig.cap="Location of Walmart stores relative to population density. Note that the classification scheme follows a log scale to more easily differentiate population density values.", message=FALSE, fig.height=3, fig.width=7, echo=FALSE, results='hide'}

library(spatstat)
library(terra)
library(sf)
# library(maptools)
# library(rgdal)

# Load Starbucks point locations
s  <- st_read("./Data/Walmarts.shp")
P <- as.ppp(s)
marks(P) <- NULL
P <- spatstat.geom::rescale(P, 1000, "km")

# Load the extent
W <- readRDS("./Data/ma.rds")
W <- as.owin(st_geometry(st_as_sf(W))) # Create Owin object
W <- spatstat.geom::rescale(W, 1000, "km")

# Define point extent
Window(P) <- W

# Load population density raster layer
r   <- rast("./Data/pop_sqmile.img")
pop_xy <- xyFromCell(r, cel = 1:prod(dim(r)))
pop_z <- r[]
pop_df <- as.data.frame(cbind(pop_xy, pop_z))
names(pop_df) <- c("x","y","z")
pop <- as.im(pop_df)
pop <- spatstat.geom::rescale(pop, 1000, "km")

# Fit the model that assumes that density is a function
# of population (the alternative hypothesis)
PPM1 <- ppm(P ~ pop)

# Fit the model that assumes that density is not a function
# of population (the null hypothesis)
PPM0 <- ppm(P ~ 1)

# Display data
plot(log(r), grid = FALSE, axes = FALSE)
plot(s$geometry, add = TRUE, pch=20, col=rgb(0,0,0,0.5), cex=1.3)
```


To test this, we fit both models to the observed data and compare their performance using a **likelihood ratio test**.

A Poisson point process model (of the the Walmart point pattern) implemented using `spatstat`'s `ppm` function [@Spatstat] produces the following output for the null model:

```{r echo=FALSE}
PPM0
```

and the following output for the alternative model.

```{r echo=FALSE}
PPM1
```

Thus, the null model (homogeneous intensity) takes on the form:

$$
\lambda(i) = e^{-6.2}
$$

and the alternate model takes on the form:

$$
\lambda(i) = e^{-6.3 + 1.04^{-4}population}
$$

To better understand how the two models differ, we can visualize the estimated intensity ($\widehat{\lambda}$) as a function of population density for both the null and alternative models.

```{r f11b_Ho_vs_H,fig.cap="Estimated intensity functions from the Poisson point process model: the left plot shows the covariate-based model (intensity as a function of population density), while the right plot shows the null model (constant intensity). Note the broader confidence interval in the covariate-based model, indicating greater uncertainty at extreme population values (the map units are in kilometers).", message=FALSE, fig.height=2.5, fig.width=8, echo=FALSE, results='hide'}
yrange <- range(c(effectfun(PPM1, "pop", se.fit=TRUE)$hi, effectfun(PPM1, "pop", se.fit=TRUE)$lo))
OP <- par(mfrow = c(1,2), mar=c(4,4,0,0))
plot(effectfun(PPM1, "pop", se.fit=TRUE), main=NULL, ylim = yrange,
     cex.axis=0.8,cex.lab=0.8, mgp=c(1.5,0.5,0), legend=FALSE)
plot(effectfun(PPM0, "pop", se.fit=TRUE), main=NULL, ylim = yrange,
     cex.axis=0.8,cex.lab=0.8, mgp=c(1.5,0.5,0), legend=FALSE)
par(OP)
#anova(PPM0, PPM1, test="LR")
```

The shapes of the fitted model clearly differ between the models, but this does not necessarily imply that the model that includes population density is significantly different from the null model. The relatively wide confidence intervals in the covariate-based model, especially at higher population densities, suggest uncertainty in the estimated relationship. This highlights the importance of formal statistical testing to determine whether the observed differences are meaningful.

To formally assess whether the covariate-based model provides a significantly better fit than the null model, we use a likelihood ratio test. This test compares the log-likelihoods of the two models and evaluates whether the improvement in fit is statistically significant.

```{r,echo=FALSE, message=FALSE, warning=FALSE}
knitr::kable(anova(PPM0, PPM1, test="LRT"))
```

The value under the heading `PR(>Chi)` is the p-value which gives us the probability we would be wrong in rejecting the null. The resulting p-value of 0.039 indicates that, if the null model were true (i.e., if Walmart locations were randomly distributed with constant intensity), there is only a 3.9% chance of observing a pattern that fits the covariate-based model this well. This provides some evidence that population density improves the model’s ability to explain the spatial distribution of Walmart stores.

In summary, the likelihood ratio test suggests that population density is a meaningful covariate in explaining the spatial distribution of Walmart stores, though some uncertainty remains in the estimated relationship.

## Summary

This chapter introduced methods for analyzing first-order properties of spatial point patterns—those driven by external factors rather than interactions between points. We began with descriptive tools such as global and local density measures, including quadrat and kernel density estimation. These methods help identify spatial variation in point intensity and can reveal associations with underlying covariates.

We then explored how covariates can be used to model intensity more formally, both non-parametrically and through parametric approaches like the Poisson point process model (PPM). The PPM allows us to test hypotheses about whether a covariate meaningfully explains the spatial distribution of points, using statistical tools such as the likelihood ratio test.

Importantly, this chapter is limited to first-order effects. While second-order properties—such as point interactions—are central to spatial analysis, they are addressed in the next chapter.