Session 9.2: Introduction to Stochastic Partial Differential Equations (SPDE) using R-INLA

# Session 9.2: Introduction to Stochastic Partial Differential Equations (SPDE) using R-INLA

###

---

.my-footer[ 
.alignleft[ 
&nbsp; &copy; Pirani | Blangiardo | Villejo
]
.aligncenter[
MSc Epidemiology 
]
.alignright[
, NA 
]
]

---

# Learning Objectives

At the end of this session you should be able to:

- understand the basics of the Stochastic Partial Differential Equation (**SPDE**) approach;

- implement the SPDE approach using the `R-INLA` package;

- perform spatial prediction and mapping.

The topics covered in this lecture can be found in:

- Sections 6.5:6.8 of the book **Spatial and Spatio-Temporal Bayesian models with R-INLA** (website for data and code: https://github.com/michelacameletti/Spatial-and-Spatio-Temporal-Bayesian-Models-with-R-INLA)

- Chapters 8 and 9 of the book **Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny**  https://www.paulamoraga.com/book-geospatial/index.html

- Chapter 2 of the book **Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA** https://becarioprecario.bitbucket.io/spde-gitbook/

---

# Outline

1\. [Basics of the SPDE approach](#spdeapproach)

2\. [The SPDE approach with `R-INLA`](#spdeinla)

3\. [Spatial prediction using SPDE approach](#predi)

---

---

# A common model for geostatistical (noisy) data

- Usually the following (mixed-effects) model is assumed

`$$\color{blue}{y(\bm{s}) = \mu(\bm{s}) + \xi(\bm{s}) +\epsilon(\bm{s})}$$`

where

- `$\mu(\bm{s})$` is the so-called .red[**large scale**] component, including linear or non linear of covariates.

- `$\xi(\bm{s})$` is a zero mean latent .red[**Gaussian spatial process**] commonly assumed to be stationary and isotropic with covariance function `$Cov(\xi(\bm{s_i}),\xi(\bm{s_j}))$` which depends only on the distance between the locations.

- `$\epsilon(\bm{s})$` represents the Gaussian .red[**measurement error**] (independent from `$\xi(\bm{s})$`) and its variance `$(\sigma^2_{\epsilon})$` is usually known as **nugget effect**.

Given the data from `$n$` locations `$(\bm{s}_{1},\ldots,\bm{s}_{n})$` we have:

`\begin{align*}
y(s_i)\mid\mu(\bm{s}_i),\xi(\bm{s}_i),\sigma^2_{\epsilon} & \sim \text{Normal}(\mu(\bm{s}_i)+\xi(\bm{s}_i),\sigma^2_{\epsilon})\\
\bm{\xi} & \sim \text{GF}(\bm 0,\bm\Sigma)
\end{align*}`

where `$\bm\Sigma$` is a **dense** matrix defined by a spatial covariance function `$\mathcal C(||\bm{s}_{i}-\bm{s}_{j}||)$`.

- Several functions are available for the **spatial covariance function** (e.g. exponential, Mat&eacute;rn, spherical, etc.) parameterized by some parameters, e.g. spatial variance, range (Banerjee, Carlin, and Gelfand, 2014).

---

# Mat&eacute;rn covariance function

The .red[**Mat&eacute;rn covariance function**] is defined by

`$$\mbox{Cov}(\xi(\bm{s}_i),\xi(\bm{s}_j))= \mbox{Cov}(\xi_i,\xi_j)=\frac{\sigma^2}{\Gamma(\lambda)2^{\lambda-1}}(\kappa ||\bm{s}_i - \bm{s}_j||)^\lambda K_\lambda(\kappa ||\bm{s}_i - \bm{s}_j||)$$`

where:
- `$||\bm{s}_i - \bm{s}_j||$` is the Euclidean distance between two generic locations `$\bm{s}_i, \bm{s}_j \in \mathbb{R}^2$`,

- `$\sigma^2$` is the marginal variance of the GF,

- `$K_\lambda$` denotes the modified Bessel function of second kind and order `$\lambda>0$`, which measures the degree of **smoothness** of the process (it is usually kept fixed due to poor identifiability),

- `$\kappa>0$` is a **scale parameter** related to the **range** `$r$`, i.e. the distance at which the spatial correlation becomes almost null. Typically, the empirically derived definition for the range is `$r=\frac{\sqrt{8 \lambda}}{\kappa}$` (Lindgren, Rue, and Lindström, 2011), with `$r$` corresponding to the distance at which the spatial correlation is close to 0.1, for each `$\lambda\geq 1/2$`.

- The Mat&eacute;rn family is a very flexible class of covariance functions able to cover a wide range of spatial fields.

---

# The effect of range

Simulation of different GFs with `$\sigma=1$` and different ranges:

---

# Model based approach for estimation

- In the Bayesian framework, the classical approach for model estimation is Markov chain Monte Carlo methods (MCMC) considering that the likelihood function is a multivariate Gaussian distribution (see for example the `spBayes` and `spTimer` `R` packages).

- This requires to compute the Cholesky factorization of the dense covariance matrix `$\bm \Sigma$` which is computationally very expensive. This is known as **big `$n$` problem** .

- The .red[**stochastic partial differential equation (SPDE) approach**] (Lindgren, Rue, and Lindström, 2011) is an alternative to the use of MCMC: it represents the continuous spatial process `$\xi(\bm{s})$` with Mat&eacute;rn covariance function using a discretely indexed spatial random process (i.e., a Gaussian Markov Random Field - GMRF),  which is characterized by a sparse precision matrix and enjoys computational benefits in terms of fast inference.

---

---

# The SPDE approach for Mat&eacute;rn GF

- The starting point is the **linear fractional stochastic partial differential equation** (SPDE)

`$$\color{blue}{(\kappa^2 - \Delta)^{\alpha/2}(\tau \xi(\bm{s}))={ \mathcal W} (\bm{s})}$$`

where `$\bm{s} \in \mathbb{R}^d$`, `$\Delta$` is the Laplacian operator,  `$\alpha>0$`  is the **smoothness** term, `$\kappa>0$` is the scale parameter, `$\tau$` controls the variance and `${\mathcal W}(\bm{s})$` is a Gaussian spatial white noise process (with unit variance).

- **Whittle** in 1954 showed that the exact and stationary solution to this SPDE is the stationary Gaussian field `$\xi(\bm{s})$`with Mat&eacute;rn covariance function.

- **Lindgren, Rue, and Lindström (2011)** represent the solution of the SPDE using the finite element method (this is possible only for some values of the smothness parameter).

- In `$\mathbb R^2$` the link between the SPDE parameters `$\tau, \alpha, \kappa$` and the Mat&eacute;rn function parameters `$\sigma^2, \lambda, \kappa$` is given by

`$$\lambda = \alpha-1 \qquad\qquad \sigma^2 = \frac{\Gamma(\lambda)}{\Gamma(\alpha)(4\pi)\kappa^{2\lambda}\tau^2}$$`

-  In `R-INLA` the default value for the smoothness parameter is `$\alpha=2$` `$\rightarrow$` `$\lambda=1$`. Consequently

`$$r=\sqrt{8\lambda}/\kappa= \sqrt{8}/\kappa \text{ (see slide 6)} \qquad\qquad \sigma^2=1/(4\pi\kappa^2\tau^2)$$`

---

# Piecewise linear approximation

- The solution to the SPDE can be approximated through a **basis function representation** defined on a **triangulation** of the spatial domain

`\begin{align*}
\color{blue}{\xi(\bm{s})=\sum_{g=1}^{G}\varphi_g(\bm{s})\tilde\xi_g}
\end{align*}`

where `$G$` is the total number of triangulation vertices, `$\{\varphi_{g}\}$` is the set of (deterministic) basis functions and `$\{\tilde\xi_{g}\}$` are zero mean Gaussian distributed weights.

- In order to obtain a Markov structure, the basis functions are chosen to be .red[**piecewise linear**] in each triangle, i.e. `$\varphi_{g}$` is 1 at vertex  `$g$` and 0 at all other vertices.

---
# Approximation in 2-Dimensions

---

# Piecewise linear approximation

- The precision matrix `$\bm Q$` for the Gaussian weight vector `$\tilde {\bm \xi}=\{\tilde\xi_{1},\ldots,\tilde\xi_{G}\}$` is given by

`$$\color{blue}{\bm Q=\tau^2\left(\kappa^4 \bm C+2\kappa^2 \bm G +\bm G\bm C^{-1}\bm G\right)}$$`
where

 - the generic element of the diagonal matrix `$\bm C$` is `$C_{ii}=\int \varphi_i(\bm s) \text{d}\bm s$`, 
 
 - the generic element of the sparse matrix `$\bm G$` is `$G_{ij}=\int \nabla\varphi_i(\bm s) \nabla\varphi_j(\bm s)\text{d}\bm s$` (where `$\nabla$` denotes the gradient),

.content-box-green[
The precision matrix  `$\bm Q$`, whose elements depend on `$\tau$` and `$\kappa$`, is sparse and consequently  `$\bm \xi$`  is a GMRF  distributed as `$\text{Normal}(\bm 0,\bm Q^{-1})$`: it represents the approximated solution to the SPDE. 
]

---

# Take home message

- Do the modelling using GF and the computations using the GMRF representation (computational advantages thanks to algorithms for sparse matrices):

`\begin{align}
\bm\xi &\sim \text{Normal}(\mathbf 0, \bm \Sigma)  \Longrightarrow  \; \tilde {\bm \xi} \sim \text{Normal}\left(\bm 0,\bm Q^{-1}\right)\nonumber
\end{align}`
]

---

---

# SPDE toy example

.pull-left[
We use the `SPDEtoy`  dataset , consisting in 200 simulated values for the variable `y` which refer to as many randomly sampled locations in the unit square area delimited by the points `$(0,0)$` and `$(1,1)$` and with coordinates given by `s1`  and  `s2`.

``` r
> library(INLA)
> data(SPDEtoy)
> dim(SPDEtoy)
```

```
[1] 200   3
```

``` r
> head(SPDEtoy, n=3)
```

```
          s1         s2         y
1 0.08265625 0.05640625 11.521206
2 0.61230625 0.91680625  5.277960
3 0.16200625 0.35700625  6.902959
```
]
.pull-right[

``` r
> summary(SPDEtoy$y)
```

```
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  5.210   8.688  10.023   9.858  11.307  14.600 
```
]
---

# Plot of toy data

``` r
> library(tidyverse)
> 
> SPDEtoy %>% 
+ ggplot() +
+ geom_point(aes(s1,s2, col=y),size=2) 
```
]
.pull-right[
<img src="./img/spdetoydataout-1.png" >
]

---

# SPDE toy example: model for simulation

- The model used for simulating the `SPDEtoy` data assumes that the distribution of the observation  `$y_{i}$`  is

`\begin{align*}
\color{blue}{y_i\mid\eta_i,\sigma^2_e \sim \text{Normal}(\eta_i, \sigma^2_e) \qquad i=1,\ldots, 200 \nonumber}
\end{align*}`

where `$\sigma^{2}_e$` is the variance of the zero mean measurement error `$e_i$` which is supposed to be Gaussian iid. 
 

- The response mean, which coincides with the **linear predictor**, is defined as

`\begin{align*}
\color{blue}{\eta_i = b_0 +  \xi_i}
\end{align*}`

and includes the intercept `$b_0$` and a random effect represented by `$\xi_i$`, which is the realization of the latent GF `$\xi(\bm s)\sim \text{MVNormal}(\bm 0,\bm \Sigma)$`.  The covariance matrix `$\bm \Sigma$`  is defined by the Mat&eacute;rn spatial covariance function.

- The parameter values chosen for simulating the data are: `$b_0=10$`, `$\sigma^2_e=0.3$`, `$\sigma^2=5$`, `$\kappa=7$`, `$r=\frac{\sqrt{8}}{\kappa}=0.404$`.

---

# The SPDE representation and the projection matrix

- Using the SPDE basis function representation, the linear predictor `$\eta_i$` can be rewritten as

`$$\color{blue}{\eta_{i} = b_0 + \sum_{g=1}^{G} \varphi_g(\bm s_i) \tilde {\xi_g}}$$`

where `$\varphi_g(\bm s_i)$`  is the value of the `$g$`-th basis function evaluated in `$\bm s_i$`.

- More generally it is possible to express the linear predictor as

`$$\color{blue}{\eta_i= b_0 + \sum_{g=1}^{G}  A_{ig} \tilde {\xi_g}}$$`

with `$A_{ig}=\varphi_g(\bm s_i)$` being the generic element of the sparse matrix `${\bm  A}$` (known as **projection matrix**) which maps the GMRF `${\tilde {\bm \xi}}$`  from the `$G$`  triangulation vertices to the `$n$` observation locations. This allows the SPDE model to be treated as standard indexed random effects.

---

# Mesh construction

- The SPDE approach is based on a .red[**triangulation**] of the spatial domain given by the **mesh**.

- The definition of the mesh is a **trade-off** between the accuracy of the GMRF representation and computational costs, both depending on the number of vertices used in the triangulation: the bigger the number of mesh triangles, the finer the GF approximation but the higher the computational costs.

- To create the mesh in `R-INLA` we use the helper function `inla.mesh.2d`. The arguments for a two-dimensional mesh construction can be checked using `args(inla.mesh.2d)`.

- Here we are going to start with the following options:

 - `loc` or `loc.domain`: specify information about the spatial domain 
 
 - `max.edge`: specify the largest allowed triangle edge length. If a vector of two values is provided, the spatial domain is divided into an inner and an outer area whose triangle resolution is specified by `max.edge` (the higher the value for `max.edge` the lower the resolution and the accuracy).

---

# Mesh: `max.edge` - mesh1

``` r
> coords = as.matrix(SPDEtoy[,1:2])
> mesh1 = inla.mesh.2d(loc = coords,
*+                       max.edge = c(0.1, 0.1))
```

``` r
> library(inlabru)
> ggplot() + 
+   gg(mesh1) +
+   geom_point(data = data.frame(coords),
+              aes(s1, s2))
```
]

# Mesh: `max.edge` - mesh2

``` r
> mesh2 = inla.mesh.2d(loc = coords,
*+                       max.edge = c(0.1, 0.2))
```

``` r
> ggplot() + 
+   gg(mesh2) +
+   geom_point(data = data.frame(coords),
+              aes(s1, s2))
```
]

---

# Mesh optional arguments: `offset` - mesh3

.pull-left[
The option `offset` of the `inla.mesh.2d` function can be used to define how much the domain should be extended in the inner and outer part. The default values are `offset = c(-0.05, -0.15)`.

``` r
> mesh3 = inla.mesh.2d(loc = coords,
+                       max.edge = c(0.1, 0.2),
*+                       offset = c(0.4,0.1))
```

``` r
> ggplot() + 
+   gg(mesh3)+
+   geom_point(data = data.frame(coords),
+              aes(s1, s2))
```
]

---
# Mesh optional arguments: `offset` - mesh4

``` r
> mesh4 = inla.mesh.2d(loc = coords,
+                       max.edge = c(0.1, 0.2),
*+                       offset = c(0.1,0.4))
```

``` r
> ggplot() + 
+   gg(mesh4) +
+   geom_point(data = data.frame(coords),
+              aes(s1, s2))
```
]

---

# Mesh optional arguments `cutoff`

Instead of the point coordinates, it is possible to use other point locations used to determine the domain extent.

``` r
> domain = matrix(cbind(c(0,1,1,0.7,0),
+                   c(0,0,0.7,1,1)), ncol=2)
*> mesh5domain = inla.mesh.2d(loc.domain = domain,
+                       max.edge = c(0.04, 0.2), 
+                       offset = c(0.1, 0.4))
```

``` r
> ggplot() + 
+   gg(mesh5domain) +
+   geom_point(data = data.frame(domain), 
+              aes(X1, X2)) +
+     geom_point(data = data.frame(coords),
+              aes(s1, s2), col = "red", 
+              alpha = 0.5)
```
]

---
# Mesh optional arguments `cutoff` - mesh5

.pull-left[
The option `cutoff` can be used to avoid building too many small triangles around clustered data locations (the default value is equal to 0).

``` r
> mesh5 = inla.mesh.2d(loc.domain = domain, 
+                       max.edge = c(0.04, 0.2), 
*+                       cutoff = 0.5,
+                       offset = c(0.1, 0.4))
```

``` r
> ggplot() + 
+ gg(mesh5) +
+ geom_point(data = data.frame(domain),
+ aes(X1, X2)) +
+ geom_point(data = data.frame(coords),
+ aes(s1, s2), col = "red", 
+ alpha = 0.5)
```
]
.pull-right[
<img src="./img/explore_plot_out5-1.png" >
]

---
# Putting all together - mesh6

``` r
> mesh6 = inla.mesh.2d(loc.domain = domain,
+                       max.edge = c(0.04, 0.2),
+                       cutoff = 0.05,
+                       offset = c(0.1, 0.4))
```

``` r
> ggplot() + 
+   gg(mesh6) +
+   geom_point(data = data.frame(domain),
+              aes(X1, X2)) +
+     geom_point(data = data.frame(coords),
+              aes(s1, s2), col = "red", 
+              alpha = 0.5)
```
]

---

# Mesh: non convex hull
.pull-left[
A feature in `R-INLA` named `inla.nonconvex.hull`  makes it possible to compute a non convex hull to be included as boundary in the mesh construction. This can be particularly useful when the shape of the domain is of some importance.

``` r
> set.seed(44)
> loc = matrix(runif(20), 10, 2)
> 
> boundary = inla.nonconvex.hull(loc,
*+                                convex=0.2)
> meshNC = inla.mesh.2d(loc = loc,
*+                       boundary = boundary,
+                       max.edge = c(0.04, 0.2))
```

``` r
> ggplot() + 
+   gg(meshNC)+
+   geom_point(data = data.frame(loc),
+              aes(X1, X2))
```

]

---

# Creation of the Mat&eacute;rn SPDE model for `mesh6`

- We create the Mat&eacute;rn **SPDE model object** with the function `inla.spde2.matern` on the mesh. By default the smoothness term is 2

- Here we create the SPDE object using the `mesh6` triangulation:

``` r
> spde = inla.spde2.matern(mesh = mesh6)
> spde$n.spde # number of vertices in the mesh
```

```
[1] 549
```

---

# PC prior for the Mat&eacute;rn model
 
- Instead of `inla.spde2.matern` it is possible to use `inla.spde2.pcmatern` for creating an `inla.spde2` model object using a PC prior for the range `$r$` and the marginal standard deviation `$\sigma$` (Simpson, Rue, Riebler, Martins, and Sorbye, 2017).

- The prior for `$\sigma$` is such that 
$$
Pr(\sigma>\sigma_0)=p
$$
and this requires to specify `$\sigma_0$` and `$p$` with the option `prior.sigma = c(sigma0,p)`.
 

- The prior for `$r$` is such that 
$$
Pr(r<r_0)=p
$$
and this requires to specify `$r_0$` and `$p$` with the option `prior.range = c(r0,p)`.

For example:

``` r
> spde = inla.spde2.pcmatern(mesh,
+                            prior.range = c(0.01,0.1),
+                            prior.sigma = c(100,0.1))
```

---

# Creation of the indexes for the SPDE model

- Now we construct the index set for the latent spatial Gaussian model using the function `inla.spde.make.index`. It will be useful in the stack preparation (see next slides)

``` r
> s.index = inla.spde.make.index(name = "spatial.field", n.spde = spde$n.spde)
```

Here the `name` represents the name of the effect which will be used in the `formula`

- The index set for the latent field does not depend on the data set locations. It only depends on the SPDE model size

---

# Projection matrix for `mesh6`

- Now, we create a .red[projection matrix], which maps the GMRF from the mesh nodes to the `$n$` observation locations

- The projection matrix is build using the `inla.spde.make.A()` function

``` r
> A.est6 = inla.spde.make.A(mesh = mesh6,
+                            loc = coords)
> dim(A.est6)
```

```
[1] 200 549
```
The matrix has the number of rows equal to the number of observations and the number of columns equal to the number of vertices of the triangulation.

The points are not necessarily on the vertices and thus each point will be associated with up to three non-zero weights (that collectively add up to 1)

``` r
> #No more than 3 elements in each line are non-zero
> table(rowSums(A.est6>0))
```

```

3 
200 
```
]
.pull-right[

The sum of each row is one:

``` r
> table(rowSums(A.est6))
```

```

1 
200 
```
There are some columns whose sum is zero corresponding to triangles with no point location inside:

``` r
> table(colSums(A.est6) > 0) 
```

```

FALSE  TRUE 
  313   236 
```
]

---

# The `inla.stack()` function

- The function `inla.stack()` allows us to perform an optimal and easy management of the SPDE objects (data, covariates, indices and projection matrices) and for the construction of the linear predictor (Lindgren and Rue, 2015).

- In the `SPDEtoy` example the .red[**linear predictor**] is given by 
`$$\eta_i = b_0 +\xi_i = b_0 + \sum_{g=1}^{G}A_{ig} \tilde {\xi_g}$$`
and can be written as
`$$\bm\eta = \bm 1 b_0+\bm A\tilde{\bm \xi}$$`
where the first term refers to the intercept and the second to the spatial effect.

- The main arguments of the `inla.stack()` function are:
 - `data`: a vector list with the data
 - `A`: a list of projection matrices 
 - `effects`: the list of effects
 - `tag` (optional): a label for the data stack
 
 
---

# The `inla.stack()` function for model fitting with the `SPDEtoy` data

We define the `inla.stack` object for model fitting as follows:

``` r
> stack.est = inla.stack(
+   data = list(y = SPDEtoy$y),
+   A = list(1, A.est6),
+   effects = list(intercept = rep(1,nrow(SPDEtoy)),
+                  s.index),
+   tag="est") 
```

Note that the function `inla.stack()` will take care of eliminating any column in the projection matrix which is full of zeros.

---

# Model fitting

- The vector of parameters is defined as `$\bm \theta = \{\tilde {\bm \xi}, b_0\}$` with hyper-parameter vector `$\bm \psi=(\sigma^{2}_{e},r,\tau)$`, where `$\tau=1/\sigma^2$` and `$r$` (depending on `$\kappa$`) are the Mat&eacute;rn covariance function parameters. 

- In `R-INLA` the default **internal representation** for the SPDE parameters is `$\log(\tau)=\theta_1$` and `$\log(\kappa)=\theta_2$`, with `$\theta_1$` and `$\theta_2$` being given two independent Normal prior distributions.

- Steps: (1) Define the formula, then (2) Run INLA

.blue[1\. Define the **linear predictor**] through the `formula`. In `R-INLA` the Mat&eacute;rn GF is part of the linear predictor and is specified in the `formula`  environment using a proper specification for `f()`.

``` r
> formula = y ~ -1 + intercept + f(spatial.field, model = spde)
```

where `spatial.field` is a proper index variable (from `inla.spde.make.index` function), and `spde` is the model created previously with `inla.spde2.matern`. Note that the intercept is removed and is added manually in the linear predictor.

---

.blue[2\. Run `inla`!] The function `inla.stack.data()` and `inla.stack.A()` are used for extracting the data and the projection matrix from the `stack.est` object:

``` r
> output6 = inla(formula,
*+                 data = inla.stack.data(stack.est),
*+                 control.predictor = list(A = inla.stack.A(stack.est), compute = TRUE))
```

Note that the projection matrix is passed to `inla` through `control.predictor`. Moreover, with the option `compute = TRUE` we ask for the computation of the marginals of the linear predictor.

---

# Exploring the output: fixed effects and hyperparameters

``` r
> output6$summary.fixed[,c("mean","0.025quant","0.975quant")]
```

```
              mean 0.025quant 0.975quant
intercept 9.502358   8.014667   10.88173
```

``` r
> output6$summary.hyperpar[,c("mean","0.025quant","0.975quant")]
```

```
                                             mean 0.025quant 0.975quant
Precision for the Gaussian observations  2.866322   2.040684   3.903928
Theta1 for spatial.field                -4.033943  -4.342915  -3.712046
Theta2 for spatial.field                 2.094528   1.594933   2.568765
```

---

# Exploring the output: spatial parameters

If we are interested in the posterior summaries of the spatial parameters  on the scale of the variance `$\sigma^2=1/\tau$`  and range  `$r$`  (instead of the internal scale regarding `$\theta_1=\log(\tau)$` and `$\theta_2=\log(\kappa)$`) we use

``` r
> output6.field = inla.spde2.result(inla = output6, 
+                                    name = "spatial.field",
+                                    spde = spde)
```

The resulting list contains the following elements:

``` r
> names(output6.field)
```
<center><img src=./img/names_output.jpg width='57%' title=''></center>

---

# Exploring the output

The posterior mean of  `$\sigma^2$` and the range `$r$`  can be obtained by typing

``` r
> inla.emarginal(function(x) x, output6.field$marginals.variance.nominal[[1]])
```

```
[1] 4.007248
```

``` r
> inla.emarginal(function(x) x, output6.field$marginals.range.nominal[[1]])
```

```
[1] 0.3589786
```

Also the other standard INLA functions can be applied to the marginal posteriors:

``` r
> inla.zmarginal(output6.field$marginals.range.nominal[[1]])
```

```
Mean            0.358979 
Stdev           0.0905183 
Quantile  0.025 0.217391 
Quantile  0.25  0.293978 
Quantile  0.5   0.34625 
Quantile  0.75  0.410374 
Quantile  0.975 0.570918 
```

---

---

# Spatial predictions

- In geostatistics we are interested in predicting the (latent) spatial field (i.e. the linear predictor) at new spatial locations where we do not have data.

- It is possible to perform the spatial prediction jointly with the estimation by using the `inla.stack` approach.

- Consider the response variable distribution
$$
\bm y\sim\text{Normal}(\bm\eta=\bm 1 b_0+\bm A\tilde{\bm \xi},\sigma^2_e \bm I)
$$
we are interested in the posterior distribution of the linear predictor `$\bm\eta$` everywhere in space, especially where we don't have observed data and `y=NA`.

- With regard to spatial prediction, it is worth noting that the INLA-SPDE algorithm provides the posterior conditional distribution of `$\bm\eta$` for all the triangulation vertices.
By using the SPDE approximation, it is then immediate to get a prediction for `$\bm\eta$` for any location in the triangulated domain (i.e. the posterior predictive distribution).

---

# Grid for spatial prediction

``` r
> grid.x = 20
> grid.y = 20
> pred.grid = expand.grid(
+             x = seq(0, 1, length.out = grid.x),
+             y = seq(0, 1, length.out = grid.y))
```

It is necessary to define a new projection matrix:

``` r
> A.pred6 = inla.spde.make.A(mesh = mesh6,
+               loc = as.matrix(pred.grid))
> dim(A.pred6)
```

```
[1] 400 549
```
]
.pull-right[
<img src="./img/predgrid-1.png" >
]

---

# Model fitting jointly with spatial prediction
For performing **jointly** the estimation and the prediction, we create a new `inla.stack` object:

``` r
*> stack.pred = inla.stack(data = list(y = NA),
+          A = list(1, A.pred6),
+          effects = list(intercept = rep(1, nrow(pred.grid)),
+                         spatial.field = 1:spde$n.spde),
+          tag = "pred") 
```
and then we join it to the `inla.stack` object created previously for the estimation (`stack.est`):

``` r
> join.stack = inla.stack(stack.est, stack.pred) #full stack object
```

And finally, we run `INLA` again:

``` r
> output6pred = inla(formula,
*+                 data = inla.stack.data(join.stack),
+                 control.compute = list(return.marginals.predictor=TRUE),
*+                 control.predictor = list(A = inla.stack.A(join.stack), compute = TRUE)
+                 )
```

The option `return.marginals.predictor=TRUE` is necessary to obtain the marginals for the linear predictor. 
---

# Retrieve the predictions

To access the predictions (posterior summary stats or marginal distribution) at the target grid locations, we extract with the `inla.stack.index()` function the corresponding indexes from the full stack object using the corresponding tag set before (`pred`):

``` r
> index.pred = inla.stack.index(join.stack, tag = "pred")$data
> length(index.pred)
```

```
[1] 400
```

We then extract the prediction posterior mean and sd at the first 3 grid points:

``` r
> output6pred$summary.linear.predictor[index.pred[1:3],c("mean","sd")]
```

```
                   mean        sd
APredictor.201 12.24876 0.2614732
APredictor.202 12.06026 0.4238819
APredictor.203 10.58893 0.4336448
```

In this case (identity link) `output6pred$summary.fitted.values` would return the same output.

---

# Mapping the linear predictor: posterior mean

``` r
> # extract posterior mean and standard deviation of the predictions
> grid.df <- data.frame(
+ x = pred.grid[, "x"],
+ y = pred.grid[, "y"],
+ mean = output6pred$summary.linear.predictor[index.pred, "mean"],
+ sd = output6pred$summary.linear.predictor[index.pred, "sd"]
+ )
> 
> library(ggplot2)
> library(viridis)
> 
> ggplot(grid.df, aes(x = x, y = y, fill = mean)) +
+ geom_tile() +
+ coord_fixed() +
+ scale_fill_viridis_c(option = "magma") +
+ theme_minimal() +
+ labs(
+ fill = "Posterior mean",
+ title = "Posterior mean of linear predictor"
+ ) 
```

---

---

# Mapping the linear predictor: posterior standard deviation

Now we plot now the posterior standard deviation of the linear predictor at the grid level

``` r
> ggplot(grid.df, aes(x = x, y = y, fill = sd)) +
+   geom_tile() +
+   coord_fixed() +
+   scale_fill_viridis_c(option = "plasma") +
+   theme_minimal() +
+   labs(
+     fill = "Posterior SD",
+     title = "Uncertainty (SD) of predicted field"
+   )
```
---

---

# Wrap-up: SPDE modelling pipeline in R-INLA (step-by-step)

- Load, among the others, the R packages: `INLA`, `fmesher`, `inlabru`

- **Construct the mesh** to obtain a triangulation of the domain using the function .red[`inla.mesh.2d`]. Recently, a new function has been made available from the package `fmesher`: .red[`fm_mesh_2d_inla`]; you can check the documentation typing `?fm_mesh_2d_inla` in R

- **Construct the SPDE model** to define the spatial dependence structure using the function .red[`inla.spde2.matern`] or .red[`inla.spde2.pcmatern`]

- **Construct the index set** (it creates identifiers for the latent spatial field components) to connect the GMRF components with the stack using the function .red[`inla.spde.make.index`]

- **Construct the projection matrix** to map the GMRF (on mesh nodes) to observation or prediction locations using the function .red[`inla.spde.make.A`]

- **Put everything together in a stack object** using the function .red[`inla.stack`]

- **Run the model!**

---

# References

Banerjee, S., B. P. Carlin, and A. E. Gelfand (2014). _Hierarchical modeling and analysis for spatial data_. Chapman and Hall/CRC.

Lindgren, F. and H. Rue (2015). "Bayesian Spatial Modelling with R-INLA". In: _Journal of Statistical Software_ 63.19, pp. 1-25.

Lindgren, F., H. Rue, and J. Lindström (2011). "An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach (with discussion)". In: _J. R. Statist. Soc. B_ 73.4, pp. 423-498.

Simpson, D., H. Rue, A. Riebler, et al. (2017). "Penalising Model Component Complexity: A Principled, Practical Approach to Constructing Priors". In: _Statistical Science_ 32.1, pp. 1-28.