class: title-slide # Session 9.2: Introduction to Stochastic Partial Differential Equations (SPDE) using R-INLA<span style="display:block; margin-top: 10px ;"></span> ## ### ### <!-- Can also separate the various components of the extra argument 'params', eg as in ### , , MSc Epidemiology --> --- layout: true .my-footer[ .alignleft[ © Pirani | Blangiardo | Villejo ] .aligncenter[ MSc Epidemiology ] .alignright[ , NA ] ] --- # Learning Objectives At the end of this session you should be able to: <span style="display:block; margin-top: 20px ;"></span> - understand the basics of the Stochastic Partial Differential Equation (**SPDE**) approach; <span style="display:block; margin-top: 20px ;"></span> - implement the SPDE approach using the `R-INLA` package; <span style="display:block; margin-top: 20px ;"></span> - perform spatial prediction and mapping. <span style="display:block; margin-top: 30px ;"></span> The topics covered in this lecture can be found in: - Sections 6.5:6.8 of the book **Spatial and Spatio-Temporal Bayesian models with R-INLA** (website for data and code: https://github.com/michelacameletti/Spatial-and-Spatio-Temporal-Bayesian-Models-with-R-INLA) - Chapters 8 and 9 of the book **Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny** https://www.paulamoraga.com/book-geospatial/index.html - Chapter 2 of the book **Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA** https://becarioprecario.bitbucket.io/spde-gitbook/ <span style="display:block; margin-top: 30px ;"></span> --- # Outline <span style="display:block; margin-top: 30px ;"></span> 1\. [Basics of the SPDE approach](#spdeapproach) <span style="display:block; margin-top: 30px ;"></span> 2\. [The SPDE approach with `R-INLA`](#spdeinla) <span style="display:block; margin-top: 30px ;"></span> 3\. [Spatial prediction using SPDE approach](#predi) --- name: spdeapproach <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **The SPDE approach**]]] --- # A common model for geostatistical (noisy) data - Usually the following (mixed-effects) model is assumed `$$\color{blue}{y(\bm{s}) = \mu(\bm{s}) + \xi(\bm{s}) +\epsilon(\bm{s})}$$` <span style="display:block; margin-top: 20px ;"></span> where <span style="display:block; margin-top: 15px ;"></span> - `\(\mu(\bm{s})\)` is the so-called .red[**large scale**] component, including linear or non linear of covariates. <span style="display:block; margin-top: 15px ;"></span> - `\(\xi(\bm{s})\)` is a zero mean latent .red[**Gaussian spatial process**] commonly assumed to be stationary and isotropic with covariance function `\(Cov(\xi(\bm{s_i}),\xi(\bm{s_j}))\)` which depends only on the distance between the locations. <span style="display:block; margin-top: 15px ;"></span> - `\(\epsilon(\bm{s})\)` represents the Gaussian .red[**measurement error**] (independent from `\(\xi(\bm{s})\)`) and its variance `\((\sigma^2_{\epsilon})\)` is usually known as **nugget effect**. -- <span style="display:block; margin-top: 20px ;"></span> Given the data from `\(n\)` locations `\((\bm{s}_{1},\ldots,\bm{s}_{n})\)` we have: `\begin{align*} y(s_i)\mid\mu(\bm{s}_i),\xi(\bm{s}_i),\sigma^2_{\epsilon} & \sim \text{Normal}(\mu(\bm{s}_i)+\xi(\bm{s}_i),\sigma^2_{\epsilon})\\ \bm{\xi} & \sim \text{GF}(\bm 0,\bm\Sigma) \end{align*}` where `\(\bm\Sigma\)` is a **dense** matrix defined by a spatial covariance function `\(\mathcal C(||\bm{s}_{i}-\bm{s}_{j}||)\)`. - Several functions are available for the **spatial covariance function** (e.g. exponential, Matérn, spherical, etc.) parameterized by some parameters, e.g. spatial variance, range (Banerjee, Carlin, and Gelfand, 2014). --- # Matérn covariance function The .red[**Matérn covariance function**] is defined by `$$\mbox{Cov}(\xi(\bm{s}_i),\xi(\bm{s}_j))= \mbox{Cov}(\xi_i,\xi_j)=\frac{\sigma^2}{\Gamma(\lambda)2^{\lambda-1}}(\kappa ||\bm{s}_i - \bm{s}_j||)^\lambda K_\lambda(\kappa ||\bm{s}_i - \bm{s}_j||)$$` <span style="display:block; margin-top: 20px ;"></span> where: - `\(||\bm{s}_i - \bm{s}_j||\)` is the Euclidean distance between two generic locations `\(\bm{s}_i, \bm{s}_j \in \mathbb{R}^2\)`, <span style="display:block; margin-top: 20px ;"></span> - `\(\sigma^2\)` is the marginal variance of the GF, <span style="display:block; margin-top: 20px ;"></span> - `\(K_\lambda\)` denotes the modified Bessel function of second kind and order `\(\lambda>0\)`, which measures the degree of **smoothness** of the process (it is usually kept fixed due to poor identifiability), <span style="display:block; margin-top: 20px ;"></span> - `\(\kappa>0\)` is a **scale parameter** related to the **range** `\(r\)`, i.e. the distance at which the spatial correlation becomes almost null. Typically, the empirically derived definition for the range is `\(r=\frac{\sqrt{8 \lambda}}{\kappa}\)` (Lindgren, Rue, and Lindström, 2011), with `\(r\)` corresponding to the distance at which the spatial correlation is close to 0.1, for each `\(\lambda\geq 1/2\)`. <span style="display:block; margin-top: 20px ;"></span> - The Matérn family is a very flexible class of covariance functions able to cover a wide range of spatial fields. --- # The effect of range Simulation of different GFs with `\(\sigma=1\)` and different ranges: <img src="./img/unnamed-chunk-3-1.png" > --- # Model based approach for estimation - In the Bayesian framework, the classical approach for model estimation is Markov chain Monte Carlo methods (MCMC) considering that the likelihood function is a multivariate Gaussian distribution (see for example the `spBayes` and `spTimer` `R` packages). <span style="display:block; margin-top: 30px ;"></span> - This requires to compute the Cholesky factorization of the dense covariance matrix `\(\bm \Sigma\)` which is computationally very expensive. This is known as **big `\(n\)` problem** . <span style="display:block; margin-top: 30px ;"></span> - The .red[**stochastic partial differential equation (SPDE) approach**] (Lindgren, Rue, and Lindström, 2011) is an alternative to the use of MCMC: it represents the continuous spatial process `\(\xi(\bm{s})\)` with Matérn covariance function using a discretely indexed spatial random process (i.e., a Gaussian Markov Random Field - GMRF), which is characterized by a sparse precision matrix and enjoys computational benefits in terms of fast inference. --- name: spdeapproach <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Introduction to the SPDE approach**]]] --- # The SPDE approach for Matérn GF - The starting point is the **linear fractional stochastic partial differential equation** (SPDE) `$$\color{blue}{(\kappa^2 - \Delta)^{\alpha/2}(\tau \xi(\bm{s}))={ \mathcal W} (\bm{s})}$$` where `\(\bm{s} \in \mathbb{R}^d\)`, `\(\Delta\)` is the Laplacian operator, `\(\alpha>0\)` is the **smoothness** term, `\(\kappa>0\)` is the scale parameter, `\(\tau\)` controls the variance and `\({\mathcal W}(\bm{s})\)` is a Gaussian spatial white noise process (with unit variance). <span style="display:block; margin-top: 30px ;"></span> - **Whittle** in 1954 showed that the exact and stationary solution to this SPDE is the stationary Gaussian field `\(\xi(\bm{s})\)`with Matérn covariance function. <span style="display:block; margin-top: 10px ;"></span> - **Lindgren, Rue, and Lindström (2011)** represent the solution of the SPDE using the finite element method (this is possible only for some values of the smothness parameter). <span style="display:block; margin-top: 10px ;"></span> - In `\(\mathbb R^2\)` the link between the SPDE parameters `\(\tau, \alpha, \kappa\)` and the Matérn function parameters `\(\sigma^2, \lambda, \kappa\)` is given by `$$\lambda = \alpha-1 \qquad\qquad \sigma^2 = \frac{\Gamma(\lambda)}{\Gamma(\alpha)(4\pi)\kappa^{2\lambda}\tau^2}$$` <span style="display:block; margin-top: 10px ;"></span> - In `R-INLA` the default value for the smoothness parameter is `\(\alpha=2\)` `\(\rightarrow\)` `\(\lambda=1\)`. Consequently `$$r=\sqrt{8\lambda}/\kappa= \sqrt{8}/\kappa \text{ (see slide 6)} \qquad\qquad \sigma^2=1/(4\pi\kappa^2\tau^2)$$` --- # Piecewise linear approximation - The solution to the SPDE can be approximated through a **basis function representation** defined on a **triangulation** of the spatial domain `\begin{align*} \color{blue}{\xi(\bm{s})=\sum_{g=1}^{G}\varphi_g(\bm{s})\tilde\xi_g} \end{align*}` where `\(G\)` is the total number of triangulation vertices, `\(\{\varphi_{g}\}\)` is the set of (deterministic) basis functions and `\(\{\tilde\xi_{g}\}\)` are zero mean Gaussian distributed weights. <span style="display:block; margin-top: 10px ;"></span> - In order to obtain a Markov structure, the basis functions are chosen to be .red[**piecewise linear**] in each triangle, i.e. `\(\varphi_{g}\)` is 1 at vertex `\(g\)` and 0 at all other vertices. --- # Approximation in 2-Dimensions <center><img src=./img/Triangulation_basis_funct.png width='80%' title='INCLUDE TEXT HERE'></center> --- # Piecewise linear approximation <span style="display:block; margin-top: 30px ;"></span> - The precision matrix `\(\bm Q\)` for the Gaussian weight vector `\(\tilde {\bm \xi}=\{\tilde\xi_{1},\ldots,\tilde\xi_{G}\}\)` is given by `$$\color{blue}{\bm Q=\tau^2\left(\kappa^4 \bm C+2\kappa^2 \bm G +\bm G\bm C^{-1}\bm G\right)}$$` where <span style="display:block; margin-top: 10px ;"></span> - the generic element of the diagonal matrix `\(\bm C\)` is `\(C_{ii}=\int \varphi_i(\bm s) \text{d}\bm s\)`, <span style="display:block; margin-top: 10px ;"></span> - the generic element of the sparse matrix `\(\bm G\)` is `\(G_{ij}=\int \nabla\varphi_i(\bm s) \nabla\varphi_j(\bm s)\text{d}\bm s\)` (where `\(\nabla\)` denotes the gradient), <span style="display:block; margin-top: 10px ;"></span> .content-box-green[ The precision matrix `\(\bm Q\)`, whose elements depend on `\(\tau\)` and `\(\kappa\)`, is sparse and consequently `\(\bm \xi\)` is a GMRF distributed as `\(\text{Normal}(\bm 0,\bm Q^{-1})\)`: it represents the approximated solution to the SPDE. ] --- # Take home message <span style="display:block; margin-top: 30px ;"></span> - Do the modelling using GF and the computations using the GMRF representation (computational advantages thanks to algorithms for sparse matrices): <span style="display:block; margin-top: 50px ;"></span> .content-box-green[ `\begin{align} \bm\xi &\sim \text{Normal}(\mathbf 0, \bm \Sigma) \Longrightarrow \; \tilde {\bm \xi} \sim \text{Normal}\left(\bm 0,\bm Q^{-1}\right)\nonumber \end{align}` ] --- name: spdeinla <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **The SPDE approach with `R-INLA`**]]] --- # SPDE toy example .pull-left[ We use the `SPDEtoy` dataset , consisting in 200 simulated values for the variable `y` which refer to as many randomly sampled locations in the unit square area delimited by the points `\((0,0)\)` and `\((1,1)\)` and with coordinates given by `s1` and `s2`. ``` r > library(INLA) > data(SPDEtoy) > dim(SPDEtoy) ``` ``` [1] 200 3 ``` ``` r > head(SPDEtoy, n=3) ``` ``` s1 s2 y 1 0.08265625 0.05640625 11.521206 2 0.61230625 0.91680625 5.277960 3 0.16200625 0.35700625 6.902959 ``` ] .pull-right[ ``` r > summary(SPDEtoy$y) ``` ``` Min. 1st Qu. Median Mean 3rd Qu. Max. 5.210 8.688 10.023 9.858 11.307 14.600 ``` ] --- # Plot of toy data .pull-left[ ``` r > library(tidyverse) > > SPDEtoy %>% + ggplot() + + geom_point(aes(s1,s2, col=y),size=2) ``` ] .pull-right[ <img src="./img/spdetoydataout-1.png" > ] --- # SPDE toy example: model for simulation - The model used for simulating the `SPDEtoy` data assumes that the distribution of the observation `\(y_{i}\)` is `\begin{align*} \color{blue}{y_i\mid\eta_i,\sigma^2_e \sim \text{Normal}(\eta_i, \sigma^2_e) \qquad i=1,\ldots, 200 \nonumber} \end{align*}` <span style="display:block; margin-top: 10px ;"></span> where `\(\sigma^{2}_e\)` is the variance of the zero mean measurement error `\(e_i\)` which is supposed to be Gaussian iid. <span style="display:block; margin-top: 20px ;"></span> - The response mean, which coincides with the **linear predictor**, is defined as `\begin{align*} \color{blue}{\eta_i = b_0 + \xi_i} \end{align*}` and includes the intercept `\(b_0\)` and a random effect represented by `\(\xi_i\)`, which is the realization of the latent GF `\(\xi(\bm s)\sim \text{MVNormal}(\bm 0,\bm \Sigma)\)`. The covariance matrix `\(\bm \Sigma\)` is defined by the Matérn spatial covariance function. <span style="display:block; margin-top: 20px ;"></span> - The parameter values chosen for simulating the data are: `\(b_0=10\)`, `\(\sigma^2_e=0.3\)`, `\(\sigma^2=5\)`, `\(\kappa=7\)`, `\(r=\frac{\sqrt{8}}{\kappa}=0.404\)`. --- # The SPDE representation and the projection matrix - Using the SPDE basis function representation, the linear predictor `\(\eta_i\)` can be rewritten as `$$\color{blue}{\eta_{i} = b_0 + \sum_{g=1}^{G} \varphi_g(\bm s_i) \tilde {\xi_g}}$$` where `\(\varphi_g(\bm s_i)\)` is the value of the `\(g\)`-th basis function evaluated in `\(\bm s_i\)`. <span style="display:block; margin-top: 20px ;"></span> - More generally it is possible to express the linear predictor as `$$\color{blue}{\eta_i= b_0 + \sum_{g=1}^{G} A_{ig} \tilde {\xi_g}}$$` with `\(A_{ig}=\varphi_g(\bm s_i)\)` being the generic element of the sparse matrix `\({\bm A}\)` (known as **projection matrix**) which maps the GMRF `\({\tilde {\bm \xi}}\)` from the `\(G\)` triangulation vertices to the `\(n\)` observation locations. This allows the SPDE model to be treated as standard indexed random effects. --- # Mesh construction - The SPDE approach is based on a .red[**triangulation**] of the spatial domain given by the **mesh**. <span style="display:block; margin-top: 20px ;"></span> - The definition of the mesh is a **trade-off** between the accuracy of the GMRF representation and computational costs, both depending on the number of vertices used in the triangulation: the bigger the number of mesh triangles, the finer the GF approximation but the higher the computational costs. <span style="display:block; margin-top: 20px ;"></span> - To create the mesh in `R-INLA` we use the helper function `inla.mesh.2d`. The arguments for a two-dimensional mesh construction can be checked using `args(inla.mesh.2d)`. <span style="display:block; margin-top: 20px ;"></span> - Here we are going to start with the following options: <span style="display:block; margin-top: 20px ;"></span> - `loc` or `loc.domain`: specify information about the spatial domain <span style="display:block; margin-top: 20px ;"></span> - `max.edge`: specify the largest allowed triangle edge length. If a vector of two values is provided, the spatial domain is divided into an inner and an outer area whose triangle resolution is specified by `max.edge` (the higher the value for `max.edge` the lower the resolution and the accuracy). --- # Mesh: `max.edge` - mesh1 .pull-left[ ``` r > coords = as.matrix(SPDEtoy[,1:2]) > mesh1 = inla.mesh.2d(loc = coords, *+ max.edge = c(0.1, 0.1)) ``` ``` r > library(inlabru) > ggplot() + + gg(mesh1) + + geom_point(data = data.frame(coords), + aes(s1, s2)) ``` ] .pull-right[ <img src="./img/explore_plot_out1-1.png" > ] --- # Mesh: `max.edge` - mesh2 .pull-left[ ``` r > mesh2 = inla.mesh.2d(loc = coords, *+ max.edge = c(0.1, 0.2)) ``` ``` r > ggplot() + + gg(mesh2) + + geom_point(data = data.frame(coords), + aes(s1, s2)) ``` ] .pull-right[ <img src="./img/explore_plot_out2-1.png" > ] --- # Mesh optional arguments: `offset` - mesh3 .pull-left[ The option `offset` of the `inla.mesh.2d` function can be used to define how much the domain should be extended in the inner and outer part. The default values are `offset = c(-0.05, -0.15)`. ``` r > mesh3 = inla.mesh.2d(loc = coords, + max.edge = c(0.1, 0.2), *+ offset = c(0.4,0.1)) ``` ``` r > ggplot() + + gg(mesh3)+ + geom_point(data = data.frame(coords), + aes(s1, s2)) ``` ] .pull-right[ <img src="./img/explore_plot_out3-1.png" > ] --- # Mesh optional arguments: `offset` - mesh4 .pull-left[ ``` r > mesh4 = inla.mesh.2d(loc = coords, + max.edge = c(0.1, 0.2), *+ offset = c(0.1,0.4)) ``` ``` r > ggplot() + + gg(mesh4) + + geom_point(data = data.frame(coords), + aes(s1, s2)) ``` ] .pull-right[ <img src="./img/explore_plot_out4-1.png" > ] --- # Mesh optional arguments `cutoff` .pull-left[ Instead of the point coordinates, it is possible to use other point locations used to determine the domain extent. ``` r > domain = matrix(cbind(c(0,1,1,0.7,0), + c(0,0,0.7,1,1)), ncol=2) *> mesh5domain = inla.mesh.2d(loc.domain = domain, + max.edge = c(0.04, 0.2), + offset = c(0.1, 0.4)) ``` ``` r > ggplot() + + gg(mesh5domain) + + geom_point(data = data.frame(domain), + aes(X1, X2)) + + geom_point(data = data.frame(coords), + aes(s1, s2), col = "red", + alpha = 0.5) ``` ] .pull-right[ <img src="./img/explore_plot_out5domain-1.png" > ] --- # Mesh optional arguments `cutoff` - mesh5 .pull-left[ The option `cutoff` can be used to avoid building too many small triangles around clustered data locations (the default value is equal to 0). ``` r > mesh5 = inla.mesh.2d(loc.domain = domain, + max.edge = c(0.04, 0.2), *+ cutoff = 0.5, + offset = c(0.1, 0.4)) ``` ``` r > ggplot() + + gg(mesh5) + + geom_point(data = data.frame(domain), + aes(X1, X2)) + + geom_point(data = data.frame(coords), + aes(s1, s2), col = "red", + alpha = 0.5) ``` ] .pull-right[ <img src="./img/explore_plot_out5-1.png" > ] --- # Putting all together - mesh6 .pull-left[ ``` r > mesh6 = inla.mesh.2d(loc.domain = domain, + max.edge = c(0.04, 0.2), + cutoff = 0.05, + offset = c(0.1, 0.4)) ``` ``` r > ggplot() + + gg(mesh6) + + geom_point(data = data.frame(domain), + aes(X1, X2)) + + geom_point(data = data.frame(coords), + aes(s1, s2), col = "red", + alpha = 0.5) ``` ] .pull-right[ <img src="./img/explore_plot_out6-1.png" > ] --- # Mesh: non convex hull .pull-left[ A feature in `R-INLA` named `inla.nonconvex.hull` makes it possible to compute a non convex hull to be included as boundary in the mesh construction. This can be particularly useful when the shape of the domain is of some importance. ``` r > set.seed(44) > loc = matrix(runif(20), 10, 2) > > boundary = inla.nonconvex.hull(loc, *+ convex=0.2) > meshNC = inla.mesh.2d(loc = loc, *+ boundary = boundary, + max.edge = c(0.04, 0.2)) ``` ``` r > ggplot() + + gg(meshNC)+ + geom_point(data = data.frame(loc), + aes(X1, X2)) ``` ] .pull-right[ <img src="./img/explore_plot_out7-1.png" > ] --- # Creation of the Matérn SPDE model for `mesh6` - We create the Matérn **SPDE model object** with the function `inla.spde2.matern` on the mesh. By default the smoothness term is 2 - Here we create the SPDE object using the `mesh6` triangulation: ``` r > spde = inla.spde2.matern(mesh = mesh6) > spde$n.spde # number of vertices in the mesh ``` ``` [1] 549 ``` --- # PC prior for the Matérn model - Instead of `inla.spde2.matern` it is possible to use `inla.spde2.pcmatern` for creating an `inla.spde2` model object using a PC prior for the range `\(r\)` and the marginal standard deviation `\(\sigma\)` (Simpson, Rue, Riebler, Martins, and Sorbye, 2017). <span style="display:block; margin-top: 20px ;"></span> - The prior for `\(\sigma\)` is such that $$ Pr(\sigma>\sigma_0)=p $$ and this requires to specify `\(\sigma_0\)` and `\(p\)` with the option `prior.sigma = c(sigma0,p)`. <span style="display:block; margin-top: 20px ;"></span> - The prior for `\(r\)` is such that $$ Pr(r<r_0)=p $$ and this requires to specify `\(r_0\)` and `\(p\)` with the option `prior.range = c(r0,p)`. <span style="display:block; margin-top: 20px ;"></span> For example: ``` r > spde = inla.spde2.pcmatern(mesh, + prior.range = c(0.01,0.1), + prior.sigma = c(100,0.1)) ``` --- # Creation of the indexes for the SPDE model - Now we construct the index set for the latent spatial Gaussian model using the function `inla.spde.make.index`. It will be useful in the stack preparation (see next slides) ``` r > s.index = inla.spde.make.index(name = "spatial.field", n.spde = spde$n.spde) ``` Here the `name` represents the name of the effect which will be used in the `formula` - The index set for the latent field does not depend on the data set locations. It only depends on the SPDE model size --- # Projection matrix for `mesh6` - Now, we create a .red[projection matrix], which maps the GMRF from the mesh nodes to the `\(n\)` observation locations - The projection matrix is build using the `inla.spde.make.A()` function .pull-left[ ``` r > A.est6 = inla.spde.make.A(mesh = mesh6, + loc = coords) > dim(A.est6) ``` ``` [1] 200 549 ``` The matrix has the number of rows equal to the number of observations and the number of columns equal to the number of vertices of the triangulation. The points are not necessarily on the vertices and thus each point will be associated with up to three non-zero weights (that collectively add up to 1) ``` r > #No more than 3 elements in each line are non-zero > table(rowSums(A.est6>0)) ``` ``` 3 200 ``` ] .pull-right[ The sum of each row is one: ``` r > table(rowSums(A.est6)) ``` ``` 1 200 ``` There are some columns whose sum is zero corresponding to triangles with no point location inside: ``` r > table(colSums(A.est6) > 0) ``` ``` FALSE TRUE 313 236 ``` ] --- # The `inla.stack()` function - The function `inla.stack()` allows us to perform an optimal and easy management of the SPDE objects (data, covariates, indices and projection matrices) and for the construction of the linear predictor (Lindgren and Rue, 2015). <span style="display:block; margin-top: 20px ;"></span> - In the `SPDEtoy` example the .red[**linear predictor**] is given by `$$\eta_i = b_0 +\xi_i = b_0 + \sum_{g=1}^{G}A_{ig} \tilde {\xi_g}$$` and can be written as `$$\bm\eta = \bm 1 b_0+\bm A\tilde{\bm \xi}$$` where the first term refers to the intercept and the second to the spatial effect. <span style="display:block; margin-top: 20px ;"></span> - The main arguments of the `inla.stack()` function are: - `data`: a vector list with the data - `A`: a list of projection matrices - `effects`: the list of effects - `tag` (optional): a label for the data stack --- # The `inla.stack()` function for model fitting with the `SPDEtoy` data We define the `inla.stack` object for model fitting as follows: ``` r > stack.est = inla.stack( + data = list(y = SPDEtoy$y), + A = list(1, A.est6), + effects = list(intercept = rep(1,nrow(SPDEtoy)), + s.index), + tag="est") ``` Note that the function `inla.stack()` will take care of eliminating any column in the projection matrix which is full of zeros. --- # Model fitting - The vector of parameters is defined as `\(\bm \theta = \{\tilde {\bm \xi}, b_0\}\)` with hyper-parameter vector `\(\bm \psi=(\sigma^{2}_{e},r,\tau)\)`, where `\(\tau=1/\sigma^2\)` and `\(r\)` (depending on `\(\kappa\)`) are the Matérn covariance function parameters. <span style="display:block; margin-top: 20px ;"></span> - In `R-INLA` the default **internal representation** for the SPDE parameters is `\(\log(\tau)=\theta_1\)` and `\(\log(\kappa)=\theta_2\)`, with `\(\theta_1\)` and `\(\theta_2\)` being given two independent Normal prior distributions. - Steps: (1) Define the formula, then (2) Run INLA <span style="display:block; margin-top: 30px ;"></span> .blue[1\. Define the **linear predictor**] through the `formula`. In `R-INLA` the Matérn GF is part of the linear predictor and is specified in the `formula` environment using a proper specification for `f()`. ``` r > formula = y ~ -1 + intercept + f(spatial.field, model = spde) ``` where `spatial.field` is a proper index variable (from `inla.spde.make.index` function), and `spde` is the model created previously with `inla.spde2.matern`. Note that the intercept is removed and is added manually in the linear predictor. --- <span style="display:block; margin-top: 20px ;"></span> .blue[2\. Run `inla`!] The function `inla.stack.data()` and `inla.stack.A()` are used for extracting the data and the projection matrix from the `stack.est` object: ``` r > output6 = inla(formula, *+ data = inla.stack.data(stack.est), *+ control.predictor = list(A = inla.stack.A(stack.est), compute = TRUE)) ``` Note that the projection matrix is passed to `inla` through `control.predictor`. Moreover, with the option `compute = TRUE` we ask for the computation of the marginals of the linear predictor. --- # Exploring the output: fixed effects and hyperparameters ``` r > output6$summary.fixed[,c("mean","0.025quant","0.975quant")] ``` ``` mean 0.025quant 0.975quant intercept 9.502358 8.014667 10.88173 ``` ``` r > output6$summary.hyperpar[,c("mean","0.025quant","0.975quant")] ``` ``` mean 0.025quant 0.975quant Precision for the Gaussian observations 2.866322 2.040684 3.903928 Theta1 for spatial.field -4.033943 -4.342915 -3.712046 Theta2 for spatial.field 2.094528 1.594933 2.568765 ``` --- # Exploring the output: spatial parameters If we are interested in the posterior summaries of the spatial parameters on the scale of the variance `\(\sigma^2=1/\tau\)` and range `\(r\)` (instead of the internal scale regarding `\(\theta_1=\log(\tau)\)` and `\(\theta_2=\log(\kappa)\)`) we use ``` r > output6.field = inla.spde2.result(inla = output6, + name = "spatial.field", + spde = spde) ``` The resulting list contains the following elements: ``` r > names(output6.field) ``` <center><img src=./img/names_output.jpg width='57%' title=''></center> --- # Exploring the output The posterior mean of `\(\sigma^2\)` and the range `\(r\)` can be obtained by typing ``` r > inla.emarginal(function(x) x, output6.field$marginals.variance.nominal[[1]]) ``` ``` [1] 4.007248 ``` ``` r > inla.emarginal(function(x) x, output6.field$marginals.range.nominal[[1]]) ``` ``` [1] 0.3589786 ``` Also the other standard INLA functions can be applied to the marginal posteriors: ``` r > inla.zmarginal(output6.field$marginals.range.nominal[[1]]) ``` ``` Mean 0.358979 Stdev 0.0905183 Quantile 0.025 0.217391 Quantile 0.25 0.293978 Quantile 0.5 0.34625 Quantile 0.75 0.410374 Quantile 0.975 0.570918 ``` --- name: predi <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Spatial prediction using SPDE approach**]]] --- # Spatial predictions - In geostatistics we are interested in predicting the (latent) spatial field (i.e. the linear predictor) at new spatial locations where we do not have data. <span style="display:block; margin-top: 20px ;"></span> - It is possible to perform the spatial prediction jointly with the estimation by using the `inla.stack` approach. <span style="display:block; margin-top: 20px ;"></span> - Consider the response variable distribution $$ \bm y\sim\text{Normal}(\bm\eta=\bm 1 b_0+\bm A\tilde{\bm \xi},\sigma^2_e \bm I) $$ we are interested in the posterior distribution of the linear predictor `\(\bm\eta\)` everywhere in space, especially where we don't have observed data and `y=NA`. <span style="display:block; margin-top: 20px ;"></span> - With regard to spatial prediction, it is worth noting that the INLA-SPDE algorithm provides the posterior conditional distribution of `\(\bm\eta\)` for all the triangulation vertices. By using the SPDE approximation, it is then immediate to get a prediction for `\(\bm\eta\)` for any location in the triangulated domain (i.e. the posterior predictive distribution). --- # Grid for spatial prediction .pull-left[ Consider the following regular grid of points: ``` r > grid.x = 20 > grid.y = 20 > pred.grid = expand.grid( + x = seq(0, 1, length.out = grid.x), + y = seq(0, 1, length.out = grid.y)) ``` It is necessary to define a new projection matrix: ``` r > A.pred6 = inla.spde.make.A(mesh = mesh6, + loc = as.matrix(pred.grid)) > dim(A.pred6) ``` ``` [1] 400 549 ``` ] .pull-right[ <img src="./img/predgrid-1.png" > ] --- # Model fitting jointly with spatial prediction For performing **jointly** the estimation and the prediction, we create a new `inla.stack` object: ``` r *> stack.pred = inla.stack(data = list(y = NA), + A = list(1, A.pred6), + effects = list(intercept = rep(1, nrow(pred.grid)), + spatial.field = 1:spde$n.spde), + tag = "pred") ``` and then we join it to the `inla.stack` object created previously for the estimation (`stack.est`): ``` r > join.stack = inla.stack(stack.est, stack.pred) #full stack object ``` And finally, we run `INLA` again: ``` r > output6pred = inla(formula, *+ data = inla.stack.data(join.stack), + control.compute = list(return.marginals.predictor=TRUE), *+ control.predictor = list(A = inla.stack.A(join.stack), compute = TRUE) + ) ``` The option `return.marginals.predictor=TRUE` is necessary to obtain the marginals for the linear predictor. --- # Retrieve the predictions To access the predictions (posterior summary stats or marginal distribution) at the target grid locations, we extract with the `inla.stack.index()` function the corresponding indexes from the full stack object using the corresponding tag set before (`pred`): ``` r > index.pred = inla.stack.index(join.stack, tag = "pred")$data > length(index.pred) ``` ``` [1] 400 ``` We then extract the prediction posterior mean and sd at the first 3 grid points: ``` r > output6pred$summary.linear.predictor[index.pred[1:3],c("mean","sd")] ``` ``` mean sd APredictor.201 12.24876 0.2614732 APredictor.202 12.06026 0.4238819 APredictor.203 10.58893 0.4336448 ``` In this case (identity link) `output6pred$summary.fitted.values` would return the same output. --- # Mapping the linear predictor: posterior mean ``` r > # extract posterior mean and standard deviation of the predictions > grid.df <- data.frame( + x = pred.grid[, "x"], + y = pred.grid[, "y"], + mean = output6pred$summary.linear.predictor[index.pred, "mean"], + sd = output6pred$summary.linear.predictor[index.pred, "sd"] + ) > > library(ggplot2) > library(viridis) > > ggplot(grid.df, aes(x = x, y = y, fill = mean)) + + geom_tile() + + coord_fixed() + + scale_fill_viridis_c(option = "magma") + + theme_minimal() + + labs( + fill = "Posterior mean", + title = "Posterior mean of linear predictor" + ) ``` --- <img src="./img/mapmeanout-1.png" > --- # Mapping the linear predictor: posterior standard deviation Now we plot now the posterior standard deviation of the linear predictor at the grid level ``` r > ggplot(grid.df, aes(x = x, y = y, fill = sd)) + + geom_tile() + + coord_fixed() + + scale_fill_viridis_c(option = "plasma") + + theme_minimal() + + labs( + fill = "Posterior SD", + title = "Uncertainty (SD) of predicted field" + ) ``` --- <img src="./img/mapsdout-1.png" > --- # Wrap-up: SPDE modelling pipeline in R-INLA (step-by-step) - Load, among the others, the R packages: `INLA`, `fmesher`, `inlabru` - **Construct the mesh** to obtain a triangulation of the domain using the function .red[`inla.mesh.2d`]. Recently, a new function has been made available from the package `fmesher`: .red[`fm_mesh_2d_inla`]; you can check the documentation typing `?fm_mesh_2d_inla` in R - **Construct the SPDE model** to define the spatial dependence structure using the function .red[`inla.spde2.matern`] or .red[`inla.spde2.pcmatern`] - **Construct the index set** (it creates identifiers for the latent spatial field components) to connect the GMRF components with the stack using the function .red[`inla.spde.make.index`] - **Construct the projection matrix** to map the GMRF (on mesh nodes) to observation or prediction locations using the function .red[`inla.spde.make.A`] - **Put everything together in a stack object** using the function .red[`inla.stack`] - **Run the model!** --- # References Banerjee, S., B. P. Carlin, and A. E. Gelfand (2014). _Hierarchical modeling and analysis for spatial data_. Chapman and Hall/CRC. Lindgren, F. and H. Rue (2015). "Bayesian Spatial Modelling with R-INLA". In: _Journal of Statistical Software_ 63.19, pp. 1-25. Lindgren, F., H. Rue, and J. Lindström (2011). "An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach (with discussion)". In: _J. R. Statist. Soc. B_ 73.4, pp. 423-498. Simpson, D., H. Rue, A. Riebler, et al. (2017). "Penalising Model Component Complexity: A Principled, Practical Approach to Constructing Priors". In: _Statistical Science_ 32.1, pp. 1-28.