class: title-slide # Session 7: Models for areal data: disease mapping and ecological regression<span style="display:block; margin-top: 10px ;"></span> ## ### ### Imperial College London <!-- Can also separate the various components of the extra argument 'params', eg as in ### Imperial College London, , MSc in Epidemiology --> --- layout: true .my-footer[ .alignleft[ © Marta Blangiardo | Monica Pirani ] .aligncenter[ MSc in Epidemiology ] .alignright[ Imperial College London, NA ] ] --- # Learning Objectives After this session you should be able to: - Explain the main ideas underlying the use of Bayesian methods for producing spatially smoothed estimates of disease risk in small areas - Describe different priors for spatial random effects - Explore aetiological hypothesis between a health outcome and exposure based on disease mapping - Describe Poisson regression with spatial random effects for continuous and categorical covariates - Use `R-INLA` to produce maps of smoothed estimates of disease risk, carry out spatial smoothing of disease risk and specify ecological regression models The topics treated in this lecture are covered in: - Chapter 5-6 of the book **Spatial and Spatio-Temporal Bayesian models with R-INLA** - Chapter 5 of the book **Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny** https://www.paulamoraga.com/book-geospatial/index.html --- # Outline <span style="display:block; margin-top: 30px ;"></span> 1\. [Spatial Structure](#spatial-structure) <span style="display:block; margin-top: 30px ;"></span> 2\. [Example: suicides in London](#Example) <span style="display:block; margin-top: 30px ;"></span> 3\. [Ecological regression with spatial random effects](#Poisson-regression) --- # Spatial patterning - Poisson-logNormal model is based on the assumption that the observations in the data set are identically distributed and independent <span style="display:block; margin-top: 20px ;"></span> - However, data that occur close together in space (or time) are likely to be correlated .red[→ Dependence between observations is a more realistic assumption] <span style="display:block; margin-top: 20px ;"></span> - This spatial patterning, or .red[spatial autocorrelation], may be treated as useful information about unobserved influences <span style="display:block; margin-top: 20px ;"></span> - Formally, spatial autocorrelation measures the correlation of a variable with itself through space. If `\(Z_i\)` is the attribute `\(Z\)` observed at location `\(i\)`, and then `\(Z_j\)` is the attribute `\(Z\)` observed at location `\(j\)`, the term spatial autocorrelation refers to the correlation between `\(Z_i\)` and `\(Z_j\)` <span style="display:block; margin-top: 20px ;"></span> - In other words, it quantifies the degree of which observations, at spatial locations, are similar to nearby observations <span style="display:block; margin-top: 20px ;"></span> - Ignoring this dependence can lead to biased and inefficient inference .red[→ Smooth in space] prior distribution for the random effects should allow for spatial correlation --- name: spatial-structure <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Spatial structure**]]] --- # Building a spatial model - We specify the distribution of each random effect considering the values of the spatial random effects in .red[neighbouring areas] <span style="display:block; margin-top: 20px ;"></span> - We have a conditional specification since we are conditioning on knowing the neighbours <span style="display:block; margin-top: 20px ;"></span> - We need a rule for determining the neighbours of each area (the most popular one is based on contiguity) <span style="display:block; margin-top: 20px ;"></span> - We specify a conditional autoregressive (CAR) model to capture the spatial structure in the data --- # Spatial neighbours - Key to analyse areal data is the concept of .blue[spatial connectivity] or .blue[spatial proximity] - Let `\(i\)` and `\(j\)` index two members of the lattice (i.e. two locations such as two countries, or two districts etc.) - With each pair of sites, we associate a .red[weight] `\(w_{ij}\)`, so that the spatial weights express the neighbour structure between the observations - Now, let `\(N\)` be the total number of areal units. The spatial relationship between the areas is represented as an adjacency matrix `\(\mathbf{W}\)` with dimensions `\(N \times N\)`, where the entries `\(w_{ij}\)` of the matrix are the spatial weights: <center><img src=./img/W.png width='35%' title=''></center> - In its simple form, `\(w_{ij}=1\)` if areas `\(i\)` and `\(j\)` are adjacent, 0 otherwise - The diagonal elements of `\(\mathbf{W}\)` are zero, that is `\(w_{ii}=0\)` --- # Weights based on contiguity Operationally, we can distinguish between a .blue[rook (A)] and a .blue[queen (B)] criterion of contiguity between areas (in analogy to the moves allowed for the such-named pieces on a chess board): <span style="display:block; margin-top: 25px ;"></span> <center><img src=./img/Weights.jpg width='35%' title=''></center> - The .blue[rook] criterion defines neighbours by the existence of a common edge between two spatial units - The .blue[queen] criterion defines neighbours as spatial units sharing a common edge or a common vertex --- # Example of neighbours computation in R We need to define: - Neighbour connectivity (who is neighbour?) - Neighbour weights (how much does the neighbour matter?) - To do so, we can work with the package `spdep` and we use the function `poly2nb` to define neighbour connectivity according to rook or queen criterion (contiguity neighbours) - Also, the function `nb2listw` defines spatial weights for neighbours lists, while `nb2mat` defines spatial weights matrices --- # Example of neighbourhood structure - We compute a neighbourhood structure for Luxembourg, one of the smallest country in Europe. We use the shapefile for Luxembourg available in the R package `raster` ``` r > library(raster); library(spdep); library(mapview) > > # Read in Luxembourg shapefile from R package raster > lux = shapefile(system.file("external/lux.shp", package="raster")) > lux1<- lux[order(lux@data$ID_2),] > summary(lux1) ``` ``` Object of class SpatialPolygonsDataFrame Coordinates: min max x 5.74414 6.528252 y 49.44781 50.181622 Is projected: FALSE proj4string : [+proj=longlat +datum=WGS84 +no_defs] Data attributes: ID_1 NAME_1 ID_2 NAME_2 AREA Min. :1.000 Length:12 Min. : 1.00 Length:12 Min. : 76.0 1st Qu.:1.000 Class :character 1st Qu.: 3.75 Class :character 1st Qu.:187.2 Median :2.000 Mode :character Median : 6.50 Mode :character Median :225.5 Mean :1.917 Mean : 6.50 Mean :213.4 3rd Qu.:3.000 3rd Qu.: 9.25 3rd Qu.:253.0 Max. :3.000 Max. :12.00 Max. :312.0 ``` --- - We plot of Luxembourg divided into 12 cantons and display their name ``` r > par(mar=c(0,0,0,0)) # par sets or adjusts plotting parameters, > # while the parameter mar stands for margin size > plot(lux1, border=3, col=terrain.colors(length(lux)), axes=F) > text(lux1,"NAME_2",cex=0.5) ``` <center><img src=./img/Lux_map3.jpeg width='40%' title=''></center> Note that we can also generate a vector of contiguous colors using the functions `rainbow(n)`, `heat.colors(n)`, `terrain.colors(n)`, `topo.colors(n)`, and `cm.colors(n)`. --- - We compute contiguity-based neighbors using `poly2nb` function (here rook's move contiguity is used) ``` r > w.rook = poly2nb(lux1, row.names=lux1$ID_2, queen=FALSE) > w.rook ``` ``` Neighbour list object: Number of regions: 12 Number of nonzero links: 46 Percentage nonzero weights: 31.94444 Average number of links: 3.833333 ``` - The `nb` object w.rook lists for each polygon the neighboring polygons. For example, to see the neighbors for the first polygon in the object, type: ``` r > w.rook[[1]] ``` ``` [1] 2 4 5 ``` .pull-left[ ``` r > par(mar=c(0,0,0,0)) > plot(lux1, border=3, + col=terrain.colors(length(lux)), + axes=F) > text(lux1,"ID_2",cex=1) ``` ] .pull-right[ <center><img src=./img/Lux_map.jpeg width='51%' title='INCLUDE TEXT HERE'></center> ] --- - Finally, we assign the weights to each neighboring polygon. The argument `style` can take on a number of character values: W=row standardized, B=binary, C=globally standardized. With `zero.policy=TRUE` we insert zero into the weights matrix where there is no connection ``` r > w.rook.l <- nb2listw(w.rook, style="B", zero.policy=TRUE) > w.rook.l$weights[1] # Check the weight of the first polygon's three neighbors ``` ``` [[1]] [1] 1 1 1 ``` ``` r > w.rook.m <- nb2mat(w.rook, style="B", zero.policy = TRUE) # spatial weights matrix > w.rook.m ``` ``` [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] 1 0 1 0 1 1 0 0 0 0 0 0 0 2 1 0 1 1 1 1 0 0 0 0 1 0 3 0 1 0 0 1 0 0 1 0 0 1 0 4 1 1 0 0 0 0 0 0 0 0 0 0 5 1 1 1 0 0 0 0 0 0 0 0 0 6 0 1 0 0 0 0 0 0 0 0 1 1 7 0 0 0 0 0 0 0 0 1 1 0 1 8 0 0 1 0 0 0 0 0 1 1 1 0 9 0 0 0 0 0 0 1 1 0 1 0 0 10 0 0 0 0 0 0 1 1 1 0 1 1 11 0 1 1 0 0 1 0 1 0 1 0 1 12 0 0 0 0 0 1 1 0 0 1 1 0 attr(,"call") nb2mat(neighbours = w.rook, style = "B", zero.policy = TRUE) ``` --- - We now plot the neighbour relationship based on rook move adjacency ``` r > # Compute coordinates at centroids of each canton > coords <- coordinates(lux1) > > # Plot > par(mai=c(0,0,0,0)) > plot(lux1, col='white', border='blue') > plot(w.rook, coords, col='red', lwd=2, add=TRUE) ``` <center><img src=./img/Rplot.jpeg width='45%' title='INCLUDE TEXT HERE'></center> --- name: modelling <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Modelling**]]] --- # Intrinsic Conditional Autoregressive (ICAR) model .panelset[ .panel[.panel-name[General definition] `$$\mathbf{u} \sim \hbox{ICAR}(\mathbf{W},\sigma^2_u)$$` with <span style="display:block; margin-top: 20px ;"></span> - `\(\mathbf{W}\)` matrix defining the neighbours (weights) - `\(\sigma^2_u\)` conditional variance parameter of `\(\mathbf{u}\)` `$$u_i \mid u_{j \;\; j\ne i} \sim \hbox{Normal}\left(\frac{\sum_{j} W_{ij} u_j}{\sum_{j} W_{ij}}, \frac{\sigma^2_u}{\sum_{j} W_{ij}}\right)$$` (Besag, York, and Mollie, 1991) ] .panel[.panel-name[Common definition] `$$\mathbf{u} \sim \hbox{ICAR}(\mathbf{W},\sigma^2_u)$$` Let `\(\partial_i =\)` set of areas adjacent to `\(i\)`, `\(w_{ij}\)` = 1 for `\(j \in \partial_i\)`, 0 otherwise `$$u_i \mid u_{j \;\; j\ne i} \sim \hbox{Normal}\left(\frac{\sum_{j \in \partial_i} u_j}{n_i}, \frac{\sigma^2_u}{n_i}\right)$$` <span style="display:block; margin-top: 20px ;"></span> - `\(u_i\)` is smoothed towards mean risk in a set of neighbouring areas - Conditional variance inversely proportional to the number of neighbours (so more neighbours, less variability) ] .panel[.panel-name[Remarks] - ICAR model is improper: the overall mean of the `\(\bm{u}\)` is not defined. So an additional constraints needs to be imposed: .red[**sum-to-zero constraint**:] `\(\sum_i u_i = 0\)` <span style="display:block; margin-top: 20px ;"></span> - The parameter `\(\sigma^2_u\)` represents the .red[**conditional**] variance of the random effects (and not the marginal one) and its magnitude determines the amount of spatial variation <span style="display:block; margin-top: 20px ;"></span> - No closed-form expression available for the .red[**marginal**] between-area variance of the spatial effects → estimate marginal spatial variance empirically `$$s^2_{\text{u.marginal}} = \sum_i (u_i - \overline{u})^2 / (N-1)$$` ] ] --- # Combining ICAR with unstructured random effects - ICAR model makes a strong spatial assumption; it cannot take a limiting form that allows non-spatial variability - Besag, York and Mollie (BYM) recommended combining the ICAR prior and the standard normal prior to allow for both - spatially unstructured latent covariates `\(\bm{v}\)` modelled as iid → global smoothing - spatially correlated latent covariates `\(\bm{u}\)` modelled as ICAR → local smoothing .panelset[ .panel[.panel-name[BYM] `\begin{align*} y_i &\sim \text{Poisson}(\lambda_i = \rho_i E_i)\\ \eta_i &= \log \rho_i = b_0 + v_i + u_i\\ v_i &\sim \text{Normal}(0, \sigma^2_v)\\ \mathbf{u} &\sim \hbox{ICAR}(\mathbf{W},\sigma^2_u) \end{align*}` - Need to specify hyperprior distributions for: - `\(\sigma^2_v\)` (between-area unstructured marginal variance), e.g. `\(1/\sigma^2_v \sim \text{Gamma}(1,0.001)\)` - `\(\sigma^2_u\)` (between-area spatial conditional variance), e.g. `\(1/\sigma^2_u \sim \text{Gamma}(1,0.001)\)` - `\(b_0\)` (mean log relative risk), e.g. `\(b_0 \sim \text{Normal}(0,0.0001)\)` ] .panel[.panel-name[BYM2] `\begin{align*} y_i &\sim \text{Poisson}(\lambda_i = \rho_i E_i)\\ \eta_i &= \log \rho_i = b_0 + b_i\\ \boldsymbol{b} &= \frac{1}{\sqrt{\tau_b}}(\sqrt{1-\phi}\boldsymbol{v}_{*} + \sqrt{\phi}\boldsymbol{u}_{*}) \end{align*}` where `\(\boldsymbol{v}_{*}\)` and `\(\boldsymbol{u}_{*}\)` are standardised versions of `\(\bm{u}\)` and `\(\bm{v}\)`. - Need to specify hyperprior distributions for: - `\(\tau_b>0\)`, which is the precision parameter controlling the marginal variance of the random effect - `\(\phi\)`, which is the mixing parameter measuring the proportion of the marginal variance explained by the structured effect `\(\boldsymbol{u}_{*}\)`. The BYM2 model is equal to an only spatial model when `\(\phi = 1\)`, and an only unstructured spatial noise when `\(\phi = 0\)`. ] ] --- # Priors for BYM2 Under the BYM2 specification the hyperparameters `\(\tau_b\)` and `\(\phi\)` are modelled using .red[**Penalised Complexity** (PC) priors] (Riebler, Sørbye, Simpson, and H. Rue, 2016): - Regularise inference while not forcing too strong information - Penalise departure from a "base" model (e.g., typically characterised by a fixed value of the relevant parameter) - Prior tends to favour the base model → need fairly strong evidence to move away from it - The distance (d) between the **base** model and an **alternative**, more complex model, is measured by the Kullback-Leibler divergence - The penalization from the base model is done at a constant rate on the distance by assigning an exponential distribution to `\(d\)`: .blue[ `$$p(d)=\lambda\exp(-\lambda d)\sim \dexp(\lambda)$$` ] - PC prior defined using probability statements on the model parameters to determine the value of `\(\lambda\)` using "reasonable" information. - For PC priors, see: - Riebler, Sørbye, Simpson et al. (2016) - section 5.4 of Gomez-Rubio' book **Bayesian inference with INLA** (https://becarioprecario.bitbucket.io/inla-gitbook/index.html) - sections 4.3:4.4 of Moraga's book **Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny** (https://www.paulamoraga.com/book-geospatial/index.html) --- # Priors for BYM2 Using probability statements, we can define the PC priors for the two hyperparameters as: <span style="display:block; margin-top: -10px ;"></span> .pull-left[ .content-box-beamer[ ### Prior on `\(\tau_b\)` `$$P(1/\sqrt{\tau_b})>U_1) = \alpha_1$$` which can be interpreted as .blue[*the probability that the standard deviation of the random effect is larger than*] `\(\class{blue}{U_1}\)` .blue[*is equal to*] `\(\class{blue}{\alpha_1}\)` ] ] .pull-right[ .content-box-beamer[ ### Prior on `\(\phi\)` `$$P(\phi < U_2) = \alpha_2$$` which can be interpreted as .blue[*the probability that the spatial random effect explains less than*] `\(\class{blue}{U_2}\)` .blue[*of the total variability is equal to*] `\(\class{blue}{\alpha_2}\)` ] ] --- count: false # Priors for BYM2 Using probability statements we can define the PC priors for the two hyperparameters as: <span style="display:block; margin-top: -10px ;"></span> .pull-left[ .content-box-beamer[ ### Prior on `\(\tau_b\)` `$$P((1/\sqrt{\tau_b})>U_1) = \alpha_1$$` which can be interpreted as .blue[*the probability that the standard deviation of the random effect is larger than*] `\(\class{blue}{U_1}\)` .blue[*is equal to*] `\(\class{blue}{\alpha_1}\)` ] ] .pull-right[ .content-box-beamer[ ### Prior on `\(\phi\)` `$$P(\phi < U_2) = \alpha_2$$` which can be interpreted as .blue[*the probability that the spatial random effect explains less than*] `\(\class{blue}{U_2}\)` .blue[*of the total variability is equal to*] `\(\class{blue}{\alpha_2}\)` ] ] <span style="display:block; margin-top: 5px ;"></span> where `\(\lambda = \frac{-\log(\alpha_k)}{U_k}\)` - We need to define `\(U_1, U_2\)` and `\(\alpha_1, \alpha_2\)`; following Riebler, Sørbye, Simpson et al. (2016): <span style="display:block; margin-top: 10px ;"></span> -- - A marginal standard deviation (sd) not too large (e.g. 0.5); -- - `\(\alpha_1=0.01\)` (we want to allow for a small probability) -- - `\(\alpha_2=2/3\)` (we expect a higher probability that the variability to be explained by the spatial random effect is lower than 50%) -- - Therefore, following Riebler, Sørbye, Simpson et al. (2016) the priors are as follows: 1\. Assuming `\(U_1=0.5/0.31\)` translates into `\(P(\sigma_b>1.62) = 0.01\)`, using the rule of thumb in Riebler, Sørbye, Simpson et al. (2016) 2\. Assuming `\(U_2=0.5\)` we get `\(P(\phi < 0.5) = 2/3\)` --- # Poisson model with BYM random effects - Choice of the adjacency matrix (neighbours): 2 areas are neighbours if they share a common border → Adjacency matrix implemented in INLA - An area cannot be specified as its own neighbour - Adjacency matrix must be symmetric -- - `\(\text{RR}_i = \exp(b_0 + b_i)\)`: RR in area `\(i\)` relative to the age/sex structure (used to estimate the `\(E_i\)`) - `\(\hbox{residual RR}_i = \exp(b_i)\)`: residual RR in area `\(i\)` relative to the region average after adjusting for the overall risk -- - `\(\sigma^2_b\)` reflects the marginal variability of the REs - `\(\phi\)` represents the weight of the spatial structure --- name: Example <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Example: Suicides in London**]]] --- # Suicides in Greater London, M+F, 1989-1993, Boroughs - 32 boroughs in Greater London - Interest: mapping the RR in each borough - Methods with no spatial structure: SMR, non spatial smoothing - Spatial smoothing using the BYM2 model `\begin{align*} y_i & \sim \hbox{Poisson}(\rho_i E_i)\\ \log \rho_i & = b_0 + b_i\\ \boldsymbol{b} &= \frac{1}{\sqrt{\tau_b}}(\sqrt{1-\phi}\boldsymbol{v}_{*} + \sqrt{\phi}\boldsymbol{u}_{*}) \end{align*}` - Data: `\(\bm{y}\)` and `\(\bm{E}\)` - Priors: `\(\tau_b\)`, `\(b_0\)`, `\(\phi\)` - Parameters of interest: - residual RR `\((\hbox{resRR}_i=\exp(b_i))\)` - marginal variance `\((1/ \tau_b)\)` - percent of total variation in the log RR due to spatial effects `\((\phi)\)` --- # Adjacency matrix in INLA - It is possible to produces a graph from a shapefile <span style="display:block; margin-top: 20px ;"></span> - Upload the shapefile using `sf` package ``` r > library(sf) > london.gen = st_read("LDNSuicides.shp") > london.gen$ID = seq(1,32) ``` -- - Use `poly2nb` and `nb2INLA` from the `spdep` package to transform the shapefile into adjacency matrix ``` r > library(spdep) > nb2INLA("LDN.graph",poly2nb(london.gen)) > LDN.adj = paste(getwd(),"/LDN.graph",sep="") ``` -- - Now `LDN.graph` has been saved in the working directory and can be called when specifying the BYM model (see later) --- # Spatial distributions for area level data in INLA We introduce here the specification of the `ICAR` and `BYM2` models in `INLA`, which are done through `f()`: .panelset[ .panel[.panel-name[ICAR in `INLA`] ``` r > formula.ICAR = y ~ f(ID, model="besag", graph=LDN.adj) ``` <span style="display:block; margin-top: 20px ;"></span> - `ID` is the area identifier <span style="display:block; margin-top: 20px ;"></span> - `graph=LDN.adj` identifies the adjacency structure constructed as seen before <span style="display:block; margin-top: 20px ;"></span> - `model=besag` specifies the intrinsic conditional autoregressive structure as described before On the example: ``` r > formula = y ~ 1 + f(ID, model="besag",graph=LDN.adj, + hyper=list( + prec=list( + prior="loggamma",param=c(1,0.0005)))) ``` ] .panel[.panel-name[BYM2 in `INLA`] ``` r > formula.BYM2 = y ~ f(ID, model="bym2", graph=LDN.adj) ``` <span style="display:block; margin-top: 20px ;"></span> - `ID` is the area identifier <span style="display:block; margin-top: 20px ;"></span> - `graph=LDN.adj` identifies the adjacency structure constructed as seen before <span style="display:block; margin-top: 20px ;"></span> - `model=BYM2` specifies the combination of the intrinsic conditional autoregressive structure and unstructured random effect as described before On the example: ``` r > formula = y ~ 1 + f(ID, model="bym2",graph="LDN.graph", + hyper=list(prec = list( + prior = "pc.prec", + param = c(0.5 / 0.31, 0.01)), + phi = list( + prior = "pc", + param = c(0.5, 2 / 3)))) ``` ] ] --- # Running the model in `INLA` To run the model in INLA ``` r > mod.suicides = inla(formula,family="poisson", + data=data.suicides,E=E, + control.compute=list(dic=TRUE, waic=TRUE)) ``` `R-INLA` estimates the parameters `\(\bm \theta = \{b_0, \bm{b}, \bm{u}\}\)` and the hyper-parameters `\(\bm\psi=\{\tau_{b}, \phi\}\)`. --- # How to get information from random effects - The random effect are obtained through ``` r > mod.suicides$summary.random$ID ``` ``` ID mean sd 0.025quant 0.5quant 0.975quant mode kld 1 1 -0.08520721 0.10806381 -0.29842450 -0.08479924 0.125707632 -0.08479341 4.423315e-09 2 2 -0.17682861 0.08551784 -0.34570967 -0.17647348 -0.009959679 -0.17647124 6.357144e-09 3 3 -0.21879424 0.09676277 -0.40988139 -0.21839754 -0.029968014 -0.21839899 6.361769e-09 4 4 0.11867772 0.08261913 -0.04260933 0.11836609 0.281739418 0.11836660 5.462791e-09 5 5 -0.14137336 0.08418255 -0.30666920 -0.14136770 0.023886609 -0.14136883 2.924925e-09 6 6 0.40093675 0.08265589 0.23906185 0.40081698 0.563536169 0.40082945 2.467318e-09 ``` which is a matrix formed by `\(2n\)` rows: - `\(1:n\)` rows include information on the area specific residuals `\(b_i\)` - `\(n+1:2n\)` rows are the spatially structured residual `\(u_i\)` --- # How to get information from random effects - All these parameters are on the logarithmic scale; to transform the marginal back to the natural scale: .pull-left[ ``` r > b = mod.suicides$marginals.random$ID[1:Nareas] ``` this returns a list with `Nareas` number of elements, each representing the posterior marginal of `\(b_i\)` for that area ] .pull-right[ <img src="./img/fig2.1-1.png" style="display: block; margin: auto;" width="50%"> ] -- <span style="display:block; margin-top: 20px ;"></span> - Then we can get the posterior mean and 95% credible intervals of the exp(b): .pull-left[ ``` r > zeta = lapply(b,function(x) inla.emarginal(exp,x)) > zeta_CI = lapply(b,function(x) + inla.qmarginal(c(0.025,0.975), + inla.tmarginal(exp,x))) ``` ] .pull-right[ <span style="display:block; margin-top: -150px ;"></span> <center><img src=./img/trans1-plot-1.png width='50%' title=''></center> ] --- # Identification of spatial patterns (exceedance probabilities) - We can now calculate the probabilities that the smoothed RR estimates are greater than a given threshold value - These probabilities are called .red[exceedance probabilities] and are useful to assess unusual elevation of disease risk (Richardson, Thomson, Best, and Elliott, 2004) - The probability that the smoothed RR of area `\(i\)` is higher than a value `\(c\)` can be written as `\(P(resRR_i>c)\)`. This can be computed by subtracting `\(P(resRR_i \le c)\)` to 1, that is: `$$P(resRR_i>c) = 1- P(resRR_i \le c)$$` - In `R-INLA`, we can compute the probability `\(1- P(resRR_i \le c)\)` using the `inla.pmarginal()` function with arguments equal to the marginal distribution of the `\(resRR_i\)` and the threshold value `\(c\)`. Then, the exceedance probability `\(P(resRR_i>c)\)` can be computed by subtracting this probability to 1: ``` r > 1 - inla.pmarginal(q = c, marginal = marg) ``` where `marg` is the marginal distribution of the residual relative risks, and `\(c\)` is the threshold value. - For reference, see section 6.6 of Moraga's book https://www.paulamoraga.com/book-geospatial/index.html --- # Posterior probability in `R-INLA` for suicides in London analysis - Remember the parametrisation `\(\zeta = \exp{(b_i)}\)` - We now visualize `\(p(\zeta_i>1\mid \bm y)=p(b_i>0\mid \bm y)\)` using the built-in function `inla.pmarginal`: ``` r > c = 0 > prob.b = lapply(b, function(x) {1 - inla.pmarginal(c, x)}) ``` -- - Create an object with all the info to map ``` r > RR_BYM = tibble(zeta=unlist(zeta),prob=unlist(prob.b), ID=seq(1,32)) > out_map = left_join(london.gen,datasuicides, by="ID") %>% left_join(., RR_BYM, by="ID") > out_map$pp_breaks = cut(out_map$prob, + breaks = c(0, c(0.2, 0.8), + 1), include.lowest = T) ``` --- #...and then create some maps .pull-left[ ### Map of posterior mean of `\(\zeta_i\)` ``` r > ggplot() + geom_sf(data = out_map, + aes(fill = zeta)) + + scale_fill_viridis_c() ``` <img src="./img/map-post-mean-1.png" > ] .pull-right[ ### Map of posterior probability of `\(\zeta_i>0\)` ``` r > ggplot() + geom_sf(data = out_map, + aes(fill = pp_breaks)) + + scale_fill_manual(values = c("blue","orange","red")) ``` <img src="./img/map-post-p-1.png" > ] --- # Output from different models .panelset[ .panel[.panel-name[Comparing maps] .pull-left[ - **SMR** non smoothed RR - **HET** non spatially smoothed residual RR: `\(\exp(v)\)` - **CAR** spatially smoothed residual RR: `\(\exp(u)\)` - **BYM** spatially and non spatially smoothed residual RR: `\(\exp(b)=\exp(u+v)\)` ] .pull-right[ <center><img src=./img/Lungmales_4maps.jpg width='90%' title=''></center> ] ] .panel[.panel-name[Shrinkage] .pull-left[ <center><img src=./img/Lungmales_shrinkage.jpg width='90%' title=''></center> ] .pull-right[ - Shrinkage towards the mean due to the **borrowing of strength** ] ] .panel[.panel-name[Interpretation] - Smoothed relative risks are more stable (precise than observed) → geographical patterns of risk are easier to detect using smoothed maps <span style="display:block; margin-top: 10px ;"></span> - Smoothed relative risks have higher specificity: - Possible "false positive" values shrunk towards mean - But in danger of over-smoothing (false negatives) <span style="display:block; margin-top: 10px ;"></span> - Visual impact of maps can be very dependent on the choice of colours and cut-points used to shade each region → Care must be taken not to over-interpret any patterns identified ] ] --- # SAHSU Environmental Atlas <iframe src="https://www.envhealthatlas.co.uk/homepage/gotoatlas.html" width="120%" height="85%" data-external="1" style="transform: scale(1);border:none"></iframe> --- name: Poisson-regression <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Ecological regression with spatial random effects (Generalized linear mixed-effect models)**]]] --- # Disease Mapping vs Ecological Regression ## Disease mapping studies - Focus is on description - Level of inference is at the aggregate (small area) level ## Ecological correlation studies - Focus is on **explanation** - Used for investigating specific exposure-disease hypotheses at small-area scale - Poisson regression can be used to model relationship between any area-level exposure measure and incidence/prevalence of disease - Such area-level exposure measures include average annual pollution level, proportion of population who smoke, proportion of population living with x km of a landfill site, etc. --- # Poisson regression with random effects Straightforward extension of disease mapping model: .content-box-beamer[ ### Ecological regression with BYM structure `\begin{align*} \hbox{y}_i & \sim \hbox{Poisson}(E_i \rho_i); \;\;\; i=1,...,N\\ \log \rho_i & = b_0 + \class{red}{\beta_1 x_i} + b_i\\ \boldsymbol{b} &= \frac{1}{\sqrt{\tau_b}}(\sqrt{1-\phi}\boldsymbol{v}_{*} + \sqrt{\phi} \boldsymbol{u}_{*})\\ \end{align*}` ] <span style="display:block; margin-top: 20px ;"></span> where - `\(y_i\)` and `\(E_i\)`: Observed and expected nb of cases in each area `\(i\)` - `\(\rho_i\)`: unknown RR - `\(x\)` area-level covariate of interest - `\(\beta_1\)`: parameter associated with the covariate - `\(\bm v_{*}\)`: standardised version of unstructured random effects, i.i.d. - `\(\bm u_{*}\)`: standardised version of random effects with spatial structure, conditional distribution --- # Interpretation of the parameters - `\(\exp(\beta_1)\)` is the change in risk associated with a unit change in exposure `\(x\)` <span style="display:block; margin-top: 30px ;"></span> - `\(b_i\)` is the random effect in area `\(i\)` <span style="display:block; margin-top: 30px ;"></span> - `\(\exp(b_i)\)` is the residual or adjusted relative risk of disease in area `\(i\)` .blue[after accounting for the effects of measured covariates and the overall mean risk] <span style="display:block; margin-top: 30px ;"></span> - The variance of the random effects reflects the amount of overdispersion in the data (total residual variance = Poisson variance + random effects variance) --- # Poisson regression with random effects - INLA code - Continuous covariate ``` r > formula.ecoreg.inla = y ~ 1 + x + + f(id,model="bym2", graph=graph) ``` - Categorical covariate ``` r > formula.ecoreg.inla = y ~ 1 + cut(x,breaks=c(0,7,10,24), + include.lowest=TRUE) + + f(id,model="bym2", graph=graph) ``` --- # Comparison between disease mapping and ecological regression <span style="display:block; margin-top: -40px ;"></span> <center><img src=./img/LipCancer.png width='50%' title=''></center> <span style="display:block; margin-top: -40px ;"></span> - Less extreme values when covariates are included → part of the spatial variability is explained by the covariates --- # Poisson regression with random effects .content-box-beamer[ ### Extension to several variables `\begin{align*} \hbox{y}_i & \sim \hbox{Poisson}(E_i \rho_i); \;\;\; i=1,...,N\\ \log \rho_i & = b_0 + \beta_1 x_{1i} +\beta_2 x_{2i} + b_i\\ & ... \end{align*}` ] <span style="display:block; margin-top: 10px ;"></span> - `\(\exp(\beta_1)\)` is the relative risk of disease/death associated with a unit increase in exposure `\(x_1\)`, after adjustment for `\(x_2\)` <span style="display:block; margin-top: 10px ;"></span> - `\(\exp(\beta_2)\)` is the relative risk of disease/death associated with a unit increase in exposure `\(x_2\)`, after adjustment for `\(x_1\)` <span style="display:block; margin-top: 10px ;"></span> - `\(\exp(b_i)\)` is the residual or adjusted relative risk of disease/death in area `\(i\)` after accounting for the effects of measured covariates and the overall mean risk --- # Summary: model cheatsheet <table class="table" style="color: black; margin-left: auto; margin-right: auto;"> <caption></caption> <thead> <tr> <th style="text-align:left;font-weight: bold;background-color: lightblue !important;"> Model </th> <th style="text-align:left;font-weight: bold;background-color: lightblue !important;"> Specification </th> <th style="text-align:left;font-weight: bold;background-color: lightblue !important;"> INLA </th> <th style="text-align:left;font-weight: bold;background-color: lightblue !important;"> Intepretation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;background-color: lightgray !important;"> iid </td> <td style="text-align:left;background-color: lightgray !important;"> \(v_i \sim N(0, \sigma^2_v)\) </td> <td style="text-align:left;background-color: lightgray !important;"> f(ID,model="iid",...) </td> <td style="text-align:left;background-color: lightgray !important;"> Unstructured random effect - Global smoothing </td> </tr> <tr> <td style="text-align:left;background-color: lightgray !important;"> ICAR </td> <td style="text-align:left;background-color: lightgray !important;"> \(\mathbf{u} \sim N(\mathbf{W}, \sigma^2_u)\) </td> <td style="text-align:left;background-color: lightgray !important;"> f(ID,model="besag",graph=...) </td> <td style="text-align:left;background-color: lightgray !important;"> Structured random effect - Local smoothing </td> </tr> <tr> <td style="text-align:left;background-color: lightgray !important;"> BYM </td> <td style="text-align:left;background-color: lightgray !important;"> \(v_i \sim N(0, \sigma^2_v), \mathbf{u} \sim N(\mathbf{W}, \sigma^2_u)\) </td> <td style="text-align:left;background-color: lightgray !important;"> f(ID, model="bym", graph=...) </td> <td style="text-align:left;background-color: lightgray !important;"> Unstructured + structured random effects - Global + local smoothing </td> </tr> <tr> <td style="text-align:left;background-color: lightgray !important;"> BYM2 </td> <td style="text-align:left;background-color: lightgray !important;"> \(b_i=\frac{1}{\sqrt{\tau_b}}(\sqrt{\phi} \times u_i^* + \sqrt{1-\phi} \times v_i^*)\) </td> <td style="text-align:left;background-color: lightgray !important;"> f(ID, model="bym2", graph=...) </td> <td style="text-align:left;background-color: lightgray !important;"> Weigthed average of unstructured and structured random effects - Global + local smoothing </td> </tr> </tbody> </table> --- # Summary: spatial models - Hierarchical models allow "borrowing of strength" across units - posterior distribution of `\(\rho_i\)` for each unit borrows strength from the likelihood contributions for **all** the units, via their joint influence on the posterior estimates of the unknown hyper-parameters <span style="display:block; margin-top: 10px ;"></span> - Judgements of exchangeability need careful assessment - units suspected a priori to be systematically different might be modelled by including relevant covariates so that residual variability more plausibly reflects exchangeability - subgroups of prior interest should be considered separately <span style="display:block; margin-top: 10px ;"></span> - Mapping geographical variations in disease risk is an important epidemiological technique for suggesting aetiological hypotheses <span style="display:block; margin-top: 10px ;"></span> - When combined with data on geographical variations in exposure, disease mapping techniques can be used to investigate and quantify **ecological** associations between disease risk and potential exposures --- # A few notes on ecological bias and atomistic fallacy - In the today's session, we worked with aggregated data. <span style="display:block; margin-top: 20px ;"></span> - Aggregation has implications for the type of inference that are possible. <span style="display:block; margin-top: 20px ;"></span> - Ecological inference is the process where aggregated data are used to infer individual level relationships. Reasons: - individual data are not available for confidentiality reasons - individual data are not reliable or too expensive to be collected <span style="display:block; margin-top: 20px ;"></span> - .red[Ecological (or aggregation) bias] can occur. It is the difference between the estimates of relationships obtained using grouped data and those estimates obtained using individual data (e.g. the association observed at the area level do not hold for the individuals within areas). <span style="display:block; margin-top: 20px ;"></span> - The converse, using individual level estimates uncritically to infer group level relationships (ignoring the possibility of group level or contextual effects) is called as .red[atomistic or individualistic fallacy]. --- # References Besag, J., J. York, and A. Mollie (1991). "Bayesian Image Restoration, with two Applications in Spatial Statistics". In: _Annals of the Institute of Statistical Mathematics_ 43, pp. 1-59. Richardson, S., A. Thomson, N. Best, et al. (2004). "Interpreting posterior relative risk estimates in disease-mapping studies". In: _Environmental Health Perspectives_ 112.9, pp. 1016-1025. Riebler, A., S. H. Sørbye, D. Simpson, et al. (2016). "An intuitive Bayesian spatial model for disease mapping that accounts for scaling". In: _Statistical Methods in Medical Research_ 25.4, pp. 1145-1165.