class: title-slide # Session 8.2: Spatio-temporal modelling for area data<span style="display:block; margin-top: 10px ;"></span> ## ### ### Imperial College London <!-- Can also separate the various components of the extra argument 'params', eg as in ### Imperial College London, , MSc in Epidemiology --> --- layout: true .my-footer[ .alignleft[ © Blangiardo | Pirani | Gascoigne ] .aligncenter[ MSc in Epidemiology ] .alignright[ Imperial College London, NA ] ] --- # Learning Objectives At the end of this session you should be able to: <span style="display:block; margin-top: 10px ;"></span> - Explain how to extend spatial or temporal models to spatio-temporal models <span style="display:block; margin-top: 10px ;"></span> - Fit Bayesian space-time models using `R-INLA` <span style="display:block; margin-top: 20px ;"></span> The topics covered in this lecture can be found in Chapter 7 of the book Spatial and **Spatio-Temporal Bayesian Models with `R-INLA`**. --- # Outline <span style="display:block; margin-top: 30px ;"></span> 1\. [Temporal dependence](#temporal-structure) <span style="display:block; margin-top: 30px ;"></span> 2\. [From space to space-time](#space-to-time) <span style="display:block; margin-top: 30px ;"></span> 3\. [Type of interactions](#interactions) --- name: temporal-structure <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Temporal dependence**]]] --- # Temporal dependence - Similarly to spatial dependence, it is sometimes necessary to model temporal dependence of data or of parameters: <span style="display:block; margin-top: 20px ;"></span> - the weekly or monthly number of cases of many diseases exhibit often a seasonal pattern as well as short term dependence <span style="display:block; margin-top: 10px ;"></span> - the underlying daily level of atmospheric pollutants, e.g. PM `\(_{10}\)`, will show strong correlation over consecutive days because their lifetime lasts over several days <span style="display:block; margin-top: 20px ;"></span> - To the contrary of spatial models, there is a natural order to any time series data which is used in specifying the models. --- # Spatial patterns 1\. Data - Disease counts `\(y_{i}\)` in area `\(i\)` and stratum `\(k\)`, aggregated over a time period, `\(i=1,\ldots,N\)`, `\(k=1,\ldots,K\)` <span style="display:block; margin-top: 10px ;"></span> - Population counts `\(n_{ik}\)` in area `\(i\)` and stratum `\(k\)`, aggregated over a time period <span style="display:block; margin-top: 10px ;"></span> - Expected numbers `\(E_i = \sum_k n_{ik} r_k\)`, where `\(r_k\)` reference rate for stratum (age, gender,...) 2\. Spatial smoothing using BYM model `\begin{align*} y_i & \sim \hbox{Poisson}(E_i \rho_i); \;\;\; i=1,...,N\\ \eta_i = \log \rho_i & = b_0 + b_i\\ \boldsymbol{b} &= \frac{1}{\sqrt{\tau_b}}(\sqrt{1-\phi}\boldsymbol{v}_{*} + \sqrt{\phi}\boldsymbol{u}_{*}) \end{align*}` --- # Ohio Lung cancer spatial pattern - Data on lung cancer in 88 counties of Ohio, 1968-1988 - Purely spatial analysis <img src="./img/unnamed-chunk-3-1.png" > --- # Temporal trends 1\. Data - Disease counts `\(y_{tk}\)` in time period `\(t\)` and stratum `\(k\)`, aggregated over space, `\(t=1,\ldots,T\)` (equally-spaced time intervals), `\(k=1,\ldots,K\)` <span style="display:block; margin-top: 10px ;"></span> - Population counts `\(n_{tk}\)` in time period `\(t\)` and stratum `\(k\)`, aggregated over space <span style="display:block; margin-top: 10px ;"></span> - Expected numbers `\(E_t = \sum_k n_{tk} r_k\)`, where `\(r_k\)` reference rate for stratum (age, gender,...) Note that `\(E_t\)` makes sense if the time series covers a long period, e.g. years, otherwise it is common to assign `\(E_t = \sum_t y_t / T\)` <span style="display:block; margin-top: 20px ;"></span> 2\. Temporal trends `\begin{align*} y_t & \sim \hbox{Poisson}(E_t \rho_t); \;\;\; t=1,...,T\\ \log \rho_t & = ??? \end{align*}` --- count:false # Temporal trends 1\. Data - Disease counts `\(y_{tk}\)` in time period `\(t\)` and stratum `\(k\)`, aggregated over space, `\(t=1,\ldots,T\)` (equally-spaced time intervals), `\(k=1,\ldots,K\)` <span style="display:block; margin-top: 10px ;"></span> - Population counts `\(n_{tk}\)` in time period `\(t\)` and stratum `\(k\)`, aggregated over space <span style="display:block; margin-top: 10px ;"></span> - Expected numbers `\(E_t = \sum_k n_{tk} r_k\)`, where `\(r_k\)` reference rate for stratum (age, gender,...) Note that `\(E_t\)` makes sense if the time series covers a long period, e.g. years, otherwise it is common to assign `\(E_t = \sum_t y_t / T\)` <span style="display:block; margin-top: 20px ;"></span> 2\. Temporal trends `\begin{align*} y_t & \sim \hbox{Poisson}(E_t \rho_t); \;\;\; t=1,...,T\\ \log \rho_t & = b_0 + \beta t \quad \hbox{ simple linear regression}\\ \end{align*}` --- count:false # Temporal trends 1\. Data - Disease counts `\(y_{tk}\)` in time period `\(t\)` and stratum `\(k\)`, aggregated over space, `\(t=1,\ldots,T\)` (equally-spaced time intervals), `\(k=1,\ldots,K\)` <span style="display:block; margin-top: 10px ;"></span> - Population counts `\(n_{tk}\)` in time period `\(t\)` and stratum `\(k\)`, aggregated over space <span style="display:block; margin-top: 10px ;"></span> - Expected numbers `\(E_t = \sum_k n_{tk} r_k\)`, where `\(r_k\)` reference rate for stratum (age, gender,...) Note that `\(E_t\)` makes sense if the time series covers a long period, e.g. years, otherwise it is common to assign `\(E_t = \sum_t y_t / T\)` <span style="display:block; margin-top: 10px ;"></span> OR <span style="display:block; margin-top: 10px ;"></span> 2\. Temporal trends `\begin{align*} y_t & \sim \hbox{Poisson}(E_t \rho_t); \;\;\; t=1,...,T\\ \log \rho_t & = b_0 + \psi_t \quad \hbox{ global temporal smoothing}\\ & \psi_t \sim \hbox{Normal}(0, \sigma^2_{\psi})\\ \end{align*}` --- count:false # Temporal trends 1\. Data - Disease counts `\(y_{tk}\)` in time period `\(t\)` and stratum `\(k\)`, aggregated over space, `\(t=1,\ldots,T\)` (equally-spaced time intervals), `\(k=1,\ldots,K\)` <span style="display:block; margin-top: 10px ;"></span> - Population counts `\(n_{tk}\)` in time period `\(t\)` and stratum `\(k\)`, aggregated over space <span style="display:block; margin-top: 10px ;"></span> - Expected numbers `\(E_t = \sum_k n_{tk} r_k\)`, where `\(r_k\)` reference rate for stratum (age, gender,...) Note that `\(E_t\)` makes sense if the time series covers a long period, e.g. years, otherwise it is common to assign `\(E_t = \sum_t y_t / T\)` <span style="display:block; margin-top: 10px ;"></span> OR <span style="display:block; margin-top: 10px ;"></span> 2\. Temporal trends `\begin{align*} y_t & \sim \hbox{Poisson}(E_t \rho_t); \;\;\; t=1,...,T\\ \log \rho_t & = b_0 + \gamma_t \quad \hbox{ local temporal smoothing}\\ & \gamma_t \sim RW(\sigma^2_{\gamma})\\ \end{align*}` --- count:false # Temporal trends 1\. Data - Disease counts `\(y_{tk}\)` in time period `\(t\)` and stratum `\(k\)`, aggregated over space, `\(t=1,\ldots,T\)` (equally-spaced time intervals), `\(k=1,\ldots,K\)` <span style="display:block; margin-top: 10px ;"></span> - Population counts `\(n_{tk}\)` in time period `\(t\)` and stratum `\(k\)`, aggregated over space <span style="display:block; margin-top: 10px ;"></span> - Expected numbers `\(E_t = \sum_k n_{tk} r_k\)`, where `\(r_k\)` reference rate for stratum (age, gender,...) Note that `\(E_t\)` makes sense if the time series covers a long period, e.g. years, otherwise it is common to assign `\(E_t = \sum_t y_t / T\)` <span style="display:block; margin-top: 10px ;"></span> OR <span style="display:block; margin-top: 10px ;"></span> 2\. Temporal trends `\begin{align*} y_t & \sim \hbox{Poisson}(E_t \rho_t); \;\;\; t=1,...,T\\ \log \rho_t & = b_0 + \gamma_t + \psi_t \quad \hbox{ global and local temporal smoothing}\\ & \psi_t \sim \hbox{Normal}(0, \sigma^2_{\psi})\\ & \gamma_t \sim RW(\sigma^2_{\gamma})\\ \end{align*}` --- # Ohio Lung cancer data: modelled temporal trend - Annual rates adjusted by gender and race - Fitting different temporal trends (linear, .red[local smoothing], .blue[local+global smoothing]) <img src="./img/ohio-rw-1.png" > --- name: space-to-time <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **From space to space-time**]]] --- # Disease mapping: Extending space to space-time - Disease mapping is usually carried out on aggregated data over a time period <span style="display:block; margin-top: 10px ;"></span> - Rather than suppressing the time dimension, it can be interesting to use models that combine the space and time dimension <span style="display:block; margin-top: 10px ;"></span> - The stability (or not) of the spatial pattern can aid interpretation <span style="display:block; margin-top: 10px ;"></span> - The specific space-time components of the model can potentially pinpoint unusual/emerging hazards <span style="display:block; margin-top: 10px ;"></span> - Data: - `\(y_{it}\)` <span style="display:block; margin-top: 10px ;"></span> - `\(\hbox{E}_{it}\)`: the observed and expected number of cases in area `\(i\)` at time `\(t\)` calculated as `\(E_{it} = \sum_k n_{itk} r_k\)`, where `\(r_k\)` reference rate for stratum (age, gender,...) --- # Dynamic spatio-temporal model <center><img src=./img/representation_ST.jpg width='80%' title='INCLUDE TEXT HERE'></center> --- # Ohio Lung cancer - temporal SMRs Log SMR time trends in each county - Slightly increasing trend but lots of variation across the counties <img src="./img/ohio-ST-1.png" > --- # Model 1: linear time + space `\begin{align*} y_{it} & \sim \hbox{Poisson}(\hbox{E}_{it} \rho_{it}) \\ \log\rho_{it} & = b_0 + \beta \times t + b_i\\ \end{align*}` where - `\(b_0\)` overall log RR in Ohio over the 21-year period - `\(b_i\)` is the weighted average of spatially structured (ICAR) and unstructured random effects - `\(\exp(\beta)\)` is the change in the RR associated with a 1-year increase in time ``` r > formula.mod1 = y ~ 1 + year + f(county,model="bym2", + graph=Ohio.adj) ``` ``` r > mod1 = inla(data=ohio.data,formula=formula.mod1, E=E, family="poisson", control.compute=list(waic=TRUE)) ``` Additional temporal random effects can be added... --- # Model 2: linear time + space + structured time `\begin{align*} y_{it} & \sim \hbox{Poisson}(\hbox{E}_{it} \rho_{it}) \\ \log\rho_{it} & = b_0 + \beta\times t + b_i + \color{red}{\gamma_t}\\ \end{align*}` where - `\(b_0\)` overall log RR in Ohio over the 21-year period - `\(b_i\)` is the weighted average of spatially structured (ICAR) and unstructured random effects - `\(\exp(\beta)\)` is the change in the RR associated with a 1-year increase in time - Looking back at the purely temporal analysis it seems a random walk was able to explain the local changes in time `\(\color{red}{\gamma_t \sim \hbox{Normal}(\gamma_{t-1}, \sigma^2_{\gamma})}\)` ``` r > year2<-ohio.data$year > formula.mod2 = y ~ 1 + + year + f(county,model="bym2", + graph=Ohio.adj) + + f(year2,model="rw1") ``` ``` r > mod2 = inla(data=ohio.data,formula=formula.mod2, E=E, family="poisson", control.compute=list(waic=TRUE)) ``` --- # Comparing model 1 and 2: Fixed effects ``` r > mod1$summary.fixed ``` ``` mean sd 0.025quant 0.5quant 0.975quant mode kld (Intercept) -0.98833579 0.11241936 -1.20947970 -0.98832087 -0.76727608 -0.98832090 1.469624e-08 year 0.02612109 0.00058628 0.02497145 0.02612109 0.02727072 0.02612109 5.512061e-11 ``` ``` r > mod2$summary.fixed ``` ``` mean sd 0.025quant 0.5quant 0.975quant mode kld (Intercept) -1.22479507 1.8561760 -4.9003996 -1.22480717 2.450882 -1.22480591 9.013104e-08 year 0.04083633 0.1683997 -0.2926659 0.04083756 0.374331 0.04083742 9.082271e-08 ``` --- # Comparing model 1 and 2: Fitted values <img src="./img/comp2-1.png" > --- # Model 3: space + structured time + unstructured time - Exploring if a combination of random effects could be more flexible in capturing the temporal trend than a linear trend `\begin{align*} y_{it} & \sim \hbox{Poisson}(\hbox{E}_{it} \rho_{it}) \\ \log\rho_{it} & = b_0 + b_i + \color{red}{\gamma_t} + \color{red}{\psi_t}\\ \end{align*}` where - `\(b_0\)` overall log RR in Ohio over the 21-year period - `\(b_i\)` is the weighted average of spatially structured (ICAR) and unstructured random effects - The temporal pattern is now modelled with two random effects (global + local smoothing) ``` r > formula.mod3 = y ~ 1 + + f(county,model="bym2", graph=Ohio.adj) + + f(year,model="rw1") + + f(year2, model="iid") ``` ``` r > mod3 = inla(data=ohio.data,formula=formula.mod3, E=E, family="poisson", control.compute=list(waic=TRUE)) ``` --- # Comparing model 1 and 3: Fixed effects ``` r > mod1$summary.fixed ``` ``` mean sd 0.025quant 0.5quant 0.975quant mode kld (Intercept) -0.98833579 0.11241936 -1.20947970 -0.98832087 -0.76727608 -0.98832090 1.469624e-08 year 0.02612109 0.00058628 0.02497145 0.02612109 0.02727072 0.02612109 5.512061e-11 ``` ``` r > mod3$summary.fixed ``` ``` mean sd 0.025quant 0.5quant 0.975quant mode kld (Intercept) -0.7759724 0.1580583 -1.086852 -0.775959 -0.465168 -0.7759588 1.534568e-08 ``` --- # Comparing model 2 and 3: Fitted values <img src="./img/comp5-1.png" > --- # Do we need space-time interactions in spatio-temporal models? - When we have temporal random effects, if we look at area specific trends this is what we would see: <center><img src=./img/ohio10.jpeg width='40%' title='INCLUDE TEXT HERE'></center> .red[Parallel lines as the temporal trend is assumed to be the same across all the spatial units] --- # Space-time model with exchangeable interactions - It is easy to allow for spatio-temporal interactions, which add flexibility to the model - A general specification of the model will be `\begin{align*} y_{it} & \sim \hbox{Poisson}(\hbox{E}_{it} \rho_{it}) \\ \log\rho_{it} & = b_0 + b_i + \;\gamma_t + \psi_t + \color{blue}{\delta_{it}}\\ b_i & = v_i + u_i\\ \gamma_t & \sim \hbox{RW(1)}\\ \psi_t & \sim \hbox{N}(0,\sigma^2_{\psi})\\ \color{blue}{\delta_{it}} & \color{blue}{\sim \hbox{Normal}(0, \sigma^2_{\delta})} \end{align*}` The **space-time interaction parameter**, `\(\delta_{it}\)`, is modelled as exchangeable random effects, that capture departure from the additive structure (given by the BYM2 in space and RW1+iid in time). --- # Interpretation of the interactions - The interactions `\(\delta_{it}\)` allow to highlight unusual temporal trends - Rules based on the posterior probabilities `\(p(\delta_{it} > 0)\)` for at least 1 time `\(t\)` .pull-left[ "usual" temporal trends <center><img src=./img/ohio14.jpeg width='70%' title='INCLUDE TEXT HERE'></center> ] .pull-right[ unusual temporal trends <center><img src=./img/ohio13.jpeg width='70%' title='INCLUDE TEXT HERE'></center> ] --- # INLA code for model with interaction - In INLA is very easy to include the interaction in the `formula` environment - We need to specify an index for the interaction, i.e. for each combination of area/time `\begin{align*} \log\rho_{it} & = b_0 + b_i + \gamma_t + \psi_t + \delta_{it}\\ \gamma_t & \sim \hbox{RW(1)}\\ \psi_t & \sim N(0,\sigma^2_{\psi})\\ \delta_{it} \sim & \hbox{Normal}(0, \sigma^2_{\delta}) \end{align*}` ``` r > formula.intI = y ~ + f(county,model="bym2", + graph=Ohio.adj) + + f(year,model="rw1") + + f(year2,model="iid") + + f(area.year,model="iid") ``` --- name: interactions <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Types of interactions**]]] --- # Another example: Birth weight in Georgia - Count of babies weigthing less than 2500g in 159 counties in Georgia (US) - Period 2000 – 2010 <img src="./img/Georgia_SMR-1.png" style="display: block; margin: auto;" width="55%"> --- # Different types of interactions `\begin{align*} y_{it} & \sim \hbox{Poisson}(\hbox{E}_{it} \rho_{it}) \\ \log \rho_{it} & = b_0 + b_i + \gamma_t + \psi_t + \color{blue}{\delta_{it}}\\ b_i & = v_i + u_i\\ \gamma_t & \sim \hbox{RW(1)}\\ \psi_t & \sim N(0,\sigma^2_{\psi})\\ \end{align*}` <table class="table" style="margin-left: auto; margin-right: auto;"> <caption>Characteristics of ST interaction</caption> <thead> <tr> <th style="text-align:left;"> Interaction </th> <th style="text-align:left;"> Parameters </th> <th style="text-align:left;"> Rank </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> I </td> <td style="text-align:left;"> Unstructured in space and Unstructured in time </td> <td style="text-align:left;"> nT </td> </tr> <tr> <td style="text-align:left;"> II </td> <td style="text-align:left;"> Unstructured in space and RW in time </td> <td style="text-align:left;"> n(T-1) for RW1, n(T-2) for RW2 </td> </tr> <tr> <td style="text-align:left;"> III </td> <td style="text-align:left;"> ICAR in space and Unstructured in time </td> <td style="text-align:left;"> (n-1)T </td> </tr> <tr> <td style="text-align:left;"> IV </td> <td style="text-align:left;"> ICAR in space and RW in time </td> <td style="text-align:left;"> (n-1)(T-1) for RW1, (n-1)(T-2) for RW2 </td> </tr> </tbody> </table> --- # How to model interactions - Data in R format ``` # A tibble: 1,749 × 8 ID.area Obs ID.year Exp y E NAME ID.area.year <int> <int> <dbl> <dbl> <int> <dbl> <chr> <int> 1 1 20 1 25.8 20 25.8 Appling 1 2 1 24 2 25.1 24 25.1 Appling 2 3 1 25 3 22.8 25 22.8 Appling 3 4 1 31 4 25.2 31 25.2 Appling 4 5 1 24 5 24.9 24 24.9 Appling 5 6 1 40 6 26.7 40 26.7 Appling 6 7 1 29 7 26.0 29 26.0 Appling 7 8 1 35 8 27.2 35 27.2 Appling 8 9 1 26 9 25.1 26 25.1 Appling 9 10 1 25 10 24.5 25 24.5 Appling 10 # ℹ 1,739 more rows ``` <span style="display:block; margin-top: -10px ;"></span> - We need to make sure to have an index for area (`ID.area`), one for time (`ID.year`) and one for the interaction (`ID.area.year`) --- # Type I interaction in INLA Let's adopt Type I interaction, which assumes that the two unstructured effects `\(v_i\)` and `\(\psi_t\)` interact <span style="display:block; margin-top: -10px ;"></span> ``` r > Georgia.adj = "Georgia.graph" > ID.year2 = data_INLA$ID.year > formula.intI = y ~ + f(ID.area,model="bym2", + graph=Georgia.adj) + + f(ID.year,model="rw1") + + f(ID.year2,model="iid") + + f(ID.area.year,model="iid") ``` <center><img src=./img/INLA_ST1_mod-1.png width='40%' title='INCLUDE TEXT HERE'></center> --- # Kronecker product - For the interactions of type II-IV we will need to use the **Kronecker product** to specify the dependencies. It is a matrix multiplication which returns a block matrix. For instance `$$\left[\begin{array}{cc} 1 & 2 \\ 3 & 4 \end{array}\right] \otimes \left[ \begin{array}{cc} 0 & 5 \\ 6 & 7 \end{array} \right] = \left[ \begin{array}{cc} 1 \left[\begin{array}{cc} 0 & 5\\ 6 & 7 \end{array}\right] & 2 \left[\begin{array}{cc} 0 & 5\\ 6 & 7 \end{array}\right] \\ 3 \left[\begin{array}{cc} 0 & 5\\ 6 & 7 \end{array}\right] & 4 \left[\begin{array}{cc} 0 & 5\\ 6 & 7 \end{array}\right] \end{array}\right]$$` - There is a function in `R` which does exactly that: `kronecker(a,b)` - Type II-IV of interactions are pretty complex and can take a **very long** time to run - We breifly discuss, but for more information on how to set interactions using the Kronecker product in `INLA` see Goicoa, Adin, Ugarte, and Hodges (2018) --- # Type II interaction: in INLA - Type II combines the structured temporal main effect `\(\gamma_t\)` and the unstructured spatial effect `\(v_i\)`. - We use first create the structure matrix for the time component assuming a random walk of order 1 ``` r > #RW1 > D1 <- diff(diag(11),differences=1) > Q.gammaRW1 <- t(D1)%*%D1 > #Let's look at it > Q.gammaRW1[1:5,1:5] ``` ``` [,1] [,2] [,3] [,4] [,5] [1,] 1 -1 0 0 0 [2,] -1 2 -1 0 0 [3,] 0 -1 2 -1 0 [4,] 0 0 -1 2 -1 [5,] 0 0 0 -1 2 ``` --- # Type II interaction: in INLA - Then we create the Kronecker product `\(v \otimes\)` `\(\gamma\)` and note from the table in slide 35 that the rank deficiency is n (159 in our case) ``` r > #Kronecker product (for RW1) > R <- kronecker(diag(159),Q.gammaRW1) > R[1:5,1:5] ``` ``` [,1] [,2] [,3] [,4] [,5] [1,] 1 -1 0 0 0 [2,] -1 2 -1 0 0 [3,] 0 -1 2 -1 0 [4,] 0 0 -1 2 -1 [5,] 0 0 0 -1 2 ``` --- # Type II interaction: in INLA - Finally we need to impose some more constraints as the matrix have spatio-temporal interaction matrix has a rank deficiency - We calculate the eigenvalues of the structure matrix of the interaction term and the extra constraints are those where the eigenvalues are zero ``` r > eigenQ <- eigen(R) > ids <- which(eigenQ$values < 0.00001) > cMat <- t(eigenQ$vectors[,ids]) ``` The `formula` environment needs to be changed to ``` r > formula.intII<- y ~ f(ID.area,model="bym2",graph=Georgia.adj) + + f(ID.year,model="rw1") + + f(ID.year2,model="iid") + + f(ID.area.year, model="generic0", Cmatrix=R, rankdef = nrow(cMat), + extraconstr = list(A = cMat, e = rep(0, nrow(cMat)))) ``` --- # Type II interaction: in INLA <center><img src=./img/INLA_ST2_mod-1.png width='60%' title='INCLUDE TEXT HERE'></center> --- # Type III interaction in INLA - Type III combines the spatially structured main effect `\(u_i\)` and the unstructured temporal effect `\(\psi_t\)`. - We use first create the structure matrix for the spatial component using the neighborhood graph ``` r > g <- inla.read.graph(Georgia.adj) > Q.xi <- matrix(0, g$n, g$n) > for (i in 1:g$n){ + Q.xi[i,i]=g$nnbs[[i]] + Q.xi[i,g$nbs[[i]]]=-1 + } > Q.xi[1:5,1:5] ``` ``` [,1] [,2] [,3] [,4] [,5] [1,] 2 -1 0 0 0 [2,] -1 4 0 0 0 [3,] 0 0 5 -1 0 [4,] 0 0 -1 4 -1 [5,] 0 0 0 -1 4 ``` --- # Type III interaction in INLA - Then we create the Kronecker product `\(u\)` `\(\otimes\)` `\(\psi\)` and note from the table in slide 35 that the rank deficiency is T (11 in our case) ``` r > #Kronecker product (for RW1) > R <- kronecker(Q.xi,diag(11)) > R[1:5,1:15] ``` ``` [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [1,] 2 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 [2,] 0 2 0 0 0 0 0 0 0 0 0 0 -1 0 0 [3,] 0 0 2 0 0 0 0 0 0 0 0 0 0 -1 0 [4,] 0 0 0 2 0 0 0 0 0 0 0 0 0 0 -1 [5,] 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 ``` --- # Type III interaction in INLA - Finally we need to impose some more constraints (on the eigenvalues equal to 0) ``` r > eigenQ <- eigen(R) > ids <- which(eigenQ$values < 0.00001) > cMat <- t(eigenQ$vectors[,ids]) ``` The `formula` environment becomes ``` r > formula.intIII<- y ~ f(ID.area,model="bym2",graph=Georgia.adj) + + f(ID.year,model="rw1") + + f(ID.year2,model="iid") + + f(ID.area.year, model="generic0", Cmatrix=R, rankdef = nrow(cMat), + extraconstr = list(A = cMat, e = rep(0, nrow(cMat)))) ``` --- # Type III interaction: in INLA <center><img src=./img/INLA_ST3_mod-1.png width='60%' title='INCLUDE TEXT HERE'></center> --- # Type IV interaction in INLA - Type IV is the most complex type of interaction, assuming that the spatially and temporally structured effects `\(u_i\)` and `\(\gamma_t\)` interact - We assume that the temporal dependency structure for each area is not independent from all the other areas anymore, but depends on the temporal pattern of the neighboring areas as well - We create the Kronecker product `\(u\)` `\(\otimes\)` `\(\gamma\)` and note from the table in slide 35 that the rank deficiency is n+T-1 (169 in our case) ``` r > #Kronecker product (for RW1) > R <- kronecker(Q.xi,Q.gammaRW1) > eigenQ <- eigen(R) > ids <- which(eigenQ$values < 0.00001) > cMat <- t(eigenQ$vectors[,ids]) ``` The `formula` environment becomes ``` r > formula.intIV<- y ~ f(ID.area,model="bym2",graph=Georgia.adj) + + f(ID.year,model="rw1") + + f(ID.year,model="iid") + + f(ID.area.year, model="generic0", Cmatrix=R, rankdef = nrow(cMat), + extraconstr = list(A = cMat, e = rep(0, nrow(cMat)))) ``` --- # Type IV interaction: in INLA <center><img src=./img/INLA_ST4_mod-1.png width='60%' title='INCLUDE TEXT HERE'></center> --- # Model selection It might be useful to employ indexes like DIC and WAIC to decide which model is the most appropriate for the problem at hand In this example: <table class="table" style="margin-left: auto; margin-right: auto;"> <caption>Model fitting for the different ST models</caption> <thead> <tr> <th style="text-align:left;"> Interaction </th> <th style="text-align:left;"> Parameters </th> <th style="text-align:left;"> Rank </th> <th style="text-align:right;"> WAIC </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> I </td> <td style="text-align:left;"> Unstructured in space and Unstructured in time </td> <td style="text-align:left;"> nT </td> <td style="text-align:right;"> 11602 </td> </tr> <tr> <td style="text-align:left;"> II </td> <td style="text-align:left;"> Unstructured in space and RW in time </td> <td style="text-align:left;"> n(T-1) for RW1, n(T-2) for RW2 </td> <td style="text-align:right;"> 11569 </td> </tr> <tr> <td style="text-align:left;"> III </td> <td style="text-align:left;"> ICAR in space and Unstructured in time </td> <td style="text-align:left;"> (n-1)T </td> <td style="text-align:right;"> 11659 </td> </tr> <tr> <td style="text-align:left;"> IV </td> <td style="text-align:left;"> ICAR in space and RW in time </td> <td style="text-align:left;"> (n-1)(T-1) for RW1, (n-1)(T-2) for RW2 </td> <td style="text-align:right;"> 11612 </td> </tr> </tbody> </table> So we conclude that the model with the interaction of order II is the most appropriate for this data. --- # Summary <span style="display:block; margin-top: 20px ;"></span> - Increase quality of datasets that are both spatially and temporally indexed <span style="display:block; margin-top: 20px ;"></span> - Advanced methods to deal with this type of data <span style="display:block; margin-top: 20px ;"></span> - Allow to interpret the stability (or not) of the spatial patterns <span style="display:block; margin-top: 20px ;"></span> -- - Model selection tools are useful to choose which interaction to use <span style="display:block; margin-top: 20px ;"></span> - Careful with interactions of order above I as they can take a **very long** time to run --- # References Goicoa, T., A. Adin, M. Ugarte, et al. (2018). "In spatio-temporal disease mapping models, identifiability constraints affect PQL and INLA results". In: _Stochastic Environmental Research and Risk Assessment_ 32.3, pp. 749-770.