class: title-slide # Session 1.2: Bayes Theorem<span style="display:block; margin-top: 10px ;"></span> ## ### ### Imperial College London <!-- Can also separate the various components of the extra argument 'params', eg as in ### Imperial College London, , MSc in Epidemiology --> --- layout: true .my-footer[ .alignleft[ © Marta Blangiardo | Monica Pirani ] .aligncenter[ MSc in Epidemiology ] .alignright[ Imperial College London, NA ] ] <style> pre { overflow-x: auto; } pre code { word-wrap: normal; white-space: pre; } </style> --- # Learning objectives After this lecture you should be able to <span style="display:block; margin-top: 40px ;"></span> - Distinguish between conditional probability and likelihood <span style="display:block; margin-top: 40px ;"></span> - Compute joint and conditional probabilities <span style="display:block; margin-top: 40px ;"></span> - Use Bayes theorem to obtain posterior probabilities <span style="display:block; margin-top: 20px ;"></span> The topics treated in this lecture are presented in Chapter 3 of Blangiardo and Cameletti (2015) and in Chapter 2 of Johnson, Ott, and Dogucu (2022). --- # Outline <span style="display:block; margin-top: 30px ;"></span> 1\. [Conditional probability and likelihood](#Cond_lik) <span style="display:block; margin-top: 30px ;"></span> 2\. [Normalising constant](#Norm) <span style="display:block; margin-top: 30px ;"></span> 3\. [Bayes Theorem](#BayTheo) --- name: Cond_lik <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Conditional probability and likelihood**]]] --- # Example: COVID-19 test - A COVID-19 test has shown to have 80% sensitivity and 99% specificity - In England, COVID-19 prevalence is 6% <span style="display:block; margin-top: 50px ;"></span> <center> .content-box-green[What is the chance that a patient testing positive actually does have COVID-19?] </center> <span style="display:block; margin-top: 100px ;"></span> -- We have two pieces of information: 1. Our prior suggests that the COVID-19 prevalence in the country is low (6%) 2. Our data suggest that our diagnostic test is accurate --- # Example: COVID-19 test How can we balance these two pieces of information to answer the question about having the disease? <span style="display:block; margin-top: 50px ;"></span> <center><img src=./img/Doodle.png width='75%' title='INCLUDE TEXT HERE'></center> --- # Prior probability model - Let's look at our prior: COVID-19 prevalence in the country is 6%. How can we formalise it? <span style="display:block; margin-top: 50px ;"></span> Define A as the event: **a person has COVID-19 in England** Then `\(P(A)=0.06\)` and consequently `\(P(A^C)=0.94\)`. <span style="display:block; margin-top: 50px ;"></span> Remember that a valid probability model must: 1. account for all possible events (having or not having COVID-19); 2. assign prior probabilities to each event. <span style="display:block; margin-top: 20px ;"></span> Also 3. each probability must be between 0 and 1; 4. these probabilities must sum to one. --- # Conditional probability Now we summarise the **data** that we get from the diagnostic tests: - 80% sensitivity: if a person has COVID-19 they will test positive 80 out of 100 times - 99% specificity: if a person does not have COVID-19 they will test negative 99 out of 100 times -- These are .alert[**conditional probabilities**] and defining B the event: a person tests positive for COVID-19, we can summarise the above information in `$$P(B \mid A) = 0.8$$` and `$$P(B \mid A^C) = 1 - P(B^C \mid A^C) = 1 - 0.99 =0.01$$` --- # Some rules of conditional probabilities In general, comparing the conditional vs unconditional probabilities `\(P(B\mid A)\)` vs `\(P(B)\)`, reveals the extent to which information about `\(B\)` changes in light of `\(A\)` In some cases, the certainty of an event `\(B\)` might increase in light of new data `\(A\)`: - if you eat hamburgers and chips every day, your probability of having high colesterol is higher than in the general population `$$P(B \mid A) > P(B)$$` -- In some cases, the certainty of an event `\(B\)` might decrease in light of new data `\(A\)`: - if you are vaccinated against flu, your probability of getting into hospital with serious flu complications decreases `$$P(B \mid A) < P(B)$$` -- The order of conditioning is also important, as generally `\(P(B \mid A) \neq P(A \mid B)\)`: for instance in India the probability of getting bitten by a snake after a week of torrential rain is `\(P(B \mid A)=0.4\)`; but this does not mean that there is a 0.4 probability of a week of torrential rain after someone is bitten by a snake `\(P(A \mid B)\)`. -- Finally information about A does not always change our understanding of B: then the two events are **independent** and `\(P(B \mid A) = P(B)\)` <span style="display:block; margin-top: 20px ;"></span> -- .red[See recording 2 for an additional recap on Probability] --- # Some rules of conditional probabilities - Provable from probability axioms $$ P(A|B) =\frac{P(A \cap B)}{P(B)} = \frac{ P(B|A) P(A) } {P(B)}$$ <span style="display:block; margin-top: -20px ;"></span> <center><img src=./img/venn_diagram.png width='50%' title=''></center> <span style="display:block; margin-top: -20px ;"></span> - If `\(A_i\)` is a set of mutually exclusive and exhaustive events (*i.e.* `\(A_i\cap A_j=\emptyset\)`, `\(P( \bigcup\limits_i A_i ) = \sum\limits_i P(A_i) = 1\)`), then $$ P(A_i|B) = \frac{ P(B|A_i) P(A_i) } {P(B)} = \frac{ P(B|A_i) P(A_i) } {\sum\limits_j P(B|A_j) P(A_j) }$$ --- # Likelihood Let's re-examine the COVID-19 example: we know that if someone has the disease their probability of testing positive is much higher than if they do not (0.8 vs 0.01), so what we think is that the probability of having the disease if testing positive must be high. We are moving unconsciously from conditional probability to likelihood. -- When A is known, the conditional probability function `\(P(.\mid A)\)` allows to compute the probabilities of an unknown event `\(B\)` or `\(B^C\)`: `$$P(B\mid A) \text{ compared to } P(B^C \mid A)$$` When B is known (i.e. observed), the likelihood function `\(L(.\mid B)\)` allows to evaluate the relative compatibility of the data `\(B\)` with the event `\(A\)` or `\(A^C\)`: `$$L(A \mid B) = P(B \mid A) \text{ compared to } L(A^C \mid B) = P(B \mid A^C)$$` -- So far we have (i) the prior evidence of getting COVID-19 and (ii) the likelihood which tells that a positive test is more likely among diseased people: <table class="table" style="margin-left: auto; margin-right: auto;"> <caption>Prior and likelihood</caption> <thead> <tr> <th style="text-align:left;"> Event </th> <th style="text-align:right;"> A </th> <th style="text-align:right;"> A^C </th> <th style="text-align:right;"> Total </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Prior </td> <td style="text-align:right;"> 0.06 </td> <td style="text-align:right;"> 0.94 </td> <td style="text-align:right;"> 1.00 </td> </tr> <tr> <td style="text-align:left;"> Likelihood </td> <td style="text-align:right;"> 0.80 </td> <td style="text-align:right;"> 0.01 </td> <td style="text-align:right;"> 0.81 </td> </tr> </tbody> </table> --- name: Norm <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Normalising constant**]]] --- # Normalising constant The marginal probability of testing positive `\(P(B)\)` provides an important point of comparison. This is the last bit of information we need. Let's try to fill in the table below: <center><img src=./img/Table_Prob.png width='50%' title='INCLUDE TEXT HERE'></center> First let's look at the row A: there are those who tests positive AND have the disease and those who do not test positive AND have the disease. To get these probabilities remember that `\(P(A)=0.06\)` and that `\(P(B\mid A)=0.8\)`, so that 1. `\(P(A \cap B) = 0.06 \times 0.8 = 0.048\)` Then using a similar rationale we can get: 2. `\(P(A \cap B^C) = P(B^C\mid A) \times P(A) = (1-0.8)\times 0.06 = 0.012\)` 3. `\(P(A^C\cap B) = P(B\mid A^C) \times P(A^C) = 0.01\times 0.94 = 0.009\)` 4. `\(P(A^C \cap B^C) = P(B^C\mid A^C) \times P(A^C) = 0.99 \times 0.94 = 1- 0.048 - 0.012 - 0.0094 = 0.931\)` And finally the `\(P(B) = 0.048 + 0.0094 = 0.057\)` --- count: false # Normalising constant The marginal probability of testing positive `\(P(B)\)` provides an important point of comparison. This is the last bit of information we need. Let's try to fill in the table below: <center><img src=./img/Table_Bay2.png width='50%' title='INCLUDE TEXT HERE'></center> First let's look at the row A: there are those who tests positive AND have the disease and those who do not test positive AND have the disease. To get these probabilities remember that `\(P(A)=0.06\)` and that `\(P(B\mid A)=0.8\)`, so that 1. `\(P(A \cap B) = 0.06 \times 0.8 = 0.048\)` Then using a similar rationale we can get: 2. `\(P(A \cap B^C) = P(B^C\mid A) \times P(A) = (1-0.8)\times 0.06 = 0.012\)` 3. `\(P(A^C\cap B) = P(B\mid A^C) \times P(A^C) = 0.01\times 0.94 = 0.009\)` 4. `\(P(A^C \cap B^C) = P(B^C\mid A^C) \times P(A^C) = 0.99 \times 0.94 = 1- 0.048 - 0.012 - 0.0094 = 0.931\)` And finally the `\(P(B) = 0.048 + 0.0094 = 0.057\)` --- name: BayTheo <span style="display:block; margin-top: 250px ;"></span> .myblue[.center[.huge[ **Bayes Theorem**]]] --- # Now we put all together... We are now ready to answer the question: <span style="display:block; margin-top: 10px ;"></span> <center> .content-box-green[What is the chance that a patient testing positive actually does have COVID-19?] </center> <span style="display:block; margin-top: 10px ;"></span> Going back to the table we can zoom in into the people testing positive <center><img src=./img/Table_Bay3.png width='40%' title='INCLUDE TEXT HERE'></center> and using the conditional probability rules we get `$$P(A \mid B) = \frac{P(A \cap B)}{P(B)} = 0.048 / 0.057 = 0.84$$` <center>.red[**This is building Bayes Theorem from scratch.**]</center> Now remember that `$$P(A \cap B) = P(B \mid A) P(A)$$` then Bayes' theorem will calculate `\(P(A \mid B)\)` combining information from the prior `\(P(A)\)` and the likelihood of observing `\(A\)` with the event `\(B\)`, given by `\(P(B \mid A)\)` --- # Does it really work? <span style="display:block; margin-top: -10px ;"></span> <center><img src=./img/example_as_bayes_no_values.png width='75%' title='INCLUDE TEXT HERE'></center> -- <span style="display:block; margin-top: -20px ;"></span> Using Bayes' theorem we get `$$p(A|B) = \frac{ P(B|A) P(A) } {P(B|A) P(A) + P(B|A^C) P(A^C)}$$` --- count:false # Does it really work? <span style="display:block; margin-top: -10px ;"></span> <center><img src=./img/example_as_bayes_values.png width='75%' title='INCLUDE TEXT HERE'></center> -- <span style="display:block; margin-top: -20px ;"></span> Using Bayes' theorem we get `$$p(A|B) = \frac{ P(B|A) P(A) } {P(B|A) P(A) + P(B|A^C) P(A^C)}=\frac{0.8 \times 0.06 } {0.8 \times 0.06 + 0.01 \times 0.94} = 0.84$$` --- # Comments .pull-left[ - The disease prevalence can be thought of as a *prior* probability ( `\(p\)` = 0.06) <span style="display:block; margin-top: 20px ;"></span> - Observing a positive result causes us to modify this probability to `\(p\)` = 0.84. This is our *posterior* probability that patient is COVID-19 positive. ] .pull-right[ <center><img src=./img/Bayes1.png width='150%' title=''></center> ] -- - Bayes theorem applied to *observables* (as in diagnostic testing) is uncontroversial and established - More controversial in general statistical analyses: *parameters* are unknown quantities, and prior distributions need to be specified `\(\rightarrow\)` .red[Bayesian inference] - Stay tuned, we are going to dive into that next week! --- # References Blangiardo, M. and M. Cameletti (2015). _Spatial and spatio-temporal Bayesian models with R-INLA_. John Wiley & Sons. Johnson, A. A., M. Q. Ott, and M. Dogucu (2022). _Bayes Rules!: An Introduction to Applied Bayesian Modeling_. CRC Press.