Hello and welcome to week five of this MOOC, where we're going to explore hypothesis testing. Now, if you think back to the title of this MOOC, Probability and Statistics To p, or Not to p? Well in the introductory video to this course I mentioned, that is during the hypothesis testing section, where all will be revealed about what to p or not to p is all about. So we will be getting there ever so shortly but before we do I just like to set the tone for what hypothesis testing is all about by taking a simple judicial i.e. legal analogy. So think of hypothesis testing as about simple decision theory. Namely we are going to be choosing between two competing statements and it's going to be a binary choice such that based on some data, based on some evidence we are going to conclude either in favor of one statement or hypothesis or the other. So I mentioned this legal or judicial analogy and this is where we will begin because I'm sure many, if not all of you, are familiar with the concept of a courtroom and a jury. So let's imagine the following scenario. Let's suppose that I've been a naughty boy. So naughty in fact the police have arrested me on suspicion of committing some crime. Now, legal disclaimer here, I've never been arrested for anything yet. Of course the future is uncertain and we never know what the future holds but I'm not planning on committing any crimes, certainly anytime, soon. But let's imagine the police have arrested me on suspicion, let's say for murder. So, I'd like you to imagine now that the police have done their investigation and now we are in the courtroom setting. So I am the defendant in this trial for murder and you are a member of the jury because basically a jury is conducting a hypothesis test. They are choosing between two competing statements. They're trying to determine whether the defendant is guilty or not guilty of this criminal offense. So I'd like to relay the statistical form of testing and apply it to this legal analogy. First of all, once we've done that we'll then be in a position to consider more statistical versions of hypothesis testing. Now, I appreciate no doubt legal systems do vary a little bit around the world. I'm UK-based and hence UK-centric. So let's consider the traditional Anglo-Saxon based legal system where there is a presumption of innocence until someone is proven guilty. So, in our statistical world of hypothesis testing we have two competing statements known as hypotheses, a so called null hypothesis H zero and an alternative hypothesis H one. So, the jury would set the following hypotheses. H zero would be that the defendant is not guilty of the alleged crime and the alternative hypothesis H one is that the defendant is guilty of said crime. So, with the presumption of innocence, the jury have to assume that the defendant is innocent, i.e. not guilty of the crime until the evidence becomes sufficiently overwhelming that being not guilty is an unlikely scenario and hence the jury would return a verdict of guilty. So, what's happened at this stage? So the police have arrested me. They've undertaken their investigation and they've collected various forms of evidence. This evidence is then presented to the courtroom and hence to you, the ladies and gentlemen of the jury and based on this evidence, you have to weigh up whether you feel I am guilty or not guilty of the crime. So, I'd like you to imagine two different scenarios and let's consider what you as the jury would likely conclude in these two situations. So remember to begin with, you have to assume I am not guilty or the defendant is innocent until proven guilty. Now in Anglo-Saxon legal systems, what is the burden of proof which the jury requires? Well, you have to find someone guilty beyond a reasonable doubt. Now of course in that legal setting, what is beyond a reasonable doubt, that's highly subjective and different people would place different weights on the evidence. This is often why a jury is not necessarily going to reach a unanimous verdict. The judge may well accept a majority verdict whereby most people conclude in favor of either being guilty or not guilty on the part of the defendant. So, we take that as our starting point. So let's consider scenario one. So on day one of the trial, you see me for the first time and you will assume I am not guilty. Well let's be honest, you are human beings and hence you are inevitably biased individuals. If any of you have ever served on a jury, no doubt the instant you see the face of the defendant you inevitably jump to a conclusion and you just feel whether that person is guilty or not guilty. It's human nature. We are all biased individuals. But nonetheless, let's suppose you are being extremely objective and you're clearing your mind of any sort of preconditions and existing biases and you are going to be influenced solely by the evidence as it's presented to you in the court. So let's imagine, scenario number one. Let's consider the case for the prosecution. So, the police think I committed the crime and they've collected various evidence and the prosecution is now putting forward that case for why they think I am guilty. So let's suppose in this scenario, the police have collected a lot of forensic evidence. Suppose my fingerprints have been found on the murder weapon. Let's say, some of my blood or DNA or traces of that was found on the victim's body. Let's imagine even that there was some CCTV, so some camera evidence which showed someone looking very much like me murdering the victim. In my defense, I will say, oh, it's a conspiracy. I've been set up. I've been framed. I was somewhere else at the time of the crime. However, I was alone and no one could corroborate my story. So armed with that set of evidence, as a jury, what would you decide? Well, I'm going to make an assumption here but I reckon you would all return a verdict of guilty. Why? Well, taking that initial position that I am not guilty, you look at the evidence presented to you and you think does the evidence, is it consistent with me the defendant being not guilty. So you'd have to ask yourselves. If I had not committed the crime, why would my fingerprints be on the murder weapon, why would my blood or DNA traces be found on the victim's body, why would there be a camera evidence which showed someone looking like me murdering the victim whereas I've got no legitimate defense other than I was on my own somewhere at a different place at that point in time. So taking that body of evidence collectively you would think quite rationally and it will be entirely reasonable for you to return a verdict of guilty because you view the evidence as not being consistent with me being not guilty. So, I would be completely expectant of you to return a verdict of guilty. Effectively you are rejecting this null hypothesis of being not guilty and returning the verdict of guilty. Scenario number two. Let's suppose the murder weapon was never found. Let suppose there was no forensic evidence, no blood, no DNA, no hair samples or anything connecting the victim's body with me. Let's suppose I have a valid alibi, I was with other people at the time of the crime, they corroborated my story and let's suppose the prosecution's only real case against me was that they have a witness that claims they saw me murder the victim. But let's imagine that, that witness was giving inconsistent testimony to the court. In that they were changing their story each time, there were some inconsistencies and hence they weren't really accredible or reliable witness. So based on that information, from my perspective at least, I would hope you would return a verdict of not guilty because collectively the evidence is consistent with me being not guilty. There is no murder weapon and hence you cannot associate me with the murder weapon, there's no forensic evidence. You are discounting the testimony of this one unreliable witness and you believe my claim that I'm with other people or was with other people at the time of the alleged offense and hence, you would return a verdict of not guilty. Now note, in neither case are you actually proving I am guilty or not guilty. In the first situation where there is this wealth of forensic evidence, I'm sure you are very confident that you have reached the correct verdict but of course it isn't. It is possible that my defense that it was a conspiracy, I was set up by the government, the police whoever, of course that is a possible explanation. However, you would attach a very small probability to that actually being the true explanation. So small in fact that this is such an improbable, such an unlikely event that you don't believe it and you return the verdict of guilty. Similarly, in the second scenario maybe I did murder the victim but I was clever. I hid or destroyed the murder weapon, perhaps I wore gloves to make sure there was no forensic evidence and not that many witnesses to say they saw me murder the victim and I'd managed to pay off some other people to claim I was with them at the time of the crime. So, of course that is another possible situation but based on the evidence provided to you, you would not say you could legitimately say I am guilty beyond a reasonable doubt and hence you conclude that I am not guilty. Now, in an ideal world, juries the world over would always get their decisions correct and the bad people, the naughty people are found guilty, they go to prison and those who are truly innocent are acquitted and are found not guilty. Of course though it's not a perfect ideal world, sometimes people in this case juries, will make mistakes and there will be many situations of miscarriages of justice, innocent people have been found guilty and gone to prison, people who were guilty of the offense have been acquitted because there wasn't sufficient evidence against them. So, we recognize juries sometimes get it wrong but we hope very much so that they tend to get it right far more often than they get it wrong. So, as we start to move over into our world of statistical testing I'd like you to keep that legal analogy very much in mind because even when we do statistical testing and there is no guarantee and there are no certainties that we're going to get the decisions right. But we do hope that we get things right far more often than we get them wrong.