Hi, inverse probability of treatment weighting is a method for estimating causal effects. In this video, our objective will be to gain an intuitive understanding of inverse weighting by relating it to matching. As a motivating example, we'll focus on the situation where there's just a single binary confounded that we'll call X. So, there's just a single variable that we need to control for. Now, let's also imagine that among people who have this confounder value equal to 1. So, X = 1. The probability of treatment is equal to 0.1. So among people with X = 1, only 10% of them would receive treatment. What this means is that the value of the propensity score for people with X = 1 is equal to 0.1. So everybody who has a value of X = 1 has a propensity score, 0.1. Now let's look at the other group of people, those are individuals who have X = 0. Here are the probabilities that A = 1 which is a probability of treatment, given that X = 0 is equal to 0.8. So in other words, the subpopulation of people who have X = 0, their propensity score is 0.8 which means that 80% of people who have X = 0 will receive the treatment. So people with X = 1 will unlikely to receive the treatment where is people with X = 0 are very likely to receive it. We could depict this in a picture where we separate the treated and controls via this vertical line, and then we're using just blue, and red to indicate the values of X. So, blue is for X = 1 and red is for X = 0. And this is just a hypothetical kind of situation where, but that's corresponding to what we saw with the propensity score where for the people with X = 1, the large majority of them. And in fact, 90% of them are in the control group. And for X = 0, we see that 4 out of 5 of them are in the treatment group which again, corresponds to what we saw for our assumption about the propensity score. So, let's just focus on one of these groups for now. So, this is the X = 1 group. So, this is a subpopulation. And in that subpopulation, for everyone one treated subject, you would expect to have nine control subjects. Or in other words, one out of every ten people with X = 1 is treated. So, that's what we'd expect on average. So out of 10 total people with X = 1, we'd expect for one of them to be treated. Now, let's imagine we're going to do propensity score matching. So you see we have this imbalance here in the sense that among people with X = 1, they are way more likely to be in the control group. Whereas if you had a randomized trial and you were strictly randomizing people to treat it in control groups, you would expect an equal number of treated and controlled subjects in the subpopulation. So, what propensity score matching would do then is remember that. Because there's only one covariant here, everybody with that is equal to 1 has the same value of a propensity score. So all of these ten blue dots here, they all have the same propensity score. So if we were going to do propensity score matching, what we would do is we would match one treated subject to one here randomly selected control subject. So with propensity score matching, the data that we would end up using is right here. We'd just use all of that and what we see then is that there's one person in the treatment group now. There's also one person in the control group, but there was originally one person in the treated group. So this one person in the treated group. So this person here, they just represent one person. However, in the control group, we now have one person that we've matched. But they actually count the same or they're representing 9 other people. So this one person in the treatment group is essentially counting the same as 9 people from the control group. So with that as a motivation. So the way we're sort of recreating balance with matching, we do this one-to-one kind of matching. And then in this example, a treated person ends up representing nine control subjects or represents nine people. But rather than actually match, in which case, you would discard some of your data. So in the previous slide, we saw that we would end up discarding eight controls. But rather than do that, what we could actually do is down-weight some of these individuals and up-weight others. So, this particular treated person should end up having nine times more weight than any of these individuals from the control group. So we saw that from the previous slide when we did that one-to-one matching, we saw that the treated person counted the same as nine people in the control group. So what you can do then is just weight these observations to make that happen. So that's what inverse probability of treatment weighting is going to do. Inverse probability treatment weighting or weight based on treatment actually received. So for treated subjects, we were weighed by the inverse of the probability of treatment. So that's a propensity of score. So for treated subjects, we weight by the inverse probability of treatment. But for control subjects, we would actually weight by the inverse of the probability of not getting the treatment. So in other words, you are always waiting by the inverse of the probability of whatever it is they actually received. So treated subjects will see treatment. So we had by the inverse of that probability, control subjects we see the control. So we weigh by the inverse of that probability. So this is what known as inverse probability of treatment weighting or IPTW. So we could go back to this example where we have one treated subject and nine control subjects. And now, we can weight by the inverse of the probability of their particular treatment. So in this treated group, We want to weight by the inverse of the propensity score and recall from an earlier slide that the propensity score for treated subjects with X = 1 was equal to 0.1. So here, we end up taking 1 and dividing by 0.1 which is the propensity score. Or in other words, they get a weight of 10. So, this one treated subject will have a weight of 10. Whereas for the control subjects, we weight by 1 over the probability of getting the control treatment. So, the difference here is we're looking for the probability that A = 0. Well, the probability that A = 0 is just 1 minus the propensity score. So this 0.9, that comes 1- 0.1. Because the probabilities have to add up to 1. So if you have 10% chance of getting the treatment, you have a 90% chance of getting the control. So then our weight is 1 over 0.9, which ends up being ten-ninths. So, that's how much weight each person in the control arm would get. So in other words, one person in the treated group counts the same as 9 people from the control group. And in case that's not clear, just imagine applying this ten-ninth weight to each person and then adding those up and there's 9 of them. There's 9 people in this control group. So in other words, we'd have ten-ninths times 9 which would equal 10 which is the same as what we see in the treated group. So by weighting in this weight, we end up counting the collection of treated subjects the same as the collection of control subjects. So among people who have the same value of the propensity score though, all treated and controls will end up getting collectively sort of counting the same. Even though there might be more control subjects, the amount they contribute in the data analysis will be equal. So there was this other group of people, which had X = 0. So, let's focus on them for a minute. And remember for that group, the propensity score was 0.8. So in that case, if you have X = 0, you have an 80% chance of getting treated. So that's why we see four out of five individuals are in the treated group, in this case. Now, imagine we want to do a propensity score matching. So we would just takes, there's only one person in the control group. We would find a match for them at the treated group. In this case, they all have the same value of the propensity score. So, we will just randomly select anybody from the treated group. And now, we have done this one-to-one matching. So now, one person in the control group count the same as four people from the treatment group. So that's what matching would do, propensity score matching here. What would end up happening is that one person in the control group would count the same as 4 people from the treatment group. So now if we want to think about weights, again, we would just weight by 1 over the probability of their specific treatment so on the treatment side. On the treatment side, we see we would weight by 1 over the propensity score. So 1 over the probability of treatment, given X = 0. Well, the propensity score in this case was 0.8. So, we have 1 over 0.8 for a weight of five-fourths. In the control group, now these are people who did not get the treatment. So, we would weight by 1 over the probability of no treatment. Well, the probability of no treatment is equal 0.2 which is just 1 minus 0.8. So, we take 1 over 0.2 for a weight of 5. So in this case, weighting is accomplishing the same as matching in the sense that one person in the control group is counting the same as four people in the treatment group. So again, if you think of each person in the treatment group as having weight five-fourths. If you added up all those weights, you would get a five which is the same thing as in the control group. So whether we do propensity one-to-one sort of propensity score matching or we do inverse probability of treatment waiting, we end up basically accomplishing the same thing where for a given value of X or for a given value the propensity score. We end up counting the collection of treated subjects in the same way as a collection of control subjects, as if it was a randomized trial.