Let's talk about another type of data tasks you might experience. Let's say you have a continuous outcome of interest, and you want to compare a measure across two or more groups to see if there's a discernible difference. You've observed something and the group means are different, but are there differences among those groups? This comes up in a variety of circumstances. You might think of comparing outcomes across four groups. How are household savings different across four groups of families? Or you might check for comparing an outcome before or after an intervention or treatment, test scores after one of three different interventions. In these cases, you might think of using the t-test, but because we have more than two groups, it's not feasible to do so. Differences come from a variety of different sources when we observe things in the world. When we look at the distribution of outcomes, we can think of variation is coming both from explanatory or group level differences but also from idiosyncratic or individual level differences. People are just different from each other. One way of thinking about our task is, are the group level differences bigger or more important than individual level differences? This is a gateway to a thinking where we think about the variation our data coming more from the differences between groups than from individual differences. If the variation in our data comes more from the differences between groups, we might think that those groups are different in some critical way. You might think of visualizing your kernel density estimates this way. Here, we have a distribution of outcomes across different groups. Here we might say that within the individual groups, those groups are quite different. There's similarity within groups and differences across groups. Alternatively, you may be in a situation where there are differences both within and across groups. But in some cases, groups are not terribly different. There is in fact more difference within the groups then there is across the groups. We can use ideas about sources of variance to think about differences between groups. At the core of ANOVA methods is the analysis of variance comparing the sources of variance in some outcome. For simple one-way ANOVA, the ratio we compare is the variance between groups to the variance within groups. Formerly, this creates an F statistic, the mean squared differences between groups divided by the mean square errors within groups. The F statistic has to be large enough given the size of our sample and the number of groups we have. If it is, we may reject our baseline assumption, and that is that all the groups are in fact the same. Our null hypothesis is that there is no difference between the groups. What's important to know is the F distribution changes both with the number of observations and the number of groups. It has two degrees of freedom, one in the numerator and one in the denominator. To give you an idea of this, let's look at some different F statistic distributions. On the left-hand side, we only change the number of degrees of freedom and the denominator. If you notice, the shape of the curve changes but it tends to asymptotically increase towards infinity as we get close to zero. On the right-hand side, we only change the number of degrees of freedom in the denominator. Here, the shape of the curve changes to some extent as well. If we increase both the numerator and the denominator together, you notice that the shape and also the central location of the distribution changes. The shape of the F distribution changes both with the number of degrees of freedom in the numerator, the number of degrees of freedom in the denominator, and the ratio of those two. It's a little bit more difficult to think about how the F distribution is shaped for any combination of degrees of freedom. But luckily, we don't have to think about it too hard; we can compute it. To perform one-way ANOVA in R, it's fairly straightforward. We're going to use a command for all ANOVA methods in R called AOV. AOV takes a formula, on the left-hand side, is our variable of interest, our outcome variable, and on the right-hand side, are grouping factors. In this case, we're going to specify a single grouping factor. We also have to tell R what the dataframe from which our observations are coming is called. For one-way ANOVA, you might think of specifying outcome tilde group as your formula, and data equals data underscore df as your data argument. The general workflow for performing ANOVA methods is you fit the model, you assign it a name, and then you summarize or you look at a summary of the analysis. We'll look at code that looks like something like this. Let's jump over to our Studio session. Here's our script for this current video. First, let's load in some data. We'll load in two different datasets that contain ANOVA information and you can play with some of this later on. We'll load in one. That's data 2 from our last session. If you'd play it a little bit around with the code in the video 2 data, you would have noticed that the last part contained an option to perform a t-test on data with three groups and it didn't work. We'll bring that data back in if you want to compare it here. Analysis of variance is performed by this AOV command. Our outcome is test score, and our grouping variable is group. If we look at the ANOVA1_df, grouping variable, test score. We can ignore the S's. When we do this, we get a summary table. It tells us what the function call was, and what the terms of the function are or what the terms of the results are. But, it's not terrifically helpful. What works better is if we take that same call and we assign it a name, so we'll call this ANOVA1, and we look at a summary. Now we have a summary table which tells us more about what the sources of variance in our data are. The model call alone isn't as helpful for us in this case. What we get is a summary of the model fit. But what we want to know is specifically, where the sources of variance are. Now, if we look at the sources of variance in our results, we'll see in this case, group. Group identifies what portion of the overall variance in outcomes comes from group level differences. We'll see an F value, which is the F statistic for the ratio of variances, and we'll see a p-value, which in this case is written Pr (greater than f). That's our p-value once again. In general, if Pr greater than f is less than 0.05, we can reject our null hypothesis, that there's no difference among the groups. Your results might look something like this. Let's jump back over to our results in RStudio and take a look. In this case, our group level of results, we have a probability of about 0.217. This is above that 0.05 threshold, so we might not reject the null hypothesis that there's no difference between the groups. But let's look at a different DataFrame together, and we'll write the code together. Let's try the same test, but with ANOVA2_df. If we look here, we now have Group and TestScore. If we write this code together, we'll call this ANOVA2. That's the name aov, TestScore, Group. The DataFrame, the data source from which we're pulling all of this, is ANOVA2_df. Now, we can summarize that model. Summary of ANOVA2. What we see is that in this case, group now has a p-value of about 0.91, so it's above that 0.05 level. If you want to revisit the dataset, Data 2 from Video 2's code, you can try it out here. Think about where you would put each of the variables in that dataset, and then look at the summary. The one thing to know is that ANOVA doesn't tell us anything about which group is different. All ANOVA methods tell us is that there exists a difference, but it can't tell us which group is different. To do this, we perform post analysis. One method of analysis is known as Tukey's on a significant differences. Now, you can perform a lot of pairwise t-tests between the groups. Once you know ANOVA says there is a group that is different, you might try testing each of the groups and their difference between the two. The problem is if you do lots of pairwise t-tests, you have a problem of multiple comparisons. The problem with multiple comparisons means that you might reject accidentally the null hypothesis when in fact it is true. We can perform those comparisons, but adjust our findings to correct for multiple tests, and this is where Tukey's Honestly Significant Difference test comes in. It's like a t-test, but there's an adjustment to our p-values to accommodate the fact that we're comparing over lots of different pairs of groups. In R we implement it with Tukey HSD. This takes an ANOVA model, an AOV object as an input, so we have to do it after we fit in ANOVA model. Let's look at how we do this in R. In R, we'll first fit our analysis of variance model, and then we'll pass it to Tukey HSD. There are additional options. If you want to make all the differences positive, if you want to make all of the differences between groups non-negative, you can choose to make ordered equal to true. You can also change the confidence interval around the differences in the means, with confidence level like we've seen before. Let's try it together. Here we have our ANOVA1 model, the one we performed first. If we pass this to Tukey HSD, we see the differences between the groups. The difference between Group 1 and Group 4, Group 2 and Group 4, Group 3 and Group 4. They're all positive, but the confidence interval contains 0, and we would not actually be confident of rejecting the null hypothesis of no difference. This is consistent with our ANOVA results. None of these p-values are less than 0.05. If we jump over to our second test, ANOVA2, you'll see that here, even though our overall fit suggests that there wasn't a significant difference, some of the groups appear more different than others. But in no case does the p-value for any pairwise comparison of groups get below that 0.05 level that we would consider to be a significant difference. But if any of these had come up as less than 0.05, we would say there might be a difference between those two groups on our outcome of interest. You can retry the data to ANOVA models on Variable 2 and Variable 3 on your own time. When you interpret the Tukey's Honestly Significant Differences though, look at each of the columns. The first column shows the level of the factor being compared, the second column shows the difference in means, then the third and fourth column shows the lowers and upper ends of that confidence interval, and the final column shows the p-value on the difference in means, and that's the thing that we care about. In general, if the p-value is less than 0.05, you can reject that null hypothesis of no difference. A couple of closing points. You can have ANOVA that goes with more than one factor, more complex than one-way ANOVA. But all ANOVA can tell us about is the existence of a discernible difference, not how big or which groups are different. Many R methods involving models look like this though. You will specify a formula for the specification, you will fit the model, you will summarize the results, and then you'll do additional post estimation. ANOVA is a nice gateway to more complex analysis of statistical models in R.