Welcome to video two of our sequence on factors. In this video we'll talk about how to create factors. Now, you can use either the factor verb or the parse factor verb. So for instance, one of the things you'll notice in the differences is par specter returns better warnings sometimes. Now these are the factors that I'd like to use for the climate data we talked about in the last video. Right here, I have input every month into a vector, so I have the month vector. So, I can take my climate data, and I can mutate it and have a vector called month Fc, that is a factor of the month with the levels given by this vector. So you specify the levels and then I've assigned that and usually you see the assignment going the other way, in this case I just put it at the other end. Okay, so now if I look at Dc climate. I can see what has happened using the verb factor, right? It puts an in a, in the month where it didn't have a match on the levels, but it doesn't tell me exactly why that happened, right? I'm going to show you now what happens if I do parce factor instead of factor, right. So the only thing I've changed here is I'm mutating it into another column called month Fc for factor two, using parse factor. You see now it gave me a warning and it gave me a parsing failure and said, that it didn't expected a value in the level set but what it got was AUX. So I can see that I have mistyped something. Okay, so now if I look at Dc climate, what happens? I still get the N/As, so I get the same output. The only difference is, in the second case I got a warning message. Now what happens if I don't tell at the levels? Well, if I don't tell it what levels, then factor is going to take the order of the levels to be the same order that I would get with a sort. So that's going to be an alphabetical order. The parse factor is going to be the same order as the order of the values introduced. So let's see how that works, let's try that. So here you can see I just factored the months, using factor but without giving it the levels. So it's going to keep AUX as a level but it's putting them in alphabetical order. Okay, if instead I tried month Fc two, to be the parse factor. Let's see what happens with that one. You see now they're in more of the order that I would want them in this particular case it's in the order I gave it, but because I didn't specify the levels it's not going to know that is incorrect for august. Now I can overwrite the ordering that's the default, by giving it a factor in order or a unique option. So let's look at, I'm going to give you some sizes as another example. So here, I'm going to specify these as the sizes. And then I'm going to put my S is going to be. So size is is going to be the levels that I would like to have, right. In one case and I'm going to have S as another group of sizes. So let's just see what happens if I do sort of sizes. Right, that is coming out in alphabetical order. Right, now I'm going to give you the size levels. And which one did us say comes out in the order as which factor is going to take the order that you have with sort, right? So alphabetical order. So, if I do factor. Of sizes. Right, it's coming out the it's made them factors but the levels are now large is the first one, first level, medium is the second level. And why do you care? Well, when you graph these things, remember like we did before, we would like them in the order that makes sense to us, rather than in the alphabetical order. So I could instead do. If I gave it the levels that I want. Then, they're coming out in the correct order. And I could do the same with. Factoring my S variable, okay. Remember now, S it had one error in it, right? It had the N there instead of the M for medium. So it's going to give me an NA where that one was. I could also if I used parse factor it's going to give me, now remember this is only done with a tiny example right? When you're doing this in real life, you're going to have like possibly millions of rows. So you'd like to know where those errors are. And then I can also do problems, when I have these errors and it will give me a little more detail each of those. So, I can overwrite with unique or factor in order. So I'm going to show you both of those, right? When I factor the sizes with the levels to be the unique ones of sizes, and I get in order that I gave them in, okay. If I do factor in order it takes the order in which I first provided them. Okay you can see the levels that you have by doing levels of a variable. So here was Y1 and if I do levels of Y1 it shows me what the levels of that factor variables are. So that is what we use the levels function for. Similarly I can use levels of, let's see Dc climate. Right, I can see, what that one was. Yeah that shows me the levels that I can use, okay. So I can do factor unique and factor count as well. I'll get to that in the next lecture because I've already gone over time in this one.