Regression analysis involves analyzing the relationship between independent variables and dependent variables. An independent variable is one that you would have direct control over, and the independent variable would normally induce changes in the dependent variable as its value is modified. Now, linear regression is one of the most basic types of regression analysis, where we're simply mapping a linear relationship between the independent variable and a dependent variable. So if we look at the slide, you can see here that we've got a number of data points that have been charted, and you can see that a line has been drawn through those data points that approximates the distribution of that data. And we could say that line really represents a linear model of the data. Obviously it's not perfect, but it could be used to predict values, other values. Now because we have control over the independent variable. We could, for instance, test to see if a particular value for an independent variable. We could check to see what type or what value would be assigned to the dependent variable if we use that specific value of an independent variable. And this is where the linear regression allows us to use this approach to estimate values. Now the slide here shows what we call a positive correlation whereas the independent variable increases which is usually on your X axis. The dependent variable on the y-axis also increases. So we call that a positive correlation. We can also represent negative correlations which are simply aware as the independent variable increases the dependent variable decreases. So there are a couple of different ways of looking at this. Now, the foundation of all this concept is really just the linear equation, which is a foundational mathematical concept. And of course the formula for that is that the dependent variable, which is y, is going to be based on the independent variable, which is X. This is the one we have control over, multiplied by the slope of the line. And then we would add the Y intercept value, which is basically going to indicate at which point on the y-axis that line crosses the y-axis. So where of course the x value is zero. So using this linear equation, it creates a straight line fit between X and Y points and generates that straight line. But it really is used to model linear data. Now, if we look at application of this, perhaps we have on this slide here a linear equation with some example data. And for this example, we're looking at an electronics manufacturer who wants to chart the correlation between the sales price of a television and the number of months since that televisions release. And so the assumption is that the older television gets, the less money they would be able to get for that television on the market because of newer, more improved TVs that would be released as well. So if we were to take the data that's in the chart on the left, and we were to use the months since first release as our independent variable. And the variable we want to predict is sales price. So that's going to be the dependent variable. We'll put the months across our X axis, and then we'll put the sale price range on the Y axis. Now because this is a declining relationship, this is a negative correlation. And so you can see that as each, data point is applauded here. You were creating a basically a graph of the relationship of those points. But if we want to be able to now use this as a means of being able to predict the value of a television at any point along the chart. Then we're going to need to apply in this case the linear equation or linear regression to be able to make that assessment. And so if we do use our linear equation, you can see here that our dependent variable Y, which is the sales price is going to be equal to the value of X, which is the number of months multiplied by the slope. And will say that the slope of this lion is negative four, okay. And we'll say this line happens to cross this y-axis where the X value is zero at a point that is 80064.3. So we've got our slope, we've got our yeah. Y intercept and now all we need to do is this effectively becomes a model. And all we do is we plug in the values of X. And we're able to chart this, and you end up with this red line that is basically your model that could now be used for estimating values. So here we now have perhaps a condition where we didn't have a value at 60 months, but let's plug the value 60 into our linear equation. And When we do this, the result is 615.6 6. And that's because if we take 60, and we go up until that intersects our line, our regression line, and we carry that over to the Y axis will get to a point where the value on the Y axis is approximately $615.66. Now, while using just a simple linear equation like that can be done. However, there are some weaknesses. First of all, if the data is really not linear, it doesn't fit that well. And as you can see here, maybe this fits the data, maybe it doesn't, but certainly if the data formed a curve or something like that, this would not be useful in the context. And also I think you'd agree that there are other factors that would influence the value of a television over time. There's more than the age of the television that could be involved. There could be certain features. The television has the resolution of the screen, the refresh rate, the technology used for the screen and so on. So it's not just one parameter that will influence the value of a television, but certainly there are others as well. And when we're looking at linear, the linear equation, it really can't account for that. The linear equation can only deal with really a single feature being mapped to a label or a value that we want to predict. So, of course, there are other mechanisms that we can use for those more complex regression tasks. And in our next section, we're going to look at now using linear regression in machine learning, which of course is based on this initial concept of the linear equation.