[MUSIC] Hi everyone. We are now heading toward the end of this Coursera course. So far I covered many topics which is important in python coding. Now it's time to study data visualization. You can yours to libraries matplotlib or seaborn. Those are two major libraries for data visualization. The first matplotlib is a library with which you can draw all kinds of grabs, you can draw a line, grab box, plot history, graham pie chart. You can do any kind of graph with matplotlib. Matplotlib is based on non priorities. So it's better to yours. Numpy array data to in case of matplotlib definitely you can use matplotlib for pandas data frame or you can provide least data to plot live in order to do graph. But basically matplotlib is based on don't priorities. That's why sometimes if you face difficulty in drawing a graph with pandas data frame library, it is better to convert the data frame data into non priority. You can overcome some difficulties you face in drawing graph but it is not frequently happening. So it is a rare case, but sometimes you may convert data framing to number if you are facing a problem. Seaborn is another library for data visualization. Seaborn is specialized for statistical graphics. Later I will show you how it is different from that plant live but seaborn is basically for statistical rapid. You can add confidence intervals in drawing graph. So that is good benefits and also seaborn is more tied to pandas data frame. So in case of seaborn, it is much easier to use or seabornt to create graphics then matplotlib. I will introduce several graphic tires line grab bar plot, history, Graham scatter plot, box plot and violent plot. Box plot and violin plot quite similar but violent plot provide a richer information than box plots. Now let me explain how to prepare rock canvas creating graph. It's like you are drawing or painting in order to paint you need or canvas. So first you need to create a canvas. There are three ways of making canvas and adding plus. So canvas is a kind of one area. You may want to attach multiple paintings on that canvas. In that case you need to subdivide canvas into pieces. I'm going to introduce that. We need those libraries. Numpy pandas bed plot live because met plot live as I already explained this based on Numpy, Seaborn is based on pandas. So we need numpy pandas matplotlib data. We are importing matplotlib that pie plot as PLT. Then you may think why we are importing matplotlib. Lip bad pipeline pipeline is a interactive interface matplotlib. So that's why we are importing pie plot. And with that interface we can use the tools reading matplotlib. Seaborn is imported as SNS and we are going to use irish data for visualization. We are ready downloaded this ideas data set when we study pandas dataframe. So in your data directory you probably already have this IDCSV data. And I'm changing color names into this format because it is short and easy to type. So let's first execute this cell. We are importing four libraries and also importing data set. So it takes a little bit of time still I'm waiting for changing this ass risk and number one because yeah, now it is ready because it is changing. Let's check the data set arrays. Right? So it looks like this one. There are four column names, simple length simple width, or pedal length pedal with and label. Right. And if you want to take only four column names, you can take it slicing column names. Right? So they're building for column names are sliced standard stored in our list. Now it's time to create a Canvas which is the basis of visualization. In order to create a Canvas, you need to use this one plt.figure and you can control the size of canvas by giving this command line figure size equal in a couple you are providing to information 10 years low length and five is the color length. So it is the first number is for the horizontal line the second number is vetical line. If you owe me this the first command line actually you don't need to worry about the outcome. Still you can see a graph. Why? Because or default Canvas will be attached default canvases created then what is the size of default Canvas? 6.4 by 4.8 partner. Those numbers are denoted in interest. So figure sides, figure side function can be used. To determine the size of canvas. And then after creating this canvas you want to add multiple plots, multiple figures on that canvas. Then you need to subdivide the whole canvas into pieces. How can you divide the whole canvas? Use subplot function, subplot function take three integers. First 2 integers determined dimensions, for example, 2 here number 2 means, number of rows. Second number determines the number of columns and the third number is the sequence of slots. So on a canvas if you use this subplot, it means that you are dividing the canvas into two rows. It means that vertically there are 2 rows, horizontally gently there are 3 columns and the numbers starting from left, upper left. That is 1 of the upper middle, number 2 upper right, that is number 3. So in this case this subplot is placed on the first area. And this is the second subplot is 2 by 3 and this will be located on the upper middle slot and we are adding another, the third column. But number 6, it means that the last one. So downright place, let's execute this way. And what do you see is this one. So actually we can place 6 subplots here, right? But only three subdivided areas are taken by subplots because I didn't specify the other locations. So, and also as you can see here, you don't have to use comma in order to separate raw information, column information, and the sequence of subplots. And another command line follows sublot function which is tight on the bottom layout. You don't have to use this one. But if you use this one, it automatically adjust the size of canvas in order to overcome overlapping axis or overlapping subplot. So what if I make this command line dormant, then it looks like this one. So in this case there is no overlapping between two subplots and vertically. Probably if there's another one then there is some space between subplots. So you can usually read the graphs. But if you activate this tight on the layout function, then it looks better. It automatically adjust the space between subplots. The last command line plt_show. You better to use always this one at the end, especially at the end, why? Because using the above command lines, we are creating figures or graphs. Reading your computing memory. Those figures are created and saved in your computer memories. If you use show, plt_show, it closes all opened graphs. It removes old graphs stored in your memory, so it releases memory resources. If you do not use this last plt_show what happens, while you are using this working place. It's like on your desk. There are many stuff existing on your working desk, so it consumes your space, your memory space. So, better to use always plt_show when you draw graph. Here let me show you another way of adding subplots. At this time I'm using the first line but I'm specifying the canvas side, it means that the first canvas sides will be used and that canvas has an object named here above. We didn't create the fig object name but at this time figure object name is assigned, is created and then we are adding subplot to that object called fig. So fig here is the name of the canvas, whole canvas. And then we are also assigning subplot names fig1, fig2. Fig1, fig2 is attached two the canvas and the dimension is 2,1, means that 2 rows, column is 1, it means that two subplots will be placed up and down. The first one will be located up and the second one will be placed down, right, so and then. To the first subplot which is fig1. We are adding specifically a graph plot line graph, plot is for line graph, range (4). It is a function creating number from 0 to 3, right? 0,1,2,3, if there is no one variable. If there's only one variable, that variable is used for Y axis then our X axis will be created. If Y axis values are created simply the X axis will become integers from 0. So 0,1,2,3, that value the integer value assigned to each matching y variable data point. This is a pie chart. So in this case range(4) is x variable. XX is variable range here (1, 5). That is the y axis variable and tight under our layout is already explained, and plt_show is also already explained. So let me execute this cell, then what we see is this one. From in the above case, axis value is not assigned because one variable means that that variable contains Y axis values from 0 to 3. And X axis is also automatically created from 0 to 3. And in the second subplot part chart is used, and at this time Y axis from 124 and X axis from 0 to 3. So you know that too draw slightly differently. I created Y variable from 1 to 4, 1234, right? So this is part chart. And by creating each subplot names, we can attach specific digitalization function to that object figure one and figure two. Or not, the last way of creating canvas in subplot is to use this syntax, Fig comma Axis 1 axis is contained in to pull. If you are drawing only one graph, in that case, fig,x simply you use figure, xr we assure you an example of that case but in this case you are attaching two subplots to fig. At this time fig is canvas and subplot names are axis X1 and X2. Surely you don't have to use X1 and X2, you can use any name for subplots. And then PLT thus subplots, one careful thing you need to remember at this time it is plural subplota. In the above case it is singular subplots. So if you use the third way of creating multiple subplots, you need to be careful, psD dash subplots and you provide dimension information two by one means that again up and down to subplots will be placed up and down. And if you want to add figure size, simply you add figure size information here. It means that you are creating specific sizable canvas like a comma and execute this way. Then actually same function is created. I'm Using X1 X2 is the supply names. And the remaining command lines are all explained already. So this is the graphs created by those command lines. But what if we remove the bigger size syntax command line, then obviously default canvas will be created. And this is the default case. But because you use the tight on the bar layout, that's why it is a little bit looks better. What if we make this command line dormant by using hashes sign then figure a little bit gets smaller. So by activating this command line, the presentation graphical presentation looks better, right? So before closing this video clip, let me give you a review question through all for us. Subplots here no singular subplot, subplots plural, subplots function is often used for preparing subplots of the same style. Yeah, usually the same size but definitely you can prevent multiple or different type of digitalization or graphs using subplots. I will show you in the following video cliffs how to use those subplots function. Among three ritual is better. Personally, the third one is back. It is a little bit simpler than the other cases. If you are doing only one figure, you can use any way of drawing graph.