Again, I'm Evan Jones, one of the course designers for Data to Insights. I've been teaching data analysis for over ten years. My life at Google before developing courses like this one was in Google Finance, where we built pretty fun machine learning models to predict and optimize expenses here at Google. And I'm thrilled that Google has made their internal petabyte- scale data analysis tools available to the world, through the Google Cloud Platform. And it's that platform that we're going to be using to explore and derive insights using their big data tools. Let's take a quick look at the agenda of topics we're going to cover. First, we'll start with the basics of Google Cloud Platform, and why letting the cloud handle your compute and storage needs enables massive scalability. After the fundamentals of cloud, we'll go into the big data tools, available to you as a data analyst. We're going to focus on BigQuery, Google Data Studio, and Cloud Dataprep to start. Third is where we'll start coding in SQL, or the Structure Query Language. Fourth, we'll explore the BigQuery pricing model for query processing and data storage. Next stop is a discussion on dirty data and how we can clean it up with SQL or a new UI tool. Sixth and seventh on this list is how you can create and store your own data sets of BigQuery, from your queries or from external data sources. We'll close here with an introduction to visualization and how to create reports from your data within Data Studio. Moving on to some of the more advanced topics we're going to cover, you're going to look at joins and unioning your datasets together in BigQuery, as some of the more advanced statistical functions and user-defined functions you may not have seen before. Afterwards is one of my favorite sections on how repeated fields in Arrays work within BigQuery's nested data structures. Again here we'll close with some more advanced data visualization tips within Data Studio. In these last sections, we'll walk through one of the most popular topics which is troubleshooting query and dataset performance. Lastly, before wrapping up, we'll close the specialization with a critical topic of data security and access control. This class is targeted primarily at data analysts who query their business datasets using SQL and create insightful reports and dashboards. So first and foremost, we're going to take a look at those challenges that are faced by data analysts. So let's just jump right into those. So if you run any queries in your life, particularly like when I was learning database processing in school. My instructors and teachers would say, hey, run this one query and then you can go to the bathroom or do whatever you need to do while your query is running, right? So upper left you see the queries that are taking too long, that could potentially stall your analysis. Or what about if I wanted to combine 15 data sources in 1 and query all of them. And I wanted to do that within a reasonable amount of time. A lot of times that was hard to do. And in the middle say, it wasn't a querying problem, but it was actually an infrastructure problem. I'm a data analyst or a data scientist, I'm not a hardware purchasing department. I don't know about buying servers and storing multiple versions of hard drives that are redundant in case of a hard drive product fails. And I have to maintain the network of all of my data as it relates to processing my queries and accessing the data where that's stored. I don't want to deal with any of that kind of infrastructure, right? But I have to as a necessary evil if I want to be a big data shop, right? Or if you're using, say, like Hadoop on your clusters, you're managing your clusters, but you've had this amazing capital outlay to get this awesome processing cluster. But now you're you're punished by your own success because now your clusters can't scale, because your organization says you did such an amazing job, now we have ten times the data. Can your clusters handle it? Or, do you need to buy more and kind of keep expanding out your ever growing infrastructure empire? And again, it's how much of the business of building infrastructure do you want to be in versus spending that opportunity cost of infrastructure versus writing out those amazing queries or those machine learning models to get those insights. Lastly is a pretty apparent one, which is just cost. So maybe you have a ton of data, you have a torrent of data, but you literally just can't afford to process all of it. Just because performance wise it's prohibitive on your machines and you can only create a few columns. Or just the monetary cost, which is processing that much data and storing that much data is just prohibitive. And last but not least, if you have no central place where you can just dump all this digital data into like a staging area or an analytics warehouse, that could be a problem as well. These are a lot of the same exact problems that Google had kind of growing up, right? And faced with a torrent of search indexing data and adds volume data. The necessary problems that Google as a big data organization had to solve. And we'll see exactly how they did that in the benefits of technology and time that have evolved to create a lot of these cool Google Cloud platform tools, these big data tools like BigQuery.