Welcome to the specialization on practical data science. Data science is an interdisciplinary field that combines business and domain knowledge with math, statistics, data visualization, and programming skills. You may already be familiar with some aspects of data science, like exploring a dataset through plotting different visualizations, or creating a machine learning model to make some predictions. If you've some familiarity of data science, and you're learning to move your data science projects from idea to production at scale in the Cloud, this specialization will show you how. We're calling this specialization Practical Data Science, because with the three courses you see, you get hands-on experience with machine learning tools and techniques in the Cloud, and these are the same tools and techniques that industry professionals are using every day. I'm thrilled to bring you this specialization, along with a group of three brilliant instructors from the AWS team: Antje Barth is a senior developer advocate for AI machine learning, and is co-author of the O'Reilly book, Data Science on AWS; Sireesha Muppala is an enterprise principle solutions architect with AI machine learning at AWS; and Shelbee Eigenbrode is a principle machine learning specialist solutions architect at AWS. I'm excited to have you three leading this specialization, not just because of your technical expertise, but also because all of you have extensive experience teaching and mentoring aspiring data scientists through workshops, conference talks, as well as being co-founders and leaders in your local women in big data chapters. Antje, perhaps you could say a bit more about the big picture goals of the specialization and what learners can expect to take away from it. Sure. Thanks Andrew. It's a pleasure to be working with you on this specialization. With this specialization focused on practical data science, we've created a series of courses meant for anyone looking to gain practical knowledge, and how to build, deploy, and scale the data science projects. As a learner in this specialization, you will build and deploy every component of an end-to-end machine learning pipeline, using the AWS machine-learning stick. You will master how to efficiently move your data science projects from idea to production, with thousands of models serving millions of end-users. One of the biggest benefits of running data science projects in the Cloud is the agility and elasticity that the Cloud offers to scale, and process virtually any amount of data. This specialization teaches you how to analyze and clean a dataset, extract the relevant features, train models, and build automated pipelines to orchestrate and scale your data science projects. And we're really excited to be bringing you this specialization in collaboration with deep-learning AI. I'll hand it over to Sireesha now to share more details about each of the three courses in this specialization. Thank you Antje. And it's truly a pleasure to be working with you on this program, Andrew. In this specialization, as a learner, you will tackle complex data science challenges with sophisticated tools. In Course 1, you will get started with the data science in the Cloud the easy way. You'll perform exploratory data analysis and detect statistical data bias. You will then train a machine learning model using automated machine learning, and build a multi-class text classification model using state-of-the- art algorithms. In Course 2, you will dive deeper into building a custom NLP model. You will build a machine learning pipeline, perform feature engineering, and share your features with the rest of your organization using a scalable feature store. You will then train, tune, and deploy your model. To wrap up this course, you will orchestrate the model workflow using ML pipelines, and MLOps strategies. In Course 3, you will optimize machine learning models, and learn best practices on tuning hyper parameters, and performing distributed model training. You will learn about advanced model deployment as well as monitoring options. Course 3 wraps up on how to perform large-scale data labeling, and build human-in- the-loop pipelines to improve your model accuracy and performance by combining machine intelligence with human intelligence. Once you complete the specialization, you will be able to build your own models and pipelines to address your most challenging machine learning problems targeted at millions of application users. Personally, I'm thrilled to bring you this specialization, which makes data science much more accessible and approachable to everyone. That's it. We are expecting that you're coming into this program with a certain level of technical background so that you're ready to dive right into these concepts and hands-on labs. Shelbee, can you say a little bit about what are the prerequisites we are recommending for this specialization. Sure thing. Thanks Sireesha. As a learner coming into this specialization, we expect that you're already familiar with Python and SQL programming. It'll also be helpful if you're familiar with building neural networks using one of the popular deep-learning Python frameworks, like TensorFlow or PyTorch. We're also assuming that you're familiar with the ideas behind building, training, and evaluating machine learning models. If you've already completed the deep learning specialization offered by deep-learning AI on Coursera, you should be in great shape to start the course from a data science perspective. However, you should also be familiar with the fundamentals of AWS and Cloud computing. If you're not already familiar with those, don't worry, but we do recommend that you take the AWS Cloud Technical Essentials course available on Coursera. You can think of this practical data science specialization as taking your data science and machine learning skills to the next level, so that you'll be ready to train, evaluate, deploy, and scale your machine-learning models just like industry professionals. I know when I first started working with machine learning workloads, it really felt like a whole new world. And it was a whole new world that didn't work well on my laptop beyond a few small prototypes, which I'm sure is a problem that many of the instructors here ran into initially when getting started in the space. But after being able to help customers make that same transition from prototype to being able to leverage scalable tools, and building end-to-end pipelines, I love seeing the progress in adopting machine learning at scale now. As a learner coming into this specialization, don't worry if this stuff seems intimidating at first. With practice, you can master these skills too. Andrew, it's been a pleasure to be working with you on this specialization. But now as I'm talking about making that transition from local prototyping to Cloud deployments, I can't help but remember what I've heard you say about ML deployment in those pre-Cloud days. What was that like? Boy, that was a long time ago. One of my first deployments was in a computer that was sitting under my desk. I build a simple search engine for research papers, and when Windows Server Load got too high, well, I only had one server, so what could I do? I had to put a note on the homepage apologizing for the slow response time. I'm glad we're a little bit past that error now. With modern Cloud tools it's much easier than ever before to do these things. For example, after building a machine learning model, my teams have been able to push the model to the Cloud and set up a prediction API service in just days or maybe even hours. This means that you can train a model and get it to make useful predictions, and even have it scale up and down easier than ever before. So, this is the exciting Cloud computing era that we live in, and I think it is useful and important for developers to know how to use these tools. So, let's get started.