Informatics and the Foundation of Knowledge. The reference for this presentation unless otherwise noted is The Fourth Paradigm, edited by Hey and colleagues. The foundation of knowledge refers to the research enterprise and how we know what can be known. Informatics and research are combining forces in new ways that will redefine our understanding of the foundation of knowledge in healthcare. Let's pause for a moment to reflect. What is your most pressing healthcare question? What data would suggest the answer? This is your chance to think very big, because we'll be talking about big data. Data, Information, Knowledge, our familiar framework has undergone a massive transformation with the advent of big data. Big Data refers to the size of a data set that has grown too large to be manipulated through traditional methods. We envision the day when the data will speak and transform into liquid knowledge answering clinical questions as fast as we can ask them. In this presentation we will consider three key points Big Data, eScience, and Ontologies. Key point one, Big Data. The increasing digitalization of healthcare data means that organizations often add terabytes' worth of patient records to data centers annually. Big Data also refers to large-scale processing architectures that focus on capacity, throughput, and new ways of processing. In addition to EHRs, some of the potential Big Data sources are instruments, devices, sensors, social media Mobile technologies, and others you can imagine. Key point two, eScience. eScience is a new, so-called fourth paradigm of science that is only now possible due to the emerging notion of Big Data. eScience builds on thousands of years of experimental science describing natural phenomena. And hundreds of years of theoretical science using models and generalizations, and decades of computational science simulating complex phenomena. eScience tools and techniques are new and different. eScience is made possible by digital networks and the Internet, as well as new computing techniques such as Hadoop. And open source artificial intelligence software framework for massively distributed data processing. Its design supports a highly scalable network of thousands of nodes backed by petabytes of data. The Internet, or cloud computing, makes storage and use of big data possible. Methods such as data mining and data visualization are considered eScience methods, as they generate hypotheses from the data. Neural networks are machine learning tools, highly interconnected processing elements that are configured for a specific application. Such as pattern recognition or data classification, through a learning process. Let's explore three big data set research projects. The Exploring and Understanding Adverse Drug Reactions Project. The Transforming Health Care through Big Data Study, and the MetroHealth Heart Disease Risk Study. The EU-ADR project aims to develop an innovative, computerized system to detect adverse drug reactions. Using EHRs from over 30 million patients in the Netherlands, Denmark, United Kingdom, and Italy. In this project, a variety of text-mining, epidemiological, and other computational techniques will be used to analyze the EHRs in order to detect signals. Combinations of drugs and suspected adverse events that warrant further investigation. The transforming healthcare through big data study examined use of data mining techniques with EHR data. FitzHenry and colleagues produced respectable sensitivity and specificity across a large sample of patients. Seen in six different medical centers and demonstrated the utility of combining natural language processing with structured data for mining EHR information. The Metro heart disease risk study used a data base of 14 million medical records gathered from 12 major health systems. To replicate a longitudinal Norwegian study of heart disease risk. The MetroHealth study produced similar but more precise findings, due to the large sample size. The study took only three months, compared to 13 years, and was accomplished with minimal costs. Key point three, Ontologies. Some of the major eScience challenges for all disciplines involve the codification, and representation of knowledge. In healthcare, we can turn to some familiar tools, ontologies, that will enable meaningful schema, organization, reorganization, and sharing of our data. Recently The Institute of Medicine championed big data in it's visionary call to action best care at lower cost. Pointing out that emerging tools like computing power, connectivity, team based care, and systems engineering techniques will make better care at lower costs possible. The recommendations for foundational elements include a digital infrastructure that's all about the data. Data capture through digital information systems, data infrastructure for standardized data elements and information transfer. Data analysis by researchers in analytic consortia, data dissemination through distributed data research networks. And research funding to support ongoing quality improvement efforts. Ontology based research is an emerging specialty that is enabled by structured data. Resulting from the rational codification and representation of knowledge. The First International Conference on Research Methods for Standardized Terminologies was held by our Center for Nursing Informatics at the University of Minnesota. I invite you to explore our website, and to imagine the studies you would like to conduct. After a brief review of the history of research, we considered three key points, Big Data, eScience, and Ontologies. These key points are fundamental to understanding the opportunities in healthcare. Coming from eScience and the emerging streams of Big Data. [SOUND].