In this video, I'll be talking to you about automated analysis. That is, bringing in automation to help, let the computer help you test the software that you're trying to build. So the question we're trying to ask ourselves when we're testing is, how do we effectively verify our software? So, in the last several courses, we've been looking at techniques to make our testing more rigorous. So we've been looking at structural coverage to make sure that we have enough tests to make sure that we've tested every line of code in the program. We've looked at techniques to automate the running of tests. So, once we write them, we don't have to replay them manually. But what we really want is something more rigorous. We want the computer to assist us in helping write the tests so that we can cover many more cases than we could if we had to write the tests by hand. So let's recall our analysis square. Towards the middle, we have accuracy. So, in the middle, we have perfect accuracy. And the tests or analysis that we run will always perfectly determine whether or not the property of interest holds. And on the right side, we have optimism. So we have testing. So, sometimes the tests will tell us that the system is okay when it's not, and that's where we find sort of our typical testing. We can ask interesting properties of the system, but if we're not careful, we're likely to get an answer that tells us the system is okay when in fact it still has quite a few bugs. On the other side, we have pessimistic analyses. So, pessimistic analyses are those that say the system is broken when in fact it works correctly. So we have simplistic program analysis that we can perform like lint. And oftentimes, they will give us lots and lots of warnings, and most of those warnings may not actually matter for our program. The other thing that we talked about earlier is this vertical scale, and that has to do with what kinds of properties we can ask of this system. For testing, we can ask quite sophisticated properties, does it meet our requirements? For a simplistic program analysis like lint, we can only ask simple properties like, is there a null pointer exception somewhere? So, that's the vertical scale, and some things we can do precisely. So, type checking is something that's very simple, and we can only ask simple questions, but we get exact answers. Can we perform this operation on an integer? Well, the type checker will tell us. And the thing we'd like to get to is perfect verification. So, we'd like to be able to ask any question we wish of any program we have and determine whether or not it meets that property. But due to the halting problem, we know that that's undecidable in general. So we're going to focus on this space in the middle, things that are fairly accurate and fairly rigorous. So, down at the bottom, we have things like Java type checking. If we get to more sophisticated programming language like Haskell, we can actually ask more interesting questions to the type system. Then we look at static analysis, and this is going to allow us to conclusively determine, in some cases, whether or not the program is well-formed, whether there are null pointer exceptions, or divide by zeros, or other kinds of well-formedness properties that would be common to any program. So you never want to have an integer overflow or divide by zero, and you can ask these kinds of questions of static analysis tools. Then we've already looked at coverage testing. So this is an attempt to make our testing regimen more rigorous, and that means it's going to be more accurate in terms of telling us whether or not the program has bugs. Now, in this course, we're going to start looking at automated approaches to ask interesting questions of programs. And one thing that I'm going to talk about is model checking of temporal properties. So we're going to ask interesting questions, does the program always meet its requirements? And the other thing that I'm going to focus on primarily is tool-assisted automated testing. So, in this case, what we're going to do is we're going to ask the computer to write tests for us, and then what we can do is we can check to see whether against all those test cases whether the program behaves as intended. And the difference here is while a human being might write 100 or 1,000 tests, you can have a computer write millions of tests, and it can do so very quickly. And a lot of those tests are like million monkeys in Shakespeare. So they wouldn't be meaningful to the human being. But if you can determine what the program should do in those cases, then, in fact, they're very useful for checking whether or not, on boundary conditions within the program, it behaves as intended. And one final thing you could do is theorem proving, where you actually, similar to writing pen and paper proofs, you try and prove properties about your program using other programs that check your work as you do the theorem proving. But we're not going to cover that because it requires a lot of specialized knowledge to do well. So the focus of this course is going to be in terms of static analysis and tool-assisted automated testing with a little bit of discussion of model checking. So, when we talk about automation and rigor, we know that it's necessary to make software testing effective. We've used it with JUnit. What we're doing is we're automating the execution of tests so that every time we do a compilation, or every time we do a check-in, we can run a set of tests against the program. So, there are many other tools that allow you to do this, but we need more to be really effective testers. What we need are tools that automate the test generation. And we've seen, to some degree, some automation in terms of Cucumber. So you write down your requirements and then it turns them into tests for you, but there's much, much more. And in terms of coming up with test inputs, we'll see some examples where the computer can come up with test cases that we never would have thought of. And when we do this, we can, for one, get better coverage than is possible with manual testing. We can really, really rigorously execute the different set of statements that we have. And in some cases, using symbolic tools, we can actually get exhaustive coverage. So, when we do model checking, we really are checking every possible input into the system and determining whether or not the program does the right thing. So, let me give you an example. So this is going to use something called random test generation with a tool called QuickCheck, and we're going to demonstrate it on the microwave that we've used in previous courses. So, here is something that allows us to run many, many tests against the microwave all at once, so to speak. What we have is something that looks a lot like JUnit. But what we do is, instead, instantiate it with something called JUnitQuickcheck. And the interesting thing about what we're doing here in this test case is that, rather than calling it a test, we're calling it a property. And what that means is that we're going to have a test that takes in inputs. And the system that we're using, JUnitQuickcheck, is going to randomly attempt to provide inputs within the ranges that we provide to execute this test. So what we're doing here in this particular test is we're checking how the display updates. And what we want to do is, in previous work, we've said that let the user choose a couple of combinations of input values. So, maybe cook for 20 seconds, or cook for 30 seconds, but we know that the microwave can cook for up to 99 minutes and 99 seconds. If you hit 9999 on the display, then that's the maximum time it can cook. So what we want to do is we want to try all kinds of different inputs in that space. So we can do that with JUnitQuickcheck. And what we're going to do is, once we get those digits that are describing how long to cook, we're going to then say, well, how long has time elapsed? So we're going to start up the microwave, let it cook for a certain amount of time, which is also variable, and then we're going to check to see whether the display matches what our expectation is. So, what we do is we take the input that describes the time to cook, and we break it into digits, and we're using our Cucumber test that we've previously written, and then we press the start key, and then we let the microwave cook for a random amount of time elapsed. And then, what we should see in the time remaining is the display matching the original time minus the time elapsed. And so, we can also check the mode of the microwave. If the time remaining reaches zero, then we should go back into the setup mode. Otherwise, we should be in the cooking mode. So, I've configured this, where the four digit values are between 0 and 9999, the time elapsed can actually go longer up to 11,000. So we should hit that zero case where the microwave quits cooking. And I've said, run this 40 times. So, the tool is going to go off and do this for us. All right. Now we will turn it on. So I'll run a build. And what you can see is that the running process takes quite a bit longer than we've been used to for our JUnit tests. But what's interesting is that it found a bug. And the reality of the situation is that on this microwave, I had run through probably three dozen manually written tests, and I hadn't found any bugs and thought it was bug free. So, when I started running Quickcheck on it, I found a new bug that I hadn't anticipated, and I thought the system was pretty well tested. So let's take a look at what we find here. In the build folder, now if I look at the reports, what I'll see is that 99 percent of the tests ran correctly, but we have a failed test. And now you can see what the result is of running this JUnitQuickcheck. And what we see is that the system fails to do what was expected when we are cooking for a very long time. So, 42 minutes 79 seconds, and it's supposed to wait for 3,291 seconds before I checked the result. So, clearly, this value is smaller than this value. And so, I should still be in the cooking mode, but I'm in the setup mode somehow. So, something went wrong. Now, the tricky part is that I still have to go in and figure out what went wrong. But in this case, I'm going to short circuit that. And I'm just going to go to the file that's an error. And what I had done is I had not rolled over the digits in a certain configuration where essentially the tens of minutes digit wasn't rolling over. And so, I needed to make this "i > 0" to be an "i >= 0". And now when I re-run, I'm running many, many tests against this particular function here, and you can see it passes. And I can run not just 40 tests, but I could run 100 tests, or a 1,000 tests, or as many as I've felt that I had time for to increase the level of rigor and make sure that I've run more of the test cases through. So this is just an example, and this is the simplest example actually, of what automated testing could do for you. So to recap, to make testing rigorous, we need to automate the generation as well as the execution of tests. And right out of the box, without even making an oracle, which I made kind of a complicated oracle there, automated tests can find exceptions and crash bugs. So, if I didn't bother to say what the right display value was, I could still run a bunch of random tests through to make sure the program didn't crash in any cases. But if you want to make automated testing more effective, we need to have some notion of executable requirements. And so, we'll spend quite a bit of time over this course in describing how to make those requirements and how to make those automated tests.