So, in this video, we're gonna take it at least partway to the next level for the heap data structure. That is, we'll discuss some of the implementation details. I.e., how would you code a, a heat data structure from scratch? So remember what the point of a heap is. It's a container, and it contains objects, and each of these objects, in addition to possibly lots of other data, should have some kind of key. And we should be able to compare the keys of different objects; you know, Social Security numbers for different employers, edge weights for different edges in a network, or time stamps on different events, and so on. Now remember, for any data structure the number one thing you should remember is what are the operations that it supports, i.e., what is the data structure good for. And also, what is the running time you can count on of those operations. So something we promised in a previous video. [inaudible] Indicate the implementation of the miss video is these two primary operations that a heap exports. First of all, you can insert stuff into it, and it takes [inaudible] logarithmic in the number of objects that the heap is storing. And second of all, you can extract an object. That has the minimum key value, and again we're going to allow duplicates in our, in our heaps, so if there's multiple objects, then I'll have a minimum, a common minimum key value, the heap will return one of those. It's unspecified, which one. As I mentioned earlier, you can also dress up heaps with additional operations, like you can do batch inserts, and you can do linear time rather than log in time. You can delete from the middle of the heap. I'm not gonna discuss those in the video; I'm just going to focus on how you'd implement inserts and extract [inaudible]. If you wanna know how heaps really work, it's important to keep in mind simultaneously Two different views of a heap One, as a tree And one, as a, array. So we're gonna start on this slide with the tree view. Conceptually, this will be useful to explain how the heap ope rations are implemented. A conceptual will think of a heap not just as any old tree, but as a tree that's rooted. It'll be binary. Meaning that, each node will have zero, one or two children nodes And third, it will be as complete as possible. So let me draw for your amusement an as complete as possible binary tree that has nine nodes. So if the tree had, had only seven nodes it would have been obvious what, is complete as possible means. It would have meant we just would have had three completely filled in levels. If it had, had fifteen nodes it would have had four completely filled in levels. If you're in between, these two numbers that are powers of two minus one, well we're going to call a complete to the tree as just in the bottom level you fill in the leaves from left to right. So here the two extra leaves on the fourth level are both, pushed as far to the left as possible. So, in our minds, this is how we visualize heaps. Let me next define the heat property. This imposes an ordering on how the different objects are arranged in this tree structure. The heap property dictates that at every single node of this tree it doesn't matter if it's the root if it's a leaf if it's an internal node whatever. At every node X the key of the object stored in X should be no more than the keys of Xs children. Now X may have zero children if it's a leaf it may have one child or it may have two children whatever those cases zero one or two children all those children keys should be at least that of key at X. For example, here is a heap with seven nodes. Notice that I am allowing duplicates. There are three different objects that have the key value four, in this heap. Another thing to realize is that while the heap property imposes useful structure on how the objects can be arranged it in no way uniquely pins down their structure. So this exact same set of seven keys could be arranged differently and it would still be a heap. The important thing is that, in any heap, the root has to have a minimum value key. Just like in the se two organizations of these seven keys, the root is always a four, the minimum key value. So that's a good sign, given that one of the main operations we're supposed to. Quickly implement, is to extract the minimum value. So at least we know where it's going to be, it' gonna be at the root of a heap. So while in or minds we think of heaps as organized in a tree fashion, we don't literally implement them as trees. So in something like search trees, you actually have pointers at each node and you can traverse pointers to go from a [inaudible] to the children from the children to the parents, yada, yada, yada. Turns out it's much more efficient in a heap to just directly implement it as an array. Let me show you by example how a tree like we had on the last slide maps naturally onto an array representation. So let's look at a slightly bigger heap, one that has nine elements. So let me draw an array with nine positions. Labeled one, two, three all the way up to nine And the way we're going to map this tree which is in our mind to this array implementation is really very natural. We're just going to group the nodes of this tree by their level. So, the root is gonna be the only node at level zero. Then the children of the roots are level one, their children constitute level two, and then we have a partial level three, which is just these last two notes here. And now we just stick these notes into the array, one level at a time. So the roots winds up in the privileged first position, so that's going to be the, the first, the object which is the first copy of the four. Then we put in the level one object, so that's the second copy of the four and the eight, and then we put in level two Which has our third four along with the two nines. And then we have the last two notes from level three rounding out the penultimate and final position of the array. And you might be wondering how it is we seem to be having our cake and eating it, too. On the one hand we have this nice, logical tree structure. On the other hand we have this array implementation and we're not wasting any space on the usual pointers you would have in a tree to traverse between parents and children. So where's the free lunch coming from? Well the reason is that because we're able to keep this binary tree as balanced as possible, we don't actually need pointers to figure out who is whose parent and who is whose child. We can just read that off directly From the positions in the array. So, let me be a little bit more specific. If you have a node in the fifth position, I'm assuming I here is not one, right? So the, the root doesn't have any, does not have a parent But any other, any other objects in position I does have a parent And what the position that is, depends on, in a simple way on whether I is even or odd. So if I is even, then the parent is just [inaudible] the position of I/2, and if I is odd, then it's going to be I/2. Okay, that's a fraction. So we take the floor that is we round down to the nearest integer. If I is odd, So for example, the objects in positions two and three have as their parent the object in position one, and those in four and five have the one in position two as his parent. Six and seven have as their parents the node in, the object in position three and so on And of course we can invert this function we can equally easily determine who the children are of a given node so if we have an object in position I. Then you notice that the children of I are gonna be at the position 2i and 2i+1. Of course those may be empty so if you have a leaf of course that doesn't have any children and then maybe one node that has only one child. But in the common case of an internal node it's gonna be two children that are gonna be in positions 2i and 2i+1. So rather than traversing pointers it's very easy to just go from a node to its parent to either one of its children just by doing these appropriate trivial calculations with respects to its position. So this slide illustrates, some of the. Lower level reasons that heaps are quite a popular data struct ure So the first one, just in terms of storage. We don't need any overhead at all. We are just. We have these objects; we're storing them directly In an array, with no extra space. Second of all, not only do we not have to have a space for pointers But we don't even have to do any traversing. All we do are these really simple. Divide by two or multiply two operations And using bit shifting tricks. Those can also be implemented extremely quickly. So, the next two slides let me indicate, at a high level, how you would implement the two exported operations, namely insertion and extract [inaudible] in time algorithmic in the size of the heap And rather than give you any pseudo code, I'm just going to show you how these work by example. I think it will be obvious how they extended the general case. I think just based on this discussion, you'll be able to code up your own versions, of insert and extract [inaudible], if you so desire. So let's redraw the 9-node heap that we had on the previous slide and again, I'm gonna keep drawing it as a tree and I'm gonna keep talking about it as a tree but always keep in mind the way it's really implemented is in terms of these array and when I talk about the parent of a node, again, what that means is you go to the appropriate position given the position of the node from where you started. So let's suppose we have an existing heap, like this blue heap here And we're called upon to insert a new object. Let's say with a key value K. Now remember heaps are always suppose to be perfectly balanced binary trees. So if we want to maintain the property that this tree is perfectly balanced is pretty only one place we can try to put the new key K and that's as the next leaf. That is it's going to be the new right most leaf on the bottom level. Or in terms of the array implementation we just stick it in the first non empty slot in the array And if we keep track of the array size we're getting constant time of course know where to put the new key. Now whether or not we can get away with this depends on what the actual key value K is But, you know, for starters we can say, what if we insert a key that's seven? Well, then we get to say, whew, we're done. So, the reason we're done is cuz we have not violated the heap property. It is still the case that every node has key no bigger than that of its children. In particular, this third copy of a four, it picked up a new child, but its key seven was bigger than its key four. So, you can imagine that maybe we get lucky with another insert. Maybe the next insertion is a ten And again, we put that in the next available spot in the last level And that becomes the second child of the third copy of the four And again we have no violation of the heap property. No worries still got a heap And in these lucky events insertion is even taking constant time. Really all we're doing is putting elements at the end of an array and not doing any rearranging. So where it gets interesting is when we do an insertion that violates the heap property. So let's supposed we do yet another insertion, and the left child of this twelve, it becomes a five. Now we got a problem. So now we have a as perfectly as possible balanced binary tree, but the key property is not satisfied. In particular, it's violated at the node twelve. It has one child. The key of that child is less than its own key. That's no good. So is there some way we can restore the heap property? Well, a natural idea is just to swap the positions of the five and the twelve, and that's something that of course can be done in constant time, cuz again from a node, we can go to the parent or the child in constant time just with a suitable trivial computation. So we say okay, for starters we put the five at the end, but that's no good. So we're gonna swap the five with the twelve And now we see we're not out of the woods. No longer is there a heap violation at the node twelve. That's been fixed. We've made it a leaf But we've Pushed up the heap violations. So now instead it's the eight that has a problem. The eight used to h ave two children, with keys twelve and nine that was fine. Now the eight has two children with keys five and nine. The five is less than the eight, that's a violation of the heap property But again, that's the only violation of the heap property. There's no other node you could have screwed up, because eight was the only person whose children we messed around with. Alright So now we just it again. Let's try [inaudible] again, locally fix the heap violation by swapping the five with the 8s And now we see we've restored order. The only place where there could possibly be a violation of the heap property is at the root. The root, when we did this swap, the only person whose children we really messed with was the root four, and fortunately its new child has key five, which is bigger than it. One subtle point that you might be thinking that in addition to screwing up at the root node, that messing around with his children, maybe we could have screwed up the twill by messing around with its parent. Alright, its parent used to be five, and now its parent is eight. So is there some possibility that his parent would all of a sudden have a key bigger than it But if you think about it, this eight and this twelve, they were a parent-child relationship in the original heap Right? So back in the blue heap, the twelve was under the 8s. Now the twelve is under the eight yet again. Since we have the heap property for that pair before, we certainly have it now. So in general, as you push up this five up the tree, there's only going to be one possible edge that could be out of order And that's between where the five currently resides and whatever its parent is. So when the 5's parent was twelve that was a violation When 5's parent was eight that was a violation But now that we've pushed it up two levels and 5's parents is four, that's not a violation because four is less than five. So in general, step two of insertion is, you do this swap, which it's called a lot of different things. I'm gonna call it bubble up because that's how I learned it more years ago than I care to admit But also this is called, sometimes sift up, happily up, and so on. So now told you to just how to implement insertions by repeated bubbling ups in a heap data structure And this is really how it works, there is nothing I haven't told you but you know, I'm not going to really fill in all the details but I'll encourage you to do that on your own time, if it's something that interests you And the two main things that you should check is first of all, is bubbling up process is gonna stop and when it stops, it stops with the heap property restored. The second thing needs to be checked in this, I think is easier to see is that we do have the desire one time log [inaudible] make in the number of elements in the heap. The key observations areas that because this is a perfectly balanced binary tree. We know exactly how many levels there are. So this is basically log based two event levels where n is the number of items in the heap And what is the running time of this insertion procedure while you only do a constant amount of work at each level, just doing the swap and comparison and then in the worst case, you'll have to swap at every single level and there is a lot [inaudible] number of levels. So that's insertion. Let's now talk about how do we implement the extract [inaudible] operation and again I'm gonna do this by example and it's gonna be by repeating the [inaudible] of a bubble [inaudible] procedure. So the Extract main operation is responsible for removing from the heap an object with minimum key value and handing it back to the client on a silver platter. So it pretty much have to whip out the root. Remember the minimum is guaranteed to be at the root. So that's how we have to begin to extract the subroutine is just we pluck out the root and hand it back to the client. So this removal of course leaves a gaping hole in our tree structure and that's no good. One of the [inaudible] responsible for maintaining is that we always have an as perfectly balanced as possib le binary tree And if you are missing a root you certainly don't have an almost perfect. Binary tree So, what are we going to do about it? How do we fill this hole? Well, there's pretty much only one node that could fill this hole without causing other problems with the tree structure, and that is the very last node. So the rightmost leaf at the bottom level one that simple fix is to swap that up and have that take the place of the original root. So in this case, the thirteen is going to get a massive promotion And get teleported all the way to be the new root of this tree. So now we've resolved our structural challenges. We now again have a, as perfectly balanced as possible, binary tree But of course now we've totally screwed up the heap property, right. So the heap property says that at every node, including the root, the key value at that node has to be less than both of the children, and now it's all messed up. Right, so at the root, the key value's actually bigger than both of the children. And matters that are little bit more tricky than they were with insertion, right, when we inserted at the bottom because every note has a unique parent. If you wanna push a note upward in the tree, there's sort of only one place you can go, right, all you can do is swap with your parent, unless you're going to try to do something really crazy But if you want to do something local, pretty much you only have a unique parent you got to swap with. Now when you're trying to push notes down to the rightful position in the tree, there is two different swaps you could do, one for the left child, one for the right child and the decision that we make matters To see that concretely, let's think about this example. There's this thirteen at the root, which is totally not where it should be, and there's the two children. The four and the eight, and we could try swapping it with either one. So suppose we swap it in a misguided way with the right child, with the eight. So now the eight becomes the root, and the thirteen gets pushed dow n a level. So on the one hand; we made some progress because now at least we don't have a violation between the thirteen and the eight. On the other hand, we still have violations involving the thirteen. The thirteen is still violated with respect to the twelve and nine And moreover, we've created a new problem between the eight and the four, right? So now that the eight is the root, that's still bigger than its left child, this four. So it's not even clear we made any progress at all when we swapped the thirteen with the eight, okay? So that was a bad idea And if you think about it would made it a bad idea, the stupid thing was to swap it with the larger child. That doesn't make any sense. We really want to swap it with the smaller child. Remember, every node should have a key bigger than both of its children. So if we're going to swap up either the four or the eight, one of those is going to become the parent of the other. The parent is supposed to be smaller, so evidently we should take the smaller of the two children and swap the thirteen with that. So we should swap the thirteen with the figure. Not with E And now we observe a phenomenon very much analogous to what we saw in insert. When we were bubbling up during insertion, it wasn't necessarily that we fixed violations of the heat property right away But we would fix one And then introduce another one that was higher up in the tree And we had confidence that eventually we could just, push this violation up to the root of the tree And squash it, just like we're trying to win a game Of Whack a Mole. Here, it's the opposite. It's just in the opposite direction. So we swap the thirteen with the four. It's true we've created one new violation of the heap property. That's again involving the thirteen with its children nine and four. But we haven't created any new ones. We've pushed the heap violation further down the tree And hopefully again, like in Whack a Mole. We'll squash it At the bottom. So after swapping the thirteen and the four, now we just gotta do t he same thing. We say, okay, we're not done. We still don't have a heap. This thirteen is bigger than both of its children But now, with our accumulated wisdom, we know we should definitely swap the thirteen with the four. We're not gonna try swapping with the nine, that's for sure. So we move the four up here And we, the thirteen takes the 4's old place. Boom! Now we're done. So now we have no violations remaining. The thirteen in its new position has no children so there's no way it can have any violations, and the four because it was the smaller child that's gonna be bigger than the 9's so we haven't introduced a huge violation there, and again we have these consecutive 4's but we know that's not gonna be a problem because those were consecutive 4's in the original heap as well. So you won't be surprised to hear that this procedure by which you push something down, by swapping it with its smaller children, is called bubble down, and extract men is nothing more than taking more than, taking this last leaf, promoting it to the top of the tree, and bubbling down until the heap violation has been fixed. So again on a conceptual level that's all of the ingredients necessary for a complete from scratch implementation of extracting the minimum from a heap and as before, I'll leave it for you to check the details. So first of all you should check that in fact this bubble down has to at some point halt And when it halts you do have a bona fide heap. The heap property is definitely restored And second of all the running time is, is logarithmic. Here the running time analysis is exactly the same as before so we already have observed that the heights of a heap because it's perfectly balanced is essentially the log base two of the number of elements in the heap and in bubbling down all you do is a constant amount of work per level. All you have to do is a couple comparisons and swap. So, that's a peek at what's under the hood in the heap data structure. A little bit about the guts of its elementation. So after having seen this, I hope you feel like a little bit more hard-core of a programmer, a little bit more hard-core of a computer scientist.