
Title:
Inserting Elements  Intro to Algorithms

Description:

So the last thing I promised I would tell you about

is inserting elements into a heap.

And that turns out not to be all that hard,

but it does involve some coding that we haven't done yet.

The idea is that the new element that we're going to insert

we stick at the sort of bottom right corner of the heap.

Or, if the heap was already full, then the far left.

But now we have potentially violated the heap property,

so this node to it's parent could be problematic.

So, essentially, we need to do

some sort of an analog of downheapify that I call upheapify

that takes this value that might be problematic,

swaps it up if it is,

and then continues to swap it up the tree

until it reaches a place where it is smaller than both of its children.

At which point, the heap property is satisfied globally.

So I'm going to leave that as a homework problem for you

to actually code upheapify,

and that gives us the insert that we need.

So now that we can insert and delete into heaps and (logn) time,

we can use this to solve the TopK problem

that we've been talking about throughout this unit.

By running through the list of the N elements that we want the TopK of,

and meanwhile we keep a heap of size K off to the side,

and so we're trying to the largest K, in this case.

So each new element that we encounter

we ask, is this element bigger than the smallest

value that we've kept so far?

So does it deserve to be in the TopK so far?

If not, we can just throw it away,

because we know it's not going to be part of the TopK.

If it is, then the smallest thing in that heap

no longer needs to be there, because we found something better.

So we delete min,

and we insert the new value that we just got into the heap

and reestablish the heap property

bywell, we could do it as a downheapify, actually,

because we deleted the node from the top.

We can replace it with something else.

So, for each of N times,

we do possibly one insert into the heap,

which is a log K operation.

So the total running time ends up being we do (nlogk) operations.

So that's actually pretty efficient.

It doesn't solve the TopK problem better

than the randomized algorithm we talked about

the partitioning algorithm

but it does it pretty fast

and there's other uses for this.

We'll see that in the next unit.

Thanks a lot, and good luck with the homework.