Return to Video

Quantifying The Error Solution - Intro to Theoretical Computer Science

  • 0:00 - 0:07
    And the answer is that difference in values can, at most, be X. So if the difference in value between B and A is larger than X,
  • 0:07 - 0:12
    value of A divided by X and the value of B divided by X, would round down to different numbers.
  • 0:12 - 0:18
    So that's why we know the amount of value lost in each mistake that we make is actually smaller than X.
  • 0:18 - 0:25
    And that is a very cool thing to know, because with these three things, knowing how many objects can be, at most,
  • 0:25 - 0:31
    in an optimum solution, how many mistakes we can have made. And having quantified how much value
  • 0:31 - 0:38
    we have lost in each mistake allows us to quantify the overall mistake that we can have made, at most.
  • 0:38 - 0:41
    And I call this the maximum value lost, as opposed to an optimum solution.
  • 0:41 - 0:46
    And that maximum value is, of course, well, there's, at most, N objects in an optimum solution.
  • 0:46 - 0:52
    That means we can make, at most, N mistakes, and each of those mistakes costs us less than X.
  • 0:52 - 0:58
    So the maximum value that we can lose overall by rounding down is smaller than N times X.
  • 0:58 - 1:03
    And that, of course, is the absolute error that the algorithm can make by this rounding down technique.
  • 1:03 - 1:09
    So now let's figure out the relative error here. And the relative error, since KNAPSACK is a maximization problem,
  • 1:09 - 1:15
    is of course the value of an optimum solution divided by the value of an approximate solution.
  • 1:15 - 1:20
    So let's just call the value of the optimum solution V opt, and opt stands for optimal here.
  • 1:20 - 1:26
    And in that case we can write the approximate solution as V opt minus N times X.
  • 1:26 - 1:31
    Now, this is not the precise value of the approximate solution, but it's a bound for the approximate solution.
  • 1:31 - 1:37
    So we know it cannot get any worse than that. So, but to be precise here, of course then we'll have to write it like this here.
  • 1:37 - 1:41
    Because we know that the approximate solution will have at least this value.
  • 1:41 - 1:47
    Actually, it will always be bigger, because we have a strict smaller than here, so we should more precisely even write it like this.
  • 1:47 - 1:51
    So it's smaller than V opt divided by V opt minus N times X.
  • 1:51 - 1:56
    Now, we can simply write this as one plus N times X over V opt minus N times X.
  • 1:56 - 2:03
    We know something about V opt because the value of an optimum solution can, at most, be the value of all of all objects combined,
  • 2:03 - 2:10
    and we call that number V. So we can rewrite this as one plus N times X over V minus N times X.
  • 2:10 - 2:17
    Now, let's have a closer look at X. How did we define X? Well, we said that X was equal to V over N
  • 2:17 - 2:21
    times one minus C, where C was some number between zero and one.
  • 2:21 - 2:30
    So if we plug the value of X into this equation here, what we get is one plus V times one minus C over V minus
  • 2:30 - 2:38
    V times one minus C. Which means that Vs here simply cancel out, and you get one plus one minus C
  • 2:38 - 2:41
    over one minus one plus C. Which is the same as one over C.
  • 2:41 - 2:48
    So I didn't tell you about C earlier, but what we just found out is that the approximation factor of our algorithm
  • 2:48 - 2:53
    is actually one over C. And you'll remember that the running time was also dependent on C.
  • 2:53 - 2:58
    So we already figured this one out, right? It was O of N squared over one minus C.
  • 2:58 - 3:06
    So a good approximation would be to set C close to one. Because then the relative error becomes almost one,
  • 3:06 - 3:11
    but in return--and you see that in the running time here--the running time actually becomes very large,
  • 3:11 - 3:18
    because if C is close to one, then you get a very small denominator here, which means that the overall running time,
  • 3:18 - 3:21
    since it's a fraction depending on this, gets very large.
  • 3:22 - 3:29
    But if you're happy with a--and I'm going to just call it bad approximation, then C can actually be smaller than one.
  • 3:29 - 3:35
    C will always be larger than zero, so it'll always be a positive number, but it could be 0.5, 0.2 or even smaller,
  • 3:35 - 3:40
    what will then happen is, if C is smaller than one--so it's this very tiny fraction,
  • 3:40 - 3:45
    well, then actually this part here in the denominator will be very close to one.
  • 3:45 - 3:52
    So the running time is very good, but this fraction here will actually increase.
  • 3:52 - 3:57
    So in return, forgetting about the running time, you get a relative error that is much worse.
  • 3:57 - 4:03
    And this is exactly the principle behind the PTAS. So we're trading on the quality of the approximation factor,
  • 4:03 - 4:10
    which is this fraction here--one over C--against running time. More running time means we can get a better approximation factor.
  • 4:10 - 4:14
    A worse approximation factor, in return, will mean that the algorithm runs faster.
  • 4:14 - 4:21
    So this is already pretty amazing. Now, one last detail here: As you'll notice, the running time of the approximation algorithm
  • 4:21 - 4:28
    is actually polynomially dependent on C, which is why this PTAS is sometimes also referred to as a fully polynomial
  • 4:28 - 4:35
    time approximation scheme, or F PTAS. For the majority of polynomial time approximation schemes, the running time
  • 4:35 - 4:39
    actually depends exponentially on the approximation factor that you want to achieve.
  • 4:39 - 4:44
    But this is something for a more advanced course. I still sometimes find it mind-boggling that KNAPSACK
  • 4:44 - 4:50
    is NP-complete, meaning that, in terms of intractability, KNAPSACK is likely to be just as hard to solve
  • 4:50 - 4:54
    as nastier problems such as SAT or CLIQUE. So approximation, I think, shows very well
  • 4:54 - 4:59
    that even within NP-completeness, there's a broad spectrum of difficulty, if you will.
  • 4:59 - 5:04
    So, some of the most difficult problems to approximate appear to be CLIQUE and INDEPENDENT set.
  • 5:04 - 5:08
    And then in the middle of the spectrum, we have problems like VERTEX COVER.
  • 5:08 - 5:13
    And then we have problems which have a PTAS or even a fully polynomial time approximation scheme,
  • 5:13 - 5:17
    such as KNAPSACK. All of these problems here are NP-complete.
  • 5:17 - 5:21
    But some of them can be approximated with no constant factor at all.
  • 5:21 - 5:26
    Others have algorithms that, if you don't look closely, could almost be mistaken for being polynomial.
  • 5:26 - 5:30
    Of course, all of this is still under the assumption that P does not equal NP.
  • 5:30 - 5:35
    If P equals NP, then all of these problems here would be polynomial time solvable, anyway.
  • 5:35 - 5:41
    Now that we have looked at approximation, what else is there that you could try to tackle NP-complete problems?
  • 5:41 - 5:50
    Next time we'll be basically poking around, meaning that we'll be looking at how randomization and random algorithms
  • 5:50 - 5:54
    could maybe help us tackle these problems here as well. So, see you next time.
Title:
Quantifying The Error Solution - Intro to Theoretical Computer Science
Video Language:
English
Team:
Udacity
Project:
CS313 - Theoretical Computer Science
Duration:
05:55

English subtitles

Revisions Compare revisions