
So cleaning the input, which is more technically also known as preprocessingwhat's the idea behind that?

Here's what you would normally do for an NPcomplete problem as we have talked about so far:

So if you're given the input for an NPcomplete problem. What you would do using the techniques from the previous units

is you would fire up your search tree to try and find an optimal solution.

And of course, that search tree has exponential size. So the algorithm goes through that tree here

until, at a certain point in time, it says, "Bingo, I found a solution," or, "I found the best possible solution."

The idea of preprocessing now is similar to something that we already saw for vertex cover or independent set,

where, for certain vertices, while we were traversing the search tree, or even in advance,

what we could already say for certain vertices, we know what assignment that vertex is going to have in an optimum solution.

And we could make that statement without actually going through any branching, but in polynomial time.

And that is the idea of preprocessing. The idea of preprocessing is, if you can actually find certain parts of the input,

where in polynomial time, of course, you can already say how they would be handled in an optimum solution.

So we're kind of nibbling away at the input here. And what that, of course, means is if your preprocessing is successful,

or especially if it's very successful, then the search tree that results from that input is not going to be as big.

So there's certain parts of the search tree that you don't have to do, because you already have found out

in the preprocessing what that part of the solution is going to look like.

So the search tree size will decrease. So we can, for example, cut off this branch here, because we've already

preprocessed this, and we can cut off this one here because we also preprocessed that one.

So now let's make this more concrete, and let me give you a concrete example.

And we're going to do this for SAT, because SAT is a problem where preprocessing is usually very successful.

So if you were, for example, to use a commercial SAT solver, then preprocessing will play a very very important role in that.

I once talked to somebody who develops those solvers, and they basically said that his package works 9095%

through preprocessing. So even for SAT instances with thousands of variables, his package can basically solve it,

but it can only solve it because the preprocessing algorithms are very good.

So you'll remember that SAT was the problem of finding if a given Boolean formula has a satisfying assignment or not.

And I'm now going to write down a Boolean formula for you, and then we're going to do a little quiz

to make preprocessing more concrete.

So the SAT formula is x1 or x3 or x5, and not x1 or x2 or x4, and so on, and so on.

Now, of course this formula here doesn't have very many variables. It's just six variables

x1, x2, x3, x4, x5 and x6. So with a little playing around, you would probably be able to figure out if this Boolean formula here

has a satisfying assignment or not. But of course, what we want to do now is preprocessing,

And that means that we want to see if, for certain variables, in this Boolean formula, we can figure out

if they should be set to true or false, without actually trying all possible combinations.

And as I said, we're going to do this as a quiz. So what I would like you to do is to look at this Boolean formula here,

and then consider the variables x1, x2, x3 and x4, and for each of those variables, determine if it's easy to see,

if they should be set to true or false. And by easy, I mean without actually trying around different true assignments

for the other variables, but you can basically immediately say, for these variables, if they should be set to true or false.

I'm going to give you one hint for the solution, and that is that, in my opinionand this is a bit of a subjective question

I think that for two of these variables here, it's rather easy to see. And I would like you to select those two.