
Title:
1201 Preprocessing

Description:

Cleaning the input also known as preprocessing. What's the idea or purpose of that?
¶

Well, the idea is that for many inputs that you're given, so say this is an input

for an NP complete problem, what you would then do having learning the techniques

that I just showed you, you would fire up your search tree of course

until at some point the search tree says, "Bingo! I found a solution for you."

Now, the idea of cleaning the input is to determine if there are certain parts of the input

for which you can already say what the solution will look like without having to go through

the whole search tree so basically something like we saw in the vertex cover

or independent set algorithm where at a certain point in the search tree

we could already say for certain vertices if they were going to be part of that vertex cover

or the independent set without really going into any further branching.

By preprocessing, what you do is you try to find some chunks of the input

that you could actually solve in polynomial time even before you fire up your search trees.

So you're kind of nibbling away at your input here and what of course that means is

if your preprocessing is successful then you do not need a search tree that is as big as the original input

but maybe only a smaller one.

So let me give you one concrete example for preprocessing now.

And we're going to do this for SAT because SAT is actually a problem where

preprocessing is heavily used.

If you use a commercial software package to solve SAT,

this package will rely heavily on preprocessing.

I once talked to somebody who programs these packages and he says it's 9095%

preprocessing in most of the cases that leads their software to be able to solve SAT

for large instances so even if you have 1000 variables or so.

You'll remember that SAT was the problem for determining for a Boolean formula

if that formula has a satisfying assignment or not.

And I'm going to write down one Boolean formula for you here

x₁ or x₃ and not x₁ and x₂ or not x₄ and not x₂ or not x₃ and x₁ and not x₁ or not x₂.

Now of course to find out if this formula here has a satisfying assignment,

you could play around with the variables.

It's just four variables so not that many combinations.

But for preprocessing of course you are trying to avoid that so what I would like you to do now

in the next quiz is to look at this formula here and without trying different truth assignments

find if there's one or two variables in here for which you can already say

if they have to be set to be true or false in a satisfying assignment without trying different values.

Is it x₁? x₂? x₃? x₄?

Please make a check mark for those variables where you can easily see

if they should be set to true or false without experimenting around with the values.

And of course more than one can be true or none can be true. Check which ever one is correct.