-
36c3 Intro Music
-
Herald: ...now with the talk "The useful-
harmless spectrum". As I said,
-
he needs no introduction: Fefe.
-
Applause
-
Tapping on the microphone
-
Fefe: Good morning, I am happy that there
are so many people here.
-
Thankfully this is not Hall 1. That would
be bad, with so many people.
-
I have to manage your expectations
before I start,
-
I actually submitted a different talk
last year about TCB-minimization,
-
which would have been a bit technical,
about what you can do
-
as a programmer. It was not accepted,
I don't know why - schedule was full.
-
I submitted it again this year,
but I didn't want it to look
-
like I want to bother them, so I
submitted another talk.
-
...of course they accepted that one.
Which meant, I had to
-
quickly prepare it now.
Audience laughs
-
Well, the problem is, this is more of a
thought process than a structured
-
presentation. I hope that it'll be
helpful none the less. But it's
-
not as structured as my usual talks.
I will just start. So, there are multiple
-
approaches, that basically result in
the same result, and I will
-
just let you listen. Relatively early in
my career, I decided the following:
-
I will never write software on which
people's lives may depend,
-
like medical devices, nuclear reactors;
that was my idea.
-
Of course not military either. And then
I met somebody that writes code for
-
nuclear reactors. And it was the kind of
guy that says "That's super easy"
-
So when those that know their
limits don't do it, then
-
the other people will.
Audience laughs
-
I don't want to generalize though.
I also met another guy that
-
was not like this, but I mean,
this type of person exist.
-
I believe that the problem here
is that you learn programming
-
exploratively: It's not like a set path,
on which you walk, but rather you
-
are just walking around and finding
your limits. But by definition this also
-
means, that you don't know your limits
yet, because you are looking for them.
-
This also means that you are always
working at your limit though. When people
-
write software, then they go
just as far as they believe they
-
can just barely go. In turn, this also
means that the technology that
-
is being rolled out out there is mainly
not tried and tested
-
or well understood, but rather it is the
technology, that the programmer
-
just barely still understood. This is a
bit of a problem, which is further
-
amplified by today's modularization and
dependency wave, where
-
people just pull in modules from elsewhere
-
and just assume that the writer of that
module must know what they are doing,
-
though without any foundation in reality.
And it is often not the case. Instead,
-
they are people like you and me, that
also worked exploratively.
-
You can also do a little thought
experiment and get to this
-
conclusion yourself; you could even
observe it happening. Let's assume
-
that somebody finds a better way to
deal with complexity. For example
-
modularization, or object-oriented
programming, when this was new.
-
So then you would hope that we would
improve the software that we
-
had written before, because we now
have it more under control.
-
But this does not happen.
Instead, we now write bigger
-
software and work at our limit
again. I think this is not
-
a problem of software development or
programming, but generally
-
a problem of humans. Evolution
made us this way, and we have to
-
learn to deal with it. Let me illustrate
this: I have a theory,
-
which I call the gradient-theory. The
thesis is, that humans treat their
-
environment like a process of optimization
in mathematics. This means you
-
have a terrain and you are looking for
the highest or lowest point - that is an
-
optimization problem. And you can't
directly aim for it, because you don't
-
know the terrain. Instead, you have to
make assumptions, and you can observe
-
this on yourself. If it's too cold, then
you go to the radiator and
-
you don't set it to the perfect heat,
you set it to "hot", then you wait
-
until it's too hot, then you
turn it down again.
-
So we interact with our environment in
a process of approximation.
-
And not just with heaters, but also when
driving a car, when we have a map.
-
We look, "where is the limit? Where do
we have to turn?", and
-
we ignore the journey to the turn,
even if it is nice.
-
Many things that we do, also including
our choice of speed, is such a gradient.
-
We accelerate until we feel unwell,
then we slow down again.
-
Or when searching for something in
a telephone book or dictionary,
-
we make an assumption of where
it will be. And when it is
-
too far, we go back again. The essence
of it is: We make an assumption
-
about what the terrain looks like.
We have smooth transitions here,
-
so this technique works well.
This is called gradient descent
-
by the way, when you try to follow
gravity to find the lowest point.
-
But it does not work well
in two scenarios:
-
Firstly, when there is a cliff where I
can't go back once I have walked
-
over it. It also doesn't go well when
you don't notice that you have gone
-
too far. Well it is similar to the cliff,
and the second problem is
-
when you can't roll back
for other reasons.
-
This happens frequently in software
development, and it turns out, that
-
this is exactly the kind of problem that
human have. For example,
-
when we have a trial subscription for two
weeks, people forget to cancel it again,
-
or drug addiction is a classic, or
gambling addiction. And in software
-
development or project management
in general this is common:
-
We have already invested so much that
we can't go back. Security is not
-
a gradient. It may look like one, but it
isn't. I think this is
-
a fundamental issue in IT security.
You don't notice when you
-
have gone too far. You only notice
when you get hacked. And then
-
you can no longer go back, all the data
is already gone. Complexity is also
-
not a gradient, similarly to security,
but it feels like one. I think
-
this is the reason why we deal with
it so badly. It feels
-
as if we have everything under
control. And when we notice,
-
that we don't, we can't go back.
By the way, giving out data to
-
facebook is also such a "pseudo-gradient".
-
When you notice that you gave away too
much, it is too late.
-
So the conclusion is:
Complexity is evil. We notice it too
-
late and we get into it too easily.
So we have to counteract that somehow.
-
If this is our job, we are externalizing
the costs to our customers,
-
to our users, and to our future self.
-
This is why you rarely find older software
developers that are happy.
-
Audience laughs
So, this was the first train of thought,
-
that led me in this direction. The second
train of thought: Let me just show you
-
the GNU manifesto, as a representative.
This is not GNU-bashing,
-
but you can show this pretty well with
the example of the GNU manifesto.
-
This is the original announcement of the
GNU project by Richard Stallman. He wrote:
-
"GNU will be able to run Unix programs, but
will not be identical to Unix. We will make
-
all improvements that are convenient".
This is a very bad sentence.
-
What does "convenient" mean? For who?
-
But this is the approach that a lot of
programmers have:
-
"Oh we can just add this quickly."
We are lacking a corrective, that
-
we think in advance "what legacy am I
hanging to my leg right now?"
-
I think this "convenience" thought when
extending software is our "original sin"
-
- to get a bit catholic here -
in software development.
-
Everyone has done it before, and you
just can't correct it after the fact.
-
So the only way of getting rid of it
is to throw away
-
the whole software or module and
start over again. But software doesn't die.
-
Only when dealing with software, I learned
that it is good that people die,
-
because it is a corrective that is needed.
If a system is supposed to improve,
-
the old stuff has to be able to die at
some point. And this does not
-
happen with software. It is a feature
that things don't last forever.
-
In general, you can observe that when
somebody is extending their software and
-
they have a choice between "We do
something to solve our specific problem"
-
or "We do something to solve a more
general problem", people will
-
always try to solve the
more general problem.
-
"The more danger, the more honor."
And you can see this across the board.
-
There are very few exceptions to this. And
I had my "aha-moment" when I opened
-
'gdb' on a project one day. I took '/tmp'
here, but that project was
-
some checkout.
In my webserver, I have a '.gdbinit' file.
-
It's a configuration file for the GNU-
debugger, where you can for example say
-
"Open this application that I want to
-
debug with these arguments!"
And in there, I write "Don't use Port 80,
-
that doesn't work, instead use port
8005" or something, to debug it on
-
localhost. And one day, gdb started
saying "no, I don't accept this
-
.gdbinit file because it is in a directory
-
that you have not specifically allowed."
This was exactly such an attempt to fix
-
an issue after shipping, after the fact.
gdb noticed: "Our config-file has become
-
so powerful, that it is a security issue",
-
and then retroactively nailed down the
whole config. And this broke more
-
than it needed to - perhaps, I don't
know for sure - but it was very annoying
-
for me. You can put an auto path in here,
but that is when I noticed it
-
for the first time. This was a few years
-
ago. I don't know, when exactly that was.
There was a similar case like this
-
again: With Vim, the editor, that I like
to use. You can do things like
-
in a comment in the file that is being
edited, you can put some configuration
-
settings in the first or last three lines.
-
It is supposed to be used for "I use
tabstop=4 here", or something.
-
But the parser for this had
a security bug, which made it
-
possible to create a file that
executes code, when it is
-
opened in vim, which was of course
not intended. But it is the same
-
issue. I think you can generalize this
-
a bit - though earlier I argued
against generalizations, but
-
in analysis it is good, in software
it is usually bad. let me illustrate
-
with an example:
Let's assume that we have a CSV file
-
with some trouble tickets. Field 4
is the one, that we are interested in.
-
Let's assume it looks like this. It's CSV.
So, now I would like to have the sum
-
of the four fields. So first I use
cut, we are in Unix here.
-
Then the first line has to go,
-
so I use tail. Now the first line
is gone, now I just have to
-
calculate the sum. There is an
application for this too: paste. that is
-
how you do it in Unix. Then I have to
calculate it. There we go! But what if
-
it doesn't say 1 here, but instead "fred"?
We notice: cut does not have a problem,
-
tail does not have a problem, paste is
fine, but bc falls on its face.
-
Even worse, bc is programmable.
There could be the
-
Ackermann-function here and
your computer would be gone
-
for an hour, while it is trying to
solve some recursion. And I think it
-
is useful to introduce a concept here
to say: cut, tail and paste are harmless,
-
bc is not. This is one of the thoughts
where I thought "okay, you can make
-
a talk about this".
But this is not enough.
-
There are different kinds of harmless.
But I think this simple idea
-
already helps us a bit.
Let's make it into a sentence:
-
Software is harmless, when unexpected
input don't produce unexpected
-
behavior or unexpected kinds of output.
For example, an SHA-checksum is always
-
harmless. Regardless of
what data I put in, the output
-
has a known format. Or word
count (wc) is also one of those.
-
Now you could say: "Okay, just use
awk!" And in awk I don't have a problem
-
when it says "fred" instead of "4"
and the interpreter also does not
-
interpret any functions.
It looks better, but
-
is it really harmless?
It turns out, awk is a different kind of
-
not harmless, because you can write
in the filesystem with it. So I don't have
-
to worry about the input, but I have to
worry about the code, that I hand to it
-
on the command line. So that is
another distinction you can make.
-
This is a big problem in the game
industry by the way:
-
The game development industry
has started putting interpreters
-
into their games, to be able to write
their business logic - not the AI,
-
but small scripts - in a scripting
language. One of the most
-
popular script-interpreters for this
purpose is Lua. And Lua is primarily
-
used because it can't do anything,
if you don't specifically allow it.
-
So It can't open files or sockets.
You can enable this manually though,
-
and then you have a problem again
of course. But this is a real issue.
-
Many open-source people don't think
about this, because they think "Well,
-
I will ship it and the rest is no longer
my issue." But I think,
-
that we generally have to think
about this, and preferably
-
before shipping, optimally already while
programming. So, this is
-
a different kind of harmlessness.
The first kind was "Can bad input
-
cause bad output?" And now: "Can the
application itself do bad things?"
-
This is a very modern thought,
because we work a lot more with
-
sandboxing today. In sandboxing, the goal
is to prevent a program from
-
accidentally or deliberately doing bad
things. And there are again different
-
things that a program can do.
bc can eat processing time. awk can
-
read and write in your filesystem, and
this goes on and on. Let's get back
-
to the GNU manifesto: GNU awk is a special
version of awk and it can open sockets,
-
without any need. This means, if we
just use awk and thing "Well, awk can
-
write in the filesystem, but I mounted
that read-only, so nothing
-
can happen". But then if GNU awk
is being used, it is suddenly
-
no longer harmless. Bash
can open sockets too by the way!
-
I don't know, how many people knew that?
This goes on of course: after awk
-
came Perl. It's even worse, and
Perl can do eval(), which in my
-
opinion is the worse evil that you can
have in a programming language.
-
A bit closer to the end-user you can also
observe this in browsers. Let's look at
-
Netscape for example:
Several times, Netscape had the choice
-
between "useful" and "harmless" and always
chose "useful". It started with
-
the plugins. I don't know, who
of you still remembers the Flash-plugin,
-
or before that we all had the RealPlayer,
and there was also an Acrobat-plugin -
-
And all of it was shit, because the
plugins were native code: they could do
-
everything, that their operating system
allowed. This means that it was very
-
useful, but also very dangerous.
And it was a conscious choice of
-
the browsers, to allow this.
The actual goal of this talk is
-
to give the programmers among you a
bit of awareness that you don't just
-
add a plugin interface that
can do everything.
-
The next iteration was:
We'll do everything in JavaScript.
-
At first it looked better, but this
JavaScript eventually also ran with
-
enough privileges do do bad things
in the system, or at least in the browser.
-
It turns out: People now have their
important data in the browser,
-
because they do online banking. And
that is enough do do a lot of damage.
-
Then they had to correct it
after the fact. Chrome now imposes
-
even further limits for security reasons
to break ad blockers. It's always
-
the same trap that we walk into.
Who of you here use Windows?
-
In Windows there is a tool by
Mark Russinovich - by now he has
-
sold it to Microsoft, so it is now an
official Microsoft tool.
-
And the only functionality of this
tool is to list the different
-
plugins that are part of the system.
And I took a relatively
-
clean system here. It's not about
this down here or
-
the size of the scrollbar, but just
how many tabs there are at the top:
-
These are all different options for
plugins to integrate into the system,
-
and nobody has an overview of this
anymore, because people always decided
-
to go in the wrong direction. I believe
that this is a core problem.
-
There is a third approach to this:
My daily life in security consists of
-
going to companies. They show me their
source code and I look for bugs. Then
-
I tell them, which bugs I found. And
occasionally, there are cases where
-
I notice that there are a lot of bugs.
Not just those that I find, but they
-
already have their own database,
a bugtracker, and they already
-
have a seven-digit number of bugs. Yes,
This happens. And since it is a problem
-
that we have so many bugs, there
are now counter-strategies by developers
-
that start saying: "Okay, if this bug is
not important then
-
I can fix it later." And "later" means
"never" in reality. It just sits there.
-
Joke that only makes sense in German
-
In the real world, bug
trackers are often just
-
massive permanent data disposal sites:
For example, I recently filed a bug report
-
for Firefox and got the ID 1590000.
This is already a bad sign.
-
But it is also a good sign, that
the bug tracker is open.
-
For Microsoft you can't see how
many bugs they have.
-
This is only meant for illustration.
Mozilla is not especially bad.
-
Mozilla just has an open tracker,
on which I can show it well.
-
What I wanted to show you -
I had a look: "What is the first bug
-
that I filed there?" It still had
a six-digit ID.
-
That was 2003. If you look at the
history of bug IDs then you notice:
-
It is growing exponentially.
And it's not like the bugs somehow
-
go away at some point.
I have noticed two major events,
-
where bugs are closed:
When a new release is done
-
and you throw out the old JavaScript
engine and put in a new one.
-
Then you just close all bugs of the old
engine. It looks as if you have achieved
-
something. And the second is this one:
I don't know, can you read this in
-
the back? Mozilla just closed my bug.
It says:
-
"This bug has been automatically
resolved after a period
-
of inactivity". Mind you, it was not me
who was inactive. I filed the bug and
-
nobody at Mozilla took care of it.
So they just automatically closed it,
-
because the statistics look so bad.
This is a big issue,
-
not just at Mozilla. As I said, this is
just the example
-
that I can show, because
in their case it is public. But
-
this leads to a cascade of action
and reaction. For example,
-
unimportant bugs are just not fixed
anymore. And then people
-
add "important" on their bugs,
because they want them to be fixed.
-
Then they say "Okay, the important
bugs also don't get fixed,
-
because there are too many of them."
And then people
-
write "Security" on their bugs, and now
we have a wave of security-bugs.
-
There they negotiate: "Is this really
a problem?" And then we get excuses
-
like "It's just a crash."
The point is that there is an unholy
-
alliance with another trend,
namely that companies see:
-
We have so many bugs open that
solving the bugs is not the goal anymore.
-
There are just too many, it is
unrealistic. Instead,
-
we introduce metrics like "we do
fuzzing". Fuzzing is not
-
a bad idea, but it is not "finding all
bugs", but just the first step
-
on a long road. But it gives
out a nice metric.
-
We have so-and-so many fuzz-
testcases, and now...
-
Are we now better or worse than
before? It's hard to say.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-