music
applause
Raichoo: Yeah sorry about that - beamers
or projectors, I don't like them. They
don't like me either. So this is a little
heads up - this is going to be the only
slide I'm going to show you today so,
"slide", because I think doing stuff like
that in a terminal might be a little bit
more interesting for you. But sadly
something is getting cut off so I we have
to improvise a little bit. But anyway, so
today I will be able to talk about two of
my favorite things right now which are
FreeBSD and DTrace. But this talk has been
capped down to 30 minutes so we'll be
focusing a little more on the DTrace part.
So there will be a little bit less BSD
than I anticipated. And also adjusted
everything a little bit to fit better into
the resilience track so hopefully you'll
enjoy that. So before we begin, who here
is actually using DTrace? Okay more than
I expected but still not as many as I
would like to see. So hopefully after this
talk you will think, "oh, this is a really
awesome tool, I gotta learn it." Because I
totally love it - it changed the way I do
a lot of stuff. So for those of you who do
not know what DTrace is, first, let me
fill you in on this stuff. So it's open
source, it originated on Solaris, and been
developed currently on illumos which is a
fork from OpenSolaris. It has been ported
to FreeBSD, NetBSD, OS X, there's also a
port for Linux called next called DTrace
for Linux. I think it's done by a person
called Paul Fox. It's been ported to QNX
and the OpenBSD folks are currently doing
some work to get the technology like
DTrace on their system. And I think
there's a port for Windows? I don't know
if this is actually true, but it is it's
kind of cool because then that means it's
basically everywhere. So, most of you
would probably know static tools like
strace. We have a very similar tool on
FreeBSD that is called truss, and what
truss and strace are doing is - you can
attach them to a process and look at the
system calls that this process is
emitting. So in case something is going
wrong you can well look inside of the
program, which can be kind of useful when
you're trying to find a problem. It's
kind of handy but it's also pretty
limited. Because first of all it really
really slows down the process that you're
currently looking at. So if you want to
debug a performance issue, you're pretty
much out of luck there. And also it's kind
of like, narrow down - you can just look
at one process. Which is also like bad
thing because the system that we currently
have - all these systems are very
complex: we have a lot of layers. You have
virtual file systems, you have virtual
memory, you have network, you have
databases, processes communicating with
each other. And in case you are using a
high-level programming language, you might
also have a runtime system. So it's a
little operating system on top of your
operating system. So when something goes
wrong in a system that has such large
complexity, something happens that we call
the blame game. And the blame game - it's
never your fault, it's always someone
else's. So what we want to be able to do
is we want to look at the system as a
whole, so we can correlate all the data
and come up with some meaningful answers
when something is really going wrong in
there. And also, we don't want to
switch out all the processes for
debug processes to make that happen,
because as these things are all -- every
problem happens in production. It never
happens on the development box. So like,
switching out all the processes - that's
totally out of the picture. So to do that
in an arbitrary way, to like, instrument
the system in an arbitrary way, we sort of
need like a programming language. So, we
need to describe - when that happens,
please submit data so I can see what's
going on. So this kind of implies a
programming language. And DTrace comes
with such a programming language - it's a
little bit reminiscent of awk cross with
C? It's pretty simple to learn - you can
pick it up 20 up to pick it up in 20
minutes and you can start churning out
your first DTrace scripts. So like awk, if
you know awk, awk can be used to analyze
large bodies of text. Dtrace is pretty
much the same, but for system behavior -
so a little bit mind boggling, but
probably I can show you what I mean by
that. And also, as a bonus we don't want
to slow down the system, so we want to be
able to do things like performance
debugging, performance tests like that. So
I've prepared this little demo here, and.
So since we had some issues here probably
this is not -- I have to play around a
little bit. So what I'm going to do is
I'm going to look at a very very naive way
to -- excuse me for a second -- very naive
way to -- give me a second -- so very
naive way to authenticate a user. And
there's a lot of stuff wrong with this
code, but like what we're going to do is
we're going to take a user string as
input, and then we're going to just
compare it to another, to a secret. So I
know, the the secret in here is like in
plain text I know this is a problem, but
this is a little bit artificial. But I
just want to get my point across. So from
an algorithmic perspective, this check
function is correct: so we take a string
we take another string and we compare
them. So everything's fine and easy. So if
you look at the way string compare works
and what it does, it's essentially
taking these two strings and it's
comparing every character bit by bit. So
when it finds the first pair of characters
that do not match up, it's going to stop.
So we can we can conclude something about
from that - so if it takes very short if
if this function this check function takes
a very short amount of time, then, what
will happen is it will terminate earlier.
And if our password guess is better, it
will take well, it will take longer. And
if we can measure that we can basically
extract information from that running
algorithm. So I wrote a little driver
program in Haskell that basically just
iterates over an alphabet and just feeds
this one letter into that program,
And I'm going to use DTrace to get some
timing information. So let me start the
driver. So this is now just running in the
background. And you cannot see what I'm
typing there, but don't worry - these
scripts will all be; I will push them on
my github. So DTrace now produces this
nice little distribution. So if you if you
were if you were able to see the entire
alphabet, you would see that everything
except "D" behaves differently. So if you
squint a little, what you see there is
DTrace the D letter takes a couple of
nanoseconds longer. This is the precision
that I'm measuring here - ten to minus
nine seconds. Like really precise. And D
takes longer than everything else, so it's
a little bit cut off there, but trust me.
I know it sound like Donald Trump I'm
saying that. So yeah, and from that let's
just enter a letter. And now the password
and now the script clears everything and
it's going to guess the next letter. So
sadly this is cut off, because you would
see that this distribution radically
changed. It looks completely different,
and so we can play that game a little bit.
So let's just roll with that.
And like every three seconds the script is
going to recompute looking at the new
distribution. And you can probably see
where this is going. So here you can see,
okay, and now it just - it just takes
about like three seconds for me to guess
the next letter. So, and this is not a
problem that is only of
something that happens when you do string
compares. This can happen with
basically everything - so it's especially
in things like cryptographic stuff where
you don't want to have some information
leaked out. So this is what we call a
timing side channel attack. So I could
essentially use DTrace to analyze
the real binary. So I didn't change the
binary - I didn't have some some debug
code there. This is like the actual binary
that I would put into production. So
what's important about out that, is to
take the actual binary, is some of these
these timing side channels might be
introduced by a compiler optimization. And
when you insert debug code into that code,
then it might actually go away. So, you
want to look at the real code that you're
putting into production. Let me show you
the script that I came up with to write
that. So there are three interesting
things in this script. So and and don't
worry - this is the more
complicated example, I just want to like
inspire your ideas. Because the things
that you can do with DTrace that's pretty
much - the sky's the limit. You can
come up with the weirdest ideas, and so
this is more complicated example. I'm
going to show you simpler ones. So to
demonstrate how we got here. So there are
three interesting things in this code. The
first one is something that we call a
probe. So a probe is a point of
instrumentation in the system. So whenever
a certain event happens in the system this
probe is going to fire. And in this case,
the begin probe like marks the state
the moment when the script starts. So the
second interesting thing is this clause.
So this clause is basically what this
probe is going to execute - what's going
to be executed once that probe fires. So
it's a little block of code.
And this probe is a little bit more
interesting, because it tells us
something about the structure of how such
a probe looks like. Because every
probe is uniquely identified by a four
tuple. So it's like four components that
uniquely identify a probe. And the first
one is called the first part of this
tuple is called the provider, and I'm
going to talk about providers in a couple
of seconds and what they are. The second
one is called the module. Third one is
called the function. And the last one is
called the name. So these four pieces of
data, like, they identify a probe
uniquely. So the third thing that is
interesting here is, sadly something that
I don't have time to talk about today,
this is called an aggregation. And this
single line that you see here is
essentially responsible for accumulating
all this data to print out this
distribution stuff - to generate this
distribution. So this is built
into DTrace. You don't have to do that
yourself. As it, when you look at this
script, it's like 42 lines of code.
And I came up with the first prototype
after five minutes. So it's not a lot
of stuff to do to get something out of
that. So it's very useful to have things -
if you use DTrace you
will use this a lot for performance
debugging so it's kind of neat that we
have that. So yeah, let's talk a little
bit about providers, and this will
probably also will be cut off. So I'm
going to cheat a little bit here - I'm
just going to double that. So let's talk
about providers -- oh that's handy --
so I got 27 providers here and the number
of providers vary from operating system to
operating system. But these are the
ones that I can see right now. There are
other providers that can be come into
existence when you demand them. So I have
these 27 providers, and we're going to
look at the syscall provider and the FBT
provider first. So, every provider knows
how to instrument a specific part of the
system. So the syscall provider knows how
to instrument the syscall table. That's not
very surprising. So if you can look at the
syscall provider and here you can see
essentially every system call entry and
return that FreeBSD offers. So
here you can see this four tuple, like,
the provider syscall, FreeBSD is the
module, and so on. So these are all the
system calls that I have in my system. And
the other provider that I want to look at
is the so called FBT provider, and that is
pretty astonishing. The FBT provider, FBT
stands for "function boundary tracer" and
what it allows us to do, it allows us to
trace every single function in the kernel.
So I can look at the entire kernel at
functions, as they are being called. So to
illustrate that I wrote a little, very
simple DTrace script and this is probably,
look at the upper half please, so this is
probably one of the first DTrace scripts
that you will come up with, it's a
fairly simple example, so let's break it
down. So I'm going to instrument the mmap
system call. For those of you who do not
know what the mmap system call is, what
you can do with it is you can so you can
take a file and map that into the address
space of your process, so very dumbed down
version. So whenever we enter the mmap
system call we are going to set the
variable "follow" to one, and what this
"self at" means: this is essentially a
thread local variable and we're going to
associate that variable with the thread
that we're currently inspecting. Then I'm
going to do something pretty, that sounds
scary but I'm going to instrument the
entire kernel. Every function entry and
every function return, I'm going to
instrument that and say "please emit data
when you do that". And this is what we
call a predicate, so this is where the
awkiness of the DTrace programming
language comes in. So this is a predicate
and whenever that evaluates to true
then the probe is going to fire, so in
this case when we are in the thread that
we're currently tracing we're going to
emit data. And this is just an empty
clause, we just want to know "hey we got
here". So when we exit the mmap
system call and the predicate is set we're
going to set the variable "follow" to
zero, because every uninitialized variable
in DTrace is set to zero, so this pretty
much amounts to deallocating that variable
and then we're going to exit cleanly. So
let me run that. So it takes a couple of
seconds and boom. So you saw a little
pause here, that was when the DTrace guard
reverted the driver, the kernel. So now
you can see every function call that
happens inside the mmap system call. And
this is a little bit hard on the eyes, so
let me pass this flag here and now you can
have nice to read indentation. So
now you might say "I don't like that. You
are injecting code into the kernel. That
is, that sounds dangerous" and yeah, but
let me show you something that I find
really interesting. So I'm not
going too much into depth here, but this
is a byte code, so every DTrace script
gets compiled to bytecode and this
bytecode gets sent to the kernel and in
the kernel you have a virtual machine that
interprets that bytecode. So in case you
write a script that for some reason might
go rogue on your kernel, it like allocates
too much memory, takes too much time, this
virtual machine can just say "okay, stop
it" and just going to revert all the
changes that happened to your kernel, and
that's kinda handy. And it's not a new
idea, so if you're using TCP dump it's
basically the same approach. They also
have this kind of bytecode, so that's just
a little excursion here. This is called
BPF, Berkeley Packet Filter, so it's not
an entirely new idea. So everything I
showed you until now was "hey, I can look
when function calls happen". that's not
very much information, so we're going to
increase the amount of information that we
get out of the system with every example.
So let me look at the actual kernel. So I
had to restart my machine, so my setup is
basically gone now. So let's look at this
VM fault function. So this is, this is the
source code of the operating system that
I'm running right now. This is FreeBSD
current 12 and the VM fault function;
remember the mmap system call that I told
you? So the mmap system call
I told you can bring, like map a file
into your address space. And it doesn't
necessarily have to load the entire file,
so whenever we are touching a page in the
system, like a memory page, this machine
is four kilobytes and it's no super pages
here, so whenever it touches a piece of
memory that you didn't bring into memory
yet, we're generating something that's
called a page fault, and then this
function gets called. So here let's look
at the arguments, and I'm going to skip
the zeroeth argument, to look at the first
argument. So this is the address that
provoked that page fault, this is the
type and these are the flags and I'm going
to show you something to make that a
little bit more readable. So what about
this one? So you see it's a pointer and
this is a big structure, so we want
to be able to look at that structure. And
just probably should do this here, so
let's look at this VM fault script here.
So this is, make this a little bit more,
so this is, don't pay too much attention
to this code, this this is basically just
boilerplate to make make stuff readable
and this is where the actual action is
happening. So this is, so what I'm doing
there is I'm instrumenting the VM
fault function and whenever we enter it
then we're going to use some information
that DTrace gives us for free. So this is
execname, this is the name of the
currently running executable that provoked
the page fault, this is the process ID and
here we have a bunch of argument
variables. So these arg1, arg2, arg3,
that are essentially just integers, so
nothing too fancy there. But we wanna
look, wanna be able to look at that
struct. And here I'm going to use this
args array, and this args array
is kind of special, because it has typing
information about the arguments. So when
you run that, so you're referencing that
pointer there with the star, excuse me,
and let's just run that and maybe, that's
a start yeah. So this is an in-kernel
data structure that we can now look
at. So DTrace enabled us to look at in-
memory data structures as the system runs.
And this is really really powerful.
In in the DTrace script I could use all
these fields like I can manipulate this
args array, this value in there, just like
just like every other variable; I
can pretty much work like I was in C. So
how is it doing that? There is something
that's called CTF, that's not capture the
flag, it's, this is the, the Compact C
Tracing Format, so you can see that but
there is a man page in FreeBSD, and
there's a little segment in the kernel
binary, where all this typing information
is stored. I don't know how that compares
to modern DWARF but yeah this is what
DTrace is working with. So now you might
ask yourself "Why on earth would I do
that? Why on earth would I look at virtual
memory, because, yeah, um, this stuff is
safe isn't it? I mean there's no bugs in
there." Except when they are. Anyone
remembers remembers "Dirty COW"? So this
was a very nasty vulnerability in the
Linux kernel and that that was a problem
in the virtual memory management. So it
allowed you to write to a file that you
didn't own as a regular user. So you could
essentially just write to a binary that
had "set UID" set. Very unpleasant, but
I'm not going to bash the Linux folks
here, this is just, I just want to show
you these things are hard. And the first
fix for this problem was in 2005 and then
it came back in 2016. So now that's fixed
and then it came back with "Huge Dirty
COW" in 2017, so this is, I mean this
was there for way over a decade.
These things are hard to debug. And this
is what I like about these systems, so not
having, not having tools like DTrace to
figure out what's going on inside of the
system somehow, to me, amounts to security
by obscurity. And I've heard that some
people who are developing exploits for
systems that have DTrace they say "Oh, I
really like developing exploits on these
systems, because the tooling is so great!"
Yeah, but, to be honest this is cool,
because an exploit is a proof of concept
and coming up with these exploits quickly
is very usable, because you know what's
going on you can show "Hey, this is going
wrong". I had situations, where
people were telling me "Oh, this is this
is not a problem with our program, this is
this weird operating system that you're
using. Like Solaris, weird operating
system." And, yeah, and then I churned out
some DTrace scripts and "No, it's
actually your problem". "Oh, now I can see
that on my Linux box!" Magic. So,
everything I showed you until now was
very, very much related to function calls
and we want to have a little bit more
semantics here, because you might want to
write a script that inspects protocols,
stuff like TCP, UDP stuff like that. So,
you don't want to know which function
inside of the kernel is responsible for
handling your TCP/IP stuff, so DTrace
comes with something that's called static
providers and I'm just going to show the
apropos here. So these are, so every
static provider has a main page which is
kind of handy - documentation whoo - and
you can see there is an I/O provider if
you are interested in looking at this guy:
Oh, IP for looking at IPv4 and IPv6,
TCP... This one is pretty cool, it's about
scheduling behavior. So, "what does my
scheduler do?" And if you look at that, you
can see some interesting stuff like length
priority if you ever saw things like
priority inversion, stuff like that, now
you can see that happen. I'm a nerd, I
find this interesting for some reason, I
don't know. And it's also pretty
interesting to figure out what's going on,
"why is this getting de-scheduled all the
time?" So, some interesting things going
on there. So, I'm running a little bit
short on time here, but I just quickly
want to show you something - this is all
kernel stuff right now - can we do that
with userspace? Of course. So, there was
one provider that didn't show up when I
had my provider listing, but was in the
DTrace script where I did this timing
attack stuff. And that's called the PID
provider. And the PID provider generates
probes on demand, because a process might
have a lot of probes and you will shortly
see why and this is why I'm going to use a
very small program which is called "true",
and true just exits with exit code zero.
So, nothing too exciting going on here,
and this dollar target gets substituted
in, we get the process ID there. And this
is everything that happens when I'm
executing this program you see this is a
little bit more fine-grained than the FBT
provider, because now we can trace every
single instruction inside of that
function, which is kind of a handy. It's a
scriptable debugger. So, these numbers are
the instructional offsets inside of that
function. We can also look at - so this is
everything in the true segment - we can
also look at libraries that got linked in
and there's a lot of stuff happening in
libc for example when you run true.
So, one last thing that I wanted to show
you because it consumed a week of my life:
I'm using a lot of Haskell and the Mac OS
people, they also have DTrace and they
have GHC Haskell DTrace support - so the
Glasgow Haskell compiler - and glorious...
they have probes to analyze what's going
on inside of the runtime system. So, I
thought "I want to have that, I have
DTrace, why doesn't it work on FreeBSD?"
So, after a week of fighting with make
files and linkers, that works: If you
check out the recent GHC repository and
build it on FreeBSD, you get all the nice
stuff that I'm going to show you now. So,
this is a very boring program - it just
starts 32 green threads and schedules them
all over the place - and now I can do
something like this: phone rings I can
ring a telephone.
laughter
No, that would be
interesting... So, you can also use
wildcards - and not as name of the probe -
and this is what's going on inside, like
GC garbage collection and all this stuff.
Now you can look at this and write useful
DTrace scripts that also take my runtime
system into account. So, stuff like that
exists for I think Python - I'm not
entirely sure because I don't use it -
nodejs same, Postgres - I used it but not
with DTrace right now - and what a find
interesting: Firefox. When you run
JavaScript in your Firefox, it actually
has a provider, so you can trace
JavaScript running in your browser with
DTrace, so after everything I just showed
you, there might be some stuff going on
there. So yeah, this is basically
everything I wanted to show you and I
think I'm going to wrap out, because
otherwise we're not going to have a lot of
time for questions and maybe you have
some. So yeah, thanks.
applause
Herald: Thank you very much Raichoo. We
are actually over time already, but we
have two more minutes because we started
three minutes late, so if there are any
really quick questions, possibly from the
internet... There is one, the signal angel
says, let's hear it.
Question: Yeah, hi, okay. So, the question
is, "which changes are actually necessary
to do in the kernel of an operating system
to support DTrace?"
Answer: That's a lot of work. So, it's not
something like you do in a weekend. This
is... So, the person who started the work
on FreeBSD has sadly passed away now, but
I think they took a couple of years to
have everything in place, so you have to
have stuff like the CTF thing that I
showed you, which is what OpenBSD is
currently working on. And then you need
all those those magic gizmos, like kernel
modules and stuff like that. So, it takes
a lot of time, but it's been ported to
most operating systems that are available
and in use right now. So yeah, hope this
answers the question.
Herald: Excellent and there are no more
questions here in the room. I will thank
Raichoo and you can find him outside of
the room and also on Twitter at "raichoo"
if you have any more further question.
postroll music
subtitles created by c3subtitles.de
in the year 2020. Join, and help us!