1
00:00:00,110 --> 00:00:14,320
music
2
00:00:14,320 --> 00:00:18,810
applause
3
00:00:18,810 --> 00:00:23,560
Raichoo: Yeah sorry about that - beamers
or projectors, I don't like them. They
4
00:00:23,560 --> 00:00:27,210
don't like me either. So this is a little
heads up - this is going to be the only
5
00:00:27,210 --> 00:00:32,049
slide I'm going to show you today so,
"slide", because I think doing stuff like
6
00:00:32,049 --> 00:00:35,940
that in a terminal might be a little bit
more interesting for you. But sadly
7
00:00:35,940 --> 00:00:40,020
something is getting cut off so I we have
to improvise a little bit. But anyway, so
8
00:00:40,020 --> 00:00:43,960
today I will be able to talk about two of
my favorite things right now which are
9
00:00:43,960 --> 00:00:47,820
FreeBSD and DTrace. But this talk has been
capped down to 30 minutes so we'll be
10
00:00:47,820 --> 00:00:53,190
focusing a little more on the DTrace part.
So there will be a little bit less BSD
11
00:00:53,190 --> 00:00:57,560
than I anticipated. And also adjusted
everything a little bit to fit better into
12
00:00:57,560 --> 00:01:03,130
the resilience track so hopefully you'll
enjoy that. So before we begin, who here
13
00:01:03,130 --> 00:01:08,640
is actually using DTrace? Okay more than
I expected but still not as many as I
14
00:01:08,640 --> 00:01:12,610
would like to see. So hopefully after this
talk you will think, "oh, this is a really
15
00:01:12,610 --> 00:01:17,170
awesome tool, I gotta learn it." Because I
totally love it - it changed the way I do
16
00:01:17,170 --> 00:01:22,110
a lot of stuff. So for those of you who do
not know what DTrace is, first, let me
17
00:01:22,110 --> 00:01:27,260
fill you in on this stuff. So it's open
source, it originated on Solaris, and been
18
00:01:27,260 --> 00:01:31,640
developed currently on illumos which is a
fork from OpenSolaris. It has been ported
19
00:01:31,640 --> 00:01:37,930
to FreeBSD, NetBSD, OS X, there's also a
port for Linux called next called DTrace
20
00:01:37,930 --> 00:01:43,689
for Linux. I think it's done by a person
called Paul Fox. It's been ported to QNX
21
00:01:43,689 --> 00:01:49,810
and the OpenBSD folks are currently doing
some work to get the technology like
22
00:01:49,810 --> 00:01:54,040
DTrace on their system. And I think
there's a port for Windows? I don't know
23
00:01:54,040 --> 00:01:57,869
if this is actually true, but it is it's
kind of cool because then that means it's
24
00:01:57,869 --> 00:02:04,650
basically everywhere. So, most of you
would probably know static tools like
25
00:02:04,650 --> 00:02:09,470
strace. We have a very similar tool on
FreeBSD that is called truss, and what
26
00:02:09,470 --> 00:02:14,500
truss and strace are doing is - you can
attach them to a process and look at the
27
00:02:14,500 --> 00:02:18,650
system calls that this process is
emitting. So in case something is going
28
00:02:18,650 --> 00:02:23,319
wrong you can well look inside of the
program, which can be kind of useful when
29
00:02:23,319 --> 00:02:28,870
you're trying to find a problem. It's
kind of handy but it's also pretty
30
00:02:28,870 --> 00:02:32,890
limited. Because first of all it really
really slows down the process that you're
31
00:02:32,890 --> 00:02:37,250
currently looking at. So if you want to
debug a performance issue, you're pretty
32
00:02:37,250 --> 00:02:42,170
much out of luck there. And also it's kind
of like, narrow down - you can just look
33
00:02:42,170 --> 00:02:47,940
at one process. Which is also like bad
thing because the system that we currently
34
00:02:47,940 --> 00:02:52,660
have - all these systems are very
complex: we have a lot of layers. You have
35
00:02:52,660 --> 00:02:56,300
virtual file systems, you have virtual
memory, you have network, you have
36
00:02:56,300 --> 00:03:00,500
databases, processes communicating with
each other. And in case you are using a
37
00:03:00,500 --> 00:03:04,710
high-level programming language, you might
also have a runtime system. So it's a
38
00:03:04,710 --> 00:03:09,519
little operating system on top of your
operating system. So when something goes
39
00:03:09,519 --> 00:03:15,000
wrong in a system that has such large
complexity, something happens that we call
40
00:03:15,000 --> 00:03:19,850
the blame game. And the blame game - it's
never your fault, it's always someone
41
00:03:19,850 --> 00:03:25,710
else's. So what we want to be able to do
is we want to look at the system as a
42
00:03:25,710 --> 00:03:30,349
whole, so we can correlate all the data
and come up with some meaningful answers
43
00:03:30,349 --> 00:03:34,506
when something is really going wrong in
there. And also, we don't want to
44
00:03:34,506 --> 00:03:39,260
switch out all the processes for
debug processes to make that happen,
45
00:03:39,260 --> 00:03:44,969
because as these things are all -- every
problem happens in production. It never
46
00:03:44,969 --> 00:03:48,470
happens on the development box. So like,
switching out all the processes - that's
47
00:03:48,470 --> 00:03:55,030
totally out of the picture. So to do that
in an arbitrary way, to like, instrument
48
00:03:55,030 --> 00:03:59,910
the system in an arbitrary way, we sort of
need like a programming language. So, we
49
00:03:59,910 --> 00:04:03,640
need to describe - when that happens,
please submit data so I can see what's
50
00:04:03,640 --> 00:04:09,489
going on. So this kind of implies a
programming language. And DTrace comes
51
00:04:09,489 --> 00:04:13,670
with such a programming language - it's a
little bit reminiscent of awk cross with
52
00:04:13,670 --> 00:04:18,798
C? It's pretty simple to learn - you can
pick it up 20 up to pick it up in 20
53
00:04:18,798 --> 00:04:25,199
minutes and you can start churning out
your first DTrace scripts. So like awk, if
54
00:04:25,199 --> 00:04:30,559
you know awk, awk can be used to analyze
large bodies of text. Dtrace is pretty
55
00:04:30,559 --> 00:04:34,749
much the same, but for system behavior -
so a little bit mind boggling, but
56
00:04:34,749 --> 00:04:40,069
probably I can show you what I mean by
that. And also, as a bonus we don't want
57
00:04:40,069 --> 00:04:43,860
to slow down the system, so we want to be
able to do things like performance
58
00:04:43,860 --> 00:04:52,300
debugging, performance tests like that. So
I've prepared this little demo here, and.
59
00:04:52,300 --> 00:04:58,780
So since we had some issues here probably
this is not -- I have to play around a
60
00:04:58,780 --> 00:05:04,249
little bit. So what I'm going to do is
I'm going to look at a very very naive way
61
00:05:04,249 --> 00:05:18,009
to -- excuse me for a second -- very naive
way to -- give me a second -- so very
62
00:05:18,009 --> 00:05:21,960
naive way to authenticate a user. And
there's a lot of stuff wrong with this
63
00:05:21,960 --> 00:05:26,030
code, but like what we're going to do is
we're going to take a user string as
64
00:05:26,030 --> 00:05:32,740
input, and then we're going to just
compare it to another, to a secret. So I
65
00:05:32,740 --> 00:05:36,420
know, the the secret in here is like in
plain text I know this is a problem, but
66
00:05:36,420 --> 00:05:41,639
this is a little bit artificial. But I
just want to get my point across. So from
67
00:05:41,639 --> 00:05:47,159
an algorithmic perspective, this check
function is correct: so we take a string
68
00:05:47,159 --> 00:05:52,449
we take another string and we compare
them. So everything's fine and easy. So if
69
00:05:52,449 --> 00:05:58,599
you look at the way string compare works
and what it does, it's essentially
70
00:05:58,599 --> 00:06:04,449
taking these two strings and it's
comparing every character bit by bit. So
71
00:06:04,449 --> 00:06:10,729
when it finds the first pair of characters
that do not match up, it's going to stop.
72
00:06:10,729 --> 00:06:17,879
So we can we can conclude something about
from that - so if it takes very short if
73
00:06:17,879 --> 00:06:23,399
if this function this check function takes
a very short amount of time, then, what
74
00:06:23,399 --> 00:06:29,129
will happen is it will terminate earlier.
And if our password guess is better, it
75
00:06:29,129 --> 00:06:34,479
will take well, it will take longer. And
if we can measure that we can basically
76
00:06:34,479 --> 00:06:40,809
extract information from that running
algorithm. So I wrote a little driver
77
00:06:40,809 --> 00:06:47,449
program in Haskell that basically just
iterates over an alphabet and just feeds
78
00:06:47,449 --> 00:06:53,379
this one letter into that program,
And I'm going to use DTrace to get some
79
00:06:53,379 --> 00:06:59,020
timing information. So let me start the
driver. So this is now just running in the
80
00:06:59,020 --> 00:07:04,919
background. And you cannot see what I'm
typing there, but don't worry - these
81
00:07:04,919 --> 00:07:12,240
scripts will all be; I will push them on
my github. So DTrace now produces this
82
00:07:12,240 --> 00:07:17,240
nice little distribution. So if you if you
were if you were able to see the entire
83
00:07:17,240 --> 00:07:22,949
alphabet, you would see that everything
except "D" behaves differently. So if you
84
00:07:22,949 --> 00:07:29,399
squint a little, what you see there is
DTrace the D letter takes a couple of
85
00:07:29,399 --> 00:07:32,949
nanoseconds longer. This is the precision
that I'm measuring here - ten to minus
86
00:07:32,949 --> 00:07:39,219
nine seconds. Like really precise. And D
takes longer than everything else, so it's
87
00:07:39,219 --> 00:07:43,929
a little bit cut off there, but trust me.
I know it sound like Donald Trump I'm
88
00:07:43,929 --> 00:07:52,759
saying that. So yeah, and from that let's
just enter a letter. And now the password
89
00:07:52,759 --> 00:07:56,799
and now the script clears everything and
it's going to guess the next letter. So
90
00:07:56,799 --> 00:08:02,020
sadly this is cut off, because you would
see that this distribution radically
91
00:08:02,020 --> 00:08:08,830
changed. It looks completely different,
and so we can play that game a little bit.
92
00:08:08,830 --> 00:08:13,419
So let's just roll with that.
And like every three seconds the script is
93
00:08:13,419 --> 00:08:19,159
going to recompute looking at the new
distribution. And you can probably see
94
00:08:19,159 --> 00:08:26,849
where this is going. So here you can see,
okay, and now it just - it just takes
95
00:08:26,849 --> 00:08:34,559
about like three seconds for me to guess
the next letter. So, and this is not a
96
00:08:34,559 --> 00:08:39,809
problem that is only of
something that happens when you do string
97
00:08:39,809 --> 00:08:44,139
compares. This can happen with
basically everything - so it's especially
98
00:08:44,139 --> 00:08:48,029
in things like cryptographic stuff where
you don't want to have some information
99
00:08:48,029 --> 00:08:56,620
leaked out. So this is what we call a
timing side channel attack. So I could
100
00:08:56,620 --> 00:09:02,959
essentially use DTrace to analyze
the real binary. So I didn't change the
101
00:09:02,959 --> 00:09:07,040
binary - I didn't have some some debug
code there. This is like the actual binary
102
00:09:07,040 --> 00:09:12,500
that I would put into production. So
what's important about out that, is to
103
00:09:12,500 --> 00:09:16,500
take the actual binary, is some of these
these timing side channels might be
104
00:09:16,500 --> 00:09:21,620
introduced by a compiler optimization. And
when you insert debug code into that code,
105
00:09:21,620 --> 00:09:26,920
then it might actually go away. So, you
want to look at the real code that you're
106
00:09:26,920 --> 00:09:34,420
putting into production. Let me show you
the script that I came up with to write
107
00:09:34,420 --> 00:09:40,779
that. So there are three interesting
things in this script. So and and don't
108
00:09:40,779 --> 00:09:44,180
worry - this is the more
complicated example, I just want to like
109
00:09:44,180 --> 00:09:48,839
inspire your ideas. Because the things
that you can do with DTrace that's pretty
110
00:09:48,839 --> 00:09:54,600
much - the sky's the limit. You can
come up with the weirdest ideas, and so
111
00:09:54,600 --> 00:09:59,420
this is more complicated example. I'm
going to show you simpler ones. So to
112
00:09:59,420 --> 00:10:04,440
demonstrate how we got here. So there are
three interesting things in this code. The
113
00:10:04,440 --> 00:10:09,509
first one is something that we call a
probe. So a probe is a point of
114
00:10:09,509 --> 00:10:15,019
instrumentation in the system. So whenever
a certain event happens in the system this
115
00:10:15,019 --> 00:10:21,269
probe is going to fire. And in this case,
the begin probe like marks the state
116
00:10:21,269 --> 00:10:27,379
the moment when the script starts. So the
second interesting thing is this clause.
117
00:10:27,379 --> 00:10:31,680
So this clause is basically what this
probe is going to execute - what's going
118
00:10:31,680 --> 00:10:37,780
to be executed once that probe fires. So
it's a little block of code.
119
00:10:37,780 --> 00:10:42,370
And this probe is a little bit more
interesting, because it tells us
120
00:10:42,370 --> 00:10:48,270
something about the structure of how such
a probe looks like. Because every
121
00:10:48,270 --> 00:10:54,100
probe is uniquely identified by a four
tuple. So it's like four components that
122
00:10:54,100 --> 00:10:59,079
uniquely identify a probe. And the first
one is called the first part of this
123
00:10:59,079 --> 00:11:03,269
tuple is called the provider, and I'm
going to talk about providers in a couple
124
00:11:03,269 --> 00:11:07,160
of seconds and what they are. The second
one is called the module. Third one is
125
00:11:07,160 --> 00:11:13,449
called the function. And the last one is
called the name. So these four pieces of
126
00:11:13,449 --> 00:11:21,079
data, like, they identify a probe
uniquely. So the third thing that is
127
00:11:21,079 --> 00:11:25,440
interesting here is, sadly something that
I don't have time to talk about today,
128
00:11:25,440 --> 00:11:31,139
this is called an aggregation. And this
single line that you see here is
129
00:11:31,139 --> 00:11:35,889
essentially responsible for accumulating
all this data to print out this
130
00:11:35,889 --> 00:11:39,949
distribution stuff - to generate this
distribution. So this is built
131
00:11:39,949 --> 00:11:44,629
into DTrace. You don't have to do that
yourself. As it, when you look at this
132
00:11:44,629 --> 00:11:50,189
script, it's like 42 lines of code.
And I came up with the first prototype
133
00:11:50,189 --> 00:11:55,279
after five minutes. So it's not a lot
of stuff to do to get something out of
134
00:11:55,279 --> 00:12:00,360
that. So it's very useful to have things -
if you use DTrace you
135
00:12:00,360 --> 00:12:05,060
will use this a lot for performance
debugging so it's kind of neat that we
136
00:12:05,060 --> 00:12:11,410
have that. So yeah, let's talk a little
bit about providers, and this will
137
00:12:11,410 --> 00:12:18,300
probably also will be cut off. So I'm
going to cheat a little bit here - I'm
138
00:12:18,300 --> 00:12:27,649
just going to double that. So let's talk
about providers -- oh that's handy --
139
00:12:27,649 --> 00:12:32,339
so I got 27 providers here and the number
of providers vary from operating system to
140
00:12:32,339 --> 00:12:38,339
operating system. But these are the
ones that I can see right now. There are
141
00:12:38,339 --> 00:12:44,499
other providers that can be come into
existence when you demand them. So I have
142
00:12:44,499 --> 00:12:49,370
these 27 providers, and we're going to
look at the syscall provider and the FBT
143
00:12:49,370 --> 00:12:55,129
provider first. So, every provider knows
how to instrument a specific part of the
144
00:12:55,129 --> 00:13:01,410
system. So the syscall provider knows how
to instrument the syscall table. That's not
145
00:13:01,410 --> 00:13:08,699
very surprising. So if you can look at the
syscall provider and here you can see
146
00:13:08,699 --> 00:13:16,720
essentially every system call entry and
return that FreeBSD offers. So
147
00:13:16,720 --> 00:13:20,120
here you can see this four tuple, like,
the provider syscall, FreeBSD is the
148
00:13:20,120 --> 00:13:28,189
module, and so on. So these are all the
system calls that I have in my system. And
149
00:13:28,189 --> 00:13:32,910
the other provider that I want to look at
is the so called FBT provider, and that is
150
00:13:32,910 --> 00:13:38,810
pretty astonishing. The FBT provider, FBT
stands for "function boundary tracer" and
151
00:13:38,810 --> 00:13:45,160
what it allows us to do, it allows us to
trace every single function in the kernel.
152
00:13:45,160 --> 00:13:50,850
So I can look at the entire kernel at
functions, as they are being called. So to
153
00:13:50,850 --> 00:13:57,660
illustrate that I wrote a little, very
simple DTrace script and this is probably,
154
00:13:57,660 --> 00:14:01,399
look at the upper half please, so this is
probably one of the first DTrace scripts
155
00:14:01,399 --> 00:14:05,529
that you will come up with, it's a
fairly simple example, so let's break it
156
00:14:05,529 --> 00:14:09,680
down. So I'm going to instrument the mmap
system call. For those of you who do not
157
00:14:09,680 --> 00:14:13,720
know what the mmap system call is, what
you can do with it is you can so you can
158
00:14:13,720 --> 00:14:20,970
take a file and map that into the address
space of your process, so very dumbed down
159
00:14:20,970 --> 00:14:27,449
version. So whenever we enter the mmap
system call we are going to set the
160
00:14:27,449 --> 00:14:32,810
variable "follow" to one, and what this
"self at" means: this is essentially a
161
00:14:32,810 --> 00:14:37,970
thread local variable and we're going to
associate that variable with the thread
162
00:14:37,970 --> 00:14:45,230
that we're currently inspecting. Then I'm
going to do something pretty, that sounds
163
00:14:45,230 --> 00:14:49,149
scary but I'm going to instrument the
entire kernel. Every function entry and
164
00:14:49,149 --> 00:14:53,009
every function return, I'm going to
instrument that and say "please emit data
165
00:14:53,009 --> 00:14:57,189
when you do that". And this is what we
call a predicate, so this is where the
166
00:14:57,189 --> 00:15:02,009
awkiness of the DTrace programming
language comes in. So this is a predicate
167
00:15:02,009 --> 00:15:07,059
and whenever that evaluates to true
then the probe is going to fire, so in
168
00:15:07,059 --> 00:15:11,139
this case when we are in the thread that
we're currently tracing we're going to
169
00:15:11,139 --> 00:15:16,329
emit data. And this is just an empty
clause, we just want to know "hey we got
170
00:15:16,329 --> 00:15:23,480
here". So when we exit the mmap
system call and the predicate is set we're
171
00:15:23,480 --> 00:15:27,660
going to set the variable "follow" to
zero, because every uninitialized variable
172
00:15:27,660 --> 00:15:33,860
in DTrace is set to zero, so this pretty
much amounts to deallocating that variable
173
00:15:33,860 --> 00:15:41,279
and then we're going to exit cleanly. So
let me run that. So it takes a couple of
174
00:15:41,279 --> 00:15:48,480
seconds and boom. So you saw a little
pause here, that was when the DTrace guard
175
00:15:48,480 --> 00:15:55,009
reverted the driver, the kernel. So now
you can see every function call that
176
00:15:55,009 --> 00:15:59,480
happens inside the mmap system call. And
this is a little bit hard on the eyes, so
177
00:15:59,480 --> 00:16:08,379
let me pass this flag here and now you can
have nice to read indentation. So
178
00:16:08,379 --> 00:16:12,629
now you might say "I don't like that. You
are injecting code into the kernel. That
179
00:16:12,629 --> 00:16:17,880
is, that sounds dangerous" and yeah, but
let me show you something that I find
180
00:16:17,880 --> 00:16:23,980
really interesting. So I'm not
going too much into depth here, but this
181
00:16:23,980 --> 00:16:28,750
is a byte code, so every DTrace script
gets compiled to bytecode and this
182
00:16:28,750 --> 00:16:34,499
bytecode gets sent to the kernel and in
the kernel you have a virtual machine that
183
00:16:34,499 --> 00:16:39,059
interprets that bytecode. So in case you
write a script that for some reason might
184
00:16:39,059 --> 00:16:44,550
go rogue on your kernel, it like allocates
too much memory, takes too much time, this
185
00:16:44,550 --> 00:16:49,279
virtual machine can just say "okay, stop
it" and just going to revert all the
186
00:16:49,279 --> 00:16:53,890
changes that happened to your kernel, and
that's kinda handy. And it's not a new
187
00:16:53,890 --> 00:17:01,199
idea, so if you're using TCP dump it's
basically the same approach. They also
188
00:17:01,199 --> 00:17:04,832
have this kind of bytecode, so that's just
a little excursion here. This is called
189
00:17:04,832 --> 00:17:13,250
BPF, Berkeley Packet Filter, so it's not
an entirely new idea. So everything I
190
00:17:13,250 --> 00:17:19,470
showed you until now was "hey, I can look
when function calls happen". that's not
191
00:17:19,470 --> 00:17:22,519
very much information, so we're going to
increase the amount of information that we
192
00:17:22,519 --> 00:17:35,080
get out of the system with every example.
So let me look at the actual kernel. So I
193
00:17:35,080 --> 00:17:39,980
had to restart my machine, so my setup is
basically gone now. So let's look at this
194
00:17:39,980 --> 00:17:45,309
VM fault function. So this is, this is the
source code of the operating system that
195
00:17:45,309 --> 00:17:52,900
I'm running right now. This is FreeBSD
current 12 and the VM fault function;
196
00:17:52,900 --> 00:17:57,539
remember the mmap system call that I told
you? So the mmap system call
197
00:17:57,539 --> 00:18:03,899
I told you can bring, like map a file
into your address space. And it doesn't
198
00:18:03,899 --> 00:18:10,320
necessarily have to load the entire file,
so whenever we are touching a page in the
199
00:18:10,320 --> 00:18:15,780
system, like a memory page, this machine
is four kilobytes and it's no super pages
200
00:18:15,780 --> 00:18:21,429
here, so whenever it touches a piece of
memory that you didn't bring into memory
201
00:18:21,429 --> 00:18:25,309
yet, we're generating something that's
called a page fault, and then this
202
00:18:25,309 --> 00:18:31,180
function gets called. So here let's look
at the arguments, and I'm going to skip
203
00:18:31,180 --> 00:18:36,990
the zeroeth argument, to look at the first
argument. So this is the address that
204
00:18:36,990 --> 00:18:44,160
provoked that page fault, this is the
type and these are the flags and I'm going
205
00:18:44,160 --> 00:18:48,780
to show you something to make that a
little bit more readable. So what about
206
00:18:48,780 --> 00:18:58,960
this one? So you see it's a pointer and
this is a big structure, so we want
207
00:18:58,960 --> 00:19:09,961
to be able to look at that structure. And
just probably should do this here, so
208
00:19:09,961 --> 00:19:17,090
let's look at this VM fault script here.
So this is, make this a little bit more,
209
00:19:17,090 --> 00:19:20,950
so this is, don't pay too much attention
to this code, this this is basically just
210
00:19:20,950 --> 00:19:26,049
boilerplate to make make stuff readable
and this is where the actual action is
211
00:19:26,049 --> 00:19:31,690
happening. So this is, so what I'm doing
there is I'm instrumenting the VM
212
00:19:31,690 --> 00:19:36,350
fault function and whenever we enter it
then we're going to use some information
213
00:19:36,350 --> 00:19:40,720
that DTrace gives us for free. So this is
execname, this is the name of the
214
00:19:40,720 --> 00:19:45,909
currently running executable that provoked
the page fault, this is the process ID and
215
00:19:45,909 --> 00:19:53,250
here we have a bunch of argument
variables. So these arg1, arg2, arg3,
216
00:19:53,250 --> 00:19:57,964
that are essentially just integers, so
nothing too fancy there. But we wanna
217
00:19:57,964 --> 00:20:02,380
look, wanna be able to look at that
struct. And here I'm going to use this
218
00:20:02,380 --> 00:20:08,140
args array, and this args array
is kind of special, because it has typing
219
00:20:08,140 --> 00:20:15,870
information about the arguments. So when
you run that, so you're referencing that
220
00:20:15,870 --> 00:20:26,570
pointer there with the star, excuse me,
and let's just run that and maybe, that's
221
00:20:26,570 --> 00:20:32,899
a start yeah. So this is an in-kernel
data structure that we can now look
222
00:20:32,899 --> 00:20:40,010
at. So DTrace enabled us to look at in-
memory data structures as the system runs.
223
00:20:40,010 --> 00:20:44,330
And this is really really powerful.
In in the DTrace script I could use all
224
00:20:44,330 --> 00:20:50,490
these fields like I can manipulate this
args array, this value in there, just like
225
00:20:50,490 --> 00:20:57,010
just like every other variable; I
can pretty much work like I was in C. So
226
00:20:57,010 --> 00:21:02,659
how is it doing that? There is something
that's called CTF, that's not capture the
227
00:21:02,659 --> 00:21:10,120
flag, it's, this is the, the Compact C
Tracing Format, so you can see that but
228
00:21:10,120 --> 00:21:14,320
there is a man page in FreeBSD, and
there's a little segment in the kernel
229
00:21:14,320 --> 00:21:19,190
binary, where all this typing information
is stored. I don't know how that compares
230
00:21:19,190 --> 00:21:24,320
to modern DWARF but yeah this is what
DTrace is working with. So now you might
231
00:21:24,320 --> 00:21:28,549
ask yourself "Why on earth would I do
that? Why on earth would I look at virtual
232
00:21:28,549 --> 00:21:33,590
memory, because, yeah, um, this stuff is
safe isn't it? I mean there's no bugs in
233
00:21:33,590 --> 00:21:42,820
there." Except when they are. Anyone
remembers remembers "Dirty COW"? So this
234
00:21:42,820 --> 00:21:48,510
was a very nasty vulnerability in the
Linux kernel and that that was a problem
235
00:21:48,510 --> 00:21:52,399
in the virtual memory management. So it
allowed you to write to a file that you
236
00:21:52,399 --> 00:21:56,679
didn't own as a regular user. So you could
essentially just write to a binary that
237
00:21:56,679 --> 00:22:01,789
had "set UID" set. Very unpleasant, but
I'm not going to bash the Linux folks
238
00:22:01,789 --> 00:22:08,030
here, this is just, I just want to show
you these things are hard. And the first
239
00:22:08,030 --> 00:22:15,440
fix for this problem was in 2005 and then
it came back in 2016. So now that's fixed
240
00:22:15,440 --> 00:22:21,080
and then it came back with "Huge Dirty
COW" in 2017, so this is, I mean this
241
00:22:21,080 --> 00:22:27,580
was there for way over a decade.
These things are hard to debug. And this
242
00:22:27,580 --> 00:22:33,110
is what I like about these systems, so not
having, not having tools like DTrace to
243
00:22:33,110 --> 00:22:37,640
figure out what's going on inside of the
system somehow, to me, amounts to security
244
00:22:37,640 --> 00:22:42,360
by obscurity. And I've heard that some
people who are developing exploits for
245
00:22:42,360 --> 00:22:46,100
systems that have DTrace they say "Oh, I
really like developing exploits on these
246
00:22:46,100 --> 00:22:53,230
systems, because the tooling is so great!"
Yeah, but, to be honest this is cool,
247
00:22:53,230 --> 00:22:58,899
because an exploit is a proof of concept
and coming up with these exploits quickly
248
00:22:58,899 --> 00:23:03,440
is very usable, because you know what's
going on you can show "Hey, this is going
249
00:23:03,440 --> 00:23:07,279
wrong". I had situations, where
people were telling me "Oh, this is this
250
00:23:07,279 --> 00:23:11,020
is not a problem with our program, this is
this weird operating system that you're
251
00:23:11,020 --> 00:23:18,100
using. Like Solaris, weird operating
system." And, yeah, and then I churned out
252
00:23:18,100 --> 00:23:22,059
some DTrace scripts and "No, it's
actually your problem". "Oh, now I can see
253
00:23:22,059 --> 00:23:31,419
that on my Linux box!" Magic. So,
everything I showed you until now was
254
00:23:31,419 --> 00:23:38,179
very, very much related to function calls
and we want to have a little bit more
255
00:23:38,179 --> 00:23:44,720
semantics here, because you might want to
write a script that inspects protocols,
256
00:23:44,720 --> 00:23:48,760
stuff like TCP, UDP stuff like that. So,
you don't want to know which function
257
00:23:48,760 --> 00:23:54,320
inside of the kernel is responsible for
handling your TCP/IP stuff, so DTrace
258
00:23:54,320 --> 00:24:00,549
comes with something that's called static
providers and I'm just going to show the
259
00:24:00,549 --> 00:24:04,769
apropos here. So these are, so every
static provider has a main page which is
260
00:24:04,769 --> 00:24:10,950
kind of handy - documentation whoo - and
you can see there is an I/O provider if
261
00:24:10,950 --> 00:24:17,539
you are interested in looking at this guy:
Oh, IP for looking at IPv4 and IPv6,
262
00:24:17,539 --> 00:24:23,570
TCP... This one is pretty cool, it's about
scheduling behavior. So, "what does my
263
00:24:23,570 --> 00:24:29,010
scheduler do?" And if you look at that, you
can see some interesting stuff like length
264
00:24:29,010 --> 00:24:33,150
priority if you ever saw things like
priority inversion, stuff like that, now
265
00:24:33,150 --> 00:24:36,970
you can see that happen. I'm a nerd, I
find this interesting for some reason, I
266
00:24:36,970 --> 00:24:43,230
don't know. And it's also pretty
interesting to figure out what's going on,
267
00:24:43,230 --> 00:24:48,279
"why is this getting de-scheduled all the
time?" So, some interesting things going
268
00:24:48,279 --> 00:24:55,809
on there. So, I'm running a little bit
short on time here, but I just quickly
269
00:24:55,809 --> 00:24:59,340
want to show you something - this is all
kernel stuff right now - can we do that
270
00:24:59,340 --> 00:25:05,380
with userspace? Of course. So, there was
one provider that didn't show up when I
271
00:25:05,380 --> 00:25:09,590
had my provider listing, but was in the
DTrace script where I did this timing
272
00:25:09,590 --> 00:25:16,230
attack stuff. And that's called the PID
provider. And the PID provider generates
273
00:25:16,230 --> 00:25:21,080
probes on demand, because a process might
have a lot of probes and you will shortly
274
00:25:21,080 --> 00:25:25,190
see why and this is why I'm going to use a
very small program which is called "true",
275
00:25:25,190 --> 00:25:31,560
and true just exits with exit code zero.
So, nothing too exciting going on here,
276
00:25:31,560 --> 00:25:37,810
and this dollar target gets substituted
in, we get the process ID there. And this
277
00:25:37,810 --> 00:25:44,640
is everything that happens when I'm
executing this program you see this is a
278
00:25:44,640 --> 00:25:48,679
little bit more fine-grained than the FBT
provider, because now we can trace every
279
00:25:48,679 --> 00:25:53,520
single instruction inside of that
function, which is kind of a handy. It's a
280
00:25:53,520 --> 00:25:58,090
scriptable debugger. So, these numbers are
the instructional offsets inside of that
281
00:25:58,090 --> 00:26:03,360
function. We can also look at - so this is
everything in the true segment - we can
282
00:26:03,360 --> 00:26:09,899
also look at libraries that got linked in
and there's a lot of stuff happening in
283
00:26:09,899 --> 00:26:15,780
libc for example when you run true.
So, one last thing that I wanted to show
284
00:26:15,780 --> 00:26:22,340
you because it consumed a week of my life:
I'm using a lot of Haskell and the Mac OS
285
00:26:22,340 --> 00:26:29,419
people, they also have DTrace and they
have GHC Haskell DTrace support - so the
286
00:26:29,419 --> 00:26:38,380
Glasgow Haskell compiler - and glorious...
they have probes to analyze what's going
287
00:26:38,380 --> 00:26:41,620
on inside of the runtime system. So, I
thought "I want to have that, I have
288
00:26:41,620 --> 00:26:47,019
DTrace, why doesn't it work on FreeBSD?"
So, after a week of fighting with make
289
00:26:47,019 --> 00:26:55,100
files and linkers, that works: If you
check out the recent GHC repository and
290
00:26:55,100 --> 00:27:00,260
build it on FreeBSD, you get all the nice
stuff that I'm going to show you now. So,
291
00:27:00,260 --> 00:27:05,909
this is a very boring program - it just
starts 32 green threads and schedules them
292
00:27:05,909 --> 00:27:10,470
all over the place - and now I can do
something like this: phone rings I can
293
00:27:10,470 --> 00:27:13,934
ring a telephone.
laughter
294
00:27:13,934 --> 00:27:18,750
No, that would be
interesting... So, you can also use
295
00:27:18,750 --> 00:27:26,970
wildcards - and not as name of the probe -
and this is what's going on inside, like
296
00:27:26,970 --> 00:27:31,580
GC garbage collection and all this stuff.
Now you can look at this and write useful
297
00:27:31,580 --> 00:27:37,509
DTrace scripts that also take my runtime
system into account. So, stuff like that
298
00:27:37,509 --> 00:27:41,810
exists for I think Python - I'm not
entirely sure because I don't use it -
299
00:27:41,810 --> 00:27:49,120
nodejs same, Postgres - I used it but not
with DTrace right now - and what a find
300
00:27:49,120 --> 00:27:55,210
interesting: Firefox. When you run
JavaScript in your Firefox, it actually
301
00:27:55,210 --> 00:27:59,360
has a provider, so you can trace
JavaScript running in your browser with
302
00:27:59,360 --> 00:28:05,130
DTrace, so after everything I just showed
you, there might be some stuff going on
303
00:28:05,130 --> 00:28:10,700
there. So yeah, this is basically
everything I wanted to show you and I
304
00:28:10,700 --> 00:28:13,759
think I'm going to wrap out, because
otherwise we're not going to have a lot of
305
00:28:13,759 --> 00:28:19,001
time for questions and maybe you have
some. So yeah, thanks.
306
00:28:19,001 --> 00:28:29,610
applause
Herald: Thank you very much Raichoo. We
307
00:28:29,610 --> 00:28:34,257
are actually over time already, but we
have two more minutes because we started
308
00:28:34,257 --> 00:28:38,817
three minutes late, so if there are any
really quick questions, possibly from the
309
00:28:38,817 --> 00:28:43,030
internet... There is one, the signal angel
says, let's hear it.
310
00:28:43,030 --> 00:28:48,013
Question: Yeah, hi, okay. So, the question
is, "which changes are actually necessary
311
00:28:48,013 --> 00:28:51,809
to do in the kernel of an operating system
to support DTrace?"
312
00:28:51,809 --> 00:28:56,370
Answer: That's a lot of work. So, it's not
something like you do in a weekend. This
313
00:28:56,370 --> 00:29:03,062
is... So, the person who started the work
on FreeBSD has sadly passed away now, but
314
00:29:03,062 --> 00:29:09,559
I think they took a couple of years to
have everything in place, so you have to
315
00:29:09,559 --> 00:29:13,730
have stuff like the CTF thing that I
showed you, which is what OpenBSD is
316
00:29:13,730 --> 00:29:19,890
currently working on. And then you need
all those those magic gizmos, like kernel
317
00:29:19,890 --> 00:29:25,660
modules and stuff like that. So, it takes
a lot of time, but it's been ported to
318
00:29:25,660 --> 00:29:30,889
most operating systems that are available
and in use right now. So yeah, hope this
319
00:29:30,889 --> 00:29:34,239
answers the question.
Herald: Excellent and there are no more
320
00:29:34,239 --> 00:29:38,839
questions here in the room. I will thank
Raichoo and you can find him outside of
321
00:29:38,839 --> 00:29:46,590
the room and also on Twitter at "raichoo"
if you have any more further question.
322
00:29:46,590 --> 00:29:51,405
postroll music
323
00:29:51,405 --> 00:30:08,000
subtitles created by c3subtitles.de
in the year 2020. Join, and help us!