0:00:00.110,0:00:14.320 music 0:00:14.320,0:00:18.810 applause 0:00:18.810,0:00:23.560 Raichoo: Yeah sorry about that - beamers[br]or projectors, I don't like them. They 0:00:23.560,0:00:27.210 don't like me either. So this is a little[br]heads up - this is going to be the only 0:00:27.210,0:00:32.049 slide I'm going to show you today so,[br]"slide", because I think doing stuff like 0:00:32.049,0:00:35.940 that in a terminal might be a little bit[br]more interesting for you. But sadly 0:00:35.940,0:00:40.020 something is getting cut off so I we have[br]to improvise a little bit. But anyway, so 0:00:40.020,0:00:43.960 today I will be able to talk about two of[br]my favorite things right now which are 0:00:43.960,0:00:47.820 FreeBSD and DTrace. But this talk has been[br]capped down to 30 minutes so we'll be 0:00:47.820,0:00:53.190 focusing a little more on the DTrace part.[br]So there will be a little bit less BSD 0:00:53.190,0:00:57.560 than I anticipated. And also adjusted[br]everything a little bit to fit better into 0:00:57.560,0:01:03.130 the resilience track so hopefully you'll[br]enjoy that. So before we begin, who here 0:01:03.130,0:01:08.640 is actually using DTrace? Okay more than[br]I expected but still not as many as I 0:01:08.640,0:01:12.610 would like to see. So hopefully after this[br]talk you will think, "oh, this is a really 0:01:12.610,0:01:17.170 awesome tool, I gotta learn it." Because I[br]totally love it - it changed the way I do 0:01:17.170,0:01:22.110 a lot of stuff. So for those of you who do[br]not know what DTrace is, first, let me 0:01:22.110,0:01:27.260 fill you in on this stuff. So it's open[br]source, it originated on Solaris, and been 0:01:27.260,0:01:31.640 developed currently on illumos which is a[br]fork from OpenSolaris. It has been ported 0:01:31.640,0:01:37.930 to FreeBSD, NetBSD, OS X, there's also a[br]port for Linux called next called DTrace 0:01:37.930,0:01:43.689 for Linux. I think it's done by a person[br]called Paul Fox. It's been ported to QNX 0:01:43.689,0:01:49.810 and the OpenBSD folks are currently doing[br]some work to get the technology like 0:01:49.810,0:01:54.040 DTrace on their system. And I think[br]there's a port for Windows? I don't know 0:01:54.040,0:01:57.869 if this is actually true, but it is it's[br]kind of cool because then that means it's 0:01:57.869,0:02:04.650 basically everywhere. So, most of you[br]would probably know static tools like 0:02:04.650,0:02:09.470 strace. We have a very similar tool on[br]FreeBSD that is called truss, and what 0:02:09.470,0:02:14.500 truss and strace are doing is - you can[br]attach them to a process and look at the 0:02:14.500,0:02:18.650 system calls that this process is[br]emitting. So in case something is going 0:02:18.650,0:02:23.319 wrong you can well look inside of the[br]program, which can be kind of useful when 0:02:23.319,0:02:28.870 you're trying to find a problem. It's[br]kind of handy but it's also pretty 0:02:28.870,0:02:32.890 limited. Because first of all it really[br]really slows down the process that you're 0:02:32.890,0:02:37.250 currently looking at. So if you want to[br]debug a performance issue, you're pretty 0:02:37.250,0:02:42.170 much out of luck there. And also it's kind[br]of like, narrow down - you can just look 0:02:42.170,0:02:47.940 at one process. Which is also like bad[br]thing because the system that we currently 0:02:47.940,0:02:52.660 have - all these systems are very[br]complex: we have a lot of layers. You have 0:02:52.660,0:02:56.300 virtual file systems, you have virtual[br]memory, you have network, you have 0:02:56.300,0:03:00.500 databases, processes communicating with[br]each other. And in case you are using a 0:03:00.500,0:03:04.710 high-level programming language, you might[br]also have a runtime system. So it's a 0:03:04.710,0:03:09.519 little operating system on top of your[br]operating system. So when something goes 0:03:09.519,0:03:15.000 wrong in a system that has such large[br]complexity, something happens that we call 0:03:15.000,0:03:19.850 the blame game. And the blame game - it's[br]never your fault, it's always someone 0:03:19.850,0:03:25.710 else's. So what we want to be able to do[br]is we want to look at the system as a 0:03:25.710,0:03:30.349 whole, so we can correlate all the data[br]and come up with some meaningful answers 0:03:30.349,0:03:34.506 when something is really going wrong in[br]there. And also, we don't want to 0:03:34.506,0:03:39.260 switch out all the processes for[br]debug processes to make that happen, 0:03:39.260,0:03:44.969 because as these things are all -- every[br]problem happens in production. It never 0:03:44.969,0:03:48.470 happens on the development box. So like,[br]switching out all the processes - that's 0:03:48.470,0:03:55.030 totally out of the picture. So to do that[br]in an arbitrary way, to like, instrument 0:03:55.030,0:03:59.910 the system in an arbitrary way, we sort of[br]need like a programming language. So, we 0:03:59.910,0:04:03.640 need to describe - when that happens,[br]please submit data so I can see what's 0:04:03.640,0:04:09.489 going on. So this kind of implies a[br]programming language. And DTrace comes 0:04:09.489,0:04:13.670 with such a programming language - it's a[br]little bit reminiscent of awk cross with 0:04:13.670,0:04:18.798 C? It's pretty simple to learn - you can[br]pick it up 20 up to pick it up in 20 0:04:18.798,0:04:25.199 minutes and you can start churning out[br]your first DTrace scripts. So like awk, if 0:04:25.199,0:04:30.559 you know awk, awk can be used to analyze[br]large bodies of text. Dtrace is pretty 0:04:30.559,0:04:34.749 much the same, but for system behavior -[br]so a little bit mind boggling, but 0:04:34.749,0:04:40.069 probably I can show you what I mean by[br]that. And also, as a bonus we don't want 0:04:40.069,0:04:43.860 to slow down the system, so we want to be[br]able to do things like performance 0:04:43.860,0:04:52.300 debugging, performance tests like that. So[br]I've prepared this little demo here, and. 0:04:52.300,0:04:58.780 So since we had some issues here probably[br]this is not -- I have to play around a 0:04:58.780,0:05:04.249 little bit. So what I'm going to do is[br]I'm going to look at a very very naive way 0:05:04.249,0:05:18.009 to -- excuse me for a second -- very naive[br]way to -- give me a second -- so very 0:05:18.009,0:05:21.960 naive way to authenticate a user. And[br]there's a lot of stuff wrong with this 0:05:21.960,0:05:26.030 code, but like what we're going to do is[br]we're going to take a user string as 0:05:26.030,0:05:32.740 input, and then we're going to just[br]compare it to another, to a secret. So I 0:05:32.740,0:05:36.420 know, the the secret in here is like in[br]plain text I know this is a problem, but 0:05:36.420,0:05:41.639 this is a little bit artificial. But I[br]just want to get my point across. So from 0:05:41.639,0:05:47.159 an algorithmic perspective, this check[br]function is correct: so we take a string 0:05:47.159,0:05:52.449 we take another string and we compare[br]them. So everything's fine and easy. So if 0:05:52.449,0:05:58.599 you look at the way string compare works[br]and what it does, it's essentially 0:05:58.599,0:06:04.449 taking these two strings and it's[br]comparing every character bit by bit. So 0:06:04.449,0:06:10.729 when it finds the first pair of characters[br]that do not match up, it's going to stop. 0:06:10.729,0:06:17.879 So we can we can conclude something about[br]from that - so if it takes very short if 0:06:17.879,0:06:23.399 if this function this check function takes[br]a very short amount of time, then, what 0:06:23.399,0:06:29.129 will happen is it will terminate earlier.[br]And if our password guess is better, it 0:06:29.129,0:06:34.479 will take well, it will take longer. And[br]if we can measure that we can basically 0:06:34.479,0:06:40.809 extract information from that running[br]algorithm. So I wrote a little driver 0:06:40.809,0:06:47.449 program in Haskell that basically just[br]iterates over an alphabet and just feeds 0:06:47.449,0:06:53.379 this one letter into that program,[br]And I'm going to use DTrace to get some 0:06:53.379,0:06:59.020 timing information. So let me start the[br]driver. So this is now just running in the 0:06:59.020,0:07:04.919 background. And you cannot see what I'm[br]typing there, but don't worry - these 0:07:04.919,0:07:12.240 scripts will all be; I will push them on[br]my github. So DTrace now produces this 0:07:12.240,0:07:17.240 nice little distribution. So if you if you[br]were if you were able to see the entire 0:07:17.240,0:07:22.949 alphabet, you would see that everything[br]except "D" behaves differently. So if you 0:07:22.949,0:07:29.399 squint a little, what you see there is[br]DTrace the D letter takes a couple of 0:07:29.399,0:07:32.949 nanoseconds longer. This is the precision[br]that I'm measuring here - ten to minus 0:07:32.949,0:07:39.219 nine seconds. Like really precise. And D[br]takes longer than everything else, so it's 0:07:39.219,0:07:43.929 a little bit cut off there, but trust me.[br]I know it sound like Donald Trump I'm 0:07:43.929,0:07:52.759 saying that. So yeah, and from that let's[br]just enter a letter. And now the password 0:07:52.759,0:07:56.799 and now the script clears everything and[br]it's going to guess the next letter. So 0:07:56.799,0:08:02.020 sadly this is cut off, because you would[br]see that this distribution radically 0:08:02.020,0:08:08.830 changed. It looks completely different,[br]and so we can play that game a little bit. 0:08:08.830,0:08:13.419 So let's just roll with that.[br]And like every three seconds the script is 0:08:13.419,0:08:19.159 going to recompute looking at the new[br]distribution. And you can probably see 0:08:19.159,0:08:26.849 where this is going. So here you can see,[br]okay, and now it just - it just takes 0:08:26.849,0:08:34.559 about like three seconds for me to guess[br]the next letter. So, and this is not a 0:08:34.559,0:08:39.809 problem that is only of[br]something that happens when you do string 0:08:39.809,0:08:44.139 compares. This can happen with[br]basically everything - so it's especially 0:08:44.139,0:08:48.029 in things like cryptographic stuff where[br]you don't want to have some information 0:08:48.029,0:08:56.620 leaked out. So this is what we call a[br]timing side channel attack. So I could 0:08:56.620,0:09:02.959 essentially use DTrace to analyze[br]the real binary. So I didn't change the 0:09:02.959,0:09:07.040 binary - I didn't have some some debug[br]code there. This is like the actual binary 0:09:07.040,0:09:12.500 that I would put into production. So[br]what's important about out that, is to 0:09:12.500,0:09:16.500 take the actual binary, is some of these[br]these timing side channels might be 0:09:16.500,0:09:21.620 introduced by a compiler optimization. And[br]when you insert debug code into that code, 0:09:21.620,0:09:26.920 then it might actually go away. So, you[br]want to look at the real code that you're 0:09:26.920,0:09:34.420 putting into production. Let me show you[br]the script that I came up with to write 0:09:34.420,0:09:40.779 that. So there are three interesting[br]things in this script. So and and don't 0:09:40.779,0:09:44.180 worry - this is the more[br]complicated example, I just want to like 0:09:44.180,0:09:48.839 inspire your ideas. Because the things[br]that you can do with DTrace that's pretty 0:09:48.839,0:09:54.600 much - the sky's the limit. You can[br]come up with the weirdest ideas, and so 0:09:54.600,0:09:59.420 this is more complicated example. I'm[br]going to show you simpler ones. So to 0:09:59.420,0:10:04.440 demonstrate how we got here. So there are[br]three interesting things in this code. The 0:10:04.440,0:10:09.509 first one is something that we call a[br]probe. So a probe is a point of 0:10:09.509,0:10:15.019 instrumentation in the system. So whenever[br]a certain event happens in the system this 0:10:15.019,0:10:21.269 probe is going to fire. And in this case,[br]the begin probe like marks the state 0:10:21.269,0:10:27.379 the moment when the script starts. So the[br]second interesting thing is this clause. 0:10:27.379,0:10:31.680 So this clause is basically what this[br]probe is going to execute - what's going 0:10:31.680,0:10:37.780 to be executed once that probe fires. So[br]it's a little block of code. 0:10:37.780,0:10:42.370 And this probe is a little bit more[br]interesting, because it tells us 0:10:42.370,0:10:48.270 something about the structure of how such[br]a probe looks like. Because every 0:10:48.270,0:10:54.100 probe is uniquely identified by a four[br]tuple. So it's like four components that 0:10:54.100,0:10:59.079 uniquely identify a probe. And the first[br]one is called the first part of this 0:10:59.079,0:11:03.269 tuple is called the provider, and I'm[br]going to talk about providers in a couple 0:11:03.269,0:11:07.160 of seconds and what they are. The second[br]one is called the module. Third one is 0:11:07.160,0:11:13.449 called the function. And the last one is[br]called the name. So these four pieces of 0:11:13.449,0:11:21.079 data, like, they identify a probe[br]uniquely. So the third thing that is 0:11:21.079,0:11:25.440 interesting here is, sadly something that[br]I don't have time to talk about today, 0:11:25.440,0:11:31.139 this is called an aggregation. And this[br]single line that you see here is 0:11:31.139,0:11:35.889 essentially responsible for accumulating[br]all this data to print out this 0:11:35.889,0:11:39.949 distribution stuff - to generate this[br]distribution. So this is built 0:11:39.949,0:11:44.629 into DTrace. You don't have to do that[br]yourself. As it, when you look at this 0:11:44.629,0:11:50.189 script, it's like 42 lines of code.[br]And I came up with the first prototype 0:11:50.189,0:11:55.279 after five minutes. So it's not a lot[br]of stuff to do to get something out of 0:11:55.279,0:12:00.360 that. So it's very useful to have things -[br]if you use DTrace you 0:12:00.360,0:12:05.060 will use this a lot for performance[br]debugging so it's kind of neat that we 0:12:05.060,0:12:11.410 have that. So yeah, let's talk a little[br]bit about providers, and this will 0:12:11.410,0:12:18.300 probably also will be cut off. So I'm[br]going to cheat a little bit here - I'm 0:12:18.300,0:12:27.649 just going to double that. So let's talk[br]about providers -- oh that's handy -- 0:12:27.649,0:12:32.339 so I got 27 providers here and the number[br]of providers vary from operating system to 0:12:32.339,0:12:38.339 operating system. But these are the[br]ones that I can see right now. There are 0:12:38.339,0:12:44.499 other providers that can be come into[br]existence when you demand them. So I have 0:12:44.499,0:12:49.370 these 27 providers, and we're going to[br]look at the syscall provider and the FBT 0:12:49.370,0:12:55.129 provider first. So, every provider knows[br]how to instrument a specific part of the 0:12:55.129,0:13:01.410 system. So the syscall provider knows how[br]to instrument the syscall table. That's not 0:13:01.410,0:13:08.699 very surprising. So if you can look at the[br]syscall provider and here you can see 0:13:08.699,0:13:16.720 essentially every system call entry and[br]return that FreeBSD offers. So 0:13:16.720,0:13:20.120 here you can see this four tuple, like,[br]the provider syscall, FreeBSD is the 0:13:20.120,0:13:28.189 module, and so on. So these are all the[br]system calls that I have in my system. And 0:13:28.189,0:13:32.910 the other provider that I want to look at[br]is the so called FBT provider, and that is 0:13:32.910,0:13:38.810 pretty astonishing. The FBT provider, FBT[br]stands for "function boundary tracer" and 0:13:38.810,0:13:45.160 what it allows us to do, it allows us to[br]trace every single function in the kernel. 0:13:45.160,0:13:50.850 So I can look at the entire kernel at[br]functions, as they are being called. So to 0:13:50.850,0:13:57.660 illustrate that I wrote a little, very[br]simple DTrace script and this is probably, 0:13:57.660,0:14:01.399 look at the upper half please, so this is[br]probably one of the first DTrace scripts 0:14:01.399,0:14:05.529 that you will come up with, it's a[br]fairly simple example, so let's break it 0:14:05.529,0:14:09.680 down. So I'm going to instrument the mmap[br]system call. For those of you who do not 0:14:09.680,0:14:13.720 know what the mmap system call is, what[br]you can do with it is you can so you can 0:14:13.720,0:14:20.970 take a file and map that into the address[br]space of your process, so very dumbed down 0:14:20.970,0:14:27.449 version. So whenever we enter the mmap[br]system call we are going to set the 0:14:27.449,0:14:32.810 variable "follow" to one, and what this[br]"self at" means: this is essentially a 0:14:32.810,0:14:37.970 thread local variable and we're going to[br]associate that variable with the thread 0:14:37.970,0:14:45.230 that we're currently inspecting. Then I'm[br]going to do something pretty, that sounds 0:14:45.230,0:14:49.149 scary but I'm going to instrument the[br]entire kernel. Every function entry and 0:14:49.149,0:14:53.009 every function return, I'm going to[br]instrument that and say "please emit data 0:14:53.009,0:14:57.189 when you do that". And this is what we[br]call a predicate, so this is where the 0:14:57.189,0:15:02.009 awkiness of the DTrace programming[br]language comes in. So this is a predicate 0:15:02.009,0:15:07.059 and whenever that evaluates to true[br]then the probe is going to fire, so in 0:15:07.059,0:15:11.139 this case when we are in the thread that[br]we're currently tracing we're going to 0:15:11.139,0:15:16.329 emit data. And this is just an empty[br]clause, we just want to know "hey we got 0:15:16.329,0:15:23.480 here". So when we exit the mmap[br]system call and the predicate is set we're 0:15:23.480,0:15:27.660 going to set the variable "follow" to[br]zero, because every uninitialized variable 0:15:27.660,0:15:33.860 in DTrace is set to zero, so this pretty[br]much amounts to deallocating that variable 0:15:33.860,0:15:41.279 and then we're going to exit cleanly. So[br]let me run that. So it takes a couple of 0:15:41.279,0:15:48.480 seconds and boom. So you saw a little[br]pause here, that was when the DTrace guard 0:15:48.480,0:15:55.009 reverted the driver, the kernel. So now[br]you can see every function call that 0:15:55.009,0:15:59.480 happens inside the mmap system call. And[br]this is a little bit hard on the eyes, so 0:15:59.480,0:16:08.379 let me pass this flag here and now you can[br]have nice to read indentation. So 0:16:08.379,0:16:12.629 now you might say "I don't like that. You[br]are injecting code into the kernel. That 0:16:12.629,0:16:17.880 is, that sounds dangerous" and yeah, but[br]let me show you something that I find 0:16:17.880,0:16:23.980 really interesting. So I'm not[br]going too much into depth here, but this 0:16:23.980,0:16:28.750 is a byte code, so every DTrace script[br]gets compiled to bytecode and this 0:16:28.750,0:16:34.499 bytecode gets sent to the kernel and in[br]the kernel you have a virtual machine that 0:16:34.499,0:16:39.059 interprets that bytecode. So in case you[br]write a script that for some reason might 0:16:39.059,0:16:44.550 go rogue on your kernel, it like allocates[br]too much memory, takes too much time, this 0:16:44.550,0:16:49.279 virtual machine can just say "okay, stop[br]it" and just going to revert all the 0:16:49.279,0:16:53.890 changes that happened to your kernel, and[br]that's kinda handy. And it's not a new 0:16:53.890,0:17:01.199 idea, so if you're using TCP dump it's[br]basically the same approach. They also 0:17:01.199,0:17:04.832 have this kind of bytecode, so that's just[br]a little excursion here. This is called 0:17:04.832,0:17:13.250 BPF, Berkeley Packet Filter, so it's not[br]an entirely new idea. So everything I 0:17:13.250,0:17:19.470 showed you until now was "hey, I can look[br]when function calls happen". that's not 0:17:19.470,0:17:22.519 very much information, so we're going to[br]increase the amount of information that we 0:17:22.519,0:17:35.080 get out of the system with every example.[br]So let me look at the actual kernel. So I 0:17:35.080,0:17:39.980 had to restart my machine, so my setup is[br]basically gone now. So let's look at this 0:17:39.980,0:17:45.309 VM fault function. So this is, this is the[br]source code of the operating system that 0:17:45.309,0:17:52.900 I'm running right now. This is FreeBSD[br]current 12 and the VM fault function; 0:17:52.900,0:17:57.539 remember the mmap system call that I told[br]you? So the mmap system call 0:17:57.539,0:18:03.899 I told you can bring, like map a file[br]into your address space. And it doesn't 0:18:03.899,0:18:10.320 necessarily have to load the entire file,[br]so whenever we are touching a page in the 0:18:10.320,0:18:15.780 system, like a memory page, this machine[br]is four kilobytes and it's no super pages 0:18:15.780,0:18:21.429 here, so whenever it touches a piece of[br]memory that you didn't bring into memory 0:18:21.429,0:18:25.309 yet, we're generating something that's[br]called a page fault, and then this 0:18:25.309,0:18:31.180 function gets called. So here let's look[br]at the arguments, and I'm going to skip 0:18:31.180,0:18:36.990 the zeroeth argument, to look at the first[br]argument. So this is the address that 0:18:36.990,0:18:44.160 provoked that page fault, this is the[br]type and these are the flags and I'm going 0:18:44.160,0:18:48.780 to show you something to make that a[br]little bit more readable. So what about 0:18:48.780,0:18:58.960 this one? So you see it's a pointer and[br]this is a big structure, so we want 0:18:58.960,0:19:09.961 to be able to look at that structure. And[br]just probably should do this here, so 0:19:09.961,0:19:17.090 let's look at this VM fault script here.[br]So this is, make this a little bit more, 0:19:17.090,0:19:20.950 so this is, don't pay too much attention[br]to this code, this this is basically just 0:19:20.950,0:19:26.049 boilerplate to make make stuff readable[br]and this is where the actual action is 0:19:26.049,0:19:31.690 happening. So this is, so what I'm doing[br]there is I'm instrumenting the VM 0:19:31.690,0:19:36.350 fault function and whenever we enter it[br]then we're going to use some information 0:19:36.350,0:19:40.720 that DTrace gives us for free. So this is[br]execname, this is the name of the 0:19:40.720,0:19:45.909 currently running executable that provoked[br]the page fault, this is the process ID and 0:19:45.909,0:19:53.250 here we have a bunch of argument[br]variables. So these arg1, arg2, arg3, 0:19:53.250,0:19:57.964 that are essentially just integers, so[br]nothing too fancy there. But we wanna 0:19:57.964,0:20:02.380 look, wanna be able to look at that[br]struct. And here I'm going to use this 0:20:02.380,0:20:08.140 args array, and this args array[br]is kind of special, because it has typing 0:20:08.140,0:20:15.870 information about the arguments. So when[br]you run that, so you're referencing that 0:20:15.870,0:20:26.570 pointer there with the star, excuse me,[br]and let's just run that and maybe, that's 0:20:26.570,0:20:32.899 a start yeah. So this is an in-kernel[br]data structure that we can now look 0:20:32.899,0:20:40.010 at. So DTrace enabled us to look at in-[br]memory data structures as the system runs. 0:20:40.010,0:20:44.330 And this is really really powerful.[br]In in the DTrace script I could use all 0:20:44.330,0:20:50.490 these fields like I can manipulate this[br]args array, this value in there, just like 0:20:50.490,0:20:57.010 just like every other variable; I[br]can pretty much work like I was in C. So 0:20:57.010,0:21:02.659 how is it doing that? There is something[br]that's called CTF, that's not capture the 0:21:02.659,0:21:10.120 flag, it's, this is the, the Compact C[br]Tracing Format, so you can see that but 0:21:10.120,0:21:14.320 there is a man page in FreeBSD, and[br]there's a little segment in the kernel 0:21:14.320,0:21:19.190 binary, where all this typing information[br]is stored. I don't know how that compares 0:21:19.190,0:21:24.320 to modern DWARF but yeah this is what[br]DTrace is working with. So now you might 0:21:24.320,0:21:28.549 ask yourself "Why on earth would I do[br]that? Why on earth would I look at virtual 0:21:28.549,0:21:33.590 memory, because, yeah, um, this stuff is[br]safe isn't it? I mean there's no bugs in 0:21:33.590,0:21:42.820 there." Except when they are. Anyone[br]remembers remembers "Dirty COW"? So this 0:21:42.820,0:21:48.510 was a very nasty vulnerability in the[br]Linux kernel and that that was a problem 0:21:48.510,0:21:52.399 in the virtual memory management. So it[br]allowed you to write to a file that you 0:21:52.399,0:21:56.679 didn't own as a regular user. So you could[br]essentially just write to a binary that 0:21:56.679,0:22:01.789 had "set UID" set. Very unpleasant, but[br]I'm not going to bash the Linux folks 0:22:01.789,0:22:08.030 here, this is just, I just want to show[br]you these things are hard. And the first 0:22:08.030,0:22:15.440 fix for this problem was in 2005 and then[br]it came back in 2016. So now that's fixed 0:22:15.440,0:22:21.080 and then it came back with "Huge Dirty[br]COW" in 2017, so this is, I mean this 0:22:21.080,0:22:27.580 was there for way over a decade.[br]These things are hard to debug. And this 0:22:27.580,0:22:33.110 is what I like about these systems, so not[br]having, not having tools like DTrace to 0:22:33.110,0:22:37.640 figure out what's going on inside of the[br]system somehow, to me, amounts to security 0:22:37.640,0:22:42.360 by obscurity. And I've heard that some[br]people who are developing exploits for 0:22:42.360,0:22:46.100 systems that have DTrace they say "Oh, I[br]really like developing exploits on these 0:22:46.100,0:22:53.230 systems, because the tooling is so great!"[br]Yeah, but, to be honest this is cool, 0:22:53.230,0:22:58.899 because an exploit is a proof of concept[br]and coming up with these exploits quickly 0:22:58.899,0:23:03.440 is very usable, because you know what's[br]going on you can show "Hey, this is going 0:23:03.440,0:23:07.279 wrong". I had situations, where[br]people were telling me "Oh, this is this 0:23:07.279,0:23:11.020 is not a problem with our program, this is[br]this weird operating system that you're 0:23:11.020,0:23:18.100 using. Like Solaris, weird operating[br]system." And, yeah, and then I churned out 0:23:18.100,0:23:22.059 some DTrace scripts and "No, it's[br]actually your problem". "Oh, now I can see 0:23:22.059,0:23:31.419 that on my Linux box!" Magic. So,[br]everything I showed you until now was 0:23:31.419,0:23:38.179 very, very much related to function calls[br]and we want to have a little bit more 0:23:38.179,0:23:44.720 semantics here, because you might want to[br]write a script that inspects protocols, 0:23:44.720,0:23:48.760 stuff like TCP, UDP stuff like that. So,[br]you don't want to know which function 0:23:48.760,0:23:54.320 inside of the kernel is responsible for[br]handling your TCP/IP stuff, so DTrace 0:23:54.320,0:24:00.549 comes with something that's called static[br]providers and I'm just going to show the 0:24:00.549,0:24:04.769 apropos here. So these are, so every[br]static provider has a main page which is 0:24:04.769,0:24:10.950 kind of handy - documentation whoo - and[br]you can see there is an I/O provider if 0:24:10.950,0:24:17.539 you are interested in looking at this guy:[br]Oh, IP for looking at IPv4 and IPv6, 0:24:17.539,0:24:23.570 TCP... This one is pretty cool, it's about[br]scheduling behavior. So, "what does my 0:24:23.570,0:24:29.010 scheduler do?" And if you look at that, you[br]can see some interesting stuff like length 0:24:29.010,0:24:33.150 priority if you ever saw things like[br]priority inversion, stuff like that, now 0:24:33.150,0:24:36.970 you can see that happen. I'm a nerd, I[br]find this interesting for some reason, I 0:24:36.970,0:24:43.230 don't know. And it's also pretty[br]interesting to figure out what's going on, 0:24:43.230,0:24:48.279 "why is this getting de-scheduled all the[br]time?" So, some interesting things going 0:24:48.279,0:24:55.809 on there. So, I'm running a little bit[br]short on time here, but I just quickly 0:24:55.809,0:24:59.340 want to show you something - this is all[br]kernel stuff right now - can we do that 0:24:59.340,0:25:05.380 with userspace? Of course. So, there was[br]one provider that didn't show up when I 0:25:05.380,0:25:09.590 had my provider listing, but was in the[br]DTrace script where I did this timing 0:25:09.590,0:25:16.230 attack stuff. And that's called the PID[br]provider. And the PID provider generates 0:25:16.230,0:25:21.080 probes on demand, because a process might[br]have a lot of probes and you will shortly 0:25:21.080,0:25:25.190 see why and this is why I'm going to use a[br]very small program which is called "true", 0:25:25.190,0:25:31.560 and true just exits with exit code zero.[br]So, nothing too exciting going on here, 0:25:31.560,0:25:37.810 and this dollar target gets substituted[br]in, we get the process ID there. And this 0:25:37.810,0:25:44.640 is everything that happens when I'm[br]executing this program you see this is a 0:25:44.640,0:25:48.679 little bit more fine-grained than the FBT[br]provider, because now we can trace every 0:25:48.679,0:25:53.520 single instruction inside of that[br]function, which is kind of a handy. It's a 0:25:53.520,0:25:58.090 scriptable debugger. So, these numbers are[br]the instructional offsets inside of that 0:25:58.090,0:26:03.360 function. We can also look at - so this is[br]everything in the true segment - we can 0:26:03.360,0:26:09.899 also look at libraries that got linked in[br]and there's a lot of stuff happening in 0:26:09.899,0:26:15.780 libc for example when you run true.[br]So, one last thing that I wanted to show 0:26:15.780,0:26:22.340 you because it consumed a week of my life:[br]I'm using a lot of Haskell and the Mac OS 0:26:22.340,0:26:29.419 people, they also have DTrace and they[br]have GHC Haskell DTrace support - so the 0:26:29.419,0:26:38.380 Glasgow Haskell compiler - and glorious...[br]they have probes to analyze what's going 0:26:38.380,0:26:41.620 on inside of the runtime system. So, I[br]thought "I want to have that, I have 0:26:41.620,0:26:47.019 DTrace, why doesn't it work on FreeBSD?"[br]So, after a week of fighting with make 0:26:47.019,0:26:55.100 files and linkers, that works: If you[br]check out the recent GHC repository and 0:26:55.100,0:27:00.260 build it on FreeBSD, you get all the nice[br]stuff that I'm going to show you now. So, 0:27:00.260,0:27:05.909 this is a very boring program - it just[br]starts 32 green threads and schedules them 0:27:05.909,0:27:10.470 all over the place - and now I can do[br]something like this: phone rings I can 0:27:10.470,0:27:13.934 ring a telephone.[br]laughter 0:27:13.934,0:27:18.750 No, that would be[br]interesting... So, you can also use 0:27:18.750,0:27:26.970 wildcards - and not as name of the probe -[br]and this is what's going on inside, like 0:27:26.970,0:27:31.580 GC garbage collection and all this stuff.[br]Now you can look at this and write useful 0:27:31.580,0:27:37.509 DTrace scripts that also take my runtime[br]system into account. So, stuff like that 0:27:37.509,0:27:41.810 exists for I think Python - I'm not[br]entirely sure because I don't use it - 0:27:41.810,0:27:49.120 nodejs same, Postgres - I used it but not[br]with DTrace right now - and what a find 0:27:49.120,0:27:55.210 interesting: Firefox. When you run[br]JavaScript in your Firefox, it actually 0:27:55.210,0:27:59.360 has a provider, so you can trace[br]JavaScript running in your browser with 0:27:59.360,0:28:05.130 DTrace, so after everything I just showed[br]you, there might be some stuff going on 0:28:05.130,0:28:10.700 there. So yeah, this is basically[br]everything I wanted to show you and I 0:28:10.700,0:28:13.759 think I'm going to wrap out, because[br]otherwise we're not going to have a lot of 0:28:13.759,0:28:19.001 time for questions and maybe you have[br]some. So yeah, thanks. 0:28:19.001,0:28:29.610 applause[br]Herald: Thank you very much Raichoo. We 0:28:29.610,0:28:34.257 are actually over time already, but we[br]have two more minutes because we started 0:28:34.257,0:28:38.817 three minutes late, so if there are any[br]really quick questions, possibly from the 0:28:38.817,0:28:43.030 internet... There is one, the signal angel[br]says, let's hear it. 0:28:43.030,0:28:48.013 Question: Yeah, hi, okay. So, the question[br]is, "which changes are actually necessary 0:28:48.013,0:28:51.809 to do in the kernel of an operating system[br]to support DTrace?" 0:28:51.809,0:28:56.370 Answer: That's a lot of work. So, it's not[br]something like you do in a weekend. This 0:28:56.370,0:29:03.062 is... So, the person who started the work[br]on FreeBSD has sadly passed away now, but 0:29:03.062,0:29:09.559 I think they took a couple of years to[br]have everything in place, so you have to 0:29:09.559,0:29:13.730 have stuff like the CTF thing that I[br]showed you, which is what OpenBSD is 0:29:13.730,0:29:19.890 currently working on. And then you need[br]all those those magic gizmos, like kernel 0:29:19.890,0:29:25.660 modules and stuff like that. So, it takes[br]a lot of time, but it's been ported to 0:29:25.660,0:29:30.889 most operating systems that are available[br]and in use right now. So yeah, hope this 0:29:30.889,0:29:34.239 answers the question.[br]Herald: Excellent and there are no more 0:29:34.239,0:29:38.839 questions here in the room. I will thank[br]Raichoo and you can find him outside of 0:29:38.839,0:29:46.590 the room and also on Twitter at "raichoo"[br]if you have any more further question. 0:29:46.590,0:29:51.405 postroll music 0:29:51.405,0:30:08.000 subtitles created by c3subtitles.de[br]in the year 2020. Join, and help us!