WEBVTT 00:00:17.300 --> 00:00:19.960 AARON SUGGS: All right. Can people hear OK? 00:00:19.970 --> 00:00:21.369 I'll go ahead and get started. 00:00:21.369 --> 00:00:24.050 So this talk is Rack::Attack and 00:00:24.050 --> 00:00:27.380 how to protect your app with this one weird gem. 00:00:27.380 --> 00:00:30.930 Where does Rack::Attack come from? We built it at 00:00:30.930 --> 00:00:34.400 KickStarter. If you haven't heard of KickStarter, it is 00:00:34.400 --> 00:00:37.610 a funding platform for creative projects. So somebody has 00:00:37.610 --> 00:00:40.690 an idea for a film, a comic book, an 00:00:40.690 --> 00:00:43.829 open source project, a gadget. They, they put their 00:00:43.829 --> 00:00:46.720 project up on our site. They can offer rewards 00:00:46.720 --> 00:00:50.430 for various pledge levels. Their friends, family, strangers on 00:00:50.430 --> 00:00:53.370 the internet come and can, can give them money. 00:00:53.370 --> 00:00:56.050 At the end of the deadline, if they've reached 00:00:56.050 --> 00:00:57.550 their funding goal and so they have enough to 00:00:57.550 --> 00:01:00.750 reach their project, that's when we process the transactions 00:01:00.750 --> 00:01:02.800 and the creators' get the funds they need to, 00:01:02.800 --> 00:01:03.980 to do the project. 00:01:03.980 --> 00:01:06.939 To give you a sense of scale for what 00:01:06.939 --> 00:01:09.000 we do, we, we recently crossed over a billion 00:01:09.000 --> 00:01:11.530 dollars pledged to the site. It's over a million 00:01:11.530 --> 00:01:14.960 dollars a day. And it's gone to over 60,000 00:01:14.960 --> 00:01:17.079 creative projects. 00:01:17.079 --> 00:01:21.759 Quick introduction. My name's Aaron Suggs. I go by 00:01:21.759 --> 00:01:24.909 ktheory on social media. I love dancing in my 00:01:24.909 --> 00:01:29.960 bear outfit. And I'm the operations engineer at KickStarter. 00:01:29.960 --> 00:01:33.409 We, we have a very dev ops-y style workflow. 00:01:33.409 --> 00:01:35.450 So, so it means I end up writing a 00:01:35.450 --> 00:01:36.880 lot of Ruby code, and I love writing Ruby 00:01:36.880 --> 00:01:37.880 code. 00:01:37.880 --> 00:01:41.979 So, so Rack::Attack is, is a tool I wrote, 00:01:41.979 --> 00:01:46.259 and it's Rack middleware for blocking and throttling abusive 00:01:46.259 --> 00:01:49.329 requests. What do we mean by abusive requests? These 00:01:49.329 --> 00:01:52.159 can be things like malicious attackers trying to take 00:01:52.159 --> 00:01:55.369 down your site, doing things like trying to crack 00:01:55.369 --> 00:01:58.509 user accounts or get sensitive information, or it can 00:01:58.509 --> 00:02:02.049 be naively written scrapers, who are just, like, people 00:02:02.049 --> 00:02:04.600 on the internet doing weird things as they are 00:02:04.600 --> 00:02:07.909 prone to do, and that's cool, but sometimes it, 00:02:07.909 --> 00:02:09.788 it is a lot of traffic. It's a lot 00:02:09.788 --> 00:02:12.300 of resources for your app to try to handle, 00:02:12.300 --> 00:02:16.310 and Rack::Attack is a very elegant DSL and, and 00:02:16.310 --> 00:02:18.560 way for dealing with these sorts of things. Sort 00:02:18.560 --> 00:02:22.190 of constraining their behavior so your website stays up. 00:02:22.190 --> 00:02:27.020 Rack::Attack is on GitHub at slash kickstarter slash rack-attack. 00:02:27.020 --> 00:02:30.160 It's an open source Ruby gem. There's a README, 00:02:30.160 --> 00:02:34.050 sort of exactly like what you'd expect. 00:02:34.050 --> 00:02:36.860 So the big wins that KickStarter has gotten from 00:02:36.860 --> 00:02:40.250 using Rack::Attack, and the reason we developed it, was 00:02:40.250 --> 00:02:42.950 we wanted to increase our performance. So, so this 00:02:42.950 --> 00:02:46.390 is like site performance. We, we had problems with 00:02:46.390 --> 00:02:49.720 sort of abusive requests making our website slow because 00:02:49.720 --> 00:02:52.530 they were using up too many app servers CP. 00:02:52.530 --> 00:02:54.500 Too much app server CPU, or too much, too 00:02:54.500 --> 00:02:58.140 many database resources, by sort of constraining them we 00:02:58.140 --> 00:03:00.610 were able to make the website faster for the 00:03:00.610 --> 00:03:03.450 sort of, the most important requests. Like people coming 00:03:03.450 --> 00:03:06.160 on, wanting to watch videos, wanting to pledge money. 00:03:06.160 --> 00:03:07.660 Not people just trying to scrape down the entire 00:03:07.660 --> 00:03:08.270 site. 00:03:08.270 --> 00:03:11.940 We also improved our available. Because sometimes these requests 00:03:11.940 --> 00:03:14.460 were, were so much, there were so many that 00:03:14.460 --> 00:03:15.920 they would take down the site, or there would 00:03:15.920 --> 00:03:19.860 just be some weird incident and, we, right. It, 00:03:19.860 --> 00:03:22.540 it hurt our availability. 00:03:22.540 --> 00:03:25.840 But the biggest win that we had was developer 00:03:25.840 --> 00:03:30.370 happiness. Because dealing with these sort of bad actors 00:03:30.370 --> 00:03:33.940 on the internet especially if it means, like, your, 00:03:33.940 --> 00:03:35.970 your site's going down or like, the, you know, 00:03:35.970 --> 00:03:38.960 you need to scale up because somebody's doing something 00:03:38.960 --> 00:03:41.750 weird, that can really interrupt a lot of developers. 00:03:41.750 --> 00:03:44.230 It can, it can sort of derail your product 00:03:44.230 --> 00:03:47.320 road map. We want to be writing cool features 00:03:47.320 --> 00:03:50.190 and Rack::Attack was a great DSL to let us 00:03:50.190 --> 00:03:53.330 spend less time thinking about that stuff and more 00:03:53.330 --> 00:03:54.850 stuff doing the stuff that we, that we like 00:03:54.850 --> 00:03:55.430 doing. 00:03:55.430 --> 00:03:58.960 So let me talk about the origin story for 00:03:58.960 --> 00:04:01.260 Rack::Attack. Like, what happened at KickStarter that made us 00:04:01.260 --> 00:04:05.060 realize we, we needed this? Let's rewind to the 00:04:05.060 --> 00:04:10.790 summer of 2012. 00:04:10.790 --> 00:04:12.510 And this happened. So this is a story in 00:04:12.510 --> 00:04:15.930 a graph. So the blue line, I hope it 00:04:15.930 --> 00:04:19.850 shows up pretty well. Cool. Is our regular successful 00:04:19.850 --> 00:04:22.000 logins. People typing in an email and password and 00:04:22.000 --> 00:04:24.919 us being like, OK, you are logged in. You 00:04:24.919 --> 00:04:26.830 know, it ebs and flows throughout the day. 00:04:26.830 --> 00:04:30.620 Suddenly, one Sun, one Saturday afternoon, we just get 00:04:30.620 --> 00:04:34.460 so many of these, like, bad login requests, and 00:04:34.460 --> 00:04:36.500 for awhile we're like, what's going on? Did we 00:04:36.500 --> 00:04:39.220 deploy a feature that broke login? No. Somebody is 00:04:39.220 --> 00:04:42.090 trying to, to crack our user accounts. They're just 00:04:42.090 --> 00:04:45.470 like guessing email addresses and passwords as fast as 00:04:45.470 --> 00:04:48.850 they can, from several different IP addresses. 00:04:48.850 --> 00:04:51.810 So, as the ops guy, this is sort of 00:04:51.810 --> 00:04:54.000 on my plate. I'm like, OK, well, I gotta 00:04:54.000 --> 00:04:55.560 stop this. This is bad for the site for 00:04:55.560 --> 00:04:58.650 this to be going on. So I wrote a 00:04:58.650 --> 00:05:01.740 pretty nasty before filter for our login action, that's 00:05:01.740 --> 00:05:04.560 like, you know, keep a counter in memcache and, 00:05:04.560 --> 00:05:07.320 you know, if it's too many like, like, give 00:05:07.320 --> 00:05:11.480 them an error page and it was, it was 00:05:11.480 --> 00:05:14.630 kind of a sucky experience, because I was changing 00:05:14.630 --> 00:05:17.480 a really critical feature of our site, sort of 00:05:17.480 --> 00:05:20.120 under duress of, of knowing that I needed to 00:05:20.120 --> 00:05:22.980 get it out there quickly. And it was sort 00:05:22.980 --> 00:05:24.440 of like a big change, and in the pull 00:05:24.440 --> 00:05:26.620 request I was, I was apologetic, being like, I 00:05:26.620 --> 00:05:28.220 know this is badly tested and it's like a 00:05:28.220 --> 00:05:29.850 nasty code change, but we've got to get it 00:05:29.850 --> 00:05:33.700 out fast because this, this event's going on. 00:05:33.700 --> 00:05:36.930 And, so that, so we did that. And then 00:05:36.930 --> 00:05:39.680 sort of in the cold light of day, I 00:05:39.680 --> 00:05:42.050 reflected a little bit and I thought, we need 00:05:42.050 --> 00:05:47.440 a more elegant way to prevent bad requests. This 00:05:47.440 --> 00:05:50.280 is, it's not just gonna be about this login 00:05:50.280 --> 00:05:51.669 attack. This is gonna be about a whole class 00:05:51.669 --> 00:05:54.250 of problems that we might have on the site. 00:05:54.250 --> 00:05:57.639 You know, I should say, too, with that login 00:05:57.639 --> 00:06:00.310 attack, it was something that we sort of always 00:06:00.310 --> 00:06:02.510 imagined that, like, oh yeah, of course we should, 00:06:02.510 --> 00:06:05.020 like, throttle login requests. We just hadn't ever gotten 00:06:05.020 --> 00:06:06.669 around to it. You know, it was in our 00:06:06.669 --> 00:06:09.770 ticketing system as like a low-priority someday somebody should 00:06:09.770 --> 00:06:13.300 do this thing. And having it actually happen was 00:06:13.300 --> 00:06:15.460 like, OK, now we gotta do it right now. 00:06:15.460 --> 00:06:18.570 So, we realized, like, we need this generic tool 00:06:18.570 --> 00:06:23.810 to stop bad requests. And really, there's already, in 00:06:23.810 --> 00:06:25.810 the Ruby world, a great solution for this, and 00:06:25.810 --> 00:06:29.340 it's Rack middleware. So now we get to the 00:06:29.340 --> 00:06:32.020 code section of the talk. Here comes some code. 00:06:32.020 --> 00:06:33.620 Get ready. 00:06:33.620 --> 00:06:35.840 This is an example of, like, the most basic 00:06:35.840 --> 00:06:38.190 Rack middleware. Just, really quick, for, for people who 00:06:38.190 --> 00:06:41.389 might not be familiar with it. So middleware is 00:06:41.389 --> 00:06:45.650 basically like hugging your application, wrapping around so, so 00:06:45.650 --> 00:06:47.430 you, you have your Rails app or your Sinatra 00:06:47.430 --> 00:06:52.350 app, that is the app in this case. And 00:06:52.350 --> 00:06:53.940 you want to do things, you want to sort 00:06:53.940 --> 00:06:56.440 of be able to do things to the request 00:06:56.440 --> 00:06:58.590 that's coming in from the client. That's the end. 00:06:58.590 --> 00:07:01.560 So every, every request from a client is gonna 00:07:01.560 --> 00:07:03.260 do this call method where you pass in the 00:07:03.260 --> 00:07:06.050 environment, the environment is, like, I don't know, what 00:07:06.050 --> 00:07:09.080 page the client wants or what they're cookie is 00:07:09.080 --> 00:07:11.710 and, and all that information. 00:07:11.710 --> 00:07:14.460 And so the real magic of Rack middleware is 00:07:14.460 --> 00:07:16.740 it lets you do stuff here with, with the 00:07:16.740 --> 00:07:19.010 requests. Like, you can block it in the case 00:07:19.010 --> 00:07:23.040 of Rack::Attack, potentially. Or you can do stuff with 00:07:23.040 --> 00:07:26.340 the response. You can log it. You can cache 00:07:26.340 --> 00:07:27.570 it. Stuff like that. 00:07:27.570 --> 00:07:29.020 So this, so this is just a great pattern 00:07:29.020 --> 00:07:34.100 for managing, for sort of making easy architectures to 00:07:34.100 --> 00:07:38.630 do stuff with HTTP requests. So in Rack::Attack's case, 00:07:38.630 --> 00:07:40.880 this is a sort of simplified version of the 00:07:40.880 --> 00:07:45.440 Rack::Attack call method. We say, for this request, should 00:07:45.440 --> 00:07:47.889 we allow it? If so, go ahead and pass 00:07:47.889 --> 00:07:51.889 it onto your application. Your application is gonna do, 00:07:51.889 --> 00:07:53.330 potentially, a lot of work. 00:07:53.330 --> 00:07:55.960 Maybe it's gonna spend a couple hundred milliseconds, like, 00:07:55.960 --> 00:07:59.300 querying the database and rendering views and stuff like 00:07:59.300 --> 00:08:01.910 that. So that's the expensive work that we want 00:08:01.910 --> 00:08:05.150 to save if the, if this is an abusive 00:08:05.150 --> 00:08:07.580 request. So, so if we shouldn't allow it, then 00:08:07.580 --> 00:08:11.490 we just return back this very fast access-denied as 00:08:11.490 --> 00:08:14.570 a very simple and fast response to render. 00:08:14.570 --> 00:08:18.500 Rack::Attack can do several hundred of these access denied 00:08:18.500 --> 00:08:21.870 requests per, like, thread that you have running. So 00:08:21.870 --> 00:08:25.130 like, per unicorn worker or per Heroku instance or 00:08:25.130 --> 00:08:26.580 something like that. 00:08:26.580 --> 00:08:29.870 But, so, that's what you get for, when you 00:08:29.870 --> 00:08:32.399 just use the Rack middleware for free. So, so 00:08:32.399 --> 00:08:34.828 we don't yet know what this should_allow method should 00:08:34.828 --> 00:08:36.419 be. That's code that you sort of have to 00:08:36.419 --> 00:08:39.299 configure yourself, of what do you want to throttle 00:08:39.299 --> 00:08:40.000 on. 00:08:40.000 --> 00:08:43.229 So that looks like this. This is sort of 00:08:43.229 --> 00:08:46.070 a generic throttle that you might put in your, 00:08:46.070 --> 00:08:51.430 in an initializer to configure Rack::Attack. The important stuff 00:08:51.430 --> 00:08:53.300 that's going on here is we are calling the 00:08:53.300 --> 00:08:57.089 throttle class method on Rack::Attack, so that's just something 00:08:57.089 --> 00:09:00.200 we expose to let you plug into the middleware. 00:09:00.200 --> 00:09:02.149 We give it a name, in this case it's 00:09:02.149 --> 00:09:04.950 the, we, we named the throttle IP. This is 00:09:04.950 --> 00:09:08.080 gonna determine how we track it. And that just 00:09:08.080 --> 00:09:11.210 has to be unique throughout your application. We're gonna 00:09:11.210 --> 00:09:13.140 give it a limit and a period. And so 00:09:13.140 --> 00:09:15.649 that's how much, the, the period is how many 00:09:15.649 --> 00:09:18.480 seconds we're gonna be considering for the throttle, and 00:09:18.480 --> 00:09:20.390 the limit is sort of your quota for how 00:09:20.390 --> 00:09:23.180 many requests you get to make during that time. 00:09:23.180 --> 00:09:25.320 So in this case, it's ten requests every five 00:09:25.320 --> 00:09:30.920 seconds. For the arithmetically inclined, you'll notice that this 00:09:30.920 --> 00:09:33.630 is not like a reduced fraction. We could say 00:09:33.630 --> 00:09:37.040 two requests every one second. The advantage of doing 00:09:37.040 --> 00:09:38.790 a higher multiple is that, like, it allows a 00:09:38.790 --> 00:09:42.649 little burstiness. So these periods are basically dividing time 00:09:42.649 --> 00:09:46.210 up into these, like, five second long buckets. So 00:09:46.210 --> 00:09:49.399 in between zero and, seconds and five seconds after 00:09:49.399 --> 00:09:51.940 the minute, like, in that window, you're allowed to 00:09:51.940 --> 00:09:54.200 make up to ten requests. 00:09:54.200 --> 00:09:57.660 And so by having bigger multiples in bigger windows, 00:09:57.660 --> 00:10:00.839 you can sort of get around some burstiness at, 00:10:00.839 --> 00:10:03.740 but the long-term average stays the same. Like, long 00:10:03.740 --> 00:10:07.300 term, nobody's gonna make more requests that two every 00:10:07.300 --> 00:10:09.240 one second. 00:10:09.240 --> 00:10:12.140 OK, so what's going on? We got the, the 00:10:12.140 --> 00:10:14.690 class method. We got the name. WE have the 00:10:14.690 --> 00:10:17.390 limit and the period. And then to this block, 00:10:17.390 --> 00:10:20.730 we are passing along the request. Now, in the 00:10:20.730 --> 00:10:23.320 earlier middleware expample we talked, we called this the 00:10:23.320 --> 00:10:25.830 end, which was just like the, the environment hash 00:10:25.830 --> 00:10:29.240 that comes from the request. Request is just like 00:10:29.240 --> 00:10:33.740 a light little Rack request object wrapped around the 00:10:33.740 --> 00:10:36.770 environment that just sort of gives you methods, instance 00:10:36.770 --> 00:10:39.510 methods to call, like dot IP or dot host 00:10:39.510 --> 00:10:41.170 or dot path or something like that. It just 00:10:41.170 --> 00:10:46.149 sort of, you use these in Rails controllers, too. 00:10:46.149 --> 00:10:49.820 So it's just like a lightly-wrapped request. And then 00:10:49.820 --> 00:10:52.100 inside the block, what the block returns is the 00:10:52.100 --> 00:10:54.640 sort of really important part. That's the discriminator that 00:10:54.640 --> 00:10:58.140 determines how we're gonna bucket up these throttles. So 00:10:58.140 --> 00:11:00.930 in this case we are gonna say every IP 00:11:00.930 --> 00:11:03.649 address, every distinct IP address is going to get 00:11:03.649 --> 00:11:06.830 its own throttle limit. But we could throttle by 00:11:06.830 --> 00:11:09.620 something else. WE could throttle by a parameter or 00:11:09.620 --> 00:11:14.459 a host name or something like that, or an 00:11:14.459 --> 00:11:16.130 API token. 00:11:16.130 --> 00:11:18.490 And one thing to note with these discriminators, too, 00:11:18.490 --> 00:11:21.459 is like, if this would, this is returning a 00:11:21.459 --> 00:11:24.050 string, so it's always gonna be a truthy value, 00:11:24.050 --> 00:11:26.700 and true values sort of enable the, the throttling. 00:11:26.700 --> 00:11:29.029 Like, we are gonna throttle these requests as long 00:11:29.029 --> 00:11:32.190 as there's an IP address, and there always is. 00:11:32.190 --> 00:11:34.670 If we would return nil or a falsey value, 00:11:34.670 --> 00:11:36.540 we just sort of let the request go through 00:11:36.540 --> 00:11:38.740 and we're not gonna throttle it. I'll talk about 00:11:38.740 --> 00:11:41.790 why we might want to do that later. But, 00:11:41.790 --> 00:11:43.709 so now we have this issue of throttle state. 00:11:43.709 --> 00:11:46.420 Like, we have these counters per IP address that 00:11:46.420 --> 00:11:48.320 we need to track. 00:11:48.320 --> 00:11:50.560 And so, so where do we store that? A 00:11:50.560 --> 00:11:53.120 pretty elegant and simple and obvious place for that 00:11:53.120 --> 00:11:56.880 was our Rails cache. So when you just use 00:11:56.880 --> 00:11:59.100 Rack::Attack by default, if you have a Rails cache, 00:11:59.100 --> 00:12:02.450 it's gonna use it. But, it really works best 00:12:02.450 --> 00:12:05.779 with memcache or redis. So, so I hope you're 00:12:05.779 --> 00:12:09.050 using that as your Rails cache. But if you're 00:12:09.050 --> 00:12:10.890 not, like, there are ways that you can build 00:12:10.890 --> 00:12:12.510 your own, or sort of like plug in a, 00:12:12.510 --> 00:12:14.670 a different cache store. 00:12:14.670 --> 00:12:17.070 The great advantage about memcache and redis is that 00:12:17.070 --> 00:12:21.209 they have really good support for atomically incrementing counters, 00:12:21.209 --> 00:12:22.720 and that's the sort of key feature we'd need 00:12:22.720 --> 00:12:26.269 behind the scenes. So now we're imagining for, for 00:12:26.269 --> 00:12:28.209 every request that comes in, we need to sort 00:12:28.209 --> 00:12:31.050 of increment the counter per IP address. 00:12:31.050 --> 00:12:32.370 And so how do we do that? Like what's, 00:12:32.370 --> 00:12:35.250 what's the algorithm? So this is the nitty gritty 00:12:35.250 --> 00:12:40.100 of how Rack::Attack works. How it constructs that key. 00:12:40.100 --> 00:12:43.060 So remember how we divided the minute up into 00:12:43.060 --> 00:12:46.810 like little buckets depending on our period. So, so 00:12:46.810 --> 00:12:48.790 to do that, we sort of take the current 00:12:48.790 --> 00:12:53.620 second. We construct a key that is the name 00:12:53.620 --> 00:12:57.380 of our request, like IP in this case. We 00:12:57.380 --> 00:12:59.490 take the time divided by the period, so this 00:12:59.490 --> 00:13:02.660 means that that middle component is going to be, 00:13:02.660 --> 00:13:05.350 is going to increment every five seconds. It's gonna, 00:13:05.350 --> 00:13:07.570 so it's, the key's gonna change. 00:13:07.570 --> 00:13:09.360 And then the final part is that block return 00:13:09.360 --> 00:13:12.170 value. So in this case it's the IP address 00:13:12.170 --> 00:13:15.149 of the request. But maybe it's an API token 00:13:15.149 --> 00:13:17.170 or something like that. 00:13:17.170 --> 00:13:18.720 So at the end of it, we have this 00:13:18.720 --> 00:13:21.990 key that changes every couple seconds. Every time, like, 00:13:21.990 --> 00:13:24.720 the period rotates, and this ends up being a 00:13:24.720 --> 00:13:27.450 very efficient use case, a very efficient use of 00:13:27.450 --> 00:13:30.880 memcache or redis. Like, this is, storing all this 00:13:30.880 --> 00:13:34.010 information is gonna take, like, a couple megabytes. It's 00:13:34.010 --> 00:13:36.000 like, don't worry about the impact on your cache 00:13:36.000 --> 00:13:39.100 store in pretty much every scenario. 00:13:39.100 --> 00:13:41.339 To make it even more efficient use of your 00:13:41.339 --> 00:13:45.910 cache store, we set an expire rate, so that 00:13:45.910 --> 00:13:48.110 in that, like, in that bucket window of, say, 00:13:48.110 --> 00:13:50.420 zero to five seconds, we're gonna say that all 00:13:50.420 --> 00:13:53.100 those cache keys expire at five seconds. So at 00:13:53.100 --> 00:13:56.990 the same moment that the cache keys change, they 00:13:56.990 --> 00:13:59.680 also expire. So memcache or redis just ends up 00:13:59.680 --> 00:14:03.200 reusing the same memory blocks over and over. You 00:14:03.200 --> 00:14:06.339 don't have, even though there's changing, they're changing in 00:14:06.339 --> 00:14:08.399 memory, you don't have as much churn as you 00:14:08.399 --> 00:14:11.300 would otherwise. 00:14:11.300 --> 00:14:13.890 And so then the Rack middleware is really doing 00:14:13.890 --> 00:14:16.310 pretty simple stuff of we're saying, for whatever your 00:14:16.310 --> 00:14:19.870 cache is, increment this key with this expire rate. 00:14:19.870 --> 00:14:21.459 That's gonna give us back the count of how 00:14:21.459 --> 00:14:23.610 many requests that have been made that, that match 00:14:23.610 --> 00:14:26.610 that throttle. And if it's more than our limit, 00:14:26.610 --> 00:14:29.680 we're gonna return that access denied response. 00:14:29.680 --> 00:14:32.750 So, we rolled this out. You know, we're able 00:14:32.750 --> 00:14:36.920 to have this global throttle per IP address. We 00:14:36.920 --> 00:14:40.779 start making a couple other, other features, and it 00:14:40.779 --> 00:14:44.029 was about a year later when we had a, 00:14:44.029 --> 00:14:46.930 the sort of redux of, of a new event 00:14:46.930 --> 00:14:48.980 that put Rack::Attack to the test. 00:14:48.980 --> 00:14:52.040 So, a new challenger emerges in the summer of 00:14:52.040 --> 00:14:57.899 2013. This was a script called kicksniper dot py. 00:14:57.899 --> 00:15:02.079 And this revealed a pretty interesting behavior on KickStarter 00:15:02.079 --> 00:15:05.290 that we call reward sniping. Actually, kicksniper dot py 00:15:05.290 --> 00:15:08.790 refers to it in the code as reward sniping. 00:15:08.790 --> 00:15:12.110 And so, this is, this is an, an interesting 00:15:12.110 --> 00:15:14.920 behavior because. So I told you how KickStarter offers 00:15:14.920 --> 00:15:17.800 these rewards. They can be limited rewards. So a 00:15:17.800 --> 00:15:20.170 creator says, I'm only gonna give away, like, a 00:15:20.170 --> 00:15:24.120 hundred of these, and first come, first serve. 00:15:24.120 --> 00:15:26.980 So, there's a, a pretty popular project where it 00:15:26.980 --> 00:15:28.700 was like a video game and, and the video 00:15:28.700 --> 00:15:31.750 game was offering these reward tiers that would be, 00:15:31.750 --> 00:15:33.980 like, for fifty bucks, you get, like, the silver 00:15:33.980 --> 00:15:35.790 level package, and for a hundred bucks you get 00:15:35.790 --> 00:15:37.600 the gold package, and so on and so, like, 00:15:37.600 --> 00:15:41.459 ever more deluxe and expensive packages. And they were 00:15:41.459 --> 00:15:42.720 all very much in demand. 00:15:42.720 --> 00:15:47.230 So the early reward tiers like sold-out super fast. 00:15:47.230 --> 00:15:50.089 And then occasionally, somebody in, who had those early 00:15:50.089 --> 00:15:53.070 reward tiers, would decide they're gonna splurge and they're 00:15:53.070 --> 00:15:54.870 gonna upgrade. They're gonna change their pledge to a 00:15:54.870 --> 00:15:58.600 higher one, and now for that moment, like, there's 00:15:58.600 --> 00:16:01.430 now one available of the lower tier. And so 00:16:01.430 --> 00:16:04.980 people were like hitting refresh, refresh, refresh, hoping that 00:16:04.980 --> 00:16:07.769 they just noticed when somebody, when somebody had changed 00:16:07.769 --> 00:16:09.350 their pledge and now there was one of these 00:16:09.350 --> 00:16:12.410 highly desirable lower-tier pledges available. 00:16:12.410 --> 00:16:18.420 Some entrepreneur, enterprising Python developer, says, I will make 00:16:18.420 --> 00:16:21.959 a script that does this for me. Sure enough, 00:16:21.959 --> 00:16:24.940 so, so he writes kicksniper dot py that's, that's 00:16:24.940 --> 00:16:27.019 in a tight loop, trying to change his pledge 00:16:27.019 --> 00:16:29.089 on our site. Saying, like, let me get that, 00:16:29.089 --> 00:16:32.260 that early reward tier. You know, our ActiveRecord validations 00:16:32.260 --> 00:16:34.050 were working fine and we said, no, you can't 00:16:34.050 --> 00:16:36.470 change your pledge to that the vast majority of 00:16:36.470 --> 00:16:39.730 the time, but, but eventually he got through and 00:16:39.730 --> 00:16:41.100 was able to get the pledge. 00:16:41.100 --> 00:16:43.089 It was such a great success that he goes 00:16:43.089 --> 00:16:45.889 on all the forums and says, hey, everybody just 00:16:45.889 --> 00:16:50.740 run this, like, Python script on your laptop and 00:16:50.740 --> 00:16:53.300 you, too, might look, luck out and get one 00:16:53.300 --> 00:16:56.220 of these highly desirable earlier reward tiers. 00:16:56.220 --> 00:17:01.360 So let's tell this story in a graph. So, 00:17:01.360 --> 00:17:04.089 this is our master database CPU over the course 00:17:04.089 --> 00:17:05.929 of a, of a day or so. We see 00:17:05.929 --> 00:17:07.919 at the very beginning, it starts off between ten 00:17:07.919 --> 00:17:10.760 or fifteen percent. That's my happy place. That's where 00:17:10.760 --> 00:17:12.039 I like it to be. We have plenty of 00:17:12.039 --> 00:17:15.148 head room for like, you know, big projects to 00:17:15.148 --> 00:17:16.949 sort of blow up on the site, as they 00:17:16.949 --> 00:17:18.980 do from time to time. 00:17:18.980 --> 00:17:21.490 And, I honestly didn't really notice that it had 00:17:21.490 --> 00:17:24.339 been creeping up over the course of the day. 00:17:24.339 --> 00:17:28.000 Thursday morning, it crossed thirty percent, and that's when 00:17:28.000 --> 00:17:30.940 I get a CPU alert threshold. So it, so 00:17:30.940 --> 00:17:32.590 in fact, the whole dev team gets this email 00:17:32.590 --> 00:17:34.910 being like, hey, the master database CPU is pretty 00:17:34.910 --> 00:17:37.260 high. You guys should check that out. 00:17:37.260 --> 00:17:41.320 So, what do we, you know, we, we spend 00:17:41.320 --> 00:17:43.070 a little time, we're like, why is the database 00:17:43.070 --> 00:17:44.870 so high? Well, you know, it looks like there 00:17:44.870 --> 00:17:46.870 are a crazy number of requests trying to change 00:17:46.870 --> 00:17:49.600 their pledge for this one project. 00:17:49.600 --> 00:17:52.650 We, we're able to sort of construct this back 00:17:52.650 --> 00:17:54.530 story and, like, see what was happening on the 00:17:54.530 --> 00:17:57.090 database CPU. We see the form request where everybody's 00:17:57.090 --> 00:18:02.280 like, thank you for kicksniper dot py. And so, 00:18:02.280 --> 00:18:03.919 and we're like, all right, so, so how are 00:18:03.919 --> 00:18:06.390 we gonna handle this? Like, is it really that 00:18:06.390 --> 00:18:08.990 important that people are able to try to change 00:18:08.990 --> 00:18:11.400 their pledge like multiple times a second? 00:18:11.400 --> 00:18:13.500 What if they only could change their pledge every 00:18:13.500 --> 00:18:17.049 couple seconds? Right, like, I guess that's fair enough 00:18:17.049 --> 00:18:19.130 to the, like, there's this question of, like, what's 00:18:19.130 --> 00:18:22.030 the fairest way to allocate the scarce resources of, 00:18:22.030 --> 00:18:24.500 of like the pledge as soon as it's available. 00:18:24.500 --> 00:18:26.720 I kind of don't care about the answer. Anybody 00:18:26.720 --> 00:18:28.450 can get it. 00:18:28.450 --> 00:18:32.410 But, but we're like, if we start throttling these 00:18:32.410 --> 00:18:34.600 people, it's like totally fair. They're using an inordinate 00:18:34.600 --> 00:18:37.540 number of resources. And people who are just clicking 00:18:37.540 --> 00:18:40.040 around the site are having a slower experience because 00:18:40.040 --> 00:18:42.400 our database CPU is so high. 00:18:42.400 --> 00:18:44.020 So we decide, like, OK, you can make a 00:18:44.020 --> 00:18:46.250 couple requests per minute to change a pledge. It 00:18:46.250 --> 00:18:49.860 was one line of Rack::Attack code. We deploy it. 00:18:49.860 --> 00:18:52.190 The yellow vertical lines here are deploy lines, so 00:18:52.190 --> 00:18:55.210 you can see that right here, about an hour 00:18:55.210 --> 00:18:57.250 after we get the alert that something was going 00:18:57.250 --> 00:19:01.710 wrong, we deploy and immediately our database CPU drops. 00:19:01.710 --> 00:19:04.780 We're pretty much back to the happy place. 00:19:04.780 --> 00:19:07.640 And so, for us, that was like, revealing the, 00:19:07.640 --> 00:19:09.240 the great success that we could have. Like, it 00:19:09.240 --> 00:19:12.000 was so easy, like, once we figured out what 00:19:12.000 --> 00:19:14.470 was going on, it was so easy for us 00:19:14.470 --> 00:19:17.100 to write code that just, like, solved that problem. 00:19:17.100 --> 00:19:19.030 We didn't have to think about, like, how do 00:19:19.030 --> 00:19:23.330 we optimize the edit pledge flow? Which could have 00:19:23.330 --> 00:19:26.260 been, like, a much bigger product change, and derail, 00:19:26.260 --> 00:19:28.090 like, taken up a lot more developer time. It 00:19:28.090 --> 00:19:30.320 was sort of a cut and dry decision of 00:19:30.320 --> 00:19:32.799 like, most people aren't gonna try to change their 00:19:32.799 --> 00:19:35.020 pledge, like, we're super confused if you're actually trying 00:19:35.020 --> 00:19:36.910 to change your pledge several times a minute. 00:19:36.910 --> 00:19:39.290 That's a, that's a bug we should fix. But 00:19:39.290 --> 00:19:41.360 it's really just these scrapers. It's not big deal 00:19:41.360 --> 00:19:43.040 to say they can try a few times a 00:19:43.040 --> 00:19:43.450 minute. 00:19:43.450 --> 00:19:47.470 So, that was a big win for Rack::Attack at 00:19:47.470 --> 00:19:50.110 KickStarter. We feel like we sort of, we sort 00:19:50.110 --> 00:19:54.230 of cemented that its value in the organization. So 00:19:54.230 --> 00:19:56.250 now I'm gonna shift gears a little bit and 00:19:56.250 --> 00:19:59.500 I'm gonna tell you pro tips of general things 00:19:59.500 --> 00:20:01.820 you can do with Rack::Attack that, that are probably 00:20:01.820 --> 00:20:03.210 useful for your application. 00:20:03.210 --> 00:20:07.040 I just, oh my gosh I'm so glad that 00:20:07.040 --> 00:20:08.410 I got to use this gif. This gif is 00:20:08.410 --> 00:20:12.580 like condensed, pure condensed happiness for me. OK. Back 00:20:12.580 --> 00:20:14.320 to the code. 00:20:14.320 --> 00:20:16.900 So, we talked about how to do, like, a 00:20:16.900 --> 00:20:20.130 general, a, a log, I'm sorry. We talked about 00:20:20.130 --> 00:20:24.179 how to do a throttle for all IP addresses. 00:20:24.179 --> 00:20:25.770 So like each IP has this quota of how 00:20:25.770 --> 00:20:28.710 many requests you can do. But, in our, in 00:20:28.710 --> 00:20:32.500 our origin story about the login attack, we wanted 00:20:32.500 --> 00:20:35.380 to be extra careful about login requests. Like, those 00:20:35.380 --> 00:20:38.169 are something that, that you would want to throttle 00:20:38.169 --> 00:20:40.880 even more strictly than you would throttle many other 00:20:40.880 --> 00:20:43.730 things on your, in your application. 00:20:43.730 --> 00:20:46.860 So this is a new throttle, and so we 00:20:46.860 --> 00:20:49.590 give it a new name of logins per IP. 00:20:49.590 --> 00:20:52.020 And this is saying that if you are making 00:20:52.020 --> 00:20:55.530 a post request to the login url, then we 00:20:55.530 --> 00:20:57.660 want to throttle you by IP to like this 00:20:57.660 --> 00:21:01.600 much, this lower limit. And so this is relying 00:21:01.600 --> 00:21:04.520 on the fact that we mentioned earlier, that if 00:21:04.520 --> 00:21:06.830 the block returns nil, we're not gonna do throttle 00:21:06.830 --> 00:21:08.760 at all. So, so if this is not a 00:21:08.760 --> 00:21:11.740 post to the login action, like, we're not gonna 00:21:11.740 --> 00:21:14.200 check memcache, we're not gonna increment any counters or 00:21:14.200 --> 00:21:15.730 do anything like that. We're just gonna sort of 00:21:15.730 --> 00:21:18.590 allow this request right through. 00:21:18.590 --> 00:21:20.400 But if it is, we're gonna hold you, we're 00:21:20.400 --> 00:21:22.830 gonna say each IP address gets this lower quota 00:21:22.830 --> 00:21:24.650 of how many login requests they can make. 00:21:24.650 --> 00:21:27.190 Thinking of this same problem from a, from a 00:21:27.190 --> 00:21:29.910 kind of different angle, you might want to imagine 00:21:29.910 --> 00:21:32.250 a, a situation where a, an attacker is using 00:21:32.250 --> 00:21:35.580 many different IP addresses to try to crack passwords 00:21:35.580 --> 00:21:39.020 for one particular email address, right. Maybe it's the 00:21:39.020 --> 00:21:41.710 founder's email address or something like that. 00:21:41.710 --> 00:21:44.289 So you, so putting on your security hat, you 00:21:44.289 --> 00:21:45.630 can be like, how am I gonna be safe 00:21:45.630 --> 00:21:48.610 from those kinds of requests? The only change here 00:21:48.610 --> 00:21:51.660 is what we're returning. Instead of the IP address, 00:21:51.660 --> 00:21:54.789 we're returning the value of the email parameter. So 00:21:54.789 --> 00:21:57.780 this is a, a sort of little different way 00:21:57.780 --> 00:22:00.890 of thinking about throttles, of saying, whoever you are, 00:22:00.890 --> 00:22:02.789 if you're trying to login with this one particular 00:22:02.789 --> 00:22:05.809 IP address, you can only do it five times 00:22:05.809 --> 00:22:07.900 every twenty seconds. 00:22:07.900 --> 00:22:11.370 So those are two throttles that pretty much everybody 00:22:11.370 --> 00:22:13.610 should, should have that feature on their website. If 00:22:13.610 --> 00:22:15.960 you haven't been bitten by it yet, it's probably 00:22:15.960 --> 00:22:19.280 just a matter of time. 00:22:19.280 --> 00:22:22.360 Another pretty cool Rack::Attack feature are blacklists. So these 00:22:22.360 --> 00:22:24.059 are requests that you don't even want to throttle. 00:22:24.059 --> 00:22:27.169 Like, you're not gonna allow them at all. Just, 00:22:27.169 --> 00:22:30.309 access denied every time they happen. I kind, I 00:22:30.309 --> 00:22:33.140 was gonna call these blocks, but like blocks, I 00:22:33.140 --> 00:22:35.020 can't call them blocks. Because in Ruby the, like, 00:22:35.020 --> 00:22:37.250 that's already a different thing. 00:22:37.250 --> 00:22:39.090 So hence the term blacklists. 00:22:39.090 --> 00:22:42.130 Here's an example of a pretty handy blacklist. Say 00:22:42.130 --> 00:22:44.610 you have an admin section of your website, and 00:22:44.610 --> 00:22:46.750 you want to restrict access to the admin section 00:22:46.750 --> 00:22:49.630 to just like, your one office IP address. So 00:22:49.630 --> 00:22:52.780 this is, again, it's using the, it's using the 00:22:52.780 --> 00:22:56.780 blacklist class method on Rack::Attack to sort of configure 00:22:56.780 --> 00:22:58.570 this in the middleware. You would, you would put 00:22:58.570 --> 00:23:03.299 this in an initializer, saying that, you're given a 00:23:03.299 --> 00:23:07.010 name like bad_admin_ip, and one of the things, like, 00:23:07.010 --> 00:23:08.669 it's different than throttles in that we don't have 00:23:08.669 --> 00:23:10.799 to pass along a limit of a period, because 00:23:10.799 --> 00:23:13.970 it just like, it doesn't apply to blacklists. 00:23:13.970 --> 00:23:15.720 But it has the same logic where if the 00:23:15.720 --> 00:23:19.480 return value of this block is truthy, we're gonna, 00:23:19.480 --> 00:23:22.450 like, just give them the very fast access denied 00:23:22.450 --> 00:23:24.400 message. If it's false, then we're gonna let the 00:23:24.400 --> 00:23:26.919 request through. So this is saying, if you're making 00:23:26.919 --> 00:23:30.690 a request to a url that starts with admin, 00:23:30.690 --> 00:23:33.510 and you are not from this IP address, we're 00:23:33.510 --> 00:23:38.410 gonna, we're gonna just give you an access denied. 00:23:38.410 --> 00:23:41.260 This is something that KickStarter uses. We call it 00:23:41.260 --> 00:23:46.130 the starve the trolls feature. So this is, if, 00:23:46.130 --> 00:23:48.630 if you're one of our banned IPs that our 00:23:48.630 --> 00:23:52.289 customer support team decides which IPs get banned, you 00:23:52.289 --> 00:23:54.919 cannot make any request that's not a get request. 00:23:54.919 --> 00:23:57.520 Or, put another way, you can only make get 00:23:57.520 --> 00:24:00.370 requests if you're from these IP addresses. 00:24:00.370 --> 00:24:01.919 So let's think about what it's like to use 00:24:01.919 --> 00:24:05.460 a dynamic web application if you're only using gets. 00:24:05.460 --> 00:24:07.730 You can't sign up. You can't log in. You 00:24:07.730 --> 00:24:11.059 can't post comments. These are, these are, we sort 00:24:11.059 --> 00:24:14.640 of use this as a measure of last resort 00:24:14.640 --> 00:24:18.140 for people who are, who are bad actors in 00:24:18.140 --> 00:24:20.700 our community. Any big community has, you know, knows 00:24:20.700 --> 00:24:22.610 that this stuff is sort of inevitable, to have 00:24:22.610 --> 00:24:26.570 a few rotten apples. 00:24:26.570 --> 00:24:28.590 And this has been like really fast and effective 00:24:28.590 --> 00:24:30.970 for our community team to be able to just 00:24:30.970 --> 00:24:34.080 like put these IP addresses into a yaml file. 00:24:34.080 --> 00:24:36.100 They leave them there for about a week or 00:24:36.100 --> 00:24:38.840 so, and you know gives that person sort of 00:24:38.840 --> 00:24:41.169 time to cool off, where they're not gonna go 00:24:41.169 --> 00:24:43.490 around signing up for a bunch of accounts and, 00:24:43.490 --> 00:24:46.789 and maybe doing bad stuff or, like, posting messages 00:24:46.789 --> 00:24:49.419 or stuff like that. 00:24:49.419 --> 00:24:52.190 So this is, I don't, I was really, I 00:24:52.190 --> 00:24:54.380 was sort of struck when we started doing this 00:24:54.380 --> 00:24:57.700 of like how simple this was in code, and 00:24:57.700 --> 00:25:01.419 how much it helped our CSS, or, our community 00:25:01.419 --> 00:25:04.350 support team. So this is another example of, like, 00:25:04.350 --> 00:25:06.250 sort of an area where I wouldn't expect Rack::Attack 00:25:06.250 --> 00:25:08.150 to be very helpful, but it ended up being 00:25:08.150 --> 00:25:10.760 very helpful. 00:25:10.760 --> 00:25:16.880 Another Rack::Attack nice to have feature is ActiveSupport::Notifications. So, 00:25:16.880 --> 00:25:20.870 every time, if, if ActiveSupport::Notifications are in your app, 00:25:20.870 --> 00:25:24.360 and so for any Rails app they're already there, 00:25:24.360 --> 00:25:28.620 we will fire a ActiveSupport notification event every time 00:25:28.620 --> 00:25:32.549 a request gets blocked or throttled. So this means 00:25:32.549 --> 00:25:35.470 you can have a subscriber to these events that's 00:25:35.470 --> 00:25:38.039 gonna log or graph these events and stuff like 00:25:38.039 --> 00:25:40.490 that. There are examples of how to do that 00:25:40.490 --> 00:25:43.970 in the README on GitHub. 00:25:43.970 --> 00:25:48.390 So thinking of where Rack::Attack might fall in the 00:25:48.390 --> 00:25:50.520 set of tools you use to keep your site 00:25:50.520 --> 00:25:54.210 fast and reliable, it is, it's not a silver 00:25:54.210 --> 00:25:57.340 bullet. Like, it very much compliments things like, the 00:25:57.340 --> 00:26:04.130 iptables firewall, or nginx limit_conn_zone, limit conn module to 00:26:04.130 --> 00:26:06.659 limit the number of concurrent requests per IP address. 00:26:06.659 --> 00:26:08.270 Or if you have, like, a CDN or a 00:26:08.270 --> 00:26:11.330 web app firewall. So, like, you know, hardware to, 00:26:11.330 --> 00:26:13.980 to keep your website fast and reliable. Like, keep 00:26:13.980 --> 00:26:14.950 doing those. 00:26:14.950 --> 00:26:18.210 Rack::Attack's not a silver bullet. You know, it's, if 00:26:18.210 --> 00:26:22.940 you have a mtp reflection ddos attack, like, it's 00:26:22.940 --> 00:26:27.460 gonna overwhelm your Unicorn or Heroku processes pretty fast. 00:26:27.460 --> 00:26:30.669 You need something else. But, what Rack::Attack really is 00:26:30.669 --> 00:26:34.409 good at is, it's Ruby. It knows everything about 00:26:34.409 --> 00:26:36.470 your app, like, I mean, because it's in your 00:26:36.470 --> 00:26:41.080 application, you can use other logic from your app. 00:26:41.080 --> 00:26:43.350 Because it's Ruby, it's easy to test. You write 00:26:43.350 --> 00:26:45.730 integration tests for it the same way you write 00:26:45.730 --> 00:26:48.000 tests for the rest of your application. 00:26:48.000 --> 00:26:49.600 And it's easy to deploy, because it's Ruby code. 00:26:49.600 --> 00:26:51.940 I don't know how you deploy changes to a 00:26:51.940 --> 00:26:54.600 CDN or a web app firewall, but it's probably 00:26:54.600 --> 00:26:57.570 a different process than how you deploy your Ruby 00:26:57.570 --> 00:27:01.470 code. And, and this is something that a lot, 00:27:01.470 --> 00:27:05.740 everybody on our engineering team is comfortable doing. 00:27:05.740 --> 00:27:08.659 So that, that's why, that's where Rack::Attack can fit 00:27:08.659 --> 00:27:13.730 in into your application security mindset. 00:27:13.730 --> 00:27:16.010 I also wanted to call out and say thank 00:27:16.010 --> 00:27:19.559 you to my many GitHub contributors. These people are 00:27:19.559 --> 00:27:23.539 really awesome and they've taken Rack::Att- like, added really 00:27:23.539 --> 00:27:26.200 cool features, like allow to ban and fail to 00:27:26.200 --> 00:27:29.299 ban, and they've cleaned up documentation and they've made 00:27:29.299 --> 00:27:32.210 the tests a lot better. They support, added reddis 00:27:32.210 --> 00:27:37.000 support was, it used to be just memcache. But 00:27:37.000 --> 00:27:40.750 these people are doing fantastic things with open source. 00:27:40.750 --> 00:27:44.179 They're from five different continents, too, which, like it 00:27:44.179 --> 00:27:46.529 feels so cool to put code out there and, 00:27:46.529 --> 00:27:49.730 like, people from five different continents contribute to it 00:27:49.730 --> 00:27:51.820 because they find it useful. 00:27:51.820 --> 00:27:56.190 So, more like that please. 00:27:56.190 --> 00:28:02.500 So, sort of wrapping up, the web, weird stuff 00:28:02.500 --> 00:28:06.700 happens on the web. It's inevitable. It's good in 00:28:06.700 --> 00:28:08.600 a lot of cases. I, I like that, you 00:28:08.600 --> 00:28:11.580 know, people write really innovative things and, and stuff 00:28:11.580 --> 00:28:13.049 that I would never would have come up with. 00:28:13.049 --> 00:28:15.520 Like, that's fantastic. So I hope the web stays 00:28:15.520 --> 00:28:17.960 weird. But I also hope that the website stays 00:28:17.960 --> 00:28:21.020 up. And Rack::Attack lets you have the best of 00:28:21.020 --> 00:28:24.470 both worlds. 00:28:24.470 --> 00:28:27.840 So that's all I had. That's, that's Rack::Attack at 00:28:27.840 --> 00:28:31.400 KickStarter. If you have any quest- I'd love to 00:28:31.400 --> 00:28:34.520 answer any questions if people have them. And, if 00:28:34.520 --> 00:28:36.460 you're more comfortable, hit me up on Twitter or 00:28:36.460 --> 00:28:38.740 find me after the talk.