AARON SUGGS: All right. Can people hear OK? I'll go ahead and get started. So this talk is Rack::Attack and how to protect your app with this one weird gem. Where does Rack::Attack come from? We built it at KickStarter. If you haven't heard of KickStarter, it is a funding platform for creative projects. So somebody has an idea for a film, a comic book, an open source project, a gadget. They, they put their project up on our site. They can offer rewards for various pledge levels. Their friends, family, strangers on the internet come and can, can give them money. At the end of the deadline, if they've reached their funding goal and so they have enough to reach their project, that's when we process the transactions and the creators' get the funds they need to, to do the project. To give you a sense of scale for what we do, we, we recently crossed over a billion dollars pledged to the site. It's over a million dollars a day. And it's gone to over 60,000 creative projects. Quick introduction. My name's Aaron Suggs. I go by ktheory on social media. I love dancing in my bear outfit. And I'm the operations engineer at KickStarter. We, we have a very dev ops-y style workflow. So, so it means I end up writing a lot of Ruby code, and I love writing Ruby code. So, so Rack::Attack is, is a tool I wrote, and it's Rack middleware for blocking and throttling abusive requests. What do we mean by abusive requests? These can be things like malicious attackers trying to take down your site, doing things like trying to crack user accounts or get sensitive information, or it can be naively written scrapers, who are just, like, people on the internet doing weird things as they are prone to do, and that's cool, but sometimes it, it is a lot of traffic. It's a lot of resources for your app to try to handle, and Rack::Attack is a very elegant DSL and, and way for dealing with these sorts of things. Sort of constraining their behavior so your website stays up. Rack::Attack is on GitHub at slash kickstarter slash rack-attack. It's an open source Ruby gem. There's a README, sort of exactly like what you'd expect. So the big wins that KickStarter has gotten from using Rack::Attack, and the reason we developed it, was we wanted to increase our performance. So, so this is like site performance. We, we had problems with sort of abusive requests making our website slow because they were using up too many app servers CP. Too much app server CPU, or too much, too many database resources, by sort of constraining them we were able to make the website faster for the sort of, the most important requests. Like people coming on, wanting to watch videos, wanting to pledge money. Not people just trying to scrape down the entire site. We also improved our available. Because sometimes these requests were, were so much, there were so many that they would take down the site, or there would just be some weird incident and, we, right. It, it hurt our availability. But the biggest win that we had was developer happiness. Because dealing with these sort of bad actors on the internet especially if it means, like, your, your site's going down or like, the, you know, you need to scale up because somebody's doing something weird, that can really interrupt a lot of developers. It can, it can sort of derail your product road map. We want to be writing cool features and Rack::Attack was a great DSL to let us spend less time thinking about that stuff and more stuff doing the stuff that we, that we like doing. So let me talk about the origin story for Rack::Attack. Like, what happened at KickStarter that made us realize we, we needed this? Let's rewind to the summer of 2012. And this happened. So this is a story in a graph. So the blue line, I hope it shows up pretty well. Cool. Is our regular successful logins. People typing in an email and password and us being like, OK, you are logged in. You know, it ebs and flows throughout the day. Suddenly, one Sun, one Saturday afternoon, we just get so many of these, like, bad login requests, and for awhile we're like, what's going on? Did we deploy a feature that broke login? No. Somebody is trying to, to crack our user accounts. They're just like guessing email addresses and passwords as fast as they can, from several different IP addresses. So, as the ops guy, this is sort of on my plate. I'm like, OK, well, I gotta stop this. This is bad for the site for this to be going on. So I wrote a pretty nasty before filter for our login action, that's like, you know, keep a counter in memcache and, you know, if it's too many like, like, give them an error page and it was, it was kind of a sucky experience, because I was changing a really critical feature of our site, sort of under duress of, of knowing that I needed to get it out there quickly. And it was sort of like a big change, and in the pull request I was, I was apologetic, being like, I know this is badly tested and it's like a nasty code change, but we've got to get it out fast because this, this event's going on. And, so that, so we did that. And then sort of in the cold light of day, I reflected a little bit and I thought, we need a more elegant way to prevent bad requests. This is, it's not just gonna be about this login attack. This is gonna be about a whole class of problems that we might have on the site. You know, I should say, too, with that login attack, it was something that we sort of always imagined that, like, oh yeah, of course we should, like, throttle login requests. We just hadn't ever gotten around to it. You know, it was in our ticketing system as like a low-priority someday somebody should do this thing. And having it actually happen was like, OK, now we gotta do it right now. So, we realized, like, we need this generic tool to stop bad requests. And really, there's already, in the Ruby world, a great solution for this, and it's Rack middleware. So now we get to the code section of the talk. Here comes some code. Get ready. This is an example of, like, the most basic Rack middleware. Just, really quick, for, for people who might not be familiar with it. So middleware is basically like hugging your application, wrapping around so, so you, you have your Rails app or your Sinatra app, that is the app in this case. And you want to do things, you want to sort of be able to do things to the request that's coming in from the client. That's the end. So every, every request from a client is gonna do this call method where you pass in the environment, the environment is, like, I don't know, what page the client wants or what they're cookie is and, and all that information. And so the real magic of Rack middleware is it lets you do stuff here with, with the requests. Like, you can block it in the case of Rack::Attack, potentially. Or you can do stuff with the response. You can log it. You can cache it. Stuff like that. So this, so this is just a great pattern for managing, for sort of making easy architectures to do stuff with HTTP requests. So in Rack::Attack's case, this is a sort of simplified version of the Rack::Attack call method. We say, for this request, should we allow it? If so, go ahead and pass it onto your application. Your application is gonna do, potentially, a lot of work. Maybe it's gonna spend a couple hundred milliseconds, like, querying the database and rendering views and stuff like that. So that's the expensive work that we want to save if the, if this is an abusive request. So, so if we shouldn't allow it, then we just return back this very fast access-denied as a very simple and fast response to render. Rack::Attack can do several hundred of these access denied requests per, like, thread that you have running. So like, per unicorn worker or per Heroku instance or something like that. But, so, that's what you get for, when you just use the Rack middleware for free. So, so we don't yet know what this should_allow method should be. That's code that you sort of have to configure yourself, of what do you want to throttle on. So that looks like this. This is sort of a generic throttle that you might put in your, in an initializer to configure Rack::Attack. The important stuff that's going on here is we are calling the throttle class method on Rack::Attack, so that's just something we expose to let you plug into the middleware. We give it a name, in this case it's the, we, we named the throttle IP. This is gonna determine how we track it. And that just has to be unique throughout your application. We're gonna give it a limit and a period. And so that's how much, the, the period is how many seconds we're gonna be considering for the throttle, and the limit is sort of your quota for how many requests you get to make during that time. So in this case, it's ten requests every five seconds. For the arithmetically inclined, you'll notice that this is not like a reduced fraction. We could say two requests every one second. The advantage of doing a higher multiple is that, like, it allows a little burstiness. So these periods are basically dividing time up into these, like, five second long buckets. So in between zero and, seconds and five seconds after the minute, like, in that window, you're allowed to make up to ten requests. And so by having bigger multiples in bigger windows, you can sort of get around some burstiness at, but the long-term average stays the same. Like, long term, nobody's gonna make more requests that two every one second. OK, so what's going on? We got the, the class method. We got the name. WE have the limit and the period. And then to this block, we are passing along the request. Now, in the earlier middleware expample we talked, we called this the end, which was just like the, the environment hash that comes from the request. Request is just like a light little Rack request object wrapped around the environment that just sort of gives you methods, instance methods to call, like dot IP or dot host or dot path or something like that. It just sort of, you use these in Rails controllers, too. So it's just like a lightly-wrapped request. And then inside the block, what the block returns is the sort of really important part. That's the discriminator that determines how we're gonna bucket up these throttles. So in this case we are gonna say every IP address, every distinct IP address is going to get its own throttle limit. But we could throttle by something else. WE could throttle by a parameter or a host name or something like that, or an API token. And one thing to note with these discriminators, too, is like, if this would, this is returning a string, so it's always gonna be a truthy value, and true values sort of enable the, the throttling. Like, we are gonna throttle these requests as long as there's an IP address, and there always is. If we would return nil or a falsey value, we just sort of let the request go through and we're not gonna throttle it. I'll talk about why we might want to do that later. But, so now we have this issue of throttle state. Like, we have these counters per IP address that we need to track. And so, so where do we store that? A pretty elegant and simple and obvious place for that was our Rails cache. So when you just use Rack::Attack by default, if you have a Rails cache, it's gonna use it. But, it really works best with memcache or redis. So, so I hope you're using that as your Rails cache. But if you're not, like, there are ways that you can build your own, or sort of like plug in a, a different cache store. The great advantage about memcache and redis is that they have really good support for atomically incrementing counters, and that's the sort of key feature we'd need behind the scenes. So now we're imagining for, for every request that comes in, we need to sort of increment the counter per IP address. And so how do we do that? Like what's, what's the algorithm? So this is the nitty gritty of how Rack::Attack works. How it constructs that key. So remember how we divided the minute up into like little buckets depending on our period. So, so to do that, we sort of take the current second. We construct a key that is the name of our request, like IP in this case. We take the time divided by the period, so this means that that middle component is going to be, is going to increment every five seconds. It's gonna, so it's, the key's gonna change. And then the final part is that block return value. So in this case it's the IP address of the request. But maybe it's an API token or something like that. So at the end of it, we have this key that changes every couple seconds. Every time, like, the period rotates, and this ends up being a very efficient use case, a very efficient use of memcache or redis. Like, this is, storing all this information is gonna take, like, a couple megabytes. It's like, don't worry about the impact on your cache store in pretty much every scenario. To make it even more efficient use of your cache store, we set an expire rate, so that in that, like, in that bucket window of, say, zero to five seconds, we're gonna say that all those cache keys expire at five seconds. So at the same moment that the cache keys change, they also expire. So memcache or redis just ends up reusing the same memory blocks over and over. You don't have, even though there's changing, they're changing in memory, you don't have as much churn as you would otherwise. And so then the Rack middleware is really doing pretty simple stuff of we're saying, for whatever your cache is, increment this key with this expire rate. That's gonna give us back the count of how many requests that have been made that, that match that throttle. And if it's more than our limit, we're gonna return that access denied response. So, we rolled this out. You know, we're able to have this global throttle per IP address. We start making a couple other, other features, and it was about a year later when we had a, the sort of redux of, of a new event that put Rack::Attack to the test. So, a new challenger emerges in the summer of 2013. This was a script called kicksniper dot py. And this revealed a pretty interesting behavior on KickStarter that we call reward sniping. Actually, kicksniper dot py refers to it in the code as reward sniping. And so, this is, this is an, an interesting behavior because. So I told you how KickStarter offers these rewards. They can be limited rewards. So a creator says, I'm only gonna give away, like, a hundred of these, and first come, first serve. So, there's a, a pretty popular project where it was like a video game and, and the video game was offering these reward tiers that would be, like, for fifty bucks, you get, like, the silver level package, and for a hundred bucks you get the gold package, and so on and so, like, ever more deluxe and expensive packages. And they were all very much in demand. So the early reward tiers like sold-out super fast. And then occasionally, somebody in, who had those early reward tiers, would decide they're gonna splurge and they're gonna upgrade. They're gonna change their pledge to a higher one, and now for that moment, like, there's now one available of the lower tier. And so people were like hitting refresh, refresh, refresh, hoping that they just noticed when somebody, when somebody had changed their pledge and now there was one of these highly desirable lower-tier pledges available. Some entrepreneur, enterprising Python developer, says, I will make a script that does this for me. Sure enough, so, so he writes kicksniper dot py that's, that's in a tight loop, trying to change his pledge on our site. Saying, like, let me get that, that early reward tier. You know, our ActiveRecord validations were working fine and we said, no, you can't change your pledge to that the vast majority of the time, but, but eventually he got through and was able to get the pledge. It was such a great success that he goes on all the forums and says, hey, everybody just run this, like, Python script on your laptop and you, too, might look, luck out and get one of these highly desirable earlier reward tiers. So let's tell this story in a graph. So, this is our master database CPU over the course of a, of a day or so. We see at the very beginning, it starts off between ten or fifteen percent. That's my happy place. That's where I like it to be. We have plenty of head room for like, you know, big projects to sort of blow up on the site, as they do from time to time. And, I honestly didn't really notice that it had been creeping up over the course of the day. Thursday morning, it crossed thirty percent, and that's when I get a CPU alert threshold. So it, so in fact, the whole dev team gets this email being like, hey, the master database CPU is pretty high. You guys should check that out. So, what do we, you know, we, we spend a little time, we're like, why is the database so high? Well, you know, it looks like there are a crazy number of requests trying to change their pledge for this one project. We, we're able to sort of construct this back story and, like, see what was happening on the database CPU. We see the form request where everybody's like, thank you for kicksniper dot py. And so, and we're like, all right, so, so how are we gonna handle this? Like, is it really that important that people are able to try to change their pledge like multiple times a second? What if they only could change their pledge every couple seconds? Right, like, I guess that's fair enough to the, like, there's this question of, like, what's the fairest way to allocate the scarce resources of, of like the pledge as soon as it's available. I kind of don't care about the answer. Anybody can get it. But, but we're like, if we start throttling these people, it's like totally fair. They're using an inordinate number of resources. And people who are just clicking around the site are having a slower experience because our database CPU is so high. So we decide, like, OK, you can make a couple requests per minute to change a pledge. It was one line of Rack::Attack code. We deploy it. The yellow vertical lines here are deploy lines, so you can see that right here, about an hour after we get the alert that something was going wrong, we deploy and immediately our database CPU drops. We're pretty much back to the happy place. And so, for us, that was like, revealing the, the great success that we could have. Like, it was so easy, like, once we figured out what was going on, it was so easy for us to write code that just, like, solved that problem. We didn't have to think about, like, how do we optimize the edit pledge flow? Which could have been, like, a much bigger product change, and derail, like, taken up a lot more developer time. It was sort of a cut and dry decision of like, most people aren't gonna try to change their pledge, like, we're super confused if you're actually trying to change your pledge several times a minute. That's a, that's a bug we should fix. But it's really just these scrapers. It's not big deal to say they can try a few times a minute. So, that was a big win for Rack::Attack at KickStarter. We feel like we sort of, we sort of cemented that its value in the organization. So now I'm gonna shift gears a little bit and I'm gonna tell you pro tips of general things you can do with Rack::Attack that, that are probably useful for your application. I just, oh my gosh I'm so glad that I got to use this gif. This gif is like condensed, pure condensed happiness for me. OK. Back to the code. So, we talked about how to do, like, a general, a, a log, I'm sorry. We talked about how to do a throttle for all IP addresses. So like each IP has this quota of how many requests you can do. But, in our, in our origin story about the login attack, we wanted to be extra careful about login requests. Like, those are something that, that you would want to throttle even more strictly than you would throttle many other things on your, in your application. So this is a new throttle, and so we give it a new name of logins per IP. And this is saying that if you are making a post request to the login url, then we want to throttle you by IP to like this much, this lower limit. And so this is relying on the fact that we mentioned earlier, that if the block returns nil, we're not gonna do throttle at all. So, so if this is not a post to the login action, like, we're not gonna check memcache, we're not gonna increment any counters or do anything like that. We're just gonna sort of allow this request right through. But if it is, we're gonna hold you, we're gonna say each IP address gets this lower quota of how many login requests they can make. Thinking of this same problem from a, from a kind of different angle, you might want to imagine a, a situation where a, an attacker is using many different IP addresses to try to crack passwords for one particular email address, right. Maybe it's the founder's email address or something like that. So you, so putting on your security hat, you can be like, how am I gonna be safe from those kinds of requests? The only change here is what we're returning. Instead of the IP address, we're returning the value of the email parameter. So this is a, a sort of little different way of thinking about throttles, of saying, whoever you are, if you're trying to login with this one particular IP address, you can only do it five times every twenty seconds. So those are two throttles that pretty much everybody should, should have that feature on their website. If you haven't been bitten by it yet, it's probably just a matter of time. Another pretty cool Rack::Attack feature are blacklists. So these are requests that you don't even want to throttle. Like, you're not gonna allow them at all. Just, access denied every time they happen. I kind, I was gonna call these blocks, but like blocks, I can't call them blocks. Because in Ruby the, like, that's already a different thing. So hence the term blacklists. Here's an example of a pretty handy blacklist. Say you have an admin section of your website, and you want to restrict access to the admin section to just like, your one office IP address. So this is, again, it's using the, it's using the blacklist class method on Rack::Attack to sort of configure this in the middleware. You would, you would put this in an initializer, saying that, you're given a name like bad_admin_ip, and one of the things, like, it's different than throttles in that we don't have to pass along a limit of a period, because it just like, it doesn't apply to blacklists. But it has the same logic where if the return value of this block is truthy, we're gonna, like, just give them the very fast access denied message. If it's false, then we're gonna let the request through. So this is saying, if you're making a request to a url that starts with admin, and you are not from this IP address, we're gonna, we're gonna just give you an access denied. This is something that KickStarter uses. We call it the starve the trolls feature. So this is, if, if you're one of our banned IPs that our customer support team decides which IPs get banned, you cannot make any request that's not a get request. Or, put another way, you can only make get requests if you're from these IP addresses. So let's think about what it's like to use a dynamic web application if you're only using gets. You can't sign up. You can't log in. You can't post comments. These are, these are, we sort of use this as a measure of last resort for people who are, who are bad actors in our community. Any big community has, you know, knows that this stuff is sort of inevitable, to have a few rotten apples. And this has been like really fast and effective for our community team to be able to just like put these IP addresses into a yaml file. They leave them there for about a week or so, and you know gives that person sort of time to cool off, where they're not gonna go around signing up for a bunch of accounts and, and maybe doing bad stuff or, like, posting messages or stuff like that. So this is, I don't, I was really, I was sort of struck when we started doing this of like how simple this was in code, and how much it helped our CSS, or, our community support team. So this is another example of, like, sort of an area where I wouldn't expect Rack::Attack to be very helpful, but it ended up being very helpful. Another Rack::Attack nice to have feature is ActiveSupport::Notifications. So, every time, if, if ActiveSupport::Notifications are in your app, and so for any Rails app they're already there, we will fire a ActiveSupport notification event every time a request gets blocked or throttled. So this means you can have a subscriber to these events that's gonna log or graph these events and stuff like that. There are examples of how to do that in the README on GitHub. So thinking of where Rack::Attack might fall in the set of tools you use to keep your site fast and reliable, it is, it's not a silver bullet. Like, it very much compliments things like, the iptables firewall, or nginx limit_conn_zone, limit conn module to limit the number of concurrent requests per IP address. Or if you have, like, a CDN or a web app firewall. So, like, you know, hardware to, to keep your website fast and reliable. Like, keep doing those. Rack::Attack's not a silver bullet. You know, it's, if you have a mtp reflection ddos attack, like, it's gonna overwhelm your Unicorn or Heroku processes pretty fast. You need something else. But, what Rack::Attack really is good at is, it's Ruby. It knows everything about your app, like, I mean, because it's in your application, you can use other logic from your app. Because it's Ruby, it's easy to test. You write integration tests for it the same way you write tests for the rest of your application. And it's easy to deploy, because it's Ruby code. I don't know how you deploy changes to a CDN or a web app firewall, but it's probably a different process than how you deploy your Ruby code. And, and this is something that a lot, everybody on our engineering team is comfortable doing. So that, that's why, that's where Rack::Attack can fit in into your application security mindset. I also wanted to call out and say thank you to my many GitHub contributors. These people are really awesome and they've taken Rack::Att- like, added really cool features, like allow to ban and fail to ban, and they've cleaned up documentation and they've made the tests a lot better. They support, added reddis support was, it used to be just memcache. But these people are doing fantastic things with open source. They're from five different continents, too, which, like it feels so cool to put code out there and, like, people from five different continents contribute to it because they find it useful. So, more like that please. So, sort of wrapping up, the web, weird stuff happens on the web. It's inevitable. It's good in a lot of cases. I, I like that, you know, people write really innovative things and, and stuff that I would never would have come up with. Like, that's fantastic. So I hope the web stays weird. But I also hope that the website stays up. And Rack::Attack lets you have the best of both worlds. So that's all I had. That's, that's Rack::Attack at KickStarter. If you have any quest- I'd love to answer any questions if people have them. And, if you're more comfortable, hit me up on Twitter or find me after the talk.