0:00:17.240,0:00:18.690 STARR HORNE: So thank you guys for coming. 0:00:18.700,0:00:21.090 This is gonna be an awesome talk. 0:00:21.090,0:00:22.750 Quite possibly the best talk you'll ever hear 0:00:22.750,0:00:26.440 in your life. What? I actually have 0:00:26.440,0:00:27.920 a built-in escape hatch here, 0:00:27.920,0:00:29.630 so if things start going wrong, 0:00:29.640,0:00:33.490 I might just bail. Yeah. 0:00:33.490,0:00:40.350 So, anyway. A couple weeks ago, I was, well, 0:00:40.360,0:00:44.090 I was sort of having a panic attack about 0:00:44.120,0:00:48.190 this presentation, because I'm a programmer,[br]and my natural 0:00:48.190,0:00:51.160 habitat is in some sort of dark place looking 0:00:51.160,0:00:53.649 a screen. It's not, it's not talking to a 0:00:53.649,0:00:56.811 bunch of people with, like, lights. What's[br]up with 0:00:56.820,0:01:00.009 this? 0:01:00.020,0:01:01.450 So I, I was in the middle of this 0:01:01.450,0:01:03.131 panic attack and I went out on a walk, 0:01:03.140,0:01:04.909 cause that always clears my mind. I was listening 0:01:04.909,0:01:08.291 to a podcast. My favorite podcast. It's called[br]Hardcore 0:01:08.300,0:01:13.290 History. I highly recommend it. Yeah. Oh.[br]Somebody likes 0:01:13.290,0:01:16.640 it. Yeah! 0:01:16.640,0:01:19.580 So this, this particular episode about Hardcore,[br]in, of 0:01:19.580,0:01:24.390 Hardcore History was about the Punic Wars.[br]And in 0:01:24.400,0:01:26.600 case you haven't heard about the Punic Wars,[br]don't 0:01:26.600,0:01:28.220 know the story, I'll just give you, like,[br]the 0:01:28.220,0:01:31.540 brief rundown. Basically, a long, long time[br]ago, there 0:01:31.540,0:01:34.660 was this place called Rome. And they, they[br]decided 0:01:34.660,0:01:38.000 that they should take over the world. And,[br]for 0:01:38.000,0:01:40.820 the most part, they did. But there was this 0:01:40.820,0:01:42.740 one country that was a thorn in their side 0:01:42.740,0:01:45.240 for a long time called Carthidge. And they[br]fought 0:01:45.240,0:01:46.710 all these wars back and forth. It was kind 0:01:46.710,0:01:48.890 of a stale mate. 0:01:48.890,0:01:51.070 And then one day, when, I don't know, maybe 0:01:51.080,0:01:53.810 he was making a sacrifice to the fire god 0:01:53.810,0:01:57.420 or whatever, this guy named Hannibal had this[br]great 0:01:57.420,0:01:59.700 idea. He was like, I'm gonna, I'm gonna lead 0:01:59.700,0:02:02.800 this sneak-attack on Rome, and I'm gonna do[br]so 0:02:02.800,0:02:07.100 by marching my army through the Alps. Which[br]is, 0:02:07.120,0:02:10.979 is pretty cool. I mean, that, that's pretty[br]bad 0:02:10.979,0:02:12.770 ass. But I, I think the most awesome part 0:02:12.770,0:02:15.751 of the story, for me, at least, is that 0:02:15.760,0:02:19.180 the dude had war elephants. So, I, I don't 0:02:19.180,0:02:20.560 know if you can see it in the slide, 0:02:20.560,0:02:23.769 but there's actually a war elephant. It's,[br]it's kind 0:02:23.769,0:02:26.831 of, I don't know, it's kind of under the 0:02:26.840,0:02:31.220 G. And a lot of people have doubted this 0:02:31.220,0:02:34.830 story, you know, over the years. And so, in 0:02:34.830,0:02:39.400 1959, there was a British engineer who, on[br]a 0:02:39.400,0:02:44.350 bar bet, borrowed a circus elephant named[br]Jumbo and 0:02:44.350,0:02:47.180 marched it across the Alps, too. 0:02:47.180,0:02:49.820 So, I don't know what my point is here, 0:02:49.820,0:02:54.689 really. I guess don't underestimate elephants, 0:02:54.689,0:02:56.171 because they don't 0:02:56.200,0:03:00.140 like it and they have long memories. 0:03:00.220,0:03:07.220 So, OK. This talk is really about biggish[br]data. 0:03:07.580,0:03:10.350 So what, what the hell is biggish data. It's 0:03:10.350,0:03:12.699 not, what is? Whoa, I've got one of these 0:03:12.700,0:03:18.431 things. It's not big data. It's not about,[br]you 0:03:18.440,0:03:20.760 know, this talk isn't gonna be about Hadoop[br]clusters 0:03:20.760,0:03:25.349 and server farms and all that stuff. That's,[br]that's 0:03:25.349,0:03:27.261 way over my head. 0:03:27.261,0:03:31.460 It's not about fancy architectures. I'm not[br]gonna show 0:03:31.460,0:03:35.760 you how to make multi-right PostGres clusters[br]that do 0:03:35.760,0:03:38.809 automatic sharding and stuff. I mean, that's[br]all like 0:03:38.809,0:03:40.800 wicked cool, but it's not what we're gonna[br]talk 0:03:40.800,0:03:41.221 about. 0:03:41.221,0:03:43.579 Instead, I'm gonna talk about something that[br]I think 0:03:43.600,0:03:47.350 is, is more practical, and it's kind of more 0:03:47.350,0:03:49.281 interesting to me, and that is, how do you 0:03:49.281,0:03:52.059 keep your app working as your production dataset[br]grows? 0:03:52.060,0:03:59.060 And grows into biggish data territory? 0:03:59.160,0:04:01.189 And this is very easy. This is very easy 0:04:01.200,0:04:03.611 to happen. Even if you don't have a firehose 0:04:03.620,0:04:06.050 of data coming in. If you run a, a 0:04:06.050,0:04:09.500 popular e-commerce site or just a site with[br]a 0:04:09.500,0:04:11.670 lot of users, over the course of years you 0:04:11.670,0:04:15.430 can accumulate a ton of data. And as this 0:04:15.440,0:04:18.789 data accumulates, you find that your site[br]performance goes 0:04:18.789,0:04:21.351 downhill sort of gradually in ways that you[br]don't 0:04:21.360,0:04:24.189 understand. 0:04:24.189,0:04:28.891 So. What is biggish data? This talk is based 0:04:28.900,0:04:31.550 on my experience at Honeybadger. In case you[br]haven't 0:04:31.550,0:04:35.660 heard it, of us, we are an exception performance 0:04:35.660,0:04:38.280 and uptime monitoring company. 0:04:38.280,0:04:39.670 That means that we essentially 0:04:39.670,0:04:42.070 have a firehose of new data coming in every 0:04:42.100,0:04:45.650 day. Right now we have about a terabyte of 0:04:45.650,0:04:49.700 errors in our database. The world has a lot 0:04:49.700,0:04:52.460 of errors. You guys need to start, like, doing 0:04:52.460,0:04:54.480 a better job. 0:04:54.480,0:04:57.950 And, and we get about, we get two gigabytes 0:04:57.960,0:05:01.240 of new errors per day, roughly. And all this 0:05:01.240,0:05:06.810 goes into a plain vanilla PostGres database,[br]and it's 0:05:06.810,0:05:08.810 served, that backs a pretty plain 0:05:08.820,0:05:11.550 vanilla Rails application. 0:05:11.560,0:05:15.410 Well, the good news is that PostGres can handle 0:05:15.410,0:05:19.590 this pretty easily. PostGres has got it covered. 0:05:19.590,0:05:24.390 The bad news is that, unless you've engineered[br]your 0:05:24.390,0:05:27.870 app to deal with this level of data, you're 0:05:27.880,0:05:31.270 kind of screwed. And the reason is that a 0:05:31.270,0:05:33.530 hundred megabyte database behaves 0:05:33.560,0:05:36.419 fundamentally different than a one 0:05:36.420,0:05:41.881 terabyte database. And a lot of the conventions,[br]a 0:05:41.881,0:05:43.330 lot of the normal things we do in Rails 0:05:43.330,0:05:45.909 apps just stop working when you have this[br]much 0:05:45.909,0:05:49.941 data. 0:05:49.941,0:05:55.539 Yup. This is actually a live feed. This queries 0:05:55.540,0:06:02.560 been running since this morning. 0:06:03.700,0:06:06.741 Something I learned, pretty much every pagination[br]system breaks 0:06:06.741,0:06:13.150 on page, like, 2000. Even if you, you're like, 0:06:13.160,0:06:15.690 dear God, make this stop, I just want to 0:06:15.690,0:06:19.050 delete half my data. You're, you're just still[br]screwed, 0:06:19.060,0:06:22.380 because deleting data takes a lot of time. 0:06:22.400,0:06:25.840 Now, I'm gonna explain all of this. I'm gonna 0:06:25.840,0:06:27.780 explain why this happens, how you can work[br]around 0:06:27.780,0:06:30.790 it, and how you can optimize the queries in 0:06:30.790,0:06:34.700 your own database and optimize your stack.[br]But, to 0:06:34.700,0:06:36.760 really understand this, we've got to take[br]a little 0:06:36.760,0:06:43.760 trip back in time, to the summer of 1978. 0:06:44.740,0:06:49.261 The summer of disco, free love, and the VT-100 0:06:49.261,0:06:53.869 terminal. The VT-100 was the first sort of[br]computer-shaped 0:06:53.869,0:06:56.041 thing that you could buy from an actual company 0:06:56.041,0:06:58.550 and sit on your desk to prove that you 0:06:58.560,0:07:02.110 were the alpha nerd of the office. It was 0:07:02.110,0:07:05.150 also the time that Oracle 1 point 0 was 0:07:05.150,0:07:08.730 being developed, which was one of the first[br]databases 0:07:08.740,0:07:12.250 that used the SQL language. It's pretty cool.[br]It 0:07:12.280,0:07:15.030 was written in PDP-11 assembly language. 0:07:15.030,0:07:18.850 And, and in case you're like too young to 0:07:18.850,0:07:23.220 have ever seen a PDP-11, this is what a 0:07:23.220,0:07:29.640 PDP-11 looks like. Yeah. Give it, show some[br]love 0:07:29.660,0:07:32.570 for the PDP-11. 0:07:32.570,0:07:39.470 Now, this is pretty awesome. Nowadays, in[br]our, like, 0:07:39.480,0:07:43.029 modern day times, right this second, marketers[br]down in 0:07:43.029,0:07:46.081 the, the presentat- what, what do they call[br]that? 0:07:46.081,0:07:50.739 The exhibit hall. Are, right now, throwing[br]around all 0:07:50.740,0:07:54.120 sorts of words about cloud computing and platform[br]as 0:07:54.120,0:07:58.191 a service and, and I don't know, some, some 0:07:58.200,0:08:00.969 stuff I haven't even heard of, probably. And[br]back 0:08:00.980,0:08:02.771 then it was the same, but the, the buzz 0:08:02.771,0:08:05.439 words were different. The buzz words on everybody's,[br]the 0:08:05.440,0:08:09.810 buzz word on everybody's lips, at that time,[br]was 0:08:09.820,0:08:11.661 real time computing. 0:08:11.661,0:08:17.789 And I, I'm using air quotes around real time, 0:08:17.789,0:08:21.660 because that meant that you could have your,[br]your 0:08:21.660,0:08:25.981 report generated in five minutes instead of[br]five days. 0:08:25.981,0:08:29.150 Which I'm sure was pretty awesome. Like, I[br]would 0:08:29.150,0:08:32.669 have been really happy about that. But, you[br]know, 0:08:32.679,0:08:34.039 looking at this, it doesn't really seem like[br]the 0:08:34.059,0:08:35.421 sort of thing I would want to build a 0:08:35.440,0:08:38.318 web app on top of, right? 0:08:38.320,0:08:43.600 Like, I. Yeah. I, I wouldn't stick Honeybadger[br]on 0:08:43.600,0:08:47.520 that. But, a funny thing happened in the past 0:08:47.520,0:08:53.071 thirty, thirty-six years. A funny thing happened[br]is that 0:08:53.071,0:08:57.199 Moore's Law made this into a web stack. Now, 0:08:57.200,0:08:59.221 I know that everybody here knows sort of,[br]vaguely, 0:08:59.221,0:09:00.860 what Moore's Law is. But I thought it would 0:09:00.860,0:09:06.040 be good to go over the technical, precise[br]definition 0:09:06.040,0:09:11.500 of Moore's Law. 0:09:11.500,0:09:18.120 And that is that, whoa. That is that computers 0:09:18.120,0:09:20.110 get more awesome as time goes forward. Did[br]that 0:09:20.120,0:09:23.480 come back normally? Yeah. As time goes forward.[br]And 0:09:23.480,0:09:27.800 awesomeness is generally defined as, well,[br]technically, being able 0:09:27.800,0:09:30.730 to do more stuff faster. Being able to process 0:09:30.730,0:09:37.720 more data faster. So, based on this, I would 0:09:38.100,0:09:41.300 like to postulate, for the first time in public 0:09:41.300,0:09:43.190 - you guys are very lucky to be able 0:09:43.190,0:09:45.290 to hear this - the first time in public, 0:09:45.290,0:09:49.350 Starr's Corelary to Moore's Law, which is,[br]as database 0:09:49.350,0:09:56.340 growth outpaces Moore's Law, you literally[br]travel back in 0:09:57.100,0:10:00.390 time. 0:10:00.390,0:10:01.940 That was the Ted moment of this, of this 0:10:01.940,0:10:04.940 talk. It's like, I can hear you guys, your 0:10:04.940,0:10:10.380 minds being blown. It sounds like bubble wrap[br]popping. 0:10:10.380,0:10:16.570 So, based on my, my extensive research of[br]time-travel 0:10:16.570,0:10:21.360 movies, the, the first rule for surviving[br]your new 0:10:21.360,0:10:23.300 time period is that you need to understand[br]the 0:10:23.300,0:10:27.290 methods and the motivations of the people[br]in that 0:10:27.290,0:10:29.239 time period. We need to start thinking like[br]our 0:10:29.240,0:10:34.490 ancestors. And our ancestors were very interested[br]in this 0:10:34.500,0:10:37.521 thing called hard ware. 0:10:37.521,0:10:42.060 Well, I, I don't mean to be flip, but 0:10:42.060,0:10:46.120 I just want to bring this up first, because 0:10:46.120,0:10:49.390 if you happen to be having database scaling[br]issues, 0:10:49.390,0:10:52.010 like, if your app right now is getting slower 0:10:52.020,0:10:54.100 because the amount of data in your database[br]is 0:10:54.100,0:10:57.950 getting bigger, and your app happens to live[br]anywhere 0:10:57.980,0:11:04.360 named EC-2, Linode, I don't know, whatever,[br]whatever other 0:11:04.360,0:11:08.120 companies offer that, you can probably solve[br]your scaling 0:11:08.120,0:11:10.560 issue just, like, right now just by going[br]and 0:11:10.560,0:11:13.019 buying a real damn computer. 0:11:13.020,0:11:19.731 Yeah. You'll be shocked and amazed. Because[br]the two 0:11:19.731,0:11:23.970 things that databases need to perform well,[br]to work 0:11:23.980,0:11:26.330 at high performance with lots of data, is[br]they 0:11:26.330,0:11:29.000 need lots of ram, and they need really fast 0:11:29.000,0:11:32.100 disc io. And virtual servers give you neither[br]of 0:11:32.100,0:11:35.040 those things. So, just go buy a real damn 0:11:35.040,0:11:39.030 computer. Yeah. And, and while you're at it,[br]throw 0:11:39.030,0:11:42.509 in a bunch of discs. 0:11:42.509,0:11:45.361 I'm not talking about a radar ray. You probably 0:11:45.361,0:11:48.280 want each of your, you probably want each[br]of 0:11:48.280,0:11:53.239 your operating system, your PostGres data,[br]your PostGres log 0:11:53.240,0:11:55.231 file, you probably want all that stuff to[br]be 0:11:55.240,0:11:58.259 on separate hard drives. And that's just gonna[br]make 0:11:58.280,0:12:02.590 the operating system able to more efficiently[br]sort of 0:12:02.590,0:12:04.241 schedule disc io. 0:12:04.241,0:12:09.940 So, if, if that solves your problems, you[br]know, 0:12:09.940,0:12:13.910 great. You're welcome. If you guys want to[br]leave 0:12:13.910,0:12:16.830 the room now, that's fine. I won't be offended. 0:12:16.830,0:12:19.460 You can go call your local Colo facility.[br]Work 0:12:19.460,0:12:22.780 something out. If, if that doesn't solve your[br]problem, 0:12:22.780,0:12:24.999 or even if it does, you probably want to 0:12:25.000,0:12:28.881 look at your queries next. 0:12:28.881,0:12:33.739 Now, most people, most of us develop - oops. 0:12:33.740,0:12:38.841 Hey. Most of us develop against smallish datasets.[br]So, 0:12:38.860,0:12:40.769 when you develop against a small dataset,[br]you don't 0:12:40.769,0:12:44.971 notice inefficient queries. It's just life.[br]And books have 0:12:44.980,0:12:48.250 been written about query optimization. It's,[br]it's a very 0:12:48.260,0:12:52.560 huge topic, and I can't ev, explain everything[br]in 0:12:52.560,0:12:54.200 thirty minutes. So I'm just going to explain[br]one 0:12:54.200,0:12:57.360 thing, and that's called a, a explain. 0:12:57.360,0:13:02.550 Fortunately, PostGres gives us an awesome[br]tool called Explain, 0:13:02.560,0:13:06.089 which basically you pass it a query and it 0:13:06.140,0:13:13.140 gives us a query plan. Oh it, wait a 0:13:16.080,0:13:19.330 second. That's a, that's a chapter from the[br]Iliad. 0:13:19.330,0:13:23.840 Sorry. It gives us a query plan, which still 0:13:23.840,0:13:26.230 kind of looks like a chapter from the Iliad. 0:13:26.240,0:13:29.110 But, fortunately, we only have to, to look[br]at 0:13:29.110,0:13:31.080 one metric here. The only thing we need to 0:13:31.100,0:13:37.160 worry about is rows. For this talk. 0:13:37.160,0:13:38.560 And here we have an example of a very 0:13:38.560,0:13:42.300 efficient query, right. It's gonna use an[br]index, and 0:13:42.300,0:13:44.880 it's gonna look at one row. That's pretty[br]sweet. 0:13:44.880,0:13:46.970 Very fast. This is the type of query that 0:13:46.970,0:13:51.240 biggish data works with. 0:13:51.240,0:13:52.559 But there's one thing you need to know about 0:13:52.560,0:13:59.341 rows. And that, obviously, the more rows you're[br]dealing 0:13:59.360,0:14:01.680 with, the more data your computer has to crunch 0:14:01.680,0:14:04.170 to give you the answers you want. And so 0:14:04.170,0:14:06.600 the whole, the whole name of the game in 0:14:06.600,0:14:08.920 query optimization is to limit the number[br]of rows 0:14:08.920,0:14:10.259 that you have to touch. 0:14:10.260,0:14:15.721 Let's go back to that inefficient count query.[br]So 0:14:15.721,0:14:17.860 if you run Explain on that, it turns out 0:14:17.860,0:14:21.660 that when you ask PostGres to count all of 0:14:21.660,0:14:25.170 the tab, all of the rows in a table, 0:14:25.170,0:14:27.050 it actually literally counts all the rows[br]in the 0:14:27.060,0:14:32.440 table, one by one. 0:14:32.440,0:14:35.009 And so you wind up with a, one of 0:14:35.009,0:14:40.581 these things. It's not pretty. And it's often[br]hard 0:14:40.581,0:14:42.800 to know when stuff like this is gonna crop 0:14:42.800,0:14:46.960 up. For example, I mentioned that most pagination[br]systems 0:14:46.960,0:14:51.569 break at page 1000 or 2000. And that's because, 0:14:51.569,0:14:58.281 well, the offset and limit operators work[br]a lot 0:14:58.300,0:15:00.269 like count, in that, if you do an offset 0:15:00.269,0:15:05.010 of 500,000, PostGres is going to count up[br]500,000 0:15:05.010,0:15:06.581 rows, and then if you have a limit of 0:15:06.600,0:15:08.740 100, it's gonna count up another 100. And[br]so 0:15:08.740,0:15:11.650 what you see is that pages one, two, and 0:15:11.650,0:15:16.840 three load extremely quickly. And so you,[br]as a 0:15:16.840,0:15:18.990 developer, are actually, when I say you I[br]mean 0:15:19.000,0:15:22.210 me, cause I actually, I actually fell into[br]this 0:15:22.210,0:15:24.770 trap. You test it with, like, one or two 0:15:24.770,0:15:27.999 pages and it works fine, so you move on 0:15:28.020,0:15:29.890 and then later on your customers are complaining[br]because 0:15:29.890,0:15:34.411 they can't access page 500. It's timing out. 0:15:34.420,0:15:38.730 The solution to this is to use a range 0:15:38.730,0:15:41.759 query. Range queries are very fast, and I'm[br]gonna 0:15:41.760,0:15:44.651 give you a really stupid example here. Here's[br]a 0:15:44.660,0:15:51.560 link with a much more detailed analysis of[br]all 0:15:51.560,0:15:55.100 this. But essentially, if you do a range query, 0:15:55.100,0:15:56.590 you're gonna be able to use an index. It's 0:15:56.590,0:15:58.700 gonna touch 100 rows and it's gonna be really 0:15:58.700,0:15:59.550 fast. 0:15:59.550,0:16:02.619 Now, I know what you guys are thinking. That's 0:16:02.620,0:16:06.250 not the same as an offset limit query. And 0:16:06.280,0:16:07.631 that's the reason I gave you the link in 0:16:07.631,0:16:10.439 the previous slide, so. 0:16:10.440,0:16:14.890 Sorting. Sorting is tricky. Sorting is the[br]devil. Sometimes 0:16:14.960,0:16:17.140 it's super fast. Sometimes if you happen to[br]be 0:16:17.140,0:16:20.731 asking for data sorted in exactly the same[br]way 0:16:20.740,0:16:23.800 that an index is sorted, well, it'll be super 0:16:23.800,0:16:26.579 fast. Other times, even if you have an index 0:16:26.580,0:16:29.020 on a column, if it's not set up right 0:16:29.020,0:16:30.911 or if you're asking for the data in a 0:16:30.911,0:16:34.019 slightly unusual way, you'll wind up sorting[br]the whole 0:16:34.020,0:16:35.200 damn dataset. 0:16:35.200,0:16:40.621 It's no good. So here's your rundown for query 0:16:40.621,0:16:44.370 optimization. You want to develop against[br]a real dataset, 0:16:44.370,0:16:46.090 because otherwise you won't know when things[br]are going 0:16:46.090,0:16:50.900 wrong until they go wrong in production. Use,[br]use 0:16:50.920,0:16:53.620 Explain. Use it a lot. And the whole name 0:16:53.620,0:16:54.949 of the game here is to limit the number 0:16:54.960,0:16:57.421 of rows that you touch. Because the more rows 0:16:57.421,0:16:59.740 you touch, the slower your queries are. You[br]know, 0:16:59.740,0:17:01.579 in general. 0:17:01.580,0:17:06.339 So, I, I don't know about you, but that 0:17:06.339,0:17:08.890 just like, that just was a lot of cognitive 0:17:08.890,0:17:12.101 load happening, right up here. So I want to 0:17:12.101,0:17:19.099 give you guys a cookie. 0:17:24.440,0:17:27.410 For the second half of this talk, we're gonna 0:17:27.420,0:17:30.950 cover a lot of issues that relate to kind 0:17:30.960,0:17:35.530 of the infrastructure around biggish data,[br]around big datasets 0:17:35.530,0:17:41.071 and firehoses of data coming in. And here's[br]a, 0:17:41.080,0:17:44.389 a legend of, of how we're gonna approach this. 0:17:44.389,0:17:48.100 We're gonna have a cute picture that, hopefully,[br]sort 0:17:48.120,0:17:50.951 of symbolizes something about the, the topic.[br]We're gonna 0:17:50.960,0:17:52.720 have the name of the topic, and then we're 0:17:52.720,0:17:55.169 gonna have a link with more info about the 0:17:55.169,0:17:57.681 topic if, if you want to, you know, find 0:17:57.681,0:17:58.249 this later. 0:17:58.249,0:17:59.551 And you don't have to write all these links 0:17:59.560,0:18:02.769 down right now. At the end of the presentation, 0:18:02.769,0:18:05.890 at the very last slide, I'll have a url 0:18:05.890,0:18:07.921 where you can get a list of all of 0:18:07.921,0:18:11.579 them. And, yeah, and I promise this isn't[br]malware. 0:18:11.580,0:18:16.441 So, you just gotta take my word on that. 0:18:16.441,0:18:19.349 Our ancestors were really interested in disc[br]operating systems. 0:18:19.380,0:18:21.310 It seems like they were building a new disc 0:18:21.310,0:18:25.110 operating system like every two seconds, but.[br]What, what's 0:18:25.110,0:18:26.620 up with that? I, I don't know. Now we 0:18:26.640,0:18:28.860 have, like, three. 0:18:28.860,0:18:30.081 The first thing that you should do if you're 0:18:30.081,0:18:33.859 using Linux is to increase your read-ahead[br]cache. A 0:18:33.860,0:18:37.430 read-ahead cache is something that, well,[br]I found not, 0:18:37.430,0:18:39.610 not a ton of people know about, but it's 0:18:39.620,0:18:44.160 a super easy way to get up to a 0:18:44.160,0:18:49.670 doubling of your read-throughput. And essentially[br]what this means 0:18:49.670,0:18:54.790 is that Linux examines the request that you[br]make 0:18:54.790,0:18:57.230 to the, to the disc. And if it sees 0:18:57.240,0:18:59.660 that you're asking for lots of blocks that[br]come 0:18:59.660,0:19:02.071 right after another in a row, it's gonna preload 0:19:02.080,0:19:04.899 the next set of blocks into RAM. 0:19:04.900,0:19:09.650 The normal, the, the default for this is,[br]like, 0:19:09.650,0:19:14.701 256K RAM that it uses to prefix these blocks. 0:19:14.701,0:19:17.109 If you update this to use, like, two megs, 0:19:17.109,0:19:19.720 four megs, you'll get a really big increase[br]in 0:19:19.720,0:19:25.051 read performance. Use a modern file system.[br]That means 0:19:25.060,0:19:29.840 EXT-3 is not an option. If you want to 0:19:29.840,0:19:33.049 know why, check out the link. 0:19:33.060,0:19:37.740 And, and if you happen to be using EXT-4 0:19:37.740,0:19:42.920 or XFS. I can never say that unless I 0:19:42.920,0:19:45.571 say it super slow. You might want to consider, 0:19:45.571,0:19:47.969 you might want to look into journaling settings.[br]If 0:19:47.969,0:19:50.051 you have your database on a completely separate[br]hard 0:19:50.060,0:19:55.379 drive, and it's running EXT-4 and you have[br]full 0:19:55.380,0:20:00.040 data journaling enabled, since PostGres does[br]its own journaling, 0:20:00.040,0:20:01.831 you're gonna have an inefficiency there because[br]you have 0:20:01.840,0:20:03.119 two things journaling. 0:20:03.120,0:20:08.951 Finally, or, wait. Not finally. Anyway. You[br]gotta tell 0:20:08.951,0:20:14.989 PG, tell, tell Paul Gram about, about all[br]the 0:20:15.000,0:20:17.581 RAM that you bought in that, that fancy new 0:20:17.581,0:20:20.289 box. A, a really easy way to set this 0:20:20.289,0:20:24.000 up is to use a PG-tune script. It examines 0:20:24.000,0:20:28.821 your computer and writes a, a PostGres configuration[br]file 0:20:28.821,0:20:33.460 that has some pretty, pretty reasonable values[br]in it. 0:20:33.460,0:20:36.149 And you can tweak them from there. This stuff 0:20:36.160,0:20:38.071 is all really easy to find. 0:20:38.071,0:20:41.469 And then, finally, the bane of, of DB administrators 0:20:41.469,0:20:47.220 is the vacuum command. PostGres needs to vacuum,[br]because 0:20:47.240,0:20:51.060 it's messy. It. When you, when you run queries, 0:20:51.060,0:20:53.360 when you delete things, when you update things,[br]it 0:20:53.360,0:20:55.250 leaves sort of a lot of stuff undone. And 0:20:55.250,0:20:59.970 it does that, it does that in the, to, 0:20:59.970,0:21:03.671 it. In the name of speed, right. It only 0:21:03.680,0:21:05.589 does what is necessary at the time of query 0:21:05.589,0:21:08.821 to get you an answer to your query. 0:21:08.821,0:21:10.719 And then it uses vacuum to go and sort 0:21:10.720,0:21:14.510 of clean up after itself. The problem is that 0:21:14.510,0:21:18.671 vacuum can be really resource intensive. And[br]so if 0:21:18.680,0:21:21.609 your, if your server is under a lot of 0:21:21.620,0:21:24.850 load, and you see that, OK, vacuum is also 0:21:24.860,0:21:26.740 causing a lot of load, you may be tempted 0:21:26.740,0:21:30.980 to turn off vacuum or to, or to make 0:21:30.980,0:21:34.560 autovacuum happen maybe once a night or something.[br]And 0:21:34.560,0:21:38.620 that's generally a bad idea. We actually got[br]bitten 0:21:38.620,0:21:41.390 by this ourselves, so that's why I bring it 0:21:41.390,0:21:43.510 up. Usually the answer to vacuum problems[br]is to 0:21:43.510,0:21:46.441 vacuum more often, not less often. 0:21:46.441,0:21:51.739 All right. So. Velocity. I really wanted to[br]like 0:21:51.740,0:21:52.921 - I didn't have time - I really wanted 0:21:52.921,0:21:56.519 to have a little, a little like, Tron guy 0:21:56.520,0:22:00.600 on the motorcycle going down the little grid[br]there. 0:22:00.600,0:22:02.260 But. Eh. 0:22:02.280,0:22:05.400 Now we're gonna talk about a, a couple things 0:22:05.400,0:22:06.941 that are important if you have a ton of 0:22:06.941,0:22:11.229 data coming in, or a ton of read-requests[br]coming 0:22:11.229,0:22:13.970 in, a ton of queries coming in. The first 0:22:13.970,0:22:16.551 is too many database connections. Each database[br]connection in 0:22:16.560,0:22:21.159 PostGres is its own process, and each process[br]has 0:22:21.160,0:22:23.581 its own RAM overhead. So there's a limited[br]number 0:22:23.581,0:22:25.680 of connections that you want to have to your, 0:22:25.680,0:22:27.109 your database server. 0:22:27.109,0:22:31.561 If you have, I don't know, a thousand workers 0:22:31.561,0:22:35.159 and, and web app processes and all this, you 0:22:35.160,0:22:37.951 don't want to open a thousand database connections[br]and 0:22:37.960,0:22:40.549 you probably already know about this. The[br]solution is 0:22:40.549,0:22:42.270 to pool connections. 0:22:42.280,0:22:44.630 There's ways to do this in Ruby. There's also, 0:22:44.630,0:22:47.500 if you're interested in a more ops-y approach,[br]you 0:22:47.520,0:22:50.110 can check out something like PG-bouncer, which[br]is a 0:22:50.110,0:22:54.260 proxy that sits in between your Ruby app and 0:22:54.260,0:22:58.090 your database and functions as a, as a connection 0:22:58.090,0:23:01.020 pool. 0:23:01.020,0:23:02.850 You can also run into problems with too many 0:23:02.850,0:23:05.700 locks. And this is, this is the sort of 0:23:05.700,0:23:08.191 problem that you don't really ever run into[br]if 0:23:08.200,0:23:10.649 you don't have a firehose of database, of[br]data 0:23:10.649,0:23:15.791 coming in. 0:23:15.800,0:23:17.829 I don't expect you to know, like, everything[br]about 0:23:17.840,0:23:19.831 locks, because it's a, it's a pretty complex[br]topic. 0:23:19.840,0:23:23.879 But you should know that, within a transaction,[br]if 0:23:23.880,0:23:28.191 you go to update a row, that transaction is 0:23:28.191,0:23:31.039 gonna put a lock on the row until it's 0:23:31.040,0:23:32.890 done. It's gonna say that nothing else can[br]write 0:23:32.890,0:23:37.051 to that row until it's done. And, you know, 0:23:37.060,0:23:39.190 that makes sense. 0:23:39.200,0:23:40.909 But where this can bite you is if you 0:23:40.909,0:23:44.951 have, say, imagine you have two Rails models.[br]You 0:23:44.960,0:23:47.769 have a parent model and a child model. And 0:23:47.769,0:23:50.410 the parent model has a counter-cache that[br]gets incremented 0:23:50.420,0:23:53.790 every time you add a child. Normally, this[br]is 0:23:53.790,0:23:55.740 no big deal. People do this sort of thing 0:23:55.740,0:24:00.390 all the time. But if something crazy happens[br]and 0:24:00.390,0:24:05.301 someone, you know, bombards your API and suddenly[br]you 0:24:05.301,0:24:09.699 have, like, fifteen thousand children created,[br]you're, you're gonna 0:24:09.700,0:24:11.040 have some locking issues. 0:24:11.040,0:24:13.410 Cause what's gonna happen is your first query[br]is 0:24:13.410,0:24:17.320 going to execute fine. The child's gonna get[br]created. 0:24:17.340,0:24:19.920 It's gonna increment the counter. Everything's[br]gonna be fine. 0:24:19.920,0:24:22.410 And while it was doing that, it put a 0:24:22.410,0:24:26.551 lock on, on that row. And so, once the 0:24:26.560,0:24:29.259 first query's done, the lock is, is removed,[br]and 0:24:29.260,0:24:33.920 the second query happens. And if you had infinite 0:24:33.920,0:24:35.960 time, like, this would all be fine. It would 0:24:35.960,0:24:39.400 all eventually work itself out. But what happens[br]is 0:24:39.420,0:24:42.170 that by the time you finish query 100, query 0:24:42.170,0:24:47.700 15, 15,000 has timed out, which causes all[br]sorts 0:24:47.700,0:24:50.880 of fun in your, you know, in your Unicorns 0:24:50.880,0:24:54.910 and your Sidekiqs and all that. It's a, yeah, 0:24:54.920,0:24:56.520 it's a huge pain in the neck. 0:24:56.520,0:25:01.120 And the way you avoid this is to, it, 0:25:01.120,0:25:04.130 it's just an architectural thing. You just[br]have to 0:25:04.130,0:25:07.090 avoid any situation where you could be updating[br]the 0:25:07.100,0:25:11.670 same record in the database like a gillion[br]times 0:25:11.670,0:25:16.140 per second. 0:25:16.140,0:25:21.390 Intensive database queries. Like, sometimes,[br]sometimes we have our 0:25:21.390,0:25:25.840 production database, and we need that to be[br]very, 0:25:25.860,0:25:29.670 very performant for our users. But we also[br]need 0:25:29.680,0:25:31.910 to maybe do some preprocessing on data as[br]it 0:25:31.910,0:25:34.321 comes in. And a really simple way to do 0:25:34.321,0:25:38.899 this is to use PostGres's streaming replication[br]facilities to 0:25:38.900,0:25:42.960 create a read-only replicant. And then you[br]just do 0:25:42.960,0:25:46.160 your intensive queries against that and it[br]doesn't affect, 0:25:46.160,0:25:49.920 affect your users. It's super simple. Sorry.[br]I, I, 0:25:49.920,0:25:53.300 I kind of feel dumb even talking about it 0:25:53.300,0:25:56.400 here. But, there you go. 0:25:56.400,0:25:59.210 Partitioning is awesome. Partitioning is like[br]the best thing 0:25:59.210,0:26:05.230 ever, because it allows you to dramatically[br]speed up 0:26:05.240,0:26:10.661 data calling and data archiving. What you[br]can set 0:26:10.700,0:26:13.728 up in, in PostGres is a partitioning scheme[br]in 0:26:13.728,0:26:19.020 which data for different days go into different[br]physical 0:26:19.020,0:26:20.800 database tables. 0:26:20.800,0:26:24.050 And, you know, that's all right. But the really 0:26:24.050,0:26:27.840 cool thing is that you access that data by 0:26:27.840,0:26:34.230 querying a parent sort of virtual table that[br]then 0:26:34.240,0:26:36.880 propagates that query across all of its children[br]and, 0:26:36.880,0:26:39.100 you know, magically does its results and,[br]and spits 0:26:39.100,0:26:39.610 them back to you. 0:26:39.610,0:26:41.100 So you don't have to update your Ruby code 0:26:41.100,0:26:44.960 at all, which is sweet. And this way, if 0:26:44.960,0:26:48.960 you want to, say, delete data that's 180 days 0:26:48.960,0:26:51.691 old, you just drop the table that's associated[br]with 0:26:51.700,0:26:55.479 180 days ago. If you were to do a 0:26:55.480,0:27:00.060 delete where, you know, created_at is less[br]than 180 0:27:00.060,0:27:03.310 days ago, you're probably gonna be waiting[br]for weeks. 0:27:03.310,0:27:07.780 And last, but not least, backups are a real 0:27:07.780,0:27:10.090 pain in the neck when you have a huge 0:27:10.100,0:27:12.211 dataset. Like, that crime job that dumps your[br]entire 0:27:12.211,0:27:16.969 database and uploads it to S3 doesn't work[br]so 0:27:16.969,0:27:19.900 well when your database is, like, two terabytes[br]big. 0:27:19.900,0:27:24.950 And, yeah, that's a big S3 bill, too. 0:27:24.950,0:27:27.750 Fortunately, we can take advantage of the[br]same sort 0:27:27.760,0:27:31.770 of things that PostGres does to allow it to 0:27:31.770,0:27:36.931 do streaming replication to do, sort of, on[br]the 0:27:36.940,0:27:41.319 fly, incremental backups. And there's a tool[br]called Wall-E, 0:27:41.340,0:27:45.150 which makes this super easy. And it, it's[br]really 0:27:45.150,0:27:48.530 cool, because it allows, it, it makes it very 0:27:48.530,0:27:52.470 easy to upload incremental backups to, say,[br]S3, and 0:27:52.480,0:27:54.080 then when you want to restore your database,[br]it 0:27:54.080,0:27:58.090 makes it really easy to, to restore a specific 0:27:58.090,0:28:00.600 point in time. 0:28:00.600,0:28:03.660 And so that, Wall-E is really awesome. I love 0:28:03.680,0:28:04.250 it. 0:28:04.250,0:28:07.300 Now, I, I, I stayed in my hotel room 0:28:07.300,0:28:10.530 last night. I didn't get to go to the 0:28:10.540,0:28:12.951 Speakeasy thing, which kind of bummed me out.[br]But 0:28:12.951,0:28:14.539 I, I had to work on these slides for 0:28:14.540,0:28:20.370 you people. And, and, and when I got done 0:28:20.380,0:28:22.260 with this, when I sort of reached this point, 0:28:22.260,0:28:25.680 I looked back on all of these, these slides 0:28:25.680,0:28:28.100 - there's like fifty-four of them - and I 0:28:28.100,0:28:30.520 was like, holy crap. This is a lot of 0:28:30.520,0:28:32.000 information. 0:28:32.000,0:28:38.840 Yeah. This is a lot of information. But the 0:28:38.840,0:28:41.060 thing I, I want to stress is that, at 0:28:41.060,0:28:45.380 least in, in my experience, these issues tend[br]to 0:28:45.380,0:28:47.730 come at you one at a time. This isn't 0:28:47.730,0:28:48.970 the sort of thing where you have to know 0:28:48.970,0:28:53.850 all of this stuff in advance in order to 0:28:53.850,0:28:57.740 deal with biggish data in PostGres. Things[br]come at 0:28:57.740,0:28:58.980 you one at a time, and you can deal 0:28:58.980,0:29:00.650 with them one at a time. 0:29:00.680,0:29:02.951 And I have faith in you. I think you 0:29:02.951,0:29:08.419 can do it, because you're awesome, because[br]nobody told 0:29:08.420,0:29:11.660 us that we could turn, like, transistors into[br]LOLCats, 0:29:11.660,0:29:14.951 but we did it. Like, that's the type of 0:29:14.960,0:29:18.039 people we are, and that's, that's why I'm[br]proud 0:29:18.040,0:29:21.711 to be at RailsConf. If you're interested in,[br]like, 0:29:21.711,0:29:23.129 talking to me about this, if you think I'm 0:29:23.129,0:29:26.320 full of shit about anything, which I probably[br]am 0:29:26.320,0:29:31.320 on at least one point, just, just say so. 0:29:31.320,0:29:32.730 Just feel free to come up to me after 0:29:32.740,0:29:36.210 the conference, or after my talk here. I have 0:29:36.220,0:29:40.840 delicious hot and spicy Mexican candy, as[br]an incentive. 0:29:40.840,0:29:43.360 So there you go. 0:29:43.360,0:29:45.900 If you want to learn more about the stuff 0:29:45.900,0:29:47.970 that I. Are you taking? Do you want to 0:29:47.970,0:29:51.520 take a picture of the Mexican candy? OK. OK. 0:29:51.520,0:29:55.750 I can send it to you. 0:29:55.760,0:29:58.890 If. If you want to learn more, all of 0:29:58.890,0:30:00.880 the links referenced in this talk are at this 0:30:00.880,0:30:03.630 url, and if you are interested in having more 0:30:03.640,0:30:08.961 visibility into your production errors, check[br]out Honeybadger, because, 0:30:08.961,0:30:13.299 yeah. We love Rails devs. And that's it. That's. 0:30:13.300,0:30:15.170 It says end of show here, so I guess 0:30:15.170,0:30:16.211 it must be the end of show.