1 00:00:17,240 --> 00:00:18,690 STARR HORNE: So thank you guys for coming. 2 00:00:18,700 --> 00:00:21,090 This is gonna be an awesome talk. 3 00:00:21,090 --> 00:00:22,750 Quite possibly the best talk you'll ever hear 4 00:00:22,750 --> 00:00:26,440 in your life. What? I actually have 5 00:00:26,440 --> 00:00:27,920 a built-in escape hatch here, 6 00:00:27,920 --> 00:00:29,630 so if things start going wrong, 7 00:00:29,640 --> 00:00:33,490 I might just bail. Yeah. 8 00:00:33,490 --> 00:00:40,350 So, anyway. A couple weeks ago, I was, well, 9 00:00:40,360 --> 00:00:44,090 I was sort of having a panic attack about 10 00:00:44,120 --> 00:00:48,190 this presentation, because I'm a programmer, and my natural 11 00:00:48,190 --> 00:00:51,160 habitat is in some sort of dark place looking 12 00:00:51,160 --> 00:00:53,649 a screen. It's not, it's not talking to a 13 00:00:53,649 --> 00:00:56,811 bunch of people with, like, lights. What's up with 14 00:00:56,820 --> 00:01:00,009 this? 15 00:01:00,020 --> 00:01:01,450 So I, I was in the middle of this 16 00:01:01,450 --> 00:01:03,131 panic attack and I went out on a walk, 17 00:01:03,140 --> 00:01:04,909 cause that always clears my mind. I was listening 18 00:01:04,909 --> 00:01:08,291 to a podcast. My favorite podcast. It's called Hardcore 19 00:01:08,300 --> 00:01:13,290 History. I highly recommend it. Yeah. Oh. Somebody likes 20 00:01:13,290 --> 00:01:16,640 it. Yeah! 21 00:01:16,640 --> 00:01:19,580 So this, this particular episode about Hardcore, in, of 22 00:01:19,580 --> 00:01:24,390 Hardcore History was about the Punic Wars. And in 23 00:01:24,400 --> 00:01:26,600 case you haven't heard about the Punic Wars, don't 24 00:01:26,600 --> 00:01:28,220 know the story, I'll just give you, like, the 25 00:01:28,220 --> 00:01:31,540 brief rundown. Basically, a long, long time ago, there 26 00:01:31,540 --> 00:01:34,660 was this place called Rome. And they, they decided 27 00:01:34,660 --> 00:01:38,000 that they should take over the world. And, for 28 00:01:38,000 --> 00:01:40,820 the most part, they did. But there was this 29 00:01:40,820 --> 00:01:42,740 one country that was a thorn in their side 30 00:01:42,740 --> 00:01:45,240 for a long time called Carthidge. And they fought 31 00:01:45,240 --> 00:01:46,710 all these wars back and forth. It was kind 32 00:01:46,710 --> 00:01:48,890 of a stale mate. 33 00:01:48,890 --> 00:01:51,070 And then one day, when, I don't know, maybe 34 00:01:51,080 --> 00:01:53,810 he was making a sacrifice to the fire god 35 00:01:53,810 --> 00:01:57,420 or whatever, this guy named Hannibal had this great 36 00:01:57,420 --> 00:01:59,700 idea. He was like, I'm gonna, I'm gonna lead 37 00:01:59,700 --> 00:02:02,800 this sneak-attack on Rome, and I'm gonna do so 38 00:02:02,800 --> 00:02:07,100 by marching my army through the Alps. Which is, 39 00:02:07,120 --> 00:02:10,979 is pretty cool. I mean, that, that's pretty bad 40 00:02:10,979 --> 00:02:12,770 ass. But I, I think the most awesome part 41 00:02:12,770 --> 00:02:15,751 of the story, for me, at least, is that 42 00:02:15,760 --> 00:02:19,180 the dude had war elephants. So, I, I don't 43 00:02:19,180 --> 00:02:20,560 know if you can see it in the slide, 44 00:02:20,560 --> 00:02:23,769 but there's actually a war elephant. It's, it's kind 45 00:02:23,769 --> 00:02:26,831 of, I don't know, it's kind of under the 46 00:02:26,840 --> 00:02:31,220 G. And a lot of people have doubted this 47 00:02:31,220 --> 00:02:34,830 story, you know, over the years. And so, in 48 00:02:34,830 --> 00:02:39,400 1959, there was a British engineer who, on a 49 00:02:39,400 --> 00:02:44,350 bar bet, borrowed a circus elephant named Jumbo and 50 00:02:44,350 --> 00:02:47,180 marched it across the Alps, too. 51 00:02:47,180 --> 00:02:49,820 So, I don't know what my point is here, 52 00:02:49,820 --> 00:02:54,689 really. I guess don't underestimate elephants, 53 00:02:54,689 --> 00:02:56,171 because they don't 54 00:02:56,200 --> 00:03:00,140 like it and they have long memories. 55 00:03:00,220 --> 00:03:07,220 So, OK. This talk is really about biggish data. 56 00:03:07,580 --> 00:03:10,350 So what, what the hell is biggish data. It's 57 00:03:10,350 --> 00:03:12,699 not, what is? Whoa, I've got one of these 58 00:03:12,700 --> 00:03:18,431 things. It's not big data. It's not about, you 59 00:03:18,440 --> 00:03:20,760 know, this talk isn't gonna be about Hadoop clusters 60 00:03:20,760 --> 00:03:25,349 and server farms and all that stuff. That's, that's 61 00:03:25,349 --> 00:03:27,261 way over my head. 62 00:03:27,261 --> 00:03:31,460 It's not about fancy architectures. I'm not gonna show 63 00:03:31,460 --> 00:03:35,760 you how to make multi-right PostGres clusters that do 64 00:03:35,760 --> 00:03:38,809 automatic sharding and stuff. I mean, that's all like 65 00:03:38,809 --> 00:03:40,800 wicked cool, but it's not what we're gonna talk 66 00:03:40,800 --> 00:03:41,221 about. 67 00:03:41,221 --> 00:03:43,579 Instead, I'm gonna talk about something that I think 68 00:03:43,600 --> 00:03:47,350 is, is more practical, and it's kind of more 69 00:03:47,350 --> 00:03:49,281 interesting to me, and that is, how do you 70 00:03:49,281 --> 00:03:52,059 keep your app working as your production dataset grows? 71 00:03:52,060 --> 00:03:59,060 And grows into biggish data territory? 72 00:03:59,160 --> 00:04:01,189 And this is very easy. This is very easy 73 00:04:01,200 --> 00:04:03,611 to happen. Even if you don't have a firehose 74 00:04:03,620 --> 00:04:06,050 of data coming in. If you run a, a 75 00:04:06,050 --> 00:04:09,500 popular e-commerce site or just a site with a 76 00:04:09,500 --> 00:04:11,670 lot of users, over the course of years you 77 00:04:11,670 --> 00:04:15,430 can accumulate a ton of data. And as this 78 00:04:15,440 --> 00:04:18,789 data accumulates, you find that your site performance goes 79 00:04:18,789 --> 00:04:21,351 downhill sort of gradually in ways that you don't 80 00:04:21,360 --> 00:04:24,189 understand. 81 00:04:24,189 --> 00:04:28,891 So. What is biggish data? This talk is based 82 00:04:28,900 --> 00:04:31,550 on my experience at Honeybadger. In case you haven't 83 00:04:31,550 --> 00:04:35,660 heard it, of us, we are an exception performance 84 00:04:35,660 --> 00:04:38,280 and uptime monitoring company. 85 00:04:38,280 --> 00:04:39,670 That means that we essentially 86 00:04:39,670 --> 00:04:42,070 have a firehose of new data coming in every 87 00:04:42,100 --> 00:04:45,650 day. Right now we have about a terabyte of 88 00:04:45,650 --> 00:04:49,700 errors in our database. The world has a lot 89 00:04:49,700 --> 00:04:52,460 of errors. You guys need to start, like, doing 90 00:04:52,460 --> 00:04:54,480 a better job. 91 00:04:54,480 --> 00:04:57,950 And, and we get about, we get two gigabytes 92 00:04:57,960 --> 00:05:01,240 of new errors per day, roughly. And all this 93 00:05:01,240 --> 00:05:06,810 goes into a plain vanilla PostGres database, and it's 94 00:05:06,810 --> 00:05:08,810 served, that backs a pretty plain 95 00:05:08,820 --> 00:05:11,550 vanilla Rails application. 96 00:05:11,560 --> 00:05:15,410 Well, the good news is that PostGres can handle 97 00:05:15,410 --> 00:05:19,590 this pretty easily. PostGres has got it covered. 98 00:05:19,590 --> 00:05:24,390 The bad news is that, unless you've engineered your 99 00:05:24,390 --> 00:05:27,870 app to deal with this level of data, you're 100 00:05:27,880 --> 00:05:31,270 kind of screwed. And the reason is that a 101 00:05:31,270 --> 00:05:33,530 hundred megabyte database behaves 102 00:05:33,560 --> 00:05:36,419 fundamentally different than a one 103 00:05:36,420 --> 00:05:41,881 terabyte database. And a lot of the conventions, a 104 00:05:41,881 --> 00:05:43,330 lot of the normal things we do in Rails 105 00:05:43,330 --> 00:05:45,909 apps just stop working when you have this much 106 00:05:45,909 --> 00:05:49,941 data. 107 00:05:49,941 --> 00:05:55,539 Yup. This is actually a live feed. This queries 108 00:05:55,540 --> 00:06:02,560 been running since this morning. 109 00:06:03,700 --> 00:06:06,741 Something I learned, pretty much every pagination system breaks 110 00:06:06,741 --> 00:06:13,150 on page, like, 2000. Even if you, you're like, 111 00:06:13,160 --> 00:06:15,690 dear God, make this stop, I just want to 112 00:06:15,690 --> 00:06:19,050 delete half my data. You're, you're just still screwed, 113 00:06:19,060 --> 00:06:22,380 because deleting data takes a lot of time. 114 00:06:22,400 --> 00:06:25,840 Now, I'm gonna explain all of this. I'm gonna 115 00:06:25,840 --> 00:06:27,780 explain why this happens, how you can work around 116 00:06:27,780 --> 00:06:30,790 it, and how you can optimize the queries in 117 00:06:30,790 --> 00:06:34,700 your own database and optimize your stack. But, to 118 00:06:34,700 --> 00:06:36,760 really understand this, we've got to take a little 119 00:06:36,760 --> 00:06:43,760 trip back in time, to the summer of 1978. 120 00:06:44,740 --> 00:06:49,261 The summer of disco, free love, and the VT-100 121 00:06:49,261 --> 00:06:53,869 terminal. The VT-100 was the first sort of computer-shaped 122 00:06:53,869 --> 00:06:56,041 thing that you could buy from an actual company 123 00:06:56,041 --> 00:06:58,550 and sit on your desk to prove that you 124 00:06:58,560 --> 00:07:02,110 were the alpha nerd of the office. It was 125 00:07:02,110 --> 00:07:05,150 also the time that Oracle 1 point 0 was 126 00:07:05,150 --> 00:07:08,730 being developed, which was one of the first databases 127 00:07:08,740 --> 00:07:12,250 that used the SQL language. It's pretty cool. It 128 00:07:12,280 --> 00:07:15,030 was written in PDP-11 assembly language. 129 00:07:15,030 --> 00:07:18,850 And, and in case you're like too young to 130 00:07:18,850 --> 00:07:23,220 have ever seen a PDP-11, this is what a 131 00:07:23,220 --> 00:07:29,640 PDP-11 looks like. Yeah. Give it, show some love 132 00:07:29,660 --> 00:07:32,570 for the PDP-11. 133 00:07:32,570 --> 00:07:39,470 Now, this is pretty awesome. Nowadays, in our, like, 134 00:07:39,480 --> 00:07:43,029 modern day times, right this second, marketers down in 135 00:07:43,029 --> 00:07:46,081 the, the presentat- what, what do they call that? 136 00:07:46,081 --> 00:07:50,739 The exhibit hall. Are, right now, throwing around all 137 00:07:50,740 --> 00:07:54,120 sorts of words about cloud computing and platform as 138 00:07:54,120 --> 00:07:58,191 a service and, and I don't know, some, some 139 00:07:58,200 --> 00:08:00,969 stuff I haven't even heard of, probably. And back 140 00:08:00,980 --> 00:08:02,771 then it was the same, but the, the buzz 141 00:08:02,771 --> 00:08:05,439 words were different. The buzz words on everybody's, the 142 00:08:05,440 --> 00:08:09,810 buzz word on everybody's lips, at that time, was 143 00:08:09,820 --> 00:08:11,661 real time computing. 144 00:08:11,661 --> 00:08:17,789 And I, I'm using air quotes around real time, 145 00:08:17,789 --> 00:08:21,660 because that meant that you could have your, your 146 00:08:21,660 --> 00:08:25,981 report generated in five minutes instead of five days. 147 00:08:25,981 --> 00:08:29,150 Which I'm sure was pretty awesome. Like, I would 148 00:08:29,150 --> 00:08:32,669 have been really happy about that. But, you know, 149 00:08:32,679 --> 00:08:34,039 looking at this, it doesn't really seem like the 150 00:08:34,059 --> 00:08:35,421 sort of thing I would want to build a 151 00:08:35,440 --> 00:08:38,318 web app on top of, right? 152 00:08:38,320 --> 00:08:43,600 Like, I. Yeah. I, I wouldn't stick Honeybadger on 153 00:08:43,600 --> 00:08:47,520 that. But, a funny thing happened in the past 154 00:08:47,520 --> 00:08:53,071 thirty, thirty-six years. A funny thing happened is that 155 00:08:53,071 --> 00:08:57,199 Moore's Law made this into a web stack. Now, 156 00:08:57,200 --> 00:08:59,221 I know that everybody here knows sort of, vaguely, 157 00:08:59,221 --> 00:09:00,860 what Moore's Law is. But I thought it would 158 00:09:00,860 --> 00:09:06,040 be good to go over the technical, precise definition 159 00:09:06,040 --> 00:09:11,500 of Moore's Law. 160 00:09:11,500 --> 00:09:18,120 And that is that, whoa. That is that computers 161 00:09:18,120 --> 00:09:20,110 get more awesome as time goes forward. Did that 162 00:09:20,120 --> 00:09:23,480 come back normally? Yeah. As time goes forward. And 163 00:09:23,480 --> 00:09:27,800 awesomeness is generally defined as, well, technically, being able 164 00:09:27,800 --> 00:09:30,730 to do more stuff faster. Being able to process 165 00:09:30,730 --> 00:09:37,720 more data faster. So, based on this, I would 166 00:09:38,100 --> 00:09:41,300 like to postulate, for the first time in public 167 00:09:41,300 --> 00:09:43,190 - you guys are very lucky to be able 168 00:09:43,190 --> 00:09:45,290 to hear this - the first time in public, 169 00:09:45,290 --> 00:09:49,350 Starr's Corelary to Moore's Law, which is, as database 170 00:09:49,350 --> 00:09:56,340 growth outpaces Moore's Law, you literally travel back in 171 00:09:57,100 --> 00:10:00,390 time. 172 00:10:00,390 --> 00:10:01,940 That was the Ted moment of this, of this 173 00:10:01,940 --> 00:10:04,940 talk. It's like, I can hear you guys, your 174 00:10:04,940 --> 00:10:10,380 minds being blown. It sounds like bubble wrap popping. 175 00:10:10,380 --> 00:10:16,570 So, based on my, my extensive research of time-travel 176 00:10:16,570 --> 00:10:21,360 movies, the, the first rule for surviving your new 177 00:10:21,360 --> 00:10:23,300 time period is that you need to understand the 178 00:10:23,300 --> 00:10:27,290 methods and the motivations of the people in that 179 00:10:27,290 --> 00:10:29,239 time period. We need to start thinking like our 180 00:10:29,240 --> 00:10:34,490 ancestors. And our ancestors were very interested in this 181 00:10:34,500 --> 00:10:37,521 thing called hard ware. 182 00:10:37,521 --> 00:10:42,060 Well, I, I don't mean to be flip, but 183 00:10:42,060 --> 00:10:46,120 I just want to bring this up first, because 184 00:10:46,120 --> 00:10:49,390 if you happen to be having database scaling issues, 185 00:10:49,390 --> 00:10:52,010 like, if your app right now is getting slower 186 00:10:52,020 --> 00:10:54,100 because the amount of data in your database is 187 00:10:54,100 --> 00:10:57,950 getting bigger, and your app happens to live anywhere 188 00:10:57,980 --> 00:11:04,360 named EC-2, Linode, I don't know, whatever, whatever other 189 00:11:04,360 --> 00:11:08,120 companies offer that, you can probably solve your scaling 190 00:11:08,120 --> 00:11:10,560 issue just, like, right now just by going and 191 00:11:10,560 --> 00:11:13,019 buying a real damn computer. 192 00:11:13,020 --> 00:11:19,731 Yeah. You'll be shocked and amazed. Because the two 193 00:11:19,731 --> 00:11:23,970 things that databases need to perform well, to work 194 00:11:23,980 --> 00:11:26,330 at high performance with lots of data, is they 195 00:11:26,330 --> 00:11:29,000 need lots of ram, and they need really fast 196 00:11:29,000 --> 00:11:32,100 disc io. And virtual servers give you neither of 197 00:11:32,100 --> 00:11:35,040 those things. So, just go buy a real damn 198 00:11:35,040 --> 00:11:39,030 computer. Yeah. And, and while you're at it, throw 199 00:11:39,030 --> 00:11:42,509 in a bunch of discs. 200 00:11:42,509 --> 00:11:45,361 I'm not talking about a radar ray. You probably 201 00:11:45,361 --> 00:11:48,280 want each of your, you probably want each of 202 00:11:48,280 --> 00:11:53,239 your operating system, your PostGres data, your PostGres log 203 00:11:53,240 --> 00:11:55,231 file, you probably want all that stuff to be 204 00:11:55,240 --> 00:11:58,259 on separate hard drives. And that's just gonna make 205 00:11:58,280 --> 00:12:02,590 the operating system able to more efficiently sort of 206 00:12:02,590 --> 00:12:04,241 schedule disc io. 207 00:12:04,241 --> 00:12:09,940 So, if, if that solves your problems, you know, 208 00:12:09,940 --> 00:12:13,910 great. You're welcome. If you guys want to leave 209 00:12:13,910 --> 00:12:16,830 the room now, that's fine. I won't be offended. 210 00:12:16,830 --> 00:12:19,460 You can go call your local Colo facility. Work 211 00:12:19,460 --> 00:12:22,780 something out. If, if that doesn't solve your problem, 212 00:12:22,780 --> 00:12:24,999 or even if it does, you probably want to 213 00:12:25,000 --> 00:12:28,881 look at your queries next. 214 00:12:28,881 --> 00:12:33,739 Now, most people, most of us develop - oops. 215 00:12:33,740 --> 00:12:38,841 Hey. Most of us develop against smallish datasets. So, 216 00:12:38,860 --> 00:12:40,769 when you develop against a small dataset, you don't 217 00:12:40,769 --> 00:12:44,971 notice inefficient queries. It's just life. And books have 218 00:12:44,980 --> 00:12:48,250 been written about query optimization. It's, it's a very 219 00:12:48,260 --> 00:12:52,560 huge topic, and I can't ev, explain everything in 220 00:12:52,560 --> 00:12:54,200 thirty minutes. So I'm just going to explain one 221 00:12:54,200 --> 00:12:57,360 thing, and that's called a, a explain. 222 00:12:57,360 --> 00:13:02,550 Fortunately, PostGres gives us an awesome tool called Explain, 223 00:13:02,560 --> 00:13:06,089 which basically you pass it a query and it 224 00:13:06,140 --> 00:13:13,140 gives us a query plan. Oh it, wait a 225 00:13:16,080 --> 00:13:19,330 second. That's a, that's a chapter from the Iliad. 226 00:13:19,330 --> 00:13:23,840 Sorry. It gives us a query plan, which still 227 00:13:23,840 --> 00:13:26,230 kind of looks like a chapter from the Iliad. 228 00:13:26,240 --> 00:13:29,110 But, fortunately, we only have to, to look at 229 00:13:29,110 --> 00:13:31,080 one metric here. The only thing we need to 230 00:13:31,100 --> 00:13:37,160 worry about is rows. For this talk. 231 00:13:37,160 --> 00:13:38,560 And here we have an example of a very 232 00:13:38,560 --> 00:13:42,300 efficient query, right. It's gonna use an index, and 233 00:13:42,300 --> 00:13:44,880 it's gonna look at one row. That's pretty sweet. 234 00:13:44,880 --> 00:13:46,970 Very fast. This is the type of query that 235 00:13:46,970 --> 00:13:51,240 biggish data works with. 236 00:13:51,240 --> 00:13:52,559 But there's one thing you need to know about 237 00:13:52,560 --> 00:13:59,341 rows. And that, obviously, the more rows you're dealing 238 00:13:59,360 --> 00:14:01,680 with, the more data your computer has to crunch 239 00:14:01,680 --> 00:14:04,170 to give you the answers you want. And so 240 00:14:04,170 --> 00:14:06,600 the whole, the whole name of the game in 241 00:14:06,600 --> 00:14:08,920 query optimization is to limit the number of rows 242 00:14:08,920 --> 00:14:10,259 that you have to touch. 243 00:14:10,260 --> 00:14:15,721 Let's go back to that inefficient count query. So 244 00:14:15,721 --> 00:14:17,860 if you run Explain on that, it turns out 245 00:14:17,860 --> 00:14:21,660 that when you ask PostGres to count all of 246 00:14:21,660 --> 00:14:25,170 the tab, all of the rows in a table, 247 00:14:25,170 --> 00:14:27,050 it actually literally counts all the rows in the 248 00:14:27,060 --> 00:14:32,440 table, one by one. 249 00:14:32,440 --> 00:14:35,009 And so you wind up with a, one of 250 00:14:35,009 --> 00:14:40,581 these things. It's not pretty. And it's often hard 251 00:14:40,581 --> 00:14:42,800 to know when stuff like this is gonna crop 252 00:14:42,800 --> 00:14:46,960 up. For example, I mentioned that most pagination systems 253 00:14:46,960 --> 00:14:51,569 break at page 1000 or 2000. And that's because, 254 00:14:51,569 --> 00:14:58,281 well, the offset and limit operators work a lot 255 00:14:58,300 --> 00:15:00,269 like count, in that, if you do an offset 256 00:15:00,269 --> 00:15:05,010 of 500,000, PostGres is going to count up 500,000 257 00:15:05,010 --> 00:15:06,581 rows, and then if you have a limit of 258 00:15:06,600 --> 00:15:08,740 100, it's gonna count up another 100. And so 259 00:15:08,740 --> 00:15:11,650 what you see is that pages one, two, and 260 00:15:11,650 --> 00:15:16,840 three load extremely quickly. And so you, as a 261 00:15:16,840 --> 00:15:18,990 developer, are actually, when I say you I mean 262 00:15:19,000 --> 00:15:22,210 me, cause I actually, I actually fell into this 263 00:15:22,210 --> 00:15:24,770 trap. You test it with, like, one or two 264 00:15:24,770 --> 00:15:27,999 pages and it works fine, so you move on 265 00:15:28,020 --> 00:15:29,890 and then later on your customers are complaining because 266 00:15:29,890 --> 00:15:34,411 they can't access page 500. It's timing out. 267 00:15:34,420 --> 00:15:38,730 The solution to this is to use a range 268 00:15:38,730 --> 00:15:41,759 query. Range queries are very fast, and I'm gonna 269 00:15:41,760 --> 00:15:44,651 give you a really stupid example here. Here's a 270 00:15:44,660 --> 00:15:51,560 link with a much more detailed analysis of all 271 00:15:51,560 --> 00:15:55,100 this. But essentially, if you do a range query, 272 00:15:55,100 --> 00:15:56,590 you're gonna be able to use an index. It's 273 00:15:56,590 --> 00:15:58,700 gonna touch 100 rows and it's gonna be really 274 00:15:58,700 --> 00:15:59,550 fast. 275 00:15:59,550 --> 00:16:02,619 Now, I know what you guys are thinking. That's 276 00:16:02,620 --> 00:16:06,250 not the same as an offset limit query. And 277 00:16:06,280 --> 00:16:07,631 that's the reason I gave you the link in 278 00:16:07,631 --> 00:16:10,439 the previous slide, so. 279 00:16:10,440 --> 00:16:14,890 Sorting. Sorting is tricky. Sorting is the devil. Sometimes 280 00:16:14,960 --> 00:16:17,140 it's super fast. Sometimes if you happen to be 281 00:16:17,140 --> 00:16:20,731 asking for data sorted in exactly the same way 282 00:16:20,740 --> 00:16:23,800 that an index is sorted, well, it'll be super 283 00:16:23,800 --> 00:16:26,579 fast. Other times, even if you have an index 284 00:16:26,580 --> 00:16:29,020 on a column, if it's not set up right 285 00:16:29,020 --> 00:16:30,911 or if you're asking for the data in a 286 00:16:30,911 --> 00:16:34,019 slightly unusual way, you'll wind up sorting the whole 287 00:16:34,020 --> 00:16:35,200 damn dataset. 288 00:16:35,200 --> 00:16:40,621 It's no good. So here's your rundown for query 289 00:16:40,621 --> 00:16:44,370 optimization. You want to develop against a real dataset, 290 00:16:44,370 --> 00:16:46,090 because otherwise you won't know when things are going 291 00:16:46,090 --> 00:16:50,900 wrong until they go wrong in production. Use, use 292 00:16:50,920 --> 00:16:53,620 Explain. Use it a lot. And the whole name 293 00:16:53,620 --> 00:16:54,949 of the game here is to limit the number 294 00:16:54,960 --> 00:16:57,421 of rows that you touch. Because the more rows 295 00:16:57,421 --> 00:16:59,740 you touch, the slower your queries are. You know, 296 00:16:59,740 --> 00:17:01,579 in general. 297 00:17:01,580 --> 00:17:06,339 So, I, I don't know about you, but that 298 00:17:06,339 --> 00:17:08,890 just like, that just was a lot of cognitive 299 00:17:08,890 --> 00:17:12,101 load happening, right up here. So I want to 300 00:17:12,101 --> 00:17:19,099 give you guys a cookie. 301 00:17:24,440 --> 00:17:27,410 For the second half of this talk, we're gonna 302 00:17:27,420 --> 00:17:30,950 cover a lot of issues that relate to kind 303 00:17:30,960 --> 00:17:35,530 of the infrastructure around biggish data, around big datasets 304 00:17:35,530 --> 00:17:41,071 and firehoses of data coming in. And here's a, 305 00:17:41,080 --> 00:17:44,389 a legend of, of how we're gonna approach this. 306 00:17:44,389 --> 00:17:48,100 We're gonna have a cute picture that, hopefully, sort 307 00:17:48,120 --> 00:17:50,951 of symbolizes something about the, the topic. We're gonna 308 00:17:50,960 --> 00:17:52,720 have the name of the topic, and then we're 309 00:17:52,720 --> 00:17:55,169 gonna have a link with more info about the 310 00:17:55,169 --> 00:17:57,681 topic if, if you want to, you know, find 311 00:17:57,681 --> 00:17:58,249 this later. 312 00:17:58,249 --> 00:17:59,551 And you don't have to write all these links 313 00:17:59,560 --> 00:18:02,769 down right now. At the end of the presentation, 314 00:18:02,769 --> 00:18:05,890 at the very last slide, I'll have a url 315 00:18:05,890 --> 00:18:07,921 where you can get a list of all of 316 00:18:07,921 --> 00:18:11,579 them. And, yeah, and I promise this isn't malware. 317 00:18:11,580 --> 00:18:16,441 So, you just gotta take my word on that. 318 00:18:16,441 --> 00:18:19,349 Our ancestors were really interested in disc operating systems. 319 00:18:19,380 --> 00:18:21,310 It seems like they were building a new disc 320 00:18:21,310 --> 00:18:25,110 operating system like every two seconds, but. What, what's 321 00:18:25,110 --> 00:18:26,620 up with that? I, I don't know. Now we 322 00:18:26,640 --> 00:18:28,860 have, like, three. 323 00:18:28,860 --> 00:18:30,081 The first thing that you should do if you're 324 00:18:30,081 --> 00:18:33,859 using Linux is to increase your read-ahead cache. A 325 00:18:33,860 --> 00:18:37,430 read-ahead cache is something that, well, I found not, 326 00:18:37,430 --> 00:18:39,610 not a ton of people know about, but it's 327 00:18:39,620 --> 00:18:44,160 a super easy way to get up to a 328 00:18:44,160 --> 00:18:49,670 doubling of your read-throughput. And essentially what this means 329 00:18:49,670 --> 00:18:54,790 is that Linux examines the request that you make 330 00:18:54,790 --> 00:18:57,230 to the, to the disc. And if it sees 331 00:18:57,240 --> 00:18:59,660 that you're asking for lots of blocks that come 332 00:18:59,660 --> 00:19:02,071 right after another in a row, it's gonna preload 333 00:19:02,080 --> 00:19:04,899 the next set of blocks into RAM. 334 00:19:04,900 --> 00:19:09,650 The normal, the, the default for this is, like, 335 00:19:09,650 --> 00:19:14,701 256K RAM that it uses to prefix these blocks. 336 00:19:14,701 --> 00:19:17,109 If you update this to use, like, two megs, 337 00:19:17,109 --> 00:19:19,720 four megs, you'll get a really big increase in 338 00:19:19,720 --> 00:19:25,051 read performance. Use a modern file system. That means 339 00:19:25,060 --> 00:19:29,840 EXT-3 is not an option. If you want to 340 00:19:29,840 --> 00:19:33,049 know why, check out the link. 341 00:19:33,060 --> 00:19:37,740 And, and if you happen to be using EXT-4 342 00:19:37,740 --> 00:19:42,920 or XFS. I can never say that unless I 343 00:19:42,920 --> 00:19:45,571 say it super slow. You might want to consider, 344 00:19:45,571 --> 00:19:47,969 you might want to look into journaling settings. If 345 00:19:47,969 --> 00:19:50,051 you have your database on a completely separate hard 346 00:19:50,060 --> 00:19:55,379 drive, and it's running EXT-4 and you have full 347 00:19:55,380 --> 00:20:00,040 data journaling enabled, since PostGres does its own journaling, 348 00:20:00,040 --> 00:20:01,831 you're gonna have an inefficiency there because you have 349 00:20:01,840 --> 00:20:03,119 two things journaling. 350 00:20:03,120 --> 00:20:08,951 Finally, or, wait. Not finally. Anyway. You gotta tell 351 00:20:08,951 --> 00:20:14,989 PG, tell, tell Paul Gram about, about all the 352 00:20:15,000 --> 00:20:17,581 RAM that you bought in that, that fancy new 353 00:20:17,581 --> 00:20:20,289 box. A, a really easy way to set this 354 00:20:20,289 --> 00:20:24,000 up is to use a PG-tune script. It examines 355 00:20:24,000 --> 00:20:28,821 your computer and writes a, a PostGres configuration file 356 00:20:28,821 --> 00:20:33,460 that has some pretty, pretty reasonable values in it. 357 00:20:33,460 --> 00:20:36,149 And you can tweak them from there. This stuff 358 00:20:36,160 --> 00:20:38,071 is all really easy to find. 359 00:20:38,071 --> 00:20:41,469 And then, finally, the bane of, of DB administrators 360 00:20:41,469 --> 00:20:47,220 is the vacuum command. PostGres needs to vacuum, because 361 00:20:47,240 --> 00:20:51,060 it's messy. It. When you, when you run queries, 362 00:20:51,060 --> 00:20:53,360 when you delete things, when you update things, it 363 00:20:53,360 --> 00:20:55,250 leaves sort of a lot of stuff undone. And 364 00:20:55,250 --> 00:20:59,970 it does that, it does that in the, to, 365 00:20:59,970 --> 00:21:03,671 it. In the name of speed, right. It only 366 00:21:03,680 --> 00:21:05,589 does what is necessary at the time of query 367 00:21:05,589 --> 00:21:08,821 to get you an answer to your query. 368 00:21:08,821 --> 00:21:10,719 And then it uses vacuum to go and sort 369 00:21:10,720 --> 00:21:14,510 of clean up after itself. The problem is that 370 00:21:14,510 --> 00:21:18,671 vacuum can be really resource intensive. And so if 371 00:21:18,680 --> 00:21:21,609 your, if your server is under a lot of 372 00:21:21,620 --> 00:21:24,850 load, and you see that, OK, vacuum is also 373 00:21:24,860 --> 00:21:26,740 causing a lot of load, you may be tempted 374 00:21:26,740 --> 00:21:30,980 to turn off vacuum or to, or to make 375 00:21:30,980 --> 00:21:34,560 autovacuum happen maybe once a night or something. And 376 00:21:34,560 --> 00:21:38,620 that's generally a bad idea. We actually got bitten 377 00:21:38,620 --> 00:21:41,390 by this ourselves, so that's why I bring it 378 00:21:41,390 --> 00:21:43,510 up. Usually the answer to vacuum problems is to 379 00:21:43,510 --> 00:21:46,441 vacuum more often, not less often. 380 00:21:46,441 --> 00:21:51,739 All right. So. Velocity. I really wanted to like 381 00:21:51,740 --> 00:21:52,921 - I didn't have time - I really wanted 382 00:21:52,921 --> 00:21:56,519 to have a little, a little like, Tron guy 383 00:21:56,520 --> 00:22:00,600 on the motorcycle going down the little grid there. 384 00:22:00,600 --> 00:22:02,260 But. Eh. 385 00:22:02,280 --> 00:22:05,400 Now we're gonna talk about a, a couple things 386 00:22:05,400 --> 00:22:06,941 that are important if you have a ton of 387 00:22:06,941 --> 00:22:11,229 data coming in, or a ton of read-requests coming 388 00:22:11,229 --> 00:22:13,970 in, a ton of queries coming in. The first 389 00:22:13,970 --> 00:22:16,551 is too many database connections. Each database connection in 390 00:22:16,560 --> 00:22:21,159 PostGres is its own process, and each process has 391 00:22:21,160 --> 00:22:23,581 its own RAM overhead. So there's a limited number 392 00:22:23,581 --> 00:22:25,680 of connections that you want to have to your, 393 00:22:25,680 --> 00:22:27,109 your database server. 394 00:22:27,109 --> 00:22:31,561 If you have, I don't know, a thousand workers 395 00:22:31,561 --> 00:22:35,159 and, and web app processes and all this, you 396 00:22:35,160 --> 00:22:37,951 don't want to open a thousand database connections and 397 00:22:37,960 --> 00:22:40,549 you probably already know about this. The solution is 398 00:22:40,549 --> 00:22:42,270 to pool connections. 399 00:22:42,280 --> 00:22:44,630 There's ways to do this in Ruby. There's also, 400 00:22:44,630 --> 00:22:47,500 if you're interested in a more ops-y approach, you 401 00:22:47,520 --> 00:22:50,110 can check out something like PG-bouncer, which is a 402 00:22:50,110 --> 00:22:54,260 proxy that sits in between your Ruby app and 403 00:22:54,260 --> 00:22:58,090 your database and functions as a, as a connection 404 00:22:58,090 --> 00:23:01,020 pool. 405 00:23:01,020 --> 00:23:02,850 You can also run into problems with too many 406 00:23:02,850 --> 00:23:05,700 locks. And this is, this is the sort of 407 00:23:05,700 --> 00:23:08,191 problem that you don't really ever run into if 408 00:23:08,200 --> 00:23:10,649 you don't have a firehose of database, of data 409 00:23:10,649 --> 00:23:15,791 coming in. 410 00:23:15,800 --> 00:23:17,829 I don't expect you to know, like, everything about 411 00:23:17,840 --> 00:23:19,831 locks, because it's a, it's a pretty complex topic. 412 00:23:19,840 --> 00:23:23,879 But you should know that, within a transaction, if 413 00:23:23,880 --> 00:23:28,191 you go to update a row, that transaction is 414 00:23:28,191 --> 00:23:31,039 gonna put a lock on the row until it's 415 00:23:31,040 --> 00:23:32,890 done. It's gonna say that nothing else can write 416 00:23:32,890 --> 00:23:37,051 to that row until it's done. And, you know, 417 00:23:37,060 --> 00:23:39,190 that makes sense. 418 00:23:39,200 --> 00:23:40,909 But where this can bite you is if you 419 00:23:40,909 --> 00:23:44,951 have, say, imagine you have two Rails models. You 420 00:23:44,960 --> 00:23:47,769 have a parent model and a child model. And 421 00:23:47,769 --> 00:23:50,410 the parent model has a counter-cache that gets incremented 422 00:23:50,420 --> 00:23:53,790 every time you add a child. Normally, this is 423 00:23:53,790 --> 00:23:55,740 no big deal. People do this sort of thing 424 00:23:55,740 --> 00:24:00,390 all the time. But if something crazy happens and 425 00:24:00,390 --> 00:24:05,301 someone, you know, bombards your API and suddenly you 426 00:24:05,301 --> 00:24:09,699 have, like, fifteen thousand children created, you're, you're gonna 427 00:24:09,700 --> 00:24:11,040 have some locking issues. 428 00:24:11,040 --> 00:24:13,410 Cause what's gonna happen is your first query is 429 00:24:13,410 --> 00:24:17,320 going to execute fine. The child's gonna get created. 430 00:24:17,340 --> 00:24:19,920 It's gonna increment the counter. Everything's gonna be fine. 431 00:24:19,920 --> 00:24:22,410 And while it was doing that, it put a 432 00:24:22,410 --> 00:24:26,551 lock on, on that row. And so, once the 433 00:24:26,560 --> 00:24:29,259 first query's done, the lock is, is removed, and 434 00:24:29,260 --> 00:24:33,920 the second query happens. And if you had infinite 435 00:24:33,920 --> 00:24:35,960 time, like, this would all be fine. It would 436 00:24:35,960 --> 00:24:39,400 all eventually work itself out. But what happens is 437 00:24:39,420 --> 00:24:42,170 that by the time you finish query 100, query 438 00:24:42,170 --> 00:24:47,700 15, 15,000 has timed out, which causes all sorts 439 00:24:47,700 --> 00:24:50,880 of fun in your, you know, in your Unicorns 440 00:24:50,880 --> 00:24:54,910 and your Sidekiqs and all that. It's a, yeah, 441 00:24:54,920 --> 00:24:56,520 it's a huge pain in the neck. 442 00:24:56,520 --> 00:25:01,120 And the way you avoid this is to, it, 443 00:25:01,120 --> 00:25:04,130 it's just an architectural thing. You just have to 444 00:25:04,130 --> 00:25:07,090 avoid any situation where you could be updating the 445 00:25:07,100 --> 00:25:11,670 same record in the database like a gillion times 446 00:25:11,670 --> 00:25:16,140 per second. 447 00:25:16,140 --> 00:25:21,390 Intensive database queries. Like, sometimes, sometimes we have our 448 00:25:21,390 --> 00:25:25,840 production database, and we need that to be very, 449 00:25:25,860 --> 00:25:29,670 very performant for our users. But we also need 450 00:25:29,680 --> 00:25:31,910 to maybe do some preprocessing on data as it 451 00:25:31,910 --> 00:25:34,321 comes in. And a really simple way to do 452 00:25:34,321 --> 00:25:38,899 this is to use PostGres's streaming replication facilities to 453 00:25:38,900 --> 00:25:42,960 create a read-only replicant. And then you just do 454 00:25:42,960 --> 00:25:46,160 your intensive queries against that and it doesn't affect, 455 00:25:46,160 --> 00:25:49,920 affect your users. It's super simple. Sorry. I, I, 456 00:25:49,920 --> 00:25:53,300 I kind of feel dumb even talking about it 457 00:25:53,300 --> 00:25:56,400 here. But, there you go. 458 00:25:56,400 --> 00:25:59,210 Partitioning is awesome. Partitioning is like the best thing 459 00:25:59,210 --> 00:26:05,230 ever, because it allows you to dramatically speed up 460 00:26:05,240 --> 00:26:10,661 data calling and data archiving. What you can set 461 00:26:10,700 --> 00:26:13,728 up in, in PostGres is a partitioning scheme in 462 00:26:13,728 --> 00:26:19,020 which data for different days go into different physical 463 00:26:19,020 --> 00:26:20,800 database tables. 464 00:26:20,800 --> 00:26:24,050 And, you know, that's all right. But the really 465 00:26:24,050 --> 00:26:27,840 cool thing is that you access that data by 466 00:26:27,840 --> 00:26:34,230 querying a parent sort of virtual table that then 467 00:26:34,240 --> 00:26:36,880 propagates that query across all of its children and, 468 00:26:36,880 --> 00:26:39,100 you know, magically does its results and, and spits 469 00:26:39,100 --> 00:26:39,610 them back to you. 470 00:26:39,610 --> 00:26:41,100 So you don't have to update your Ruby code 471 00:26:41,100 --> 00:26:44,960 at all, which is sweet. And this way, if 472 00:26:44,960 --> 00:26:48,960 you want to, say, delete data that's 180 days 473 00:26:48,960 --> 00:26:51,691 old, you just drop the table that's associated with 474 00:26:51,700 --> 00:26:55,479 180 days ago. If you were to do a 475 00:26:55,480 --> 00:27:00,060 delete where, you know, created_at is less than 180 476 00:27:00,060 --> 00:27:03,310 days ago, you're probably gonna be waiting for weeks. 477 00:27:03,310 --> 00:27:07,780 And last, but not least, backups are a real 478 00:27:07,780 --> 00:27:10,090 pain in the neck when you have a huge 479 00:27:10,100 --> 00:27:12,211 dataset. Like, that crime job that dumps your entire 480 00:27:12,211 --> 00:27:16,969 database and uploads it to S3 doesn't work so 481 00:27:16,969 --> 00:27:19,900 well when your database is, like, two terabytes big. 482 00:27:19,900 --> 00:27:24,950 And, yeah, that's a big S3 bill, too. 483 00:27:24,950 --> 00:27:27,750 Fortunately, we can take advantage of the same sort 484 00:27:27,760 --> 00:27:31,770 of things that PostGres does to allow it to 485 00:27:31,770 --> 00:27:36,931 do streaming replication to do, sort of, on the 486 00:27:36,940 --> 00:27:41,319 fly, incremental backups. And there's a tool called Wall-E, 487 00:27:41,340 --> 00:27:45,150 which makes this super easy. And it, it's really 488 00:27:45,150 --> 00:27:48,530 cool, because it allows, it, it makes it very 489 00:27:48,530 --> 00:27:52,470 easy to upload incremental backups to, say, S3, and 490 00:27:52,480 --> 00:27:54,080 then when you want to restore your database, it 491 00:27:54,080 --> 00:27:58,090 makes it really easy to, to restore a specific 492 00:27:58,090 --> 00:28:00,600 point in time. 493 00:28:00,600 --> 00:28:03,660 And so that, Wall-E is really awesome. I love 494 00:28:03,680 --> 00:28:04,250 it. 495 00:28:04,250 --> 00:28:07,300 Now, I, I, I stayed in my hotel room 496 00:28:07,300 --> 00:28:10,530 last night. I didn't get to go to the 497 00:28:10,540 --> 00:28:12,951 Speakeasy thing, which kind of bummed me out. But 498 00:28:12,951 --> 00:28:14,539 I, I had to work on these slides for 499 00:28:14,540 --> 00:28:20,370 you people. And, and, and when I got done 500 00:28:20,380 --> 00:28:22,260 with this, when I sort of reached this point, 501 00:28:22,260 --> 00:28:25,680 I looked back on all of these, these slides 502 00:28:25,680 --> 00:28:28,100 - there's like fifty-four of them - and I 503 00:28:28,100 --> 00:28:30,520 was like, holy crap. This is a lot of 504 00:28:30,520 --> 00:28:32,000 information. 505 00:28:32,000 --> 00:28:38,840 Yeah. This is a lot of information. But the 506 00:28:38,840 --> 00:28:41,060 thing I, I want to stress is that, at 507 00:28:41,060 --> 00:28:45,380 least in, in my experience, these issues tend to 508 00:28:45,380 --> 00:28:47,730 come at you one at a time. This isn't 509 00:28:47,730 --> 00:28:48,970 the sort of thing where you have to know 510 00:28:48,970 --> 00:28:53,850 all of this stuff in advance in order to 511 00:28:53,850 --> 00:28:57,740 deal with biggish data in PostGres. Things come at 512 00:28:57,740 --> 00:28:58,980 you one at a time, and you can deal 513 00:28:58,980 --> 00:29:00,650 with them one at a time. 514 00:29:00,680 --> 00:29:02,951 And I have faith in you. I think you 515 00:29:02,951 --> 00:29:08,419 can do it, because you're awesome, because nobody told 516 00:29:08,420 --> 00:29:11,660 us that we could turn, like, transistors into LOLCats, 517 00:29:11,660 --> 00:29:14,951 but we did it. Like, that's the type of 518 00:29:14,960 --> 00:29:18,039 people we are, and that's, that's why I'm proud 519 00:29:18,040 --> 00:29:21,711 to be at RailsConf. If you're interested in, like, 520 00:29:21,711 --> 00:29:23,129 talking to me about this, if you think I'm 521 00:29:23,129 --> 00:29:26,320 full of shit about anything, which I probably am 522 00:29:26,320 --> 00:29:31,320 on at least one point, just, just say so. 523 00:29:31,320 --> 00:29:32,730 Just feel free to come up to me after 524 00:29:32,740 --> 00:29:36,210 the conference, or after my talk here. I have 525 00:29:36,220 --> 00:29:40,840 delicious hot and spicy Mexican candy, as an incentive. 526 00:29:40,840 --> 00:29:43,360 So there you go. 527 00:29:43,360 --> 00:29:45,900 If you want to learn more about the stuff 528 00:29:45,900 --> 00:29:47,970 that I. Are you taking? Do you want to 529 00:29:47,970 --> 00:29:51,520 take a picture of the Mexican candy? OK. OK. 530 00:29:51,520 --> 00:29:55,750 I can send it to you. 531 00:29:55,760 --> 00:29:58,890 If. If you want to learn more, all of 532 00:29:58,890 --> 00:30:00,880 the links referenced in this talk are at this 533 00:30:00,880 --> 00:30:03,630 url, and if you are interested in having more 534 00:30:03,640 --> 00:30:08,961 visibility into your production errors, check out Honeybadger, because, 535 00:30:08,961 --> 00:30:13,299 yeah. We love Rails devs. And that's it. That's. 536 00:30:13,300 --> 00:30:15,170 It says end of show here, so I guess 537 00:30:15,170 --> 00:30:16,211 it must be the end of show.