1
00:00:17,240 --> 00:00:18,690
STARR HORNE: So thank you guys for coming.

2
00:00:18,700 --> 00:00:21,090
This is gonna be an awesome talk.

3
00:00:21,090 --> 00:00:22,750
Quite possibly the best talk you'll ever hear

4
00:00:22,750 --> 00:00:26,440
in your life. What? I actually have

5
00:00:26,440 --> 00:00:27,920
a built-in escape hatch here,

6
00:00:27,920 --> 00:00:29,630
so if things start going wrong,

7
00:00:29,640 --> 00:00:33,490
I might just bail. Yeah.

8
00:00:33,490 --> 00:00:40,350
So, anyway. A couple weeks ago, I was, well,

9
00:00:40,360 --> 00:00:44,090
I was sort of having a panic attack about

10
00:00:44,120 --> 00:00:48,190
this presentation, because I'm a programmer,
and my natural

11
00:00:48,190 --> 00:00:51,160
habitat is in some sort of dark place looking

12
00:00:51,160 --> 00:00:53,649
a screen. It's not, it's not talking to a

13
00:00:53,649 --> 00:00:56,811
bunch of people with, like, lights. What's
up with

14
00:00:56,820 --> 00:01:00,009
this?

15
00:01:00,020 --> 00:01:01,450
So I, I was in the middle of this

16
00:01:01,450 --> 00:01:03,131
panic attack and I went out on a walk,

17
00:01:03,140 --> 00:01:04,909
cause that always clears my mind. I was listening

18
00:01:04,909 --> 00:01:08,291
to a podcast. My favorite podcast. It's called
Hardcore

19
00:01:08,300 --> 00:01:13,290
History. I highly recommend it. Yeah. Oh.
Somebody likes

20
00:01:13,290 --> 00:01:16,640
it. Yeah!

21
00:01:16,640 --> 00:01:19,580
So this, this particular episode about Hardcore,
in, of

22
00:01:19,580 --> 00:01:24,390
Hardcore History was about the Punic Wars.
And in

23
00:01:24,400 --> 00:01:26,600
case you haven't heard about the Punic Wars,
don't

24
00:01:26,600 --> 00:01:28,220
know the story, I'll just give you, like,
the

25
00:01:28,220 --> 00:01:31,540
brief rundown. Basically, a long, long time
ago, there

26
00:01:31,540 --> 00:01:34,660
was this place called Rome. And they, they
decided

27
00:01:34,660 --> 00:01:38,000
that they should take over the world. And,
for

28
00:01:38,000 --> 00:01:40,820
the most part, they did. But there was this

29
00:01:40,820 --> 00:01:42,740
one country that was a thorn in their side

30
00:01:42,740 --> 00:01:45,240
for a long time called Carthidge. And they
fought

31
00:01:45,240 --> 00:01:46,710
all these wars back and forth. It was kind

32
00:01:46,710 --> 00:01:48,890
of a stale mate.

33
00:01:48,890 --> 00:01:51,070
And then one day, when, I don't know, maybe

34
00:01:51,080 --> 00:01:53,810
he was making a sacrifice to the fire god

35
00:01:53,810 --> 00:01:57,420
or whatever, this guy named Hannibal had this
great

36
00:01:57,420 --> 00:01:59,700
idea. He was like, I'm gonna, I'm gonna lead

37
00:01:59,700 --> 00:02:02,800
this sneak-attack on Rome, and I'm gonna do
so

38
00:02:02,800 --> 00:02:07,100
by marching my army through the Alps. Which
is,

39
00:02:07,120 --> 00:02:10,979
is pretty cool. I mean, that, that's pretty
bad

40
00:02:10,979 --> 00:02:12,770
ass. But I, I think the most awesome part

41
00:02:12,770 --> 00:02:15,751
of the story, for me, at least, is that

42
00:02:15,760 --> 00:02:19,180
the dude had war elephants. So, I, I don't

43
00:02:19,180 --> 00:02:20,560
know if you can see it in the slide,

44
00:02:20,560 --> 00:02:23,769
but there's actually a war elephant. It's,
it's kind

45
00:02:23,769 --> 00:02:26,831
of, I don't know, it's kind of under the

46
00:02:26,840 --> 00:02:31,220
G. And a lot of people have doubted this

47
00:02:31,220 --> 00:02:34,830
story, you know, over the years. And so, in

48
00:02:34,830 --> 00:02:39,400
1959, there was a British engineer who, on
a

49
00:02:39,400 --> 00:02:44,350
bar bet, borrowed a circus elephant named
Jumbo and

50
00:02:44,350 --> 00:02:47,180
marched it across the Alps, too.

51
00:02:47,180 --> 00:02:49,820
So, I don't know what my point is here,

52
00:02:49,820 --> 00:02:54,689
really. I guess don't underestimate elephants,

53
00:02:54,689 --> 00:02:56,171
because they don't

54
00:02:56,200 --> 00:03:00,140
like it and they have long memories.

55
00:03:00,220 --> 00:03:07,220
So, OK. This talk is really about biggish
data.

56
00:03:07,580 --> 00:03:10,350
So what, what the hell is biggish data. It's

57
00:03:10,350 --> 00:03:12,699
not, what is? Whoa, I've got one of these

58
00:03:12,700 --> 00:03:18,431
things. It's not big data. It's not about,
you

59
00:03:18,440 --> 00:03:20,760
know, this talk isn't gonna be about Hadoop
clusters

60
00:03:20,760 --> 00:03:25,349
and server farms and all that stuff. That's,
that's

61
00:03:25,349 --> 00:03:27,261
way over my head.

62
00:03:27,261 --> 00:03:31,460
It's not about fancy architectures. I'm not
gonna show

63
00:03:31,460 --> 00:03:35,760
you how to make multi-right PostGres clusters
that do

64
00:03:35,760 --> 00:03:38,809
automatic sharding and stuff. I mean, that's
all like

65
00:03:38,809 --> 00:03:40,800
wicked cool, but it's not what we're gonna
talk

66
00:03:40,800 --> 00:03:41,221
about.

67
00:03:41,221 --> 00:03:43,579
Instead, I'm gonna talk about something that
I think

68
00:03:43,600 --> 00:03:47,350
is, is more practical, and it's kind of more

69
00:03:47,350 --> 00:03:49,281
interesting to me, and that is, how do you

70
00:03:49,281 --> 00:03:52,059
keep your app working as your production dataset
grows?

71
00:03:52,060 --> 00:03:59,060
And grows into biggish data territory?

72
00:03:59,160 --> 00:04:01,189
And this is very easy. This is very easy

73
00:04:01,200 --> 00:04:03,611
to happen. Even if you don't have a firehose

74
00:04:03,620 --> 00:04:06,050
of data coming in. If you run a, a

75
00:04:06,050 --> 00:04:09,500
popular e-commerce site or just a site with
a

76
00:04:09,500 --> 00:04:11,670
lot of users, over the course of years you

77
00:04:11,670 --> 00:04:15,430
can accumulate a ton of data. And as this

78
00:04:15,440 --> 00:04:18,789
data accumulates, you find that your site
performance goes

79
00:04:18,789 --> 00:04:21,351
downhill sort of gradually in ways that you
don't

80
00:04:21,360 --> 00:04:24,189
understand.

81
00:04:24,189 --> 00:04:28,891
So. What is biggish data? This talk is based

82
00:04:28,900 --> 00:04:31,550
on my experience at Honeybadger. In case you
haven't

83
00:04:31,550 --> 00:04:35,660
heard it, of us, we are an exception performance

84
00:04:35,660 --> 00:04:38,280
and uptime monitoring company.

85
00:04:38,280 --> 00:04:39,670
That means that we essentially

86
00:04:39,670 --> 00:04:42,070
have a firehose of new data coming in every

87
00:04:42,100 --> 00:04:45,650
day. Right now we have about a terabyte of

88
00:04:45,650 --> 00:04:49,700
errors in our database. The world has a lot

89
00:04:49,700 --> 00:04:52,460
of errors. You guys need to start, like, doing

90
00:04:52,460 --> 00:04:54,480
a better job.

91
00:04:54,480 --> 00:04:57,950
And, and we get about, we get two gigabytes

92
00:04:57,960 --> 00:05:01,240
of new errors per day, roughly. And all this

93
00:05:01,240 --> 00:05:06,810
goes into a plain vanilla PostGres database,
and it's

94
00:05:06,810 --> 00:05:08,810
served, that backs a pretty plain

95
00:05:08,820 --> 00:05:11,550
vanilla Rails application.

96
00:05:11,560 --> 00:05:15,410
Well, the good news is that PostGres can handle

97
00:05:15,410 --> 00:05:19,590
this pretty easily. PostGres has got it covered.

98
00:05:19,590 --> 00:05:24,390
The bad news is that, unless you've engineered
your

99
00:05:24,390 --> 00:05:27,870
app to deal with this level of data, you're

100
00:05:27,880 --> 00:05:31,270
kind of screwed. And the reason is that a

101
00:05:31,270 --> 00:05:33,530
hundred megabyte database behaves

102
00:05:33,560 --> 00:05:36,419
fundamentally different than a one

103
00:05:36,420 --> 00:05:41,881
terabyte database. And a lot of the conventions,
a

104
00:05:41,881 --> 00:05:43,330
lot of the normal things we do in Rails

105
00:05:43,330 --> 00:05:45,909
apps just stop working when you have this
much

106
00:05:45,909 --> 00:05:49,941
data.

107
00:05:49,941 --> 00:05:55,539
Yup. This is actually a live feed. This queries

108
00:05:55,540 --> 00:06:02,560
been running since this morning.

109
00:06:03,700 --> 00:06:06,741
Something I learned, pretty much every pagination
system breaks

110
00:06:06,741 --> 00:06:13,150
on page, like, 2000. Even if you, you're like,

111
00:06:13,160 --> 00:06:15,690
dear God, make this stop, I just want to

112
00:06:15,690 --> 00:06:19,050
delete half my data. You're, you're just still
screwed,

113
00:06:19,060 --> 00:06:22,380
because deleting data takes a lot of time.

114
00:06:22,400 --> 00:06:25,840
Now, I'm gonna explain all of this. I'm gonna

115
00:06:25,840 --> 00:06:27,780
explain why this happens, how you can work
around

116
00:06:27,780 --> 00:06:30,790
it, and how you can optimize the queries in

117
00:06:30,790 --> 00:06:34,700
your own database and optimize your stack.
But, to

118
00:06:34,700 --> 00:06:36,760
really understand this, we've got to take
a little

119
00:06:36,760 --> 00:06:43,760
trip back in time, to the summer of 1978.

120
00:06:44,740 --> 00:06:49,261
The summer of disco, free love, and the VT-100

121
00:06:49,261 --> 00:06:53,869
terminal. The VT-100 was the first sort of
computer-shaped

122
00:06:53,869 --> 00:06:56,041
thing that you could buy from an actual company

123
00:06:56,041 --> 00:06:58,550
and sit on your desk to prove that you

124
00:06:58,560 --> 00:07:02,110
were the alpha nerd of the office. It was

125
00:07:02,110 --> 00:07:05,150
also the time that Oracle 1 point 0 was

126
00:07:05,150 --> 00:07:08,730
being developed, which was one of the first
databases

127
00:07:08,740 --> 00:07:12,250
that used the SQL language. It's pretty cool.
It

128
00:07:12,280 --> 00:07:15,030
was written in PDP-11 assembly language.

129
00:07:15,030 --> 00:07:18,850
And, and in case you're like too young to

130
00:07:18,850 --> 00:07:23,220
have ever seen a PDP-11, this is what a

131
00:07:23,220 --> 00:07:29,640
PDP-11 looks like. Yeah. Give it, show some
love

132
00:07:29,660 --> 00:07:32,570
for the PDP-11.

133
00:07:32,570 --> 00:07:39,470
Now, this is pretty awesome. Nowadays, in
our, like,

134
00:07:39,480 --> 00:07:43,029
modern day times, right this second, marketers
down in

135
00:07:43,029 --> 00:07:46,081
the, the presentat- what, what do they call
that?

136
00:07:46,081 --> 00:07:50,739
The exhibit hall. Are, right now, throwing
around all

137
00:07:50,740 --> 00:07:54,120
sorts of words about cloud computing and platform
as

138
00:07:54,120 --> 00:07:58,191
a service and, and I don't know, some, some

139
00:07:58,200 --> 00:08:00,969
stuff I haven't even heard of, probably. And
back

140
00:08:00,980 --> 00:08:02,771
then it was the same, but the, the buzz

141
00:08:02,771 --> 00:08:05,439
words were different. The buzz words on everybody's,
the

142
00:08:05,440 --> 00:08:09,810
buzz word on everybody's lips, at that time,
was

143
00:08:09,820 --> 00:08:11,661
real time computing.

144
00:08:11,661 --> 00:08:17,789
And I, I'm using air quotes around real time,

145
00:08:17,789 --> 00:08:21,660
because that meant that you could have your,
your

146
00:08:21,660 --> 00:08:25,981
report generated in five minutes instead of
five days.

147
00:08:25,981 --> 00:08:29,150
Which I'm sure was pretty awesome. Like, I
would

148
00:08:29,150 --> 00:08:32,669
have been really happy about that. But, you
know,

149
00:08:32,679 --> 00:08:34,039
looking at this, it doesn't really seem like
the

150
00:08:34,059 --> 00:08:35,421
sort of thing I would want to build a

151
00:08:35,440 --> 00:08:38,318
web app on top of, right?

152
00:08:38,320 --> 00:08:43,600
Like, I. Yeah. I, I wouldn't stick Honeybadger
on

153
00:08:43,600 --> 00:08:47,520
that. But, a funny thing happened in the past

154
00:08:47,520 --> 00:08:53,071
thirty, thirty-six years. A funny thing happened
is that

155
00:08:53,071 --> 00:08:57,199
Moore's Law made this into a web stack. Now,

156
00:08:57,200 --> 00:08:59,221
I know that everybody here knows sort of,
vaguely,

157
00:08:59,221 --> 00:09:00,860
what Moore's Law is. But I thought it would

158
00:09:00,860 --> 00:09:06,040
be good to go over the technical, precise
definition

159
00:09:06,040 --> 00:09:11,500
of Moore's Law.

160
00:09:11,500 --> 00:09:18,120
And that is that, whoa. That is that computers

161
00:09:18,120 --> 00:09:20,110
get more awesome as time goes forward. Did
that

162
00:09:20,120 --> 00:09:23,480
come back normally? Yeah. As time goes forward.
And

163
00:09:23,480 --> 00:09:27,800
awesomeness is generally defined as, well,
technically, being able

164
00:09:27,800 --> 00:09:30,730
to do more stuff faster. Being able to process

165
00:09:30,730 --> 00:09:37,720
more data faster. So, based on this, I would

166
00:09:38,100 --> 00:09:41,300
like to postulate, for the first time in public

167
00:09:41,300 --> 00:09:43,190
- you guys are very lucky to be able

168
00:09:43,190 --> 00:09:45,290
to hear this - the first time in public,

169
00:09:45,290 --> 00:09:49,350
Starr's Corelary to Moore's Law, which is,
as database

170
00:09:49,350 --> 00:09:56,340
growth outpaces Moore's Law, you literally
travel back in

171
00:09:57,100 --> 00:10:00,390
time.

172
00:10:00,390 --> 00:10:01,940
That was the Ted moment of this, of this

173
00:10:01,940 --> 00:10:04,940
talk. It's like, I can hear you guys, your

174
00:10:04,940 --> 00:10:10,380
minds being blown. It sounds like bubble wrap
popping.

175
00:10:10,380 --> 00:10:16,570
So, based on my, my extensive research of
time-travel

176
00:10:16,570 --> 00:10:21,360
movies, the, the first rule for surviving
your new

177
00:10:21,360 --> 00:10:23,300
time period is that you need to understand
the

178
00:10:23,300 --> 00:10:27,290
methods and the motivations of the people
in that

179
00:10:27,290 --> 00:10:29,239
time period. We need to start thinking like
our

180
00:10:29,240 --> 00:10:34,490
ancestors. And our ancestors were very interested
in this

181
00:10:34,500 --> 00:10:37,521
thing called hard ware.

182
00:10:37,521 --> 00:10:42,060
Well, I, I don't mean to be flip, but

183
00:10:42,060 --> 00:10:46,120
I just want to bring this up first, because

184
00:10:46,120 --> 00:10:49,390
if you happen to be having database scaling
issues,

185
00:10:49,390 --> 00:10:52,010
like, if your app right now is getting slower

186
00:10:52,020 --> 00:10:54,100
because the amount of data in your database
is

187
00:10:54,100 --> 00:10:57,950
getting bigger, and your app happens to live
anywhere

188
00:10:57,980 --> 00:11:04,360
named EC-2, Linode, I don't know, whatever,
whatever other

189
00:11:04,360 --> 00:11:08,120
companies offer that, you can probably solve
your scaling

190
00:11:08,120 --> 00:11:10,560
issue just, like, right now just by going
and

191
00:11:10,560 --> 00:11:13,019
buying a real damn computer.

192
00:11:13,020 --> 00:11:19,731
Yeah. You'll be shocked and amazed. Because
the two

193
00:11:19,731 --> 00:11:23,970
things that databases need to perform well,
to work

194
00:11:23,980 --> 00:11:26,330
at high performance with lots of data, is
they

195
00:11:26,330 --> 00:11:29,000
need lots of ram, and they need really fast

196
00:11:29,000 --> 00:11:32,100
disc io. And virtual servers give you neither
of

197
00:11:32,100 --> 00:11:35,040
those things. So, just go buy a real damn

198
00:11:35,040 --> 00:11:39,030
computer. Yeah. And, and while you're at it,
throw

199
00:11:39,030 --> 00:11:42,509
in a bunch of discs.

200
00:11:42,509 --> 00:11:45,361
I'm not talking about a radar ray. You probably

201
00:11:45,361 --> 00:11:48,280
want each of your, you probably want each
of

202
00:11:48,280 --> 00:11:53,239
your operating system, your PostGres data,
your PostGres log

203
00:11:53,240 --> 00:11:55,231
file, you probably want all that stuff to
be

204
00:11:55,240 --> 00:11:58,259
on separate hard drives. And that's just gonna
make

205
00:11:58,280 --> 00:12:02,590
the operating system able to more efficiently
sort of

206
00:12:02,590 --> 00:12:04,241
schedule disc io.

207
00:12:04,241 --> 00:12:09,940
So, if, if that solves your problems, you
know,

208
00:12:09,940 --> 00:12:13,910
great. You're welcome. If you guys want to
leave

209
00:12:13,910 --> 00:12:16,830
the room now, that's fine. I won't be offended.

210
00:12:16,830 --> 00:12:19,460
You can go call your local Colo facility.
Work

211
00:12:19,460 --> 00:12:22,780
something out. If, if that doesn't solve your
problem,

212
00:12:22,780 --> 00:12:24,999
or even if it does, you probably want to

213
00:12:25,000 --> 00:12:28,881
look at your queries next.

214
00:12:28,881 --> 00:12:33,739
Now, most people, most of us develop - oops.

215
00:12:33,740 --> 00:12:38,841
Hey. Most of us develop against smallish datasets.
So,

216
00:12:38,860 --> 00:12:40,769
when you develop against a small dataset,
you don't

217
00:12:40,769 --> 00:12:44,971
notice inefficient queries. It's just life.
And books have

218
00:12:44,980 --> 00:12:48,250
been written about query optimization. It's,
it's a very

219
00:12:48,260 --> 00:12:52,560
huge topic, and I can't ev, explain everything
in

220
00:12:52,560 --> 00:12:54,200
thirty minutes. So I'm just going to explain
one

221
00:12:54,200 --> 00:12:57,360
thing, and that's called a, a explain.

222
00:12:57,360 --> 00:13:02,550
Fortunately, PostGres gives us an awesome
tool called Explain,

223
00:13:02,560 --> 00:13:06,089
which basically you pass it a query and it

224
00:13:06,140 --> 00:13:13,140
gives us a query plan. Oh it, wait a

225
00:13:16,080 --> 00:13:19,330
second. That's a, that's a chapter from the
Iliad.

226
00:13:19,330 --> 00:13:23,840
Sorry. It gives us a query plan, which still

227
00:13:23,840 --> 00:13:26,230
kind of looks like a chapter from the Iliad.

228
00:13:26,240 --> 00:13:29,110
But, fortunately, we only have to, to look
at

229
00:13:29,110 --> 00:13:31,080
one metric here. The only thing we need to

230
00:13:31,100 --> 00:13:37,160
worry about is rows. For this talk.

231
00:13:37,160 --> 00:13:38,560
And here we have an example of a very

232
00:13:38,560 --> 00:13:42,300
efficient query, right. It's gonna use an
index, and

233
00:13:42,300 --> 00:13:44,880
it's gonna look at one row. That's pretty
sweet.

234
00:13:44,880 --> 00:13:46,970
Very fast. This is the type of query that

235
00:13:46,970 --> 00:13:51,240
biggish data works with.

236
00:13:51,240 --> 00:13:52,559
But there's one thing you need to know about

237
00:13:52,560 --> 00:13:59,341
rows. And that, obviously, the more rows you're
dealing

238
00:13:59,360 --> 00:14:01,680
with, the more data your computer has to crunch

239
00:14:01,680 --> 00:14:04,170
to give you the answers you want. And so

240
00:14:04,170 --> 00:14:06,600
the whole, the whole name of the game in

241
00:14:06,600 --> 00:14:08,920
query optimization is to limit the number
of rows

242
00:14:08,920 --> 00:14:10,259
that you have to touch.

243
00:14:10,260 --> 00:14:15,721
Let's go back to that inefficient count query.
So

244
00:14:15,721 --> 00:14:17,860
if you run Explain on that, it turns out

245
00:14:17,860 --> 00:14:21,660
that when you ask PostGres to count all of

246
00:14:21,660 --> 00:14:25,170
the tab, all of the rows in a table,

247
00:14:25,170 --> 00:14:27,050
it actually literally counts all the rows
in the

248
00:14:27,060 --> 00:14:32,440
table, one by one.

249
00:14:32,440 --> 00:14:35,009
And so you wind up with a, one of

250
00:14:35,009 --> 00:14:40,581
these things. It's not pretty. And it's often
hard

251
00:14:40,581 --> 00:14:42,800
to know when stuff like this is gonna crop

252
00:14:42,800 --> 00:14:46,960
up. For example, I mentioned that most pagination
systems

253
00:14:46,960 --> 00:14:51,569
break at page 1000 or 2000. And that's because,

254
00:14:51,569 --> 00:14:58,281
well, the offset and limit operators work
a lot

255
00:14:58,300 --> 00:15:00,269
like count, in that, if you do an offset

256
00:15:00,269 --> 00:15:05,010
of 500,000, PostGres is going to count up
500,000

257
00:15:05,010 --> 00:15:06,581
rows, and then if you have a limit of

258
00:15:06,600 --> 00:15:08,740
100, it's gonna count up another 100. And
so

259
00:15:08,740 --> 00:15:11,650
what you see is that pages one, two, and

260
00:15:11,650 --> 00:15:16,840
three load extremely quickly. And so you,
as a

261
00:15:16,840 --> 00:15:18,990
developer, are actually, when I say you I
mean

262
00:15:19,000 --> 00:15:22,210
me, cause I actually, I actually fell into
this

263
00:15:22,210 --> 00:15:24,770
trap. You test it with, like, one or two

264
00:15:24,770 --> 00:15:27,999
pages and it works fine, so you move on

265
00:15:28,020 --> 00:15:29,890
and then later on your customers are complaining
because

266
00:15:29,890 --> 00:15:34,411
they can't access page 500. It's timing out.

267
00:15:34,420 --> 00:15:38,730
The solution to this is to use a range

268
00:15:38,730 --> 00:15:41,759
query. Range queries are very fast, and I'm
gonna

269
00:15:41,760 --> 00:15:44,651
give you a really stupid example here. Here's
a

270
00:15:44,660 --> 00:15:51,560
link with a much more detailed analysis of
all

271
00:15:51,560 --> 00:15:55,100
this. But essentially, if you do a range query,

272
00:15:55,100 --> 00:15:56,590
you're gonna be able to use an index. It's

273
00:15:56,590 --> 00:15:58,700
gonna touch 100 rows and it's gonna be really

274
00:15:58,700 --> 00:15:59,550
fast.

275
00:15:59,550 --> 00:16:02,619
Now, I know what you guys are thinking. That's

276
00:16:02,620 --> 00:16:06,250
not the same as an offset limit query. And

277
00:16:06,280 --> 00:16:07,631
that's the reason I gave you the link in

278
00:16:07,631 --> 00:16:10,439
the previous slide, so.

279
00:16:10,440 --> 00:16:14,890
Sorting. Sorting is tricky. Sorting is the
devil. Sometimes

280
00:16:14,960 --> 00:16:17,140
it's super fast. Sometimes if you happen to
be

281
00:16:17,140 --> 00:16:20,731
asking for data sorted in exactly the same
way

282
00:16:20,740 --> 00:16:23,800
that an index is sorted, well, it'll be super

283
00:16:23,800 --> 00:16:26,579
fast. Other times, even if you have an index

284
00:16:26,580 --> 00:16:29,020
on a column, if it's not set up right

285
00:16:29,020 --> 00:16:30,911
or if you're asking for the data in a

286
00:16:30,911 --> 00:16:34,019
slightly unusual way, you'll wind up sorting
the whole

287
00:16:34,020 --> 00:16:35,200
damn dataset.

288
00:16:35,200 --> 00:16:40,621
It's no good. So here's your rundown for query

289
00:16:40,621 --> 00:16:44,370
optimization. You want to develop against
a real dataset,

290
00:16:44,370 --> 00:16:46,090
because otherwise you won't know when things
are going

291
00:16:46,090 --> 00:16:50,900
wrong until they go wrong in production. Use,
use

292
00:16:50,920 --> 00:16:53,620
Explain. Use it a lot. And the whole name

293
00:16:53,620 --> 00:16:54,949
of the game here is to limit the number

294
00:16:54,960 --> 00:16:57,421
of rows that you touch. Because the more rows

295
00:16:57,421 --> 00:16:59,740
you touch, the slower your queries are. You
know,

296
00:16:59,740 --> 00:17:01,579
in general.

297
00:17:01,580 --> 00:17:06,339
So, I, I don't know about you, but that

298
00:17:06,339 --> 00:17:08,890
just like, that just was a lot of cognitive

299
00:17:08,890 --> 00:17:12,101
load happening, right up here. So I want to

300
00:17:12,101 --> 00:17:19,099
give you guys a cookie.

301
00:17:24,440 --> 00:17:27,410
For the second half of this talk, we're gonna

302
00:17:27,420 --> 00:17:30,950
cover a lot of issues that relate to kind

303
00:17:30,960 --> 00:17:35,530
of the infrastructure around biggish data,
around big datasets

304
00:17:35,530 --> 00:17:41,071
and firehoses of data coming in. And here's
a,

305
00:17:41,080 --> 00:17:44,389
a legend of, of how we're gonna approach this.

306
00:17:44,389 --> 00:17:48,100
We're gonna have a cute picture that, hopefully,
sort

307
00:17:48,120 --> 00:17:50,951
of symbolizes something about the, the topic.
We're gonna

308
00:17:50,960 --> 00:17:52,720
have the name of the topic, and then we're

309
00:17:52,720 --> 00:17:55,169
gonna have a link with more info about the

310
00:17:55,169 --> 00:17:57,681
topic if, if you want to, you know, find

311
00:17:57,681 --> 00:17:58,249
this later.

312
00:17:58,249 --> 00:17:59,551
And you don't have to write all these links

313
00:17:59,560 --> 00:18:02,769
down right now. At the end of the presentation,

314
00:18:02,769 --> 00:18:05,890
at the very last slide, I'll have a url

315
00:18:05,890 --> 00:18:07,921
where you can get a list of all of

316
00:18:07,921 --> 00:18:11,579
them. And, yeah, and I promise this isn't
malware.

317
00:18:11,580 --> 00:18:16,441
So, you just gotta take my word on that.

318
00:18:16,441 --> 00:18:19,349
Our ancestors were really interested in disc
operating systems.

319
00:18:19,380 --> 00:18:21,310
It seems like they were building a new disc

320
00:18:21,310 --> 00:18:25,110
operating system like every two seconds, but.
What, what's

321
00:18:25,110 --> 00:18:26,620
up with that? I, I don't know. Now we

322
00:18:26,640 --> 00:18:28,860
have, like, three.

323
00:18:28,860 --> 00:18:30,081
The first thing that you should do if you're

324
00:18:30,081 --> 00:18:33,859
using Linux is to increase your read-ahead
cache. A

325
00:18:33,860 --> 00:18:37,430
read-ahead cache is something that, well,
I found not,

326
00:18:37,430 --> 00:18:39,610
not a ton of people know about, but it's

327
00:18:39,620 --> 00:18:44,160
a super easy way to get up to a

328
00:18:44,160 --> 00:18:49,670
doubling of your read-throughput. And essentially
what this means

329
00:18:49,670 --> 00:18:54,790
is that Linux examines the request that you
make

330
00:18:54,790 --> 00:18:57,230
to the, to the disc. And if it sees

331
00:18:57,240 --> 00:18:59,660
that you're asking for lots of blocks that
come

332
00:18:59,660 --> 00:19:02,071
right after another in a row, it's gonna preload

333
00:19:02,080 --> 00:19:04,899
the next set of blocks into RAM.

334
00:19:04,900 --> 00:19:09,650
The normal, the, the default for this is,
like,

335
00:19:09,650 --> 00:19:14,701
256K RAM that it uses to prefix these blocks.

336
00:19:14,701 --> 00:19:17,109
If you update this to use, like, two megs,

337
00:19:17,109 --> 00:19:19,720
four megs, you'll get a really big increase
in

338
00:19:19,720 --> 00:19:25,051
read performance. Use a modern file system.
That means

339
00:19:25,060 --> 00:19:29,840
EXT-3 is not an option. If you want to

340
00:19:29,840 --> 00:19:33,049
know why, check out the link.

341
00:19:33,060 --> 00:19:37,740
And, and if you happen to be using EXT-4

342
00:19:37,740 --> 00:19:42,920
or XFS. I can never say that unless I

343
00:19:42,920 --> 00:19:45,571
say it super slow. You might want to consider,

344
00:19:45,571 --> 00:19:47,969
you might want to look into journaling settings.
If

345
00:19:47,969 --> 00:19:50,051
you have your database on a completely separate
hard

346
00:19:50,060 --> 00:19:55,379
drive, and it's running EXT-4 and you have
full

347
00:19:55,380 --> 00:20:00,040
data journaling enabled, since PostGres does
its own journaling,

348
00:20:00,040 --> 00:20:01,831
you're gonna have an inefficiency there because
you have

349
00:20:01,840 --> 00:20:03,119
two things journaling.

350
00:20:03,120 --> 00:20:08,951
Finally, or, wait. Not finally. Anyway. You
gotta tell

351
00:20:08,951 --> 00:20:14,989
PG, tell, tell Paul Gram about, about all
the

352
00:20:15,000 --> 00:20:17,581
RAM that you bought in that, that fancy new

353
00:20:17,581 --> 00:20:20,289
box. A, a really easy way to set this

354
00:20:20,289 --> 00:20:24,000
up is to use a PG-tune script. It examines

355
00:20:24,000 --> 00:20:28,821
your computer and writes a, a PostGres configuration
file

356
00:20:28,821 --> 00:20:33,460
that has some pretty, pretty reasonable values
in it.

357
00:20:33,460 --> 00:20:36,149
And you can tweak them from there. This stuff

358
00:20:36,160 --> 00:20:38,071
is all really easy to find.

359
00:20:38,071 --> 00:20:41,469
And then, finally, the bane of, of DB administrators

360
00:20:41,469 --> 00:20:47,220
is the vacuum command. PostGres needs to vacuum,
because

361
00:20:47,240 --> 00:20:51,060
it's messy. It. When you, when you run queries,

362
00:20:51,060 --> 00:20:53,360
when you delete things, when you update things,
it

363
00:20:53,360 --> 00:20:55,250
leaves sort of a lot of stuff undone. And

364
00:20:55,250 --> 00:20:59,970
it does that, it does that in the, to,

365
00:20:59,970 --> 00:21:03,671
it. In the name of speed, right. It only

366
00:21:03,680 --> 00:21:05,589
does what is necessary at the time of query

367
00:21:05,589 --> 00:21:08,821
to get you an answer to your query.

368
00:21:08,821 --> 00:21:10,719
And then it uses vacuum to go and sort

369
00:21:10,720 --> 00:21:14,510
of clean up after itself. The problem is that

370
00:21:14,510 --> 00:21:18,671
vacuum can be really resource intensive. And
so if

371
00:21:18,680 --> 00:21:21,609
your, if your server is under a lot of

372
00:21:21,620 --> 00:21:24,850
load, and you see that, OK, vacuum is also

373
00:21:24,860 --> 00:21:26,740
causing a lot of load, you may be tempted

374
00:21:26,740 --> 00:21:30,980
to turn off vacuum or to, or to make

375
00:21:30,980 --> 00:21:34,560
autovacuum happen maybe once a night or something.
And

376
00:21:34,560 --> 00:21:38,620
that's generally a bad idea. We actually got
bitten

377
00:21:38,620 --> 00:21:41,390
by this ourselves, so that's why I bring it

378
00:21:41,390 --> 00:21:43,510
up. Usually the answer to vacuum problems
is to

379
00:21:43,510 --> 00:21:46,441
vacuum more often, not less often.

380
00:21:46,441 --> 00:21:51,739
All right. So. Velocity. I really wanted to
like

381
00:21:51,740 --> 00:21:52,921
- I didn't have time - I really wanted

382
00:21:52,921 --> 00:21:56,519
to have a little, a little like, Tron guy

383
00:21:56,520 --> 00:22:00,600
on the motorcycle going down the little grid
there.

384
00:22:00,600 --> 00:22:02,260
But. Eh.

385
00:22:02,280 --> 00:22:05,400
Now we're gonna talk about a, a couple things

386
00:22:05,400 --> 00:22:06,941
that are important if you have a ton of

387
00:22:06,941 --> 00:22:11,229
data coming in, or a ton of read-requests
coming

388
00:22:11,229 --> 00:22:13,970
in, a ton of queries coming in. The first

389
00:22:13,970 --> 00:22:16,551
is too many database connections. Each database
connection in

390
00:22:16,560 --> 00:22:21,159
PostGres is its own process, and each process
has

391
00:22:21,160 --> 00:22:23,581
its own RAM overhead. So there's a limited
number

392
00:22:23,581 --> 00:22:25,680
of connections that you want to have to your,

393
00:22:25,680 --> 00:22:27,109
your database server.

394
00:22:27,109 --> 00:22:31,561
If you have, I don't know, a thousand workers

395
00:22:31,561 --> 00:22:35,159
and, and web app processes and all this, you

396
00:22:35,160 --> 00:22:37,951
don't want to open a thousand database connections
and

397
00:22:37,960 --> 00:22:40,549
you probably already know about this. The
solution is

398
00:22:40,549 --> 00:22:42,270
to pool connections.

399
00:22:42,280 --> 00:22:44,630
There's ways to do this in Ruby. There's also,

400
00:22:44,630 --> 00:22:47,500
if you're interested in a more ops-y approach,
you

401
00:22:47,520 --> 00:22:50,110
can check out something like PG-bouncer, which
is a

402
00:22:50,110 --> 00:22:54,260
proxy that sits in between your Ruby app and

403
00:22:54,260 --> 00:22:58,090
your database and functions as a, as a connection

404
00:22:58,090 --> 00:23:01,020
pool.

405
00:23:01,020 --> 00:23:02,850
You can also run into problems with too many

406
00:23:02,850 --> 00:23:05,700
locks. And this is, this is the sort of

407
00:23:05,700 --> 00:23:08,191
problem that you don't really ever run into
if

408
00:23:08,200 --> 00:23:10,649
you don't have a firehose of database, of
data

409
00:23:10,649 --> 00:23:15,791
coming in.

410
00:23:15,800 --> 00:23:17,829
I don't expect you to know, like, everything
about

411
00:23:17,840 --> 00:23:19,831
locks, because it's a, it's a pretty complex
topic.

412
00:23:19,840 --> 00:23:23,879
But you should know that, within a transaction,
if

413
00:23:23,880 --> 00:23:28,191
you go to update a row, that transaction is

414
00:23:28,191 --> 00:23:31,039
gonna put a lock on the row until it's

415
00:23:31,040 --> 00:23:32,890
done. It's gonna say that nothing else can
write

416
00:23:32,890 --> 00:23:37,051
to that row until it's done. And, you know,

417
00:23:37,060 --> 00:23:39,190
that makes sense.

418
00:23:39,200 --> 00:23:40,909
But where this can bite you is if you

419
00:23:40,909 --> 00:23:44,951
have, say, imagine you have two Rails models.
You

420
00:23:44,960 --> 00:23:47,769
have a parent model and a child model. And

421
00:23:47,769 --> 00:23:50,410
the parent model has a counter-cache that
gets incremented

422
00:23:50,420 --> 00:23:53,790
every time you add a child. Normally, this
is

423
00:23:53,790 --> 00:23:55,740
no big deal. People do this sort of thing

424
00:23:55,740 --> 00:24:00,390
all the time. But if something crazy happens
and

425
00:24:00,390 --> 00:24:05,301
someone, you know, bombards your API and suddenly
you

426
00:24:05,301 --> 00:24:09,699
have, like, fifteen thousand children created,
you're, you're gonna

427
00:24:09,700 --> 00:24:11,040
have some locking issues.

428
00:24:11,040 --> 00:24:13,410
Cause what's gonna happen is your first query
is

429
00:24:13,410 --> 00:24:17,320
going to execute fine. The child's gonna get
created.

430
00:24:17,340 --> 00:24:19,920
It's gonna increment the counter. Everything's
gonna be fine.

431
00:24:19,920 --> 00:24:22,410
And while it was doing that, it put a

432
00:24:22,410 --> 00:24:26,551
lock on, on that row. And so, once the

433
00:24:26,560 --> 00:24:29,259
first query's done, the lock is, is removed,
and

434
00:24:29,260 --> 00:24:33,920
the second query happens. And if you had infinite

435
00:24:33,920 --> 00:24:35,960
time, like, this would all be fine. It would

436
00:24:35,960 --> 00:24:39,400
all eventually work itself out. But what happens
is

437
00:24:39,420 --> 00:24:42,170
that by the time you finish query 100, query

438
00:24:42,170 --> 00:24:47,700
15, 15,000 has timed out, which causes all
sorts

439
00:24:47,700 --> 00:24:50,880
of fun in your, you know, in your Unicorns

440
00:24:50,880 --> 00:24:54,910
and your Sidekiqs and all that. It's a, yeah,

441
00:24:54,920 --> 00:24:56,520
it's a huge pain in the neck.

442
00:24:56,520 --> 00:25:01,120
And the way you avoid this is to, it,

443
00:25:01,120 --> 00:25:04,130
it's just an architectural thing. You just
have to

444
00:25:04,130 --> 00:25:07,090
avoid any situation where you could be updating
the

445
00:25:07,100 --> 00:25:11,670
same record in the database like a gillion
times

446
00:25:11,670 --> 00:25:16,140
per second.

447
00:25:16,140 --> 00:25:21,390
Intensive database queries. Like, sometimes,
sometimes we have our

448
00:25:21,390 --> 00:25:25,840
production database, and we need that to be
very,

449
00:25:25,860 --> 00:25:29,670
very performant for our users. But we also
need

450
00:25:29,680 --> 00:25:31,910
to maybe do some preprocessing on data as
it

451
00:25:31,910 --> 00:25:34,321
comes in. And a really simple way to do

452
00:25:34,321 --> 00:25:38,899
this is to use PostGres's streaming replication
facilities to

453
00:25:38,900 --> 00:25:42,960
create a read-only replicant. And then you
just do

454
00:25:42,960 --> 00:25:46,160
your intensive queries against that and it
doesn't affect,

455
00:25:46,160 --> 00:25:49,920
affect your users. It's super simple. Sorry.
I, I,

456
00:25:49,920 --> 00:25:53,300
I kind of feel dumb even talking about it

457
00:25:53,300 --> 00:25:56,400
here. But, there you go.

458
00:25:56,400 --> 00:25:59,210
Partitioning is awesome. Partitioning is like
the best thing

459
00:25:59,210 --> 00:26:05,230
ever, because it allows you to dramatically
speed up

460
00:26:05,240 --> 00:26:10,661
data calling and data archiving. What you
can set

461
00:26:10,700 --> 00:26:13,728
up in, in PostGres is a partitioning scheme
in

462
00:26:13,728 --> 00:26:19,020
which data for different days go into different
physical

463
00:26:19,020 --> 00:26:20,800
database tables.

464
00:26:20,800 --> 00:26:24,050
And, you know, that's all right. But the really

465
00:26:24,050 --> 00:26:27,840
cool thing is that you access that data by

466
00:26:27,840 --> 00:26:34,230
querying a parent sort of virtual table that
then

467
00:26:34,240 --> 00:26:36,880
propagates that query across all of its children
and,

468
00:26:36,880 --> 00:26:39,100
you know, magically does its results and,
and spits

469
00:26:39,100 --> 00:26:39,610
them back to you.

470
00:26:39,610 --> 00:26:41,100
So you don't have to update your Ruby code

471
00:26:41,100 --> 00:26:44,960
at all, which is sweet. And this way, if

472
00:26:44,960 --> 00:26:48,960
you want to, say, delete data that's 180 days

473
00:26:48,960 --> 00:26:51,691
old, you just drop the table that's associated
with

474
00:26:51,700 --> 00:26:55,479
180 days ago. If you were to do a

475
00:26:55,480 --> 00:27:00,060
delete where, you know, created_at is less
than 180

476
00:27:00,060 --> 00:27:03,310
days ago, you're probably gonna be waiting
for weeks.

477
00:27:03,310 --> 00:27:07,780
And last, but not least, backups are a real

478
00:27:07,780 --> 00:27:10,090
pain in the neck when you have a huge

479
00:27:10,100 --> 00:27:12,211
dataset. Like, that crime job that dumps your
entire

480
00:27:12,211 --> 00:27:16,969
database and uploads it to S3 doesn't work
so

481
00:27:16,969 --> 00:27:19,900
well when your database is, like, two terabytes
big.

482
00:27:19,900 --> 00:27:24,950
And, yeah, that's a big S3 bill, too.

483
00:27:24,950 --> 00:27:27,750
Fortunately, we can take advantage of the
same sort

484
00:27:27,760 --> 00:27:31,770
of things that PostGres does to allow it to

485
00:27:31,770 --> 00:27:36,931
do streaming replication to do, sort of, on
the

486
00:27:36,940 --> 00:27:41,319
fly, incremental backups. And there's a tool
called Wall-E,

487
00:27:41,340 --> 00:27:45,150
which makes this super easy. And it, it's
really

488
00:27:45,150 --> 00:27:48,530
cool, because it allows, it, it makes it very

489
00:27:48,530 --> 00:27:52,470
easy to upload incremental backups to, say,
S3, and

490
00:27:52,480 --> 00:27:54,080
then when you want to restore your database,
it

491
00:27:54,080 --> 00:27:58,090
makes it really easy to, to restore a specific

492
00:27:58,090 --> 00:28:00,600
point in time.

493
00:28:00,600 --> 00:28:03,660
And so that, Wall-E is really awesome. I love

494
00:28:03,680 --> 00:28:04,250
it.

495
00:28:04,250 --> 00:28:07,300
Now, I, I, I stayed in my hotel room

496
00:28:07,300 --> 00:28:10,530
last night. I didn't get to go to the

497
00:28:10,540 --> 00:28:12,951
Speakeasy thing, which kind of bummed me out.
But

498
00:28:12,951 --> 00:28:14,539
I, I had to work on these slides for

499
00:28:14,540 --> 00:28:20,370
you people. And, and, and when I got done

500
00:28:20,380 --> 00:28:22,260
with this, when I sort of reached this point,

501
00:28:22,260 --> 00:28:25,680
I looked back on all of these, these slides

502
00:28:25,680 --> 00:28:28,100
- there's like fifty-four of them - and I

503
00:28:28,100 --> 00:28:30,520
was like, holy crap. This is a lot of

504
00:28:30,520 --> 00:28:32,000
information.

505
00:28:32,000 --> 00:28:38,840
Yeah. This is a lot of information. But the

506
00:28:38,840 --> 00:28:41,060
thing I, I want to stress is that, at

507
00:28:41,060 --> 00:28:45,380
least in, in my experience, these issues tend
to

508
00:28:45,380 --> 00:28:47,730
come at you one at a time. This isn't

509
00:28:47,730 --> 00:28:48,970
the sort of thing where you have to know

510
00:28:48,970 --> 00:28:53,850
all of this stuff in advance in order to

511
00:28:53,850 --> 00:28:57,740
deal with biggish data in PostGres. Things
come at

512
00:28:57,740 --> 00:28:58,980
you one at a time, and you can deal

513
00:28:58,980 --> 00:29:00,650
with them one at a time.

514
00:29:00,680 --> 00:29:02,951
And I have faith in you. I think you

515
00:29:02,951 --> 00:29:08,419
can do it, because you're awesome, because
nobody told

516
00:29:08,420 --> 00:29:11,660
us that we could turn, like, transistors into
LOLCats,

517
00:29:11,660 --> 00:29:14,951
but we did it. Like, that's the type of

518
00:29:14,960 --> 00:29:18,039
people we are, and that's, that's why I'm
proud

519
00:29:18,040 --> 00:29:21,711
to be at RailsConf. If you're interested in,
like,

520
00:29:21,711 --> 00:29:23,129
talking to me about this, if you think I'm

521
00:29:23,129 --> 00:29:26,320
full of shit about anything, which I probably
am

522
00:29:26,320 --> 00:29:31,320
on at least one point, just, just say so.

523
00:29:31,320 --> 00:29:32,730
Just feel free to come up to me after

524
00:29:32,740 --> 00:29:36,210
the conference, or after my talk here. I have

525
00:29:36,220 --> 00:29:40,840
delicious hot and spicy Mexican candy, as
an incentive.

526
00:29:40,840 --> 00:29:43,360
So there you go.

527
00:29:43,360 --> 00:29:45,900
If you want to learn more about the stuff

528
00:29:45,900 --> 00:29:47,970
that I. Are you taking? Do you want to

529
00:29:47,970 --> 00:29:51,520
take a picture of the Mexican candy? OK. OK.

530
00:29:51,520 --> 00:29:55,750
I can send it to you.

531
00:29:55,760 --> 00:29:58,890
If. If you want to learn more, all of

532
00:29:58,890 --> 00:30:00,880
the links referenced in this talk are at this

533
00:30:00,880 --> 00:30:03,630
url, and if you are interested in having more

534
00:30:03,640 --> 00:30:08,961
visibility into your production errors, check
out Honeybadger, because,

535
00:30:08,961 --> 00:30:13,299
yeah. We love Rails devs. And that's it. That's.

536
00:30:13,300 --> 00:30:15,170
It says end of show here, so I guess

537
00:30:15,170 --> 00:30:16,211
it must be the end of show.