0:00:16.600,0:00:17.920 TOM DALE: Hey, you guys ready? 0:00:17.940,0:00:19.180 Thank you guys so much for coming. 0:00:19.189,0:00:21.770 This is awesome. I was really, 0:00:21.770,0:00:23.700 I, when they were putting together the schedule, 0:00:23.700,0:00:25.349 I said, make sure that you put us down 0:00:25.349,0:00:27.160 in the Caves of Moria. So thank you 0:00:27.160,0:00:31.560 guys for coming down and making it. 0:00:31.560,0:00:32.910 I'm Tom. This is Yehuda. 0:00:32.910,0:00:35.270 YEHUDA KATZ: When people told me was signed[br]up 0:00:35.270,0:00:40.520 to do a back-to-back talk, I don't know what 0:00:40.520,0:00:40.579 I was thinking. 0:00:40.579,0:00:40.820 T.D.: Yup. So. We want to talk to you 0:00:40.820,0:00:43.809 today about, about Skylight. So, just a little[br]bit 0:00:43.809,0:00:45.730 before we talk about that, I want to talk 0:00:45.730,0:00:47.530 about us a little bit. 0:00:47.530,0:00:50.420 So, in 2011 we started a company called Tilde. 0:00:50.420,0:00:52.760 It's this shirt. It may have made me self-conscious, 0:00:52.760,0:00:54.609 because this is actually a first edition and[br]it's 0:00:54.609,0:00:57.550 printed off-center. Well, either I'm off-center[br]or the shirt's 0:00:57.550,0:01:00.539 off-center. One of the two. 0:01:00.539,0:01:02.570 So we started Tilde in 2011, and we had 0:01:02.570,0:01:06.030 all just left a venture-backed company, and[br]that was 0:01:06.030,0:01:08.440 a pretty traumatic experience for us because[br]we spent 0:01:08.440,0:01:09.560 a lot of time building the company and then 0:01:09.560,0:01:11.110 we ran out of money and sold to Facebook 0:01:11.110,0:01:13.500 and we really didn't want to repeat that experience. 0:01:13.500,0:01:16.420 So, we decided to start Tilde, and when we 0:01:16.420,0:01:19.830 did it, we decided to be. DHH and the 0:01:19.830,0:01:21.970 other people at Basecamp were talking about,[br]you know, 0:01:21.970,0:01:23.630 being bootstrapped and proud. And that was[br]a message 0:01:23.630,0:01:26.250 that really resonated with us, and so we wanted 0:01:26.250,0:01:28.360 to capture the same thing. 0:01:28.360,0:01:30.250 There's only problem with being bootstrapped[br]and proud, and 0:01:30.250,0:01:31.890 that is, in order to be both of those 0:01:31.890,0:01:33.180 things, you actually need money, it turns[br]out. It's 0:01:33.180,0:01:34.850 not like you just say it in a blog 0:01:34.850,0:01:37.660 post and then all of the sudden you are 0:01:37.660,0:01:38.560 in business. 0:01:38.560,0:01:40.350 So, we had to think a lot about, OK, 0:01:40.350,0:01:42.450 well, how do we make money? How do we 0:01:42.450,0:01:44.690 make money? How do we make a profitable and, 0:01:44.690,0:01:47.290 most importantly, sustainable business? Because[br]we didn't want to 0:01:47.290,0:01:49.380 just flip to Facebook in a couple years. 0:01:49.380,0:01:53.130 So, looking around, I think the most obvious[br]thing 0:01:53.130,0:01:55.110 that people suggested to us is, well, why[br]don't 0:01:55.110,0:01:58.050 you guys just become Ember, Inc.? Raise a[br]few 0:01:58.050,0:02:01.390 million dollars, you know, build a bunch of[br]business 0:02:01.390,0:02:05.730 model, mostly prayer. But that's not really[br]how we 0:02:05.730,0:02:08.959 want to think about building open source communities. 0:02:08.959,0:02:10.860 We don't really think that that necessarily[br]leads to 0:02:10.860,0:02:13.810 the best open source communities. And if you're[br]interested 0:02:13.810,0:02:16.569 more in that, I recommend Leia Silver, who[br]is 0:02:16.569,0:02:19.700 one of our co-founders. She's giving a talk[br]this 0:02:19.700,0:02:22.450 afternoon. Oh, sorry. Friday afternoon, about[br]how to build 0:02:22.450,0:02:25.219 a company that is centered on open source.[br]So 0:02:25.219,0:02:26.469 if you want to learn more about how we've 0:02:26.469,0:02:28.989 done that, I would really suggest you go check 0:02:28.989,0:02:30.060 out her talk. 0:02:30.060,0:02:33.689 So, no. So, no Ember, Inc. Not allowed. 0:02:33.689,0:02:38.159 So, we really want to build something that[br]leveraged 0:02:38.159,0:02:40.249 the strengths that we thought that we had.[br]One, 0:02:40.249,0:02:42.680 I think most importantly, a really deep knowledge[br]of 0:02:42.680,0:02:44.569 open source and a deep knowledge of the Rails 0:02:44.569,0:02:47.090 stack, and also Carl, it turns out, is really, 0:02:47.090,0:02:50.689 really good at building highly scalable big[br]data sys- 0:02:50.689,0:02:53.989 big data systems. Lots of Hadoop in there. 0:02:53.989,0:02:58.290 So, last year at RailsConf, we announced the[br]private 0:02:58.290,0:03:00.519 beta of Skylight. How many of you have used 0:03:00.519,0:03:01.709 Skylight? Can you raise your hand if you have 0:03:01.709,0:03:04.629 used it? OK. Many of you. Awesome. 0:03:04.629,0:03:08.129 So, so Skylight is a tool for profiling and 0:03:08.129,0:03:11.780 measuring the performance of your Rails applications[br]in production. 0:03:11.780,0:03:15.389 And, as a product, Skylight, I think, was[br]built 0:03:15.389,0:03:20.349 on three really, three key break-throughs.[br]There were key, 0:03:20.349,0:03:22.120 three key break-throughs. We didn't want to[br]ship a 0:03:22.120,0:03:26.189 product that was incrementally better than[br]the competition. We 0:03:26.189,0:03:28.319 wanted to ship a product that was dramatically[br]better. 0:03:28.319,0:03:30.079 Quantum leap. An order of magnitude better. 0:03:30.079,0:03:32.079 And, in order to do that, we spent a 0:03:32.079,0:03:33.889 lot of time thinking about it, about how we 0:03:33.889,0:03:36.389 could solve most of the problems that we saw 0:03:36.389,0:03:39.310 in the existing landscape. And so those, those[br]break-throughs 0:03:39.310,0:03:42.299 are predicated- sorry, those, delivering a[br]product that does 0:03:42.299,0:03:44.919 that is predicated on these three break-throughs. 0:03:44.919,0:03:46.870 So, the first one I want to talk about 0:03:46.870,0:03:53.870 is, honest response times. Honest response[br]times. So, DHH 0:03:54.060,0:03:55.799 wrote a blog post on what was then the 0:03:55.799,0:03:58.930 37Signals blog, now the Basecamp blog, called[br]The problem 0:03:58.930,0:04:01.909 with averages. How many of you have read this? 0:04:01.909,0:04:02.459 Awesome. 0:04:02.459,0:04:03.779 For those of you that have not, how many 0:04:03.779,0:04:08.469 of you hate raising your hands at presentations? 0:04:08.469,0:04:10.510 So, for those of you that- 0:04:10.510,0:04:11.549 Y.K.: Just put a button in every seat. Press 0:04:11.549,0:04:11.779 this button- 0:04:11.779,0:04:15.290 T.D.: Press the button if you have. Yes. Great. 0:04:15.290,0:04:19.810 So, if you read this blog post, the way 0:04:19.810,0:04:22.810 it opens is, Our average response time for[br]Basecamp 0:04:22.810,0:04:26.770 right now is 87ms... That sounds fantastic.[br]And it 0:04:26.770,0:04:29.950 easily leads you to believe that all is well 0:04:29.950,0:04:31.680 and that we wouldn't need to spend any more 0:04:31.680,0:04:34.150 time optimizing performance. 0:04:34.150,0:04:38.840 But that's actually wrong. The average number[br]is completely 0:04:38.840,0:04:42.250 skewed by tons of fast responses to feed requests 0:04:42.250,0:04:46.169 and other cached replies. If you have 1000[br]requests 0:04:46.169,0:04:49.229 that return in 5ms, and then you can have 0:04:49.229,0:04:53.560 200 requests taking 2000ms, or two seconds,[br]you can 0:04:53.560,0:04:57.509 still report an av- a respectable 170ms of[br]average. 0:04:57.509,0:04:59.819 That's useless. 0:04:59.819,0:05:02.520 So what does DHH say that we need? DHH 0:05:02.520,0:05:06.569 says the solution is histograms. So, for those[br]of 0:05:06.569,0:05:09.009 you like me who were sleeping through your[br]statistics 0:05:09.009,0:05:12.410 class in high school, and college, a brief[br]primer 0:05:12.410,0:05:15.310 on histograms. So a histogram is very simple.[br]Basically, 0:05:15.310,0:05:17.699 you have a, you have a series of numbers 0:05:17.699,0:05:22.389 along some axis, and every time you, you're[br]in 0:05:22.389,0:05:24.360 that number, you're in that bucket, you basically[br]increment 0:05:24.360,0:05:25.280 that bar by one. 0:05:25.280,0:05:27.669 So, this is an example of a histogram of 0:05:27.669,0:05:30.620 response times in a Rails application. So[br]you can 0:05:30.620,0:05:31.979 see that there's a big cluster in the middle 0:05:31.979,0:05:35.900 around 488ms, 500ms. This isn't a super speedy[br]app 0:05:35.900,0:05:38.740 but it's not the worst thing in the world. 0:05:38.740,0:05:39.520 And they're all clustered, and then as you[br]kind 0:05:39.520,0:05:40.810 of move to the right you can see that 0:05:40.810,0:05:42.169 the respond times get longer and longer and[br]longer, 0:05:42.169,0:05:43.990 and as you move to the left, response times 0:05:43.990,0:05:45.720 get shorter and shorter and shorter. 0:05:45.720,0:05:47.500 So, why do you want a histogram? What's the, 0:05:47.500,0:05:48.599 what's the most important thing about a histogram? 0:05:48.599,0:05:52.550 Y.K.: Well, I think it's because most requests[br]don't 0:05:52.550,0:05:53.229 actually look like this. 0:05:53.229,0:05:53.509 T.D.: Yes. 0:05:53.509,0:05:54.759 Y.K.: Most end points don't actually look[br]like this. 0:05:54.759,0:05:56.419 T.D.: Right. If you think about what your[br]Rails 0:05:56.419,0:05:58.610 app is doing, it's a complicated beast, right.[br]Turns 0:05:58.610,0:06:02.330 out, Ruby frankly, you can, you can do branching 0:06:02.330,0:06:04.360 logic. You can do a lot of things. 0:06:04.360,0:06:06.150 And so what that means is that one end 0:06:06.150,0:06:09.460 point, if you represent that with a single[br]number, 0:06:09.460,0:06:11.650 you are losing a lot of fidelity, to the 0:06:11.650,0:06:15.189 point where it becomes, as DHH said, useless.[br]So, 0:06:15.189,0:06:17.729 for example, in a histogram, you can easily[br]see, 0:06:17.729,0:06:19.810 oh, here's a group of requests and response[br]times 0:06:19.810,0:06:22.379 where I'm hitting the cache, and here's another[br]group 0:06:22.379,0:06:24.169 where I'm missing it. And you can see that 0:06:24.169,0:06:27.889 that cluster is significantly slower than[br]the faster cache-hitting 0:06:27.889,0:06:28.439 cluster. 0:06:28.439,0:06:30.849 And the other thing that you get when you 0:06:30.849,0:06:32.800 have a, a distribution, when you keep the[br]whole 0:06:32.800,0:06:35.370 distribution in the histogram, is you can[br]look at 0:06:35.370,0:06:39.470 this number at the 95th percentile, right.[br]So the 0:06:39.470,0:06:41.639 right, the way to think about the performance[br]of 0:06:41.639,0:06:46.639 your web application is not the average, because[br]the 0:06:46.639,0:06:50.159 average doesn't really tell you anything.[br]You want to 0:06:50.159,0:06:52.220 think about the 95th percentile, because that's[br]not the 0:06:52.220,0:06:55.379 average response time, that's the average[br]worst response time 0:06:55.379,0:06:57.990 that a user is likely to hit. 0:06:57.990,0:06:59.360 And the thing to keep in mind is that 0:06:59.360,0:07:01.849 it's not as though a customer comes to your 0:07:01.849,0:07:05.099 site, they issue one request, and then they're[br]done, 0:07:05.099,0:07:08.000 right. As someone is using your website, they're[br]gonna 0:07:08.000,0:07:10.300 be generating a lot of requests. And you need 0:07:10.300,0:07:15.020 to look at the 95th percentile, because otherwise[br]every 0:07:15.020,0:07:17.219 request is basically you rolling the dice[br]that they're 0:07:17.219,0:07:18.610 not gonna hit one of those two second, three 0:07:18.610,0:07:21.400 second, four second responses, close the tab[br]and go 0:07:21.400,0:07:23.919 to your competitor. 0:07:23.919,0:07:25.340 So we look at this as, here's the crazy 0:07:25.340,0:07:28.439 thing. Here's what I think is crazy. That[br]blog 0:07:28.439,0:07:32.750 post that DHH wrote, it's from 2009. It's[br]been 0:07:32.750,0:07:35.000 five years, and there's still no tool that[br]does 0:07:35.000,0:07:36.960 what DHH was asking for. So, we, frankly,[br]we 0:07:36.960,0:07:39.060 smelled money. We were like, holy crap. 0:07:39.060,0:07:41.169 Y.K.: Yeah, why isn't that slide green? 0:07:41.169,0:07:43.229 T.D.: Yeah. It should be green and dollars.[br]I 0:07:43.229,0:07:45.270 think keynote has the dollars, the make it[br]rain 0:07:45.270,0:07:50.240 effect I should have used. So we smelled blood 0:07:50.240,0:07:53.090 in the water. We're like, this is awesome.[br]There's 0:07:53.090,0:07:56.610 only one problem that we discovered, and that[br]is, 0:07:56.610,0:07:58.330 it turns out that building this thing is actually 0:07:58.330,0:08:01.189 really, really freaky hard. Really, really[br]hard. 0:08:01.189,0:08:05.780 So, we announced the private beta at RailsConf[br]last 0:08:05.780,0:08:09.139 year. Before doing that, we spent a year of 0:08:09.139,0:08:12.789 research spiking out prototypes, building[br]prototypes, building out the 0:08:12.789,0:08:16.509 beta. We launched at RailsConf, and we realized,[br]we 0:08:16.509,0:08:18.520 made a lot of problems. We made a lot 0:08:18.520,0:08:21.909 of errors when we were building this system. 0:08:21.909,0:08:26.270 So then, after RailsConf last year, we basically[br]took 0:08:26.270,0:08:29.689 six months to completely rewrite the backend[br]from the 0:08:29.689,0:08:32.450 ground up. And I think tying into your keynote, 0:08:32.450,0:08:36.280 Yehuda, we, we were like, oh. We clearly have 0:08:36.280,0:08:39.120 a bespoke problem. No one else is doing this. 0:08:39.120,0:08:42.090 So we rewrote our own custom backend. And[br]then 0:08:42.090,0:08:43.729 we had all these problems, and we realized[br]that 0:08:43.729,0:08:45.390 they had actually already all been solved[br]by the 0:08:45.390,0:08:47.769 open source community. And so we benefited[br]tremendously by 0:08:47.769,0:08:48.430 having a shared solution. 0:08:48.430,0:08:50.709 Y.K.: Yeah. So our first release of this was 0:08:50.709,0:08:55.279 really very bespoke, and the current release[br]uses a 0:08:55.279,0:08:59.540 tremendous amount of very off-the-shelf open[br]source projects that 0:08:59.540,0:09:04.550 just solved the particular problem very effectively,[br]very well. 0:09:04.550,0:09:05.779 None of which are as easy to use as 0:09:05.779,0:09:09.029 Rails, but all of which solve really thorny[br]problems 0:09:09.029,0:09:10.230 very effectively. 0:09:10.230,0:09:12.870 T.D.: So, so let's just talk, just for your 0:09:12.870,0:09:15.950 own understanding, let's talk about how most[br]performance monitoring 0:09:15.950,0:09:17.670 tools work. So the way that most of these 0:09:17.670,0:09:19.930 work is that you run your Rails app, and 0:09:19.930,0:09:22.250 running inside of your Rails app is some gem, 0:09:22.250,0:09:25.000 some agent that you install. And every time[br]the 0:09:25.000,0:09:28.560 Rails app handles a request, it generates[br]events, and 0:09:28.560,0:09:32.500 those events, which include information about[br]performance data, those 0:09:32.500,0:09:34.930 events are passed into the agent. 0:09:34.930,0:09:37.630 And then the agent sends that data to some 0:09:37.630,0:09:40.930 kind of centralized server. Now, it turns[br]out that 0:09:40.930,0:09:44.139 doing a running average is actually really[br]simple. Which 0:09:44.139,0:09:46.550 is why everyone does it. Basically you can[br]do 0:09:46.550,0:09:48.050 it in a single SQL query, right. All you 0:09:48.050,0:09:50.310 do is you have three columns in database.[br]The 0:09:50.310,0:09:52.690 end point, the running average, and the number[br]of 0:09:52.690,0:09:55.769 requests, and then so, you can, those are[br]the 0:09:55.769,0:09:57.170 two things that you need to keep a running 0:09:57.170,0:09:57.310 average, right. 0:09:57.310,0:09:58.750 So keeping a running average is actually really[br]simple 0:09:58.750,0:10:00.980 from a technical point of view. 0:10:00.980,0:10:03.800 Y.K.: I don't think you could even JavaScript[br]through 0:10:03.800,0:10:04.600 to the lack of integers. 0:10:04.600,0:10:06.050 T.D.: Yes. You probably wouldn't want to do[br]any 0:10:06.050,0:10:07.529 math in JavaScript, it turns out. So, so we 0:10:07.529,0:10:10.100 took a little bit different approach. Yehuda,[br]do you 0:10:10.100,0:10:12.070 want to go over the next section? 0:10:12.070,0:10:15.089 Y.K.: Yeah. Sure. So, when we first started,[br]right 0:10:15.089,0:10:17.790 at the beginning, we basically did a similar[br]thing 0:10:17.790,0:10:19.790 where we had a bunch - your app creates 0:10:19.790,0:10:22.620 events. Most of those start off as being ActiveSupport::Notifications, 0:10:22.620,0:10:25.980 although it turns out that there's very limited[br]use 0:10:25.980,0:10:28.490 of ActiveSupport::Notifications so we had[br]to do some normalization 0:10:28.490,0:10:30.360 work to get them sane, which we're gonna be 0:10:30.360,0:10:32.630 upstreaming back into, into Rails. 0:10:32.630,0:10:35.320 But, one thing that's kind of unfortunate[br]about having 0:10:35.320,0:10:37.029 every single Rails app have an agent is that 0:10:37.029,0:10:38.649 you end up having to do a lot of 0:10:38.649,0:10:40.310 the same kind of work over and over again, 0:10:40.310,0:10:42.180 and use up a lot of memory. So, for 0:10:42.180,0:10:44.220 example, every one of these things is making[br]HTTP 0:10:44.220,0:10:46.380 requests. So now you have a queue of things 0:10:46.380,0:10:48.810 that you're sending over HTTP in every single[br]one 0:10:48.810,0:10:50.510 of your Rails processes. And, of course, you[br]probably 0:10:50.510,0:10:52.250 don't notice this. People are used to Rails[br]taking 0:10:52.250,0:10:54.089 up hundreds and hundreds of megabytes, so[br]you probably 0:10:54.089,0:10:55.790 don't notice if you install some agent and[br]it 0:10:55.790,0:10:59.449 suddenly starts taking twenty, thirty, forty,[br]fifty more megabytes. 0:10:59.449,0:11:01.510 But we really wanted to keep the actual memory 0:11:01.510,0:11:04.649 per process down to a small amount. So one 0:11:04.649,0:11:06.170 of the very first things that we did, we 0:11:06.170,0:11:07.910 even did it before last year, is that we 0:11:07.910,0:11:09.800 pulled out all that shared logic into a, a 0:11:09.800,0:11:13.420 separate process called the coordinator. And[br]the agent is 0:11:13.420,0:11:16.940 basically responsible simply for collecting[br]the, the trace, and 0:11:16.940,0:11:18.899 it's not responsible for actually talking[br]to our server 0:11:18.899,0:11:20.709 at all. And that means that the coordinator[br]only 0:11:20.709,0:11:22.720 has to do this queue, this keeping a st- 0:11:22.720,0:11:25.550 a bunch of stuff of work in one place, 0:11:25.550,0:11:28.500 and it doesn't end up using up as much 0:11:28.500,0:11:28.839 memory. 0:11:28.839,0:11:31.019 And I think this, this ended up being very 0:11:31.019,0:11:31.940 effective for us. 0:11:31.940,0:11:33.490 T.D.: And I think that low overhead also allows 0:11:33.490,0:11:36.350 us to just collect more information, in general. 0:11:36.350,0:11:37.060 Y.K.: Yeah. 0:11:37.060,0:11:39.880 Now, after our first attempt, we started getting[br]a 0:11:39.880,0:11:42.079 bunch of customers that were telling us that[br]even 0:11:42.079,0:11:43.920 the separate - so the separate coordinator,[br]started as 0:11:43.920,0:11:45.260 a good thing and a bad thing. On the 0:11:45.260,0:11:47.399 one hand, there's only one of them, so it 0:11:47.399,0:11:49.529 uses up only one set of memory. On the 0:11:49.529,0:11:51.220 other hand, it's really easy for someone to[br]go 0:11:51.220,0:11:52.839 in and PS that process and see how many 0:11:52.839,0:11:54.260 megabytes of memory it's using. 0:11:54.260,0:11:56.560 So, we got a lot of additional complaints[br]that 0:11:56.560,0:11:58.100 said oh, your process is using a lot of 0:11:58.100,0:12:00.990 memory. And, I spent a few weeks, I, I 0:12:00.990,0:12:03.160 know Ruby pretty well. I spent a couple of 0:12:03.160,0:12:05.670 weeks. I actually wrote a gem called Allocation[br]Counter 0:12:05.670,0:12:07.550 that basically went in to try to pin point 0:12:07.550,0:12:09.490 exactly where the allocations were hap- coming[br]from. But 0:12:09.490,0:12:11.800 it turns out that it's actually really, really[br]hard 0:12:11.800,0:12:14.019 to track down exactly where allocations are[br]coming from 0:12:14.019,0:12:15.170 in Ruby, because something as simple as using[br]a 0:12:15.170,0:12:18.630 regular expression in Ruby can allocate match[br]objects that 0:12:18.630,0:12:19.410 get put back on the stack. 0:12:19.410,0:12:21.449 And so I was able to pair this down 0:12:21.449,0:12:24.220 to some degree. But I really discovered quickly[br]that, 0:12:24.220,0:12:26.980 trying to keep a lid on the memory allocation 0:12:26.980,0:12:29.940 by doing all the stuff in Ruby, is mostly 0:12:29.940,0:12:31.579 fine. But for our specific use case where[br]we 0:12:31.579,0:12:33.100 really wanna, we wanna be telling you, you[br]can 0:12:33.100,0:12:34.860 run the agent on your process, on your box, 0:12:34.860,0:12:36.870 and it's not gonna use a lot of memory. 0:12:36.870,0:12:40.079 We really needed something more efficient,[br]and our first 0:12:40.079,0:12:42.790 thought was, we'll use C++ or C. No problem. 0:12:42.790,0:12:45.220 C is, is native. It's great. And Carl did 0:12:45.220,0:12:48.120 the work. Carl is very smart. And then he 0:12:48.120,0:12:49.509 said, Yehuda. It is now your turn. You need 0:12:49.509,0:12:51.250 to start maintaining this. And I said, I don't 0:12:51.250,0:12:53.620 trust myself to write C++ code that's running[br]in 0:12:53.620,0:12:56.029 all of your guys's boxes, and not seg-fault.[br]So 0:12:56.029,0:12:59.649 I don't think that, that doesn't work for[br]me. 0:12:59.649,0:13:01.790 And so I, I noticed that rust was coming 0:13:01.790,0:13:03.630 along, and what rust really gives you is it 0:13:03.630,0:13:05.899 gives you the ability to write low-level code[br]a 0:13:05.899,0:13:08.529 la C or C++ with magma memory management,[br]that 0:13:08.529,0:13:11.790 keeps your memory allocation low and keeps[br]things speedy. 0:13:11.790,0:13:14.930 Low resource utilization. While also giving[br]you compile time 0:13:14.930,0:13:17.949 guarantees about not seg-faulting. So, again,[br]if your processes 0:13:17.949,0:13:20.320 randomly started seg-faulting because you[br]installed the agent, I 0:13:20.320,0:13:21.949 think you would stop being our customer very[br]quickly. 0:13:21.949,0:13:24.680 So having what, pretty much 100% guarantees[br]about that 0:13:24.680,0:13:26.600 was very important to us. And so that's why 0:13:26.600,0:13:28.420 we decided to use rust. 0:13:28.420,0:13:29.880 I'll just keep going. 0:13:29.880,0:13:30.970 T.D.: Keep going. 0:13:30.970,0:13:32.949 Y.K.: So, we had this coordinator object.[br]And basically 0:13:32.949,0:13:36.149 the coordinator object is receiving events.[br]So the events 0:13:36.149,0:13:39.750 basically end up being these traces that describe[br]what's 0:13:39.750,0:13:42.050 happening in your application. And the next[br]thing, I 0:13:42.050,0:13:44.420 think our initial work on this we used JSON 0:13:44.420,0:13:46.160 just to send the pay load to the server, 0:13:46.160,0:13:47.949 but we noticed that a lot of people have 0:13:47.949,0:13:49.889 really big requests. So you may have a big 0:13:49.889,0:13:51.519 request with a big SQL query in it, or 0:13:51.519,0:13:53.320 a lot of big SQL queries in it. Some 0:13:53.320,0:13:55.279 people have traces that are hundreds and hundreds[br]of 0:13:55.279,0:13:57.500 nodes long. And so we really wanted to figure 0:13:57.500,0:14:00.100 out how to shrink down the payload size to 0:14:00.100,0:14:02.569 something that we could be, you know, pumping[br]out 0:14:02.569,0:14:04.759 of your box on a regular basis without tearing 0:14:04.759,0:14:06.850 up your bandwidth costs. 0:14:06.850,0:14:09.009 So, one of the first things that we did 0:14:09.009,0:14:11.069 early on was we switched using protobuf as[br]the 0:14:11.069,0:14:13.550 transport mechanism, and that really shrunk,[br]shrunk down the 0:14:13.550,0:14:17.250 payloads a lot. Our earlier prototypes for[br]actually collecting 0:14:17.250,0:14:19.490 the data were written in Ruby, but I think 0:14:19.490,0:14:21.370 Carl did, like, a weekend hack to just pour 0:14:21.370,0:14:24.180 it over the Java and got, like, 200x performance. 0:14:24.180,0:14:26.319 And you don't always get 200x performance,[br]if mostly 0:14:26.319,0:14:27.970 what you're doing is database queries, you're[br]not gonna 0:14:27.970,0:14:29.139 get a huge performance swing. 0:14:29.139,0:14:31.889 But mostly what we're doing is math. And algorithms 0:14:31.889,0:14:34.209 and data structures. And for that, Ruby is,[br]it 0:14:34.209,0:14:36.310 could, in theory, one day, have a good git 0:14:36.310,0:14:38.569 or something, but today, writing that code[br]in Java 0:14:38.569,0:14:40.949 didn't end up being significantly more code[br]cause it's 0:14:40.949,0:14:42.540 just, you know, algorithms and data structures. 0:14:42.540,0:14:44.420 T.D.: And I'll just note something about standardizing[br]on 0:14:44.420,0:14:46.779 protobufs in our, in our stack, is actually[br]a 0:14:46.779,0:14:52.170 huge win, because we, we realized, hey, browsers,[br]as 0:14:52.170,0:14:53.350 it turns out are pretty powerful these days.[br]They've 0:14:53.350,0:14:54.490 got, you know, they can allocate memory, they[br]can 0:14:54.490,0:14:56.069 do all these types of computation. So, and[br]protobuff's 0:14:56.069,0:14:59.410 libraries exist everywhere. So we save ourselves[br]a lot 0:14:59.410,0:15:01.589 of computation and a lot of time by just 0:15:01.589,0:15:04.259 treating protobuff as the canonical serialization[br]form, and then 0:15:04.259,0:15:06.190 you can move payloads around the entire stack[br]and 0:15:06.190,0:15:08.560 everything speaks the same language, so you've[br]saved the 0:15:08.560,0:15:09.300 serialization and deserialization. 0:15:09.300,0:15:11.990 Y.K.: And JavaScript is actually surprisingly[br]effective at des- 0:15:11.990,0:15:13.870 at taking protobuffs and converting them to[br]the format 0:15:13.870,0:15:18.190 that we need efficiently. So, so we basically[br]take 0:15:18.190,0:15:21.029 this data. The Java collector is basically[br]collecting all 0:15:21.029,0:15:23.550 these protobuffs, and pretty much it just[br]turns around, 0:15:23.550,0:15:24.940 and this is sort of where we got into 0:15:24.940,0:15:28.149 bespoke territory before we started rolling[br]our own, but 0:15:28.149,0:15:30.930 we realized that when you write a big, distributed, 0:15:30.930,0:15:33.130 fault-tolerant system, there's a lot of problems[br]that you 0:15:33.130,0:15:35.319 really just want someone else to have thought[br]about. 0:15:35.319,0:15:37.750 So, what we do is we basically take these, 0:15:37.750,0:15:39.600 take these payloads that are coming in. We[br]convert 0:15:39.600,0:15:41.709 them into batches and we send the batches[br]down 0:15:41.709,0:15:45.259 into the Kafka queue. And the, the next thing 0:15:45.259,0:15:47.680 that happens, so the Kafka, sorry, Kafka's[br]basically just 0:15:47.680,0:15:49.910 a queue that allows you to throw things into, 0:15:49.910,0:15:53.019 I guess, it might be considered similar to[br]like, 0:15:53.019,0:15:55.430 something lime AMQP. It has some nice fault-tolerance[br]properties 0:15:55.430,0:15:57.949 and integrates well with storm. But most important[br]it's 0:15:57.949,0:15:59.670 just super, super high through-put. 0:15:59.670,0:16:01.940 So basically didn't want to put any barrier[br]between 0:16:01.940,0:16:03.560 you giving us the data and us getting it 0:16:03.560,0:16:04.480 to disc as soon as possible. 0:16:04.480,0:16:05.870 T.D.: Yeah. Which we'll, I think, talk about[br]in 0:16:05.870,0:16:06.180 a bit. 0:16:06.180,0:16:08.540 Y.K.: So we, so the basic Kafka takes the 0:16:08.540,0:16:11.410 data and starts sending it into Storm. And[br]if 0:16:11.410,0:16:13.009 you think about what has to happen in order 0:16:13.009,0:16:15.089 to get some request. So, you have these requests. 0:16:15.089,0:16:18.149 There's, you know, maybe traces that have[br]a bunch 0:16:18.149,0:16:19.670 of SQL queries, and our job is basically to 0:16:19.670,0:16:21.459 take all those SQL queries and say, OK, I 0:16:21.459,0:16:22.560 can see that in all of your requests, you 0:16:22.560,0:16:24.040 had the SQL query and it took around this 0:16:24.040,0:16:25.449 amount of time and it happened as a child 0:16:25.449,0:16:27.470 of this other node. And the way to think 0:16:27.470,0:16:29.740 about that is basically just a processing[br]pipeline. Right. 0:16:29.740,0:16:31.480 So you have these traces that come in one 0:16:31.480,0:16:33.480 side. You start passing them through a bunch[br]of 0:16:33.480,0:16:34.800 processing steps, and then you end up on the 0:16:34.800,0:16:36.670 other side with the data. 0:16:36.670,0:16:38.879 And Storm is actually a way of describing[br]that 0:16:38.879,0:16:41.930 processing pipeline in sort of functional[br]style, and then 0:16:41.930,0:16:43.740 you tell it, OK. Here's how many servers I 0:16:43.740,0:16:46.839 need. Here's how, here's how I'm gonna handle[br]failures. 0:16:46.839,0:16:50.000 And it basically deals with distribution and[br]scaling and 0:16:50.000,0:16:52.379 all that stuff for you. And part of that 0:16:52.379,0:16:55.379 is because you wrote everything using functional[br]style. 0:16:55.379,0:16:57.110 And so what happens is Kafka sends the data 0:16:57.110,0:17:00.550 into the entry spout, which is sort of terminology 0:17:00.550,0:17:04.040 in, terminology in Storm for these streams[br]that get 0:17:04.040,0:17:06.930 created. And they basically go into these[br]processing things, 0:17:06.930,0:17:09.839 which very clever- cutely are called bolts.[br]This is 0:17:09.839,0:17:12.980 definitely not the naming I would have used,[br]but. 0:17:12.980,0:17:15.130 So they're called bolts. And the idea is that 0:17:15.130,0:17:16.970 basically every request may have several things. 0:17:16.970,0:17:20.089 So, for example, we now automatically detect[br]n + 0:17:20.089,0:17:22.220 1 queries and that's sort of a different kind 0:17:22.220,0:17:25.059 of processing from just, make a picture of[br]the 0:17:25.059,0:17:26.980 entire request. Or what is the 95th percentile[br]across 0:17:26.980,0:17:29.090 your entire app, right. These are all different[br]kinds 0:17:29.090,0:17:30.940 of processing. So we take the data and we 0:17:30.940,0:17:33.580 send them into a bunch of bolts, and the 0:17:33.580,0:17:35.750 cool thing about bolts is that, again, because[br]they're 0:17:35.750,0:17:38.890 just functional chaining, you can take the[br]output from 0:17:38.890,0:17:41.130 one bolt and feed it into another bolt. And 0:17:41.130,0:17:43.510 that works, that works pretty well. And, and[br]you 0:17:43.510,0:17:44.730 don't have to worry about - I mean, you 0:17:44.730,0:17:46.600 have to worry a little bit about things like 0:17:46.600,0:17:49.960 fault tolerance, failure, item potence. But[br]you worry about 0:17:49.960,0:17:52.230 them at, at the abstraction level, and then[br]the 0:17:52.230,0:17:54.159 operational part is handled for you. 0:17:54.159,0:17:55.740 T.D.: So it's just like a very declarative[br]way 0:17:55.740,0:17:58.179 of describing how this computation work in,[br]in a 0:17:58.179,0:17:59.230 way that's easy to scale. 0:17:59.230,0:18:01.860 Y.K.: And Carl actually talked about this[br]at very 0:18:01.860,0:18:04.909 high speed yesterday, and you, some of you[br]may 0:18:04.909,0:18:06.620 have been there. I would recommend watching[br]the video 0:18:06.620,0:18:09.020 when it comes out if you want to make 0:18:09.020,0:18:11.159 use of this stuff in your own applications. 0:18:11.159,0:18:13.250 And then when you're finally done with all[br]the 0:18:13.250,0:18:14.970 processing, you need to actually do something[br]with it. 0:18:14.970,0:18:16.289 You need to put it somewhere so that the 0:18:16.289,0:18:18.360 web app can get access to it, and that 0:18:18.360,0:18:21.350 is basically, we use Cassandra for this. And[br]Cassandra 0:18:21.350,0:18:24.929 again is mostly, it's a dumb database, but[br]it 0:18:24.929,0:18:27.780 has, it's, has high capacity. It has some[br]of 0:18:27.780,0:18:29.080 the fault-tolerance capacities that we want. 0:18:29.080,0:18:30.770 T.D.: We're very, we're just very, very heavy,[br]right. 0:18:30.770,0:18:32.820 Like, we tend to be writing more than we're 0:18:32.820,0:18:33.730 ever reading. 0:18:33.730,0:18:36.270 Y.K.: Yup. And then when we're done, when[br]we're 0:18:36.270,0:18:38.780 done with a particular batch, Cassandra basically[br]kicks off 0:18:38.780,0:18:40.720 the process over again. So we're basically[br]doing these 0:18:40.720,0:18:41.200 things as batches. 0:18:41.200,0:18:42.919 T.D.: So these are, these are roll-ups, is[br]what's 0:18:42.919,0:18:45.360 happening here. So basically every minute,[br]every ten minutes, 0:18:45.360,0:18:48.140 and then at every hour, we reprocess and we 0:18:48.140,0:18:49.970 re-aggregate, so that when you query us we[br]know 0:18:49.970,0:18:51.010 exactly what to give you. 0:18:51.010,0:18:52.830 Y.K.: Yup. So we sort of have this cycle 0:18:52.830,0:18:55.049 where we start off, obviously, in the first[br]five 0:18:55.049,0:18:56.890 second, the first minute, you really want[br]high granularity. 0:18:56.890,0:18:59.140 You want to see what's happening right now.[br]But, 0:18:59.140,0:19:00.110 if you want to go back and look at 0:19:00.110,0:19:02.460 data from three months ago, you probably care[br]about 0:19:02.460,0:19:04.830 it, like the day granularity or maybe the[br]hour 0:19:04.830,0:19:07.490 granularity. So, we basically do these roll-ups[br]and we 0:19:07.490,0:19:09.200 cycle through the process. 0:19:09.200,0:19:11.880 T.D.: So this, it turns out, building the[br]system 0:19:11.880,0:19:15.169 required an intense amount of work. Carl spent[br]probably 0:19:15.169,0:19:17.490 six months reading PHP thesises to find- 0:19:17.490,0:19:18.140 Y.K.: Thesis. 0:19:18.140,0:19:23.850 T.D.: Thesis. To find, to find data structures[br]and 0:19:23.850,0:19:25.870 algorithms that we could use. Because this[br]is a 0:19:25.870,0:19:28.340 huge amount of data. Like, I think even a 0:19:28.340,0:19:31.049 few months after we were in private data,[br]private 0:19:31.049,0:19:34.179 beta, we were already handling over a billion[br]requests 0:19:34.179,0:19:36.200 per month. And obviously there's no way that[br]we- 0:19:36.200,0:19:37.970 Y.K.: Basically the number of requests that[br]we handle 0:19:37.970,0:19:39.770 is the sum of all of the requests that 0:19:39.770,0:19:40.010 you handle. 0:19:40.010,0:19:40.110 T.D.: Right. 0:19:40.110,0:19:41.200 Y.K.: And all of our customers handle. 0:19:41.200,0:19:41.909 T.D.: Right. Right. So. 0:19:41.909,0:19:43.120 Y.K.: So that's a lot of requests. 0:19:43.120,0:19:45.760 T.D.: So obviously we can't provide a service,[br]at 0:19:45.760,0:19:48.480 least one that's not, we can't provide an[br]affordable 0:19:48.480,0:19:51.130 service, an accessible service, if we have[br]to store 0:19:51.130,0:19:53.190 terabytes or exabytes of data just to tell[br]you 0:19:53.190,0:19:53.990 how your app is running. 0:19:53.990,0:19:56.630 Y.K.: And I think, also a problem, it's problematic 0:19:56.630,0:19:58.429 if you store all the data in a database 0:19:58.429,0:19:59.760 and then every single time someone wants to[br]learn 0:19:59.760,0:20:01.590 something about that, you have to do a query. 0:20:01.590,0:20:03.159 Those queries can take a very long time. They 0:20:03.159,0:20:04.700 can take minutes. And I think we really wanted 0:20:04.700,0:20:07.159 to have something that would be very, that[br]would, 0:20:07.159,0:20:09.580 where the feedback loop would be fast. So[br]we 0:20:09.580,0:20:11.860 wanted to find algorithms that let us handle[br]the 0:20:11.860,0:20:13.940 data at, at real time, and then provide it 0:20:13.940,0:20:16.020 to you at real time instead of these, like, 0:20:16.020,0:20:18.309 dump the data somewhere and then do these[br]complicated 0:20:18.309,0:20:18.549 queries. 0:20:18.549,0:20:20.820 T.D.: So, hold on. So this slide was not 0:20:20.820,0:20:24.440 supposed to be here. It was supposed to be 0:20:24.440,0:20:27.669 a Rails slide. So, whoa. I went too far. 0:20:27.669,0:20:29.970 K. We'll watch that again. That's pretty cool.[br]So 0:20:29.970,0:20:32.460 then the last thing I want to say is, 0:20:32.460,0:20:34.330 perhaps your take away from looking at this[br]architecture 0:20:34.330,0:20:36.750 diagram is, oh my gosh, these Rails guys completely- 0:20:36.750,0:20:38.309 Y.K.: They really jumped the shark. 0:20:38.309,0:20:40.870 T.D.: They jumped the shark. They ditched[br]Rails. I 0:20:40.870,0:20:42.330 saw, like, three Tweets yesterday - I wasn't[br]here, 0:20:42.330,0:20:43.450 I was in Portland yesterday, but I saw, like, 0:20:43.450,0:20:44.340 three Tweets that were like, I'm at RailsConf[br]and 0:20:44.340,0:20:48.809 I haven't seen a single talk about, like,[br]Rails. 0:20:48.809,0:20:51.950 So that's true here, too. But, I want to 0:20:51.950,0:20:54.940 assure you that we are only using this stack 0:20:54.940,0:20:58.070 for the heavy computation. We started in Rails.[br]We 0:20:58.070,0:21:00.730 started, we were like, hey, what do we need. 0:21:00.730,0:21:01.929 Ah, well, people probably need to authenticate[br]and log 0:21:01.929,0:21:05.090 in, and we probably need to do billing. And 0:21:05.090,0:21:06.220 those are all things that Rails is really,[br]really 0:21:06.220,0:21:08.830 good at. So we started with Rails as, basically, 0:21:08.830,0:21:11.110 the starting point, and then when we realized[br]oh 0:21:11.110,0:21:14.039 my gosh, computation is really slow. There's[br]no way 0:21:14.039,0:21:15.240 we're gonna be able to offer this service.[br]OK. 0:21:15.240,0:21:16.059 Now let's think about how we can do all 0:21:16.059,0:21:16.270 of that. 0:21:16.270,0:21:18.510 Y.K.: And I think notably, a lot of people 0:21:18.510,0:21:20.049 who look at Rails are like, there's a lot 0:21:20.049,0:21:21.750 of companies that have built big stuff on[br]Rails, 0:21:21.750,0:21:24.090 and their attitude is, like, oh, this legacy[br]terrible 0:21:24.090,0:21:25.409 Rails app. I really wish we could get rid 0:21:25.409,0:21:26.850 of it. If we could just write everything in 0:21:26.850,0:21:30.760 Scala or Clojure or Go, everything would be[br]amazing. 0:21:30.760,0:21:31.500 That is definitely not our attitude. Our attitude[br]is 0:21:31.500,0:21:34.320 that Rails is really amazing, at particular,[br]at the 0:21:34.320,0:21:36.740 kinds of things that are really common across[br]everyone's 0:21:36.740,0:21:39.900 web applications - authentication, billing,[br]et cetera. And we 0:21:39.900,0:21:41.429 really want to be using Rails for the parts 0:21:41.429,0:21:43.360 of our app- even things like error-tracking,[br]we do 0:21:43.360,0:21:45.039 through the Rails app. We want to be using 0:21:45.039,0:21:47.470 Rails because it's very productive at doing[br]those things. 0:21:47.470,0:21:48.789 It happens to be very slow with doing data 0:21:48.789,0:21:50.330 crunching, so we're gonna use a different[br]tool for 0:21:50.330,0:21:50.539 that. 0:21:50.539,0:21:51.909 But I don't think you'll ever see me getting 0:21:51.909,0:21:54.210 up and saying, ah, I really wish we had 0:21:54.210,0:21:55.080 just started writing, you know, the Rails[br]app in 0:21:55.080,0:21:55.159 rust. 0:21:55.159,0:21:55.309 T.D.: Yeah. 0:21:55.309,0:21:58.090 Y.K.: That would be terrible. 0:21:58.090,0:22:02.429 T.D.: So that's number one, is, is, honest[br]response 0:22:02.429,0:22:04.390 times, which we're, which it turns out, seems[br]like 0:22:04.390,0:22:08.289 it should be easy, requires storing insane[br]amount of 0:22:08.289,0:22:09.169 data. 0:22:09.169,0:22:10.620 So the second thing that we realized when[br]we 0:22:10.620,0:22:12.059 were looking at a lot of these tools, is 0:22:12.059,0:22:14.360 that most of them focus on data. They focus 0:22:14.360,0:22:16.590 on giving you the raw data. But I'm not 0:22:16.590,0:22:19.130 a machine. I'm not a computer. I don't enjoy 0:22:19.130,0:22:21.320 sifting through data. That's what computers[br]are good for. 0:22:21.320,0:22:23.289 I would rather be drinking a beer. It's really 0:22:23.289,0:22:24.830 nice in Portland, this time of year. 0:22:24.830,0:22:27.409 So, we wanted to think about, if you're trying 0:22:27.409,0:22:31.179 to solve the performance problems in your[br]application, what 0:22:31.179,0:22:32.840 are the things that you would suss out with 0:22:32.840,0:22:35.760 the existing tools after spending, like, four[br]hours depleting 0:22:35.760,0:22:37.510 your ego to get there? 0:22:37.510,0:22:38.929 Y.K.: And I think part of this is just 0:22:38.929,0:22:42.260 people are actually very, people like to think[br]that 0:22:42.260,0:22:43.880 they're gonna use these tools, but when the[br]tools 0:22:43.880,0:22:45.320 require you to dig through a lot of data, 0:22:45.320,0:22:47.090 people just don't use them very much. So,[br]the 0:22:47.090,0:22:48.330 goal here was to build a tool that people 0:22:48.330,0:22:50.809 actually use and actually like using, and[br]not to 0:22:50.809,0:22:54.870 build a tool that happens to provide a lot 0:22:54.870,0:22:55.039 of data you can sift through. 0:22:55.039,0:22:55.059 T.D.: Yes. 0:22:55.059,0:22:55.929 So, probably the, one of the first things[br]that 0:22:55.929,0:22:58.529 we realized is that we don't want to provide. 0:22:58.529,0:23:00.400 This is a trace of a request, you've probably 0:23:00.400,0:23:04.070 seen similar UIs using other tools, using,[br]for example, 0:23:04.070,0:23:07.059 the inspector in, in like Chrome or Safari,[br]and 0:23:07.059,0:23:08.700 this is just showing basically, it's basically[br]a visual 0:23:08.700,0:23:10.830 stack trace of where your application is spending[br]its 0:23:10.830,0:23:11.600 time. 0:23:11.600,0:23:13.950 But I think what was important for us is 0:23:13.950,0:23:17.809 showing not just a single request, because[br]your app 0:23:17.809,0:23:20.570 handles, you know, hundreds of thousands of[br]requests, or 0:23:20.570,0:23:22.679 millions of requests. So looking at a single[br]request 0:23:22.679,0:23:24.630 statistically is complete, it's just noise. 0:23:24.630,0:23:26.500 Y.K.: And it's especially bad if it's the[br]worst 0:23:26.500,0:23:28.659 request, because the worst request is, is[br]really noise. 0:23:28.659,0:23:30.850 It's like, a hiccup in the network, right. 0:23:30.850,0:23:31.250 T.D.: It's the outlier. Yeah. 0:23:31.250,0:23:32.150 Y.K.: It's literally the outlier. 0:23:32.150,0:23:35.659 T.D.: It's literally the outlier. Yup. So,[br]what we 0:23:35.659,0:23:38.700 present in Skylight is something a little[br]bit different, 0:23:38.700,0:23:41.770 and it's something that we call the aggregate[br]trace. 0:23:41.770,0:23:46.260 So the aggregate trace is basically us taking[br]all 0:23:46.260,0:23:49.559 of your requests, averaging them out where[br]each of 0:23:49.559,0:23:51.750 these things spends their time, and then showing[br]you 0:23:51.750,0:23:54.899 that. So this is basically like, this is like, 0:23:54.899,0:23:57.929 this is like the statue of David. It is 0:23:57.929,0:24:00.880 the idealized form of the stack trace of how 0:24:00.880,0:24:02.270 your application's behaving. 0:24:02.270,0:24:05.330 But, of course, you have the same problem[br]as 0:24:05.330,0:24:07.500 before, which is, if this is all that we 0:24:07.500,0:24:10.580 were showing you, it would be obscuring a[br]lot 0:24:10.580,0:24:12.870 of information. You want to actually be able[br]to 0:24:12.870,0:24:13.990 tell the difference between, OK, what's my[br]stack trace 0:24:13.990,0:24:16.419 look like for fast requests, and how does[br]that 0:24:16.419,0:24:18.539 differ from requests that are slower. 0:24:18.539,0:24:20.860 So what we've got, I've got a little video 0:24:20.860,0:24:22.320 here. You can see that when I move the 0:24:22.320,0:24:26.490 slider, that this trace below it is actually[br]updating 0:24:26.490,0:24:29.130 in real time. As I move the slider around, 0:24:29.130,0:24:31.770 you can see that the aggregate trace actually[br]updates 0:24:31.770,0:24:34.240 with it. And that's because we're collecting[br]all this 0:24:34.240,0:24:36.159 information. We're collecting, like I said,[br]a lot of 0:24:36.159,0:24:38.669 data. We can recompute this aggregate trace[br]on the 0:24:38.669,0:24:38.909 fly. 0:24:38.909,0:24:41.200 Basically, for each bucket, we're storing[br]a different trace, 0:24:41.200,0:24:42.880 and then on the client we're reassembling[br]that. We'll 0:24:42.880,0:24:43.899 go into that a little bit. 0:24:43.899,0:24:45.799 Y.K.: And I think it's really important that[br]you 0:24:45.799,0:24:48.370 be able to do these experiments quickly. If[br]every 0:24:48.370,0:24:50.059 time you think, oh, I wonder what happens[br]if 0:24:50.059,0:24:52.260 I add another histogram bucket, if it requires[br]a 0:24:52.260,0:24:54.830 whole full page refresh. Then that would basically[br]make 0:24:54.830,0:24:56.309 people not want to use the tool. Not able 0:24:56.309,0:24:58.580 to use the tool. So, actually building something[br]which 0:24:58.580,0:24:59.649 is real time and fast, gets the data as 0:24:59.649,0:25:00.110 it comes, was really important to us. 0:25:00.110,0:25:01.220 T.D.: So that's number one. 0:25:01.220,0:25:04.850 And the second thing. So we built that, and 0:25:04.850,0:25:07.929 we're like, OK, well what's next? And I think 0:25:07.929,0:25:09.250 that the big problem with this is that you 0:25:09.250,0:25:12.020 need to know that there's a problem before[br]you 0:25:12.020,0:25:14.429 go look at it, right. So we have been 0:25:14.429,0:25:16.080 working for the past few months, and the Storm 0:25:16.080,0:25:18.390 infrastructure that we built makes it pretty[br]straight-forward to 0:25:18.390,0:25:21.149 start building more abstractions on top of[br]the data 0:25:21.149,0:25:21.559 that we've already collected. 0:25:21.559,0:25:24.120 It's a very declarative system. So we've been[br]working 0:25:24.120,0:25:26.679 on a feature called inspections. And what's[br]cool about 0:25:26.679,0:25:29.279 inspections is that we can look at this tremendous 0:25:29.279,0:25:31.270 volume of data that we've collected from your[br]app, 0:25:31.270,0:25:33.840 and we can automatically tease out what the[br]problems 0:25:33.840,0:25:35.210 are. So the first one that we shipped, this 0:25:35.210,0:25:37.399 is in beta right now. It's not, it's not 0:25:37.399,0:25:39.840 out and enabled by default, but there, it's[br]behind 0:25:39.840,0:25:42.440 a feature flag that we've had some users turning 0:25:42.440,0:25:42.730 on. 0:25:42.730,0:25:44.419 And, and trying out. And so what we can 0:25:44.419,0:25:46.450 do in this case, is because we have information 0:25:46.450,0:25:48.730 about all of the database queries in your[br]app, 0:25:48.730,0:25:50.840 we can look and see if you have n 0:25:50.840,0:25:52.390 plus one queries. Can you maybe explain what[br]an 0:25:52.390,0:25:53.250 n plus one query is? 0:25:53.250,0:25:54.600 Y.K.: Yeah. So, I'm, people know, hopefully,[br]what n 0:25:54.600,0:25:56.770 plus one queries. But the, it's the idea that 0:25:56.770,0:25:59.260 you, by accident, for some reason, instead[br]of making 0:25:59.260,0:26:01.970 one query, you ask for like all the posts 0:26:01.970,0:26:02.940 and then you iterated through all of them[br]and 0:26:02.940,0:26:04.940 got all the comments and now you, instead[br]of 0:26:04.940,0:26:08.679 having one query, you have one query per post, 0:26:08.679,0:26:10.309 right. And you, what I've, what I've like[br]to 0:26:10.309,0:26:12.549 do is do eager reloading, where you say include 0:26:12.549,0:26:14.559 comments, right. But you have to know that[br]you 0:26:14.559,0:26:15.039 have to do that. 0:26:15.039,0:26:16.899 So there's some tools that will run in development 0:26:16.899,0:26:18.380 mode, if you happen to catch it, like a 0:26:18.380,0:26:20.460 bullet. This is basically a tool that's looking[br]at 0:26:20.460,0:26:22.210 every single one of your classes and has some 0:26:22.210,0:26:24.169 thresholds that, once we see that a bunch[br]of 0:26:24.169,0:26:27.429 your requests have the same exact query, so[br]we 0:26:27.429,0:26:29.549 do some work to pull out binds. So if 0:26:29.549,0:26:32.200 it's, like, where something equals one, we[br]will automatically 0:26:32.200,0:26:34.110 pull out the one and replace it with a 0:26:34.110,0:26:34.740 question mark. 0:26:34.740,0:26:36.230 And then we basically take all those queries,[br]if 0:26:36.230,0:26:39.529 they're the exact same query repeated multiple[br]times, subject 0:26:39.529,0:26:41.390 to some thresholds, we'll start showing you[br]hey, there's 0:26:41.390,0:26:42.450 an n plus one query. 0:26:42.450,0:26:43.799 And you can imagine this same sort of thing 0:26:43.799,0:26:46.320 being done for things, like, are you missing[br]an 0:26:46.320,0:26:49.690 index, right. Or, are you using the Ruby version 0:26:49.690,0:26:50.950 of JSON when you should be using the native 0:26:50.950,0:26:52.179 version of JSON. These are all things that[br]we 0:26:52.179,0:26:55.140 can start detecting just because we're consuming[br]an enormous 0:26:55.140,0:26:57.510 amount of information, and we can start writing[br]some 0:26:57.510,0:26:59.330 heuristics for bubbling it up. 0:26:59.330,0:27:02.330 So, third and final breakthrough, we realized[br]that we 0:27:02.330,0:27:05.289 really, really needed a lightning fast UI.[br]Something really 0:27:05.289,0:27:08.279 responsive. So, in particular, the feedback[br]loop is critical, 0:27:08.279,0:27:09.929 right. You can imagine, if the way that you 0:27:09.929,0:27:12.279 dug into data was you clicked and you wait 0:27:12.279,0:27:14.320 an hour, and then you get your results, no 0:27:14.320,0:27:15.730 one would do it. No one would ever do 0:27:15.730,0:27:15.890 it. 0:27:15.890,0:27:19.090 And the existing tools are OK, but you click 0:27:19.090,0:27:20.470 and you wait. You look at it and you're 0:27:20.470,0:27:21.730 like, oh, I want a different view, so then 0:27:21.730,0:27:23.240 you go edit your query and then you click 0:27:23.240,0:27:25.360 and you wait and it's just not a pleasant 0:27:25.360,0:27:26.600 experience. 0:27:26.600,0:27:28.850 So, so we use Ember, the, the UI that 0:27:28.850,0:27:31.250 you're using when you log into Skylight. Even[br]though 0:27:31.250,0:27:33.289 it feels just like a regular website, it doesn't 0:27:33.289,0:27:35.940 feel like a native app, is powered, all of 0:27:35.940,0:27:37.679 the routing, all of the rendering, all of[br]the 0:27:37.679,0:27:40.769 decision making, is happening in, as an Ember.js[br]app, 0:27:40.769,0:27:43.049 and we pair that with D3. So all of 0:27:43.049,0:27:44.830 the charts, the charts that you saw there[br]in 0:27:44.830,0:27:48.039 the aggregate trace, that is all Ember components[br]powered 0:27:48.039,0:27:48.970 by D3. 0:27:48.970,0:27:52.860 So, this is actually significantly cleaned[br]up our client-side 0:27:52.860,0:27:55.679 code. It makes re-usability really, really[br]awesome. So to 0:27:55.679,0:27:57.039 give you an example, this is from our billing 0:27:57.039,0:27:58.789 page that I, the designer came and they had, 0:27:58.789,0:28:01.260 they had a component that was like, the gate 0:28:01.260,0:28:01.809 component. 0:28:01.809,0:28:02.919 And, the- 0:28:02.919,0:28:05.899 T.D.: It seems really boring at first. 0:28:05.899,0:28:06.799 Y.K.: It seemed really boring. But, this is[br]the 0:28:06.799,0:28:08.950 implementation, right. So you could copy and[br]paste this 0:28:08.950,0:28:11.059 code over and over again, everywhere you go.[br]Just 0:28:11.059,0:28:12.750 remember to format it correctly. If you forget[br]to 0:28:12.750,0:28:15.070 format it, it's not gonna look the same everywhere. 0:28:15.070,0:28:17.460 But I was like, hey, we're using this all 0:28:17.460,0:28:18.010 over the place. Why don't we bundle this up 0:28:18.010,0:28:20.070 into a component? And so with Ember, it was 0:28:20.070,0:28:22.230 super easy. We basically just said, OK, here's[br]new 0:28:22.230,0:28:24.590 calendar date component. It has a property[br]on it 0:28:24.590,0:28:26.460 called date. Just set that to any JavaScript[br]data 0:28:26.460,0:28:28.059 object. Just set, you don't have to remember[br]about 0:28:28.059,0:28:30.450 converting it or formatting it. Here's the[br]component. Set 0:28:30.450,0:28:31.840 the date and it will render the correct thing 0:28:31.840,0:28:32.760 automatically. 0:28:32.760,0:28:36.039 And, so the architecture of the Ember app[br]looks 0:28:36.039,0:28:37.640 a little bit, something like this, where you[br]have 0:28:37.640,0:28:39.919 many, many different components, most of them[br]just driven 0:28:39.919,0:28:42.370 by D3, and then they're plugged into the model 0:28:42.370,0:28:43.480 and the controller. 0:28:43.480,0:28:44.909 And the Ember app will go fetch those models 0:28:44.909,0:28:46.750 from the cloud, and the cloud from the Java 0:28:46.750,0:28:50.190 app, which just queries Cassandra, and render[br]them. And 0:28:50.190,0:28:53.429 what's neat about this model is turning on[br]web 0:28:53.429,0:28:56.360 sockets is super easy, right. Because all[br]of these 0:28:56.360,0:28:58.860 components are bound to a single place. So[br]when 0:28:58.860,0:29:00.890 the web socket says, hey, we have updated[br]information 0:29:00.890,0:29:02.630 for you to show, it just pushes it onto 0:29:02.630,0:29:04.980 the model or onto the controller, and the[br]whole 0:29:04.980,0:29:06.159 UI updates automatically. 0:29:06.159,0:29:06.890 It's like magic. 0:29:06.890,0:29:07.230 And- 0:29:07.230,0:29:08.250 Y.K.: Like magic. 0:29:08.250,0:29:09.679 T.D.: It's like magic. And, and when debugging,[br]this 0:29:09.679,0:29:11.559 is especially awesome too, because, and I'll[br]maybe show 0:29:11.559,0:29:15.080 a demo of the Ember inspector. It's nice. 0:29:15.080,0:29:17.830 So. Yeah. So, lightning fast UI. Reducing[br]the feedback 0:29:17.830,0:29:19.510 loop so that you can quickly play with your 0:29:19.510,0:29:21.880 data, makes it go from a chore to something 0:29:21.880,0:29:23.620 that actually feels kind of fun. 0:29:23.620,0:29:27.039 So, these were the breakthroughs that we had[br]when 0:29:27.039,0:29:28.440 we were building Skylight. The things that[br]made us 0:29:28.440,0:29:29.980 think, yes, this is actually a product that[br]we 0:29:29.980,0:29:31.940 think deserves to be on the market. So, one, 0:29:31.940,0:29:33.860 honest response times. Collect data that no[br]one else 0:29:33.860,0:29:36.549 can collect. Focus on answers instead of just[br]dumping 0:29:36.549,0:29:38.289 data, and have a lightning fast UI to do 0:29:38.289,0:29:38.409 it. 0:29:38.409,0:29:40.100 So we like to think of Skylight as basically 0:29:40.100,0:29:42.690 a smart profiler. It's a smart profiler that[br]runs 0:29:42.690,0:29:44.350 in production. It's like the profiler that[br]you run 0:29:44.350,0:29:47.230 on your local development machine, but instead[br]of being 0:29:47.230,0:29:49.179 on your local dev box which has nothing to 0:29:49.179,0:29:51.610 do with the performance characteristics of[br]what your users 0:29:51.610,0:29:53.450 are experience, we're actually running in[br]production. 0:29:53.450,0:29:58.919 So, let me just give you guys a quick 0:29:58.919,0:30:00.390 demo. 0:30:00.390,0:30:03.120 So, this is what the Skylight, this is what 0:30:03.120,0:30:07.610 Skylight looks like. What's under this? There[br]we go. 0:30:07.610,0:30:09.620 So, the first thing here is we've got the 0:30:09.620,0:30:12.669 app dash board. So this, it's like our, 95th 0:30:12.669,0:30:15.500 responsile- 95th percentile response time[br]has peaked. Maybe you're 0:30:15.500,0:30:17.970 all hammering it right now. That would be[br]nice. 0:30:17.970,0:30:19.940 So, this is a graph of your response time 0:30:19.940,0:30:22.010 over time, and then on the right, this is 0:30:22.010,0:30:24.700 the graph of the RPMs, the requests per minute 0:30:24.700,0:30:26.750 that your app is handling. So this is app-wide. 0:30:26.750,0:30:29.440 And this is live. This updates every minute. 0:30:29.440,0:30:31.039 Then down below, you have a list of the 0:30:31.039,0:30:33.730 end points in your application. So you can[br]see, 0:30:33.730,0:30:35.700 actually, the top, the slowest ones for us[br]were, 0:30:35.700,0:30:37.789 we have an instrumentation API, and we've[br]gone and 0:30:37.789,0:30:39.929 instrumented our background workers. So we[br]can see them 0:30:39.929,0:30:42.010 here, and their response time plays in. So[br]we 0:30:42.010,0:30:44.220 can see that we have this reporting worker[br]that's 0:30:44.220,0:30:46.899 taking 95th percentile, thirteen seconds. 0:30:46.899,0:30:48.880 Y.K.: So all that time used to be inside 0:30:48.880,0:30:51.500 of some request somewhere, and we discovered[br]that there 0:30:51.500,0:30:52.840 was a lot of time being spent in things 0:30:52.840,0:30:54.679 that we could push to the background. We probably 0:30:54.679,0:30:56.789 need to update the agony index so that it 0:30:56.789,0:30:59.190 doesn't make workers very high, because spending[br]some time 0:30:59.190,0:31:02.120 in your workers is not that big of a 0:31:02.120,0:31:02.130 deal. 0:31:02.130,0:31:03.000 T.D.: So, so then, if we dive into one 0:31:03.000,0:31:05.299 of these, you can see that for this request, 0:31:05.299,0:31:07.000 we've got the time explorer up above, and[br]that 0:31:07.000,0:31:10.429 shows a graph of response time at, again,[br]95th 0:31:10.429,0:31:11.840 percentile, and you can, if you want to go 0:31:11.840,0:31:13.549 back and look at historical data, you just[br]drag 0:31:13.549,0:31:15.250 it like this. And this has got a brush, 0:31:15.250,0:31:16.980 so you can zoom in and out on different 0:31:16.980,0:31:17.760 times. 0:31:17.760,0:31:19.649 And every time you change the range, you can 0:31:19.649,0:31:21.360 see that it's very responsive. It's never[br]waiting for 0:31:21.360,0:31:23.039 the server. But it is going back and fetching 0:31:23.039,0:31:25.080 data from the server and then when the data 0:31:25.080,0:31:29.210 comes back, you see the whole UI just updates. 0:31:29.210,0:31:29.250 And we get that for free with Ember and 0:31:29.250,0:31:31.190 And then down below, as we discussed, you[br]actually 0:31:31.190,0:31:33.760 have a real histogram. And this histogram,[br]in this 0:31:33.760,0:31:37.159 case, is showing. So this is for fifty-seven[br]requests. 0:31:37.159,0:31:39.019 And if we click and drag, we could just 0:31:39.019,0:31:40.429 move this. And you can see that the aggregate 0:31:40.429,0:31:43.360 trace below updates in response to us dragging[br]this. 0:31:43.360,0:31:44.919 And if we want to look at the fastest 0:31:44.919,0:31:47.500 quartile, we just click faster and we'll just[br]choose 0:31:47.500,0:31:48.149 that range on the histogram. 0:31:48.149,0:31:49.210 Y.K.: I think it's the fastest load. 0:31:49.210,0:31:50.899 T.D.: The fastest load. And then if you click 0:31:50.899,0:31:52.899 on slower, you can see the slower requests.[br]So 0:31:52.899,0:31:54.669 this makes it really easy to compare and contrast. 0:31:54.669,0:31:56.710 OK. Why are certain requests faster and why[br]are 0:31:56.710,0:31:58.529 certain requests slow? 0:31:58.529,0:32:00.779 You can see the blue, these blue areas. This 0:32:00.779,0:32:03.559 is Ruby code. So, right now it's not super 0:32:03.559,0:32:05.820 granular. It would be nice if you could actually 0:32:05.820,0:32:07.940 know what was going on here. But, it'll at 0:32:07.940,0:32:09.940 least tell you where in your controller action[br]this 0:32:09.940,0:32:12.690 is happening, and then you can actually see[br]which 0:32:12.690,0:32:15.919 database queries are being executed, and what[br]their duration 0:32:15.919,0:32:16.080 is. 0:32:16.080,0:32:17.889 And you can see that we actually extract the 0:32:17.889,0:32:20.419 SQL and we denormalize it so we, so you, 0:32:20.419,0:32:22.159 or, we normalize it so you can see exactly 0:32:22.159,0:32:24.019 what those requests are even if the values[br]are 0:32:24.019,0:32:24.820 totally different between them. 0:32:24.820,0:32:27.649 Y.K.: Yeah. So the real query, courtesy of[br]Rails, 0:32:27.649,0:32:29.730 not yet supporting bind extraction is like,[br]where id 0:32:29.730,0:32:32.169 equals one or, ten or whatever. 0:32:32.169,0:32:33.659 T.D.: Yup. So that's pretty cool. 0:32:33.659,0:32:37.429 Y.K.: So one, one other thing is, initially,[br]we 0:32:37.429,0:32:39.269 actually just showed the whole trace, but[br]we discovered 0:32:39.269,0:32:41.659 that, obviously when you show whole traces[br]you have 0:32:41.659,0:32:43.639 information that doesn't really matter that[br]much. So we 0:32:43.639,0:32:47.340 started off by, we've recently basically started[br]to collapse 0:32:47.340,0:32:48.850 things that don't matter so much so that you 0:32:48.850,0:32:51.090 can basically expand or condense the trace. 0:32:51.090,0:32:52.519 And we wanted to make it not, but you 0:32:52.519,0:32:55.690 have to think about expanding or condensing[br]individual areas, 0:32:55.690,0:32:57.960 but just, you see what matters the most and 0:32:57.960,0:32:59.100 then you can see trivial errors. 0:32:59.100,0:33:02.179 T.D.: Yup. So, so that's the demo of Skylight. 0:33:02.179,0:33:04.190 We'd really like it if you checked it out. 0:33:04.190,0:33:05.899 There is one more thing I want to show 0:33:05.899,0:33:07.720 you that is, like, really freaking cool. This[br]is 0:33:07.720,0:33:10.529 coming out of Tilde labs. Carl was like, has 0:33:10.529,0:33:13.730 been hacking, he's been up until past midnight,[br]getting 0:33:13.730,0:33:15.769 almost no sleep for the past month trying[br]to 0:33:15.769,0:33:16.730 have this ready. 0:33:16.730,0:33:19.090 I don't know how many of you know this, 0:33:19.090,0:33:23.630 but Ruby 2 point 1 has a new, a, 0:33:23.630,0:33:27.950 a stack sampling feature. So you can get really 0:33:27.950,0:33:31.149 granular information about how your Ruby code[br]is performing. 0:33:31.149,0:33:33.450 So I want to show you, I just mentioned 0:33:33.450,0:33:34.570 how it would be nice if we could get 0:33:34.570,0:33:36.830 more information out of what your Ruby code[br]is 0:33:36.830,0:33:38.760 doing. And now we can do that. 0:33:38.760,0:33:42.039 Basically, every few milliseconds, this code[br]that Carl wrote 0:33:42.039,0:33:44.399 is going into the, to the Ruby, into MRI, 0:33:44.399,0:33:47.419 and it's taking a snap shot of the stack. 0:33:47.419,0:33:50.769 And because this is built-in, it's very low-impact.[br]It's 0:33:50.769,0:33:53.570 not allocating any new memory. It's very little[br]performance 0:33:53.570,0:33:55.769 hit. Basically you wouldn't even notice it.[br]And so 0:33:55.769,0:33:58.149 every few milliseconds it's sampling, and[br]we take that 0:33:58.149,0:34:00.260 information and we send it up to our servers. 0:34:00.260,0:34:02.260 So it's almost like you're running Ruby profiler[br]on 0:34:02.260,0:34:05.220 your local dev box, where you get extremely[br]granular 0:34:05.220,0:34:07.159 information about where your code is spending[br]its time 0:34:07.159,0:34:09.010 in Ruby, per method, per all of these things. 0:34:09.010,0:34:11.909 But it's happening in production. 0:34:11.909,0:34:16.409 So, this is, so this is a, we enabled 0:34:16.409,0:34:18.399 it in staging. You can see that we've got 0:34:18.399,0:34:19.600 some rendering bugs. It's still in beta. 0:34:19.600,0:34:21.918 Y.K.: Yeah, and we haven't yet collapsed things[br]that 0:34:21.918,0:34:21.980 are not important- 0:34:21.980,0:34:22.020 T.D.: Yes. 0:34:22.020,0:34:23.270 Y.K.: -for this particular feature. 0:34:23.270,0:34:24.170 T.D.: So we want to show, we want to 0:34:24.170,0:34:27.610 hide things like, like framework code, obviously.[br]But this 0:34:27.610,0:34:31.070 gives you an incredibly, incredibly granular[br]view of what 0:34:31.070,0:34:35.659 your app is doing in production. And we think. 0:34:35.659,0:34:39.230 This is a, an API that's built into, into 0:34:39.230,0:34:43.159 Ruby 2.1.1. Because our agent is running so[br]low-level, 0:34:43.159,0:34:44.659 because we wrote it in Rust, we have the 0:34:44.659,0:34:47.409 ability to do things like this, and Carl thinks 0:34:47.409,0:34:48.370 that we may be able to actually back port 0:34:48.370,0:34:48.480 this to older Rubies, too. So if you're not 0:34:48.480,0:34:50.130 on Ruby 2.1, we think that we can actually 0:34:50.130,0:34:52.790 bring this. But that's TPD. 0:34:52.790,0:34:55.480 Y.K.: Yeah, I- so I think the cool thing 0:34:55.480,0:34:57.940 about this, in general, is when you run a 0:34:57.940,0:34:59.430 sampling- so this is a sampling profiler,[br]right, we 0:34:59.430,0:35:01.260 don't want to be burning every single thing[br]that 0:35:01.260,0:35:03.790 you do in your program with tracing, right.[br]That 0:35:03.790,0:35:05.380 would be very slow. 0:35:05.380,0:35:06.920 So when you normally run a sampling profiler,[br]you 0:35:06.920,0:35:08.760 have to basically make a loop. You have to 0:35:08.760,0:35:11.090 basically create a loop, run this code a million 0:35:11.090,0:35:12.970 times and keep sampling it. Eventually we'll[br]get enough 0:35:12.970,0:35:15.030 samples to get the information. But it turns[br]out 0:35:15.030,0:35:17.280 that your production server is a loop. Your[br]production 0:35:17.280,0:35:20.560 server is serving tons and tons of requests.[br]So, 0:35:20.560,0:35:22.880 by simply tak- you know, taking a few microseconds 0:35:22.880,0:35:25.580 out of every request and collecting a couple[br]of 0:35:25.580,0:35:27.210 samples, over time we can actually get this[br]really 0:35:27.210,0:35:29.700 high fidelity picture with basically no cost. 0:35:29.700,0:35:31.150 And that's pretty mind-blowing. And this is[br]the kind 0:35:31.150,0:35:34.650 of stuff that we can start doing by really 0:35:34.650,0:35:37.250 caring about, about both the user experience[br]and the 0:35:37.250,0:35:40.830 implementation and getting really scary about[br]it. And I'm 0:35:40.830,0:35:42.700 really, like, honestly this is a really exciting[br]feature 0:35:42.700,0:35:45.330 that really shows what we can do as we 0:35:45.330,0:35:46.130 start building this out. 0:35:46.130,0:35:47.140 T.D.: Once we've got that, once we've got[br]that 0:35:47.140,0:35:48.380 groundwork. 0:35:48.380,0:35:49.820 So if you guys want to check it out, 0:35:49.820,0:35:51.760 Skylight dot io, it's available today. It's[br]no longer 0:35:51.760,0:35:54.040 in private beta. Everyone can sign up. No[br]invitation 0:35:54.040,0:35:56.630 token necessary. And you can get a thirty-day[br]free 0:35:56.630,0:35:58.240 trial if you haven't started one already.[br]So if 0:35:58.240,0:35:59.620 you have any questions, please come see us[br]right 0:35:59.620,0:36:00.980 now, or we have a booth in the vendor 0:36:00.980,0:36:03.140 hall. Thank you guys very much.