WEBVTT 00:00:24.900 --> 00:00:25.410 00:00:25.410 --> 00:00:26.650 Chad: Yes, hello, thank you. 00:00:26.650 --> 00:00:29.140 Audience member: Hello! 00:00:29.140 --> 00:00:30.810 Chad: Hello! 00:00:30.810 --> 00:00:33.629 I am Chad, as he said. 00:00:33.629 --> 00:00:35.300 He said I need no introduction 00:00:35.300 --> 00:00:37.890 so I won't introduce myself any further. 00:00:37.890 --> 00:00:44.890 I may be the biggest non-Indian fan of India 00:01:01.690 --> 00:01:06.400 [Hindi speech] 00:01:06.400 --> 00:01:13.400 00:01:15.390 --> 00:01:17.830 I'll now switch back, sorry. 00:01:17.830 --> 00:01:20.010 If you don't understand Hindi, I said nothing of value 00:01:20.010 --> 00:01:22.030 and it was all wrong. 00:01:22.030 --> 00:01:23.549 But I was saying that my Hindi is bad 00:01:23.549 --> 00:01:26.000 and it's because now I'm learning German 00:01:26.000 --> 00:01:28.110 so I mixed them together, but I know not everyone 00:01:28.110 --> 00:01:29.440 speaks Hindi here. 00:01:29.440 --> 00:01:32.480 I just had to show off, you know 00:01:32.480 --> 00:01:37.110 So, I am currently working on 6WunderKinder, 00:01:37.110 --> 00:01:40.370 and I'm working on a product called Wunderlist. 00:01:40.370 --> 00:01:42.060 It is a productivity application. 00:01:42.060 --> 00:01:45.860 It runs on every client you can think of. 00:01:45.860 --> 00:01:47.620 We have native clients, we have a back-end, 00:01:47.620 --> 00:01:49.690 we have millions of active users, 00:01:49.690 --> 00:01:51.850 and I'm telling you this not so that you'll go download it - 00:01:51.850 --> 00:01:53.390 you can do that too - 00:01:53.390 --> 00:01:56.960 but I want to tell you about the challenges that I have 00:01:56.960 --> 00:02:00.710 and the way I'm starting to think about system's architecture and design. 00:02:00.710 --> 00:02:03.130 That's what I'm gonna talk about today 00:02:03.130 --> 00:02:05.690 I'm going to show you some things that are real 00:02:05.690 --> 00:02:07.159 and that we're really doing. 00:02:07.159 --> 00:02:09.429 I'm going to show you some things that are 00:02:09.429 --> 00:02:12.670 just a fantasy that maybe don't make any sense at all. 00:02:12.670 --> 00:02:13.980 But hopefully I'll get you think about 00:02:13.980 --> 00:02:15.540 how we think about system architecture 00:02:15.540 --> 00:02:18.370 and how we build things that can last for a long time. 00:02:18.370 --> 00:02:20.870 So the first thing that I want to mention: 00:02:20.870 --> 00:02:23.340 this is a graph from the Standish Chaos report 00:02:23.340 --> 00:02:25.430 and I've taken the years out 00:02:25.430 --> 00:02:27.310 and I've taken some of the raw data out 00:02:27.310 --> 00:02:28.720 because it doesn't matter. 00:02:28.720 --> 00:02:30.530 If you look at these, this graph, 00:02:30.530 --> 00:02:33.379 each one of these bars is a year, 00:02:33.379 --> 00:02:38.159 and each bar represents successful projects in green - 00:02:38.159 --> 00:02:40.079 software projects. 00:02:40.079 --> 00:02:42.409 Challenged projects are in silver or white in the middle 00:02:42.409 --> 00:02:44.249 and then failed ones are in red. 00:02:44.249 --> 00:02:47.340 But challenged means significantly over time or budget 00:02:47.340 --> 00:02:49.349 which to me means failed too. 00:02:49.349 --> 00:02:51.430 So basically we're terrible, 00:02:51.430 --> 00:02:54.279 all of us here, we're terrible. 00:02:54.279 --> 00:02:57.060 We call ourselves engineers but it's a disgrace. 00:02:57.060 --> 00:03:00.840 We very rarely actually launch things that work. 00:03:00.840 --> 00:03:01.389 Kind of sad, 00:03:01.389 --> 00:03:03.829 and I am here to bring you down. 00:03:03.829 --> 00:03:07.169 Then once you launch software, anecdotal-y, 00:03:07.169 --> 00:03:12.359 and you probably would see this in your own work lives, too, 00:03:12.359 --> 00:03:16.230 anecdotal-y, software gets killed after about five years - 00:03:16.230 --> 00:03:17.650 business software. 00:03:17.650 --> 00:03:19.950 So you barely ever get to launch it, because, 00:03:19.950 --> 00:03:23.319 or at least successfully, in a way that you're proud of, 00:03:23.319 --> 00:03:24.650 and then in about five years 00:03:24.650 --> 00:03:27.739 you end up in that situation where you're doing a big rewrite 00:03:27.739 --> 00:03:29.519 and throwing everything away and replacing it. 00:03:29.519 --> 00:03:32.519 You know there's always that project to get rid of the junk, 00:03:32.519 --> 00:03:35.569 old Java code or whatever that you wrote five years ago, 00:03:35.569 --> 00:03:37.180 replace it with Ruby now, 00:03:37.180 --> 00:03:39.909 five years from now you'll be replacing your old junk Ruby code 00:03:39.909 --> 00:03:46.180 that didn't work with something else. 00:03:46.180 --> 00:03:49.379 We create this thing, probably all of you know the term legacy software - 00:03:49.379 --> 00:03:53.340 Right, am I right? You know what legacy software is, 00:03:53.340 --> 00:03:56.139 and you probably think of it as a negative thing. 00:03:56.139 --> 00:03:58.120 You think of it as that ugly code that doesn't work, 00:03:58.120 --> 00:04:02.540 that's brittle, that you can't change, that you're all afraid of. 00:04:02.540 --> 00:04:07.150 But there's actually also a positive connotation of the word legacy: 00:04:07.150 --> 00:04:14.139 it's leaving behind something that future generations can benefit from. 00:04:14.139 --> 00:04:17.370 But if we're rarely ever launching successful projects 00:04:17.370 --> 00:04:20.889 and then the ones we do launch tend to die within five years 00:04:20.889 --> 00:04:24.600 none of us are actually creating a legacy in our work. 00:04:24.600 --> 00:04:27.430 We're just creating stuff that gets thrown away. 00:04:27.430 --> 00:04:29.400 Kind of sad. 00:04:29.400 --> 00:04:32.240 So we create this stuff that's a legacy software. 00:04:32.240 --> 00:04:35.060 It's hard to change, that's why it ends up getting thrown away 00:04:35.060 --> 00:04:37.370 right, that's, if the software worked 00:04:37.370 --> 00:04:40.030 and you could keep changing it to meet the needs of the business 00:04:40.030 --> 00:04:43.979 you wouldn't need to do a big rewrite and throw it away. 00:04:43.979 --> 00:04:47.840 We create these huge tightly-coupled systems, 00:04:47.840 --> 00:04:49.370 and I don't just mean one application, 00:04:49.370 --> 00:04:51.430 but like many applications are all tightly coupled. 00:04:51.430 --> 00:04:55.900 You've got this thing over here talking to the database of this system over here 00:04:55.900 --> 00:04:59.360 so if you change the columns to update the view of a webpage 00:04:59.360 --> 00:05:02.710 you ruin your billing system, that kind of thing 00:05:02.710 --> 00:05:06.270 this is what makes it so hard to change 00:05:06.270 --> 00:05:09.970 and the sad thing about this is the way we work 00:05:09.970 --> 00:05:13.500 the way we develop software, this is the default setting 00:05:13.500 --> 00:05:18.460 and, what I mean is, if we were robots churning out software 00:05:18.460 --> 00:05:20.819 and we had a preferences panel 00:05:20.819 --> 00:05:25.080 the default preferences would lead to us creating terrible software that gets thrown away in 00:05:25.080 --> 00:05:25.699 five years 00:05:25.699 --> 00:05:27.210 that's just how we all work 00:05:27.210 --> 00:05:30.180 as human beings when we sit down to write code 00:05:30.180 --> 00:05:35.430 our default instincts lead to us to create systems that are tightly coupled 00:05:35.430 --> 00:05:41.659 and hard to change and ultimately get thrown away and can't scale 00:05:41.659 --> 00:05:46.060 we create, we try doing tests, we try doing TDD 00:05:46.060 --> 00:05:51.330 but we create test suites that take forty-five minutes to run 00:05:51.330 --> 00:05:52.720 every team has had to deal with this I'm sure 00:05:52.720 --> 00:05:55.990 if you've written any kind of meaningful application 00:05:55.990 --> 00:05:57.970 and it gets to where you have like a project 00:05:57.970 --> 00:05:59.849 to speed up the test suite 00:05:59.849 --> 00:06:02.949 like you start focusing your company's resources 00:06:02.949 --> 00:06:04.949 on making the test suite faster 00:06:04.949 --> 00:06:08.689 or making it like only fail ninety percent of the time 00:06:08.689 --> 00:06:10.949 and then you say well if it only fails ninety percent that's OK 00:06:10.949 --> 00:06:14.550 right, and right now it's taking forty-five minutes 00:06:14.550 --> 00:06:18.180 we want to get it to where it only takes ten minutes to run 00:06:18.180 --> 00:06:24.479 so the test suite ends up being a liability instead of a benefit 00:06:24.479 --> 00:06:25.719 because of the way you do it 00:06:25.719 --> 00:06:29.319 because you have this architect where everything is so coupled 00:06:29.319 --> 00:06:34.939 you can't change anything without spending hours working on the stupid test suite 00:06:34.939 --> 00:06:38.419 and your terrified to deploy 00:06:38.419 --> 00:06:42.960 I know like the last big Java project I was working on 00:06:42.960 --> 00:06:45.819 it would take, once a week we did a deploy 00:06:45.819 --> 00:06:50.139 it would take fifteen people all night to deploy the thing 00:06:50.139 --> 00:06:52.460 and usually it was like copying class files around 00:06:52.460 --> 00:06:54.430 and restarting servers 00:06:54.430 --> 00:06:57.120 it's much better today but it's still terrifying 00:06:57.120 --> 00:06:59.340 you deploy code, you change it in production 00:06:59.340 --> 00:07:01.259 you're not sure what might break 00:07:01.259 --> 00:07:03.719 cause it's really hard to test these big integrated things together 00:07:03.719 --> 00:07:08.650 and actually upgrading the technology component is terrifying 00:07:08.650 --> 00:07:13.289 so, how many of you have been doing Rails for more than three years? 00:07:13.289 --> 00:07:18.400 do you have, like a Rails 2 app in production, anyone? Yeah? 00:07:18.400 --> 00:07:21.710 that's a lot of people, wow, that's terrifying 00:07:21.710 --> 00:07:26.129 and I've been in situations, recently, where we had Rails 2 apps in production 00:07:26.129 --> 00:07:29.560 security patches are coming out, we were applying our own versions 00:07:29.560 --> 00:07:30.879 of those security patches 00:07:30.879 --> 00:07:32.439 because we were afraid to upgrade Rails 00:07:32.439 --> 00:07:35.060 we would rather hack it than upgrade the thing 00:07:35.060 --> 00:07:38.319 because you just don't know what's gonna happen 00:07:38.319 --> 00:07:42.490 and then you end up, as you're re-implementing all this stuff yourself 00:07:42.490 --> 00:07:44.819 you end up burning yourself out, wasting your time 00:07:44.819 --> 00:07:47.990 because you're hacking on stupid Rails 2 00:07:47.990 --> 00:07:50.139 or some old struts version 00:07:50.139 --> 00:07:52.569 when you should be just taking advantage of the new patches 00:07:52.569 --> 00:07:54.639 but you can't because you're afraid to upgrade the software 00:07:54.639 --> 00:07:56.240 because you don't know what's going to happen 00:07:56.240 --> 00:08:02.849 because the system is too big and too scary 00:08:02.849 --> 00:08:04.949 then, and this is really bad, I think this is something 00:08:04.949 --> 00:08:07.009 Ruby messes up for all of us 00:08:07.009 --> 00:08:11.110 I say this as someone who's been using Ruby for thirteen years now 00:08:11.110 --> 00:08:12.740 happily 00:08:12.740 --> 00:08:15.520 we create these mountains of abstractions 00:08:15.520 --> 00:08:17.669 and the logic ends up being buried inside them 00:08:17.669 --> 00:08:22.860 I mean in Java it was like static, or, you know, factories 00:08:22.860 --> 00:08:25.090 and design pattern soup 00:08:25.090 --> 00:08:27.490 in Ruby its modules and mixins and you know 00:08:27.490 --> 00:08:31.050 we have all these crazy ways of hiding what's actually happening from us 00:08:31.050 --> 00:08:33.070 but when you go look at the code 00:08:33.070 --> 00:08:34.360 it's completely opaque 00:08:34.360 --> 00:08:37.090 you have no idea where the stuff actually gets done 00:08:37.090 --> 00:08:40.820 because it's in some magic library somewhere 00:08:40.820 --> 00:08:45.050 and we do all that because we're trying to save ourselves from the complexity of these 00:08:45.050 --> 00:08:47.450 big nasty systems 00:08:47.450 --> 00:08:50.760 but like if you look at the rest of the world 00:08:50.760 --> 00:08:53.810 this is a software specific problem 00:08:53.810 --> 00:08:58.760 these cars are old, they're older than any software that you would ever run 00:08:58.760 --> 00:09:00.340 and they're still driving down the street 00:09:00.340 --> 00:09:03.460 they're older than software itself, right 00:09:03.460 --> 00:09:06.370 but these things still function, they still work 00:09:06.370 --> 00:09:08.970 how? why? why do they work? 00:09:08.970 --> 00:09:11.340 bodies! my body should not work 00:09:11.340 --> 00:09:12.540 I have abused it 00:09:12.540 --> 00:09:13.870 I should not be standing here today 00:09:13.870 --> 00:09:16.660 I shouldn't have been able to come from Berlin here 00:09:16.660 --> 00:09:18.620 without dying somehow by being in the air 00:09:18.620 --> 00:09:23.660 you know, by the air pressure changes 00:09:23.660 --> 00:09:25.950 but our bodies somehow can survive even when 00:09:25.950 --> 00:09:30.730 we don't take care of them 00:09:30.730 --> 00:09:35.290 and like it's just the system that works, right 00:09:35.290 --> 00:09:37.770 so how do our bodies work? 00:09:37.770 --> 00:09:39.440 how do we stay alive 00:09:39.440 --> 00:09:40.930 despite this fact 00:09:40.930 --> 00:09:42.130 even though we haven't done like some 00:09:42.130 --> 00:09:45.270 great design, we don't have any design patterns 00:09:45.270 --> 00:09:49.780 like mixed up into our bodies 00:09:49.780 --> 00:09:53.980 in biology there is a term called homeostasis 00:09:53.980 --> 00:09:56.210 and I literally don't know what this means 00:09:56.210 --> 00:09:57.390 other than this definition 00:09:57.390 --> 00:09:58.870 so you won't learn about this from me 00:09:58.870 --> 00:10:01.060 there's probably at least one biologist in the room 00:10:01.060 --> 00:10:04.370 so you can correct me later 00:10:04.370 --> 00:10:07.870 but basically the idea of homeostasis is 00:10:07.870 --> 00:10:11.430 that an organism has all these different components 00:10:11.430 --> 00:10:13.890 that serve different purposes 00:10:13.890 --> 00:10:15.820 that regulate it 00:10:15.820 --> 00:10:18.260 so they're all kind of in balance 00:10:18.260 --> 00:10:20.750 and they work together to regulate the system 00:10:20.750 --> 00:10:23.700 if one component, like a liver, does too much 00:10:23.700 --> 00:10:24.720 or does the wrong thing 00:10:24.720 --> 00:10:27.840 another component kicks in and fixes it 00:10:27.840 --> 00:10:30.160 and so our bodies are this well designed system 00:10:30.160 --> 00:10:31.960 for staying alive 00:10:31.960 --> 00:10:34.530 because we have almost like autonomous agents 00:10:34.530 --> 00:10:38.810 internally that take care of the many things that can and do go wrong 00:10:38.810 --> 00:10:41.890 on a regular basis 00:10:41.890 --> 00:10:43.770 so you have, you know, your brain, your liver 00:10:43.770 --> 00:10:47.230 your liver, of course, metabolizes toxic substances 00:10:47.230 --> 00:10:50.400 your kidney deals with blood, water level, et cetera 00:10:50.400 --> 00:10:55.660 you know all these things work in concert to make you live 00:10:55.660 --> 00:11:01.140 the inability to continue to do that is known as homeostatic imbalance 00:11:01.140 --> 00:11:04.070 so I was saying, homeostasis is balancing 00:11:04.070 --> 00:11:07.330 not being able to do that is when you're out of balance 00:11:07.330 --> 00:11:10.340 and that will actually lead to really bad health problems 00:11:10.340 --> 00:11:16.410 or probably death, if you fall into homeostatic imbalance 00:11:16.410 --> 00:11:20.420 so the good news is you're already dying 00:11:20.420 --> 00:11:22.450 like we're all dying all the time 00:11:22.450 --> 00:11:26.500 this is the beautiful thing about death 00:11:26.500 --> 00:11:29.110 there is, there is an estimate that fifty trillion cells 00:11:29.110 --> 00:11:31.850 are in your body, and three million die per second 00:11:31.850 --> 00:11:35.520 it's an estimate because it's actually impossible to count 00:11:35.520 --> 00:11:39.520 but scientists have figured out somehow that this is probably the right number 00:11:39.520 --> 00:11:42.310 so your cells, you've probably heard this all your life 00:11:42.310 --> 00:11:45.170 like physically, after some amount of time, 00:11:45.170 --> 00:11:47.430 you aren't the same human being that you were, physically 00:11:47.430 --> 00:11:52.770 you know, I don't know, you some period of time ago 00:11:52.770 --> 00:11:55.500 you're literally not the same organism anymore 00:11:55.500 --> 00:11:58.420 but you're the same system 00:11:58.420 --> 00:12:01.470 kind of interesting, isn't it 00:12:01.470 --> 00:12:06.740 so in a way you can think about software this 00:12:06.740 --> 00:12:08.300 you can think about software as a system 00:12:08.300 --> 00:12:10.820 if the components could be replaced like these cells 00:12:10.820 --> 00:12:17.820 like, if you focus on making death, constant death OK 00:12:18.970 --> 00:12:19.890 on a small level 00:12:19.890 --> 00:12:24.690 then the system can live on a large level 00:12:24.690 --> 00:12:25.760 that's what this talk is about 00:12:25.760 --> 00:12:29.300 solution, the solution being to mimic living organisms 00:12:29.300 --> 00:12:36.110 and as an aside, I will say many times the word small or tiny in this talk 00:12:36.110 --> 00:12:38.480 because I think I'm learning, as I age 00:12:38.480 --> 00:12:39.870 that small is good 00:12:39.870 --> 00:12:42.950 its, small projects are good 00:12:42.950 --> 00:12:44.050 you know how to estimate them 00:12:44.050 --> 00:12:45.110 small commitments are good 00:12:45.110 --> 00:12:46.790 because you know you can make them 00:12:46.790 --> 00:12:47.750 small methods are good 00:12:47.750 --> 00:12:48.790 small classes are good 00:12:48.790 --> 00:12:50.140 small applications are good 00:12:50.140 --> 00:12:52.410 small teams are good 00:12:52.410 --> 00:12:55.270 so I don't know, this is sort of a non sequitur 00:12:55.270 --> 00:12:58.130 so if we're going to think about software 00:12:58.130 --> 00:12:59.750 as like an organism 00:12:59.750 --> 00:13:03.100 what is a cell in that context? 00:13:03.100 --> 00:13:06.360 this is sort of the key question that you have to ask yourself 00:13:06.360 --> 00:13:08.940 and I say that a cell is a tiny component 00:13:08.940 --> 00:13:12.800 now, tiny and component are both subjective words 00:13:12.800 --> 00:13:15.370 so you can kind of do what you want with that 00:13:15.370 --> 00:13:17.670 but it's a good frame of thinking 00:13:17.670 --> 00:13:20.530 if you make your software system of tiny components 00:13:20.530 --> 00:13:22.510 each one can be like a cell 00:13:22.510 --> 00:13:28.010 each one can die and the system is a collection of those tiny components 00:13:28.010 --> 00:13:31.930 and what you want is not for your code to live forever 00:13:31.930 --> 00:13:35.700 you don't care that each line of code lives forever, right 00:13:35.700 --> 00:13:38.830 like if you're trying to develop a legacy in software 00:13:38.830 --> 00:13:42.920 it's not important to you that your system dot out dot printline statement 00:13:42.920 --> 00:13:44.300 lives for ten years 00:13:44.300 --> 00:13:48.050 it's important to you that the function of the system lives for ten years 00:13:48.050 --> 00:13:50.170 so like, about exactly ten years ago 00:13:50.170 --> 00:13:57.170 we created Ruby gems at the RubyConf 2003 in Austin, Texas 00:13:59.260 --> 00:14:03.600 I haven't touched Ruby gems myself in like four or five years 00:14:03.600 --> 00:14:04.890 but people are still using it 00:14:04.890 --> 00:14:06.130 they hate it because it's software 00:14:06.130 --> 00:14:07.750 everybody hates software right 00:14:07.750 --> 00:14:10.160 so if you can create software that people hate 00:14:10.160 --> 00:14:13.080 you've succeeded 00:14:13.080 --> 00:14:14.450 but it still exists 00:14:14.450 --> 00:14:16.560 I have no idea if any of the code is the same 00:14:16.560 --> 00:14:17.210 I would assume not 00:14:17.210 --> 00:14:21.350 you know I think, I'm sure that my name is still in it in a copyright notice 00:14:21.350 --> 00:14:23.510 but that's about it 00:14:23.510 --> 00:14:24.890 and that's a beautiful thing 00:14:24.890 --> 00:14:28.380 people are still using it to install Ruby libraries 00:14:28.380 --> 00:14:29.570 and software 00:14:29.570 --> 00:14:35.600 and I don't care if any of my existing, or my initial code is still in the system 00:14:35.600 --> 00:14:36.840 because the system still lives 00:14:36.840 --> 00:14:43.030 so, quite a long time ago now I was researching this kind of question 00:14:43.030 --> 00:14:44.600 about Legacy software 00:14:44.600 --> 00:14:48.390 and I asked a question on Twitter as I often do at conferences 00:14:48.390 --> 00:14:49.910 when I'm preparing 00:14:49.910 --> 00:14:55.610 what are some of the old surviving software systems you regularly use 00:14:55.610 --> 00:14:58.430 and if you look at this, I mean, one thing is obviously 00:14:58.430 --> 00:15:03.290 everyone who answered gave some sort of Unix related answer 00:15:03.290 --> 00:15:06.510 but basically all of these things on this list 00:15:06.510 --> 00:15:13.240 are either systems that are collections of really well-known split-up components 00:15:13.240 --> 00:15:15.700 or they're tiny, tiny programs 00:15:15.700 --> 00:15:18.540 so, like, grep is a tiny program, make 00:15:18.540 --> 00:15:19.640 it only does one thing 00:15:19.640 --> 00:15:23.970 well make is actually also arguably an operating system 00:15:23.970 --> 00:15:27.320 but I won't get into that 00:15:27.320 --> 00:15:29.390 emacs is obviously an operating system, right 00:15:29.390 --> 00:15:33.050 but it's well designed of these tiny little pieces 00:15:33.050 --> 00:15:37.190 so a lot of the old systems I know about follow this pattern 00:15:37.190 --> 00:15:40.190 this metaphor that I'm proposing 00:15:40.190 --> 00:15:42.170 and from my own career 00:15:42.170 --> 00:15:43.530 when I was here before in Banglore 00:15:43.530 --> 00:15:47.250 I worked for GE and some of the people 00:15:47.250 --> 00:15:48.690 we hired even worked on the system there 00:15:48.690 --> 00:15:50.970 we had a system called the Bull 00:15:50.970 --> 00:15:53.700 and it was a Honeywell Bull mainframe 00:15:53.700 --> 00:15:57.280 I doubt any of you have worked on that 00:15:57.280 --> 00:15:58.440 but this one I know you didn't work on 00:15:58.440 --> 00:16:01.070 because it had a custom operating system 00:16:01.070 --> 00:16:03.110 with our own RDVMS 00:16:03.110 --> 00:16:06.260 we had created a PCP stack for it 00:16:06.260 --> 00:16:11.160 using like custom hardware that we plugged into a Windows MT computer 00:16:11.160 --> 00:16:14.930 with some sort of MT queuing system back in the day 00:16:14.930 --> 00:16:17.060 it was this terrifying thing 00:16:17.060 --> 00:16:22.510 when I started working there the system was already something like twenty-five years old 00:16:22.510 --> 00:16:25.630 and I believe even though there have been many, many projects 00:16:25.630 --> 00:16:30.160 to try to kill it, like we had a team called the Bull exit team 00:16:30.160 --> 00:16:33.230 I believe the system is still in production 00:16:33.230 --> 00:16:37.070 not as much as it used to be, there are less and less functions in production 00:16:37.070 --> 00:16:39.190 but I believe the system is still in production 00:16:39.190 --> 00:16:46.190 the reason for this is that the system was actually made up of these tiny little components 00:16:47.070 --> 00:16:50.540 and like really queer interfaces between them 00:16:50.540 --> 00:16:53.950 and we kept the system live because every time we tried to replace it 00:16:53.950 --> 00:16:57.290 with some fancy new gem, web thing or gooey app 00:16:57.290 --> 00:16:59.470 it wasn't as good, and the users hated it 00:16:59.470 --> 00:17:00.740 it just didn't work 00:17:00.740 --> 00:17:04.789 so we had to use this old, crazy, modified mainframe 00:17:04.789 --> 00:17:08.150 for a long time as a result 00:17:08.150 --> 00:17:10.890 so, the question I ask myself is now 00:17:10.890 --> 00:17:13.429 how do I, how do I approach a problem like this 00:17:13.429 --> 00:17:19.000 and build a system that can survive for a long time 00:17:19.000 --> 00:17:20.049 I would encourage you 00:17:20.049 --> 00:17:22.589 how many of you know of Fred George 00:17:22.589 --> 00:17:24.720 this is Fred George 00:17:24.720 --> 00:17:25.900 he was at ThoughtWorks for awhile 00:17:25.900 --> 00:17:27.669 so he may have, I think he lived in Banglore 00:17:27.669 --> 00:17:31.150 for some time with ThoughtWorks, in fact 00:17:31.150 --> 00:17:35.050 he is now running a start-up in Silicon Valley 00:17:35.050 --> 00:17:38.600 but he has this talk that you can watch online 00:17:38.600 --> 00:17:41.660 from the Barcelona Ruby Conference the year before last 00:17:41.660 --> 00:17:45.030 called Microservice Architectures 00:17:45.030 --> 00:17:47.760 and he talks in great detail about he, 00:17:47.760 --> 00:17:50.340 how he implemented a concept at forward 00:17:50.340 --> 00:17:52.150 that's very much like what I'm talking about 00:17:52.150 --> 00:17:55.130 tiny components that only do one thing and can be thrown away 00:17:55.130 --> 00:17:59.890 so Microservice Architecture is kind of the core of what I'm gonna talk about 00:17:59.890 --> 00:18:02.080 now I've put together some rules for 6WunderKinder 00:18:02.080 --> 00:18:04.110 which I am going to share with you 00:18:04.110 --> 00:18:07.110 6WunderKinder is the company I work for 00:18:07.110 --> 00:18:09.220 when we're working on Wunderlist 00:18:09.220 --> 00:18:11.940 and the rules of the, the goals of these rules 00:18:11.940 --> 00:18:16.690 are to reduce coupling, to make it where we can do fear-free deployments 00:18:16.690 --> 00:18:19.190 we reduce the chance of "cruft" in our code 00:18:19.190 --> 00:18:20.680 like nasty stuff that you're afraid of 00:18:20.680 --> 00:18:24.670 that you leave there, kind of broken window problems 00:18:24.670 --> 00:18:28.660 we make it literally trivial to change code 00:18:28.660 --> 00:18:32.680 so you just never have to ask how do I do that 00:18:32.680 --> 00:18:33.990 you just find it easy 00:18:33.990 --> 00:18:39.140 and most importantly we give ourselves the freedom to go fast 00:18:39.140 --> 00:18:43.680 because I think no developer ever wants to be slow 00:18:43.680 --> 00:18:44.670 that's one of the worst things 00:18:44.670 --> 00:18:47.750 just toiling away and not actually accomplishing anything 00:18:47.750 --> 00:18:50.920 but we go slow because we're constrained by the system 00:18:50.920 --> 00:18:54.110 and we're constrained by, sometimes projects 00:18:54.110 --> 00:18:56.010 and other, you know, management related things 00:18:56.010 --> 00:19:01.230 but often times its the mess of the system that we've created 00:19:01.230 --> 00:19:03.730 so some of the rules 00:19:03.730 --> 00:19:09.480 I think one thing, and maybe, maybe I'm going to get some push back from this crowd 00:19:09.480 --> 00:19:13.170 one rule that is less controversial than it used to be 00:19:13.170 --> 00:19:15.050 is that comments are a design smell 00:19:15.050 --> 00:19:19.270 does anyone strongly disagree with that? 00:19:19.270 --> 00:19:20.750 no? 00:19:20.750 --> 00:19:23.930 does anyone strongly agree with that? 00:19:23.930 --> 00:19:27.210 OK, so the rest of you have no idea what I'm talking about 00:19:27.210 --> 00:19:33.240 so a design smell, I want to define this really quickly 00:19:33.240 --> 00:19:36.860 a design smell is something you see in your code or your system 00:19:36.860 --> 00:19:39.600 where it doesn't necessarily mean it's bad 00:19:39.600 --> 00:19:40.730 but you look at it and you think 00:19:40.730 --> 00:19:43.270 hmm, I should look into this a little bit 00:19:43.270 --> 00:19:45.930 and ask myself, why are there so many comments in this code? 00:19:45.930 --> 00:19:48.300 you know, especially the bottom one 00:19:48.300 --> 00:19:50.650 inline comments? 00:19:50.650 --> 00:19:56.810 definitely bad, definitely a sign that you should have another method, right 00:19:56.810 --> 00:19:59.040 so it's pretty easy to convince people 00:19:59.040 --> 00:20:00.430 that comments are a design smell 00:20:00.430 --> 00:20:02.010 and I think a lot of people in the industry 00:20:02.010 --> 00:20:03.490 are starting to agree 00:20:03.490 --> 00:20:05.280 maybe not for like a public library 00:20:05.280 --> 00:20:06.800 where you really need to tell someone 00:20:06.800 --> 00:20:09.660 here's how you use this class and this is what it's for 00:20:09.660 --> 00:20:12.470 but you shouldn't have to document every method 00:20:12.470 --> 00:20:15.440 and every argument because the method name and the argument name 00:20:15.440 --> 00:20:18.170 should speak for themselves, right 00:20:18.170 --> 00:20:20.670 so here's one that you probably won't agree with 00:20:20.670 --> 00:20:21.940 tests are a design smell 00:20:21.940 --> 00:20:28.580 so this one is probably a little more controversial 00:20:28.580 --> 00:20:32.570 especially in an environment where you're maybe still struggling people 00:20:32.570 --> 00:20:37.910 struggling with people to actually get them to write tests to begin with, right 00:20:37.910 --> 00:20:41.330 you know I went through this period in, like, 2000 and 2001 00:20:41.330 --> 00:20:44.090 where I was really heavily into evangelizing TDD 00:20:44.090 --> 00:20:47.190 and it was really stressful that you couldn't get anyone to do it 00:20:47.190 --> 00:20:49.520 I think you do have to go through that period 00:20:49.520 --> 00:20:52.110 and I'm not saying you shouldn't write any tests 00:20:52.110 --> 00:20:57.180 but that picture I showed you earlier of the slow, brittle test suite 00:20:57.180 --> 00:20:58.300 that's bad, right 00:20:58.300 --> 00:21:00.910 that is a bad state to be in 00:21:00.910 --> 00:21:03.890 and you're in that state because your tests suck 00:21:03.890 --> 00:21:05.850 that's why you get in that state 00:21:05.850 --> 00:21:09.570 your tests suck because you're writing bad tests 00:21:09.570 --> 00:21:15.910 that don't exercise the right things in your system 00:21:15.910 --> 00:21:18.960 and what I've found is whenever I look into one of these 00:21:18.960 --> 00:21:21.809 big slow brittle test suites 00:21:21.809 --> 00:21:25.180 the tests themselves are indications 00:21:25.180 --> 00:21:28.240 and the sheer proliferation of tests 00:21:28.240 --> 00:21:30.940 are indications that the system is bad 00:21:30.940 --> 00:21:33.590 and the developers are like desperately 00:21:33.590 --> 00:21:36.980 fearfully trying to run the code 00:21:36.980 --> 00:21:38.650 in every way they can 00:21:38.650 --> 00:21:40.660 because it's the only way they can manage 00:21:40.660 --> 00:21:43.980 to even think about the complexity 00:21:43.980 --> 00:21:47.720 but if you think about it, if you had a tiny trivial system 00:21:47.720 --> 00:21:50.059 you wouldn't need to have hundreds of test files 00:21:50.059 --> 00:21:53.059 that take ten minutes to run, ever 00:21:53.059 --> 00:21:54.480 if you did, you're doing something stupid 00:21:54.480 --> 00:21:57.020 you're wasting your time working on tests 00:21:57.020 --> 00:22:00.110 and we as software developers obsess about this kind of thing 00:22:00.110 --> 00:22:04.770 because we have to fight so hard to get our peers to do it in the first place 00:22:04.770 --> 00:22:06.370 and to understand it 00:22:06.370 --> 00:22:10.050 we obsess to the point where we focus on the wrong thing 00:22:10.050 --> 00:22:14.660 none of us are in the business of writing tests for customers 00:22:14.660 --> 00:22:17.559 like we're not launching our tests on the web 00:22:17.559 --> 00:22:20.000 and hoping people will buy them, right 00:22:20.000 --> 00:22:23.900 it doesn't provide value, it's just a side-effect 00:22:23.900 --> 00:22:25.780 that we have focused too heavily on 00:22:25.780 --> 00:22:29.930 and we've lost sight of what the actual goal is 00:22:29.930 --> 00:22:34.210 so, this one actually requires a visual 00:22:34.210 --> 00:22:37.100 I tell the people on my team now 00:22:37.100 --> 00:22:40.340 you can write code in any language you want 00:22:40.340 --> 00:22:42.760 any framework you want, anything you want to do 00:22:42.760 --> 00:22:44.559 as long as the code is this big 00:22:44.559 --> 00:22:47.490 so if you want to write the new service in Haskell 00:22:47.490 --> 00:22:50.059 and it's this big in a normal size font 00:22:50.059 --> 00:22:51.470 you can do it 00:22:51.470 --> 00:22:54.260 if you want to do it in Closure or Elixir or Scarla or Ruby 00:22:54.260 --> 00:22:55.050 or whatever you want to do 00:22:55.050 --> 00:22:56.820 even Python for god's sake 00:22:56.820 --> 00:22:59.230 you can do it if it's this big and no bigger 00:22:59.230 --> 00:23:04.010 why? because it means I can look at it 00:23:04.010 --> 00:23:05.620 and I can understand it 00:23:05.620 --> 00:23:08.730 or if I don't I'll just throw it away 00:23:08.730 --> 00:23:12.100 because if it's this big it doesn't do very much, right 00:23:12.100 --> 00:23:14.450 so the risk is really low 00:23:14.450 --> 00:23:16.809 and I really mean the system is that 00:23:16.809 --> 00:23:19.130 there are the, the component is that big 00:23:19.130 --> 00:23:21.070 and in my world a component means a service 00:23:21.070 --> 00:23:24.710 that's running and probably listening on an HTTP board 00:23:24.710 --> 00:23:27.820 or some sort of rift or RPC protocol 00:23:27.820 --> 00:23:29.520 so it's a standalone thing 00:23:29.520 --> 00:23:30.680 it's its own application 00:23:30.680 --> 00:23:33.130 it's probably in its own git repository 00:23:33.130 --> 00:23:34.950 people do poll requests against it 00:23:34.950 --> 00:23:35.820 but it's just tiny 00:23:35.820 --> 00:23:39.110 so this big 00:23:39.110 --> 00:23:41.200 at the top of this, by the way 00:23:41.200 --> 00:23:45.720 is some code by Konstantin Haase 00:23:45.720 --> 00:23:48.720 who also lives in Berlin, where I live 00:23:48.720 --> 00:23:51.480 this is a rewrite of Sinatra 00:23:51.480 --> 00:23:52.430 the web framework 00:23:52.430 --> 00:23:55.450 and Konstantin is actually the maintainer of Sinatra 00:23:55.450 --> 00:23:58.870 it's not fully compatible, but it's amazingly close 00:23:58.870 --> 00:24:00.260 and it all fits right in that 00:24:00.260 --> 00:24:05.020 but the font size is kind of small, so I cheated 00:24:05.020 --> 00:24:08.550 another rule, our systems are heterogeneous by default 00:24:08.550 --> 00:24:11.420 so I say you can write in any language you want 00:24:11.420 --> 00:24:14.050 that's not just because I want the developers to be excited 00:24:14.050 --> 00:24:16.650 although I think, most of you, if you worked 00:24:16.650 --> 00:24:19.390 in an environment where your boss told you 00:24:19.390 --> 00:24:21.500 you can use any programming language or tool you want 00:24:21.500 --> 00:24:23.809 you would be pretty happy about that, right 00:24:23.809 --> 00:24:26.590 anyone unhappy about that? I don't think so 00:24:26.590 --> 00:24:28.100 unless it's one of the bosses here 00:24:28.100 --> 00:24:31.679 that's like don't tell people that 00:24:31.679 --> 00:24:32.570 so that's one thing 00:24:32.570 --> 00:24:36.880 the other one is, it leads to a good system design 00:24:36.880 --> 00:24:38.840 because think about this 00:24:38.840 --> 00:24:42.350 if I write one program in Erlang, one component in Erlang 00:24:42.350 --> 00:24:44.410 one program in Ruby 00:24:44.410 --> 00:24:47.710 I have to work really, really hard to make tight coupling 00:24:47.710 --> 00:24:49.650 between those things 00:24:49.650 --> 00:24:53.340 like I have to basically use computer science to do that 00:24:53.340 --> 00:24:54.370 I don't even know what I would do 00:24:54.370 --> 00:24:55.929 you know it's hard 00:24:55.929 --> 00:24:58.590 like I would have to maybe implement Ruby in Erlang 00:24:58.590 --> 00:25:01.140 so that it can run in the same BM or vice versa 00:25:01.140 --> 00:25:04.059 it's just silly, I wouldn't do it 00:25:04.059 --> 00:25:07.050 so if my system is heterogeneous by default 00:25:07.050 --> 00:25:11.960 my coupling is very low, at least at a certain level by default 00:25:11.960 --> 00:25:14.170 because it's the path of least resistance 00:25:14.170 --> 00:25:16.679 is to make the system decoupled 00:25:16.679 --> 00:25:19.300 it's easier to make things decoupled than coupled 00:25:19.300 --> 00:25:21.510 if they're all running in different languages 00:25:21.510 --> 00:25:25.210 so in the past three months, I'll say 00:25:25.210 --> 00:25:30.490 I have written production code in objective CRuby, Scala, Closure, Node 00:25:30.490 --> 00:25:34.059 I don't know, more stuff, Java 00:25:34.059 --> 00:25:35.670 all these different languages 00:25:35.670 --> 00:25:38.809 real code for work 00:25:38.809 --> 00:25:40.550 and yes, they are not tightly coupled 00:25:40.550 --> 00:25:44.650 like I haven't installed JRuby so that I could reach into the internals of my Scala code 00:25:44.650 --> 00:25:45.630 because that would be a pain 00:25:45.630 --> 00:25:50.730 I don't want to do that 00:25:50.730 --> 00:25:52.960 another very important one is 00:25:52.960 --> 00:25:55.559 server nodes are disposable 00:25:55.559 --> 00:25:59.429 so, back when I was at GE, for example 00:25:59.429 --> 00:26:02.730 I remember being really proud when I looked at the up time of one of my servers 00:26:02.730 --> 00:26:05.480 and it was like four hundred days or something 00:26:05.480 --> 00:26:07.150 it's like, wow, this is awesome 00:26:07.150 --> 00:26:09.750 I have this big server, it had all these apps on it 00:26:09.750 --> 00:26:12.940 we kept it running for four hundred days 00:26:12.940 --> 00:26:14.809 the problem with that is I was afraid to ever touch it 00:26:14.809 --> 00:26:17.510 I was really happy it was alive 00:26:17.510 --> 00:26:18.860 but I didn't want to do anything to it 00:26:18.860 --> 00:26:21.250 I was afraid to update the operating system 00:26:21.250 --> 00:26:23.770 in fact you could not upgrade Solaris then without restarting it 00:26:23.770 --> 00:26:27.540 so that meant I had not upgrading the operating system 00:26:27.540 --> 00:26:32.390 I probably shouldn't have been too proud about it 00:26:32.390 --> 00:26:34.890 Nodes that are alive for a long time lead to fear 00:26:34.890 --> 00:26:37.440 and what I want is less fear 00:26:37.440 --> 00:26:39.340 so I throw them away 00:26:39.340 --> 00:26:42.900 and this means I don't have physical servers that I throw away 00:26:42.900 --> 00:26:45.920 that would be fun but I'm not that rich yet 00:26:45.920 --> 00:26:49.160 we use AWS right now, you could do it with any kind of cloud service 00:26:49.160 --> 00:26:52.640 or even internal cloud divider 00:26:52.640 --> 00:26:53.780 but every node is disposable 00:26:53.780 --> 00:27:00.550 so, we never upgrade software on an existing server 00:27:00.550 --> 00:27:03.150 whenever you want to deploy a new version of a service 00:27:03.150 --> 00:27:04.370 you create new servers 00:27:04.370 --> 00:27:05.429 and you deploy that version 00:27:05.429 --> 00:27:08.790 and then you replace them in the load balance or somewhere 00:27:08.790 --> 00:27:10.200 that's it 00:27:10.200 --> 00:27:13.100 so, you never have to wonder what's on a server 00:27:13.100 --> 00:27:15.620 because it was deployed through an automated process 00:27:15.620 --> 00:27:16.840 and there's no fear there 00:27:16.840 --> 00:27:17.980 you know exactly what it is 00:27:17.980 --> 00:27:19.320 you know exactly how to recreate it 00:27:19.320 --> 00:27:21.540 because you have a golden master image 00:27:21.540 --> 00:27:24.200 and in our case it's actually an Amazon image 00:27:24.200 --> 00:27:26.380 that you can just boot more of 00:27:26.380 --> 00:27:27.440 scaling is a problem 00:27:27.440 --> 00:27:29.070 you just boot ten more servers 00:27:29.070 --> 00:27:32.520 boom, done, no problem 00:27:32.520 --> 00:27:35.450 so yeah I tell the team, you know, pick your technology 00:27:35.450 --> 00:27:38.090 everything must be automated, that's another piece 00:27:38.090 --> 00:27:43.059 if you're going to deploy a closure service for the first time 00:27:43.059 --> 00:27:46.760 you have to be responsible for figuring out how it fits into our deployment system 00:27:46.760 --> 00:27:50.309 so that you have immutable deployments and disposable nodes 00:27:50.309 --> 00:27:53.760 if you can do that and you're willing to also maintain it and teach someone else 00:27:53.760 --> 00:27:55.910 about the little piece of code that you wrote, then cool 00:27:55.910 --> 00:27:59.010 you can do it, any level you want 00:27:59.010 --> 00:28:02.929 and then once you deploy stuff 00:28:02.929 --> 00:28:05.250 like a lot of us like to just SFH in the machines 00:28:05.250 --> 00:28:07.679 and then twiddle with things and replace files 00:28:07.679 --> 00:28:11.660 and like try like fixing bugs live on production 00:28:11.660 --> 00:28:13.990 why no just throw away the actual keys 00:28:13.990 --> 00:28:16.590 because you're going to throw away the system eventually 00:28:16.590 --> 00:28:19.140 you don't even need route access to it 00:28:19.140 --> 00:28:21.490 you don't need to be able to get to it 00:28:21.490 --> 00:28:24.980 except through the port that your service is listening on 00:28:24.980 --> 00:28:26.840 so you can't screw it up 00:28:26.840 --> 00:28:29.470 you can't introduce entropy and mess things up 00:28:29.470 --> 00:28:31.470 if you throw away the keys 00:28:31.470 --> 00:28:33.640 so this is actually a practice that you can do 00:28:33.640 --> 00:28:36.460 deploy the servers, remove all the credentials 00:28:36.460 --> 00:28:39.299 for logging in and the only option you have 00:28:39.299 --> 00:28:43.610 is to destroy them when you're done with them 00:28:43.610 --> 00:28:45.140 provisioning new services in our world 00:28:45.140 --> 00:28:46.960 must also be trivial 00:28:46.960 --> 00:28:51.370 so we have actually now thrown away our chef repository 00:28:51.370 --> 00:28:54.340 because chef is obsolete and 00:28:54.340 --> 00:28:56.049 we have replaced it with shell scripts 00:28:56.049 --> 00:29:01.340 and that sounds like I'm an idiot 00:29:01.340 --> 00:29:04.460 I know, but when I say chef is obsolete 00:29:04.460 --> 00:29:05.480 I don't really mean that 00:29:05.480 --> 00:29:07.100 I like to say that so that people will think 00:29:07.100 --> 00:29:08.450 because a lot of you are probably thinking 00:29:08.450 --> 00:29:11.040 we should move to chef 00:29:11.040 --> 00:29:11.809 that would be great 00:29:11.809 --> 00:29:13.530 because what you have is a bunch of servers 00:29:13.530 --> 00:29:14.670 that are running for a long time 00:29:14.670 --> 00:29:17.110 and you need to be able to continue to keep them up to date 00:29:17.110 --> 00:29:19.150 chef is really great at that 00:29:19.150 --> 00:29:22.059 chef is also good at booting a new server 00:29:22.059 --> 00:29:24.340 but really it's just overkill for that 00:29:24.340 --> 00:29:25.059 yeah 00:29:25.059 --> 00:29:26.460 so if you're always throwing stuff away 00:29:26.460 --> 00:29:27.809 I don't think you need chef 00:29:27.809 --> 00:29:29.160 do something really, really simple 00:29:29.160 --> 00:29:29.950 and that's what we've done 00:29:29.950 --> 00:29:33.090 so like whenever we deploy a new type of service 00:29:33.090 --> 00:29:37.730 I set up ZooKepper recently, which is a complete change from the other stuff we're deploying 00:29:37.730 --> 00:29:39.980 I think it was a five line shell script to do that 00:29:39.980 --> 00:29:42.590 I just added it to a get repo and run a command 00:29:42.590 --> 00:29:47.340 I've got a cluster of ZooKeeper servers running 00:29:47.340 --> 00:29:51.260 you want to always be deploying your software 00:29:51.260 --> 00:29:55.570 this is something I learned from Kent Beck early on in the agile extreme programming 00:29:55.570 --> 00:29:56.330 world 00:29:56.330 --> 00:29:57.980 that if something is hard 00:29:57.980 --> 00:30:00.420 or you perceive it to be hard or difficult 00:30:00.420 --> 00:30:02.290 the best thing you can do 00:30:02.290 --> 00:30:04.390 if you have to do that thing all the time 00:30:04.390 --> 00:30:07.000 is to just do it constantly 00:30:07.000 --> 00:30:09.090 non-stop all the time 00:30:09.090 --> 00:30:10.910 so like deploying in our old world 00:30:10.910 --> 00:30:15.280 where it would take all night once a week 00:30:15.280 --> 00:30:18.040 if we instituted a new policy 00:30:18.040 --> 00:30:19.270 in that team that said 00:30:19.270 --> 00:30:23.100 any change that goes to master must be deployed within five minutes 00:30:23.100 --> 00:30:28.410 I guarantee you we would have fixed that process, right 00:30:28.410 --> 00:30:29.730 and if you're deploying constantly 00:30:29.730 --> 00:30:31.080 all day every day 00:30:31.080 --> 00:30:33.120 you're never going to be afraid of deployments 00:30:33.120 --> 00:30:36.020 because it's always a small change 00:30:36.020 --> 00:30:37.929 so always be deploying 00:30:37.929 --> 00:30:40.410 every new deploy means you're throwing away old servers 00:30:40.410 --> 00:30:42.600 and replacing them with new ones 00:30:42.600 --> 00:30:45.610 in our world I would say that the average uptime 00:30:45.610 --> 00:30:48.240 of one of our servers is probably something like 00:30:48.240 --> 00:30:55.179 seventeen hours and that's because we don't tend to work on the weekend very much 00:30:55.179 --> 00:30:56.870 you also, when you have these sorts of systems 00:30:56.870 --> 00:30:58.710 that are distributed like this 00:30:58.710 --> 00:31:02.100 and you're trying to reduce the fear of change 00:31:02.100 --> 00:31:04.350 the big thing that you're afraid of is failure 00:31:04.350 --> 00:31:06.110 you're afraid that the service is going to fail 00:31:06.110 --> 00:31:07.110 the system is going to go down 00:31:07.110 --> 00:31:10.070 one component won't be reachable, that sort of thing 00:31:10.070 --> 00:31:12.370 so you just to have assume that that's going to happen 00:31:12.370 --> 00:31:17.210 you are not going to build a system that never fails, ever 00:31:17.210 --> 00:31:19.740 I hope you don't, because you will have wasted much of your life 00:31:19.740 --> 00:31:21.100 trying to get that to happen 00:31:21.100 --> 00:31:24.309 instead, assume that the thing, the components are going to fail 00:31:24.309 --> 00:31:25.960 and build resiliency in 00:31:25.960 --> 00:31:28.030 I have a picture here of Joe Armstrong 00:31:28.030 --> 00:31:30.380 who is one of the inventors of Erlang 00:31:30.380 --> 00:31:34.890 if you have not studied Erlang philosophy around failure and recovery 00:31:34.890 --> 00:31:35.340 you should 00:31:35.340 --> 00:31:36.470 and it won't take you long 00:31:36.470 --> 00:31:39.070 so I'm just going to leave that as homework for you 00:31:39.070 --> 00:31:42.110 and then, you know, I said, the tests are a design pattern 00:31:42.110 --> 00:31:43.540 I don't mean don't write any tests 00:31:43.540 --> 00:31:45.950 but I also want to be further responsible here 00:31:45.950 --> 00:31:50.540 and say you should monitor everything 00:31:50.540 --> 00:31:52.880 you want to favor measurement over testing 00:31:52.880 --> 00:31:57.130 so I use measurement as a surrogate for testing 00:31:57.130 --> 00:31:57.850 or as an enhancement 00:31:57.850 --> 00:32:03.980 and the reason I say this is 00:32:03.980 --> 00:32:05.650 you can either focus on one of two things 00:32:05.650 --> 00:32:07.790 I said assume failure right, so 00:32:07.790 --> 00:32:12.370 mean time between failures or mean time to resolution 00:32:12.370 --> 00:32:16.200 those are kind of two metrics in the ops world 00:32:16.200 --> 00:32:17.400 that people talk about 00:32:17.400 --> 00:32:20.140 for measuring their success and their effectiveness 00:32:20.140 --> 00:32:21.980 mean time between failures means 00:32:21.980 --> 00:32:25.360 you're trying to increase the time between failures 00:32:25.360 --> 00:32:29.290 of the system, so basically you're trying to make failures never happen, right 00:32:29.290 --> 00:32:31.059 mean time to resolution means 00:32:31.059 --> 00:32:34.679 when they happen, I'm gonna focus on bringing them back 00:32:34.679 --> 00:32:37.290 as fast as I possibly can 00:32:37.290 --> 00:32:41.120 so a perfect example would be a system fails 00:32:41.120 --> 00:32:43.720 and another one is already up and just takes over its work 00:32:43.720 --> 00:32:46.679 mean time to resolution is essentially zero, right 00:32:46.679 --> 00:32:50.679 if you're always assuming that every component can will fail 00:32:50.679 --> 00:32:53.770 then mean time to resolution is going to be really good 00:32:53.770 --> 00:32:56.240 because you're going to bake it into the process 00:32:56.240 --> 00:32:59.480 if you do that, you don't care about when things fail 00:32:59.480 --> 00:33:02.640 and back to this idea of favoring measurement over testing 00:33:02.640 --> 00:33:07.250 if you're monitoring everything, everything with intelligence 00:33:07.250 --> 00:33:10.390 then you're actually focusing on mean time to resolution 00:33:10.390 --> 00:33:15.750 and acknowledging that the software is going to be broken sometimes, right 00:33:15.750 --> 00:33:18.200 and when I say monitor everything, I mean everything 00:33:18.200 --> 00:33:21.940 I don't mean, like your disk space and your memory and stuff there 00:33:21.940 --> 00:33:23.669 I'm talking about business metrics 00:33:23.669 --> 00:33:27.630 so, at living social we created this thing called rearview 00:33:27.630 --> 00:33:29.250 which is now opensource 00:33:29.250 --> 00:33:33.030 which allows you do to aberration detection 00:33:33.030 --> 00:33:37.919 and aberration means strange behavior, strange change in behavior 00:33:37.919 --> 00:33:41.679 so rearview can do aberration detection 00:33:41.679 --> 00:33:44.690 on data sets, arbitrary data sets 00:33:44.690 --> 00:33:47.010 which means, like in the living social world 00:33:47.010 --> 00:33:48.230 we had user sign ups 00:33:48.230 --> 00:33:49.190 constantly streaming in 00:33:49.190 --> 00:33:51.559 it was a very high volume site 00:33:51.559 --> 00:33:53.799 if user sign-ups were weird 00:33:53.799 --> 00:33:55.940 we would get an alert 00:33:55.940 --> 00:33:57.540 why might they be weird? 00:33:57.540 --> 00:34:00.830 one thing could be like the user service is down, right 00:34:00.830 --> 00:34:02.100 so then we would get two alerts 00:34:02.100 --> 00:34:04.010 user sign ups have gone down 00:34:04.010 --> 00:34:05.150 and so has the service 00:34:05.150 --> 00:34:07.510 so obviously the problem is the service is down 00:34:07.510 --> 00:34:09.679 let's bring it back up 00:34:09.679 --> 00:34:11.409 but it could be something like 00:34:11.409 --> 00:34:13.349 a front-end developer or a designer 00:34:13.349 --> 00:34:16.469 made a change that was intentional 00:34:16.469 --> 00:34:18.040 but it just didn't work and no one liked it 00:34:18.040 --> 00:34:21.168 so they didn't sign up to the site anymore 00:34:21.168 --> 00:34:23.980 that's more important than just knowing that the service is down 00:34:23.980 --> 00:34:25.460 right, because what you care about 00:34:25.460 --> 00:34:27.190 isn't that the service is up or down 00:34:27.190 --> 00:34:30.540 if you could crash the entire system and still be making money 00:34:30.540 --> 00:34:31.859 you don't care, right, that's better 00:34:31.859 --> 00:34:34.839 throw it away and stop paying for the servers 00:34:34.839 --> 00:34:40.679 but if your system is up 100% of the time and performs excellently 00:34:40.679 --> 00:34:43.359 but no one's using it, that's bad 00:34:43.359 --> 00:34:49.279 so monitoring business metrics gives you a lot more than unit test could ever give you 00:34:49.279 --> 00:34:50.899 and then in our world 00:34:50.899 --> 00:34:51.859 we focused on experiencing 00:34:51.859 --> 00:34:56.259 no, you have to come up to front and say ten! 00:34:56.259 --> 00:34:59.220 ok, ten minutes left 00:34:59.220 --> 00:35:01.989 when I got to 6WunderKinder in Berlin 00:35:01.989 --> 00:35:04.069 everyone was terrified to touch the system 00:35:04.069 --> 00:35:08.710 because they hadn't created a really well-designed 00:35:08.710 --> 00:35:12.009 but traditional monolithic API 00:35:12.009 --> 00:35:13.539 so they had layers of abstractions 00:35:13.539 --> 00:35:15.289 it was all kind of in one big thing 00:35:15.289 --> 00:35:16.519 they had a huge database 00:35:16.519 --> 00:35:19.720 and they were really, really scared to do anything 00:35:19.720 --> 00:35:22.190 so there's like one person who would deploy anything 00:35:22.190 --> 00:35:24.190 and everyone else was trying to work on other projects 00:35:24.190 --> 00:35:25.950 and not touch it 00:35:25.950 --> 00:35:27.859 but it was like the production system 00:35:27.859 --> 00:35:29.960 you know so it wasn't really an option 00:35:29.960 --> 00:35:31.880 so the first thing I did in my first week 00:35:31.880 --> 00:35:34.920 is I got these graphs going 00:35:34.920 --> 00:35:39.239 and this was, yeah, response time 00:35:39.239 --> 00:35:42.749 and the first thing I did is I started turning off servers 00:35:42.749 --> 00:35:44.279 and just watching the graphs 00:35:44.279 --> 00:35:47.749 and then, as I was turning off the servers 00:35:47.749 --> 00:35:49.380 I went to the production database 00:35:49.380 --> 00:35:54.220 and I did select, count, star from tasks 00:35:54.220 --> 00:35:55.650 and we're a task management app 00:35:55.650 --> 00:35:58.249 so we have hundreds of millions of tasks 00:35:58.249 --> 00:36:00.910 and the whole thing crashed 00:36:00.910 --> 00:36:04.119 and all the people were like AAAAH what's going on 00:36:04.119 --> 00:36:05.630 you know, and I said, it's no problem 00:36:05.630 --> 00:36:08.539 I did this on purpose, I'll just make it come back 00:36:08.539 --> 00:36:10.119 which I did 00:36:10.119 --> 00:36:11.079 and from that point on 00:36:11.079 --> 00:36:13.349 like, really every day I would do something 00:36:13.349 --> 00:36:16.999 which basically crash the system for just a moment 00:36:16.999 --> 00:36:19.819 and really, like, we had way too many servers in production 00:36:19.819 --> 00:36:22.690 we were spending tens of thousands more Euros per month 00:36:22.690 --> 00:36:25.079 than we should have on the infrastructure 00:36:25.079 --> 00:36:27.499 and I just started taking things away 00:36:27.499 --> 00:36:28.819 and I would usually do it 00:36:28.819 --> 00:36:30.579 instead of the responsible way, 00:36:30.579 --> 00:36:31.630 like one server at a time 00:36:31.630 --> 00:36:34.079 I would just remove all of them and start adding them back 00:36:34.079 --> 00:36:36.220 so for a moment everything was down 00:36:36.220 --> 00:36:38.809 but after that we go to a point where 00:36:38.809 --> 00:36:41.299 everyone on the team was absolutely comfortable 00:36:41.299 --> 00:36:42.720 with the worst case scenario 00:36:42.720 --> 00:36:45.180 of the system being completely down 00:36:45.180 --> 00:36:47.989 so that we could, in a panic free way 00:36:47.989 --> 00:36:51.059 just focus on bringing it up when it was bad 00:36:51.059 --> 00:36:52.940 so now when you do a deployment 00:36:52.940 --> 00:36:54.710 and you have your business metrics being measured 00:36:54.710 --> 00:36:57.160 you know the important stuff is happening 00:36:57.160 --> 00:37:00.559 and you know what to do when everything is down 00:37:00.559 --> 00:37:02.509 you've experienced the worst thing that can happen 00:37:02.509 --> 00:37:04.690 well the worst thing is like someone breaks in 00:37:04.690 --> 00:37:07.789 and steals all your stuff, steals all your users' phone numbers 00:37:07.789 --> 00:37:10.140 and posts them online like SnapChat or something 00:37:10.140 --> 00:37:13.650 but you've experienced all these potentially horrible things 00:37:13.650 --> 00:37:16.920 and realized, eh, it's not so bad, I can deal with this 00:37:16.920 --> 00:37:19.119 I know what do to 00:37:19.119 --> 00:37:22.400 it allows you to start making bold moves 00:37:22.400 --> 00:37:23.640 and that's what we all want right 00:37:23.640 --> 00:37:28.739 we all want to be able to bravely go into our systems 00:37:28.739 --> 00:37:30.319 and do anything we think is right 00:37:30.319 --> 00:37:33.869 so that's what I've been focusing on 00:37:33.869 --> 00:37:36.769 we also do this thing called Canary in the Coal Mine deployments 00:37:36.769 --> 00:37:38.999 which removes the fear, also 00:37:38.999 --> 00:37:43.479 canary in the coalmine refers to a kind of sad thing 00:37:43.479 --> 00:37:46.869 about coal miners in the US 00:37:46.869 --> 00:37:49.400 where they would send canaries into the mines 00:37:49.400 --> 00:37:50.380 at various levels 00:37:50.380 --> 00:37:54.170 and if the canary died they knew there was a problem 00:37:54.170 --> 00:37:58.299 with the air 00:37:58.299 --> 00:37:59.470 but in the software world 00:37:59.470 --> 00:38:02.839 what this means is you have bunch of servers running 00:38:02.839 --> 00:38:06.400 or a bunch of, I don't know, clients running a certain version 00:38:06.400 --> 00:38:09.789 and you start introducing new version incrementally 00:38:09.789 --> 00:38:11.769 and watching the effects 00:38:11.769 --> 00:38:13.210 so once you're measuring everything 00:38:13.210 --> 00:38:14.680 and monitoring everything 00:38:14.680 --> 00:38:17.039 you can also start doing these canary in the coalmine things 00:38:17.039 --> 00:38:19.069 where you say OK I have a new version of this service 00:38:19.069 --> 00:38:20.369 that I'm going to deploy 00:38:20.369 --> 00:38:22.769 and I've got thirty servers running for it 00:38:22.769 --> 00:38:25.869 but I'm going to change only five of them now 00:38:25.869 --> 00:38:27.950 and see, like, does my error rate increase 00:38:27.950 --> 00:38:30.180 or does my performance drop on those servers 00:38:30.180 --> 00:38:33.880 or do people actually not successfully complete the task they're trying to do 00:38:33.880 --> 00:38:34.650 on those servers 00:38:34.650 --> 00:38:39.989 so, this also allows us the combination of monitoring everything 00:38:39.989 --> 00:38:41.989 and these immutable deployments and everything 00:38:41.989 --> 00:38:46.569 gives us the ability to gradually affect change and not be afraid 00:38:46.569 --> 00:38:48.460 so we roll out changes all day every day 00:38:48.460 --> 00:38:53.819 because we don't fear that we're just going to destroy the entire system all at once 00:38:53.819 --> 00:38:55.880 so I think I have like five minutes left 00:38:55.880 --> 00:39:00.239 uh, these are some things we're not necessarily doing yet 00:39:00.239 --> 00:39:01.970 but they're some ideas that I have 00:39:01.970 --> 00:39:04.940 that given some free time I will work on 00:39:04.940 --> 00:39:08.700 and, they're probably more exciting 00:39:08.700 --> 00:39:11.319 one is I talked about homeostatic regulation 00:39:11.319 --> 00:39:13.579 and homeostasis 00:39:13.579 --> 00:39:16.640 so I think we all understand the idea of you know homeostasis 00:39:16.640 --> 00:39:20.019 and the fact that systems have different parts that do different roles 00:39:20.019 --> 00:39:21.819 and can protect each other from each other 00:39:21.819 --> 00:39:27.819 but, so this diagram is actually just some random diagram 00:39:27.819 --> 00:39:30.789 I copied and pasted off the AWS website 00:39:30.789 --> 00:39:33.589 so it's not necessarily all that meaningful 00:39:33.589 --> 00:39:36.279 except to show that every architecture 00:39:36.279 --> 00:39:38.680 especially server based architectures 00:39:38.680 --> 00:39:42.979 has a collection of services that play different roles 00:39:42.979 --> 00:39:45.079 and it almost looks like a person 00:39:45.079 --> 00:39:46.989 you've got a brain and a heart and a liver 00:39:46.989 --> 00:39:50.690 and all these things, right 00:39:50.690 --> 00:39:53.009 what would it mean to actually implement 00:39:53.009 --> 00:39:56.539 homeostatic regulation in a web service? 00:39:56.539 --> 00:39:59.539 so that you have some controlling system 00:39:59.539 --> 00:40:02.579 where the database will actually kill an app server 00:40:02.579 --> 00:40:04.859 that is hurting it, for example 00:40:04.859 --> 00:40:07.410 just kill it 00:40:07.410 --> 00:40:08.769 I don't know yet, I don't know what that is 00:40:08.769 --> 00:40:14.339 but some ideas about this stuff 00:40:14.339 --> 00:40:15.569 I don't know if you've heard of these 00:40:15.569 --> 00:40:19.690 NetFlix, do you have NetFlix in India yet? 00:40:19.690 --> 00:40:23.499 probably not, unless you have a VPN, right 00:40:23.499 --> 00:40:27.029 NetFlix has a really great cloud based architecture 00:40:27.029 --> 00:40:29.839 they have this thing called Chaos Monkey they've created 00:40:29.839 --> 00:40:33.940 which goes through their system and randomly destroys Nodes 00:40:33.940 --> 00:40:35.960 just crashes servers 00:40:35.960 --> 00:40:39.680 and they did this because, when they were, they were early users of AWS 00:40:39.680 --> 00:40:42.239 and when they went out initially with AWS, servers were crashing 00:40:42.239 --> 00:40:43.880 like it was still immature 00:40:43.880 --> 00:40:46.279 so they said OK we still want to use this 00:40:46.279 --> 00:40:49.769 and we'll build in stuff so that we can deal with the crashes 00:40:49.769 --> 00:40:52.210 but we have to know it's gonna work when it crashes 00:40:52.210 --> 00:40:55.410 so let's make crashing be part of production 00:40:55.410 --> 00:40:58.210 so they actually have gotten really sophisticated now 00:40:58.210 --> 00:41:00.499 and they will crash entire regions 00:41:00.499 --> 00:41:01.819 cause they're in multiple data centers 00:41:01.819 --> 00:41:03.569 so they'll say like, what would happen if this 00:41:03.569 --> 00:41:06.479 data center went down, does the site still stay up? 00:41:06.479 --> 00:41:08.369 and they do this in production all the time 00:41:08.369 --> 00:41:09.609 like they're crashing servers right now 00:41:09.609 --> 00:41:11.130 it's really neat 00:41:11.130 --> 00:41:14.079 another one that is inspirational in this way 00:41:14.079 --> 00:41:19.170 is Pinterest, they use AWS as well 00:41:19.170 --> 00:41:22.450 and they have, AWS has this thing called Spot Instances 00:41:22.450 --> 00:41:24.440 and I won't go into too much detail 00:41:24.440 --> 00:41:25.910 because I don't have time 00:41:25.910 --> 00:41:29.869 but Spot Instances allow you to effectively 00:41:29.869 --> 00:41:36.210 bid on servers at a price that you are willing to pay 00:41:36.210 --> 00:41:39.559 so like if a usual server costs $0.20 per minute 00:41:39.559 --> 00:41:42.319 you can say, I'll give $0.15 per minute 00:41:42.319 --> 00:41:45.380 and when excess capacity comes open 00:41:45.380 --> 00:41:47.710 it's almost like a stock market 00:41:47.710 --> 00:41:50.299 if $0.15 is the going price, you'll get a server 00:41:50.299 --> 00:41:52.479 and it starts up and it runs what you want 00:41:52.479 --> 00:41:54.400 but here's the cool thing 00:41:54.400 --> 00:42:00.140 if the stock market goes and the price goes higher than you're willing to pay 00:42:00.140 --> 00:42:03.229 Amazon will just turn off those servers 00:42:03.229 --> 00:42:05.219 they're just dead, you don't have any warning 00:42:05.219 --> 00:42:06.579 they're just dead 00:42:06.579 --> 00:42:11.069 so Pinterest uses this for their production servers 00:42:11.069 --> 00:42:13.749 which means they save a lot of money 00:42:13.749 --> 00:42:17.269 they're paying way under the average Amazon cost for hosting 00:42:17.269 --> 00:42:19.309 but the really cool thing in my opinion 00:42:19.309 --> 00:42:21.170 is not the money they save but the fact that 00:42:21.170 --> 00:42:26.039 like, what would you have to do to build a full system 00:42:26.039 --> 00:42:29.119 where any node can and will die at any moment 00:42:29.119 --> 00:42:31.259 and it's not even under your control 00:42:31.259 --> 00:42:33.529 that's really exciting 00:42:33.529 --> 00:42:36.259 so a simple thing you can do for homeostasis though 00:42:36.259 --> 00:42:37.509 is you can just adjust 00:42:37.509 --> 00:42:39.489 so in our world we have multiple nodes 00:42:39.489 --> 00:42:40.969 and all these little services 00:42:40.969 --> 00:42:42.569 we can scale each one independently 00:42:42.569 --> 00:42:44.569 we're measuring everything 00:42:44.569 --> 00:42:46.400 so Amazon has a thing called Auto Scaling 00:42:46.400 --> 00:42:49.469 we don't use it, we do our own scaling 00:42:49.469 --> 00:42:54.119 and we just do it based on volume and performance 00:42:54.119 --> 00:42:57.869 now when you have a bunch of services like this 00:42:57.869 --> 00:43:00.539 like, I don't know, maybe we have fifty different services now 00:43:00.539 --> 00:43:03.229 that each play tiny little roles 00:43:03.229 --> 00:43:07.210 it becomes difficult to figure out, like, where things are 00:43:07.210 --> 00:43:10.619 so we've started implementing zookeeper for service resolution 00:43:10.619 --> 00:43:14.130 which means a service can come online and say 00:43:14.130 --> 00:43:17.539 I'm the reminder service version 2.3 00:43:17.539 --> 00:43:19.349 and then tell a central guardian 00:43:19.349 --> 00:43:21.979 and the zookeeper can then route traffic to it 00:43:21.979 --> 00:43:24.019 probably too detailed for now 00:43:24.019 --> 00:43:28.420 I'm gonna skip over some stuff real quick 00:43:28.420 --> 00:43:29.499 but I want to talk about this one 00:43:29.499 --> 00:43:33.739 if, did the Nordic Ruby, no, Nordic Ruby talks never go online 00:43:33.739 --> 00:43:35.160 so you can never see this talk 00:43:35.160 --> 00:43:36.630 sorry 00:43:36.630 --> 00:43:41.499 at Nordic Ruby Reginald Braithwaite did a really cool talk 00:43:41.499 --> 00:43:44.130 on like challenges of the Ruby language 00:43:44.130 --> 00:43:45.380 and he made this statement 00:43:45.380 --> 00:43:48.869 Ruby has beautiful but static coupling 00:43:48.869 --> 00:43:51.269 which was really strange 00:43:51.269 --> 00:43:52.989 but basically he was making the same point that 00:43:52.989 --> 00:43:53.950 I was talking about earlier 00:43:53.950 --> 00:43:59.210 that, like Ruby creates a bunch of ways that you can couple 00:43:59.210 --> 00:44:01.200 your system together 00:44:01.200 --> 00:44:02.729 that kind of screw you in the end 00:44:02.729 --> 00:44:03.960 but they're really beautiful to use 00:44:03.960 --> 00:44:09.819 but, like, Ruby can really lead to some deep crazy coupling 00:44:09.819 --> 00:44:14.089 and so he presented this idea of bind by contract 00:44:14.089 --> 00:44:17.930 and bind by contract, in a Ruby sense 00:44:17.930 --> 00:44:22.539 would be, like, I have a class that has a method 00:44:22.539 --> 00:44:26.410 that takes these parameters under these conditions 00:44:26.410 --> 00:44:29.420 and I can kind of put it into my VM 00:44:29.420 --> 00:44:31.999 and whenever someone needs to have a functionality like that 00:44:31.999 --> 00:44:34.650 it will be automatically bound together 00:44:34.650 --> 00:44:36.589 by the fact that it can do that thing 00:44:36.589 --> 00:44:40.680 and instead of how we tend to use Ruby and Java and other languages 00:44:40.680 --> 00:44:42.910 I have a class with a method name I'm going to call it 00:44:42.910 --> 00:44:45.319 right, that's coupling 00:44:45.319 --> 00:44:48.009 but he proposed this idea of this decoupled system 00:44:48.009 --> 00:44:50.609 where you just say I need a functionality like this 00:44:50.609 --> 00:44:53.390 that works under the conditions that I have present 00:44:53.390 --> 00:44:55.369 so this lead me to this idea 00:44:55.369 --> 00:44:59.059 and this may be like way too weird, I don't know 00:44:59.059 --> 00:45:02.569 what if in your web application your routes file 00:45:02.569 --> 00:45:08.130 for your services read like a functional pattern matching syntax 00:45:08.130 --> 00:45:11.200 so like if you've ever used Erlang or Haskell or Scala 00:45:11.200 --> 00:45:14.509 any of these things that have functional pattern matching 00:45:14.509 --> 00:45:18.680 what if you could then route to different services 00:45:18.680 --> 00:45:20.880 across a bunch of different services 00:45:20.880 --> 00:45:23.450 based on contract 00:45:23.450 --> 00:45:27.279 now I have zero time left 00:45:27.279 --> 00:45:29.029 but I'm just gonna keep talking, cause I'm mean 00:45:29.029 --> 00:45:30.349 oh wait I'm not allowed to be mean 00:45:30.349 --> 00:45:31.579 because of the code of contact 00:45:31.579 --> 00:45:34.759 so I'll wrap up 00:45:34.759 --> 00:45:38.749 so this is an idea that I've started working on as well 00:45:38.749 --> 00:45:40.539 where I would actually write an Erlang service 00:45:40.539 --> 00:45:42.700 with this sort of functional pattern matching 00:45:42.700 --> 00:45:45.589 but have it be routing in really fast real time 00:45:45.589 --> 00:45:48.539 through back end services that support it 00:45:48.539 --> 00:45:50.650 one more thing I just want to show you real quick 00:45:50.650 --> 00:45:53.869 that I am working on and I want to show you 00:45:53.869 --> 00:45:57.910 because I want you to help me 00:45:57.910 --> 00:46:00.690 has anyone used JSON schema? 00:46:00.690 --> 00:46:05.890 OK, you people are my friends for the rest of the conference 00:46:05.890 --> 00:46:08.469 in a system where you have all these things talking to each other 00:46:08.469 --> 00:46:11.219 you do need a way to validate the inputs and outputs 00:46:11.219 --> 00:46:16.229 but I don't want to generate code that parses and creates JSON 00:46:16.229 --> 00:46:21.180 I don't want to do something in real time that intercepts my 00:46:21.180 --> 00:46:24.219 kind of traffic, so there's this thing called JSON schema 00:46:24.219 --> 00:46:27.219 that allows you to, in a completely decoupled way 00:46:27.219 --> 00:46:30.719 specify JSON documents and how they should interact 00:46:30.719 --> 00:46:35.849 and I am working on a new thing that's called Klagen 00:46:35.849 --> 00:46:38.299 which is the German word for complain 00:46:38.299 --> 00:46:42.420 it's written in Scala, so if anyone wants to pair up on some Scala stuff 00:46:42.420 --> 00:46:47.700 what it will be is a high performance asynchronous JSON schema validation middleware 00:46:47.700 --> 00:46:52.749 so if that's interesting to anyone, even if you don't know Scala or JSON schema 00:46:52.749 --> 00:46:54.029 please let me know 00:46:54.029 --> 00:46:57.099 and I believe I'm out of time so I'm just gonna end there 00:46:57.099 --> 00:46:58.609 am I right? I'm right, yes 00:46:58.609 --> 00:47:01.529 so thank you very much, and let's talk during the conference