WEBVTT 00:00:00.982 --> 00:00:05.041 I'm going to tell you about the most amazing machines in the world 00:00:05.065 --> 00:00:06.833 and what we can now do with them. 00:00:07.396 --> 00:00:08.559 Proteins, 00:00:08.583 --> 00:00:10.833 some of which you see inside a cell here, 00:00:10.857 --> 00:00:14.316 carry out essentially all the important functions in our bodies. 00:00:14.972 --> 00:00:16.851 Proteins digest your food, 00:00:16.875 --> 00:00:18.581 contract your muscles, 00:00:18.605 --> 00:00:20.182 fire your neurons 00:00:20.206 --> 00:00:21.822 and power your immune system. 00:00:22.400 --> 00:00:24.376 Everything that happens in biology -- 00:00:24.400 --> 00:00:25.551 almost -- 00:00:25.575 --> 00:00:26.987 happens because of proteins. NOTE Paragraph 00:00:27.698 --> 00:00:31.773 Proteins are linear chains of building blocks called amino acids. 00:00:32.366 --> 00:00:35.599 Nature uses an alphabet of 20 amino acids, 00:00:35.623 --> 00:00:37.898 some of which have names you may have heard of. 00:00:38.921 --> 00:00:42.463 In this picture, for scale, each bump is an atom. 00:00:43.351 --> 00:00:47.995 Chemical forces between the amino acids cause these long stringy molecules 00:00:48.019 --> 00:00:51.480 to fold up into unique, three-dimensional structures. 00:00:51.937 --> 00:00:53.277 The folding process, 00:00:53.301 --> 00:00:54.735 while it looks random, 00:00:54.759 --> 00:00:56.722 is in fact very precise. 00:00:56.746 --> 00:01:01.143 Each protein folds to its characteristic shape each time, 00:01:01.167 --> 00:01:04.555 and the folding process takes just a fraction of a second. 00:01:06.029 --> 00:01:07.873 And it's the shapes of proteins 00:01:07.897 --> 00:01:11.867 which enable them to carry out their remarkable biological functions. 00:01:12.520 --> 00:01:13.671 For example, 00:01:13.695 --> 00:01:17.203 hemoglobin has a shape in the lungs perfectly suited 00:01:17.227 --> 00:01:19.214 for binding a molecule of oxygen. 00:01:19.759 --> 00:01:21.651 When hemoglobin moves to your muscle, 00:01:21.675 --> 00:01:23.607 the shape changes slightly 00:01:23.631 --> 00:01:25.822 and the oxygen comes out. NOTE Paragraph 00:01:27.494 --> 00:01:28.860 The shapes of proteins, 00:01:28.884 --> 00:01:31.097 and hence their remarkable functions, 00:01:31.121 --> 00:01:36.899 are completely specified by the sequence of amino acids in the protein chain. 00:01:37.331 --> 00:01:41.272 In this picture, each letter on top is an amino acid. 00:01:42.860 --> 00:01:44.697 Where do these sequences come from? 00:01:45.586 --> 00:01:50.410 The genes in your genome specify the amino acid sequences 00:01:50.434 --> 00:01:51.832 of your proteins. 00:01:51.856 --> 00:01:55.594 Each gene encodes the amino acid sequence of a single protein. 00:01:57.515 --> 00:02:01.317 The translation between these amino acid sequences 00:02:01.341 --> 00:02:03.799 and the structures and functions of proteins 00:02:03.823 --> 00:02:05.880 is known as the protein folding problem. 00:02:06.439 --> 00:02:07.984 It's a very hard problem 00:02:08.008 --> 00:02:11.188 because there's so many different shapes a protein can adopt. 00:02:12.073 --> 00:02:13.718 Because of this complexity, 00:02:13.742 --> 00:02:16.679 humans have only been able to harness the power of proteins 00:02:16.703 --> 00:02:20.171 by making very small changes to the amino acid sequences 00:02:20.195 --> 00:02:22.286 of the proteins we've found in nature. NOTE Paragraph 00:02:22.835 --> 00:02:26.693 This is similar to the process that our Stone Age ancestors used 00:02:26.717 --> 00:02:30.076 to make tools and other implements from the sticks and stones 00:02:30.100 --> 00:02:32.103 that we found in the world around us. 00:02:33.226 --> 00:02:38.250 But humans did not learn to fly by modifying birds. NOTE Paragraph 00:02:38.790 --> 00:02:40.807 (Laughter) NOTE Paragraph 00:02:40.831 --> 00:02:47.141 Instead, scientists, inspired by birds, uncovered the principles of aerodynamics. 00:02:47.165 --> 00:02:51.560 Engineers then used those principles to design custom flying machines. 00:02:52.195 --> 00:02:53.440 In a similar way, 00:02:53.464 --> 00:02:55.406 we've been working for a number of years 00:02:55.430 --> 00:02:58.699 to uncover the fundamental principles of protein folding 00:02:58.723 --> 00:03:02.782 and encoding those principles in the computer program called Rosetta. 00:03:03.742 --> 00:03:06.255 We made a breakthrough in recent years. 00:03:07.029 --> 00:03:11.488 We can now design completely new proteins from scratch on the computer. 00:03:12.396 --> 00:03:14.464 Once we've designed the new protein, 00:03:15.242 --> 00:03:19.145 we encode its amino acid sequence in a synthetic gene. 00:03:19.656 --> 00:03:21.544 We have to make a synthetic gene 00:03:21.568 --> 00:03:23.819 because since the protein is completely new, 00:03:23.843 --> 00:03:28.605 there's no gene in any organism on earth which currently exists that encodes it. NOTE Paragraph 00:03:29.697 --> 00:03:33.884 Our advances in understanding protein folding 00:03:33.908 --> 00:03:35.630 and how to design proteins, 00:03:35.654 --> 00:03:39.282 coupled with the decreasing cost of gene synthesis 00:03:39.306 --> 00:03:42.805 and the Moore's law increase in computing power, 00:03:42.829 --> 00:03:47.565 now enable us to design tens of thousands of new proteins, 00:03:47.589 --> 00:03:49.928 with new shapes and new functions, 00:03:49.952 --> 00:03:51.465 on the computer, 00:03:51.489 --> 00:03:55.404 and encode each one of those in a synthetic gene. 00:03:56.248 --> 00:03:57.916 Once we have those synthetic genes, 00:03:57.940 --> 00:03:59.485 we put them into bacteria 00:03:59.509 --> 00:04:02.814 to program them to make these brand-new proteins. 00:04:03.197 --> 00:04:05.270 We then extract the proteins 00:04:05.294 --> 00:04:08.730 and determine whether they function as we designed them to 00:04:08.754 --> 00:04:10.165 and whether they're safe. NOTE Paragraph 00:04:11.867 --> 00:04:14.332 It's exciting to be able to make new proteins, 00:04:14.356 --> 00:04:16.852 because despite the diversity in nature, 00:04:16.876 --> 00:04:22.968 evolution has only sampled a tiny fraction of the total number of proteins possible. 00:04:23.572 --> 00:04:27.067 I told you that nature uses an alphabet of 20 amino acids, 00:04:27.091 --> 00:04:31.540 and a typical protein is a chain of about 100 amino acids, 00:04:31.564 --> 00:04:37.116 so the total number of possibilities is 20 times 20 times 20, 100 times, 00:04:37.140 --> 00:04:40.957 which is a number on the order of 10 to the 130th power, 00:04:40.981 --> 00:04:44.793 which is enormously more than the total number of proteins 00:04:44.817 --> 00:04:47.233 which have existed since life on earth began. 00:04:47.990 --> 00:04:50.681 And it's this unimaginably large space 00:04:50.705 --> 00:04:54.235 we can now explore using computational protein design. NOTE Paragraph 00:04:55.747 --> 00:04:58.116 Now the proteins that exist on earth 00:04:58.140 --> 00:05:02.133 evolved to solve the problems faced by natural evolution. 00:05:02.705 --> 00:05:05.058 For example, replicating the genome. 00:05:06.128 --> 00:05:08.412 But we face new challenges today. 00:05:08.436 --> 00:05:11.173 We live longer, so new diseases are important. 00:05:11.197 --> 00:05:13.412 We're heating up and polluting the planet, 00:05:13.436 --> 00:05:16.994 so we face a whole host of ecological challenges. 00:05:17.977 --> 00:05:19.785 If we had a million years to wait, 00:05:19.809 --> 00:05:23.017 new proteins might evolve to solve those challenges. 00:05:23.787 --> 00:05:25.846 But we don't have millions of years to wait. 00:05:26.488 --> 00:05:29.359 Instead, with computational protein design, 00:05:29.383 --> 00:05:33.822 we can design new proteins to address these challenges today. NOTE Paragraph 00:05:35.693 --> 00:05:40.143 Our audacious idea is to bring biology out of the Stone Age 00:05:40.167 --> 00:05:43.142 through technological revolution in protein design. 00:05:44.113 --> 00:05:46.977 We've already shown that we can design new proteins 00:05:47.001 --> 00:05:48.684 with new shapes and functions. 00:05:49.174 --> 00:05:53.482 For example, vaccines work by stimulating your immune system 00:05:53.506 --> 00:05:56.628 to make a strong response against a pathogen. 00:05:57.698 --> 00:05:59.249 To make better vaccines, 00:05:59.273 --> 00:06:01.575 we've designed protein particles 00:06:01.599 --> 00:06:05.186 to which we can fuse proteins from pathogens, 00:06:05.210 --> 00:06:09.544 like this blue protein here, from the respiratory virus RSV. 00:06:10.131 --> 00:06:11.861 To make vaccine candidates 00:06:11.885 --> 00:06:15.548 that are literally bristling with the viral protein, 00:06:15.572 --> 00:06:18.142 we find that such vaccine candidates 00:06:18.166 --> 00:06:21.468 produce a much stronger immune response to the virus 00:06:21.492 --> 00:06:24.195 than any previous vaccines that have been tested. 00:06:24.648 --> 00:06:28.498 This is important because RSV is currently one of the leading causes 00:06:28.522 --> 00:06:30.751 of infant mortality worldwide. 00:06:32.414 --> 00:06:36.377 We've also designed new proteins to break down gluten in your stomach 00:06:36.401 --> 00:06:37.998 for celiac disease 00:06:38.022 --> 00:06:42.398 and other proteins to stimulate your immune system to fight cancer. 00:06:43.338 --> 00:06:47.277 These advances are the beginning of the protein design revolution. NOTE Paragraph 00:06:48.850 --> 00:06:52.040 We've been inspired by a previous technological revolution: 00:06:52.064 --> 00:06:53.409 the digital revolution, 00:06:53.433 --> 00:06:58.558 which took place in large part due to advances in one place, 00:06:58.582 --> 00:06:59.854 Bell Laboratories. 00:07:00.337 --> 00:07:03.631 Bell Labs was a place with an open, collaborative environment, 00:07:03.655 --> 00:07:06.838 and was able to attract top talent from around the world. 00:07:07.418 --> 00:07:10.860 And this led to a remarkable string of innovations -- 00:07:10.884 --> 00:07:15.075 the transistor, the laser, satellite communication 00:07:15.099 --> 00:07:16.825 and the foundations of the internet. 00:07:17.761 --> 00:07:21.602 Our goal is to build the Bell Laboratories of protein design. 00:07:22.076 --> 00:07:25.591 We are seeking to attract talented scientists from around the world 00:07:25.615 --> 00:07:28.550 to accelerate the protein design revolution, 00:07:28.574 --> 00:07:32.662 and we'll be focusing on five grand challenges. NOTE Paragraph 00:07:34.136 --> 00:07:39.733 First, by taking proteins from flu strains from around the world 00:07:39.757 --> 00:07:43.311 and putting them on top of the designed protein particles 00:07:43.335 --> 00:07:45.002 I showed you earlier, 00:07:45.026 --> 00:07:48.416 we aim to make a universal flu vaccine, 00:07:48.440 --> 00:07:52.391 one shot of which gives a lifetime of protection against the flu. 00:07:53.356 --> 00:07:54.968 The ability to design -- NOTE Paragraph 00:07:54.992 --> 00:08:00.216 (Applause) NOTE Paragraph 00:08:00.240 --> 00:08:03.308 The ability to design new vaccines on the computer 00:08:03.332 --> 00:08:08.640 is important both to protect against natural flu epidemics 00:08:08.664 --> 00:08:12.144 and, in addition, intentional acts of bioterrorism. NOTE Paragraph 00:08:13.272 --> 00:08:16.562 Second, we're going far beyond nature's limited alphabet 00:08:16.586 --> 00:08:18.297 of just 20 amino acids 00:08:18.321 --> 00:08:23.056 to design new therapeutic candidates for conditions such as chronic pain, 00:08:23.080 --> 00:08:25.711 using an alphabet of thousands of amino acids. NOTE Paragraph 00:08:26.602 --> 00:08:30.415 Third, we're building advanced delivery vehicles 00:08:30.439 --> 00:08:34.603 to target existing medications exactly where they need to go in the body. 00:08:35.226 --> 00:08:37.875 For example, chemotherapy to a tumor 00:08:37.899 --> 00:08:42.202 or gene therapies to the tissue where gene repair needs to take place. NOTE Paragraph 00:08:43.000 --> 00:08:49.532 Fourth, we're designing smart therapeutics that can do calculations within the body 00:08:49.556 --> 00:08:51.770 and go far beyond current medicines, 00:08:51.794 --> 00:08:54.058 which are really blunt instruments. 00:08:54.082 --> 00:08:58.431 For example, to target a small subset of immune cells 00:08:58.455 --> 00:09:00.536 responsible for an autoimmune disorder, 00:09:00.560 --> 00:09:04.018 and distinguish them from the vast majority of healthy immune cells. NOTE Paragraph 00:09:04.899 --> 00:09:08.311 Finally, inspired by remarkable biological materials 00:09:08.335 --> 00:09:13.443 such as silk, abalone shell, tooth and others, 00:09:13.467 --> 00:09:16.351 we're designing new protein-based materials 00:09:16.375 --> 00:09:20.538 to address challenges in energy and ecological issues. NOTE Paragraph 00:09:21.558 --> 00:09:24.403 To do all this, we're growing our institute. 00:09:24.768 --> 00:09:30.367 We seek to attract energetic, talented and diverse scientists 00:09:30.391 --> 00:09:33.471 from around the world, at all career stages, 00:09:33.495 --> 00:09:34.645 to join us. 00:09:35.304 --> 00:09:38.607 You can also participate in the protein design revolution 00:09:38.631 --> 00:09:42.375 through our online folding and design game, "Foldit." 00:09:43.214 --> 00:09:47.065 And through our distributed computing project, Rosetta@home, 00:09:47.089 --> 00:09:50.820 which you can join from your laptop or your Android smartphone. NOTE Paragraph 00:09:52.547 --> 00:09:56.514 Making the world a better place through protein design is my life's work. 00:09:56.996 --> 00:09:59.274 I'm so excited about what we can do together. 00:09:59.583 --> 00:10:01.053 I hope you'll join us, 00:10:01.077 --> 00:10:02.235 and thank you. NOTE Paragraph 00:10:02.259 --> 00:10:06.714 (Applause and cheers)