0:00:00.000,0:00:20.130 36C3 preroll music 0:00:20.130,0:00:25.169 Angel: Right now I'd like to welcome our[br]first speaker on stage. The talk will be 0:00:25.169,0:00:30.800 about protecting the wild and I'll hand[br]over to her. Please give her a warm round 0:00:30.800,0:00:32.870 of applause. 0:00:32.870,0:00:34.860 Applause 0:00:34.860,0:00:43.920 Jutta Buschbom: Thank you very much for[br]the introduction. My name is Jutta 0:00:43.920,0:00:52.110 Buschbom, I'm an evolutionary biologist.[br]That is my background. I did do my PHD at 0:00:52.110,0:00:57.290 the University of Chicago working on[br]little fungees that live in symbiosis with 0:00:57.290,0:01:05.979 algae and form colorful rocks, colorful[br]crust on rocks. I then did a Postdoc in 0:01:05.979,0:01:12.240 bioinformatics and after that moved back[br]into organismal biology, working in forest 0:01:12.240,0:01:19.560 genetics. And the ten years I worked in[br]forest genetics for the first time I 0:01:19.560,0:01:26.049 encountered questions that were with[br]regard to application, and I found out 0:01:26.049,0:01:37.359 that actually moving from research to[br]application is not trivial. So what I'm 0:01:37.359,0:01:45.869 going to present is a high tech way using[br]genomic data to protect biodiversity in a 0:01:45.869,0:01:51.939 way that you can actually reach[br]application and use conservation genomic 0:01:51.939,0:02:02.600 tools. So this summer the draft of the[br]report of the Intergovernmental Science 0:02:02.600,0:02:12.319 Policy Panel for Biodiversity and[br]Ecosystem Services came out and its 0:02:12.319,0:02:19.930 results were quite warning. It stated that[br]around a million animal and plant species 0:02:19.930,0:02:27.330 are currently stated and of those...half[br]of those species are already dead species 0:02:27.330,0:02:33.450 walking. So because due to the destruction[br]of the habitats or habitat deterioration, 0:02:33.450,0:02:42.950 they are not able to reproduce in a[br]sustainable way anymore. A third of the 0:02:42.950,0:02:51.170 total species extinction rate risk to date[br]has arisen in the last 25 years. And just 0:02:51.170,0:03:01.450 to give you an idea about the relation we[br]are talking about...currently the rate of 0:03:01.450,0:03:07.680 extinction risk is already at least ten to[br]hundreds times higher than it has averaged 0:03:07.680,0:03:13.130 over the past 10 million years. And within[br]these 10 million years there were the Ice 0:03:13.130,0:03:23.260 Ages, for example. And most of the[br]extinction risk is due to the fact of land 0:03:23.260,0:03:36.190 and sea use change. The report also talks,[br]even talks about that we already seem to 0:03:36.190,0:03:42.420 have transgressed a proposed precautionary[br]planetary boundary, which means within the 0:03:42.420,0:03:48.370 boundary we have a stable biological[br]system. But having transgressed it, we 0:03:48.370,0:03:55.430 might already be in a transition to a new[br]state that we have no way to find out how 0:03:55.430,0:04:05.240 this state is going to look like. So all[br]of these facts that the report is stating 0:04:05.240,0:04:14.730 are actually pretty negative. And I was[br]quite happy to read that they also present 0:04:14.730,0:04:20.699 that there are actually people who do[br]better than most of us. And they point out 0:04:20.699,0:04:27.810 that many practices of indigenous people[br]and local communities actually conserve 0:04:27.810,0:04:38.350 and sustain wild and domesticated[br]biodiversity quite well. Today, a higher 0:04:38.350,0:04:44.600 proportion of the remaining terrestrial[br]biodiversity lies in areas managed and 0:04:44.600,0:04:52.890 held by indigenous people. And these[br]ecosystems are more intact and less 0:04:52.890,0:05:01.770 declining, less rapidly declining. So we[br]have examples of lifestyles that actually 0:05:01.770,0:05:10.530 do better than most of us. And I know the[br]solutions won't be simple and it won't be 0:05:10.530,0:05:22.330 easy to get there but we can look to what[br]these people do better than we do. All of 0:05:22.330,0:05:27.930 this sounds...it's a global report and it[br]sounds kind of like far away, like 0:05:27.930,0:05:35.990 probably somewhere in the tropics, but[br]actually threats to biodiversity happen 0:05:35.990,0:05:45.400 also directly in front of our own front[br]doors. This summer a paper came out from 0:05:45.400,0:05:52.800 two colleagues from the University of[br]Greifswald, who had analyzed the long term 0:05:52.800,0:05:58.490 data set about leaf beetles. And they were[br]asking if we already have a decline of 0:05:58.490,0:06:08.240 leaf beetles in Central Europe. So they[br]compiled long term data sets of leaf 0:06:08.240,0:06:19.140 beetle observations for Central Europe,[br]starting from 1900 now to 2017, so 0:06:19.140,0:06:27.010 spanning a hundred and twenty years. And[br]what they find is that systematic reports 0:06:27.010,0:06:36.270 on leaf beetles and leaf beetle[br]observations are increasing during this 0:06:36.270,0:06:45.310 time interval, time span. But despite the[br]fact that we have...like in the last two 0:06:45.310,0:06:53.270 decades, we had very high numbers of[br]reports and observations for leaf beetles, 0:06:53.270,0:07:00.100 the number of species, the orange line, is[br]declining. It's slightly declining. But 0:07:00.100,0:07:06.010 the question is, is this real or not? And[br]what was most worrisome to the authors is 0:07:06.010,0:07:15.110 that in the data set, the number of[br]species here in orange that were having 0:07:15.110,0:07:21.930 more reports was declining, while the[br]number of species that showed less reports 0:07:21.930,0:07:33.930 than before is expanding. So this kind of[br]long term datasets are very hard to 0:07:33.930,0:07:41.310 interpret and many factors can contribute[br]to those patterns. And it's not clear if 0:07:41.310,0:07:48.310 this pattern is statistically significant.[br]But if you take a step back and consider 0:07:48.310,0:07:54.470 your background knowledge, your prior[br]knowledge about the state of the world, do 0:07:54.470,0:08:02.760 you say, like, how does the current state[br]look like? Does it look good or rather 0:08:02.760,0:08:16.910 worrisome? And then with that knowledge,[br]tell me that these results are an 0:08:16.910,0:08:30.150 artifact or a bias. I'm worried that once[br]we have statistical significant signal in 0:08:30.150,0:08:41.789 this dataset, it will be already too late.[br]So right now, I've been talking about leaf 0:08:41.789,0:08:49.639 beetles and beetles are the largest group[br]within insects with about 400.000 species. 0:08:49.639,0:08:56.200 Leaf beetles are a large family of about[br]50.000 species which are worldwide 0:08:56.200,0:09:05.080 distributed. And here in Germany, we have[br]over 470 leaf beetle species. So how do we 0:09:05.080,0:09:09.740 actually know how many species there are[br]and who actually counted all these 0:09:09.740,0:09:15.960 species? And is that just a task of[br]taxonomists. Taxonomy is the science of 0:09:15.960,0:09:21.600 naming and defining, including[br]circumscribing and classifying groups of 0:09:21.600,0:09:32.020 biological organisms on the basis of[br]shared characters. So one could have the 0:09:32.020,0:09:37.560 picture of some woman with a funny hat[br]running over a meadow catching like 0:09:37.560,0:09:44.480 butterflies or some guy mushroom hunter[br]crawling through the forest trying to find 0:09:44.480,0:09:52.380 mushrooms. And it's true, as biodiversity[br]scientists we spent a lot of time outdoors 0:09:52.380,0:10:02.290 and yeah...on the other hand, biotaxonomy[br]is a high-tech science today. So 0:10:02.290,0:10:11.050 taxonomists actually take up new[br]technological tools and developments to 0:10:11.050,0:10:17.270 help them identify and describe,[br]understand the species. So taxonomists 0:10:17.270,0:10:25.110 actually are often experts in, for[br]example, microscopy, mathematics, 0:10:25.110,0:10:36.850 biochemistry, even proteomics and[br]genomics. So throughout the talk, I'm 0:10:36.850,0:10:41.520 going to compile this list of people and[br]experts we're going to need to protect 0:10:41.520,0:10:49.360 biodiversity if we want to do this on the[br]basis of genetic data. Right now, the list 0:10:49.360,0:10:56.430 is quite empty. The first entry is a[br]taxonomists, but that will change quickly 0:10:56.430,0:11:06.260 and taxonomists are a subgroup of[br]evolutionary biologists mostly. So I told 0:11:06.260,0:11:15.560 you as taxonomists and biodiversity[br]scientists take up technology and...so as 0:11:15.560,0:11:23.610 soon as computers came about and the[br]internet started people started to use 0:11:23.610,0:11:32.420 that to compile information about species,[br]and today we have several global resources 0:11:32.420,0:11:40.640 available at the species level and above[br]the species level. So we biodiversity 0:11:40.640,0:11:45.720 scientists were among the first who[br]defined biodiversity information 0:11:45.720,0:11:56.690 standards. We have a global catalog of[br]life. A list of all named species. The 0:11:56.690,0:12:01.810 Global Biodiversity Information Facility[br]has an aim to bring together information 0:12:01.810,0:12:08.630 from different sources and they are[br]compiling, producing this wonderful map. 0:12:08.630,0:12:13.940 This is leaf beetles, all the records[br]about leaf beetles that we have in the 0:12:13.940,0:12:22.200 world. And it looks like as if leaf[br]beetles are highly associated with third 0:12:22.200,0:12:29.580 world economics. However that clearly is[br]an artifact and it just shows that we need 0:12:29.580,0:12:34.560 many more taxonomists and biodiversity[br]scientists all over the world to find and 0:12:34.560,0:12:45.300 identify leaf beetles. So we also need[br]biodiversity informaticians to help us 0:12:45.300,0:12:52.050 compile global lists and distribute[br]knowledge. So far I have been talking 0:12:52.050,0:12:57.890 about species which is a simplification.[br]The question is what is...what are species 0:12:57.890,0:13:03.400 actually? And so we need to talk about[br]genetic diversity within and between 0:13:03.400,0:13:16.519 species. And I'm going to do so using[br]gulls, which most of us might know. Here 0:13:16.519,0:13:21.670 in Europe, we have two large gulls of the[br]genus Larus. One is in the front, the 0:13:21.670,0:13:31.070 lighter gray is our Silbermöwe. And in the[br]back is our Heringsmöwe, the dark one. And 0:13:31.070,0:13:35.740 I'm going to use German names because the[br]English names go crosswise and that's 0:13:35.740,0:13:43.160 completely confusing. So I will stick with[br]the German names. Here in Europe these two 0:13:43.160,0:13:48.450 species seem to be really fine species[br]because they barely interbreed, so they 0:13:48.450,0:13:55.680 don't hybridize. However, if you take a[br]step back and look at the genus in 0:13:55.680,0:14:03.120 general, you see that the species of the[br]genus are distributed kind of ringwise 0:14:03.120,0:14:14.510 around the Arctic. And so the idea is[br]that, say during the Ice Age, all of this 0:14:14.510,0:14:22.959 area was glaciated and the gulls retreated[br]to a refuge here near the Caspian Sea. And 0:14:22.959,0:14:28.110 then after the ice retreated, the gulls[br]moved back north. One branch moved into 0:14:28.110,0:14:34.350 Europe forming our Heringsmöwe and[br]another branch then moved counterclockwise 0:14:34.350,0:14:41.019 around the Arctic, producing different[br]morphotypes, different species across the 0:14:41.019,0:14:49.450 Bering Strait and then into North America.[br]There the dark blue one is...I'm 0:14:49.450,0:14:58.730 simplifying, the equivalent of our[br]European Silbermöwe, the American 0:14:58.730,0:15:03.830 Silbermöwe. Then the idea is that some[br]individuals crossed back to Europe and 0:15:03.830,0:15:14.800 formed our European Silbermöwe. And while[br]all of these species here are 0:15:14.800,0:15:21.769 interbreeding, so they hybridize. Only[br]when this ring is closed those two species 0:15:21.769,0:15:26.720 don't interbreed anymore. And the big[br]question is, are we actually dealing with 0:15:26.720,0:15:34.230 one single species or are we dealing with[br]different species that just happened to 0:15:34.230,0:15:41.079 hybridize more or less? The question is[br]not trivial because it has consequences 0:15:41.079,0:15:48.740 for protection. If we are dealing with one[br]single species, all the gulls in Eurasia 0:15:48.740,0:15:53.010 could go extinct and it wouldn't matter[br]because we still would have the gulls in 0:15:53.010,0:15:58.540 North America. However, if we have[br]different species in all of these areas, 0:15:58.540,0:16:04.709 we would need to protect individuals or[br]the species on a regional level and 0:16:04.709,0:16:17.279 protect all of these different species. So[br]to investigate this question about: Do we 0:16:17.279,0:16:23.589 have different species? And what were the[br]evolutionary processes and histories that 0:16:23.589,0:16:31.100 brought about the species? A group of[br]scientists investigated that using DNA 0:16:31.100,0:16:39.930 sequences. And on the left, you have the[br]model, the theoretical model of the ring 0:16:39.930,0:16:46.380 species. And here on the right you have[br]reality. And the scientists found that the 0:16:46.380,0:16:51.630 reality is always much more complex. So,[br]for example, they found two refuges or 0:16:51.630,0:16:58.430 they proposed two refuges. But what they[br]found was that genetic diversity was 0:16:58.430,0:17:07.351 correlated with those species or[br]morphotypes. So what that also means is 0:17:07.351,0:17:15.730 that genetic diversity is cultivated with[br]geographic origin. What we learn from this 0:17:15.730,0:17:24.360 type of analysis is we learn about[br]evolutionary processes and history, about 0:17:24.360,0:17:30.170 variability and differentiation of our[br]gene flow and migration, about speciation 0:17:30.170,0:17:37.590 processes. That we all need to understand[br]our species, which will allow us to 0:17:37.590,0:17:43.440 protect them. So we need evolutionary[br]biologists who do follow genetics and 0:17:43.440,0:17:59.030 population genetics. So once we found out[br]that one can use genetic diversity, to 0:17:59.030,0:18:07.130 infer geographic origin because genetic[br]diversity is correlated with geography, 0:18:07.130,0:18:18.500 people immediately said: 'Okay, we can use[br]it for conservation applications.'. And 0:18:18.500,0:18:24.049 it's also...we learned that we...often it[br]is unclear what is a species, species 0:18:24.049,0:18:32.559 boundaries are unclear and some species[br]have huge distribution ranges with 0:18:32.559,0:18:37.340 different clusters of viability within[br]this huge range. So we know that we need 0:18:37.340,0:18:42.941 to protect within species genetic[br]diversity, which means that we need to 0:18:42.941,0:18:50.650 understand within species population[br]structure and we need to build useful and 0:18:50.650,0:18:58.919 reliable models of population structure.[br]These models are actually required for all 0:18:58.919,0:19:03.740 of our applications. They are required for[br]monitoring, for example, for conservation 0:19:03.740,0:19:11.890 strategies, for functional adaptation and[br]adaptability, questions of productability 0:19:11.890,0:19:19.190 of different provenances, its impact on[br]management regimes, breeding strategies, 0:19:19.190,0:19:27.610 and also for enforcement applications.[br]From the studies I showed you before with 0:19:27.610,0:19:34.110 the gulls we also know that we need to[br]approach the question of a population 0:19:34.110,0:19:47.070 structure on a distribution range wide[br]scale. So here's the map produced by 0:19:47.070,0:19:53.630 EUFORGENE, the European Network for forest[br]reproductive material for one of our 0:19:53.630,0:20:02.000 native oaks, the sessil oak. And the dots[br]are the sites for genetic conservation 0:20:02.000,0:20:12.120 units. And so that is one strategy how to[br]represent within species genetic diversity 0:20:12.120,0:20:22.020 and how to sample it. And you can see this[br]is a hypothetical example, but we likely 0:20:22.020,0:20:32.460 will see a gradient from west to east or[br]might see one at this scale. Then once we 0:20:32.460,0:20:37.800 have these kind of global data sets, we[br]can go to the fine scale and maybe, for 0:20:37.800,0:20:44.100 example, do a national genetic monitoring.[br]And we will find much finer scale 0:20:44.100,0:20:51.210 gradients. We also will find especially[br]for first trace outliers, so for stands 0:20:51.210,0:20:59.150 that don't fit the usual pattern. And that[br]is because the first reproductive material 0:20:59.150,0:21:07.660 has been moved around a lot. And so these[br]lighter or darker dots is material that 0:21:07.660,0:21:16.150 was moved to Germany from the outside. And[br]we only will identify these outliers if we 0:21:16.150,0:21:21.380 have the whole reference dataset. If we[br]don't have the whole reference dataset, we 0:21:21.380,0:21:28.799 might not identify these outliers - stands[br]with a different history. Or in a worst 0:21:28.799,0:21:34.280 case, these outliers might actually bias[br]our gradients. And we are always talking 0:21:34.280,0:21:42.770 about very slight gradients. So it's easy[br]to bias these gradiants, dilute them, so 0:21:42.770,0:21:50.710 we actually won't get the results we need.[br]To compile these kinds of reference 0:21:50.710,0:21:57.850 datasets that's huge collaborative efforts[br]because people need to go out into the 0:21:57.850,0:22:04.500 field and collect the reference samples[br]and that might be scientists, that might 0:22:04.500,0:22:13.669 be people from local communities, citizen[br]scientists, managers, owners, government 0:22:13.669,0:22:20.179 officials who provide background[br]information, maps, distribution 0:22:20.179,0:22:27.929 information and also in many parts of the[br]world might protect the people who are 0:22:27.929,0:22:34.510 actually collecting the samples. And it[br]might be conservation activists and NGOs. 0:22:34.510,0:22:41.150 So once the samples have been collected[br]they need to be stored somewhere for the 0:22:41.150,0:22:51.150 long term and the information needs to be[br]databased. And that is the work of 0:22:51.150,0:22:57.430 scientific connections, which are mostly[br]at natural history museums and there the 0:22:57.430,0:23:04.460 samples are processed. They're organized[br]in ways that you can find them again. All 0:23:04.460,0:23:09.680 the metadata is entered, which curators[br]do, collection managers, preparators, 0:23:09.680,0:23:17.030 technical staff at the scientific[br]collections. So once we have these kind of 0:23:17.030,0:23:24.910 data sets, large scale data sets, what are[br]we actually doing with them? So the 0:23:24.910,0:23:32.514 foundation for all of our applications is[br]population structure and there 0:23:32.514,0:23:42.370 specifically population assignment. So the[br]process is set first. We decide on a 0:23:42.370,0:23:46.660 question and design our project[br]accordingly that we can answer the 0:23:46.660,0:23:51.940 question. Then we need to infer the[br]population structure model and optimize 0:23:51.940,0:23:57.480 it. In the next step we need to check if a[br]model actually is good enough for 0:23:57.480,0:24:03.040 application because we might have found[br]the best model, but it might still not be 0:24:03.040,0:24:07.480 good enough for application. So we need to[br]test that. And that is the step of 0:24:07.480,0:24:12.831 population assignment or predictive[br]assignment. And then in the end, we want 0:24:12.831,0:24:19.330 to test our hypothesis. Are the two stands[br]different or does an individual come from 0:24:19.330,0:24:31.059 stand A or from stand B? And here we[br]identify error rates and accuracy. So this 0:24:31.059,0:24:38.890 whole process is very statistical. And so[br]the analysis of these reference data they 0:24:38.890,0:24:48.240 need to be accompanied by biostatisticians[br]who can tell us how to analyze our data. 0:24:48.240,0:24:55.289 So what is the state-of-the-art right now?[br]What kind of geographic resolution do we 0:24:55.289,0:25:02.990 actually get of this non model specie[br]currently? And I'm going to present the 0:25:02.990,0:25:09.600 example of an African timber tree[br]species, which is a very valuable timber. 0:25:09.600,0:25:18.110 It's one example but basically all results[br]for species who have large distribution 0:25:18.110,0:25:26.059 ranges and are continuously distributed[br]and are also long-lived, are very similar. 0:25:26.059,0:25:33.460 So this kind of results seem to be species[br]independent. So the species are Milica 0:25:33.460,0:25:40.370 regia and excelsa, African teak, which[br]cannot be grown in plantations for timber 0:25:40.370,0:25:51.159 quality. So it is harvested unsustainably[br]from natural forests. It's distributed in 0:25:51.159,0:26:00.580 West, Central and East Africa. Here's a[br]black rectangle. And a group of a dozen 0:26:00.580,0:26:06.289 scientists got together and they actually[br]sampled a reference dataset for these two 0:26:06.289,0:26:18.659 species. It's about over 400 samples, they[br]analyzed four marker systems, resulting in 0:26:18.659,0:26:24.570 a total of something like 100 markers,[br]genetic markers, and then they optimized 0:26:24.570,0:26:32.660 the population model and used different[br]parameter settings. And we're going to 0:26:32.660,0:26:40.080 concentrate here on the best solution that[br]they found. And basically this rectangle 0:26:40.080,0:26:47.870 here is the black one over here. So the[br]resolution is... they found population 0:26:47.870,0:26:54.690 structure with clear clusters. So the[br]populations and the species from West 0:26:54.690,0:27:01.490 Africa can be distinguished from those[br]populations in Central Africa. And the 0:27:01.490,0:27:08.460 ones in East Africa can be differentiated.[br]So that is really good. So we have 0:27:08.460,0:27:13.480 population structure. We know their[br]signal. The problem is still that our 0:27:13.480,0:27:21.510 resolution is much lower than we would[br]need to have it because we basically need 0:27:21.510,0:27:32.090 resolution at least on a country level,[br]because most of the laws are national. So 0:27:32.090,0:27:41.770 it might be legal to harvest a tree in one[br]country, but not in another country. So we 0:27:41.770,0:27:49.319 need to get our resolution down to country[br]level or even to regional level. If you 0:27:49.319,0:27:52.361 want to distinguish, was the tree[br]harvested in a national park in a 0:27:52.361,0:28:02.289 protected area or outside in a managed[br]forest. And when as biodiversity 0:28:02.289,0:28:10.740 scientists, we don't know how to continue,[br]one thing is to look for what people do 0:28:10.740,0:28:17.179 with model organisms and specifically what[br]people do in human population genomics 0:28:17.179,0:28:24.179 because there thousands of populations[br]geneticists are working and there is a 0:28:24.179,0:28:28.210 completely different funding background[br]due to the interest of the medical and the 0:28:28.210,0:28:39.119 pharma industry. So they are always[br]advanced. What we can learn from there, 0:28:39.119,0:28:46.660 from the human populations genomics is[br]that we need two features. One is we 0:28:46.660,0:28:53.570 already know that we need distribution[br]wide sampling, which provides a spatial 0:28:53.570,0:28:59.950 context. The second feature is that we[br]need genome wide sequencing, preferably 0:28:59.950,0:29:09.210 genome sequencing, which provides us steps[br]in time because our genomes are archives 0:29:09.210,0:29:14.710 of our evolutionary history. They are[br]records of all the processes and events 0:29:14.710,0:29:21.429 and these steps in time then translate[br]also into resolution. Once we have these 0:29:21.429,0:29:30.150 two features, actually these reference[br]datasets open Pandora's box. Suddently we 0:29:30.150,0:29:36.390 can ask all kinds of questions and[br]objectives, even those that we still don't 0:29:36.390,0:29:47.010 know. We can develop all kinds of[br]applications which is done for humans. 0:29:47.010,0:29:59.400 Currently, there are at least four global[br]datasets on human diversity. These are 0:29:59.400,0:30:08.860 very widely reused and these big datasets[br]- so they are big data with regard to the 0:30:08.860,0:30:18.850 number of samples and also the genomes or[br]the genome representations and this 0:30:18.850,0:30:26.470 results in very information rich data[br]which initiates analytical development so 0:30:26.470,0:30:33.799 people continuously are developing new[br]statistical methods. And right now, a new 0:30:33.799,0:30:42.330 wave is coming in of these methods. So[br]once you have these global datasets, 0:30:42.330,0:30:47.500 people start in human populations[br]genomics, started to do these intense 0:30:47.500,0:30:56.299 regional samplings. And this is the[br]example of the United Kingdom Biobank. 0:30:56.299,0:31:02.789 It's a project with 500.000 volunteers,[br]they are all UK citizens from all over the 0:31:02.789,0:31:13.982 islands. And each individual was genotyped[br]in a vet lab for 820.000 markers. That's 0:31:13.982,0:31:19.620 completely I mean, that's a different[br]number than the 100 or 1000...in 0:31:19.620,0:31:26.409 biodiversity scientists we normally[br]analyse a maximum of a couple of 10.000 0:31:26.409,0:31:36.220 markers. So that's a completely different[br]number. But then statistical geneticists 0:31:36.220,0:31:47.140 come. They do some weird and wonderful[br]voodoo and they derive 96 million markers 0:31:47.140,0:31:53.460 per genome that is per individual from[br]these 820.000 markers that were produced 0:31:53.460,0:32:00.630 in the lab. So that's a hundred fold[br]increase. And once you have this kind of 0:32:00.630,0:32:07.510 dataset for a genome, you suddenly or you[br]finally become country level and within 0:32:07.510,0:32:18.970 country level resolution. So these panels[br]are examples. So the first panel shows 0:32:18.970,0:32:25.980 individuals who were born in Edinburgh and[br]the question was "Where were people born 0:32:25.980,0:32:32.419 who had a similar ancestral background,[br]genetic background?". And what they found 0:32:32.419,0:32:41.980 was that was all over Scotland and[br]Northern Ireland. Northern Yorkshire was 0:32:41.980,0:32:50.250 even more local. So people from Yorkshire[br]don't seem to get around a lot. For London 0:32:50.250,0:32:54.090 the situation is completely different.[br]That is what we would expect because 0:32:54.090,0:32:59.580 London is a people magnet. People move[br]there all the time. They meet there, they 0:32:59.580,0:33:05.700 get children and the kids born in London,[br]their genetic ancestry has nothing to do 0:33:05.700,0:33:12.760 with London. It's from all over the place,[br]from the British Isles and the world. So 0:33:12.760,0:33:21.600 that's why the colors are strongly[br]dissolved. So this study came out also 0:33:21.600,0:33:26.100 this summer. And it's the first time that[br]I have seen that we actually really can 0:33:26.100,0:33:36.580 achieve regional resolution. And I find[br]this possibility for biodiversity science 0:33:36.580,0:33:46.820 really exciting. So it was made possible[br]by very sophisticated statistical 0:33:46.820,0:33:51.890 approaches which are able to analyze[br]genetic data from highly complex 0:33:51.890,0:33:59.450 evolutionary and ecological systems. And[br]at the same time these analyses are able 0:33:59.450,0:34:04.910 to handle big data. We we're talking about[br]gigabytes and terabytes of data and 0:34:04.910,0:34:13.810 results. So a statistical geneticist are[br]developing new methods of data 0:34:13.810,0:34:20.309 representation to handle this amount of[br]data. And then we are able to sufficiently 0:34:20.309,0:34:25.520 extract the signal for a very specific[br]question from data which are very low 0:34:25.520,0:34:36.919 signal to noise ratio. So to get there, we[br]need many experts and specialists. So we 0:34:36.919,0:34:41.659 need statistical geneticists, big data[br]experts who also might contribute machine 0:34:41.659,0:34:49.299 learning expertise. We need molecular[br]biologists who know how to sequence 0:34:49.299,0:34:54.259 complex genomes. We now need[br]bioinformatics with an expertise in 0:34:54.259,0:35:05.010 genomics for assembly, annotation and[br]alignment of genomic sequences. The result 0:35:05.010,0:35:12.569 is actually this: This is the author list[br]for the thousands genomes project 0:35:12.569,0:35:20.380 reference data set, and I don't expect you[br]to be able to read it, but the bold type 0:35:20.380,0:35:25.539 is of interest because it shows all the[br]different tasks that are necessary to 0:35:25.539,0:35:36.140 produce a standardized and highly cleaned[br]reverence dataset. So the whole author 0:35:36.140,0:35:41.880 list is something like 1.5 pages long and[br]even considering that some authors will 0:35:41.880,0:35:51.130 have contributed to several tasks. The[br]publications for reference datasets mostly 0:35:51.130,0:35:57.079 have author lists that are far over 50[br]people. So they are huge collaborative 0:35:57.079,0:36:05.219 efforts. Now we take the step into[br]biodiversity science. Here these are eight 0:36:05.219,0:36:13.440 gastrotrichs, they are little worm like...[br]organisms who live in the sediments of 0:36:13.440,0:36:23.069 freshwater lakes and marine sediment. They[br]are in general a couple of hundreds micro 0:36:23.069,0:36:29.569 meters large. And I don't have any[br]numbers, but my guess would be that maybe 0:36:29.569,0:36:38.640 worldwide, a hundred to a thousand people[br]actually work on these species. There are 0:36:38.640,0:36:44.829 800 species of gastrotrichs. So let's say[br]there's one, two, maybe three experts per 0:36:44.829,0:36:52.240 species for these organisms. So how are[br]these three people going to manage all 0:36:52.240,0:37:01.420 these tasks to produce a reference[br]dataset? You might say, well, it's 0:37:01.420,0:37:05.209 gastrotrichs, I mean, have never heard[br]about them. Maybe they are not so 0:37:05.209,0:37:08.349 important. Maybe you don't need a[br]reference data sets, but actually some of 0:37:08.349,0:37:17.579 those species are bioindicators for water[br]quality. So what we observe right now is a 0:37:17.579,0:37:27.510 gap for biodiversity conservation. In[br]model organisms, we have Pandora's Box 0:37:27.510,0:37:34.630 open. We have all the statistical analyses[br]at our hands to analyze our data sets. 0:37:34.630,0:37:39.709 However, in none model organisms, we are[br]still stuck with summary statistics that 0:37:39.709,0:37:46.839 don't provide us the resolution that we[br]need. And we know that to close this gap, 0:37:46.839,0:37:52.599 even for a single species, it's a huge[br]effort. But at the same time, we have over 0:37:52.599,0:38:03.560 35.000 species listed by scientists which[br]need already now effective protection. So 0:38:03.560,0:38:10.008 we need to find a way to close this gap[br]and actually move in this direction. And 0:38:10.008,0:38:19.940 the good thing is, so all of this... in[br]biodiversity science, in academia, and we 0:38:19.940,0:38:24.890 need to make the transition over the[br]conservational genomic gap into the big 0:38:24.890,0:38:32.130 loop of real world conservation tasks. And[br]the good thing is we already know what we 0:38:32.130,0:38:37.940 have to do. So we need to have reference[br]data sets, distribution range wide. We 0:38:37.940,0:38:43.959 need to have statistics. And it's going to[br]be big data. So we need collection 0:38:43.959,0:38:54.140 management, data management and an[br]analysis environment. So looking at 0:38:54.140,0:38:59.880 different ingredients or different steps[br]the first we need is a general data 0:38:59.880,0:39:05.269 infrastructure for global diversity of[br]reference data sets that actually can be 0:39:05.269,0:39:11.779 used across species for preferably as many[br]species as possible and provide a working 0:39:11.779,0:39:19.749 environment for biodiversity scientists[br]and experts. It should be user friendly so 0:39:19.749,0:39:25.759 it can be used by scientists, but also[br]that people from local communities and 0:39:25.759,0:39:33.489 citizen scientists can add their[br]observation data and their data into this 0:39:33.489,0:39:41.339 data infrastructure. I have listed quite a[br]lot of features that these kind of 0:39:41.339,0:39:48.400 infrastructures should have. And I'm going[br]to argue that these features are not some 0:39:48.400,0:40:02.609 nice to have, but actually some must have.[br]Because our goal is always application. So 0:40:02.609,0:40:13.279 we need developers, managers and curators[br]for data infrastructures. Since our goal 0:40:13.279,0:40:30.900 is application, our main features are[br]quality control and error reduction. These 0:40:30.900,0:40:38.880 are the basis. So that our conservation[br]tools can be robustly and reliably applied 0:40:38.880,0:40:46.459 under real world operating conditions. And[br]the way to achieve quality and error 0:40:46.459,0:40:52.759 reduction is through chains of custody. So[br]it means that from project of sign, from 0:40:52.759,0:40:58.299 the questions through all the steps that[br]are necessary to produce a reference data 0:40:58.299,0:41:08.219 set and then...so from sample collection,[br]genomic statistical analysis down to 0:41:08.219,0:41:15.599 application. These steps need to be[br]documented and standardized. They need to 0:41:15.599,0:41:22.239 be, each one of them needs to be validated[br]and reproducible. They should be modular 0:41:22.239,0:41:28.999 so they can be user friendly. And the[br]whole chain of custody needs to be 0:41:28.999,0:41:40.690 scalable. So if our chains of custody have[br]these characteristics, we actually will 0:41:40.690,0:41:51.390 have tools that will work in everyday[br]life. So we need professional developers 0:41:51.390,0:41:59.519 and programmers who are able to produce[br]these very collaborative softwares. We 0:41:59.519,0:42:06.130 need free and open source experts. So we[br]always can ensure that our code and that 0:42:06.130,0:42:13.859 our infrastructures are still integer and[br]we can check them. And I'm a biologist, I 0:42:13.859,0:42:19.390 don't have any background in hardware, but[br]I've heard a couple of talks here in the 0:42:19.390,0:42:26.099 conference about Green IT. And I have[br]the feeling we should have people who know 0:42:26.099,0:42:33.849 hardware and software and know how to[br]develop these high tech tools in a way 0:42:33.849,0:42:38.450 sustainable so that by developing these[br]tools, we don't use more resources than we 0:42:38.450,0:42:48.940 are trying to protect. So I've shown all[br]these features and characteristics that 0:42:48.940,0:42:57.459 the software should have. And I'm arguing[br]that these features are necessary because 0:42:57.459,0:43:04.819 of the reality we find us in. It is one of[br]rising over-exploitation and destruction 0:43:04.819,0:43:19.799 of nature. So the extent of environmental[br]crimes is up in the billions. All 0:43:19.799,0:43:29.029 environmental crime together, the green[br]bubbles are only second to drug associated 0:43:29.029,0:43:35.489 crimes. They are up there with[br]counterfeiting or human trafficing. So 0:43:35.489,0:43:45.479 these are multi-billion enterprises. They[br]are often transnational and industries 0:43:45.479,0:44:02.019 with huge profits. So if there's some[br]crime, some mafia boss, some criminal 0:44:02.019,0:44:09.539 manager who just bribed a government[br]official somewhere in the neck in the 0:44:09.539,0:44:17.859 woods, it just would make sense that that[br]person would not wait or not take the 0:44:17.859,0:44:23.809 risks to be discovered just because some[br]customs officer pulls out a container 0:44:23.809,0:44:29.170 somewhere in the harbor, for example,[br]opens it and says "This looks kind of 0:44:29.170,0:44:37.380 weird. Let's take a sample, send it to a[br]lab." and then a population geneticist 0:44:37.380,0:44:44.171 comes back and says "Oh, yes, this sample[br]is not from area A as documented, but 0:44:44.171,0:44:52.449 actually it's from area B and it was[br]illegally logged." If we have reference 0:44:52.449,0:44:58.660 data sets, information rich reference data[br]sets, they become highly valuable and they 0:44:58.660,0:45:08.430 need protection themselves against[br]manipulation and destruction. So we will 0:45:08.430,0:45:14.739 need to think about IT security from the[br]beginning. Also, these data sets are often 0:45:14.739,0:45:20.069 very politically sensitive because if it[br]is shown that in a certain country there 0:45:20.069,0:45:25.680 is the illegal logging repeatedly, that[br]country might not be too excited about 0:45:25.680,0:45:41.380 this information. So we need to think[br]about IT security experts. So my hope is 0:45:41.380,0:45:48.599 that these kind of very high tech digital[br]conservation tools can actually contribute 0:45:48.599,0:45:55.690 to the U.N. Sustainable Development Goals[br]by empowering indigenous people, local 0:45:55.690,0:46:02.810 communities and also us to protect and[br]force and sustainably use our lands and 0:46:02.810,0:46:10.139 our biodiversity by providing some[br]management and law enforcement tools. So 0:46:10.139,0:46:14.059 we need people from around the world,[br]users from around the world who use these 0:46:14.059,0:46:25.789 tools and help to develop them further and[br]to maintain them. And finally here, these 0:46:25.789,0:46:33.910 high tech tools will just another[br]technological fix. If we don't manage to 0:46:33.910,0:46:45.770 get our back down, our way of life down to[br]sustainable levels. So what we need is to 0:46:45.770,0:46:53.759 today...this year, the Earth Overshoot Day[br]was at the end of July. So at the end of 0:46:53.759,0:47:01.639 July, we had used all the resources that[br]we had available for the whole year. And 0:47:01.639,0:47:09.400 we need to get this back to the end of the[br]year so that our resources actually 0:47:09.400,0:47:22.910 sustain us for the whole year. The graphic[br]here for Germany suggests that we are on a 0:47:22.910,0:47:29.819 good way. We are reducing our resource[br]consumption and maybe even our biocapacity 0:47:29.819,0:47:38.099 moves up a little bit. So actually it[br]seems that our personal lifestyles and 0:47:38.099,0:47:46.329 choices make a difference and we just need[br]to close this gap here much quicker. So 0:47:46.329,0:47:53.689 protecting biodiversity needs all of us to[br]achieve that. And with that, thank you 0:47:53.689,0:47:57.770 very much. 0:47:57.770,0:48:08.020 Applause 0:48:08.020,0:48:12.680 Angel: So thank you Jutta for this very[br]interesting talk and the very valuable 0:48:12.680,0:48:16.609 work you're doing. We have three mics[br]here. Please line up at the microphones if 0:48:16.609,0:48:22.809 you have any questions or suggestions or[br]want to participate and work together with 0:48:22.809,0:48:29.660 Jutta. We have one question from the[br]Internet, so please Signal-Angel start. 0:48:29.660,0:48:34.749 Signal-Angel: Why do wild plant species[br]within a genus are further apart than wild 0:48:34.749,0:48:42.509 animal species within a genus?[br]Angel: Could you repeat it, please? 0:48:42.509,0:48:49.069 Signal-Angel: Why do wild plant species[br]within a genus are further apart than wild 0:48:49.069,0:48:55.910 animal species within a genus?[br]Jutta: I'm not sure I understand the 0:48:55.910,0:49:01.180 background for the question.[br]Mic 1: Because animals move and plants 0:49:01.180,0:49:06.449 don't move.[br]Jutta: Oh, okay. If that is the idea 0:49:06.449,0:49:12.299 behind the question. Plants actually move,[br]too. They don't move as individuals, but 0:49:12.299,0:49:24.289 they move their genetic material through[br]pollen or fragments. So actually diversity 0:49:24.289,0:49:30.760 in plants and in animals can be quite[br]similar. So the idea is that plants are 0:49:30.760,0:49:36.459 just stuck and should have a completely[br]different population structure does not 0:49:36.459,0:49:43.130 hold because plants move around their[br]genetic material through seeds, through 0:49:43.130,0:49:49.610 pollen, through vegetative propagules.[br]Angel: So thank you microphone 1 for 0:49:49.610,0:49:55.999 helping out. Please ask your question. Mic[br]1: So my question is about the success 0:49:55.999,0:50:00.939 factor of it. If you think of this,[br]whatever database being set up there and I 0:50:00.939,0:50:07.430 think it's gonna be a huge database...I[br]downloaded my own genome on the Internet. 0:50:07.430,0:50:12.989 It was about 150 megabytes. And if we[br]multiply that, I think the genetic 0:50:12.989,0:50:17.539 variation from one person to another is[br]about 1 percent only. So we can compress 0:50:17.539,0:50:25.009 that to 4 megabytes per person. If we[br]sequence all the humans in the world, that 0:50:25.009,0:50:32.689 would be 32 petabytes, that would cost[br]approximately 15 billion dollars. And 0:50:32.689,0:50:36.890 that's only for the storage. Now comes the[br]entire management. Of course, we don't 0:50:36.890,0:50:41.470 want to digitize all the human genome, but[br]rather the plants and animal species 0:50:41.470,0:50:46.309 genome. So it's a huge data program. And[br]what would be for you the success factors 0:50:46.309,0:50:51.229 for this thing to really fly? And did you[br]talk to organizations like WikiData or 0:50:51.229,0:50:56.469 others or where would it ideally be[br]hosted? At a university or an 0:50:56.469,0:51:02.170 international nonprofit or who would be[br]running the thing? 0:51:02.170,0:51:14.519 Jutta: Yeah, I mean, it's just really big[br]data. I think our first goal is not to 0:51:14.519,0:51:23.670 think about having all predicted 5 to 10[br]million species be sequenced on a 0:51:23.670,0:51:30.239 population level. I think we need to think[br]about the next step. And there it would 0:51:30.239,0:51:35.530 make sense to start with species that are[br]actually highly exploited, like many 0:51:35.530,0:51:40.579 timber species and also many marine[br]fishes. I think that's where we should 0:51:40.579,0:51:48.039 start. And to host this kind of data I[br]think it should be in political 0:51:48.039,0:51:56.410 independent hands. So it should be with an[br]NGO or with the U.N., some organization 0:51:56.410,0:52:02.449 that is independent.[br]Mic 1: Are you the first to think about 0:52:02.449,0:52:06.509 this or are there existing initiatives?[br]Jutta: There are actually existing 0:52:06.509,0:52:14.219 initiatives. I have been in contact with[br]the Forest Stewardship Council and they 0:52:14.219,0:52:23.219 are actually starting to sample their[br]concessions and initiated to build up the 0:52:23.219,0:52:28.730 samples, they work together with Kew[br]Botanical Gardens and the U.S. Forest 0:52:28.730,0:52:37.589 Service. And right now they're analyzing[br]the samples, using isotopes which is 0:52:37.589,0:52:45.579 another method which is very powerful and[br]can also produce geographic information. 0:52:45.579,0:53:00.710 And so, yeah, so people are moving in this[br]way. So, yeah, I think the idea is out 0:53:00.710,0:53:05.839 there, just we have to start and we have[br]to really do it and provide one 0:53:05.839,0:53:13.210 infrastructure so that we can combine, for[br]example, morphological data, isotope data 0:53:13.210,0:53:18.329 and genomic data into one dataset, which[br]will increase our resolution and our 0:53:18.329,0:53:23.980 reliability.[br]Angel: Okay. Microphone number two, 0:53:23.980,0:53:27.069 please.[br]Mic 2: Thank you for your valuable talk. 0:53:27.069,0:53:32.660 My question would be you'd start your talk[br]with the possible decrease of leaf beetles 0:53:32.660,0:53:37.100 in the data set you showed on slide number[br]six there was an increase in leaf beetle 0:53:37.100,0:53:41.930 population until the 70s, something about[br]that. Is there a possible explanation for 0:53:41.930,0:53:49.869 that?[br]Jutta: Yeah, I believe it is, because 0:53:49.869,0:53:55.359 people started to much more systematically[br]observe leaf beetles. So it's a sample 0:53:55.359,0:54:05.869 effort. And also at that time the people -[br]so it's a multi-people collaboration who 0:54:05.869,0:54:12.369 actually has assembled this dataset so the[br]people who are part of this collaboration 0:54:12.369,0:54:16.949 they edit their own private data sets. And[br]that's why you have an increase I think. 0:54:16.949,0:54:23.509 While the people from the nineteen[br]hundreds, nineteen hundred ten you only 0:54:23.509,0:54:29.009 can use the data that is available in[br]publications and samples in museums or in 0:54:29.009,0:54:33.289 scientific collections. I think that is[br]the reason why you have the sharp 0:54:33.289,0:54:35.589 increase.[br]Mic 2: Thank you. 0:54:35.589,0:54:38.750 Angel: So we have another question of[br]microphone number two. 0:54:38.750,0:54:44.459 Mic 2: Thank you for your fine talking.[br]Excuse me. Maybe my question is a bit off 0:54:44.459,0:54:51.730 topic. Do you think the methods and roles[br]that you identified in your talk could be 0:54:51.730,0:54:59.880 transferred to the assessment of raw[br]materials? I'm thinking about metals? 0:54:59.880,0:55:09.349 Jutta: Maybe the data infrastructure, like[br]if you wanted to collect raw metals or 0:55:09.349,0:55:16.471 materials from all over the world and...a[br]sampleized scientific collection and to 0:55:16.471,0:55:22.390 have kind of a reference dataset that[br]might work, actually. But the genomics 0:55:22.390,0:55:29.170 obviously won't. So that part of what you[br]would need to use different methods from 0:55:29.170,0:55:36.010 physics, obviously. But actually the[br]infrastructure, certain parts will be 0:55:36.010,0:55:40.249 quite similar. I think so, yes.[br]Angel: So we have one more question from 0:55:40.249,0:55:43.420 the Internet.[br]Signal-Angel: Who does contract a 0:55:43.420,0:55:51.619 freelance evolutionary biologist? Can you[br]give an example of this kind of work you 0:55:51.619,0:56:01.429 proposed?[br]Jutta: So I see this gap between science 0:56:01.429,0:56:07.739 and applications, that we need these[br]applications and there's a huge potential 0:56:07.739,0:56:18.150 for these applications. We know that[br]illegal logging and that is my background, 0:56:18.150,0:56:23.769 but doesn't seem to be much different, for[br]example, in marine fisheries. We know that 0:56:23.769,0:56:29.730 there is this huge amount of illegal[br]logging and timber trade going on. And we 0:56:29.730,0:56:39.670 need to have some assets actually that[br]have the power to detect illegally traded 0:56:39.670,0:56:49.789 timber. So I think there is a huge need[br]for these kind of methods and 0:56:49.789,0:57:00.869 organizations who are interested in these[br]kind of methods. Our governments, their 0:57:00.869,0:57:12.719 companies, NGOs, customs, Interpol. So,[br]yeah. 0:57:12.719,0:57:19.700 Angel: Do we have any other questions? So[br]thank you again Jutta for your talk and 0:57:19.700,0:57:23.739 the valuable work you're doing. Please[br]give a warm round of applause to Jutta. 0:57:23.739,0:57:29.009 Applause 0:57:29.009,0:57:33.599 36c3 postrol music 0:57:33.599,0:57:56.000 Subtitles created by c3subtitles.de[br]in the year 2020. Join, and help us!