0:00:00.000,0:00:20.130
36C3 preroll music
0:00:20.130,0:00:25.169
Angel: Right now I'd like to welcome our[br]first speaker on stage. The talk will be
0:00:25.169,0:00:30.800
about protecting the wild and I'll hand[br]over to her. Please give her a warm round
0:00:30.800,0:00:32.870
of applause.
0:00:32.870,0:00:34.860
Applause
0:00:34.860,0:00:43.920
Jutta Buschbom: Thank you very much for[br]the introduction. My name is Jutta
0:00:43.920,0:00:52.110
Buschbom, I'm an evolutionary biologist.[br]That is my background. I did do my PHD at
0:00:52.110,0:00:57.290
the University of Chicago working on[br]little fungees that live in symbiosis with
0:00:57.290,0:01:05.979
algae and form colorful rocks, colorful[br]crust on rocks. I then did a Postdoc in
0:01:05.979,0:01:12.240
bioinformatics and after that moved back[br]into organismal biology, working in forest
0:01:12.240,0:01:19.560
genetics. And the ten years I worked in[br]forest genetics for the first time I
0:01:19.560,0:01:26.049
encountered questions that were with[br]regard to application, and I found out
0:01:26.049,0:01:37.359
that actually moving from research to[br]application is not trivial. So what I'm
0:01:37.359,0:01:45.869
going to present is a high tech way using[br]genomic data to protect biodiversity in a
0:01:45.869,0:01:51.939
way that you can actually reach[br]application and use conservation genomic
0:01:51.939,0:02:02.600
tools. So this summer the draft of the[br]report of the Intergovernmental Science
0:02:02.600,0:02:12.319
Policy Panel for Biodiversity and[br]Ecosystem Services came out and its
0:02:12.319,0:02:19.930
results were quite warning. It stated that[br]around a million animal and plant species
0:02:19.930,0:02:27.330
are currently stated and of those...half[br]of those species are already dead species
0:02:27.330,0:02:33.450
walking. So because due to the destruction[br]of the habitats or habitat deterioration,
0:02:33.450,0:02:42.950
they are not able to reproduce in a[br]sustainable way anymore. A third of the
0:02:42.950,0:02:51.170
total species extinction rate risk to date[br]has arisen in the last 25 years. And just
0:02:51.170,0:03:01.450
to give you an idea about the relation we[br]are talking about...currently the rate of
0:03:01.450,0:03:07.680
extinction risk is already at least ten to[br]hundreds times higher than it has averaged
0:03:07.680,0:03:13.130
over the past 10 million years. And within[br]these 10 million years there were the Ice
0:03:13.130,0:03:23.260
Ages, for example. And most of the[br]extinction risk is due to the fact of land
0:03:23.260,0:03:36.190
and sea use change. The report also talks,[br]even talks about that we already seem to
0:03:36.190,0:03:42.420
have transgressed a proposed precautionary[br]planetary boundary, which means within the
0:03:42.420,0:03:48.370
boundary we have a stable biological[br]system. But having transgressed it, we
0:03:48.370,0:03:55.430
might already be in a transition to a new[br]state that we have no way to find out how
0:03:55.430,0:04:05.240
this state is going to look like. So all[br]of these facts that the report is stating
0:04:05.240,0:04:14.730
are actually pretty negative. And I was[br]quite happy to read that they also present
0:04:14.730,0:04:20.699
that there are actually people who do[br]better than most of us. And they point out
0:04:20.699,0:04:27.810
that many practices of indigenous people[br]and local communities actually conserve
0:04:27.810,0:04:38.350
and sustain wild and domesticated[br]biodiversity quite well. Today, a higher
0:04:38.350,0:04:44.600
proportion of the remaining terrestrial[br]biodiversity lies in areas managed and
0:04:44.600,0:04:52.890
held by indigenous people. And these[br]ecosystems are more intact and less
0:04:52.890,0:05:01.770
declining, less rapidly declining. So we[br]have examples of lifestyles that actually
0:05:01.770,0:05:10.530
do better than most of us. And I know the[br]solutions won't be simple and it won't be
0:05:10.530,0:05:22.330
easy to get there but we can look to what[br]these people do better than we do. All of
0:05:22.330,0:05:27.930
this sounds...it's a global report and it[br]sounds kind of like far away, like
0:05:27.930,0:05:35.990
probably somewhere in the tropics, but[br]actually threats to biodiversity happen
0:05:35.990,0:05:45.400
also directly in front of our own front[br]doors. This summer a paper came out from
0:05:45.400,0:05:52.800
two colleagues from the University of[br]Greifswald, who had analyzed the long term
0:05:52.800,0:05:58.490
data set about leaf beetles. And they were[br]asking if we already have a decline of
0:05:58.490,0:06:08.240
leaf beetles in Central Europe. So they[br]compiled long term data sets of leaf
0:06:08.240,0:06:19.140
beetle observations for Central Europe,[br]starting from 1900 now to 2017, so
0:06:19.140,0:06:27.010
spanning a hundred and twenty years. And[br]what they find is that systematic reports
0:06:27.010,0:06:36.270
on leaf beetles and leaf beetle[br]observations are increasing during this
0:06:36.270,0:06:45.310
time interval, time span. But despite the[br]fact that we have...like in the last two
0:06:45.310,0:06:53.270
decades, we had very high numbers of[br]reports and observations for leaf beetles,
0:06:53.270,0:07:00.100
the number of species, the orange line, is[br]declining. It's slightly declining. But
0:07:00.100,0:07:06.010
the question is, is this real or not? And[br]what was most worrisome to the authors is
0:07:06.010,0:07:15.110
that in the data set, the number of[br]species here in orange that were having
0:07:15.110,0:07:21.930
more reports was declining, while the[br]number of species that showed less reports
0:07:21.930,0:07:33.930
than before is expanding. So this kind of[br]long term datasets are very hard to
0:07:33.930,0:07:41.310
interpret and many factors can contribute[br]to those patterns. And it's not clear if
0:07:41.310,0:07:48.310
this pattern is statistically significant.[br]But if you take a step back and consider
0:07:48.310,0:07:54.470
your background knowledge, your prior[br]knowledge about the state of the world, do
0:07:54.470,0:08:02.760
you say, like, how does the current state[br]look like? Does it look good or rather
0:08:02.760,0:08:16.910
worrisome? And then with that knowledge,[br]tell me that these results are an
0:08:16.910,0:08:30.150
artifact or a bias. I'm worried that once[br]we have statistical significant signal in
0:08:30.150,0:08:41.789
this dataset, it will be already too late.[br]So right now, I've been talking about leaf
0:08:41.789,0:08:49.639
beetles and beetles are the largest group[br]within insects with about 400.000 species.
0:08:49.639,0:08:56.200
Leaf beetles are a large family of about[br]50.000 species which are worldwide
0:08:56.200,0:09:05.080
distributed. And here in Germany, we have[br]over 470 leaf beetle species. So how do we
0:09:05.080,0:09:09.740
actually know how many species there are[br]and who actually counted all these
0:09:09.740,0:09:15.960
species? And is that just a task of[br]taxonomists. Taxonomy is the science of
0:09:15.960,0:09:21.600
naming and defining, including[br]circumscribing and classifying groups of
0:09:21.600,0:09:32.020
biological organisms on the basis of[br]shared characters. So one could have the
0:09:32.020,0:09:37.560
picture of some woman with a funny hat[br]running over a meadow catching like
0:09:37.560,0:09:44.480
butterflies or some guy mushroom hunter[br]crawling through the forest trying to find
0:09:44.480,0:09:52.380
mushrooms. And it's true, as biodiversity[br]scientists we spent a lot of time outdoors
0:09:52.380,0:10:02.290
and yeah...on the other hand, biotaxonomy[br]is a high-tech science today. So
0:10:02.290,0:10:11.050
taxonomists actually take up new[br]technological tools and developments to
0:10:11.050,0:10:17.270
help them identify and describe,[br]understand the species. So taxonomists
0:10:17.270,0:10:25.110
actually are often experts in, for[br]example, microscopy, mathematics,
0:10:25.110,0:10:36.850
biochemistry, even proteomics and[br]genomics. So throughout the talk, I'm
0:10:36.850,0:10:41.520
going to compile this list of people and[br]experts we're going to need to protect
0:10:41.520,0:10:49.360
biodiversity if we want to do this on the[br]basis of genetic data. Right now, the list
0:10:49.360,0:10:56.430
is quite empty. The first entry is a[br]taxonomists, but that will change quickly
0:10:56.430,0:11:06.260
and taxonomists are a subgroup of[br]evolutionary biologists mostly. So I told
0:11:06.260,0:11:15.560
you as taxonomists and biodiversity[br]scientists take up technology and...so as
0:11:15.560,0:11:23.610
soon as computers came about and the[br]internet started people started to use
0:11:23.610,0:11:32.420
that to compile information about species,[br]and today we have several global resources
0:11:32.420,0:11:40.640
available at the species level and above[br]the species level. So we biodiversity
0:11:40.640,0:11:45.720
scientists were among the first who[br]defined biodiversity information
0:11:45.720,0:11:56.690
standards. We have a global catalog of[br]life. A list of all named species. The
0:11:56.690,0:12:01.810
Global Biodiversity Information Facility[br]has an aim to bring together information
0:12:01.810,0:12:08.630
from different sources and they are[br]compiling, producing this wonderful map.
0:12:08.630,0:12:13.940
This is leaf beetles, all the records[br]about leaf beetles that we have in the
0:12:13.940,0:12:22.200
world. And it looks like as if leaf[br]beetles are highly associated with third
0:12:22.200,0:12:29.580
world economics. However that clearly is[br]an artifact and it just shows that we need
0:12:29.580,0:12:34.560
many more taxonomists and biodiversity[br]scientists all over the world to find and
0:12:34.560,0:12:45.300
identify leaf beetles. So we also need[br]biodiversity informaticians to help us
0:12:45.300,0:12:52.050
compile global lists and distribute[br]knowledge. So far I have been talking
0:12:52.050,0:12:57.890
about species which is a simplification.[br]The question is what is...what are species
0:12:57.890,0:13:03.400
actually? And so we need to talk about[br]genetic diversity within and between
0:13:03.400,0:13:16.519
species. And I'm going to do so using[br]gulls, which most of us might know. Here
0:13:16.519,0:13:21.670
in Europe, we have two large gulls of the[br]genus Larus. One is in the front, the
0:13:21.670,0:13:31.070
lighter gray is our Silbermöwe. And in the[br]back is our Heringsmöwe, the dark one. And
0:13:31.070,0:13:35.740
I'm going to use German names because the[br]English names go crosswise and that's
0:13:35.740,0:13:43.160
completely confusing. So I will stick with[br]the German names. Here in Europe these two
0:13:43.160,0:13:48.450
species seem to be really fine species[br]because they barely interbreed, so they
0:13:48.450,0:13:55.680
don't hybridize. However, if you take a[br]step back and look at the genus in
0:13:55.680,0:14:03.120
general, you see that the species of the[br]genus are distributed kind of ringwise
0:14:03.120,0:14:14.510
around the Arctic. And so the idea is[br]that, say during the Ice Age, all of this
0:14:14.510,0:14:22.959
area was glaciated and the gulls retreated[br]to a refuge here near the Caspian Sea. And
0:14:22.959,0:14:28.110
then after the ice retreated, the gulls[br]moved back north. One branch moved into
0:14:28.110,0:14:34.350
Europe forming our Heringsmöwe and[br]another branch then moved counterclockwise
0:14:34.350,0:14:41.019
around the Arctic, producing different[br]morphotypes, different species across the
0:14:41.019,0:14:49.450
Bering Strait and then into North America.[br]There the dark blue one is...I'm
0:14:49.450,0:14:58.730
simplifying, the equivalent of our[br]European Silbermöwe, the American
0:14:58.730,0:15:03.830
Silbermöwe. Then the idea is that some[br]individuals crossed back to Europe and
0:15:03.830,0:15:14.800
formed our European Silbermöwe. And while[br]all of these species here are
0:15:14.800,0:15:21.769
interbreeding, so they hybridize. Only[br]when this ring is closed those two species
0:15:21.769,0:15:26.720
don't interbreed anymore. And the big[br]question is, are we actually dealing with
0:15:26.720,0:15:34.230
one single species or are we dealing with[br]different species that just happened to
0:15:34.230,0:15:41.079
hybridize more or less? The question is[br]not trivial because it has consequences
0:15:41.079,0:15:48.740
for protection. If we are dealing with one[br]single species, all the gulls in Eurasia
0:15:48.740,0:15:53.010
could go extinct and it wouldn't matter[br]because we still would have the gulls in
0:15:53.010,0:15:58.540
North America. However, if we have[br]different species in all of these areas,
0:15:58.540,0:16:04.709
we would need to protect individuals or[br]the species on a regional level and
0:16:04.709,0:16:17.279
protect all of these different species. So[br]to investigate this question about: Do we
0:16:17.279,0:16:23.589
have different species? And what were the[br]evolutionary processes and histories that
0:16:23.589,0:16:31.100
brought about the species? A group of[br]scientists investigated that using DNA
0:16:31.100,0:16:39.930
sequences. And on the left, you have the[br]model, the theoretical model of the ring
0:16:39.930,0:16:46.380
species. And here on the right you have[br]reality. And the scientists found that the
0:16:46.380,0:16:51.630
reality is always much more complex. So,[br]for example, they found two refuges or
0:16:51.630,0:16:58.430
they proposed two refuges. But what they[br]found was that genetic diversity was
0:16:58.430,0:17:07.351
correlated with those species or[br]morphotypes. So what that also means is
0:17:07.351,0:17:15.730
that genetic diversity is cultivated with[br]geographic origin. What we learn from this
0:17:15.730,0:17:24.360
type of analysis is we learn about[br]evolutionary processes and history, about
0:17:24.360,0:17:30.170
variability and differentiation of our[br]gene flow and migration, about speciation
0:17:30.170,0:17:37.590
processes. That we all need to understand[br]our species, which will allow us to
0:17:37.590,0:17:43.440
protect them. So we need evolutionary[br]biologists who do follow genetics and
0:17:43.440,0:17:59.030
population genetics. So once we found out[br]that one can use genetic diversity, to
0:17:59.030,0:18:07.130
infer geographic origin because genetic[br]diversity is correlated with geography,
0:18:07.130,0:18:18.500
people immediately said: 'Okay, we can use[br]it for conservation applications.'. And
0:18:18.500,0:18:24.049
it's also...we learned that we...often it[br]is unclear what is a species, species
0:18:24.049,0:18:32.559
boundaries are unclear and some species[br]have huge distribution ranges with
0:18:32.559,0:18:37.340
different clusters of viability within[br]this huge range. So we know that we need
0:18:37.340,0:18:42.941
to protect within species genetic[br]diversity, which means that we need to
0:18:42.941,0:18:50.650
understand within species population[br]structure and we need to build useful and
0:18:50.650,0:18:58.919
reliable models of population structure.[br]These models are actually required for all
0:18:58.919,0:19:03.740
of our applications. They are required for[br]monitoring, for example, for conservation
0:19:03.740,0:19:11.890
strategies, for functional adaptation and[br]adaptability, questions of productability
0:19:11.890,0:19:19.190
of different provenances, its impact on[br]management regimes, breeding strategies,
0:19:19.190,0:19:27.610
and also for enforcement applications.[br]From the studies I showed you before with
0:19:27.610,0:19:34.110
the gulls we also know that we need to[br]approach the question of a population
0:19:34.110,0:19:47.070
structure on a distribution range wide[br]scale. So here's the map produced by
0:19:47.070,0:19:53.630
EUFORGENE, the European Network for forest[br]reproductive material for one of our
0:19:53.630,0:20:02.000
native oaks, the sessil oak. And the dots[br]are the sites for genetic conservation
0:20:02.000,0:20:12.120
units. And so that is one strategy how to[br]represent within species genetic diversity
0:20:12.120,0:20:22.020
and how to sample it. And you can see this[br]is a hypothetical example, but we likely
0:20:22.020,0:20:32.460
will see a gradient from west to east or[br]might see one at this scale. Then once we
0:20:32.460,0:20:37.800
have these kind of global data sets, we[br]can go to the fine scale and maybe, for
0:20:37.800,0:20:44.100
example, do a national genetic monitoring.[br]And we will find much finer scale
0:20:44.100,0:20:51.210
gradients. We also will find especially[br]for first trace outliers, so for stands
0:20:51.210,0:20:59.150
that don't fit the usual pattern. And that[br]is because the first reproductive material
0:20:59.150,0:21:07.660
has been moved around a lot. And so these[br]lighter or darker dots is material that
0:21:07.660,0:21:16.150
was moved to Germany from the outside. And[br]we only will identify these outliers if we
0:21:16.150,0:21:21.380
have the whole reference dataset. If we[br]don't have the whole reference dataset, we
0:21:21.380,0:21:28.799
might not identify these outliers - stands[br]with a different history. Or in a worst
0:21:28.799,0:21:34.280
case, these outliers might actually bias[br]our gradients. And we are always talking
0:21:34.280,0:21:42.770
about very slight gradients. So it's easy[br]to bias these gradiants, dilute them, so
0:21:42.770,0:21:50.710
we actually won't get the results we need.[br]To compile these kinds of reference
0:21:50.710,0:21:57.850
datasets that's huge collaborative efforts[br]because people need to go out into the
0:21:57.850,0:22:04.500
field and collect the reference samples[br]and that might be scientists, that might
0:22:04.500,0:22:13.669
be people from local communities, citizen[br]scientists, managers, owners, government
0:22:13.669,0:22:20.179
officials who provide background[br]information, maps, distribution
0:22:20.179,0:22:27.929
information and also in many parts of the[br]world might protect the people who are
0:22:27.929,0:22:34.510
actually collecting the samples. And it[br]might be conservation activists and NGOs.
0:22:34.510,0:22:41.150
So once the samples have been collected[br]they need to be stored somewhere for the
0:22:41.150,0:22:51.150
long term and the information needs to be[br]databased. And that is the work of
0:22:51.150,0:22:57.430
scientific connections, which are mostly[br]at natural history museums and there the
0:22:57.430,0:23:04.460
samples are processed. They're organized[br]in ways that you can find them again. All
0:23:04.460,0:23:09.680
the metadata is entered, which curators[br]do, collection managers, preparators,
0:23:09.680,0:23:17.030
technical staff at the scientific[br]collections. So once we have these kind of
0:23:17.030,0:23:24.910
data sets, large scale data sets, what are[br]we actually doing with them? So the
0:23:24.910,0:23:32.514
foundation for all of our applications is[br]population structure and there
0:23:32.514,0:23:42.370
specifically population assignment. So the[br]process is set first. We decide on a
0:23:42.370,0:23:46.660
question and design our project[br]accordingly that we can answer the
0:23:46.660,0:23:51.940
question. Then we need to infer the[br]population structure model and optimize
0:23:51.940,0:23:57.480
it. In the next step we need to check if a[br]model actually is good enough for
0:23:57.480,0:24:03.040
application because we might have found[br]the best model, but it might still not be
0:24:03.040,0:24:07.480
good enough for application. So we need to[br]test that. And that is the step of
0:24:07.480,0:24:12.831
population assignment or predictive[br]assignment. And then in the end, we want
0:24:12.831,0:24:19.330
to test our hypothesis. Are the two stands[br]different or does an individual come from
0:24:19.330,0:24:31.059
stand A or from stand B? And here we[br]identify error rates and accuracy. So this
0:24:31.059,0:24:38.890
whole process is very statistical. And so[br]the analysis of these reference data they
0:24:38.890,0:24:48.240
need to be accompanied by biostatisticians[br]who can tell us how to analyze our data.
0:24:48.240,0:24:55.289
So what is the state-of-the-art right now?[br]What kind of geographic resolution do we
0:24:55.289,0:25:02.990
actually get of this non model specie[br]currently? And I'm going to present the
0:25:02.990,0:25:09.600
example of an African timber tree[br]species, which is a very valuable timber.
0:25:09.600,0:25:18.110
It's one example but basically all results[br]for species who have large distribution
0:25:18.110,0:25:26.059
ranges and are continuously distributed[br]and are also long-lived, are very similar.
0:25:26.059,0:25:33.460
So this kind of results seem to be species[br]independent. So the species are Milica
0:25:33.460,0:25:40.370
regia and excelsa, African teak, which[br]cannot be grown in plantations for timber
0:25:40.370,0:25:51.159
quality. So it is harvested unsustainably[br]from natural forests. It's distributed in
0:25:51.159,0:26:00.580
West, Central and East Africa. Here's a[br]black rectangle. And a group of a dozen
0:26:00.580,0:26:06.289
scientists got together and they actually[br]sampled a reference dataset for these two
0:26:06.289,0:26:18.659
species. It's about over 400 samples, they[br]analyzed four marker systems, resulting in
0:26:18.659,0:26:24.570
a total of something like 100 markers,[br]genetic markers, and then they optimized
0:26:24.570,0:26:32.660
the population model and used different[br]parameter settings. And we're going to
0:26:32.660,0:26:40.080
concentrate here on the best solution that[br]they found. And basically this rectangle
0:26:40.080,0:26:47.870
here is the black one over here. So the[br]resolution is... they found population
0:26:47.870,0:26:54.690
structure with clear clusters. So the[br]populations and the species from West
0:26:54.690,0:27:01.490
Africa can be distinguished from those[br]populations in Central Africa. And the
0:27:01.490,0:27:08.460
ones in East Africa can be differentiated.[br]So that is really good. So we have
0:27:08.460,0:27:13.480
population structure. We know their[br]signal. The problem is still that our
0:27:13.480,0:27:21.510
resolution is much lower than we would[br]need to have it because we basically need
0:27:21.510,0:27:32.090
resolution at least on a country level,[br]because most of the laws are national. So
0:27:32.090,0:27:41.770
it might be legal to harvest a tree in one[br]country, but not in another country. So we
0:27:41.770,0:27:49.319
need to get our resolution down to country[br]level or even to regional level. If you
0:27:49.319,0:27:52.361
want to distinguish, was the tree[br]harvested in a national park in a
0:27:52.361,0:28:02.289
protected area or outside in a managed[br]forest. And when as biodiversity
0:28:02.289,0:28:10.740
scientists, we don't know how to continue,[br]one thing is to look for what people do
0:28:10.740,0:28:17.179
with model organisms and specifically what[br]people do in human population genomics
0:28:17.179,0:28:24.179
because there thousands of populations[br]geneticists are working and there is a
0:28:24.179,0:28:28.210
completely different funding background[br]due to the interest of the medical and the
0:28:28.210,0:28:39.119
pharma industry. So they are always[br]advanced. What we can learn from there,
0:28:39.119,0:28:46.660
from the human populations genomics is[br]that we need two features. One is we
0:28:46.660,0:28:53.570
already know that we need distribution[br]wide sampling, which provides a spatial
0:28:53.570,0:28:59.950
context. The second feature is that we[br]need genome wide sequencing, preferably
0:28:59.950,0:29:09.210
genome sequencing, which provides us steps[br]in time because our genomes are archives
0:29:09.210,0:29:14.710
of our evolutionary history. They are[br]records of all the processes and events
0:29:14.710,0:29:21.429
and these steps in time then translate[br]also into resolution. Once we have these
0:29:21.429,0:29:30.150
two features, actually these reference[br]datasets open Pandora's box. Suddently we
0:29:30.150,0:29:36.390
can ask all kinds of questions and[br]objectives, even those that we still don't
0:29:36.390,0:29:47.010
know. We can develop all kinds of[br]applications which is done for humans.
0:29:47.010,0:29:59.400
Currently, there are at least four global[br]datasets on human diversity. These are
0:29:59.400,0:30:08.860
very widely reused and these big datasets[br]- so they are big data with regard to the
0:30:08.860,0:30:18.850
number of samples and also the genomes or[br]the genome representations and this
0:30:18.850,0:30:26.470
results in very information rich data[br]which initiates analytical development so
0:30:26.470,0:30:33.799
people continuously are developing new[br]statistical methods. And right now, a new
0:30:33.799,0:30:42.330
wave is coming in of these methods. So[br]once you have these global datasets,
0:30:42.330,0:30:47.500
people start in human populations[br]genomics, started to do these intense
0:30:47.500,0:30:56.299
regional samplings. And this is the[br]example of the United Kingdom Biobank.
0:30:56.299,0:31:02.789
It's a project with 500.000 volunteers,[br]they are all UK citizens from all over the
0:31:02.789,0:31:13.982
islands. And each individual was genotyped[br]in a vet lab for 820.000 markers. That's
0:31:13.982,0:31:19.620
completely I mean, that's a different[br]number than the 100 or 1000...in
0:31:19.620,0:31:26.409
biodiversity scientists we normally[br]analyse a maximum of a couple of 10.000
0:31:26.409,0:31:36.220
markers. So that's a completely different[br]number. But then statistical geneticists
0:31:36.220,0:31:47.140
come. They do some weird and wonderful[br]voodoo and they derive 96 million markers
0:31:47.140,0:31:53.460
per genome that is per individual from[br]these 820.000 markers that were produced
0:31:53.460,0:32:00.630
in the lab. So that's a hundred fold[br]increase. And once you have this kind of
0:32:00.630,0:32:07.510
dataset for a genome, you suddenly or you[br]finally become country level and within
0:32:07.510,0:32:18.970
country level resolution. So these panels[br]are examples. So the first panel shows
0:32:18.970,0:32:25.980
individuals who were born in Edinburgh and[br]the question was "Where were people born
0:32:25.980,0:32:32.419
who had a similar ancestral background,[br]genetic background?". And what they found
0:32:32.419,0:32:41.980
was that was all over Scotland and[br]Northern Ireland. Northern Yorkshire was
0:32:41.980,0:32:50.250
even more local. So people from Yorkshire[br]don't seem to get around a lot. For London
0:32:50.250,0:32:54.090
the situation is completely different.[br]That is what we would expect because
0:32:54.090,0:32:59.580
London is a people magnet. People move[br]there all the time. They meet there, they
0:32:59.580,0:33:05.700
get children and the kids born in London,[br]their genetic ancestry has nothing to do
0:33:05.700,0:33:12.760
with London. It's from all over the place,[br]from the British Isles and the world. So
0:33:12.760,0:33:21.600
that's why the colors are strongly[br]dissolved. So this study came out also
0:33:21.600,0:33:26.100
this summer. And it's the first time that[br]I have seen that we actually really can
0:33:26.100,0:33:36.580
achieve regional resolution. And I find[br]this possibility for biodiversity science
0:33:36.580,0:33:46.820
really exciting. So it was made possible[br]by very sophisticated statistical
0:33:46.820,0:33:51.890
approaches which are able to analyze[br]genetic data from highly complex
0:33:51.890,0:33:59.450
evolutionary and ecological systems. And[br]at the same time these analyses are able
0:33:59.450,0:34:04.910
to handle big data. We we're talking about[br]gigabytes and terabytes of data and
0:34:04.910,0:34:13.810
results. So a statistical geneticist are[br]developing new methods of data
0:34:13.810,0:34:20.309
representation to handle this amount of[br]data. And then we are able to sufficiently
0:34:20.309,0:34:25.520
extract the signal for a very specific[br]question from data which are very low
0:34:25.520,0:34:36.919
signal to noise ratio. So to get there, we[br]need many experts and specialists. So we
0:34:36.919,0:34:41.659
need statistical geneticists, big data[br]experts who also might contribute machine
0:34:41.659,0:34:49.299
learning expertise. We need molecular[br]biologists who know how to sequence
0:34:49.299,0:34:54.259
complex genomes. We now need[br]bioinformatics with an expertise in
0:34:54.259,0:35:05.010
genomics for assembly, annotation and[br]alignment of genomic sequences. The result
0:35:05.010,0:35:12.569
is actually this: This is the author list[br]for the thousands genomes project
0:35:12.569,0:35:20.380
reference data set, and I don't expect you[br]to be able to read it, but the bold type
0:35:20.380,0:35:25.539
is of interest because it shows all the[br]different tasks that are necessary to
0:35:25.539,0:35:36.140
produce a standardized and highly cleaned[br]reverence dataset. So the whole author
0:35:36.140,0:35:41.880
list is something like 1.5 pages long and[br]even considering that some authors will
0:35:41.880,0:35:51.130
have contributed to several tasks. The[br]publications for reference datasets mostly
0:35:51.130,0:35:57.079
have author lists that are far over 50[br]people. So they are huge collaborative
0:35:57.079,0:36:05.219
efforts. Now we take the step into[br]biodiversity science. Here these are eight
0:36:05.219,0:36:13.440
gastrotrichs, they are little worm like...[br]organisms who live in the sediments of
0:36:13.440,0:36:23.069
freshwater lakes and marine sediment. They[br]are in general a couple of hundreds micro
0:36:23.069,0:36:29.569
meters large. And I don't have any[br]numbers, but my guess would be that maybe
0:36:29.569,0:36:38.640
worldwide, a hundred to a thousand people[br]actually work on these species. There are
0:36:38.640,0:36:44.829
800 species of gastrotrichs. So let's say[br]there's one, two, maybe three experts per
0:36:44.829,0:36:52.240
species for these organisms. So how are[br]these three people going to manage all
0:36:52.240,0:37:01.420
these tasks to produce a reference[br]dataset? You might say, well, it's
0:37:01.420,0:37:05.209
gastrotrichs, I mean, have never heard[br]about them. Maybe they are not so
0:37:05.209,0:37:08.349
important. Maybe you don't need a[br]reference data sets, but actually some of
0:37:08.349,0:37:17.579
those species are bioindicators for water[br]quality. So what we observe right now is a
0:37:17.579,0:37:27.510
gap for biodiversity conservation. In[br]model organisms, we have Pandora's Box
0:37:27.510,0:37:34.630
open. We have all the statistical analyses[br]at our hands to analyze our data sets.
0:37:34.630,0:37:39.709
However, in none model organisms, we are[br]still stuck with summary statistics that
0:37:39.709,0:37:46.839
don't provide us the resolution that we[br]need. And we know that to close this gap,
0:37:46.839,0:37:52.599
even for a single species, it's a huge[br]effort. But at the same time, we have over
0:37:52.599,0:38:03.560
35.000 species listed by scientists which[br]need already now effective protection. So
0:38:03.560,0:38:10.008
we need to find a way to close this gap[br]and actually move in this direction. And
0:38:10.008,0:38:19.940
the good thing is, so all of this... in[br]biodiversity science, in academia, and we
0:38:19.940,0:38:24.890
need to make the transition over the[br]conservational genomic gap into the big
0:38:24.890,0:38:32.130
loop of real world conservation tasks. And[br]the good thing is we already know what we
0:38:32.130,0:38:37.940
have to do. So we need to have reference[br]data sets, distribution range wide. We
0:38:37.940,0:38:43.959
need to have statistics. And it's going to[br]be big data. So we need collection
0:38:43.959,0:38:54.140
management, data management and an[br]analysis environment. So looking at
0:38:54.140,0:38:59.880
different ingredients or different steps[br]the first we need is a general data
0:38:59.880,0:39:05.269
infrastructure for global diversity of[br]reference data sets that actually can be
0:39:05.269,0:39:11.779
used across species for preferably as many[br]species as possible and provide a working
0:39:11.779,0:39:19.749
environment for biodiversity scientists[br]and experts. It should be user friendly so
0:39:19.749,0:39:25.759
it can be used by scientists, but also[br]that people from local communities and
0:39:25.759,0:39:33.489
citizen scientists can add their[br]observation data and their data into this
0:39:33.489,0:39:41.339
data infrastructure. I have listed quite a[br]lot of features that these kind of
0:39:41.339,0:39:48.400
infrastructures should have. And I'm going[br]to argue that these features are not some
0:39:48.400,0:40:02.609
nice to have, but actually some must have.[br]Because our goal is always application. So
0:40:02.609,0:40:13.279
we need developers, managers and curators[br]for data infrastructures. Since our goal
0:40:13.279,0:40:30.900
is application, our main features are[br]quality control and error reduction. These
0:40:30.900,0:40:38.880
are the basis. So that our conservation[br]tools can be robustly and reliably applied
0:40:38.880,0:40:46.459
under real world operating conditions. And[br]the way to achieve quality and error
0:40:46.459,0:40:52.759
reduction is through chains of custody. So[br]it means that from project of sign, from
0:40:52.759,0:40:58.299
the questions through all the steps that[br]are necessary to produce a reference data
0:40:58.299,0:41:08.219
set and then...so from sample collection,[br]genomic statistical analysis down to
0:41:08.219,0:41:15.599
application. These steps need to be[br]documented and standardized. They need to
0:41:15.599,0:41:22.239
be, each one of them needs to be validated[br]and reproducible. They should be modular
0:41:22.239,0:41:28.999
so they can be user friendly. And the[br]whole chain of custody needs to be
0:41:28.999,0:41:40.690
scalable. So if our chains of custody have[br]these characteristics, we actually will
0:41:40.690,0:41:51.390
have tools that will work in everyday[br]life. So we need professional developers
0:41:51.390,0:41:59.519
and programmers who are able to produce[br]these very collaborative softwares. We
0:41:59.519,0:42:06.130
need free and open source experts. So we[br]always can ensure that our code and that
0:42:06.130,0:42:13.859
our infrastructures are still integer and[br]we can check them. And I'm a biologist, I
0:42:13.859,0:42:19.390
don't have any background in hardware, but[br]I've heard a couple of talks here in the
0:42:19.390,0:42:26.099
conference about Green IT. And I have[br]the feeling we should have people who know
0:42:26.099,0:42:33.849
hardware and software and know how to[br]develop these high tech tools in a way
0:42:33.849,0:42:38.450
sustainable so that by developing these[br]tools, we don't use more resources than we
0:42:38.450,0:42:48.940
are trying to protect. So I've shown all[br]these features and characteristics that
0:42:48.940,0:42:57.459
the software should have. And I'm arguing[br]that these features are necessary because
0:42:57.459,0:43:04.819
of the reality we find us in. It is one of[br]rising over-exploitation and destruction
0:43:04.819,0:43:19.799
of nature. So the extent of environmental[br]crimes is up in the billions. All
0:43:19.799,0:43:29.029
environmental crime together, the green[br]bubbles are only second to drug associated
0:43:29.029,0:43:35.489
crimes. They are up there with[br]counterfeiting or human trafficing. So
0:43:35.489,0:43:45.479
these are multi-billion enterprises. They[br]are often transnational and industries
0:43:45.479,0:44:02.019
with huge profits. So if there's some[br]crime, some mafia boss, some criminal
0:44:02.019,0:44:09.539
manager who just bribed a government[br]official somewhere in the neck in the
0:44:09.539,0:44:17.859
woods, it just would make sense that that[br]person would not wait or not take the
0:44:17.859,0:44:23.809
risks to be discovered just because some[br]customs officer pulls out a container
0:44:23.809,0:44:29.170
somewhere in the harbor, for example,[br]opens it and says "This looks kind of
0:44:29.170,0:44:37.380
weird. Let's take a sample, send it to a[br]lab." and then a population geneticist
0:44:37.380,0:44:44.171
comes back and says "Oh, yes, this sample[br]is not from area A as documented, but
0:44:44.171,0:44:52.449
actually it's from area B and it was[br]illegally logged." If we have reference
0:44:52.449,0:44:58.660
data sets, information rich reference data[br]sets, they become highly valuable and they
0:44:58.660,0:45:08.430
need protection themselves against[br]manipulation and destruction. So we will
0:45:08.430,0:45:14.739
need to think about IT security from the[br]beginning. Also, these data sets are often
0:45:14.739,0:45:20.069
very politically sensitive because if it[br]is shown that in a certain country there
0:45:20.069,0:45:25.680
is the illegal logging repeatedly, that[br]country might not be too excited about
0:45:25.680,0:45:41.380
this information. So we need to think[br]about IT security experts. So my hope is
0:45:41.380,0:45:48.599
that these kind of very high tech digital[br]conservation tools can actually contribute
0:45:48.599,0:45:55.690
to the U.N. Sustainable Development Goals[br]by empowering indigenous people, local
0:45:55.690,0:46:02.810
communities and also us to protect and[br]force and sustainably use our lands and
0:46:02.810,0:46:10.139
our biodiversity by providing some[br]management and law enforcement tools. So
0:46:10.139,0:46:14.059
we need people from around the world,[br]users from around the world who use these
0:46:14.059,0:46:25.789
tools and help to develop them further and[br]to maintain them. And finally here, these
0:46:25.789,0:46:33.910
high tech tools will just another[br]technological fix. If we don't manage to
0:46:33.910,0:46:45.770
get our back down, our way of life down to[br]sustainable levels. So what we need is to
0:46:45.770,0:46:53.759
today...this year, the Earth Overshoot Day[br]was at the end of July. So at the end of
0:46:53.759,0:47:01.639
July, we had used all the resources that[br]we had available for the whole year. And
0:47:01.639,0:47:09.400
we need to get this back to the end of the[br]year so that our resources actually
0:47:09.400,0:47:22.910
sustain us for the whole year. The graphic[br]here for Germany suggests that we are on a
0:47:22.910,0:47:29.819
good way. We are reducing our resource[br]consumption and maybe even our biocapacity
0:47:29.819,0:47:38.099
moves up a little bit. So actually it[br]seems that our personal lifestyles and
0:47:38.099,0:47:46.329
choices make a difference and we just need[br]to close this gap here much quicker. So
0:47:46.329,0:47:53.689
protecting biodiversity needs all of us to[br]achieve that. And with that, thank you
0:47:53.689,0:47:57.770
very much.
0:47:57.770,0:48:08.020
Applause
0:48:08.020,0:48:12.680
Angel: So thank you Jutta for this very[br]interesting talk and the very valuable
0:48:12.680,0:48:16.609
work you're doing. We have three mics[br]here. Please line up at the microphones if
0:48:16.609,0:48:22.809
you have any questions or suggestions or[br]want to participate and work together with
0:48:22.809,0:48:29.660
Jutta. We have one question from the[br]Internet, so please Signal-Angel start.
0:48:29.660,0:48:34.749
Signal-Angel: Why do wild plant species[br]within a genus are further apart than wild
0:48:34.749,0:48:42.509
animal species within a genus?[br]Angel: Could you repeat it, please?
0:48:42.509,0:48:49.069
Signal-Angel: Why do wild plant species[br]within a genus are further apart than wild
0:48:49.069,0:48:55.910
animal species within a genus?[br]Jutta: I'm not sure I understand the
0:48:55.910,0:49:01.180
background for the question.[br]Mic 1: Because animals move and plants
0:49:01.180,0:49:06.449
don't move.[br]Jutta: Oh, okay. If that is the idea
0:49:06.449,0:49:12.299
behind the question. Plants actually move,[br]too. They don't move as individuals, but
0:49:12.299,0:49:24.289
they move their genetic material through[br]pollen or fragments. So actually diversity
0:49:24.289,0:49:30.760
in plants and in animals can be quite[br]similar. So the idea is that plants are
0:49:30.760,0:49:36.459
just stuck and should have a completely[br]different population structure does not
0:49:36.459,0:49:43.130
hold because plants move around their[br]genetic material through seeds, through
0:49:43.130,0:49:49.610
pollen, through vegetative propagules.[br]Angel: So thank you microphone 1 for
0:49:49.610,0:49:55.999
helping out. Please ask your question. Mic[br]1: So my question is about the success
0:49:55.999,0:50:00.939
factor of it. If you think of this,[br]whatever database being set up there and I
0:50:00.939,0:50:07.430
think it's gonna be a huge database...I[br]downloaded my own genome on the Internet.
0:50:07.430,0:50:12.989
It was about 150 megabytes. And if we[br]multiply that, I think the genetic
0:50:12.989,0:50:17.539
variation from one person to another is[br]about 1 percent only. So we can compress
0:50:17.539,0:50:25.009
that to 4 megabytes per person. If we[br]sequence all the humans in the world, that
0:50:25.009,0:50:32.689
would be 32 petabytes, that would cost[br]approximately 15 billion dollars. And
0:50:32.689,0:50:36.890
that's only for the storage. Now comes the[br]entire management. Of course, we don't
0:50:36.890,0:50:41.470
want to digitize all the human genome, but[br]rather the plants and animal species
0:50:41.470,0:50:46.309
genome. So it's a huge data program. And[br]what would be for you the success factors
0:50:46.309,0:50:51.229
for this thing to really fly? And did you[br]talk to organizations like WikiData or
0:50:51.229,0:50:56.469
others or where would it ideally be[br]hosted? At a university or an
0:50:56.469,0:51:02.170
international nonprofit or who would be[br]running the thing?
0:51:02.170,0:51:14.519
Jutta: Yeah, I mean, it's just really big[br]data. I think our first goal is not to
0:51:14.519,0:51:23.670
think about having all predicted 5 to 10[br]million species be sequenced on a
0:51:23.670,0:51:30.239
population level. I think we need to think[br]about the next step. And there it would
0:51:30.239,0:51:35.530
make sense to start with species that are[br]actually highly exploited, like many
0:51:35.530,0:51:40.579
timber species and also many marine[br]fishes. I think that's where we should
0:51:40.579,0:51:48.039
start. And to host this kind of data I[br]think it should be in political
0:51:48.039,0:51:56.410
independent hands. So it should be with an[br]NGO or with the U.N., some organization
0:51:56.410,0:52:02.449
that is independent.[br]Mic 1: Are you the first to think about
0:52:02.449,0:52:06.509
this or are there existing initiatives?[br]Jutta: There are actually existing
0:52:06.509,0:52:14.219
initiatives. I have been in contact with[br]the Forest Stewardship Council and they
0:52:14.219,0:52:23.219
are actually starting to sample their[br]concessions and initiated to build up the
0:52:23.219,0:52:28.730
samples, they work together with Kew[br]Botanical Gardens and the U.S. Forest
0:52:28.730,0:52:37.589
Service. And right now they're analyzing[br]the samples, using isotopes which is
0:52:37.589,0:52:45.579
another method which is very powerful and[br]can also produce geographic information.
0:52:45.579,0:53:00.710
And so, yeah, so people are moving in this[br]way. So, yeah, I think the idea is out
0:53:00.710,0:53:05.839
there, just we have to start and we have[br]to really do it and provide one
0:53:05.839,0:53:13.210
infrastructure so that we can combine, for[br]example, morphological data, isotope data
0:53:13.210,0:53:18.329
and genomic data into one dataset, which[br]will increase our resolution and our
0:53:18.329,0:53:23.980
reliability.[br]Angel: Okay. Microphone number two,
0:53:23.980,0:53:27.069
please.[br]Mic 2: Thank you for your valuable talk.
0:53:27.069,0:53:32.660
My question would be you'd start your talk[br]with the possible decrease of leaf beetles
0:53:32.660,0:53:37.100
in the data set you showed on slide number[br]six there was an increase in leaf beetle
0:53:37.100,0:53:41.930
population until the 70s, something about[br]that. Is there a possible explanation for
0:53:41.930,0:53:49.869
that?[br]Jutta: Yeah, I believe it is, because
0:53:49.869,0:53:55.359
people started to much more systematically[br]observe leaf beetles. So it's a sample
0:53:55.359,0:54:05.869
effort. And also at that time the people -[br]so it's a multi-people collaboration who
0:54:05.869,0:54:12.369
actually has assembled this dataset so the[br]people who are part of this collaboration
0:54:12.369,0:54:16.949
they edit their own private data sets. And[br]that's why you have an increase I think.
0:54:16.949,0:54:23.509
While the people from the nineteen[br]hundreds, nineteen hundred ten you only
0:54:23.509,0:54:29.009
can use the data that is available in[br]publications and samples in museums or in
0:54:29.009,0:54:33.289
scientific collections. I think that is[br]the reason why you have the sharp
0:54:33.289,0:54:35.589
increase.[br]Mic 2: Thank you.
0:54:35.589,0:54:38.750
Angel: So we have another question of[br]microphone number two.
0:54:38.750,0:54:44.459
Mic 2: Thank you for your fine talking.[br]Excuse me. Maybe my question is a bit off
0:54:44.459,0:54:51.730
topic. Do you think the methods and roles[br]that you identified in your talk could be
0:54:51.730,0:54:59.880
transferred to the assessment of raw[br]materials? I'm thinking about metals?
0:54:59.880,0:55:09.349
Jutta: Maybe the data infrastructure, like[br]if you wanted to collect raw metals or
0:55:09.349,0:55:16.471
materials from all over the world and...a[br]sampleized scientific collection and to
0:55:16.471,0:55:22.390
have kind of a reference dataset that[br]might work, actually. But the genomics
0:55:22.390,0:55:29.170
obviously won't. So that part of what you[br]would need to use different methods from
0:55:29.170,0:55:36.010
physics, obviously. But actually the[br]infrastructure, certain parts will be
0:55:36.010,0:55:40.249
quite similar. I think so, yes.[br]Angel: So we have one more question from
0:55:40.249,0:55:43.420
the Internet.[br]Signal-Angel: Who does contract a
0:55:43.420,0:55:51.619
freelance evolutionary biologist? Can you[br]give an example of this kind of work you
0:55:51.619,0:56:01.429
proposed?[br]Jutta: So I see this gap between science
0:56:01.429,0:56:07.739
and applications, that we need these[br]applications and there's a huge potential
0:56:07.739,0:56:18.150
for these applications. We know that[br]illegal logging and that is my background,
0:56:18.150,0:56:23.769
but doesn't seem to be much different, for[br]example, in marine fisheries. We know that
0:56:23.769,0:56:29.730
there is this huge amount of illegal[br]logging and timber trade going on. And we
0:56:29.730,0:56:39.670
need to have some assets actually that[br]have the power to detect illegally traded
0:56:39.670,0:56:49.789
timber. So I think there is a huge need[br]for these kind of methods and
0:56:49.789,0:57:00.869
organizations who are interested in these[br]kind of methods. Our governments, their
0:57:00.869,0:57:12.719
companies, NGOs, customs, Interpol. So,[br]yeah.
0:57:12.719,0:57:19.700
Angel: Do we have any other questions? So[br]thank you again Jutta for your talk and
0:57:19.700,0:57:23.739
the valuable work you're doing. Please[br]give a warm round of applause to Jutta.
0:57:23.739,0:57:29.009
Applause
0:57:29.009,0:57:33.599
36c3 postrol music
0:57:33.599,0:57:56.000
Subtitles created by c3subtitles.de[br]in the year 2020. Join, and help us!