36C3 - Protecting the Wild

0:00 - 0:20

36C3 preroll music
0:20 - 0:25

Angel: Right now I'd like to welcome our
first speaker on stage. The talk will be
0:25 - 0:31

about protecting the wild and I'll hand
over to her. Please give her a warm round
0:31 - 0:33

of applause.
0:33 - 0:35

Applause
0:35 - 0:44

Jutta Buschbom: Thank you very much for
the introduction. My name is Jutta
0:44 - 0:52

Buschbom, I'm an evolutionary biologist.
That is my background. I did do my PHD at
0:52 - 0:57

the University of Chicago working on
little fungees that live in symbiosis with
0:57 - 1:06

algae and form colorful rocks, colorful
crust on rocks. I then did a Postdoc in
1:06 - 1:12

bioinformatics and after that moved back
into organismal biology, working in forest
1:12 - 1:20

genetics. And the ten years I worked in
forest genetics for the first time I
1:20 - 1:26

encountered questions that were with
regard to application, and I found out
1:26 - 1:37

that actually moving from research to
application is not trivial. So what I'm
1:37 - 1:46

going to present is a high tech way using
genomic data to protect biodiversity in a
1:46 - 1:52

way that you can actually reach
application and use conservation genomic
1:52 - 2:03

tools. So this summer the draft of the
report of the Intergovernmental Science
2:03 - 2:12

Policy Panel for Biodiversity and
Ecosystem Services came out and its
2:12 - 2:20

results were quite warning. It stated that
around a million animal and plant species
2:20 - 2:27

are currently stated and of those...half
of those species are already dead species
2:27 - 2:33

walking. So because due to the destruction
of the habitats or habitat deterioration,
2:33 - 2:43

they are not able to reproduce in a
sustainable way anymore. A third of the
2:43 - 2:51

total species extinction rate risk to date
has arisen in the last 25 years. And just
2:51 - 3:01

to give you an idea about the relation we
are talking about...currently the rate of
3:01 - 3:08

extinction risk is already at least ten to
hundreds times higher than it has averaged
3:08 - 3:13

over the past 10 million years. And within
these 10 million years there were the Ice
3:13 - 3:23

Ages, for example. And most of the
extinction risk is due to the fact of land
3:23 - 3:36

and sea use change. The report also talks,
even talks about that we already seem to
3:36 - 3:42

have transgressed a proposed precautionary
planetary boundary, which means within the
3:42 - 3:48

boundary we have a stable biological
system. But having transgressed it, we
3:48 - 3:55

might already be in a transition to a new
state that we have no way to find out how
3:55 - 4:05

this state is going to look like. So all
of these facts that the report is stating
4:05 - 4:15

are actually pretty negative. And I was
quite happy to read that they also present
4:15 - 4:21

that there are actually people who do
better than most of us. And they point out
4:21 - 4:28

that many practices of indigenous people
and local communities actually conserve
4:28 - 4:38

and sustain wild and domesticated
biodiversity quite well. Today, a higher
4:38 - 4:45

proportion of the remaining terrestrial
biodiversity lies in areas managed and
4:45 - 4:53

held by indigenous people. And these
ecosystems are more intact and less
4:53 - 5:02

declining, less rapidly declining. So we
have examples of lifestyles that actually
5:02 - 5:11

do better than most of us. And I know the
solutions won't be simple and it won't be
5:11 - 5:22

easy to get there but we can look to what
these people do better than we do. All of
5:22 - 5:28

this sounds...it's a global report and it
sounds kind of like far away, like
5:28 - 5:36

probably somewhere in the tropics, but
actually threats to biodiversity happen
5:36 - 5:45

also directly in front of our own front
doors. This summer a paper came out from
5:45 - 5:53

two colleagues from the University of
Greifswald, who had analyzed the long term
5:53 - 5:58

data set about leaf beetles. And they were
asking if we already have a decline of
5:58 - 6:08

leaf beetles in Central Europe. So they
compiled long term data sets of leaf
6:08 - 6:19

beetle observations for Central Europe,
starting from 1900 now to 2017, so
6:19 - 6:27

spanning a hundred and twenty years. And
what they find is that systematic reports
6:27 - 6:36

on leaf beetles and leaf beetle
observations are increasing during this
6:36 - 6:45

time interval, time span. But despite the
fact that we have...like in the last two
6:45 - 6:53

decades, we had very high numbers of
reports and observations for leaf beetles,
6:53 - 7:00

the number of species, the orange line, is
declining. It's slightly declining. But
7:00 - 7:06

the question is, is this real or not? And
what was most worrisome to the authors is
7:06 - 7:15

that in the data set, the number of
species here in orange that were having
7:15 - 7:22

more reports was declining, while the
number of species that showed less reports
7:22 - 7:34

than before is expanding. So this kind of
long term datasets are very hard to
7:34 - 7:41

interpret and many factors can contribute
to those patterns. And it's not clear if
7:41 - 7:48

this pattern is statistically significant.
But if you take a step back and consider
7:48 - 7:54

your background knowledge, your prior
knowledge about the state of the world, do
7:54 - 8:03

you say, like, how does the current state
look like? Does it look good or rather
8:03 - 8:17

worrisome? And then with that knowledge,
tell me that these results are an
8:17 - 8:30

artifact or a bias. I'm worried that once
we have statistical significant signal in
8:30 - 8:42

this dataset, it will be already too late.
So right now, I've been talking about leaf
8:42 - 8:50

beetles and beetles are the largest group
within insects with about 400.000 species.
8:50 - 8:56

Leaf beetles are a large family of about
50.000 species which are worldwide
8:56 - 9:05

distributed. And here in Germany, we have
over 470 leaf beetle species. So how do we
9:05 - 9:10

actually know how many species there are
and who actually counted all these
9:10 - 9:16

species? And is that just a task of
taxonomists. Taxonomy is the science of
9:16 - 9:22

naming and defining, including
circumscribing and classifying groups of
9:22 - 9:32

biological organisms on the basis of
shared characters. So one could have the
9:32 - 9:38

picture of some woman with a funny hat
running over a meadow catching like
9:38 - 9:44

butterflies or some guy mushroom hunter
crawling through the forest trying to find
9:44 - 9:52

mushrooms. And it's true, as biodiversity
scientists we spent a lot of time outdoors
9:52 - 10:02

and yeah...on the other hand, biotaxonomy
is a high-tech science today. So
10:02 - 10:11

taxonomists actually take up new
technological tools and developments to
10:11 - 10:17

help them identify and describe,
understand the species. So taxonomists
10:17 - 10:25

actually are often experts in, for
example, microscopy, mathematics,
10:25 - 10:37

biochemistry, even proteomics and
genomics. So throughout the talk, I'm
10:37 - 10:42

going to compile this list of people and
experts we're going to need to protect
10:42 - 10:49

biodiversity if we want to do this on the
basis of genetic data. Right now, the list
10:49 - 10:56

is quite empty. The first entry is a
taxonomists, but that will change quickly
10:56 - 11:06

and taxonomists are a subgroup of
evolutionary biologists mostly. So I told
11:06 - 11:16

you as taxonomists and biodiversity
scientists take up technology and...so as
11:16 - 11:24

soon as computers came about and the
internet started people started to use
11:24 - 11:32

that to compile information about species,
and today we have several global resources
11:32 - 11:41

available at the species level and above
the species level. So we biodiversity
11:41 - 11:46

scientists were among the first who
defined biodiversity information
11:46 - 11:57

standards. We have a global catalog of
life. A list of all named species. The
11:57 - 12:02

Global Biodiversity Information Facility
has an aim to bring together information
12:02 - 12:09

from different sources and they are
compiling, producing this wonderful map.
12:09 - 12:14

This is leaf beetles, all the records
about leaf beetles that we have in the
12:14 - 12:22

world. And it looks like as if leaf
beetles are highly associated with third
12:22 - 12:30

world economics. However that clearly is
an artifact and it just shows that we need
12:30 - 12:35

many more taxonomists and biodiversity
scientists all over the world to find and
12:35 - 12:45

identify leaf beetles. So we also need
biodiversity informaticians to help us
12:45 - 12:52

compile global lists and distribute
knowledge. So far I have been talking
12:52 - 12:58

about species which is a simplification.
The question is what is...what are species
12:58 - 13:03

actually? And so we need to talk about
genetic diversity within and between
13:03 - 13:17

species. And I'm going to do so using
gulls, which most of us might know. Here
13:17 - 13:22

in Europe, we have two large gulls of the
genus Larus. One is in the front, the
13:22 - 13:31

lighter gray is our Silbermöwe. And in the
back is our Heringsmöwe, the dark one. And
13:31 - 13:36

I'm going to use German names because the
English names go crosswise and that's
13:36 - 13:43

completely confusing. So I will stick with
the German names. Here in Europe these two
13:43 - 13:48

species seem to be really fine species
because they barely interbreed, so they
13:48 - 13:56

don't hybridize. However, if you take a
step back and look at the genus in
13:56 - 14:03

general, you see that the species of the
genus are distributed kind of ringwise
14:03 - 14:15

around the Arctic. And so the idea is
that, say during the Ice Age, all of this
14:15 - 14:23

area was glaciated and the gulls retreated
to a refuge here near the Caspian Sea. And
14:23 - 14:28

then after the ice retreated, the gulls
moved back north. One branch moved into
14:28 - 14:34

Europe forming our Heringsmöwe and
another branch then moved counterclockwise
14:34 - 14:41

around the Arctic, producing different
morphotypes, different species across the
14:41 - 14:49

Bering Strait and then into North America.
There the dark blue one is...I'm
14:49 - 14:59

simplifying, the equivalent of our
European Silbermöwe, the American
14:59 - 15:04

Silbermöwe. Then the idea is that some
individuals crossed back to Europe and
15:04 - 15:15

formed our European Silbermöwe. And while
all of these species here are
15:15 - 15:22

interbreeding, so they hybridize. Only
when this ring is closed those two species
15:22 - 15:27

don't interbreed anymore. And the big
question is, are we actually dealing with
15:27 - 15:34

one single species or are we dealing with
different species that just happened to
15:34 - 15:41

hybridize more or less? The question is
not trivial because it has consequences
15:41 - 15:49

for protection. If we are dealing with one
single species, all the gulls in Eurasia
15:49 - 15:53

could go extinct and it wouldn't matter
because we still would have the gulls in
15:53 - 15:59

North America. However, if we have
different species in all of these areas,
15:59 - 16:05

we would need to protect individuals or
the species on a regional level and
16:05 - 16:17

protect all of these different species. So
to investigate this question about: Do we
16:17 - 16:24

have different species? And what were the
evolutionary processes and histories that
16:24 - 16:31

brought about the species? A group of
scientists investigated that using DNA
16:31 - 16:40

sequences. And on the left, you have the
model, the theoretical model of the ring
16:40 - 16:46

species. And here on the right you have
reality. And the scientists found that the
16:46 - 16:52

reality is always much more complex. So,
for example, they found two refuges or
16:52 - 16:58

they proposed two refuges. But what they
found was that genetic diversity was
16:58 - 17:07

correlated with those species or
morphotypes. So what that also means is
17:07 - 17:16

that genetic diversity is cultivated with
geographic origin. What we learn from this
17:16 - 17:24

type of analysis is we learn about
evolutionary processes and history, about
17:24 - 17:30

variability and differentiation of our
gene flow and migration, about speciation
17:30 - 17:38

processes. That we all need to understand
our species, which will allow us to
17:38 - 17:43

protect them. So we need evolutionary
biologists who do follow genetics and
17:43 - 17:59

population genetics. So once we found out
that one can use genetic diversity, to
17:59 - 18:07

infer geographic origin because genetic
diversity is correlated with geography,
18:07 - 18:18

people immediately said: 'Okay, we can use
it for conservation applications.'. And
18:18 - 18:24

it's also...we learned that we...often it
is unclear what is a species, species
18:24 - 18:33

boundaries are unclear and some species
have huge distribution ranges with
18:33 - 18:37

different clusters of viability within
this huge range. So we know that we need
18:37 - 18:43

to protect within species genetic
diversity, which means that we need to
18:43 - 18:51

understand within species population
structure and we need to build useful and
18:51 - 18:59

reliable models of population structure.
These models are actually required for all
18:59 - 19:04

of our applications. They are required for
monitoring, for example, for conservation
19:04 - 19:12

strategies, for functional adaptation and
adaptability, questions of productability
19:12 - 19:19

of different provenances, its impact on
management regimes, breeding strategies,
19:19 - 19:28

and also for enforcement applications.
From the studies I showed you before with
19:28 - 19:34

the gulls we also know that we need to
approach the question of a population
19:34 - 19:47

structure on a distribution range wide
scale. So here's the map produced by
19:47 - 19:54

EUFORGENE, the European Network for forest
reproductive material for one of our
19:54 - 20:02

native oaks, the sessil oak. And the dots
are the sites for genetic conservation
20:02 - 20:12

units. And so that is one strategy how to
represent within species genetic diversity
20:12 - 20:22

and how to sample it. And you can see this
is a hypothetical example, but we likely
20:22 - 20:32

will see a gradient from west to east or
might see one at this scale. Then once we
20:32 - 20:38

have these kind of global data sets, we
can go to the fine scale and maybe, for
20:38 - 20:44

example, do a national genetic monitoring.
And we will find much finer scale
20:44 - 20:51

gradients. We also will find especially
for first trace outliers, so for stands
20:51 - 20:59

that don't fit the usual pattern. And that
is because the first reproductive material
20:59 - 21:08

has been moved around a lot. And so these
lighter or darker dots is material that
21:08 - 21:16

was moved to Germany from the outside. And
we only will identify these outliers if we
21:16 - 21:21

have the whole reference dataset. If we
don't have the whole reference dataset, we
21:21 - 21:29

might not identify these outliers - stands
with a different history. Or in a worst
21:29 - 21:34

case, these outliers might actually bias
our gradients. And we are always talking
21:34 - 21:43

about very slight gradients. So it's easy
to bias these gradiants, dilute them, so
21:43 - 21:51

we actually won't get the results we need.
To compile these kinds of reference
21:51 - 21:58

datasets that's huge collaborative efforts
because people need to go out into the
21:58 - 22:04

field and collect the reference samples
and that might be scientists, that might
22:04 - 22:14

be people from local communities, citizen
scientists, managers, owners, government
22:14 - 22:20

officials who provide background
information, maps, distribution
22:20 - 22:28

information and also in many parts of the
world might protect the people who are
22:28 - 22:35

actually collecting the samples. And it
might be conservation activists and NGOs.
22:35 - 22:41

So once the samples have been collected
they need to be stored somewhere for the
22:41 - 22:51

long term and the information needs to be
databased. And that is the work of
22:51 - 22:57

scientific connections, which are mostly
at natural history museums and there the
22:57 - 23:04

samples are processed. They're organized
in ways that you can find them again. All
23:04 - 23:10

the metadata is entered, which curators
do, collection managers, preparators,
23:10 - 23:17

technical staff at the scientific
collections. So once we have these kind of
23:17 - 23:25

data sets, large scale data sets, what are
we actually doing with them? So the
23:25 - 23:33

foundation for all of our applications is
population structure and there
23:33 - 23:42

specifically population assignment. So the
process is set first. We decide on a
23:42 - 23:47

question and design our project
accordingly that we can answer the
23:47 - 23:52

question. Then we need to infer the
population structure model and optimize
23:52 - 23:57

it. In the next step we need to check if a
model actually is good enough for
23:57 - 24:03

application because we might have found
the best model, but it might still not be
24:03 - 24:07

good enough for application. So we need to
test that. And that is the step of
24:07 - 24:13

population assignment or predictive
assignment. And then in the end, we want
24:13 - 24:19

to test our hypothesis. Are the two stands
different or does an individual come from
24:19 - 24:31

stand A or from stand B? And here we
identify error rates and accuracy. So this
24:31 - 24:39

whole process is very statistical. And so
the analysis of these reference data they
24:39 - 24:48

need to be accompanied by biostatisticians
who can tell us how to analyze our data.
24:48 - 24:55

So what is the state-of-the-art right now?
What kind of geographic resolution do we
24:55 - 25:03

actually get of this non model specie
currently? And I'm going to present the
25:03 - 25:10

example of an African timber tree
species, which is a very valuable timber.
25:10 - 25:18

It's one example but basically all results
for species who have large distribution
25:18 - 25:26

ranges and are continuously distributed
and are also long-lived, are very similar.
25:26 - 25:33

So this kind of results seem to be species
independent. So the species are Milica
25:33 - 25:40

regia and excelsa, African teak, which
cannot be grown in plantations for timber
25:40 - 25:51

quality. So it is harvested unsustainably
from natural forests. It's distributed in
25:51 - 26:01

West, Central and East Africa. Here's a
black rectangle. And a group of a dozen
26:01 - 26:06

scientists got together and they actually
sampled a reference dataset for these two
26:06 - 26:19

species. It's about over 400 samples, they
analyzed four marker systems, resulting in
26:19 - 26:25

a total of something like 100 markers,
genetic markers, and then they optimized
26:25 - 26:33

the population model and used different
parameter settings. And we're going to
26:33 - 26:40

concentrate here on the best solution that
they found. And basically this rectangle
26:40 - 26:48

here is the black one over here. So the
resolution is... they found population
26:48 - 26:55

structure with clear clusters. So the
populations and the species from West
26:55 - 27:01

Africa can be distinguished from those
populations in Central Africa. And the
27:01 - 27:08

ones in East Africa can be differentiated.
So that is really good. So we have
27:08 - 27:13

population structure. We know their
signal. The problem is still that our
27:13 - 27:22

resolution is much lower than we would
need to have it because we basically need
27:22 - 27:32

resolution at least on a country level,
because most of the laws are national. So
27:32 - 27:42

it might be legal to harvest a tree in one
country, but not in another country. So we
27:42 - 27:49

need to get our resolution down to country
level or even to regional level. If you
27:49 - 27:52

want to distinguish, was the tree
harvested in a national park in a
27:52 - 28:02

protected area or outside in a managed
forest. And when as biodiversity
28:02 - 28:11

scientists, we don't know how to continue,
one thing is to look for what people do
28:11 - 28:17

with model organisms and specifically what
people do in human population genomics
28:17 - 28:24

because there thousands of populations
geneticists are working and there is a
28:24 - 28:28

completely different funding background
due to the interest of the medical and the
28:28 - 28:39

pharma industry. So they are always
advanced. What we can learn from there,
28:39 - 28:47

from the human populations genomics is
that we need two features. One is we
28:47 - 28:54

already know that we need distribution
wide sampling, which provides a spatial
28:54 - 29:00

context. The second feature is that we
need genome wide sequencing, preferably
29:00 - 29:09

genome sequencing, which provides us steps
in time because our genomes are archives
29:09 - 29:15

of our evolutionary history. They are
records of all the processes and events
29:15 - 29:21

and these steps in time then translate
also into resolution. Once we have these
29:21 - 29:30

two features, actually these reference
datasets open Pandora's box. Suddently we
29:30 - 29:36

can ask all kinds of questions and
objectives, even those that we still don't
29:36 - 29:47

know. We can develop all kinds of
applications which is done for humans.
29:47 - 29:59

Currently, there are at least four global
datasets on human diversity. These are
29:59 - 30:09

very widely reused and these big datasets
- so they are big data with regard to the
30:09 - 30:19

number of samples and also the genomes or
the genome representations and this
30:19 - 30:26

results in very information rich data
which initiates analytical development so
30:26 - 30:34

people continuously are developing new
statistical methods. And right now, a new
30:34 - 30:42

wave is coming in of these methods. So
once you have these global datasets,
30:42 - 30:48

people start in human populations
genomics, started to do these intense
30:48 - 30:56

regional samplings. And this is the
example of the United Kingdom Biobank.
30:56 - 31:03

It's a project with 500.000 volunteers,
they are all UK citizens from all over the
31:03 - 31:14

islands. And each individual was genotyped
in a vet lab for 820.000 markers. That's
31:14 - 31:20

completely I mean, that's a different
number than the 100 or 1000...in
31:20 - 31:26

biodiversity scientists we normally
analyse a maximum of a couple of 10.000
31:26 - 31:36

markers. So that's a completely different
number. But then statistical geneticists
31:36 - 31:47

come. They do some weird and wonderful
voodoo and they derive 96 million markers
31:47 - 31:53

per genome that is per individual from
these 820.000 markers that were produced
31:53 - 32:01

in the lab. So that's a hundred fold
increase. And once you have this kind of
32:01 - 32:08

dataset for a genome, you suddenly or you
finally become country level and within
32:08 - 32:19

country level resolution. So these panels
are examples. So the first panel shows
32:19 - 32:26

individuals who were born in Edinburgh and
the question was "Where were people born
32:26 - 32:32

who had a similar ancestral background,
genetic background?". And what they found
32:32 - 32:42

was that was all over Scotland and
Northern Ireland. Northern Yorkshire was
32:42 - 32:50

even more local. So people from Yorkshire
don't seem to get around a lot. For London
32:50 - 32:54

the situation is completely different.
That is what we would expect because
32:54 - 33:00

London is a people magnet. People move
there all the time. They meet there, they
33:00 - 33:06

get children and the kids born in London,
their genetic ancestry has nothing to do
33:06 - 33:13

with London. It's from all over the place,
from the British Isles and the world. So
33:13 - 33:22

that's why the colors are strongly
dissolved. So this study came out also
33:22 - 33:26

this summer. And it's the first time that
I have seen that we actually really can
33:26 - 33:37

achieve regional resolution. And I find
this possibility for biodiversity science
33:37 - 33:47

really exciting. So it was made possible
by very sophisticated statistical
33:47 - 33:52

approaches which are able to analyze
genetic data from highly complex
33:52 - 33:59

evolutionary and ecological systems. And
at the same time these analyses are able
33:59 - 34:05

to handle big data. We we're talking about
gigabytes and terabytes of data and
34:05 - 34:14

results. So a statistical geneticist are
developing new methods of data
34:14 - 34:20

representation to handle this amount of
data. And then we are able to sufficiently
34:20 - 34:26

extract the signal for a very specific
question from data which are very low
34:26 - 34:37

signal to noise ratio. So to get there, we
need many experts and specialists. So we
34:37 - 34:42

need statistical geneticists, big data
experts who also might contribute machine
34:42 - 34:49

learning expertise. We need molecular
biologists who know how to sequence
34:49 - 34:54

complex genomes. We now need
bioinformatics with an expertise in
34:54 - 35:05

genomics for assembly, annotation and
alignment of genomic sequences. The result
35:05 - 35:13

is actually this: This is the author list
for the thousands genomes project
35:13 - 35:20

reference data set, and I don't expect you
to be able to read it, but the bold type
35:20 - 35:26

is of interest because it shows all the
different tasks that are necessary to
35:26 - 35:36

produce a standardized and highly cleaned
reverence dataset. So the whole author
35:36 - 35:42

list is something like 1.5 pages long and
even considering that some authors will
35:42 - 35:51

have contributed to several tasks. The
publications for reference datasets mostly
35:51 - 35:57

have author lists that are far over 50
people. So they are huge collaborative
35:57 - 36:05

efforts. Now we take the step into
biodiversity science. Here these are eight
36:05 - 36:13

gastrotrichs, they are little worm like...
organisms who live in the sediments of
36:13 - 36:23

freshwater lakes and marine sediment. They
are in general a couple of hundreds micro
36:23 - 36:30

meters large. And I don't have any
numbers, but my guess would be that maybe
36:30 - 36:39

worldwide, a hundred to a thousand people
actually work on these species. There are
36:39 - 36:45

800 species of gastrotrichs. So let's say
there's one, two, maybe three experts per
36:45 - 36:52

species for these organisms. So how are
these three people going to manage all
36:52 - 37:01

these tasks to produce a reference
dataset? You might say, well, it's
37:01 - 37:05

gastrotrichs, I mean, have never heard
about them. Maybe they are not so
37:05 - 37:08

important. Maybe you don't need a
reference data sets, but actually some of
37:08 - 37:18

those species are bioindicators for water
quality. So what we observe right now is a
37:18 - 37:28

gap for biodiversity conservation. In
model organisms, we have Pandora's Box
37:28 - 37:35

open. We have all the statistical analyses
at our hands to analyze our data sets.
37:35 - 37:40

However, in none model organisms, we are
still stuck with summary statistics that
37:40 - 37:47

don't provide us the resolution that we
need. And we know that to close this gap,
37:47 - 37:53

even for a single species, it's a huge
effort. But at the same time, we have over
37:53 - 38:04

35.000 species listed by scientists which
need already now effective protection. So
38:04 - 38:10

we need to find a way to close this gap
and actually move in this direction. And
38:10 - 38:20

the good thing is, so all of this... in
biodiversity science, in academia, and we
38:20 - 38:25

need to make the transition over the
conservational genomic gap into the big
38:25 - 38:32

loop of real world conservation tasks. And
the good thing is we already know what we
38:32 - 38:38

have to do. So we need to have reference
data sets, distribution range wide. We
38:38 - 38:44

need to have statistics. And it's going to
be big data. So we need collection
38:44 - 38:54

management, data management and an
analysis environment. So looking at
38:54 - 39:00

different ingredients or different steps
the first we need is a general data
39:00 - 39:05

infrastructure for global diversity of
reference data sets that actually can be
39:05 - 39:12

used across species for preferably as many
species as possible and provide a working
39:12 - 39:20

environment for biodiversity scientists
and experts. It should be user friendly so
39:20 - 39:26

it can be used by scientists, but also
that people from local communities and
39:26 - 39:33

citizen scientists can add their
observation data and their data into this
39:33 - 39:41

data infrastructure. I have listed quite a
lot of features that these kind of
39:41 - 39:48

infrastructures should have. And I'm going
to argue that these features are not some
39:48 - 40:03

nice to have, but actually some must have.
Because our goal is always application. So
40:03 - 40:13

we need developers, managers and curators
for data infrastructures. Since our goal
40:13 - 40:31

is application, our main features are
quality control and error reduction. These
40:31 - 40:39

are the basis. So that our conservation
tools can be robustly and reliably applied
40:39 - 40:46

under real world operating conditions. And
the way to achieve quality and error
40:46 - 40:53

reduction is through chains of custody. So
it means that from project of sign, from
40:53 - 40:58

the questions through all the steps that
are necessary to produce a reference data
40:58 - 41:08

set and then...so from sample collection,
genomic statistical analysis down to
41:08 - 41:16

application. These steps need to be
documented and standardized. They need to
41:16 - 41:22

be, each one of them needs to be validated
and reproducible. They should be modular
41:22 - 41:29

so they can be user friendly. And the
whole chain of custody needs to be
41:29 - 41:41

scalable. So if our chains of custody have
these characteristics, we actually will
41:41 - 41:51

have tools that will work in everyday
life. So we need professional developers
41:51 - 42:00

and programmers who are able to produce
these very collaborative softwares. We
42:00 - 42:06

need free and open source experts. So we
always can ensure that our code and that
42:06 - 42:14

our infrastructures are still integer and
we can check them. And I'm a biologist, I
42:14 - 42:19

don't have any background in hardware, but
I've heard a couple of talks here in the
42:19 - 42:26

conference about Green IT. And I have
the feeling we should have people who know
42:26 - 42:34

hardware and software and know how to
develop these high tech tools in a way
42:34 - 42:38

sustainable so that by developing these
tools, we don't use more resources than we
42:38 - 42:49

are trying to protect. So I've shown all
these features and characteristics that
42:49 - 42:57

the software should have. And I'm arguing
that these features are necessary because
42:57 - 43:05

of the reality we find us in. It is one of
rising over-exploitation and destruction
43:05 - 43:20

of nature. So the extent of environmental
crimes is up in the billions. All
43:20 - 43:29

environmental crime together, the green
bubbles are only second to drug associated
43:29 - 43:35

crimes. They are up there with
counterfeiting or human trafficing. So
43:35 - 43:45

these are multi-billion enterprises. They
are often transnational and industries
43:45 - 44:02

with huge profits. So if there's some
crime, some mafia boss, some criminal
44:02 - 44:10

manager who just bribed a government
official somewhere in the neck in the
44:10 - 44:18

woods, it just would make sense that that
person would not wait or not take the
44:18 - 44:24

risks to be discovered just because some
customs officer pulls out a container
44:24 - 44:29

somewhere in the harbor, for example,
opens it and says "This looks kind of
44:29 - 44:37

weird. Let's take a sample, send it to a
lab." and then a population geneticist
44:37 - 44:44

comes back and says "Oh, yes, this sample
is not from area A as documented, but
44:44 - 44:52

actually it's from area B and it was
illegally logged." If we have reference
44:52 - 44:59

data sets, information rich reference data
sets, they become highly valuable and they
44:59 - 45:08

need protection themselves against
manipulation and destruction. So we will
45:08 - 45:15

need to think about IT security from the
beginning. Also, these data sets are often
45:15 - 45:20

very politically sensitive because if it
is shown that in a certain country there
45:20 - 45:26

is the illegal logging repeatedly, that
country might not be too excited about
45:26 - 45:41

this information. So we need to think
about IT security experts. So my hope is
45:41 - 45:49

that these kind of very high tech digital
conservation tools can actually contribute
45:49 - 45:56

to the U.N. Sustainable Development Goals
by empowering indigenous people, local
45:56 - 46:03

communities and also us to protect and
force and sustainably use our lands and
46:03 - 46:10

our biodiversity by providing some
management and law enforcement tools. So
46:10 - 46:14

we need people from around the world,
users from around the world who use these
46:14 - 46:26

tools and help to develop them further and
to maintain them. And finally here, these
46:26 - 46:34

high tech tools will just another
technological fix. If we don't manage to
46:34 - 46:46

get our back down, our way of life down to
sustainable levels. So what we need is to
46:46 - 46:54

today...this year, the Earth Overshoot Day
was at the end of July. So at the end of
46:54 - 47:02

July, we had used all the resources that
we had available for the whole year. And
47:02 - 47:09

we need to get this back to the end of the
year so that our resources actually
47:09 - 47:23

sustain us for the whole year. The graphic
here for Germany suggests that we are on a
47:23 - 47:30

good way. We are reducing our resource
consumption and maybe even our biocapacity
47:30 - 47:38

moves up a little bit. So actually it
seems that our personal lifestyles and
47:38 - 47:46

choices make a difference and we just need
to close this gap here much quicker. So
47:46 - 47:54

protecting biodiversity needs all of us to
achieve that. And with that, thank you
47:54 - 47:58

very much.
47:58 - 48:08

Applause
48:08 - 48:13

Angel: So thank you Jutta for this very
interesting talk and the very valuable
48:13 - 48:17

work you're doing. We have three mics
here. Please line up at the microphones if
48:17 - 48:23

you have any questions or suggestions or
want to participate and work together with
48:23 - 48:30

Jutta. We have one question from the
Internet, so please Signal-Angel start.
48:30 - 48:35

Signal-Angel: Why do wild plant species
within a genus are further apart than wild
48:35 - 48:43

animal species within a genus?
Angel: Could you repeat it, please?
48:43 - 48:49

Signal-Angel: Why do wild plant species
within a genus are further apart than wild
48:49 - 48:56

animal species within a genus?
Jutta: I'm not sure I understand the
48:56 - 49:01

background for the question.
Mic 1: Because animals move and plants
49:01 - 49:06

don't move.
Jutta: Oh, okay. If that is the idea
49:06 - 49:12

behind the question. Plants actually move,
too. They don't move as individuals, but
49:12 - 49:24

they move their genetic material through
pollen or fragments. So actually diversity
49:24 - 49:31

in plants and in animals can be quite
similar. So the idea is that plants are
49:31 - 49:36

just stuck and should have a completely
different population structure does not
49:36 - 49:43

hold because plants move around their
genetic material through seeds, through
49:43 - 49:50

pollen, through vegetative propagules.
Angel: So thank you microphone 1 for
49:50 - 49:56

helping out. Please ask your question. Mic
1: So my question is about the success
49:56 - 50:01

factor of it. If you think of this,
whatever database being set up there and I
50:01 - 50:07

think it's gonna be a huge database...I
downloaded my own genome on the Internet.
50:07 - 50:13

It was about 150 megabytes. And if we
multiply that, I think the genetic
50:13 - 50:18

variation from one person to another is
about 1 percent only. So we can compress
50:18 - 50:25

that to 4 megabytes per person. If we
sequence all the humans in the world, that
50:25 - 50:33

would be 32 petabytes, that would cost
approximately 15 billion dollars. And
50:33 - 50:37

that's only for the storage. Now comes the
entire management. Of course, we don't
50:37 - 50:41

want to digitize all the human genome, but
rather the plants and animal species
50:41 - 50:46

genome. So it's a huge data program. And
what would be for you the success factors
50:46 - 50:51

for this thing to really fly? And did you
talk to organizations like WikiData or
50:51 - 50:56

others or where would it ideally be
hosted? At a university or an
50:56 - 51:02

international nonprofit or who would be
running the thing?
51:02 - 51:15

Jutta: Yeah, I mean, it's just really big
data. I think our first goal is not to
51:15 - 51:24

think about having all predicted 5 to 10
million species be sequenced on a
51:24 - 51:30

population level. I think we need to think
about the next step. And there it would
51:30 - 51:36

make sense to start with species that are
actually highly exploited, like many
51:36 - 51:41

timber species and also many marine
fishes. I think that's where we should
51:41 - 51:48

start. And to host this kind of data I
think it should be in political
51:48 - 51:56

independent hands. So it should be with an
NGO or with the U.N., some organization
51:56 - 52:02

that is independent.
Mic 1: Are you the first to think about
52:02 - 52:07

this or are there existing initiatives?
Jutta: There are actually existing
52:07 - 52:14

initiatives. I have been in contact with
the Forest Stewardship Council and they
52:14 - 52:23

are actually starting to sample their
concessions and initiated to build up the
52:23 - 52:29

samples, they work together with Kew
Botanical Gardens and the U.S. Forest
52:29 - 52:38

Service. And right now they're analyzing
the samples, using isotopes which is
52:38 - 52:46

another method which is very powerful and
can also produce geographic information.
52:46 - 53:01

And so, yeah, so people are moving in this
way. So, yeah, I think the idea is out
53:01 - 53:06

there, just we have to start and we have
to really do it and provide one
53:06 - 53:13

infrastructure so that we can combine, for
example, morphological data, isotope data
53:13 - 53:18

and genomic data into one dataset, which
will increase our resolution and our
53:18 - 53:24

reliability.
Angel: Okay. Microphone number two,
53:24 - 53:27

please.
Mic 2: Thank you for your valuable talk.
53:27 - 53:33

My question would be you'd start your talk
with the possible decrease of leaf beetles
53:33 - 53:37

in the data set you showed on slide number
six there was an increase in leaf beetle
53:37 - 53:42

population until the 70s, something about
that. Is there a possible explanation for
53:42 - 53:50

that?
Jutta: Yeah, I believe it is, because
53:50 - 53:55

people started to much more systematically
observe leaf beetles. So it's a sample
53:55 - 54:06

effort. And also at that time the people -
so it's a multi-people collaboration who
54:06 - 54:12

actually has assembled this dataset so the
people who are part of this collaboration
54:12 - 54:17

they edit their own private data sets. And
that's why you have an increase I think.
54:17 - 54:24

While the people from the nineteen
hundreds, nineteen hundred ten you only
54:24 - 54:29

can use the data that is available in
publications and samples in museums or in
54:29 - 54:33

scientific collections. I think that is
the reason why you have the sharp
54:33 - 54:36

increase.
Mic 2: Thank you.
54:36 - 54:39

Angel: So we have another question of
microphone number two.
54:39 - 54:44

Mic 2: Thank you for your fine talking.
Excuse me. Maybe my question is a bit off
54:44 - 54:52

topic. Do you think the methods and roles
that you identified in your talk could be
54:52 - 55:00

transferred to the assessment of raw
materials? I'm thinking about metals?
55:00 - 55:09

Jutta: Maybe the data infrastructure, like
if you wanted to collect raw metals or
55:09 - 55:16

materials from all over the world and...a
sampleized scientific collection and to
55:16 - 55:22

have kind of a reference dataset that
might work, actually. But the genomics
55:22 - 55:29

obviously won't. So that part of what you
would need to use different methods from
55:29 - 55:36

physics, obviously. But actually the
infrastructure, certain parts will be
55:36 - 55:40

quite similar. I think so, yes.
Angel: So we have one more question from
55:40 - 55:43

the Internet.
Signal-Angel: Who does contract a
55:43 - 55:52

freelance evolutionary biologist? Can you
give an example of this kind of work you
55:52 - 56:01

proposed?
Jutta: So I see this gap between science
56:01 - 56:08

and applications, that we need these
applications and there's a huge potential
56:08 - 56:18

for these applications. We know that
illegal logging and that is my background,
56:18 - 56:24

but doesn't seem to be much different, for
example, in marine fisheries. We know that
56:24 - 56:30

there is this huge amount of illegal
logging and timber trade going on. And we
56:30 - 56:40

need to have some assets actually that
have the power to detect illegally traded
56:40 - 56:50

timber. So I think there is a huge need
for these kind of methods and
56:50 - 57:01

organizations who are interested in these
kind of methods. Our governments, their
57:01 - 57:13

companies, NGOs, customs, Interpol. So,
yeah.
57:13 - 57:20

Angel: Do we have any other questions? So
thank you again Jutta for your talk and
57:20 - 57:24

the valuable work you're doing. Please
give a warm round of applause to Jutta.
57:24 - 57:29

Applause
57:29 - 57:34

36c3 postrol music
57:34 - 57:56

Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!

Title:: 36C3 - Protecting the Wild
Description:: more » « less
Video Language:: English
Duration:: 57:56

	Bar Sch edited English subtitles for 36C3 - Protecting the Wild
	Lafresa_rC3 edited English subtitles for 36C3 - Protecting the Wild
	Lafresa_rC3 edited English subtitles for 36C3 - Protecting the Wild
	C3Subtitles edited English subtitles for 36C3 - Protecting the Wild
	C3Subtitles edited English subtitles for 36C3 - Protecting the Wild

English subtitles

Revisions

Revision 5 Edited

Bar Sch

36C3 - Protecting the Wild

Revisions

Our website uses cookies

Operating cookies (Required)