Return to Video

36C3 - Protecting the Wild

  • 0:19 - 0:20
    *36C3 preroll music*
  • 0:20 - 0:25
    Angel: Right now I'd like to welcome our
    first speaker on stage. The talk will be
  • 0:25 - 0:31
    about protecting the wild and I'll hand
    over to her. Please give her a warm round
  • 0:31 - 0:33
    of applause.
  • 0:33 - 0:35
    *Applause*
  • 0:35 - 0:44
    Jutta Buschbom: Thank you very much for
    the introduction. My name is Jutta
  • 0:44 - 0:52
    Buschbom, I'm an evolutionary biologist.
    That is my background. I did do my PHD at
  • 0:52 - 0:57
    the University of Chicago working on
    little fungees that live in symbiosis with
  • 0:57 - 1:06
    algae and form colorful rocks, colorful
    crust on rocks. I then did a Postdoc in
  • 1:06 - 1:12
    bioinformatics and after that moved back
    into organic biology, working in forest
  • 1:12 - 1:20
    genetics. And the ten years I worked in
    forest genetics for the first time I
  • 1:20 - 1:26
    encountered questions that were with
    regard to application, and I found out
  • 1:26 - 1:37
    that actually moving from research to
    application is not trivial. So what I'm
  • 1:37 - 1:46
    going to present is a high tech way using
    genomic data to protect biodiversity in a
  • 1:46 - 1:52
    way that you can actually reach
    application and use conservation genomic
  • 1:52 - 2:03
    tools. So this summer the draft of the
    report of the Intergovernmental Science
  • 2:03 - 2:12
    Policy Panel for Biodiversity and
    Ecosystem Services came out and its
  • 2:12 - 2:20
    results were quite warning. It stated that
    around a million animal and plant species
  • 2:20 - 2:27
    are currently stated and of those...half
    of those species are already dead species
  • 2:27 - 2:33
    walking. So because due to the destruction
    of the habitats or habitat deterioration,
  • 2:33 - 2:43
    they are not able to reproduce in a
    sustainable way anymore. A third of the
  • 2:43 - 2:51
    total species extinction rate risk to date
    has arisen in the last 25 years. And just
  • 2:51 - 3:01
    to give you an idea about the relation we
    are talking about...currently the rate of
  • 3:01 - 3:08
    extinction risk is already at least ten to
    hundreds times higher than it has averaged
  • 3:08 - 3:13
    over the past 10 million years. And within
    these 10 million years there were the Ice
  • 3:13 - 3:23
    Age, for example. And most of the
    extinction risk is due to the fact of land
  • 3:23 - 3:36
    and sea use change. The report also talks,
    even talks about that we already seem to
  • 3:36 - 3:42
    have transgressed a proposed precautionary
    planetary boundary, which means within the
  • 3:42 - 3:48
    boundary we have a stable biological
    system. But having transgressed it, we
  • 3:48 - 3:55
    might already be in a transition to a new
    state that we have no way to find out how
  • 3:55 - 4:05
    this state is going to look like. So all
    of these facts that the report is stating
  • 4:05 - 4:15
    are actually pretty negative. And I was
    quite happy to read that they also present
  • 4:15 - 4:21
    that there are actually people who do
    better than most of us. And they point out
  • 4:21 - 4:28
    that many practices of indigenous people
    and local communities actually conserve
  • 4:28 - 4:38
    and sustain wild and domesticated
    biodiversity quite well. Today, a higher
  • 4:38 - 4:45
    proportion of the remaining terrestrial
    biodiversity lies in areas managed and
  • 4:45 - 4:53
    held by indigenous people. And these
    ecosystems are more intact and less
  • 4:53 - 5:02
    declining, less rapidly declining. So we
    have examples of lifestyles that actually
  • 5:02 - 5:11
    do better than most of us. And I know the
    solutions won't be simple and it won't be
  • 5:11 - 5:22
    easy to get there but we can look to what
    these people do better than we do. All of
  • 5:22 - 5:28
    this sounds...it's a global report and it
    sounds kind of like far away, like
  • 5:28 - 5:36
    probably somewhere in the tropics, but
    actually threats to biodiversity happen
  • 5:36 - 5:45
    also directly in front of our own front
    doors. This summer a paper came out from
  • 5:45 - 5:53
    two colleagues from the University of
    Greifswald, who had analyzed the long term
  • 5:53 - 5:58
    data set about leaf beetles. And they were
    asking if we already have a decline of
  • 5:58 - 6:08
    leaf beetles in Central Europe. So they
    compiled long term data sets of leaf
  • 6:08 - 6:19
    beetle observations for Central Europe,
    starting from 1900 now to 2017, so
  • 6:19 - 6:27
    spanning a hundred and twenty years. And
    what they find is that systematic reports
  • 6:27 - 6:36
    on leaf beetles and leaf beetle
    observations are increasing during this
  • 6:36 - 6:45
    time interval, time span. But despite the
    fact that we have...like in the last two
  • 6:45 - 6:53
    decades, we had very high numbers of
    reports and observations for leaf beetles,
  • 6:53 - 7:00
    the number of species, the orange line, is
    declining. It's slightly declining. But
  • 7:00 - 7:06
    the question is, is this real or not? And
    what was most worrisome to the authors is
  • 7:06 - 7:15
    that in the data set, the number of
    species here in orange that were having
  • 7:15 - 7:22
    more reports was declining, while the
    number of species that showed less reports
  • 7:22 - 7:34
    than before is expanding. So this kind of
    long term datasets are very hard to
  • 7:34 - 7:41
    interpret and many factors can contribute
    to those patterns. And it's not clear if
  • 7:41 - 7:48
    this pattern is statistically significant.
    But if you take a step back and consider
  • 7:48 - 7:54
    your background knowledge, your prior
    knowledge about the state of the world, do
  • 7:54 - 8:03
    you say, like, how does the current state
    look like? Does it look good or rather
  • 8:03 - 8:17
    worrisome? And then with that knowledge,
    tell me that that these results are an
  • 8:17 - 8:30
    artifact or a bias. I'm worried that once
    we have statistical significant signal in
  • 8:30 - 8:42
    this dataset, it will be already too late.
    So right now, I've been talking about leaf
  • 8:42 - 8:50
    beetles and beetles are the largest group
    within insects with about 400.000 species.
  • 8:50 - 8:56
    Leaf beetles are a large family of about
    50.000 species which are worldwide
  • 8:56 - 9:05
    distributed. And here in Germany, we have
    over 470 leaf beetle species. So how do we
  • 9:05 - 9:10
    actually know how many species there are
    and who actually counted all these
  • 9:10 - 9:16
    species? And is that just a task of
    taxonomists? Taxonomy is the science of
  • 9:16 - 9:22
    naming and defining, including
    circumscribing and classifying groups of
  • 9:22 - 9:32
    biological organisms on the basis of
    shared characters. So one could have the
  • 9:32 - 9:38
    picture of some woman with a funny hat
    running over a meadow catching like
  • 9:38 - 9:44
    butterflies or some guy mushroom hunter
    crawling through the forest trying to find
  • 9:44 - 9:52
    mushrooms. And it's true, as biodiversity
    scientists we spent a lot of time outdoors
  • 9:52 - 10:02
    and yeah...on the other hand, biotaxonomy
    is a high-tech science today. So
  • 10:02 - 10:11
    taxonomists actually take up new
    technological tools and developments to
  • 10:11 - 10:17
    help them identify and describe,
    understand the species. So taxonomists
  • 10:17 - 10:25
    actually are often experts in, for
    example, microscopy, mathematics,
  • 10:25 - 10:37
    biochemistry, even proteomics and
    genomics. So throughout the talk, I'm
  • 10:37 - 10:42
    going to compile this list of people and
    experts we're going to need to protect
  • 10:42 - 10:49
    biodiversity if we want to do this on the
    basis of genetic data. Right now, the list
  • 10:49 - 10:56
    is quite empty. The first entry is a
    taxonomists, but that will change quickly
  • 10:56 - 11:06
    and taxonomists are a subgroup of
    evolutionary biologists mostly. So I told
  • 11:06 - 11:16
    you as taxonomists and biodiversity
    scientists take up technology and...so as
  • 11:16 - 11:24
    soon as computers came about and the
    internet started people started to use
  • 11:24 - 11:32
    that to compile information about species,
    and today we have several global resources
  • 11:32 - 11:41
    available at the species level and above
    the species level. So we biodiversity
  • 11:41 - 11:46
    scientists were among the first who
    defined biodiversity information
  • 11:46 - 11:57
    standards. We have a global catalog of
    life. A list of all named species. The
  • 11:57 - 12:02
    Global Biodiversity Information Facility
    has an aim to bring together information
  • 12:02 - 12:09
    from different sources and they are
    compiling, producing this wonderful map.
  • 12:09 - 12:14
    This is leaf beetles, all the records
    about leaf beetles that we have in the
  • 12:14 - 12:22
    world. And it looks like as if leaf
    beetles are highly associated with third
  • 12:22 - 12:30
    world economics. However that clearly is
    an artifact and it just shows that we need
  • 12:30 - 12:35
    many more taxonomists and biodiversity
    scientists all over the world to find and
  • 12:35 - 12:45
    identify leaf beetles. So we also need
    biodiversity informaticians to help us
  • 12:45 - 12:52
    compile global lists and distribute
    knowledge. So far I have been talking
  • 12:52 - 12:58
    about species which is a simplification.
    The question is what is...what are species
  • 12:58 - 13:03
    actually? And so we need to talk about
    genetic diversity within and between
  • 13:03 - 13:17
    species. And I'm going to do so using
    gulls, which most of us might know. Here
  • 13:17 - 13:22
    in Europe, we have two large gulls of the
    genus Larus. One is in the front, the
  • 13:22 - 13:31
    lighter gray is our Silbermöwe. And in the
    back is our Heringsmöwe, the dark one. And
  • 13:31 - 13:36
    I'm going to use German names because the
    English names go crosswise and that's
  • 13:36 - 13:43
    completely confusing. So I will stick with
    the German names. Here in Europe these two
  • 13:43 - 13:48
    species seem to be really fine species
    because they barely interbreed, so they
  • 13:48 - 13:56
    don't hybridize. However, if you take a
    step back and look at the genus in
  • 13:56 - 14:03
    general, you see that the species of the
    genus are distributed kind of ringwise
  • 14:03 - 14:15
    around the Arctic. And so the idea is
    that, say during the Ice Age, all of this
  • 14:15 - 14:23
    area was glaciated and the gulls retreated
    to a refuge here near the Caspian Sea. And
  • 14:23 - 14:28
    then after the ice retreated, the gulls
    moved back north. One branch moved into
  • 14:28 - 14:34
    Europe forming our Heringsmöwe and
    another branch then moved counterclockwise
  • 14:34 - 14:41
    around the Arctic, producing different
    morphotypes, different species across the
  • 14:41 - 14:49
    Bering Strait and then into North America.
    There the dark blue one is...I'm
  • 14:49 - 14:59
    simplifying, the equivalent of our
    European Silbermöwe, the American
  • 14:59 - 15:04
    Silbermöwe. Then the idea is that some
    individuals crossed back to Europe and
  • 15:04 - 15:15
    formed our European Silbermöwe. And while
    all of these species here are
  • 15:15 - 15:22
    interbreeding, so they hybridize. Only
    when this ring is closed those two species
  • 15:22 - 15:27
    don't interbreed anymore. And the big
    question is, are we actually dealing with
  • 15:27 - 15:34
    one single species or are we dealing with
    different species that just happened to
  • 15:34 - 15:41
    hybridize more or less? The question is
    not trivial because it has consequences
  • 15:41 - 15:49
    for protection. If we are dealing with one
    single species, all the gulls in Eurasia
  • 15:49 - 15:53
    could go extinct and it wouldn't matter
    because we still would have the gulls in
  • 15:53 - 15:59
    North America. However, if we have
    different species in all of these areas,
  • 15:59 - 16:05
    we would need to protect individuals or
    the species on a regional level and
  • 16:05 - 16:17
    protect all of these different species. So
    to investigate this question about: Do we
  • 16:17 - 16:24
    have different species? And what were the
    evolutionary processes and histories that
  • 16:24 - 16:31
    brought about the species? A group of
    scientists investigated that using DNA
  • 16:31 - 16:40
    sequences. And on the left, you have the
    model, the theoretical model of the ring
  • 16:40 - 16:46
    species. And here on the right you have
    reality. And the scientists found that the
  • 16:46 - 16:52
    reality is always much more complex. So,
    for example, they found two refuges or
  • 16:52 - 16:58
    they proposed two refuges. But what they
    found was that genetic diversity was
  • 16:58 - 17:07
    correlated with those species or
    morphotypes. So what that also means is
  • 17:07 - 17:16
    that genetic diversity is cultivated with
    geographic origin. What we learn from this
  • 17:16 - 17:24
    type of analysis is we learn about
    evolutionary processes and history, about
  • 17:24 - 17:30
    variability and differentiation of our
    gene flow and migration, about speciation
  • 17:30 - 17:38
    processes. That we all need to understand
    our species, which will allow us to
  • 17:38 - 17:43
    protect them. So we need evolutionary
    biologists who do follow genetics and
  • 17:43 - 17:59
    population genetics. So once we found out
    that one can use genetic diversity, to
  • 17:59 - 18:07
    infer geographic origin because genetic
    diversity is correlated with geography,
  • 18:07 - 18:18
    people immediately said 'Okay, we can use
    it for conservation applications.'. And
  • 18:18 - 18:24
    it's also...we learned that we...often it
    is unclear what is a species, species
  • 18:24 - 18:33
    boundaries are unclear and some species
    have huge distribution ranges with
  • 18:33 - 18:37
    different clusters of viability within
    this huge range. So we know that we need
  • 18:37 - 18:43
    to protect within species genetic
    diversity, which means that we need to
  • 18:43 - 18:51
    understand within species population
    structure and we need to build useful and
  • 18:51 - 18:59
    liable models of population structure.
    These models are actually required for all
  • 18:59 - 19:04
    of our applications. They are required for
    monitoring, for example, for conservation
  • 19:04 - 19:12
    strategies, for functional adaptation and
    adaptability, questions of productability
  • 19:12 - 19:19
    of different provenances, its impact on
    management regimes, breeding strategies,
  • 19:19 - 19:28
    and also for enforcement applications.
    From the studies I showed you before with
  • 19:28 - 19:34
    the gulls we also know that we need to
    approach the question of a population
  • 19:34 - 19:47
    structure on a distribution range wide
    scale. So here's the map produced by
  • 19:47 - 19:54
    EUFORGENE, the European Network for forest
    reproductive material for one of our
  • 19:54 - 20:02
    native oaks, the sessil oak. And the dots
    are the sites for genetic conservation
  • 20:02 - 20:12
    units. And so that is one strategy how to
    represent within species genetic diversity
  • 20:12 - 20:22
    and how to sample it. And you can see this
    is a hypothetical example, but we likely
  • 20:22 - 20:32
    will see a gradient from west to east or
    might see one at this scale. Then once we
  • 20:32 - 20:38
    have these kind of global data sets, we
    can go to the fine scale and maybe, for
  • 20:38 - 20:44
    example, do a national genetic monitoring.
    And we will find much finer scale
  • 20:44 - 20:51
    gradients. We also will find especially
    for first trace outliers, so for stands
  • 20:51 - 20:59
    that don't fit the usual pattern. And that
    is because the first reproductive material
  • 20:59 - 21:08
    has been moved around a lot. And so these
    lighter or darker dots is material that
  • 21:08 - 21:16
    was moved to Germany from the outside. And
    we only will identify these outliers if we
  • 21:16 - 21:21
    have the whole reference dataset. If we
    don't have the whole reference dataset, we
  • 21:21 - 21:29
    might not identify these outliers - stands
    with a different history. Or in a worst
  • 21:29 - 21:34
    case, these outliers might actually bias
    our gradients. And we are always talking
  • 21:34 - 21:43
    about very slight gradients. So it's easy
    to bias these gradiants, dilute them, so
  • 21:43 - 21:51
    we actually won't get the results we need.
    To compile these kinds of reference
  • 21:51 - 21:58
    datasets that's huge collaborative efforts
    because people need to go out into the
  • 21:58 - 22:04
    field and collect the reference samples
    and that might be scientists, that might
  • 22:04 - 22:14
    be people from local communities, citizen
    scientists, managers, owners, government
  • 22:14 - 22:20
    officials who provide background
    information, maps, distribution
  • 22:20 - 22:28
    information and also in many parts of the
    world might protect the people who are
  • 22:28 - 22:35
    actually collecting the samples. And it
    might be conservation activists and NGOs.
  • 22:35 - 22:41
    So once the samples have been collected
    they need to be stored somewhere for the
  • 22:41 - 22:51
    long term and the information needs to be
    databased. And that is the work of
  • 22:51 - 22:59
    scientific connections, which are mostly
    at natural history museums and there the
  • 22:59 - 23:05
    samples are processed. They're organized
    in ways that you can find them again. All
  • 23:05 - 23:10
    the metadata is entered, which curators
    do, collection managers, preparators,
  • 23:10 - 23:17
    technical staff at the scientific
    collections. So once we have these kind of
  • 23:17 - 23:25
    data sets, large scale data sets, what are
    we actually doing with them? So the
  • 23:25 - 23:33
    foundation for all of our applications is
    population structure and there
  • 23:33 - 23:42
    specifically population assignment. So the
    process is set first. We decide on a
  • 23:42 - 23:47
    question and design our project
    accordingly that we can answer the
  • 23:47 - 23:52
    question. Then we need to infer the
    population structure model and optimize
  • 23:52 - 23:57
    it. In the next step we need to check if a
    model actually is good enough for
  • 23:57 - 24:03
    application because we might have found
    the best model, but it might still not be
  • 24:03 - 24:07
    good enough for application. So we need to
    test that. And that is the step of
  • 24:07 - 24:13
    population assignment or predictive
    assignment. And then in the end, we want
  • 24:13 - 24:19
    to test our hypothesis. Are the two stands
    different or does an individual come from
  • 24:19 - 24:31
    stand A or from stand B? And here we
    identify error rates and accuracy. So this
  • 24:31 - 24:39
    whole process is very statistical. And so
    the analysis of these reference data they
  • 24:39 - 24:48
    need to be accompanied by biostatisticians
    who can tell us how to analyze our data.
  • 24:48 - 24:55
    So what is the state-of-the-art right now?
    What kind of geographic resolution do we
  • 24:55 - 25:03
    actually get of this non model specie
    currently? And I'm going to present the
  • 25:03 - 25:10
    example of an African timber tree
    species, which is a very valuable timber.
  • 25:10 - 25:18
    It's one example but basically all results
    for species who have large distribution
  • 25:18 - 25:26
    ranges and are continuously distributed
    and are also long-lived, are very similar.
  • 25:26 - 25:33
    So this kind of results seem to be species
    independent. So the species are Milica
  • 25:33 - 25:40
    regia and excelsa, African teak, which
    cannot be grown in plantations for timber
  • 25:40 - 25:51
    quality. So it is harvested unsustainably
    from natural forests. It's distributed in
  • 25:51 - 26:01
    West, Central and East Africa. Here's a
    black rectangle. And a group of a dozen
  • 26:01 - 26:06
    scientists got together and they actually
    sampled a reference dataset for these two
  • 26:06 - 26:19
    species. It's about over 400 samples, they
    analyzed four marker systems, resulting in
  • 26:19 - 26:25
    a total of something like 100 markers,
    genetic markers, and then they optimized
  • 26:25 - 26:33
    the population model and used different
    parameter settings. And we're going to
  • 26:33 - 26:40
    concentrate here on the best solution that
    they found. And basically this rectangle
  • 26:40 - 26:48
    here is the black one over here. So the
    resolution is they found population
  • 26:48 - 26:55
    structure with clear clusters. So the
    populations and the species from West
  • 26:55 - 27:01
    Africa can be distinguished from those
    populations in Central Africa. And the
  • 27:01 - 27:08
    ones in East Africa can be differentiated.
    So that is really good. So we have
  • 27:08 - 27:13
    population structure. We know their
    signal. The problem is still that our
  • 27:13 - 27:22
    resolution is much lower than we would
    need to have it because we basically need
  • 27:22 - 27:32
    resolution at least on a country level,
    because most of the laws are national. So
  • 27:32 - 27:42
    it might be legal to harvest a tree in one
    country, but not in another country. So we
  • 27:42 - 27:50
    need to get our resolution down to country
    level or even to regional level. If you
  • 27:50 - 27:52
    want to distinguish, was the tree
    harvested in a national park in a
  • 27:52 - 28:02
    protected area or outside in a managed
    forest. And when as biodiversity
  • 28:02 - 28:11
    scientists, we don't know how to continue,
    one thing is to look for what people do
  • 28:11 - 28:17
    with model organisms and specifically what
    people do in human population genomics
  • 28:17 - 28:24
    because there thousands of populations
    geneticists are working and there is a
  • 28:24 - 28:28
    completely different funding background
    due to the interest of the medical and the
  • 28:28 - 28:40
    pharma industry. So they are always
    advanced. What we can learn from there,
  • 28:40 - 28:47
    from the human populations genomics is
    that we need two features. One is we
  • 28:47 - 28:54
    already know that we need distribution
    wide sampling, which provides a spatial
  • 28:54 - 29:00
    context. The second feature is that we
    need genome wide sequencing, preferably
  • 29:00 - 29:09
    genome sequencing, which provides us steps
    in time because our genomes are archives
  • 29:09 - 29:15
    of our evolutionary history. They are
    records of all the processes and events
  • 29:15 - 29:21
    and these steps in time then translate
    also into resolution. Once we have these
  • 29:21 - 29:32
    two features, actually these reference
    datasets open Pandora's box. Suddently we
  • 29:32 - 29:36
    can ask all kinds of questions and
    objectives, even those that we still don't
  • 29:36 - 29:47
    know. We can develop all kinds of
    applications which is done for humans.
  • 29:47 - 30:00
    Currently, there are at least four global
    datasets on human diversity. These are
  • 30:00 - 30:09
    very widely reused and these big datasets
    - so they are big data with regard to the
  • 30:09 - 30:19
    number of samples and also the genomes or
    the genome representations and this
  • 30:19 - 30:26
    results in very information rich data
    which initiates analytical development so
  • 30:26 - 30:34
    people continuously are developing new
    statistical methods. And right now, a new
  • 30:34 - 30:42
    wave is coming in of these methods. So
    once you have these global datasets,
  • 30:42 - 30:48
    people start in human populations
    genomics, started to do these intense
  • 30:48 - 30:56
    regional samplings. And this is the
    example of the United Kingdom Biobank.
  • 30:56 - 31:03
    It's a project with 500.000 volunteers,
    they are all UK citizens from all over the
  • 31:03 - 31:14
    islands. And each individual was genotyped
    in a vet lab for 820.000 markers. That's
  • 31:14 - 31:20
    completely I mean, that's a different
    number than the 100 or 1000...in
  • 31:20 - 31:26
    biodiversity scientists we normally
    analyse a maximum of a couple of 10.000
  • 31:26 - 31:36
    markers. So that's a completely different
    number. But then statistical geneticists
  • 31:36 - 31:47
    come. They do some weird and wonderful
    voodoo and they derive 96 million markers
  • 31:47 - 31:53
    per genome that is per individual from
    these 820.000 markers that were produced
  • 31:53 - 32:01
    in the lab. So that's a hundred fold
    increase. And once you have this kind of
  • 32:01 - 32:08
    dataset for a genome, you suddenly or you
    finally become country level and within
  • 32:08 - 32:19
    country level resolution. So these panels
    are examples. So the first panel shows
  • 32:19 - 32:26
    individuals who were born in Edinburgh and
    the question was "Where were people born
  • 32:26 - 32:32
    who had a similar ancestral background,
    genetic background?". And what they found
  • 32:32 - 32:42
    was that that was all over Scotland and
    Northern Ireland. Northern Yorkshire was
  • 32:42 - 32:50
    even more local. So people from Yorkshire
    don't seem to get around a lot. For London
  • 32:50 - 32:54
    the situation is completely different.
    That is what we would expect because
  • 32:54 - 33:00
    London is a people magnet. People move
    there all the time. They meet there, they
  • 33:00 - 33:06
    get children and the kids born in London,
    their genetic ancestry has nothing to do
  • 33:06 - 33:13
    with London. It's from all over the place,
    from the British Isles and the world. So
  • 33:13 - 33:22
    that's why the colors are strongly
    dissolved. So this study came out also
  • 33:22 - 33:26
    this summer. And it's the first time that
    I have seen that we actually really can
  • 33:26 - 33:37
    achieve regional resolution. And I find
    this possibility for biodiversity science
  • 33:37 - 33:47
    really exciting. So it was made possible
    by very sophisticated statistical
  • 33:47 - 33:52
    approaches which are able to analyze
    genetic data from highly complex
  • 33:52 - 33:59
    evolutionary and ecological systems. And
    at the same time these analyses are able
  • 33:59 - 34:05
    to handle big data. We we're talking about
    gigabytes and terabytes of data and
  • 34:05 - 34:14
    results. So a statistical geneticist are
    developing new methods of data
  • 34:14 - 34:20
    representation to handle this amount of
    data. And then we are able to sufficiently
  • 34:20 - 34:26
    extract the signal for a very specific
    question from data which are very low
  • 34:26 - 34:37
    signal to noise ratio. So to get there, we
    need many experts and specialists. So we
  • 34:37 - 34:42
    need statistical geneticists, big data
    experts who also might contribute machine
  • 34:42 - 34:49
    learning expertise. We need molecular
    biologists who know how to sequence
  • 34:49 - 34:54
    complex genomes. We now need
    bioinformatics with an expertise in
  • 34:54 - 35:05
    genomics for assembly, annotation and
    alignment of genomic sequences. The result
  • 35:05 - 35:13
    is actually this: This is the author list
    for the thousands genomes project
  • 35:13 - 35:20
    reference data set, and I don't expect you
    to be able to read it, but the bold type
  • 35:20 - 35:26
    is of interest because it shows all the
    different tasks that are necessary to
  • 35:26 - 35:36
    produce a standardized and highly cleaned
    reverence dataset. So the whole author
  • 35:36 - 35:42
    list is something like 1.5 pages long and
    even considering that some authors will
  • 35:42 - 35:51
    have contributed to several tasks. The
    publications for reference datasets mostly
  • 35:51 - 35:57
    have author lists that are far over 50
    people. So they are huge collaborative
  • 35:57 - 36:05
    efforts. Now we take the step into
    biodiversity science. Here these are eight
  • 36:05 - 36:13
    gastrotrichs, they are little worm like
    organisms who live in the sediments of
  • 36:13 - 36:23
    freshwater lakes and marine sediment. They
    are in general a couple of hundreds micro
  • 36:23 - 36:30
    meters large. And I don't have any
    numbers, but my guess would be that maybe
  • 36:30 - 36:39
    worldwide, a hundred to a thousand people
    actually work on these species. There are
  • 36:39 - 36:45
    800 species of gastrotrichs. So let's say
    there's one, two, maybe three experts per
  • 36:45 - 36:52
    species for these organisms. So how are
    these three people going to manage all
  • 36:52 - 37:01
    these tasks to produce a reference
    dataset? You might say, well, it's
  • 37:01 - 37:05
    gastrotrichs, I mean, have never heard
    about them. Maybe they are not so
  • 37:05 - 37:08
    important. Maybe you don't need a
    reference data sets, but actually some of
  • 37:08 - 37:18
    those species are bioindicators for water
    quality. So what we observe right now is a
  • 37:18 - 37:28
    gap for biodiversity conservation. In
    model organisms, we have Pandora's Box
  • 37:28 - 37:35
    open. We have all the statistical analyses
    at our hands to analyze our data sets.
  • 37:35 - 37:40
    However, in none model organisms, we are
    still stuck with summary statistics that
  • 37:40 - 37:47
    don't provide us the resolution that we
    need. And we know that to close this gap,
  • 37:47 - 37:53
    even for a single species, it's a huge
    effort. But at the same time, we have over
  • 37:53 - 38:04
    35.000 species listed by scientists which
    need already now effective protection. So
  • 38:04 - 38:10
    we need to find a way to close this gap
    and actually move in this direction. And
  • 38:10 - 38:20
    the good thing is, so all of this... in
    biodiversity science, in academia, and we
  • 38:20 - 38:25
    need to make the transition over the
    conservational genomic gap into the big
  • 38:25 - 38:32
    loop of real world conservation tasks. And
    the good thing is we already know what we
  • 38:32 - 38:38
    have to do. So we need to have reference
    data sets, distribution range wide. We
  • 38:38 - 38:44
    need to have statistics. And it's going to
    be big data. So we need collection
  • 38:44 - 38:54
    management, data management and an
    analysis environment. So looking at
  • 38:54 - 39:00
    different ingredients or different steps
    the first we need is a general data
  • 39:00 - 39:05
    infrastructure for global diversity of
    reference data sets that actually can be
  • 39:05 - 39:12
    used across species for preferably as many
    species as possible and provide a working
  • 39:12 - 39:20
    environment for biodiversity scientists
    and experts. It should be user friendly so
  • 39:20 - 39:26
    it can be used by scientists, but also
    that people from local communities and
  • 39:26 - 39:33
    citizen scientists can add their
    observation data and their data into this
  • 39:33 - 39:41
    data infrastructure. I have listed quite a
    lot of features that these kind of
  • 39:41 - 39:48
    infrastructures should have. And I'm going
    to argue that these features are not some
  • 39:48 - 40:03
    nice to have, but actually some must have.
    Because our goal is always application. So
  • 40:03 - 40:13
    we need developers, managers and curators
    for data infrastructures. Since our goal
  • 40:13 - 40:31
    is application, our main features are
    quality control and error reduction. These
  • 40:31 - 40:39
    are the basis. So that our conservation
    tools can be robustly and reliably applied
  • 40:39 - 40:46
    under real world operating conditions. And
    the way to achieve quality and error
  • 40:46 - 40:53
    reduction is through chains of custody. So
    it means that from project of sign, from
  • 40:53 - 40:58
    the questions through all the steps that
    are necessary to produce a reference data
  • 40:58 - 41:08
    set and then...so from sample collection,
    genomic statistical analysis down to
  • 41:08 - 41:16
    application. These steps need to be
    documented and standardized. They need to
  • 41:16 - 41:22
    be, each one of them needs to be validated
    and reproducible. They should be modular
  • 41:22 - 41:29
    so they can be user friendly. And the
    whole chain of custody needs to be
  • 41:29 - 41:41
    scalable. So if our chains of custody have
    these characteristics, we actually will
  • 41:41 - 41:51
    have tools that will work in everyday
    life. So we need professional developers
  • 41:51 - 42:00
    and programmers who are able to produce
    these very collaborative softwares. We
  • 42:00 - 42:06
    need free and open source experts. So we
    always can ensure that our code and that
  • 42:06 - 42:14
    our infrastructures are still integer and
    we can check them. And I'm a biologist, I
  • 42:14 - 42:19
    don't have any background in hardware, but
    I've heard a couple of talks here in the
  • 42:19 - 42:26
    conference about Green IT here. And I have
    the feeling we should have people who know
  • 42:26 - 42:34
    hardware and software and know how to
    develop these high tech tools in a way
  • 42:34 - 42:38
    sustainable so that by developing these
    tools, we don't use more resources than we
  • 42:38 - 42:49
    are trying to protect. So I've shown all
    these features and characteristics that
  • 42:49 - 42:57
    the software should have. And I'm arguing
    that these features are necessary because
  • 42:57 - 43:05
    of the reality we find us in. It is one of
    rising over-exploitation and destruction
  • 43:05 - 43:20
    of nature. So the extent of environmental
    crimes is up in the billions. All
  • 43:20 - 43:29
    environmental crime together, the green
    bubbles are only second to drug associated
  • 43:29 - 43:35
    crimes. They are up there with
    counterfeiting or human trafficing. So
  • 43:35 - 43:45
    these are multi-billion enterprises. They
    are often transnational and industries
  • 43:45 - 44:02
    with huge profits. So if there's some
    crime, some mafia boss, some criminal
  • 44:02 - 44:10
    manager who just bribed a government
    official somewhere in the neck in the
  • 44:10 - 44:18
    woods, it just would make sense that that
    person would not wait or not take the
  • 44:18 - 44:24
    risks to be discovered just because some
    customs officer pulls out a container
  • 44:24 - 44:29
    somewhere in the harbor, for example,
    opens it and says "This looks kind of
  • 44:29 - 44:37
    weird. Let's take a sample, send it to a
    lab." and then a population geneticist
  • 44:37 - 44:44
    comes back and says "Oh, yes, this sample
    is not from area A as documented, but
  • 44:44 - 44:52
    actually it's from area B and it was
    illegally logged." If we have reference
  • 44:52 - 44:59
    data sets, information rich reference data
    sets, they become highly valuable and they
  • 44:59 - 45:08
    need protection themselves against
    manipulation and destruction. So we will
  • 45:08 - 45:15
    need to think about IT security from the
    beginning. Also, these data sets are often
  • 45:15 - 45:20
    very politically sensitive because if it
    is shown that in a certain country there
  • 45:20 - 45:26
    is the illegal logging repeatedly, that
    country might not be too excited about
  • 45:26 - 45:41
    this information. So we need to think
    about IT security experts. So my hope is
  • 45:41 - 45:49
    that these kind of very high tech digital
    conservation tools can actually contribute
  • 45:49 - 45:56
    to the U.N. Sustainable Development Goals
    by empowering indigenous people, local
  • 45:56 - 46:03
    communities and also us to protect and
    force and sustainably use our lands and
  • 46:03 - 46:10
    our biodiversity by providing some
    management and law enforcement tools. So
  • 46:10 - 46:14
    we need people from around the world,
    users from around the world who use these
  • 46:14 - 46:26
    tools and help to develop them further and
    to maintain them. And finally here, these
  • 46:26 - 46:34
    high tech tools will just another
    technological fix. If we don't manage to
  • 46:34 - 46:46
    get our back down, our way of life down to
    sustainable levels. So what we need is to
  • 46:46 - 46:54
    today...this year, the Earth Overshoot Day
    was at the end of July. So at the end of
  • 46:54 - 47:02
    July, we had used all the resources that
    we had available for the whole year. And
  • 47:02 - 47:09
    we need to get this back to the end of the
    year so that our resources actually
  • 47:09 - 47:23
    sustain us for the whole year. The graphic
    here for Germany suggests that we are on a
  • 47:23 - 47:30
    good way. We are reducing our resource
    consumption and maybe even our biocapacity
  • 47:30 - 47:38
    moves up a little bit. So actually it
    seems that our personal lifestyles and
  • 47:38 - 47:46
    choices make a difference and we just need
    to close this gap here much quicker. So
  • 47:46 - 47:54
    protecting biodiversity needs all of us to
    achieve that. And with that, thank you
  • 47:54 - 47:59
    very much.
  • 47:59 - 48:04
    *Applause*
  • 48:04 - 48:13
    Angel: So thank you Jutta for this very
    interesting talk and the very valuable
  • 48:13 - 48:17
    work you're doing. We have three mics
    here. Please line up at the microphones if
  • 48:17 - 48:23
    you have any questions or suggestions or
    want to participate and work together with
  • 48:23 - 48:28
    Jutta. We have one question from the
    Internet, so please Signal-Angel start.
  • 48:28 - 48:35
    Signal-Angel: Why do wild plant species
    within a genus are further apart than wild
  • 48:35 - 48:43
    animal species within a genus?
    Angel: Could you repeat it, please?
  • 48:43 - 48:49
    Signal-Angel: Why do wild plant species
    within a genus are further apart than wild
  • 48:49 - 48:56
    animal species within a genus?
    Jutta: I'm not sure I understand the
  • 48:56 - 49:01
    background for the question.
    Mic 1: Because animals move and plants
  • 49:01 - 49:06
    don't move.
    Jutta: Oh, okay. If that is the idea
  • 49:06 - 49:12
    behind the question. Plants actually move,
    too. They don't move as individuals, but
  • 49:12 - 49:24
    they move their genetic material through
    pollen or fragments. So actually diversity
  • 49:24 - 49:31
    in plants and in animals can be quite
    similar. So the idea is that plants are
  • 49:31 - 49:36
    just stuck and should have a completely
    different population structure does not
  • 49:36 - 49:43
    hold because plants move around their
    genetic material through seeds, through
  • 49:43 - 49:50
    pollen, through vegetative propagules.
    Angel: So thank you microphone 1 for
  • 49:50 - 49:56
    helping out. Please ask your question. Mic
    1: So my question is about the success
  • 49:56 - 50:01
    factor of it. If you think of this,
    whatever database being set up there and I
  • 50:01 - 50:07
    think it's gonna be a huge database...I
    downloaded my own genome on the Internet.
  • 50:07 - 50:13
    It was about 150 megabytes. And if we
    multiply that, I think the genetic
  • 50:13 - 50:18
    variation from one person to another is
    about 1 percent only. So we can compress
  • 50:18 - 50:25
    that to 4 megabytes per person. If we
    sequence all the humans in the world, that
  • 50:25 - 50:33
    would be 32 petabytes, that would cost
    approximately 15 billion dollars. And
  • 50:33 - 50:37
    that's only for the storage. Now comes the
    entire management. Of course, we don't
  • 50:37 - 50:41
    want to digitize all the human genome, but
    rather the plants and animal species
  • 50:41 - 50:46
    genome. So it's a huge data program. And
    what would be for you the success factors
  • 50:46 - 50:51
    for this thing to really fly? And did you
    talk to organizations like WikiData or
  • 50:51 - 50:56
    others or where would it ideally be
    hosted? At a university or an
  • 50:56 - 51:00
    international nonprofit or who would be
    running the thing?
  • 51:00 - 51:15
    Jutta: Yeah, I mean, it's just really big
    data. I think our first goal is not to
  • 51:15 - 51:24
    think about having all predicted 5 to 10
    million species be sequenced on a
  • 51:24 - 51:30
    population level. I think we need to think
    about the next step. And there it would
  • 51:30 - 51:36
    make sense to start with species that are
    actually highly exploited, like many
  • 51:36 - 51:41
    timber species and also many marine
    fishes. I think that's where we should
  • 51:41 - 51:48
    start. And to host this kind of data I
    think it should be in political
  • 51:48 - 51:56
    independent hands. So it should be with an
    NGO or with the U.N., some organization
  • 51:56 - 52:03
    that is independent.
    Mic 1: Are you the first to think about
  • 52:03 - 52:07
    this or are there existing initiatives?
    Jutta: There are actually existing
  • 52:07 - 52:14
    initiatives. I have been in contact with
    the Forest Stewardship Council and they
  • 52:14 - 52:23
    are actually starting to sample their
    concessions and initiated to build up the
  • 52:23 - 52:29
    samples, they work together with Kew
    Botanical Gardens and the U.S. Forest
  • 52:29 - 52:38
    Service. And right now they're analyzing
    the samples, using isotopes which is
  • 52:38 - 52:46
    another method which is very powerful and
    can also produce geographic information.
  • 52:46 - 53:01
    And so, yeah, so people are moving in this
    way. So, yeah, I think the idea is out
  • 53:01 - 53:06
    there, just we have to start and we have
    to really do it and provide one
  • 53:06 - 53:13
    infrastructure so that we can combine, for
    example, morphological data, isotope data
  • 53:13 - 53:18
    and genomic data into one dataset, which
    will increase our resolution and our
  • 53:18 - 53:24
    reliability.
    Angel: Okay. Microphone number two,
  • 53:24 - 53:27
    please.
    Mic 2: Thank you for your valuable talk.
  • 53:27 - 53:33
    My question would be you'd start your talk
    with the possible decrease of leaf beetles
  • 53:33 - 53:37
    in the data set you showed on slide number
    six there was an increase in leaf beetle
  • 53:37 - 53:42
    population until the 70s, something about
    that. Is there a possible explanation for
  • 53:42 - 53:50
    that?
    Jutta: Yeah, I believe it is, because
  • 53:50 - 53:55
    people started to much more systematically
    observe leaf beetles. So it's a sample
  • 53:55 - 54:06
    effort. And also at that time the people -
    so it's a multi-people collaboration who
  • 54:06 - 54:12
    actually has assembled this dataset so the
    people who are part of this collaboration
  • 54:12 - 54:17
    they edit their own private data sets. And
    that's why you have an increase I think.
  • 54:17 - 54:24
    While the people from the nineteen
    hundreds, nineteen hundred ten you only
  • 54:24 - 54:29
    can use the data that is available in
    publications and samples in museums or in
  • 54:29 - 54:33
    scientific collections. I think that is
    the reason why you have the sharp
  • 54:33 - 54:36
    increase.
    Mic 2: Thank you.
  • 54:36 - 54:39
    Angel: So we have another question of
    microphone number two.
  • 54:39 - 54:44
    Mic 2: Thank you for your fine talking.
    Excuse me. Maybe my question is a bit off
  • 54:44 - 54:52
    topic. Do you think the methods and roles
    that you identified in your talk could be
  • 54:52 - 55:00
    transferred to the assessment of raw
    materials? I'm thinking about metals?
  • 55:00 - 55:09
    Jutta: Maybe the data infrastructure, like
    if you wanted to collect raw metals or
  • 55:09 - 55:16
    materials from all over the world and...a
    sampleized scientific collection and to
  • 55:16 - 55:22
    have kind of a reference dataset that
    might work, actually. But the genomics
  • 55:22 - 55:29
    obviously won't. So that part of what you
    would need to use different methods from
  • 55:29 - 55:36
    physics, obviously. But actually the
    infrastructure, certain parts will be
  • 55:36 - 55:40
    quite similar. I think so, yes.
    Angel: So we have one more question from
  • 55:40 - 55:43
    the Internet.
    Signal-Angel: Who does contract a
  • 55:43 - 55:52
    freelance evolutionary biologist? Can you
    give an example of this kind of work you
  • 55:52 - 56:01
    proposed?
    Jutta: So I see this gap between science
  • 56:01 - 56:08
    and applications, that we need these
    applications and there's a huge potential
  • 56:08 - 56:18
    for these applications. We know that
    illegal logging and that is my background,
  • 56:18 - 56:24
    but doesn't seem to be much different, for
    example, in marine fisheries. We know that
  • 56:24 - 56:30
    there is this huge amount of illegal
    logging and timber trade going on. And we
  • 56:30 - 56:40
    need to have some assets actually that
    have the power to detect illegally traded
  • 56:40 - 56:50
    timber. So I think there is a huge need
    for these kind of methods and
  • 56:50 - 57:01
    organizations who are interested in these
    kind of methods. Our governments, their
  • 57:01 - 57:13
    companies, NGOs, customs, Interpol. So,
    yeah.
  • 57:13 - 57:20
    Angel: Do we have any other questions? So
    thank you again Jutta for your talk and
  • 57:20 - 57:23
    the valuable work you're doing. Please
    give a warm round of applause to Jutta.
  • 57:23 - 57:24
    *Applause*
  • 57:24 - 57:25
    *36c3 postrol music*
  • 57:25 - 57:26
    Subtitles created by c3subtitles.de
    in the year 2020. Join, and help us!
Title:
36C3 - Protecting the Wild
Description:

more » « less
Video Language:
English
Duration:
57:56

English subtitles

Incomplete

Revisions