-
People use the internet
for various reasons.
-
It turns out that one of the most
popular categories of website
-
is something that people
typically consume in private.
-
It involves curiosity,
-
non-insignificant levels
of self-indulgence
-
and is centered around recording
the reproductive activities
-
of other people.
-
(Laughter)
-
Of course, I'm talking about genealogy --
-
(Laughter)
-
the study of family history.
-
When it comes to detailing family history,
-
in every family, we have this person
that is obsessed with genealogy.
-
Let's call him Uncle Bernie.
-
Uncle Bernie is exactly the last person
you want to sit next to
-
in Thanksgiving dinner,
-
because he will bore you to death
with peculiar details
-
about some ancient relatives.
-
But as you know,
-
there is a scientific side for everything,
-
and we found that Uncle Bernie's stories
-
have immense potential
for biomedical research.
-
We let Uncle Bernie
and his fellow genealogists
-
document their family trees through
a genealogy website called geni.com.
-
When users upload
their trees to the website,
-
it scans their relatives,
-
and if it finds matches to existing trees,
-
it merges the existing
and the new tree together.
-
The result is that large
family trees are created,
-
beyond the individual level
of each genealogist.
-
Now, by repeating this process
with millions of people
-
all over the world,
-
we can crowdsource the construction
of a family tree of all humankind.
-
Using this website,
-
we were able to connect 125 million people
-
into a single family tree.
-
I cannot draw the tree
on the screens over here
-
because they have less pixels
-
than the number of people in this tree.
-
But here is an example of a subset
of 6,000 individuals.
-
Each green node is a person.
-
The red nodes represent marriages,
-
and the connections represent parenthood.
-
In the middle of this tree,
you see the ancestors.
-
And as we go to the periphery,
you see the descendants.
-
This tree has seven
generations, approximately.
-
Now, this is what happens
when we increase the number of individuals
-
to 70,000 people --
-
still a tiny subset
of all the data that we have.
-
Despite that, you can already see
the formation of gigantic family trees
-
with many very distant relatives.
-
Thanks to the hard work
of our genealogists,
-
we can go back in time
hundreds of years ago.
-
For example, here is Alexander Hamilton,
-
who was born in 1755.
-
Alexander was the first
US Secretary of the Treasury,
-
but mostly known today
due to a popular Broadway musical.
-
We found that Alexander has deeper
connections in the showbiz industry.
-
In fact, he's a blood relative of ...
-
Kevin Bacon!
-
(Laughter)
-
Both of them are descendants
of a lady from Scotland
-
who lived in the 13th century.
-
So you can say that Alexander Hamilton
-
is 35 degrees of Kevin Bacon genealogy.
-
(Laughter)
-
And our tree has millions
of stories like that.
-
We invested significant efforts
to validate the quality of our data.
-
Using DNA, we found that .3 percent of
the mother-child connections in our data
-
are wrong,
-
which could match the adoption rate
in the US pre-Second World War.
-
For the father's side,
-
the news is not as good:
-
1.9 percent of the father-child
connections in our data are wrong.
-
And I see some people smirk over here.
-
It is what you think --
-
there are many milkmen out there.
-
(Laughter)
-
However, this 1.9 percent error rate
in patrilineal connections
-
is not unique to our data.
-
Previous studies found
a similar error rate
-
using clinical-grade pedigrees.
-
So the quality of our data is good,
-
and that should not be a surprise.
-
Our genealogists have
a profound, vested interest
-
in correctly documenting
their family history.
-
We can leverage this data to learn
quantitative information about humanity,
-
for example, questions about demography.
-
Here is a look at all our profiles
on the map of the world.
-
Each pixel is a person
that lived at some point.
-
And since we have so much data,
-
you can see the contours
of many countries,
-
especially in the Western world.
-
In this clip, we stratified
the map that I've showed you,
-
basically, of births of individuals
from 1400 to 1900,
-
and we compared it
to known migration events.
-
The clip is going to show you
that the deepest lineages in our data
-
go all the way back to the UK,
-
where they had better record keeping,
-
and then they spread along
the routes of Western colonialism.
-
Let's watch this.
-
(Music)
-
[Year of birth: ]
-
[1492 - Columbus sails the ocean blue]
-
[1620 - Mayflower lands in Massachusetts]
-
[1652 - Dutch settle in South Africa]
-
[1788 - Great Britain penal
transportation to Australia starts]
-
[1836 - First migrants use Oregon Trail]
-
[all activity]
-
I love this movie.
-
Now, since these migration events
are giving the context of families,
-
we can ask questions such as:
-
What is the typical distance
between the birth locations
-
of husbands and wives?
-
This distance plays
a pivotal role in demography,
-
because the patterns in which
people migrate to form families
-
determine how genes spread
in geographical areas.
-
We analyzed this distance using our data,
-
and we found that in the old days,
-
people had it easy.
-
They just married someone
in the village nearby.
-
But the Industrial Revolution
really complicated our love life.
-
And today, with affordable flights
and online social media,
-
people typically migrate more than
100 kilometers from their place of birth
-
to find their soul mate.
-
So now you might ask:
-
OK, but who does the hard work
of migrating from places to places
-
to form families?
-
Are these the males or the females?
-
We used our data to address this question,
-
and at least in the last 300 years,
-
we found that the ladies do the hard work
-
of migrating from places
to places to form families.
-
Now, these results
are statistically significant,
-
so you can take it as scientific fact
that males are lazy.
-
(Laughter)
-
We can move from questions
about demography
-
and ask questions about human health.
-
For example, we can ask
-
to what extent genetic variations
account for differences in life span
-
between individuals.
-
Previous studies analyzed the correlation
of longevity between twins
-
to address this question.
-
They estimated that the genetic
variations account for
-
about a quarter of the differences
in life span between individuals.
-
But twins can be correlated
due to so many reasons,
-
including various environmental effects
-
or a shared household.
-
Large family trees give us the opportunity
to analyze both close relatives,
-
such as twins,
-
all the way to distant relatives,
even fourth cousins.
-
This way we can build robust models
-
that can tease apart the contribution
of genetic variations
-
from environmental factors.
-
We conducted this analysis using our data,
-
and we found that genetic variations
explain only 15 percent
-
of the differences in life span
between individuals.
-
That is five years, on average.
-
So genes matter less than
what we thought before to life span.
-
And I find it great news,
-
because it means that
our actions can matter more.
-
Smoking, for example, determines
10 years of our life expectancy --
-
twice as much as what genetics determines.
-
We can even have more surprising findings
-
as we move from family trees
-
and we let our genealogists
document and crowdsource DNA information.
-
And the results can be amazing.
-
It might be hard to imagine,
but Uncle Bernie and his friends
-
can create a DNA forensic capabilities
-
that even exceed
what the FBI currently has.
-
When you place the DNA
on a large family tree,
-
you effectively create a beacon
-
that illuminates the hundreds
of distant relatives
-
that are all connected to the person
that originated the DNA.
-
By placing multiple beacons
on a large family tree,
-
you can now triangulate the DNA
of an unknown person,
-
the same way that the GPS system
uses multiple satellites
-
to find a location.
-
The prime example
of the power of this technique
-
is capturing the Golden State Killer,
-
one of the most notorious criminals
in the history of the US.
-
The FBI had been searching
for this person for over 40 years.
-
They had his DNA,
-
but he never showed up
in any police database.
-
About a year ago, the FBI
consulted a genetic genealogist,
-
and she suggested that they submit
his DNA to a genealogy service
-
that can locate distant relatives.
-
They did that,
-
and they found a third cousin
of the Golden State Killer.
-
They built a large family tree,
-
scanned the different
branches of that tree,
-
until they found a profile
that exactly matched
-
what they knew about
the Golden State Killer.
-
They obtained DNA from this person
and found a perfect match
-
to the DNA they had in hand.
-
They arrested him
and brought him to justice
-
after all these years.
-
Since then, genetic genealogists
have started working with
-
local US law enforcement agencies
-
to use this technique
in order to capture criminals.
-
And only in the past six months,
-
they were able to solve
over 20 cold cases with this technique.
-
Luckily, we have people like Uncle
Bernie and his fellow genealogists
-
who love families
and tirelessly study them.
-
These are not amateurs
with a self-serving hobby.
-
These are citizen scientists
with a deep passion to tell us who we are.
-
And they know that the past
can hold a key to the future.
-
Thank you very much.
-
(Applause)