-
People use the internet
for various reasons.
-
It turns out that one of the most
popular categories of website
-
is something that people
typically consume in private.
-
It involves curiosity,
-
non-insignificant levels
of self-indulgence,
-
and centered around recording
the reproductive activities
-
of other people.
-
Of course I'm talking
about genealogy.
-
(Laughter)
-
The study of family history.
-
When it comes to detailing family history,
-
in every family we have this person
that is obsessed with genealogy.
-
Let's call him Uncle Bernie.
-
Uncle Bernie is exactly the last person
you want to sit next to
-
in Thanksgiving dinner
-
because he will bore you to death
with peculiar details
-
about some ancient relatives.
-
But as you know,
-
there is a scientific side for everything,
-
and we found that Uncle Bernie's stories
-
have immense potential
for biomedical research.
-
We let Uncle Bernie
and his fellow genealogists
-
document their family trees through
a genealogy website called geni.com.
-
When users upload
their trees to the website,
-
it scans their relatives,
-
and if it finds matches to existing trees,
-
it emerges the existing
and the new tree together.
-
The result is that large family trees
are created beyond the individual level
-
of each genealogist.
-
Now, by repeating this process with
millions of people all over the world,
-
we can crowdsource the construction
of a family tree of all humankind.
-
Using this website,
-
we were able to connect 125 million people
-
into a single family tree.
-
I cannot draw the tree
on the screens over here
-
because they have less pixels
-
than the number of people in this tree,
-
but here is an example of a subset
of 6,000 individuals.
-
Each green node is a person.
-
The red nodes represent marriages,
-
and the connections represent parenthood.
-
In the middle of this tree,
you see the ancestors,
-
and as we go to the periphery,
you see the descendants,
-
and this tree has seven
generations approximately.
-
Now, this is what happens
when we increase the number of individuals
-
to 70,000 people,
-
still a tiny subset
of all the data that we have.
-
Despite that, you can already see
the formation of gigantic family trees
-
with very many distant relatives.
-
Thanks to the hard work
of our genealogists,
-
we can go back in time
hundreds of years ago.
-
For example, here is Alexander Hamilton
-
that was born in 1755.
-
Alexander was the first
US Secretary of the Treasury,
-
but mostly known today
due to a popular Broadway musical.
-
We found that Alexander has deeper
connections in the showbiz industry.
-
In fact, he's a blood relative
of Kevin Bacon.
-
(Laughter)
-
Both of them are descendants
of a lady from Scotland
-
who lived in the 13th century.
-
So you can say that Alexander Hamilton
-
is 35 degrees of Kevin Bacon genealogy.
-
(Laughter)
-
And our tree has millions
of stories like that.
-
We invested significant effort
to validate the quality of our data.
-
Using DNA, we found that .3 percent of
the mother-child connections in our data
-
are wrong,
-
which could match the adoption rate
in the US pre-Second World War.
-
For the father's side,
-
the news are not as good.
-
1.9 percent of the father-child
connections in our data are wrong.
-
And I see some people smirk over here.
-
It is what you think.
-
There are many milkmen out there.
-
(Laughter)
-
However, this 1.9 percent error rate
in patrilineal connections
-
is not unique to our data.
-
Previous studies found
a similar error rate
-
using clinical-grade pedigrees.
-
So the quality of our data is good,
-
and that should not be a surprise.
-
Our genealogists have a profound,
vested interest in correctly documenting
-
the family history.
-
We can leverage this data to learn
quantitative information about humanity,
-
for example questions about demography.
-
Here is a look of all our profiles
on the map of the world.
-
Each pixel is a person
that lived at some point,
-
and since we have so much data,
-
you can see the contours
of many countries,
-
especially in the Western world.
-
In this clip, we stratified
the map that I've showed you
-
basically of birth of individuals
-
from 1400 to 1900
-
and we compared it
to known migration events.
-
The clip is going to show you
that the deepest lineages in our data
-
go all the way back to the UK,
-
where they had better record-keeping,
-
and then they spread along
the routes of Western colonialism.
-
Let's watch this.
-
(Music)
-
I love this movie.
-
Now, since these migrations events
are giving the context of families,
-
we can ask questions
-
such as what is the typical distance
between the birth locations
-
of husbands and wives?
-
This distance plays
a pivotal role in demography,
-
because the patterns on which
people migrate to form families
-
determine how genes spread
in geographical areas.
-
We analyzed this distance using our data,
-
and we found that in the old days,
-
people had it easy.
-
They just married someone
in the village nearby.
-
But the Industrial Revolution
really complicated our love life,
-
and today with affordable flights
and online social media,
-
people typically migrate
more than 100 kilometers
-
from their place of birth
to find their soulmate.
-
So now you might ask, OK,
-
but who does the hard work
of migrating from places to places
-
to form families?
-
Are these the males or the females?
-
We used our data to address this question,
-
and at least in the last 300 years,
-
we found that the ladies
-
do the hard work of migrating
from places to places to form families.
-
Now these results
are statistically significant,
-
so you can take it as scientific fact
that males are lazy.
-
(Laughter)
-
We can move from questions
about demography
-
and ask questions about human health.
-
For example, we can ask to what extent
genetic variations account for differences
-
in lifespan between individuals.
-
Previous studies analyzed
the correlation of longevity
-
between twins to address this question.
-
They estimated that the genetic variations
account for about a quarter
-
of the differences in lifespan
between individuals.
-
But twins can be correlated
due to so many reasons,
-
including various environmental effects
-
or a shared household.
-
Large family trees give us the opportunity
to analyze both close relatives,
-
such as twins, all the way
to distant relatives, even fourth cousins.
-
This way we can build robust models
-
that can tease apart the contribution
of genetic variations
-
from environmental factors.
-
We conducted this analysis using our data,
-
and we found that genetic variations
explain only 15 percent
-
of the differences in lifespan
between individuals.
-
That is five years, on average.
-
So genes matter less than
what we thought before to lifespan,
-
and I find it as great news,
-
because it means that
our actions can matter more.
-
Smoking, for example, determines
10 years of our life expectancy,
-
twice as much as what genetics determines.
-
We can even have more surprising findings
-
as we move from family trees
-
and we let our genealogists
-
to document and crowdsource
DNA information.
-
And the results can be amazing.
-
It might be hard to imagine,
but Uncle Bernie and his friends
-
can create a DNA forensic capabilities
-
that even exceed
what the FBI currently has.
-
When you place the DNA
on a large family tree,
-
you effectively create a beacon
-
that illuminates the hundreds
of distant relatives
-
that are connected to the person
that originated the DNA.
-
By placing multiple beacons
on a large family tree,
-
you can now triangulate the DNA
of an unknown person,
-
the same way that the GPS system
-
uses multiple satellites
to find a location.
-
The prime example
of the power of this technique
-
is capturing the Golden State Killer,
-
one of the most notorious criminals
in the history of the US.
-
The FBI has been searching
For this person for over 40 years.
-
They had his DNA,
-
but he never showed up
in any police database.
-
About a year ago, the FBI
consulted a genetic genealogist,
-
and she suggested that they submit
his DNA to a genealogy service
-
that can locate distant relatives.
-
They did that,
-
and they found a third cousin
of the Golden State Killer.
-
They built a large family tree,
-
scanned the different
branches of that tree
-
until they found a profile
that exactly matched
-
what they knew about
the Golden State Killer.
-
They obtained DNA from this person
and found a perfect match
-
to the DNA they had in hand.
-
They arrested him
and brought him to justice
-
after all these years.
-
Since then, genetic genealogists
-
have started working with
local US law enforcement agencies
-
to use this technique
in order to capture criminals,
-
and only in the past six months,
-
they were able to solve
over 20 cold cases with this technique.
-
The French Nobel Laureate André Gide
once wrote, "Families, I hate you!"
-
(Laughter)
-
And I think most of us
can relate to his words.
-
Why dig around in the past
doing family history
-
when the future is so bright and open?
-
But luckily, we have people
like Uncle Bernie
-
and his fellow genealogists
who love families
-
and tirelessly study them.
-
These are not amateurs
with a self-serving hobby,
-
these are citizen scientists
with a deep passion to tell us who we are,
-
and they know that the past
can hold a key to the future.
-
Thank you very much.
-
(Applause)