How we're building the world's largest family tree
-
0:01 - 0:04People use the internet
for various reasons. -
0:06 - 0:10It turns out that one of the most
popular categories of website -
0:10 - 0:12is something that people
typically consume in private. -
0:14 - 0:16It involves curiosity,
-
0:16 - 0:20non-insignificant levels
of self-indulgence -
0:20 - 0:23and is centered around recording
the reproductive activities -
0:23 - 0:25of other people.
-
0:25 - 0:26(Laughter)
-
0:26 - 0:28Of course, I'm talking about genealogy --
-
0:28 - 0:29(Laughter)
-
0:29 - 0:31the study of family history.
-
0:31 - 0:33When it comes to detailing family history,
-
0:33 - 0:37in every family, we have this person
that is obsessed with genealogy. -
0:37 - 0:39Let's call him Uncle Bernie.
-
0:39 - 0:43Uncle Bernie is exactly the last person
you want to sit next to -
0:43 - 0:45in Thanksgiving dinner,
-
0:45 - 0:47because he will bore you to death
with peculiar details -
0:47 - 0:49about some ancient relatives.
-
0:50 - 0:52But as you know,
-
0:52 - 0:55there is a scientific side for everything,
-
0:55 - 0:58and we found that Uncle Bernie's stories
-
0:58 - 1:01have immense potential
for biomedical research. -
1:01 - 1:04We let Uncle Bernie
and his fellow genealogists -
1:04 - 1:09document their family trees through
a genealogy website called geni.com. -
1:09 - 1:11When users upload
their trees to the website, -
1:11 - 1:13it scans their relatives,
-
1:13 - 1:15and if it finds matches to existing trees,
-
1:15 - 1:19it merges the existing
and the new tree together. -
1:20 - 1:23The result is that large
family trees are created, -
1:23 - 1:26beyond the individual level
of each genealogist. -
1:27 - 1:31Now, by repeating this process
with millions of people -
1:31 - 1:33all over the world,
-
1:33 - 1:38we can crowdsource the construction
of a family tree of all humankind. -
1:39 - 1:41Using this website,
-
1:41 - 1:46we were able to connect 125 million people
-
1:46 - 1:48into a single family tree.
-
1:49 - 1:52I cannot draw the tree
on the screens over here -
1:52 - 1:54because they have less pixels
-
1:54 - 1:56than the number of people in this tree.
-
1:57 - 2:02But here is an example of a subset
of 6,000 individuals. -
2:02 - 2:05Each green node is a person.
-
2:05 - 2:08The red nodes represent marriages,
-
2:08 - 2:10and the connections represent parenthood.
-
2:11 - 2:13In the middle of this tree,
you see the ancestors. -
2:13 - 2:16And as we go to the periphery,
you see the descendants. -
2:16 - 2:19This tree has seven
generations, approximately. -
2:20 - 2:23Now, this is what happens
when we increase the number of individuals -
2:23 - 2:25to 70,000 people --
-
2:25 - 2:29still a tiny subset
of all the data that we have. -
2:30 - 2:34Despite that, you can already see
the formation of gigantic family trees -
2:34 - 2:37with many very distant relatives.
-
2:38 - 2:41Thanks to the hard work
of our genealogists, -
2:41 - 2:44we can go back in time
hundreds of years ago. -
2:44 - 2:48For example, here is Alexander Hamilton,
-
2:48 - 2:50who was born in 1755.
-
2:51 - 2:55Alexander was the first
US Secretary of the Treasury, -
2:55 - 2:58but mostly known today
due to a popular Broadway musical. -
2:59 - 3:04We found that Alexander has deeper
connections in the showbiz industry. -
3:04 - 3:06In fact, he's a blood relative of ...
-
3:07 - 3:08Kevin Bacon!
-
3:08 - 3:10(Laughter)
-
3:10 - 3:13Both of them are descendants
of a lady from Scotland -
3:13 - 3:15who lived in the 13th century.
-
3:15 - 3:18So you can say that Alexander Hamilton
-
3:18 - 3:21is 35 degrees of Kevin Bacon genealogy.
-
3:21 - 3:23(Laughter)
-
3:23 - 3:26And our tree has millions
of stories like that. -
3:28 - 3:33We invested significant efforts
to validate the quality of our data. -
3:33 - 3:38Using DNA, we found that .3 percent of
the mother-child connections in our data -
3:38 - 3:40are wrong,
-
3:40 - 3:43which could match the adoption rate
in the US pre-Second World War. -
3:45 - 3:47For the father's side,
-
3:47 - 3:49the news is not as good:
-
3:50 - 3:561.9 percent of the father-child
connections in our data are wrong. -
3:56 - 3:58And I see some people smirk over here.
-
3:58 - 4:00It is what you think --
-
4:00 - 4:02there are many milkmen out there.
-
4:02 - 4:03(Laughter)
-
4:03 - 4:07However, this 1.9 percent error rate
in patrilineal connections -
4:07 - 4:09is not unique to our data.
-
4:09 - 4:12Previous studies found
a similar error rate -
4:12 - 4:14using clinical-grade pedigrees.
-
4:14 - 4:17So the quality of our data is good,
-
4:17 - 4:19and that should not be a surprise.
-
4:19 - 4:23Our genealogists have
a profound, vested interest -
4:23 - 4:26in correctly documenting
their family history. -
4:29 - 4:33We can leverage this data to learn
quantitative information about humanity, -
4:33 - 4:36for example, questions about demography.
-
4:36 - 4:40Here is a look at all our profiles
on the map of the world. -
4:40 - 4:45Each pixel is a person
that lived at some point. -
4:45 - 4:46And since we have so much data,
-
4:46 - 4:49you can see the contours
of many countries, -
4:49 - 4:51especially in the Western world.
-
4:51 - 4:55In this clip, we stratified
the map that I've showed you -
4:55 - 5:00based on the year of births of individuals
from 1400 to 1900, -
5:00 - 5:03and we compared it
to known migration events. -
5:03 - 5:07The clip is going to show you
that the deepest lineages in our data -
5:07 - 5:08go all the way back to the UK,
-
5:08 - 5:10where they had better record keeping,
-
5:10 - 5:13and then they spread along
the routes of Western colonialism. -
5:13 - 5:15Let's watch this.
-
5:15 - 5:17(Music)
-
5:17 - 5:19[Year of birth: ]
-
5:20 - 5:22[1492 - Columbus sails the ocean blue]
-
5:24 - 5:26[1620 - Mayflower lands in Massachusetts]
-
5:27 - 5:29[1652 - Dutch settle in South Africa]
-
5:32 - 5:36[1788 - Great Britain penal
transportation to Australia starts] -
5:36 - 5:37[1836 - First migrants use Oregon Trail]
-
5:38 - 5:41[all activity]
-
5:44 - 5:45I love this movie.
-
5:45 - 5:51Now, since these migration events
are giving the context of families, -
5:51 - 5:53we can ask questions such as:
-
5:53 - 5:56What is the typical distance
between the birth locations -
5:56 - 5:59of husbands and wives?
-
5:59 - 6:03This distance plays
a pivotal role in demography, -
6:03 - 6:06because the patterns in which
people migrate to form families -
6:06 - 6:10determine how genes spread
in geographical areas. -
6:11 - 6:13We analyzed this distance using our data,
-
6:13 - 6:15and we found that in the old days,
-
6:15 - 6:17people had it easy.
-
6:17 - 6:19They just married someone
in the village nearby. -
6:20 - 6:24But the Industrial Revolution
really complicated our love life. -
6:24 - 6:28And today, with affordable flights
and online social media, -
6:28 - 6:33people typically migrate more than
100 kilometers from their place of birth -
6:33 - 6:35to find their soul mate.
-
6:37 - 6:38So now you might ask:
-
6:38 - 6:42OK, but who does the hard work
of migrating from places to places -
6:42 - 6:44to form families?
-
6:44 - 6:47Are these the males or the females?
-
6:48 - 6:50We used our data to address this question,
-
6:50 - 6:53and at least in the last 300 years,
-
6:53 - 6:56we found that the ladies do the hard work
-
6:56 - 6:59of migrating from places
to places to form families. -
6:59 - 7:03Now, these results
are statistically significant, -
7:03 - 7:06so you can take it as scientific fact
that males are lazy. -
7:06 - 7:09(Laughter)
-
7:09 - 7:12We can move from questions
about demography -
7:12 - 7:15and ask questions about human health.
-
7:15 - 7:16For example, we can ask
-
7:16 - 7:21to what extent genetic variations
account for differences in life span -
7:21 - 7:22between individuals.
-
7:23 - 7:28Previous studies analyzed the correlation
of longevity between twins -
7:28 - 7:29to address this question.
-
7:29 - 7:32They estimated that the genetic
variations account for -
7:32 - 7:36about a quarter of the differences
in life span between individuals. -
7:37 - 7:39But twins can be correlated
due to so many reasons, -
7:39 - 7:42including various environmental effects
-
7:42 - 7:43or a shared household.
-
7:44 - 7:48Large family trees give us the opportunity
to analyze both close relatives, -
7:48 - 7:49such as twins,
-
7:49 - 7:52all the way to distant relatives,
even fourth cousins. -
7:53 - 7:55This way we can build robust models
-
7:55 - 7:59that can tease apart the contribution
of genetic variations -
7:59 - 8:01from environmental factors.
-
8:01 - 8:04We conducted this analysis using our data,
-
8:04 - 8:10and we found that genetic variations
explain only 15 percent -
8:10 - 8:13of the differences in life span
between individuals. -
8:15 - 8:18That is five years, on average.
-
8:18 - 8:23So genes matter less than
what we thought before to life span. -
8:24 - 8:26And I find it great news,
-
8:26 - 8:30because it means that
our actions can matter more. -
8:31 - 8:35Smoking, for example, determines
10 years of our life expectancy -- -
8:35 - 8:37twice as much as what genetics determines.
-
8:38 - 8:41We can even have more surprising findings
-
8:41 - 8:42as we move from family trees
-
8:42 - 8:47and we let our genealogists
document and crowdsource DNA information. -
8:47 - 8:49And the results can be amazing.
-
8:49 - 8:53It might be hard to imagine,
but Uncle Bernie and his friends -
8:53 - 8:56can create DNA forensic capabilities
-
8:56 - 8:59that even exceed
what the FBI currently has. -
9:01 - 9:03When you place the DNA
on a large family tree, -
9:03 - 9:05you effectively create a beacon
-
9:05 - 9:08that illuminates the hundreds
of distant relatives -
9:08 - 9:12that are all connected to the person
that originated the DNA. -
9:13 - 9:15By placing multiple beacons
on a large family tree, -
9:15 - 9:19you can now triangulate the DNA
of an unknown person, -
9:19 - 9:23the same way that the GPS system
uses multiple satellites -
9:23 - 9:24to find a location.
-
9:25 - 9:29The prime example
of the power of this technique -
9:29 - 9:32is capturing the Golden State Killer,
-
9:33 - 9:37one of the most notorious criminals
in the history of the US. -
9:37 - 9:43The FBI had been searching
for this person for over 40 years. -
9:44 - 9:45They had his DNA,
-
9:45 - 9:49but he never showed up
in any police database. -
9:49 - 9:54About a year ago, the FBI
consulted a genetic genealogist, -
9:54 - 9:58and she suggested that they submit
his DNA to a genealogy service -
9:58 - 10:01that can locate distant relatives.
-
10:01 - 10:02They did that,
-
10:02 - 10:06and they found a third cousin
of the Golden State Killer. -
10:06 - 10:08They built a large family tree,
-
10:08 - 10:10scanned the different
branches of that tree, -
10:11 - 10:13until they found a profile
that exactly matched -
10:13 - 10:16what they knew about
the Golden State Killer. -
10:16 - 10:19They obtained DNA from this person
and found a perfect match -
10:19 - 10:21to the DNA they had in hand.
-
10:21 - 10:24They arrested him
and brought him to justice -
10:24 - 10:25after all these years.
-
10:26 - 10:29Since then, genetic genealogists
have started working with -
10:29 - 10:32local US law enforcement agencies
-
10:32 - 10:35to use this technique
in order to capture criminals. -
10:36 - 10:38And only in the past six months,
-
10:38 - 10:43they were able to solve
over 20 cold cases with this technique. -
10:44 - 10:49Luckily, we have people like Uncle
Bernie and his fellow genealogists -
10:49 - 10:52These are not amateurs
with a self-serving hobby. -
10:53 - 10:59These are citizen scientists
with a deep passion to tell us who we are. -
10:59 - 11:04And they know that the past
can hold a key to the future. -
11:04 - 11:05Thank you very much.
-
11:05 - 11:09(Applause)
- Title:
- How we're building the world's largest family tree
- Speaker:
- Yaniv Erlich
- Description:
-
Computational geneticist Yaniv Erlich helped build the world's largest family tree -- comprising 13 million people and going back more than 500 years. He shares fascinating patterns that emerged from the work -- about our love lives, our health, even decades-old criminal cases -- and shows how crowdsourced genealogy databases can shed light not only on the past but also on the future.
- Video Language:
- English
- Team:
- closed TED
- Project:
- TEDTalks
- Duration:
- 11:45
Brian Greene edited English subtitles for How we're building the world's largest family tree | ||
Oliver Friedman edited English subtitles for How we're building the world's largest family tree | ||
Oliver Friedman edited English subtitles for How we're building the world's largest family tree | ||
Brian Greene edited English subtitles for How we're building the world's largest family tree | ||
Brian Greene edited English subtitles for How we're building the world's largest family tree | ||
Brian Greene approved English subtitles for How we're building the world's largest family tree | ||
Brian Greene edited English subtitles for How we're building the world's largest family tree | ||
Brian Greene accepted English subtitles for How we're building the world's largest family tree |