How I built an information time machine
-
0:00 - 0:03This is an image of the planet Earth.
-
0:03 - 0:06It looks very much like the Apollo pictures
-
0:06 - 0:08that are very well known.
-
0:08 - 0:10There is something different;
-
0:10 - 0:11you can click on it,
-
0:11 - 0:13and if you click on it,
-
0:13 - 0:16you can zoom in on almost any place on the Earth.
-
0:16 - 0:18For instance, this is a bird's-eye view
-
0:18 - 0:20of the EPFL campus.
-
0:20 - 0:22In many cases, you can also see
-
0:22 - 0:26how a building looks from a nearby street.
-
0:26 - 0:28This is pretty amazing.
-
0:28 - 0:31But there's something missing in this wonderful tour:
-
0:31 - 0:33It's time.
-
0:33 - 0:36i'm not really sure when this picture was taken.
-
0:36 - 0:38I'm not even sure it was taken
-
0:38 - 0:44at the same moment as the bird's-eye view.
-
0:44 - 0:46In my lab, we develop tools
-
0:46 - 0:48to travel not only in space
-
0:48 - 0:50but also through time.
-
0:50 - 0:52The kind of question we're asking is
-
0:52 - 0:54Is it possible to build something
-
0:54 - 0:56like Google Maps of the past?
-
0:56 - 0:59Can I add a slider on top of Google Maps
-
0:59 - 1:01and just change the year,
-
1:01 - 1:03seeing how it was 100 years before,
-
1:03 - 1:041,000 years before?
-
1:04 - 1:06Is that possible?
-
1:06 - 1:09Can I reconstruct social networks of the past?
-
1:09 - 1:12Can I make a Facebook of the Middle Ages?
-
1:12 - 1:16So, can I build time machines?
-
1:16 - 1:18Maybe we can just say, "No, it's not possible."
-
1:18 - 1:22Or, maybe, we can think of it from an information point of view.
-
1:22 - 1:25This is what I call the information mushroom.
-
1:25 - 1:27Vertically, you have the time.
-
1:27 - 1:29and horizontally, the amount of digital information available.
-
1:29 - 1:33Obviously, in the last 10 years, we have much information.
-
1:33 - 1:36And obviously the more we go in the past, the less information we have.
-
1:36 - 1:39If we want to build something like Google Maps of the past,
-
1:39 - 1:40or Facebook of the past,
-
1:40 - 1:42we need to enlarge this space,
-
1:42 - 1:44we need to make that like a rectangle.
-
1:44 - 1:45How do we do that?
-
1:45 - 1:47One way is digitization.
-
1:47 - 1:49There's a lot of material available --
-
1:49 - 1:55newspaper, printed books, thousands of printed books.
-
1:55 - 1:57I can digitize all these.
-
1:57 - 2:00I can extract information from these.
-
2:00 - 2:04Of course, the more you go in the past,
the less information you will have. -
2:04 - 2:06So, it might not be enough.
-
2:06 - 2:09So, I can do what historians do.
-
2:09 - 2:10I can extrapolate.
-
2:10 - 2:15This is what we call, in computer science, simulation.
-
2:15 - 2:16If I take a log book,
-
2:16 - 2:19I can consider, it's not just a log book
-
2:19 - 2:22of a Venetian captain going to a particular journey.
-
2:22 - 2:23I can consider it is actually a log book
-
2:23 - 2:26which is representative of
many journeys of that period. -
2:26 - 2:28I'm extrapolating.
-
2:28 - 2:30If I have a painting of a facade,
-
2:30 - 2:33I can consider it's not just that particular building,
-
2:33 - 2:37but probably it also shares the same grammar
-
2:37 - 2:41of buildings where we lost any information.
-
2:41 - 2:44So if we want to construct a time machine,
-
2:44 - 2:45we need two things.
-
2:45 - 2:47We need very large archives,
-
2:47 - 2:50and we need excellent specialists.
-
2:50 - 2:52The Venice Time Machine,
-
2:52 - 2:54the project I'm going to talk to you about,
-
2:54 - 2:57is a joint project between the EPFL
-
2:57 - 3:00and the University of Venice Ca'Foscari.
-
3:00 - 3:02There's something very peculiar about Venice,
-
3:02 - 3:05that its administration has been
-
3:05 - 3:07very, very bureaucratic.
-
3:07 - 3:09They've been keeping track of everything,
-
3:09 - 3:12almost like Google today.
-
3:12 - 3:13At the Archivio di Stato,
-
3:13 - 3:15you have 80 kilometers of archives
-
3:15 - 3:17documenting every aspect
-
3:17 - 3:19of the life of Venice over
more than 1,000 years. -
3:19 - 3:21You have every boat that goes out,
-
3:21 - 3:22every boat that comes in.
-
3:22 - 3:25You have every change that was made in the city.
-
3:25 - 3:29This is all there.
-
3:29 - 3:32We are setting up a 10-year digitization program
-
3:32 - 3:34which has the objective of transforming
-
3:34 - 3:35this immense archive
-
3:35 - 3:38into a giant information system.
-
3:38 - 3:40The type of objective we want to reach
-
3:40 - 3:45is 450 books a day that can be digitized.
-
3:45 - 3:47Of course, when you digitize, that's not enough,
-
3:47 - 3:48because these documents,
-
3:48 - 3:51most of them are in Latin, in Tuscan,
-
3:51 - 3:52in Venetian dialect,
-
3:52 - 3:54so you need to transcribe them,
-
3:54 - 3:56to translate them in some cases,
-
3:56 - 3:57to index them,
-
3:57 - 3:59and this is obviously not easy.
-
3:59 - 4:03In particular, traditional optical
character recognition method -
4:03 - 4:04that can be used for printed manuscripts,
-
4:04 - 4:08they do not work well on the handwritten document.
-
4:08 - 4:10So the solution is actually to take inspiration
-
4:10 - 4:13from another domain: speech recognition.
-
4:13 - 4:15This is a domain of something
that seems impossible, -
4:15 - 4:18which can actually be done,
-
4:18 - 4:20simply by putting additional constraints.
-
4:20 - 4:22If you have a very good model
-
4:22 - 4:23of a language which is used,
-
4:23 - 4:25if you have a very good model of a document,
-
4:25 - 4:27how well they are structured.
-
4:27 - 4:28And these are administrative documents.
-
4:28 - 4:30They are well structured in many cases.
-
4:30 - 4:33If you divide this huge archive into smaller subsets
-
4:33 - 4:36where a smaller subset
actually shares similar features, -
4:36 - 4:40then there's a chance of success.
-
4:43 - 4:45If we reach that stage, then there's something else:
-
4:45 - 4:49we can extract from this document events.
-
4:49 - 4:51Actually probably 10 billion events
-
4:51 - 4:53can be extracted from this archive.
-
4:53 - 4:55And this giant information system
-
4:55 - 4:56can be searched in many ways.
-
4:56 - 4:58You can ask questions like,
-
4:58 - 5:01"Who lived in this palazzo in 1323?"
-
5:01 - 5:03"How much cost a sea bream at the Realto market
-
5:03 - 5:05in 1434?"
-
5:05 - 5:06"What was the salary
-
5:06 - 5:08of a glass maker in Murano
-
5:08 - 5:09maybe over a decade?"
-
5:09 - 5:11You can ask even bigger questions
-
5:11 - 5:14because it will be semantically coded.
-
5:14 - 5:16And then what you can do is put that in space,
-
5:16 - 5:18because much of this information is spatial.
-
5:18 - 5:20And from that, you can do things like
-
5:20 - 5:22reconstructing this extraordinary journey
-
5:22 - 5:25of that city that managed to
have a sustainable development -
5:25 - 5:27over a thousand years,
-
5:27 - 5:29managing to have all the time
-
5:29 - 5:32a form of equilibrium with its environment.
-
5:32 - 5:33You can reconstruct that journey,
-
5:33 - 5:36visualize it in many different ways.
-
5:36 - 5:39But of course, you cannot understand
Venice if you just look at the city. -
5:39 - 5:41You have to put it in a larger European context.
-
5:41 - 5:44So the idea is also to document all the things
-
5:44 - 5:46that worked at the European level.
-
5:46 - 5:48We can reconstruct also the journey
-
5:48 - 5:50of the Venetian maritime empire,
-
5:50 - 5:54how it progressively controlled the Adriatic Sea,
-
5:54 - 5:57how it became the most powerful medieval empire
-
5:57 - 5:59of its time,
-
5:59 - 6:01controlling most of the sea routes
-
6:01 - 6:04from the east to the south.
-
6:05 - 6:08But you can even do other things,
-
6:08 - 6:10because in these maritime routes,
-
6:10 - 6:12there are regular patterns.
-
6:12 - 6:14You can go one step beyond
-
6:14 - 6:17and actually create a simulation system,
-
6:17 - 6:19create a Mediterranean simulator
-
6:19 - 6:22which is capable actually of reconstructing
-
6:22 - 6:24even the information we are missing,
-
6:24 - 6:27which would enable us to have
questions you could ask -
6:27 - 6:30like if you were using a route planner.
-
6:30 - 6:33"If I am in Corfu in June 1323
-
6:33 - 6:36and want to go to Constantinople,
-
6:36 - 6:38where can I take a boat?"
-
6:38 - 6:39Probably we can answer this question
-
6:39 - 6:44with one or two or three days' precision.
-
6:44 - 6:45"How much will it cost?"
-
6:45 - 6:49"What are the chance of encountering pirates?"
-
6:49 - 6:51Of course, you understand,
-
6:51 - 6:53the central scientific challenge
of a project like this one -
6:53 - 6:57is qualifying, quantifying and representing
-
6:57 - 7:00uncertainty and inconsistency
at each step of this process. -
7:00 - 7:03There are errors everywhere,
-
7:03 - 7:06errors in the document, it's
the wrong name of the captain, -
7:06 - 7:09some of the boats never actually took to sea.
-
7:09 - 7:14There are errors in translation, interpretative biases,
-
7:14 - 7:17and on top of that, if you add algorithmic processes,
-
7:17 - 7:20you're going to have errors in recognition,
-
7:20 - 7:22errors in extraction,
-
7:22 - 7:26so you have very, very uncertain data.
-
7:26 - 7:30So how can we detect and
correct these inconsistencies? -
7:30 - 7:34How can we represent that form of uncertainty?
-
7:34 - 7:36It's difficult. One thing you can do
-
7:36 - 7:38is document each step of the process,
-
7:38 - 7:41not only coding the historical information
-
7:41 - 7:43but what we call the meta-historical information,
-
7:43 - 7:46how is historical knowledge constructed,
-
7:46 - 7:48documenting each step.
-
7:48 - 7:50That will not guarantee that we actually converge
-
7:50 - 7:52toward a single story of Venice,
-
7:52 - 7:54but probably we can actually reconstruct
-
7:54 - 7:57a fully documented potential story of Venice.
-
7:57 - 7:59Maybe there's not a single map.
-
7:59 - 8:01Maybe there are several maps.
-
8:01 - 8:03The system should allow for that,
-
8:03 - 8:06because we have to deal with
a new form of uncertainty, -
8:06 - 8:11which is really new for this type of giant databases.
-
8:11 - 8:13And how should we communicate
-
8:13 - 8:17this new research to a large audience?
-
8:17 - 8:19Again, Venice is extraordinary for that.
-
8:19 - 8:22With the millions of visitors that come every year,
-
8:22 - 8:23it's actually one of the best places
-
8:23 - 8:26to try to invent the museum of the future.
-
8:26 - 8:30Imagine, horizontally you see the reconstructed map
-
8:30 - 8:31of a given year,
-
8:31 - 8:34and vertically, you see the document
-
8:34 - 8:35that served the reconstruction,
-
8:35 - 8:39paintings, for instance.
-
8:39 - 8:41Imagine an immersive system that permits
-
8:41 - 8:45to go and dive and reconstruct
the Venice of a given year, -
8:45 - 8:48some experience you could share within a group.
-
8:48 - 8:50On the contrary, imagine actually that you start
-
8:50 - 8:52from a document, a Venetian manuscript,
-
8:52 - 8:55and you show, actually, what
you can construct out of it, -
8:55 - 8:57how it is decoded,
-
8:57 - 8:59how the context of that document can be recreated.
-
8:59 - 9:01This is an image from an exhibit
-
9:01 - 9:03which is currently conducted in Geneva
-
9:03 - 9:06with that type of system.
-
9:06 - 9:08So to conclude, we can say that
-
9:08 - 9:11research in the humanities is about to undergo
-
9:11 - 9:13an evolution which is maybe similar
-
9:13 - 9:17to what happened to life sciences 30 years ago.
-
9:17 - 9:22It's really a question of scale.
-
9:22 - 9:25We see projects which are
-
9:25 - 9:29much beyond any single research team can do,
-
9:29 - 9:32and this is really new for the humanities,
-
9:32 - 9:35which very often take the habit of working
-
9:35 - 9:39in small groups or only with a couple of researchers.
-
9:39 - 9:42When you visit the Archivio di Stato,
-
9:42 - 9:44you feel this is beyond what any single team can do,
-
9:44 - 9:48and that should be a joint and common effort.
-
9:48 - 9:51So what we must do for this paradigm shift
-
9:51 - 9:53is actually foster a new generation
-
9:53 - 9:55of "digital humanists"
-
9:55 - 9:57that are going to be ready for this shift.
-
9:57 - 9:59I thank you very much.
-
9:59 - 10:03(Applause)
- Title:
- How I built an information time machine
- Speaker:
- Frederic Kaplan
- Description:
-
Imagine if you could surf Facebook ... from the Middle Ages. Well, it may not be as far off as it sounds. In a fun and interesting talk, researcher and engineer Frederic Kaplan shows off the Venice Time Machine, a project to digitize 80 kilometers of books to create a historical and geographical simulation of Venice across 1000 years. (Filmed at TEDxCaFoscariU.)
- Video Language:
- English
- Team:
- closed TED
- Project:
- TEDTalks
- Duration:
- 10:20
Morton Bast edited English subtitles for How to build an information time machine | ||
Morton Bast approved English subtitles for How to build an information time machine | ||
Morton Bast edited English subtitles for How to build an information time machine | ||
Morton Bast edited English subtitles for How to build an information time machine | ||
Madeleine Aronson accepted English subtitles for How to build an information time machine | ||
Madeleine Aronson edited English subtitles for How to build an information time machine | ||
Joseph Geni edited English subtitles for How to build an information time machine | ||
Joseph Geni edited English subtitles for How to build an information time machine |