WEBVTT 00:00:00.000 --> 00:00:03.000 I'd like to begin with a thought experiment. 00:00:04.000 --> 00:00:07.000 Imagine that it's 4,000 years into the future. 00:00:07.000 --> 00:00:09.000 Civilization as we know it 00:00:09.000 --> 00:00:11.000 has ceased to exist -- 00:00:11.000 --> 00:00:13.000 no books, 00:00:13.000 --> 00:00:16.000 no electronic devices, 00:00:16.000 --> 00:00:19.000 no Facebook or Twitter. 00:00:19.000 --> 00:00:22.000 All knowledge of the English language and the English alphabet 00:00:22.000 --> 00:00:24.000 has been lost. 00:00:24.000 --> 00:00:26.000 Now imagine archeologists 00:00:26.000 --> 00:00:28.000 digging through the rubble of one of our cities. 00:00:28.000 --> 00:00:30.000 What might they find? 00:00:30.000 --> 00:00:33.000 Well perhaps some rectangular pieces of plastic 00:00:33.000 --> 00:00:36.000 with strange symbols on them. 00:00:36.000 --> 00:00:39.000 Perhaps some circular pieces of metal. 00:00:39.000 --> 00:00:41.000 Maybe some cylindrical containers 00:00:41.000 --> 00:00:43.000 with some symbols on them. 00:00:43.000 --> 00:00:46.000 And perhaps one archeologist becomes an instant celebrity 00:00:46.000 --> 00:00:48.000 when she discovers -- 00:00:48.000 --> 00:00:50.000 buried in the hills somewhere in North America -- 00:00:50.000 --> 00:00:53.000 massive versions of these same symbols. 00:00:55.000 --> 00:00:57.000 Now let's ask ourselves, 00:00:57.000 --> 00:01:00.000 what could such artifacts say about us 00:01:00.000 --> 00:01:03.000 to people 4,000 years into the future? NOTE Paragraph 00:01:03.000 --> 00:01:05.000 This is no hypothetical question. 00:01:05.000 --> 00:01:08.000 In fact, this is exactly the kind of question we're faced with 00:01:08.000 --> 00:01:11.000 when we try to understand the Indus Valley civilization, 00:01:11.000 --> 00:01:13.000 which existed 4,000 years ago. 00:01:13.000 --> 00:01:16.000 The Indus civilization was roughly contemporaneous 00:01:16.000 --> 00:01:19.000 with the much better known Egyptian and the Mesopotamian civilizations, 00:01:19.000 --> 00:01:22.000 but it was actually much larger than either of these two civilizations. 00:01:22.000 --> 00:01:24.000 It occupied the area 00:01:24.000 --> 00:01:26.000 of approximately one million square kilometers, 00:01:26.000 --> 00:01:28.000 covering what is now Pakistan, 00:01:28.000 --> 00:01:30.000 Northwestern India 00:01:30.000 --> 00:01:32.000 and parts of Afghanistan and Iran. 00:01:32.000 --> 00:01:34.000 Given that it was such a vast civilization, 00:01:34.000 --> 00:01:38.000 you might expect to find really powerful rulers, kings, 00:01:38.000 --> 00:01:41.000 and huge monuments glorifying these powerful kings. 00:01:41.000 --> 00:01:43.000 In fact, 00:01:43.000 --> 00:01:45.000 what archeologists have found is none of that. 00:01:45.000 --> 00:01:48.000 They've found small objects such as these. NOTE Paragraph 00:01:48.000 --> 00:01:51.000 Here's an example of one of these objects. 00:01:51.000 --> 00:01:53.000 Well obviously this is a replica. 00:01:53.000 --> 00:01:56.000 But who is this person? 00:01:56.000 --> 00:01:58.000 A king? A god? 00:01:58.000 --> 00:02:00.000 A priest? 00:02:00.000 --> 00:02:02.000 Or perhaps an ordinary person 00:02:02.000 --> 00:02:04.000 like you or me? 00:02:04.000 --> 00:02:06.000 We don't know. 00:02:06.000 --> 00:02:09.000 But the Indus people also left behind artifacts with writing on them. 00:02:09.000 --> 00:02:11.000 Well no, not pieces of plastic, 00:02:11.000 --> 00:02:14.000 but stone seals, copper tablets, 00:02:14.000 --> 00:02:16.000 pottery and, surprisingly, 00:02:16.000 --> 00:02:18.000 one large sign board, 00:02:18.000 --> 00:02:20.000 which was found buried near the gate of a city. 00:02:20.000 --> 00:02:22.000 Now we don't know if it says Hollywood, 00:02:22.000 --> 00:02:24.000 or even Bollywood for that matter. 00:02:24.000 --> 00:02:26.000 In fact, we don't even know 00:02:26.000 --> 00:02:28.000 what any of these objects say, 00:02:28.000 --> 00:02:31.000 and that's because the Indus script is undeciphered. 00:02:31.000 --> 00:02:33.000 We don't know what any of these symbols mean. NOTE Paragraph 00:02:33.000 --> 00:02:36.000 The symbols are most commonly found on seals. 00:02:36.000 --> 00:02:38.000 So you see up there one such object. 00:02:38.000 --> 00:02:41.000 It's the square object with the unicorn-like animal on it. 00:02:41.000 --> 00:02:43.000 Now that's a magnificent piece of art. 00:02:43.000 --> 00:02:45.000 So how big do you think that is? 00:02:45.000 --> 00:02:47.000 Perhaps that big? 00:02:47.000 --> 00:02:49.000 Or maybe that big? 00:02:49.000 --> 00:02:51.000 Well let me show you. 00:02:52.000 --> 00:02:55.000 Here's a replica of one such seal. 00:02:55.000 --> 00:02:57.000 It's only about one inch by one inch in size -- 00:02:57.000 --> 00:02:59.000 pretty tiny. 00:02:59.000 --> 00:03:01.000 So what were these used for? 00:03:01.000 --> 00:03:04.000 We know that these were used for stamping clay tags 00:03:04.000 --> 00:03:07.000 that were attached to bundles of goods that were sent from one place to the other. 00:03:07.000 --> 00:03:10.000 So you know those packing slips you get on your FedEx boxes? 00:03:10.000 --> 00:03:13.000 These were used to make those kinds of packing slips. 00:03:13.000 --> 00:03:16.000 You might wonder what these objects contain 00:03:16.000 --> 00:03:18.000 in terms of their text. 00:03:18.000 --> 00:03:20.000 Perhaps they're the name of the sender 00:03:20.000 --> 00:03:22.000 or some information about the goods 00:03:22.000 --> 00:03:25.000 that are being sent from one place to the other -- we don't know. 00:03:25.000 --> 00:03:27.000 We need to decipher the script to answer that question. NOTE Paragraph 00:03:27.000 --> 00:03:29.000 Deciphering the script 00:03:29.000 --> 00:03:31.000 is not just an intellectual puzzle; 00:03:31.000 --> 00:03:33.000 it's actually become a question 00:03:33.000 --> 00:03:35.000 that's become deeply intertwined 00:03:35.000 --> 00:03:38.000 with the politics and the cultural history of South Asia. 00:03:38.000 --> 00:03:41.000 In fact, the script has become a battleground of sorts 00:03:41.000 --> 00:03:43.000 between three different groups of people. 00:03:43.000 --> 00:03:45.000 First, there's a group of people 00:03:45.000 --> 00:03:47.000 who are very passionate in their belief 00:03:47.000 --> 00:03:49.000 that the Indus script 00:03:49.000 --> 00:03:51.000 does not represent a language at all. 00:03:51.000 --> 00:03:53.000 These people believe that the symbols 00:03:53.000 --> 00:03:56.000 are very similar to the kind of symbols you find on traffic signs 00:03:56.000 --> 00:03:59.000 or the emblems you find on shields. 00:03:59.000 --> 00:04:01.000 There's a second group of people 00:04:01.000 --> 00:04:04.000 who believe that the Indus script represents an Indo-European language. 00:04:04.000 --> 00:04:06.000 If you look at a map of India today, 00:04:06.000 --> 00:04:09.000 you'll see that most of the languages spoken in North India 00:04:09.000 --> 00:04:12.000 belong to the Indo-European language family. 00:04:12.000 --> 00:04:14.000 So some people believe that the Indus script 00:04:14.000 --> 00:04:17.000 represents an ancient Indo-European language such as Sanskrit. NOTE Paragraph 00:04:17.000 --> 00:04:19.000 There's a last group of people 00:04:19.000 --> 00:04:22.000 who believe that the Indus people 00:04:22.000 --> 00:04:25.000 were the ancestors of people living in South India today. 00:04:25.000 --> 00:04:27.000 These people believe that the Indus script 00:04:27.000 --> 00:04:29.000 represents an ancient form 00:04:29.000 --> 00:04:31.000 of the Dravidian language family, 00:04:31.000 --> 00:04:34.000 which is the language family spoken in much of South India today. 00:04:34.000 --> 00:04:36.000 And the proponents of this theory 00:04:36.000 --> 00:04:39.000 point to that small pocket of Dravidian-speaking people in the North, 00:04:39.000 --> 00:04:41.000 actually near Afghanistan, 00:04:41.000 --> 00:04:44.000 and they say that perhaps, sometime in the past, 00:04:44.000 --> 00:04:47.000 Dravidian languages were spoken all over India 00:04:47.000 --> 00:04:49.000 and that this suggests 00:04:49.000 --> 00:04:52.000 that the Indus civilization is perhaps also Dravidian. NOTE Paragraph 00:04:52.000 --> 00:04:55.000 Which of these hypotheses can be true? 00:04:55.000 --> 00:04:57.000 We don't know, but perhaps if you deciphered the script, 00:04:57.000 --> 00:04:59.000 you would be able to answer this question. 00:04:59.000 --> 00:05:01.000 But deciphering the script is a very challenging task. 00:05:01.000 --> 00:05:03.000 First, there's no Rosetta Stone. 00:05:03.000 --> 00:05:05.000 I don't mean the software; 00:05:05.000 --> 00:05:07.000 I mean an ancient artifact 00:05:07.000 --> 00:05:09.000 that contains in the same text 00:05:09.000 --> 00:05:12.000 both a known text and an unknown text. 00:05:12.000 --> 00:05:15.000 We don't have such an artifact for the Indus script. 00:05:15.000 --> 00:05:18.000 And furthermore, we don't even know what language they spoke. 00:05:18.000 --> 00:05:20.000 And to make matters even worse, 00:05:20.000 --> 00:05:22.000 most of the text that we have are extremely short. 00:05:22.000 --> 00:05:24.000 So as I showed you, they're usually found on these seals 00:05:24.000 --> 00:05:26.000 that are very, very tiny. NOTE Paragraph 00:05:26.000 --> 00:05:28.000 And so given these formidable obstacles, 00:05:28.000 --> 00:05:30.000 one might wonder and worry 00:05:30.000 --> 00:05:33.000 whether one will ever be able to decipher the Indus script. 00:05:33.000 --> 00:05:35.000 In the rest of my talk, 00:05:35.000 --> 00:05:37.000 I'd like to tell you about how I learned to stop worrying 00:05:37.000 --> 00:05:39.000 and love the challenge posed by the Indus script. 00:05:39.000 --> 00:05:42.000 I've always been fascinated by the Indus script 00:05:42.000 --> 00:05:44.000 ever since I read about it in a middle school textbook. 00:05:44.000 --> 00:05:46.000 And why was I fascinated? 00:05:46.000 --> 00:05:50.000 Well it's the last major undeciphered script in the ancient world. 00:05:50.000 --> 00:05:53.000 My career path led me to become a computational neuroscientist, 00:05:53.000 --> 00:05:55.000 so in my day job, 00:05:55.000 --> 00:05:57.000 I create computer models of the brain 00:05:57.000 --> 00:06:00.000 to try to understand how the brain makes predictions, 00:06:00.000 --> 00:06:02.000 how the brain makes decisions, 00:06:02.000 --> 00:06:04.000 how the brain learns and so on. NOTE Paragraph 00:06:04.000 --> 00:06:07.000 But in 2007, my path crossed again with the Indus script. 00:06:07.000 --> 00:06:09.000 That's when I was in India, 00:06:09.000 --> 00:06:11.000 and I had the wonderful opportunity 00:06:11.000 --> 00:06:13.000 to meet with some Indian scientists 00:06:13.000 --> 00:06:16.000 who were using computer models to try to analyze the script. 00:06:16.000 --> 00:06:18.000 And so it was then that I realized 00:06:18.000 --> 00:06:21.000 there was an opportunity for me to collaborate with these scientists, 00:06:21.000 --> 00:06:23.000 and so I jumped at that opportunity. 00:06:23.000 --> 00:06:25.000 And I'd like to describe some of the results that we have found. 00:06:25.000 --> 00:06:28.000 Or better yet, let's all collectively decipher. 00:06:28.000 --> 00:06:30.000 Are you ready? NOTE Paragraph 00:06:30.000 --> 00:06:33.000 The first thing that you need to do when you have an undeciphered script 00:06:33.000 --> 00:06:35.000 is try to figure out the direction of writing. 00:06:35.000 --> 00:06:38.000 Here are two texts that contain some symbols on them. 00:06:38.000 --> 00:06:40.000 Can you tell me 00:06:40.000 --> 00:06:43.000 if the direction of writing is right to left or left to right? 00:06:43.000 --> 00:06:46.000 I'll give you a couple of seconds. 00:06:46.000 --> 00:06:49.000 Okay. Right to left, how many? Okay. 00:06:49.000 --> 00:06:51.000 Okay. Left to right? 00:06:51.000 --> 00:06:53.000 Oh, it's almost 50/50. Okay. 00:06:53.000 --> 00:06:55.000 The answer is: 00:06:55.000 --> 00:06:57.000 if you look at the left-hand side of the two texts, 00:06:57.000 --> 00:07:00.000 you'll notice that there's a cramping of signs, 00:07:00.000 --> 00:07:02.000 and it seems like 4,000 years ago, 00:07:02.000 --> 00:07:04.000 when the scribe was writing from right to left, 00:07:04.000 --> 00:07:06.000 they ran out of space. 00:07:06.000 --> 00:07:08.000 And so they had to cram the sign. 00:07:08.000 --> 00:07:10.000 One of the signs is also below the text on the top. 00:07:10.000 --> 00:07:12.000 This suggests the direction of writing 00:07:12.000 --> 00:07:14.000 was probably from right to left, 00:07:14.000 --> 00:07:16.000 and so that's one of the first things we know, 00:07:16.000 --> 00:07:19.000 that directionality is a very key aspect of linguistic scripts. 00:07:19.000 --> 00:07:21.000 And the Indus script now has 00:07:21.000 --> 00:07:23.000 this particular property. NOTE Paragraph 00:07:23.000 --> 00:07:25.000 What other properties of language does the script show? 00:07:25.000 --> 00:07:27.000 Languages contain patterns. 00:07:27.000 --> 00:07:29.000 If I give you the letter Q 00:07:29.000 --> 00:07:32.000 and ask you to predict the next letter, what do you think that would be? 00:07:32.000 --> 00:07:34.000 Most of you said U, which is right. 00:07:34.000 --> 00:07:36.000 Now if I asked you to predict one more letter, 00:07:36.000 --> 00:07:38.000 what do you think that would be? 00:07:38.000 --> 00:07:41.000 Now there's several thoughts. There's E. It could be I. It could be A, 00:07:41.000 --> 00:07:44.000 but certainly not B, C or D, right? 00:07:44.000 --> 00:07:47.000 The Indus script also exhibits similar kinds of patterns. 00:07:47.000 --> 00:07:50.000 There's a lot of text that start with this diamond-shaped symbol. 00:07:50.000 --> 00:07:52.000 And this in turn tends to be followed 00:07:52.000 --> 00:07:54.000 by this quotation marks-like symbol. 00:07:54.000 --> 00:07:56.000 And this is very similar to a Q and U example. 00:07:56.000 --> 00:07:58.000 This symbol can in turn be followed 00:07:58.000 --> 00:08:01.000 by these fish-like symbols and some other signs, 00:08:01.000 --> 00:08:03.000 but never by these other signs at the bottom. 00:08:03.000 --> 00:08:05.000 And furthermore, there's some signs 00:08:05.000 --> 00:08:07.000 that really prefer the end of texts, 00:08:07.000 --> 00:08:09.000 such as this jar-shaped sign, 00:08:09.000 --> 00:08:11.000 and this sign, in fact, happens to be 00:08:11.000 --> 00:08:13.000 the most frequently occurring sign in the script. NOTE Paragraph 00:08:13.000 --> 00:08:16.000 Given such patterns, here was our idea. 00:08:16.000 --> 00:08:18.000 The idea was to use a computer 00:08:18.000 --> 00:08:20.000 to learn these patterns, 00:08:20.000 --> 00:08:23.000 and so we gave the computer the existing texts. 00:08:23.000 --> 00:08:25.000 And the computer learned a statistical model 00:08:25.000 --> 00:08:27.000 of which symbols tend to occur together 00:08:27.000 --> 00:08:29.000 and which symbols tend to follow each other. 00:08:29.000 --> 00:08:31.000 Given the computer model, 00:08:31.000 --> 00:08:34.000 we can test the model by essentially quizzing it. 00:08:34.000 --> 00:08:36.000 So we could deliberately erase some symbols, 00:08:36.000 --> 00:08:39.000 and we can ask it to predict the missing symbols. 00:08:39.000 --> 00:08:42.000 Here are some examples. 00:08:45.000 --> 00:08:47.000 You may regard this 00:08:47.000 --> 00:08:49.000 as perhaps the most ancient game 00:08:49.000 --> 00:08:52.000 of Wheel of Fortune. NOTE Paragraph 00:08:53.000 --> 00:08:55.000 What we found 00:08:55.000 --> 00:08:57.000 was that the computer was successful in 75 percent of the cases 00:08:57.000 --> 00:08:59.000 in predicting the correct symbol. 00:08:59.000 --> 00:09:01.000 In the rest of the cases, 00:09:01.000 --> 00:09:04.000 typically the second best guess or third best guess was the right answer. 00:09:04.000 --> 00:09:06.000 There's also practical use 00:09:06.000 --> 00:09:08.000 for this particular procedure. 00:09:08.000 --> 00:09:10.000 There's a lot of these texts that are damaged. 00:09:10.000 --> 00:09:12.000 Here's an example of one such text. 00:09:12.000 --> 00:09:15.000 And we can use the computer model now to try to complete this text 00:09:15.000 --> 00:09:17.000 and make a best guess prediction. 00:09:17.000 --> 00:09:20.000 Here's an example of a symbol that was predicted. 00:09:20.000 --> 00:09:22.000 And this could be really useful as we try to decipher the script 00:09:22.000 --> 00:09:25.000 by generating more data that we can analyze. NOTE Paragraph 00:09:25.000 --> 00:09:28.000 Now here's one other thing you can do with the computer model. 00:09:28.000 --> 00:09:30.000 So imagine a monkey 00:09:30.000 --> 00:09:32.000 sitting at a keyboard. 00:09:32.000 --> 00:09:35.000 I think you might get a random jumble of letters that looks like this. 00:09:35.000 --> 00:09:37.000 Such a random jumble of letters 00:09:37.000 --> 00:09:39.000 is said to have a very high entropy. 00:09:39.000 --> 00:09:41.000 This is a physics and information theory term. 00:09:41.000 --> 00:09:44.000 But just imagine it's a really random jumble of letters. 00:09:44.000 --> 00:09:48.000 How many of you have ever spilled coffee on a keyboard? 00:09:48.000 --> 00:09:50.000 You might have encountered the stuck-key problem -- 00:09:50.000 --> 00:09:53.000 so basically the same symbol being repeated over and over again. 00:09:53.000 --> 00:09:56.000 This kind of a sequence is said to have a very low entropy 00:09:56.000 --> 00:09:58.000 because there's no variation at all. 00:09:58.000 --> 00:10:01.000 Language, on the other hand, has an intermediate level of entropy; 00:10:01.000 --> 00:10:03.000 it's neither too rigid, 00:10:03.000 --> 00:10:05.000 nor is it too random. 00:10:05.000 --> 00:10:07.000 What about the Indus script? 00:10:07.000 --> 00:10:11.000 Here's a graph that plots the entropies of a whole bunch of sequences. 00:10:11.000 --> 00:10:13.000 At the very top you find the uniformly random sequence, 00:10:13.000 --> 00:10:15.000 which is a random jumble of letters -- 00:10:15.000 --> 00:10:17.000 and interestingly, we also find 00:10:17.000 --> 00:10:20.000 the DNA sequence from the human genome and instrumental music. 00:10:20.000 --> 00:10:22.000 And both of these are very, very flexible, 00:10:22.000 --> 00:10:24.000 which is why you find them in the very high range. 00:10:24.000 --> 00:10:26.000 At the lower end of the scale, 00:10:26.000 --> 00:10:28.000 you find a rigid sequence, a sequence of all A's, 00:10:28.000 --> 00:10:30.000 and you also find a computer program, 00:10:30.000 --> 00:10:32.000 in this case in the language Fortran, 00:10:32.000 --> 00:10:34.000 which obeys really strict rules. 00:10:34.000 --> 00:10:36.000 Linguistic scripts 00:10:36.000 --> 00:10:38.000 occupy the middle range. NOTE Paragraph 00:10:38.000 --> 00:10:40.000 Now what about the Indus script? 00:10:40.000 --> 00:10:42.000 We found that the Indus script 00:10:42.000 --> 00:10:44.000 actually falls within the range of the linguistic scripts. 00:10:44.000 --> 00:10:46.000 When this result was first published, 00:10:46.000 --> 00:10:49.000 it was highly controversial. 00:10:49.000 --> 00:10:52.000 There were people who raised a hue and cry, 00:10:52.000 --> 00:10:54.000 and these people were the ones who believed 00:10:54.000 --> 00:10:57.000 that the Indus script does not represent language. 00:10:57.000 --> 00:10:59.000 I even started to get some hate mail. 00:10:59.000 --> 00:11:01.000 My students said 00:11:01.000 --> 00:11:04.000 that I should really seriously consider getting some protection. 00:11:04.000 --> 00:11:06.000 Who'd have thought 00:11:06.000 --> 00:11:08.000 that deciphering could be a dangerous profession? 00:11:08.000 --> 00:11:10.000 What does this result really show? 00:11:10.000 --> 00:11:12.000 It shows that the Indus script 00:11:12.000 --> 00:11:14.000 shares an important property of language. 00:11:14.000 --> 00:11:16.000 So, as the old saying goes, 00:11:16.000 --> 00:11:18.000 if it looks like a linguistic script 00:11:18.000 --> 00:11:20.000 and it acts like a linguistic script, 00:11:20.000 --> 00:11:23.000 then perhaps we may have a linguistic script on our hands. 00:11:23.000 --> 00:11:25.000 What other evidence is there 00:11:25.000 --> 00:11:27.000 that the script could actually encode language? NOTE Paragraph 00:11:27.000 --> 00:11:30.000 Well linguistic scripts can actually encode multiple languages. 00:11:30.000 --> 00:11:33.000 So for example, here's the same sentence written in English 00:11:33.000 --> 00:11:35.000 and the same sentence written in Dutch 00:11:35.000 --> 00:11:37.000 using the same letters of the alphabet. 00:11:37.000 --> 00:11:40.000 If you don't know Dutch and you only know English 00:11:40.000 --> 00:11:42.000 and I give you some words in Dutch, 00:11:42.000 --> 00:11:44.000 you'll tell me that these words contain 00:11:44.000 --> 00:11:46.000 some very unusual patterns. 00:11:46.000 --> 00:11:48.000 Some things are not right, 00:11:48.000 --> 00:11:51.000 and you'll say these words are probably not English words. 00:11:51.000 --> 00:11:53.000 The same thing happens in the case of the Indus script. 00:11:53.000 --> 00:11:55.000 The computer found several texts -- 00:11:55.000 --> 00:11:57.000 two of them are shown here -- 00:11:57.000 --> 00:11:59.000 that have very unusual patterns. 00:11:59.000 --> 00:12:01.000 So for example the first text: 00:12:01.000 --> 00:12:04.000 there's a doubling of this jar-shaped sign. 00:12:04.000 --> 00:12:06.000 This sign is the most frequently-occurring sign 00:12:06.000 --> 00:12:08.000 in the Indus script, 00:12:08.000 --> 00:12:10.000 and it's only in this text 00:12:10.000 --> 00:12:12.000 that it occurs as a doubling pair. NOTE Paragraph 00:12:12.000 --> 00:12:14.000 Why is that the case? 00:12:14.000 --> 00:12:17.000 We went back and looked at where these particular texts were found, 00:12:17.000 --> 00:12:19.000 and it turns out that they were found 00:12:19.000 --> 00:12:21.000 very, very far away from the Indus Valley. 00:12:21.000 --> 00:12:24.000 They were found in present day Iraq and Iran. 00:12:24.000 --> 00:12:26.000 And why were they found there? 00:12:26.000 --> 00:12:28.000 What I haven't told you is that 00:12:28.000 --> 00:12:30.000 the Indus people were very, very enterprising. 00:12:30.000 --> 00:12:33.000 They used to trade with people pretty far away from where they lived, 00:12:33.000 --> 00:12:36.000 and so in this case, they were traveling by sea 00:12:36.000 --> 00:12:39.000 all the way to Mesopotamia, present-day Iraq. 00:12:39.000 --> 00:12:41.000 And what seems to have happened here 00:12:41.000 --> 00:12:44.000 is that the Indus traders, the merchants, 00:12:44.000 --> 00:12:47.000 were using this script to write a foreign language. 00:12:47.000 --> 00:12:49.000 It's just like our English and Dutch example. 00:12:49.000 --> 00:12:51.000 And that would explain why we have these strange patterns 00:12:51.000 --> 00:12:54.000 that are very different from the kinds of patterns you see in the text 00:12:54.000 --> 00:12:57.000 that are found within the Indus Valley. 00:12:57.000 --> 00:12:59.000 This suggests that the same script, the Indus script, 00:12:59.000 --> 00:13:02.000 could be used to write different languages. 00:13:02.000 --> 00:13:05.000 The results we have so far seem to point to the conclusion 00:13:05.000 --> 00:13:08.000 that the Indus script probably does represent language. NOTE Paragraph 00:13:08.000 --> 00:13:10.000 If it does represent language, 00:13:10.000 --> 00:13:12.000 then how do we read the symbols? 00:13:12.000 --> 00:13:14.000 That's our next big challenge. 00:13:14.000 --> 00:13:16.000 So you'll notice that many of the symbols 00:13:16.000 --> 00:13:18.000 look like pictures of humans, of insects, 00:13:18.000 --> 00:13:21.000 of fishes, of birds. 00:13:21.000 --> 00:13:23.000 Most ancient scripts 00:13:23.000 --> 00:13:25.000 use the rebus principle, 00:13:25.000 --> 00:13:28.000 which is, using pictures to represent words. 00:13:28.000 --> 00:13:31.000 So as an example, here's a word. 00:13:31.000 --> 00:13:33.000 Can you write it using pictures? 00:13:33.000 --> 00:13:35.000 I'll give you a couple seconds. 00:13:35.000 --> 00:13:37.000 Got it? 00:13:37.000 --> 00:13:39.000 Okay. Great. 00:13:39.000 --> 00:13:41.000 Here's my solution. 00:13:41.000 --> 00:13:43.000 You could use the picture of a bee followed by a picture of a leaf -- 00:13:43.000 --> 00:13:45.000 and that's "belief," right. 00:13:45.000 --> 00:13:47.000 There could be other solutions. 00:13:47.000 --> 00:13:49.000 In the case of the Indus script, 00:13:49.000 --> 00:13:51.000 the problem is the reverse. 00:13:51.000 --> 00:13:54.000 You have to figure out the sounds of each of these pictures 00:13:54.000 --> 00:13:56.000 such that the entire sequence makes sense. 00:13:56.000 --> 00:13:59.000 So this is just like a crossword puzzle, 00:13:59.000 --> 00:14:02.000 except that this is the mother of all crossword puzzles 00:14:02.000 --> 00:14:06.000 because the stakes are so high if you solve it. NOTE Paragraph 00:14:06.000 --> 00:14:09.000 My colleagues, Iravatham Mahadevan and Asko Parpola, 00:14:09.000 --> 00:14:11.000 have been making some headway on this particular problem. 00:14:11.000 --> 00:14:13.000 And I'd like to give you a quick example of Parpola's work. 00:14:13.000 --> 00:14:15.000 Here's a really short text. 00:14:15.000 --> 00:14:18.000 It contains seven vertical strokes followed by this fish-like sign. 00:14:18.000 --> 00:14:20.000 And I want to mention that these seals were used 00:14:20.000 --> 00:14:22.000 for stamping clay tags 00:14:22.000 --> 00:14:24.000 that were attached to bundles of goods, 00:14:24.000 --> 00:14:27.000 so it's quite likely that these tags, at least some of them, 00:14:27.000 --> 00:14:29.000 contain names of merchants. 00:14:29.000 --> 00:14:31.000 And it turns out that in India 00:14:31.000 --> 00:14:33.000 there's a long tradition 00:14:33.000 --> 00:14:35.000 of names being based on horoscopes 00:14:35.000 --> 00:14:38.000 and star constellations present at the time of birth. 00:14:38.000 --> 00:14:40.000 In Dravidian languages, 00:14:40.000 --> 00:14:42.000 the word for fish is "meen" 00:14:42.000 --> 00:14:45.000 which happens to sound just like the word for star. 00:14:45.000 --> 00:14:47.000 And so seven stars 00:14:47.000 --> 00:14:49.000 would stand for "elu meen," 00:14:49.000 --> 00:14:51.000 which is the Dravidian word 00:14:51.000 --> 00:14:53.000 for the Big Dipper star constellation. 00:14:53.000 --> 00:14:56.000 Similarly, there's another sequence of six stars, 00:14:56.000 --> 00:14:58.000 and that translates to "aru meen," 00:14:58.000 --> 00:15:00.000 which is the old Dravidian name 00:15:00.000 --> 00:15:02.000 for the star constellation Pleiades. 00:15:02.000 --> 00:15:05.000 And finally, there's other combinations, 00:15:05.000 --> 00:15:08.000 such as this fish sign with something that looks like a roof on top of it. 00:15:08.000 --> 00:15:11.000 And that could be translated into "mey meen," 00:15:11.000 --> 00:15:14.000 which is the old Dravidian name for the planet Saturn. 00:15:14.000 --> 00:15:16.000 So that was pretty exciting. 00:15:16.000 --> 00:15:18.000 It looks like we're getting somewhere. NOTE Paragraph 00:15:18.000 --> 00:15:20.000 But does this prove 00:15:20.000 --> 00:15:22.000 that these seals contain Dravidian names 00:15:22.000 --> 00:15:24.000 based on planets and star constellations? 00:15:24.000 --> 00:15:26.000 Well not yet. 00:15:26.000 --> 00:15:28.000 So we have no way of validating 00:15:28.000 --> 00:15:30.000 these particular readings, 00:15:30.000 --> 00:15:33.000 but if more and more of these readings start making sense, 00:15:33.000 --> 00:15:35.000 and if longer and longer sequences 00:15:35.000 --> 00:15:37.000 appear to be correct, 00:15:37.000 --> 00:15:39.000 then we know that we are on the right track. 00:15:39.000 --> 00:15:41.000 Today, 00:15:41.000 --> 00:15:44.000 we can write a word such as TED 00:15:44.000 --> 00:15:47.000 in Egyptian hieroglyphics and in cuneiform script, 00:15:47.000 --> 00:15:49.000 because both of these were deciphered 00:15:49.000 --> 00:15:51.000 in the 19th century. 00:15:51.000 --> 00:15:53.000 The decipherment of these two scripts 00:15:53.000 --> 00:15:56.000 enabled these civilizations to speak to us again directly. 00:15:56.000 --> 00:15:58.000 The Mayans 00:15:58.000 --> 00:16:00.000 started speaking to us in the 20th century, 00:16:00.000 --> 00:16:03.000 but the Indus civilization remains silent. NOTE Paragraph 00:16:03.000 --> 00:16:05.000 Why should we care? 00:16:05.000 --> 00:16:07.000 The Indus civilization does not belong 00:16:07.000 --> 00:16:09.000 to just the South Indians or the North Indians 00:16:09.000 --> 00:16:11.000 or the Pakistanis; 00:16:11.000 --> 00:16:13.000 it belongs to all of us. 00:16:13.000 --> 00:16:15.000 These are our ancestors -- 00:16:15.000 --> 00:16:17.000 yours and mine. 00:16:17.000 --> 00:16:19.000 They were silenced 00:16:19.000 --> 00:16:21.000 by an unfortunate accident of history. 00:16:21.000 --> 00:16:23.000 If we decipher the script, 00:16:23.000 --> 00:16:25.000 we would enable them to speak to us again. 00:16:25.000 --> 00:16:28.000 What would they tell us? 00:16:28.000 --> 00:16:31.000 What would we find out about them? About us? 00:16:31.000 --> 00:16:34.000 I can't wait to find out. NOTE Paragraph 00:16:34.000 --> 00:16:36.000 Thank you. NOTE Paragraph 00:16:36.000 --> 00:16:40.000 (Applause)