[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,♪ (Music fades in) ♪ Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Chirping) Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Vocalizations, different languages) Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Talking overlaps in background) Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Computerized beeping) Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man) We come into this world with the\Ninnate ability to learn to interact Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,with other sentient beings. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Child vocalizing) Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man) Suppose you are to interact with\Nother people by writing little messages. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man) It'd be a real pain. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man) And that's how we interact\Nwith computers. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,It's much easier just to talk to them...\Njust so much easier... Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man) If the computers could understand\Nwhat we're saying. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,For that, you need really \Ngood speech recognition. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Narrator) The first speech recognition\Nsystem was developed by Bell Laboratories Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,in 1952. It could only recognize\Nnumbers spoken by one person. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,In the 1970s, Carnegie-Mellon\Ncame out with the Harpy System. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,This was able to recognize over\N1,000 words and different pronunciations Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Narrator) of the same word.\N- (Man) Tomato - (Woman) Tomato Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Narrator) Speech recognition continued\Nin the 80s with the introduction of the Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,Hidden Markov Model, which\Nused a more mathematical approach Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,to analyzing sound waves that led to\Nmany breakthroughs we have today. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,You're taking in very raw audio wave forms Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,like you get through a microphone Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,on your phone Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,or (unintelligible) Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Woman) We chop it into small pieces\Nand it tries to identify which phoneme Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,was spoken in that piece of speech. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- Phoneme is a primitive unit for\Nexpressing words. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(voicing phonemes shown above) Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,And then you stitch those together\Ninto likely words like Palo Alto. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- Speech recognition today is good at\Ntranscribing what you've said... Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man, to phone) What's the weather\Nlike in Topeka? Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man) You can talk about travels, your\Ncontacts, like, "Where can I get pizza?" Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Phone) Here are the listings for Pizza. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man) "Where's the Eiffel Tower?"\N(Phone) The Eiffel Tower is ... Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Woman) We've made tremendous\Nimprovements very quickly. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man, to phone) Who is the 21st\NPresident of the United States? Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Phone beeps)\N(Phone) Chester A. Arthur was the 21st... Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man, to phone) Okay, Google,\Nwhere is he from? Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man) Years ago, you had to be an engineer\Nto interact with computers. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,Today, everybody can interact. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- One thing still in its\Ninfancy is understanding. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- We need a far more sophisticated\Nlanguage understanding model Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,that understands what the sentence means. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,We're still a very long way from that. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Beeping) Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,♪ (Soft background music) ♪ Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Woman) Our ability to use language is one\Nof the things that helps us have culture. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,It's one of the things that helps\Nus pass on traditions across generations. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,Figuring out how the system of language\Nworks, even though it seems easy, Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,turns out to be very hard, but is one that\Nevery baby understands by 2 years old. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Girl) There's two of them.\N(Woman) There's two Ls, yeah (spells word) Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- Language is extremely complex\Nand sophisticated... Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- From the semantics Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- (Man in chair) Ironies...\N- (Woman) Strong accents... Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- (Man) Facial expressions, human emotions... Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- Because that's part of\Nhow we communicate. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- Humor... Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Aside) Do I have to careful\Nnot to offend the dinosaur? Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- Language has so many different\Nlayers and that's why it's Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,such a difficult problem. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man) The (unknown) human brain\Nand the learning algorithms in it Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,are far, far better at things like\Nlanguage understanding Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,and they're still a lot better\Nat pun recognition. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,- Whether or not we replicate exactly\Nwhat the brain does, to understand Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,language and speech, is still a question. Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Beeping) Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,(Man) For many years, we believed that\Nneural networks should work better than Dialogue: 0,9:59:59.99,9:59:59.99,Default,,0000,0000,0000,,the dull existing technology