[Script Info]
Title:
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.13,0:00:07.43,Default,,0000,0000,0000,,Here's an example of translation. And this is a phrase-based translation that doesn't happen to use any syntactic trees.
Dialogue: 0,0:00:07.43,0:00:12.90,Default,,0000,0000,0000,,And in this case we're using an example going from German to English.
Dialogue: 0,0:00:12.90,0:00:18.83,Default,,0000,0000,0000,,And in this model we break up the probability of the translation into three components.
Dialogue: 0,0:00:18.83,0:00:25.30,Default,,0000,0000,0000,,First, a segmentation model of how do we break up the German into phrases?
Dialogue: 0,0:00:25.30,0:00:30.83,Default,,0000,0000,0000,,Here the sentence has been broken up into 1-2-3-4-5 phrases.
Dialogue: 0,0:00:30.83,0:00:36.87,Default,,0000,0000,0000,,Then a translation model. For each phrase, what's a good translation into English?
Dialogue: 0,0:00:36.87,0:00:45.50,Default,,0000,0000,0000,,And then a distortion model, saying, "Each of these phrases, what order would be a good order to put them into?"
Dialogue: 0,0:00:45.50,0:00:50.30,Default,,0000,0000,0000,,So let's look at each of those in turn. First, the segmentation model.
Dialogue: 0,0:00:50.30,0:00:59.97,Default,,0000,0000,0000,,We have a database of phrases that we picked out--maybe through a similar process to what went on in the Chinese menu,
Dialogue: 0,0:00:59.97,0:01:07.50,Default,,0000,0000,0000,,where we looked for coherent phrases that occurred frequently, and so we're able to apply probabilities to them.
Dialogue: 0,0:01:07.50,0:01:14.100,Default,,0000,0000,0000,,So now we have a probability. What's the probability that "morgen" is a phrase and that "fliege" is a phrase by itself?
Dialogue: 0,0:01:15.00,0:01:20.27,Default,,0000,0000,0000,,We would also consider the probability that they're considered a phrase together,
Dialogue: 0,0:01:20.27,0:01:24.37,Default,,0000,0000,0000,,and come up with a high probability segmentation.
Dialogue: 0,0:01:24.37,0:01:29.83,Default,,0000,0000,0000,,Next, the translation model. That's going between the two sides of the Chinese menu.
Dialogue: 0,0:01:29.83,0:01:35.100,Default,,0000,0000,0000,,How often, when we saw the phrase "morgen", did it correspond to the phrase "tomorrow" in English?
Dialogue: 0,0:01:36.00,0:01:40.27,Default,,0000,0000,0000,,And so on for the other phrases. So far, that's all pretty straight-forward.
Dialogue: 0,0:01:40.27,0:01:49.37,Default,,0000,0000,0000,,And the we have the distortion model, saying, "In what order should we put these phrases? Should we swap them around in any order?"
Dialogue: 0,0:01:49.37,0:01:54.47,Default,,0000,0000,0000,,And we measure that just by looking at the beginning and the ending of each phrase.
Dialogue: 0,0:01:54.47,0:02:06.47,Default,,0000,0000,0000,,So Ai is the beginning of the i phrase, and Bi minus one is the ending of the i minus one phrase.
Dialogue: 0,0:02:06.47,0:02:14.53,Default,,0000,0000,0000,,But we measure those in the German, but we consider the indexes, the I's, by looking at the English.
Dialogue: 0,0:02:14.53,0:02:21.77,Default,,0000,0000,0000,,So we say, "This is the last phrase, is this phrase, 'in Canada,'" but that corresponds to this one here,
Dialogue: 0,0:02:21.77,0:02:31.40,Default,,0000,0000,0000,,and so the beginning of that phrase is at number three, and the next to last phrase is this one, that corresponds to "zero confidence."
Dialogue: 0,0:02:31.40,0:02:41.17,Default,,0000,0000,0000,,And the end of that phrase is at seven. And so the distortion there from three to seven is a distortion of four.
Dialogue: 0,0:02:41.17,0:02:46.13,Default,,0000,0000,0000,,And our distortion model, then, would just be a probability distribution over those integers.
Dialogue: 0,0:02:46.13,0:02:53.37,Default,,0000,0000,0000,,So it's not doing anything fancy in terms of saying what type of phrases occur before or after what other types of phrases.
Dialogue: 0,0:02:53.37,0:03:00.17,Default,,0000,0000,0000,,It's just saying, "Are they shifted to the right or to the left? And are they shifted a small amount or a large amount?"
Dialogue: 0,0:03:00.17,0:03:06.37,Default,,0000,0000,0000,,And I should note that in this model, if we had a one to one translation where things weren't switched--
Dialogue: 0,0:03:06.37,0:03:15.87,Default,,0000,0000,0000,,So say, if the original German sentence had "zur Konferenz" before "nach Kanada," and we translated it into English like this,
Dialogue: 0,0:03:15.87,0:03:25.60,Default,,0000,0000,0000,,then the Bi minus one would be five, and the Ai--imagine this being swapped over here--would also be five.
Dialogue: 0,0:03:25.60,0:03:30.77,Default,,0000,0000,0000,,In that case the distortion would be zero. And so for a language where the words line up
Dialogue: 0,0:03:30.77,0:03:38.17,Default,,0000,0000,0000,,very closely between the source and the target language--for those pairs of languages--then we'd have a high probability
Dialogue: 0,0:03:38.17,0:03:42.83,Default,,0000,0000,0000,,mass under a zero distortion, and lower probability for other distortions.
Dialogue: 0,0:03:42.83,0:03:50.37,Default,,0000,0000,0000,,In a language where lots of things are swapped far, a more volatile type of translation between the language pairs,
Dialogue: 0,0:03:50.37,0:03:57.80,Default,,0000,0000,0000,,then we'd expect the probability mass to be lower for zero distortion, and higher for higher distortions.
Dialogue: 0,0:03:57.80,0:04:04.87,Default,,0000,0000,0000,,So this is a very simple model. It takes only into account segmentation, translation between phrases,
Dialogue: 0,0:04:04.87,0:04:12.53,Default,,0000,0000,0000,,and just the simplest model of distortion. You can imagine a more complex model based on trees and other components.
Dialogue: 0,0:04:12.53,0:04:17.56,Default,,0000,0000,0000,,And I should note that this is just the translation part of the model.
Dialogue: 0,0:04:17.57,0:04:22.93,Default,,0000,0000,0000,,And then to make the final choice, we would want to multiply out all these probabilities,
Dialogue: 0,0:04:22.93,0:04:28.27,Default,,0000,0000,0000,,but we would also want to take into account the probability of the generated English sentence.
Dialogue: 0,0:04:28.27,0:04:33.70,Default,,0000,0000,0000,,Is this a good sentence in English? And we have a probability model for that.
Dialogue: 0,0:04:33.70,0:04:40.30,Default,,0000,0000,0000,,That's a monolingual model rather than a bilingual model. And the process of coming up with the best translation, then,
Dialogue: 0,0:04:40.30,0:04:48.30,Default,,0000,0000,0000,,is just a search through all possible segmentations, all possible translations, all possible distortions,
Dialogue: 0,0:04:48.30,0:04:55.70,Default,,0000,0000,0000,,multiply up these probabilities times the monolingual probability, and find the one that gives you the highest value,
Dialogue: 0,0:04:55.70,0:05:02.47,Default,,0000,0000,0000,,and that'll be your best translation. And the tricky part is just coming up with a search technique that can enumerate
Dialogue: 0,0:05:02.47,0:05:07.00,Default,,0000,0000,0000,,through many of those possibilities quickly, and choose a good one.