Return to Video

Unit 22 12 LPCFGmp4

  • 0:00 - 0:07
    Let's review. We started off with a context-free grammar. That's a rule of the form,
  • 0:07 - 0:17
    "VP goes to V NP NP." That's the kind of technology that's used in your grammars for programming languages.
  • 0:17 - 0:23
    And then we moved to a probabilistic, context-free grammar by adding on a probability,
  • 0:23 - 0:28
    and we put it in parentheses to the right, but let's be more clear about exactly what we're doing.
  • 0:28 - 0:36
    We're saying, "What's the probability of this rule, given that the left-hand side of the rule is VP?"
  • 0:36 - 0:40
    And we said that was equal to .2.
  • 0:40 - 0:48
    Now, the next step is to go and add lexicalization, so we have a lexicalized, probabilistic, context-free grammar.
  • 0:48 - 0:55
    So in a lexicalized, probabilistic, context-free grammar, we deal not with the categories of the left-hand side,
  • 0:55 - 0:59
    but rather with specific words. And there's multiple ways you can do that.
  • 0:59 - 1:08
    And one way we can do it is say, "What's the probability that a verb phrase is a verb followed by two noun phrases?"
  • 1:08 - 1:12
    And we're going to condition that on what the actual verb is.
  • 1:12 - 1:18
    If the verb is "gave," then there should be a relatively high probability.
  • 1:18 - 1:26
    You said, "He gave me the money," a direct and indirect object. That's fairly common for "gave."
  • 1:26 - 1:33
    So maybe that's .25 or something. And compare that to the same rule where the verb is said.
  • 1:33 - 1:44
    Normally, the verb "said" either has a single object, "He said something," but it doesn't normally have two objects.
  • 1:44 - 1:49
    It would be rare to say, "He said me something." In colloquial, it may occur.
  • 1:50 - 1:54
    "I said me my piece." But we're going to put down a very low probability for that.
  • 1:54 - 2:01
    If we had a tree bank we could figure out how low it is. But I'm just going to estimate it's something like .0001.
  • 2:01 - 2:07
    In a dictionary, they'll give you these probabilities, but they'll give them in absolute terms,
  • 2:07 - 2:12
    in that they'll tell you whether verbs are transitive or intransitive.
  • 2:12 - 2:19
    So for example, what's the probability that a verb phrase consists of just a verb?
  • 2:19 - 2:28
    Versus that the verb phrase consists of a verb followed by a noun phrase, given that the verb is "quake"?
  • 2:28 - 2:33
    Well, I can put down some numbers here, but if I look in my dictionary, I get a clue.
  • 2:33 - 2:38
    So my dictionary says that "quake" is an intransitive verb.
  • 2:38 - 2:43
    And so that means the dictionary is claiming that this probability should be zero.
  • 2:43 - 2:50
    And this probability should be something higher. Now, unfortunately, if we go out and look at the actual world,
  • 2:50 - 2:53
    it turns out that "quake" is not always intransitive.
  • 2:54 - 3:02
    If you do a web search for "quaked the earth," I get back 20,000 results. Now, not all of those are valid sentences;
  • 3:02 - 3:08
    Some of them are those words just happen to be together in a non-sentence context, a list of words or something.
  • 3:08 - 3:14
    But you can see thousands of sentences where "quake" is used transitively.
  • 3:14 - 3:20
    And so this shouldn't be a zero. Maybe it should be a .0001 or something.
  • 3:20 - 3:28
    But the dictionaries are too Boolean, too logical, too willing to give you a precise answer,
  • 3:28 - 3:30
    when language is really much more complex than that.
  • 3:30 - 3:34
    And so these lexicalized grammars come closer to giving you what you need.
  • 3:34 - 3:38
    Now, we still haven't gone all the way to solving our telescope problem.
  • 3:38 - 3:49
    For that, we want to be able to say, "What's the probability of noun phrase going to noun phrase followed by prepositional phrase?"
  • 3:49 - 3:58
    Or, "What's the probability of a verb phrase going to a verb followed by a noun phrase, followed by a prepositional phrase?"
  • 3:58 - 4:05
    And we want to do that in the case of the verb, if the verb equals "saw," and then if we're also dealing with a case where
  • 4:05 - 4:16
    the noun phrase has a head, meaning the main verb is equal to "man" and the preposition phrase has "with" and "telescope."
  • 4:16 - 4:27
    And then compare that to the probability for when the head of the noun phrase is "man" and the preposition has "with" and "telescope."
  • 4:27 - 4:36
    Now, these probabilities may be hard to get, because they're conditioning on quite a lot, on three very specific words on the right-hand side.
  • 4:36 - 4:40
    And so it may be hard to estimate these, and we may need some model that backs off,
  • 4:40 - 4:50
    that says maybe we don't look exactly for the word "man," but rather we look for something which represents an animate person.
  • 4:50 - 4:56
    And so just as we had in previous models when we talked about how to do smoothing and how to back off
  • 4:56 - 5:01
    to a more general case, we can do that in these lexicalized models as well.
  • 5:01 - 5:07
    But the point is that we want to make these choices based on probabilities, and we get these probabilities
  • 5:07 - 5:16
    by looking at our model, doing an analysis, and informing that analysis from data that we get from the tree banks.
  • 5:16 - 5:23
    We can put that all together, then we can make these choices and we can come up with the right interpretation of sentences,
  • 5:23 - 5:28
    and do the disambiguation, and figure out which one is more probable.
Unit 22 12 LPCFGmp4
Video Language:
CS271 - Intro to Artificial Intelligence
Udacity Robot edited English subtitles for Unit 22 12 LPCFGmp4
sp1 added a translation

English subtitles

Revisions Compare revisions