English subtitles

← Unit 22 07 PCFGmp4

Get Embed Code
2 Languages

Showing Revision 2 created 05/25/2016 by Udacity Robot.

  1. The first step is called a probabilistic, context-free grammar, or PCFG.
  2. And what that means is we take this context-free grammar--this is the same one we had before;
  3. I just rewrote it on multiple lines, so each alternative separated by the "or" bar is on a separate line by itself.
  4. And now we're just going to associate a probability with each of the right-hand sides for each of the left-hand side symbols.
  5. So for a sentence, there's only one way to rewrite a sentence, so that gets probability one.
  6. And for a noun phrase, there are four ways, and we have to divvy up our probability one among them.
  7. So let's say we do .3, .4, .2, .1. That's saying that it's going to be rare to have this three-noun compound,
  8. and fairly common to have a determiner followed by a noun, like "the dog."
  9. Now, for a verb phrase maybe we'll have .4, .4, .2. And we can fill in numbers for each of these categories as well,
  10. to distinguish between the common and the uncommon words, used as each of the categories.
  11. And we'll go ahead and fill those in. So now what we have is a grammar that specifies all the allowable trees.
  12. So a tree is made up of a conjunction of these rules. We start off with a sentence and we generate a noun phrase and a verb phrase,
  13. and then for the noun phrase we generate one of these choices, and so on.
  14. And a probability associated with each of those choices. And the probability of the sentence as a whole
  15. to be associated with the tree that we've chosen is just a product of all the individual probabilities.
  16. So for example, we can take the sentence, "Fed raises interest rates" and associate a tree with that,
  17. and calculate the probability of the tree. We don't have space here, so let's put it on its own page.
  18. So now we have our sentence, we have our first tree, and we have our probabilistic context-free grammar,
  19. and now we can figure out the probability of this parse tree. The way we do that is
  20. for each of the nodes in the tree, we just look up the probability.
  21. So what's the probability of N becoming "Fed"? That's N is Fed, is .3. So we'll write that down here.
  22. V becoming "raises" is .6. N being interest is .3. As is N being rates. Now, nP becoming an N, that's here,
  23. that's also .3. NP becoming two nouns, that's .2. A VP being a verb followed by a noun phrase is .4.
  24. And a sentence being a noun phrase followed by a verb phrase, in our grammar that's probability one.
  25. We multiply these all out, and we get .0003888, or .039%, approximately.