Got a YouTube account?

New: enable viewer-created translations and captions on your YouTube channel!

English subtitles

← 21-34 Segment Question 1 Solution

Get Embed Code
2 Languages

Showing Revision 1 created 11/28/2012 by Amara Bot.

  1. In this case, the problem really comes down to the naive Bayes assumption is
  2. a weak one, and the Markov assumption would do much better.
  3. It wouldn't really help to have more data or to do a better job of smoothing,
  4. because I already have good counts for words like "in" and "significant"
  5. as well as words like "small" and "and."
  6. They're all common enough that I have a good representation of how often they occur
  7. as a unigram as a single word.
  8. The problem is that we would like to know that the word "small" goes very well
  9. with the word "insignificant" but does not goes very well with the word "significant."
  10. So if we had a Markov model where the probability of "insignificant" depended
  11. on the probability of "small," then we could catch that,
  12. and we could get this segmentation correct.