Return to Video

21-33 Segment Question 1

  • 0:00 - 0:04
    Now I want to give you an idea of how well the segmentation program performs.
  • 0:04 - 0:07
    Here I've trained it on a corpus of 4 billion words--
  • 0:07 - 0:10
    not just the Shakespeare corpus but a larger corpus,
  • 0:10 - 0:14
    and then I give it some test cases to try to find the best segmentation.
  • 0:14 - 0:19
    So I gave it the test case here. The program came up with "base rate sought to,"
  • 0:19 - 0:22
    but the correct answer was "base rates ought to."
  • 0:22 - 0:28
    In this case, it just seems somewhat like bad luck that that was the right answer,
  • 0:28 - 0:32
    but both segmentations seem like good segmentations.
  • 0:32 - 0:34
    Next was this trial.
  • 0:34 - 0:38
    My program came up with "small and in significant,"
  • 0:38 - 0:41
    but the correct answer was "small and insignificant."
  • 0:41 - 0:45
    Here it seems like it really has erred that "small and insignificant"
  • 0:45 - 0:49
    seems like a much better segmentation than the one my program came up with.
  • 0:49 - 0:55
    What I want you to tell me is what do you think could help us do a better job of getting the right answer.
  • 0:55 - 0:59
    Would it be helpful to gather more data?
  • 0:59 - 1:02
    Check that box if you think that would be helpful.
  • 1:02 - 1:08
    Would it be helpful to make a Markov assumption rather than the naive Bayes assumption?
  • 1:08 - 1:10
    Check here.
  • 1:10 - 1:16
    Or would it be helpful to do a better job with our smoothing algorithm? Check here.
  • 1:16 -
    And you can check more than one.
Title:
21-33 Segment Question 1
Team:
Udacity
Project:
CS271 - Intro to Artificial Intelligence
Duration:
01:19
Amara Bot added a translation

English subtitles

Revisions