  1. So one of the most exciting developments in machine learning is
  2. learning from text.
  3. A lot of the online data is actually text data, the web, emails and so on.
  4. And companies like Google, Yahoo!,
  5. and many others are really built on the idea that you
  6. can use machine learning and apply it to text.
  7. So if you want to build the next Google, listen up.
  8. The fundamental question in learning from text has to do
  9. with what's the input feature.
  10. And I'm going to give an example and ask you a question.
  11. Suppose you have two strings, two kind of sentences.
  12. One is called nice day, and one is called a very nice day.
  13. And suppose for whatever reason, one is a positive example.
  14. And one of the negative examples is indicated by this X over here and
  15. this circle.
  16. And maybe if many of those, and
  17. you want to toss them into your favorite learning algorithm like the support
  18. vector machine to produce either output label.
  19. What do you think is the best input dimension for the support vector machine?
  20. Give you a few choices, one, two, three, four or hard to tell.
  21. Give it a best shot.