Got a YouTube account?

New: enable viewer-created translations and captions on your YouTube channel!

English subtitles

← 03x-05 Ill Formed Input

Get Embed Code
3 Languages

Showing Revision 2 created 10/24/2012 by Amara Bot.

  1. Another topic that I would like to address is malformed HTML.
  2. One of the questions on the forums was, are we going to talk about mistakes in web pages?
  3. What if a real web developer forgets to close off a tag or makes a sort of subtle mistake
  4. in punctuation?
  5. Are we going to talk about that in the web browser that we build for this class?
  6. And the answer is, yes!
  7. In fact, very prescient on your part. That's a very predictive question.
  8. We're going to get to it in unit 3, exactly the next unit.
  9. We're going to talk about recognizing malformed HTML and JavaScript.
  10. But in this class, we're mostly going to talk about recognizing them,
  11. and for the particular project that we work on for our simple web browser,
  12. if the HTML is malformed, we're not going to do anything about it.
  13. We just won't render that part of the web page.
  14. In practice, web browsers put a lot--a huge amount--of effort into being very forgiving.
  15. They want to render as much information as possible, even if the web page is out of date
  16. or written without knowledge of the standards or in any other way messed up.
  17. This approach of keeping going is sometimes known as error recovery
  18. or error tolerance or fault tolerance.
  19. In unit 3, we're going to talk about breaking up tokens and seeing if they match
  20. a particular structure, seeing if they're in the language of a formal grammar
  21. that describes all of JavaScript or all of web pages.
  22. You know in the real world, a lot of web pages are not.
  23. They don't match the formal idealized grammar I've written down on the walls of Plato's cave.
  24. Similarly, not every JavaScript program adheres to exactly the same idea of--
  25. to pick a timely example--where the semicolons go after statements.
  26. So in practice, what you'll often want to do if you're doing commercial software
  27. if you want to make your customers as happy as possible by supporting everything
  28. that they've written, you'll write about your duplicate rules.
  29. For example, you might write one regular expression that accepts normal numbers,
  30. but if people make a common mistake when writing numbers,
  31. maybe they write multiple.multiple period signs or something like that,
  32. you might write another rule that accepts those,
  33. and maybe print out some warning but then does it's best to figure out what the value is
  34. and keeps going.
  35. Again, in real world industrial software development for a web browser,
  36. this sort of error recovery when you're doing lexical analysis or syntactic analysis
  37. is of critical importance because the vast majority of web pages are not
  38. standards compliant.
  39. In this class, we're going to tell you how to tell the difference between good HTML and bad,
  40. between well-formed JavaScript and malformed JavaScript.
  41. But I'm only going to require that you deal with well-formed strings.
  42. Once you know how to do it the good way though,
  43. you could do it for ill-formed strings.
  44. You'd have all the tools after finishing this class.
  45. It would just be more busy work, elbow grease.
  46. It would take additional time, and it wouldn't really teach you more concepts.
  47. That's why I'm not going to focus on it.