Return to Video

02-10 Taking Html Apart

  • 0:00 - 0:04
    Now that we understand how HTML works,
  • 0:04 - 0:09
    we want to separate out these tags from the words that will be displaced on the screen.
  • 0:09 - 0:15
    Breaking up words like this is actually a surprisingly common task in real life.
  • 0:15 - 0:22
    For example, ancient Latin was often written or inscribed without spaces.
  • 0:22 - 0:26
    This particular set of letters "SENTATUSPOPULUSQUEROMANUS"
  • 0:26 - 0:30
    is inscribed on the arch of Titus, which I've doodled over here as a column,
  • 0:30 - 0:33
    but what can you do? Arches are apparently beyond my power.
  • 0:33 - 0:37
    I know. It has just become an arch. Those labels never lie.
  • 0:37 - 0:41
    Roman inscriptions like this were written without spaces,
  • 0:41 - 0:46
    and it requires a bit of domain knowledge to know how to break this up.
  • 0:46 - 0:52
    "Senate and the People of Rome." That inscription was made quite some time ago.
  • 0:52 - 0:57
    Similarly, in many written Asian languages, they don't explicitly include spaces
  • 0:57 - 1:00
    or punctuations between the various characters or glyphs.
  • 1:00 - 1:04
    In this particular Japanese example, and both my handwriting and my stroke order
  • 1:04 - 1:08
    are very, very poor--have pity--some amount of domain knowledge is required to break up
  • 1:08 - 1:14
    "ano" from "yama"--"that mountain."
  • 1:14 - 1:18
    Finally,even if you're not familiar with Asian languages or ancient Latin,
  • 1:18 - 1:21
    you might have seen the same sort of thing in a much more modern guise,
  • 1:21 - 1:23
    in text messaging.
  • 1:23 - 1:28
    Some amount of domain knowledge is required to break this up into "I love you"
  • 1:28 - 1:31
    even though no particular spaces are given.
  • 1:31 - 1:36
    We will want to do the same thing for HTML to break it up into words
  • 1:36 - 1:40
    like "Wollstonecraft" and "wrote" that will appear on the screen
  • 1:40 - 1:46
    or this special left angle bracket slash maneuver that tells us that we're starting end tag,
  • 1:46 - 1:48
    this special word in the middle that tells us which tag it was,
  • 1:48 - 1:51
    and then this closing right angle bracket.
  • 1:51 - 1:56
    Once again, for this HTML fragment we want to break it up into this first word,
  • 1:56 - 2:02
    the start of the closing tag, another word, the end of the closing tag,
  • 2:02 - 2:04
    and then another word.
  • 2:04 - 2:07
    We're going to need to do this to write our web browser.
  • 2:07 - 2:11
    In order to interpret HTML and JavaScript, we're going to have to break sentences down
  • 2:11 - 2:15
    into their component words to figure out what's going on.
  • 2:15 - 2:21
    This process is called--dun, dun, dun, dun-- lexical analysis.
  • 2:21 - 2:26
    Lexical here has the same roots and "lexicon" like a dictionary.
  • 2:26 - 2:29
    This means "to break something down into words."
  • 2:29 - 2:36
    You'll be pleased to know that we're going to use regular expressions to solve this problem.
  • 2:36 - 2:39
    Here I've written another one of those decompositions.
  • 2:39 - 2:45
    We might have broken an HTML fragment down into these word-like objects,
  • 2:45 - 2:51
    but this time you're going to help me out by doing the problem in reverse.
  • 2:51 - 2:58
    So in this multiple multiple choice quiz, I'd like you to mark each one of these HTML fragments
  • 2:58 -
    that would decompose into this sequence of five elements.
Cím:
02-10 Taking Html Apart
Leírás:

more » « less
Video Language:
English
Team:
Udacity
Projekt:
CS262 - Programming Languages
Duration:
03:05
Amara Bot hozzáadott egy fordítást

English subtitles

Felülvizsgálatok