-
Title:
Rule Order - Programming Languages
-
Description:
-
As we saw in that last quiz,
-
it's not quite clear what to do when our token definitions overlap.
-
The 7-character sequence "hello"
-
matches our regular expression for word
-
but also matches our regular expression for string.
-
This is a problem not just with computer languages
-
but also with natural languages.
-
As the hypothetical owner of this restaurant would notice,
-
we don't just serve hamburgers, we serve people
-
could be interpreted the wrong way.
-
Presumably those hamburgers are soylent green flavored.
-
We want to have definitive rules for figuring out
-
which of these we prefer.
-
In fact, we're going to use a very simple rule.
-
The first one you list wins,
-
the one closer to the top of the file,
-
so this is our big winner and is going to take priority over string.
-
If you're making a lexical analyzer for HTML or JavaScript,
-
ordering your token definitions is of prime importance.
-
Let's investigate this issue in the form of a quiz.
-
Suppose we have the input string hello, "world,"
-
and we really want that to yield word,
-
the word hello, followed by a string.
-
I'm going to list 3 rules for you,
-
and I want you to tell me which one has to come last
-
for us to get the desired effect.
-
And here, because you've seen it all before, I'm eliding some of the details
-
like the colon, token, blah, blah, blah.
-
Instead what I'd like you to do is tell me
-
which one of these functions, which one of these rules,
-
would have to come last, bearing in mind that the one that comes first
-
wins all ties in order for hello, "world" to break down into
-
a word followed by a string.