Our weekly video hangout series!
I think - let's see, we started a little bit off time
so I'll say it again:
Welcome to GV Face, our weekly video hangout series!
Today, we are celebrating the 25th birthday of the world wide web.
Pretty exciting. That was on Wednesday.
Um, we've got a really all-star lineup of guests
on today's program.
Um, moving from left to right, we have:
Alan Emtage, a very special guest who is
gonna talk to us about his very special creation
of, uh, the first web browser...
Um! We have Jeremy Clark, in Montreal -
Jeremy is a technical director at Global Voices.
Josh Levy, from Free Press,
in Massachusetts, in the U.S.
and Renata Avila, campaign manager
for the Web We Want
Creative Commons extraordinaire, and
GV star.
who is joining us from Berlin!
Welcome, everybody!
Um. So we wanted to start today's show
by talking a little bit about the world wide web
and the internet.
'Cuz a lot of people think that they're the same thing
when actually, that's not quite true.
I want to first turn to Jeremy
and just ask, Jer, could you
break it down for us, like,
I thought that the internet was invented in the 70's
but, if it's the 25th birthday of the web,
what does that mean?
Jeremy Clark: Okay, well, the
best place to start, I think, is
the internet - it has existed in various formats
since the 1970's, as you said,
but it was the web that really made it
enter our homes.
and, so, understanding the relationship is important.
So, the internet was invented by
the U.S. Government in a lot of senses...
...a mix of military and science funding
that developed the network of
actual computers
that can communicate with each other over
wires.
Now, another related technology that is also compri--
[amends] uh, built in to the web
is called hypertext. And that is the notion
of documents that can link between each other
immediately, without having to go and fetch
a separate document. Um.
So there were lots of systems since the 1960s
that were trying to implement hypertext, like,
Xanadu is an example,
uh, but all of them were commercial,
expensive, closed,
and none of them were very popular.
So, Tim Berners-Lee, who is the
"inventor of the internet,"
[corrects himself] of the web,
obviously, the World Wide Web -
Um. [Tim Berners-Lee] put those two things together
by building a service that runs
on top of the internet, and he
called it the World Wide Web.
So what the World Wide Web is, is the
decentralized hypertext engine
that we use to communicate between
computers' web pages.
So what makes up WEB is three things:
URLs (or URIs) - Universal Resource Locator
which are the addresses we use
to find things on the web,
[#2] HTML, which is the
HyperText Markup Lanuage
which is the way that the information
is stored and sent
so that we can then use browsers
to view HTML, and then
all the documents can be understood
and then also they display the links
so that the hypertext part of it works
and we can jump around from page to page.
Um, the final part is HTTP, which is
the HyperText Transfer Protocol
which is the communication method
by which the different computers can
talk to each other and send the
HTML documents back and forth
depending on the URLs.
Um. So, when he built it, there were some
very important things that he
built into this system
that didn't exist before.
And the main one is
universal authorship.
So he always intended that anyone
would be able
to access these webpages,
and anyone would be able to
add their own webpages, without
asking for permission.
With the very explicit special condition
that anyone can link to any other webpage
without permission.
Previous hypertext systems required that
basically, for you to link to me,
I have to accept that link, and
probably create a link back to you, and
that wasn't required on the Web, which
gives us a lot of freedom to link to people
who wouldn't want us to be able
to link to them, for example,
so no one can say "I'm putting up free content..."
"...but you can't send your readers here,
because I hate you," et cetera.
The other one is that he made it
completely, completely free.
So in the world of
inter--[fumbles for words]--programming
the most free thing is generally considered
to be the GPL [General Public License]:
open-source, free software licenses.
uh, and Tim Berners-Lee actually almost used
the GPL, because he wanted the web software
he was building to be free.
But at the last minute he actually changed his mind
and made it full public domain,
because in certain ways
the GPL is actually more restrictive, because it
forces other people - like, certain commercial actors
wouldn't have wanted to use web technology
if it were GPL, so he made it full public domain,
and then from there went on to make all of the standards
as open and, uh, general and free as possible.
Uh. So that's my extremely brief
history of the internet.
If anyone is curious, he wrote a wonderful book
called "Weaving the Web" about his experiences
[enticing tone] As you can see, it's short!
And he has lots of interesting technical information
in it, without being overwhelming.
It's very approachable
and he's a really interesting person
and it - the book is much better than his tweets,
which are usually incoherent.
[one of the participants huffs out a "whew"]
Ellery: Ouch!
Jeremy [?]: A few minutes?
Ellery: Thanks, that was - that was great, Jer!
Ellery: I mean, I think that that helps
um, in conversations about internet policy,
and internet governance, there's a lot of emphasis
on the ability to kind of create and innovate
without permission? Like, for every
to be able to build parts of the web, and
what you just laid out for us makes it clear
how important the Web piece of the infrastructure is
for that, for that capacity to become
a real tangible thing, and somebody that -
[amends] something that now
we can do - we don't have to have
technical expertise to kind of build our own
spaces there.
Ellery: Um. So, I wanted to -
Jeremy: So um.
Jeremy: If I could add just one more thing, sorry -
Jeremy: I just wanted to give a couple examples
of things that happen over the internet
that aren't the web,
because that was the actual initial question.
So, one example would be torrents,
where you're the - two computers
connect to each other,
and stream information directly, without any URLs
being mixed into the process.
Um, another one is - email, at its core,
is its own communication protocol
that doesn't have to use the web,
although we often use web sites
to access and manage our email.
Umm. And then another one was the one
right before the Web came out,
a very popular protocol was called Gopher,
which people liked, and sort of worked like the Web
- you surf around and find things -
but it actually became commercial
right around the time that the web came out,
so people would've had to start paying,
and instead of starting to pay,
they switched to HTTP, HTML, and
the World Wide Web.
Ellery: Thank you.
Ellery: So I want to move to Alan, now... Um,
Alan built the first search engine.
And I'm kind of... like, overwhelmed, and feel sort of
like, giddy and nervous having him here.
Ellery: This is just -
[Alan laughs]
Ellery: This is, like, a really big deal!
Ellery: So, Alan, just - if you could tell us -
'cuz I think a lot of people don't know about Archie -
um, it would be really cool just to hear
about how you sort of - what you were doing
that made you decide to, to do this
and kinda what it was like, and then, I mean,
everything you've seen since...
Unfortunately we're time limited, but...
Alan: Right.
Ellery: You know.
Alan [coughs]: Well, um, uh, well, that was back in
1989, and, I was working as a system administrator
for uh, McGill University - I was a grad student
for McGill University - and um, I was responsible
for getting software for - one of my responsibilities
was getting software for the faculty and the students.
And at the time, the three major
protocols on the internet
- this was pre Web, ummm -
was, uh, Telnet, which would allow you to log in
to a remote machine.
Email, ah, which would allow you to communicate
ah, with another - as we do now, with a, with a
remote machines, plural,
and, and FTP, which was the File Transfer Protocol,
which allowed you to move, ah, data files, or files
from one machine to another.
And at the time what we had was - people had made
- remember it was a non-commercial internet
at the time -
- actually, commercial traffic was forbidden
on the internet at the time,
because it was run by the
National Science Foundation
and it was using educational money
and therefore other than companies with
research arms, like IBM and HP
and those kinds of things,
we didn't have any commercial traffic on the internet,
which nowadays seems kind of amazing
to even think about -
and, ah, so what people did, were to provide
to provide free space on their machines
- and remember, you know, at the time,
a big disc would be a megabyte, you know -
and so people would provide common repositories
that you could deposit programs that you had written
datafiles, and documents, and that kinda stuff.
into these central repositories that were
spread around the internet.
Then other people could then retrieve them.
And so I spent a lot of my time trying to locate
software, or the information that my, the
students and the faculty were trying to find,
and I got tired of it.
and since I'm lazy and a geek, I...
I automated the process.
I got - instead of doing it manually, I had a bunch
of scripts wake up in the middle of the night
every night,
and go out and index files.
Now remember all of this was just file listings.
It's not like Google, it's not like
a search engine would be today,
it is just... filenames. All it was, was filenames.
And so what it would do
was it would go out every night,
list all the filenames in all the repositories,
and allow you to search lists of filenames.
And I only used it for myself!
I only used it, um, uh, for my own personal use.
Um, and at one point my boss,
who was also a student, a grad student
at the University, let Peter Deutsch let it be known
that, um, somebody was asking for, you know,
could they, could somebody tell them where, um,
y'know, a particular piece of software was.
And, uh, uh, we, um, uh... we, you know,
he came and asked me,
he knew we had this database
and he came and asked me if I could help out.
And I gave it to him, and if, y'know,
half a sec- half a minute later I had the information,
and so he put this posting online, and, umm.
People then started asking,
"Well, can you find this for me?"
And, you know, all these manual requests!
Basically - either through email, or UseNet postings -
- which is what we were using at the time -
we thought, this is silly,
there's no point doing these things manually
when we can just allow people access
to the database itself.
And in a moment of insanity,
we had to come up with a name for it,
and I said, "Okay, well, let's just call it ARCHI,"
which is "ARCHIVE" without the V
And, ah, and within about three or four months
we were consuming about half
of all of the traffic to eastern Canada
[where McGill University is]
as this search engine became - as people, y'know -
- word of mouth -
you know, people who know about Archie
are generally people of a certain age...
...I won't mention what that age is, but
it's generally people who were in university
or working on the internet, so it would have been
so it would have been research people,
people in academia in the early nineties.
So Archie lasted for about, uh, [hems and haws]
Five years. Four or five years.
And, um, it only indexed FTP archives.
It never indexed the web.
Now, I went on, as Archie became popular,
and I got more involved in the standards process
and that kind of stuff,
I worked, uh, fairly closely with Tim Berners-Lee
to, uh, to standardize - for example,
I did the - I ran the committee
at the standard-setting body for the internet,
which is the IETF
[Internet Engineering Task Force]
to standardize URLs.
Because Tim had come up with
a set of rules for URLS,
and as we looked at expanding that
to a larger range of resources,
we realized that those rules did not cover
all of the cases.
So, we worked, for, uh - Tim brought the,
the specification, his original specification,
to the group, and we worked on it for,
I don't remember, nine months to a year or so,
to come up with a standard for URLs.
So all of those URLs that we use,
day in and day out,
were, were standardized as a result
of that committee.
So, it was, um, it was a really exciting time,
it was a time of, y'know - the question I always get
is why didn't make a billion dollars off of it?
And I keep reminding people