Our weekly video hangout series!

I think - let's see, we started a little bit off time

so I'll say it again:

Welcome to GV Face, our weekly video hangout series!

Today, we are celebrating the 25th birthday of the world wide web.

Pretty exciting. That was on Wednesday.

Um, we've got a <i>really</i> all-star lineup of guests

on today's program.

Um, moving from left to right, we have:

Alan Emtage, a very special guest who is

gonna talk to us about his very special creation

of, uh, the first web browser...

Um! We have Jeremy Clark, in Montreal -

Jeremy is a technical director at Global Voices.

Josh Levy, from Free Press,

in Massachusetts, in the U.S.

and Renata Avila, campaign manager 
for the Web We Want

Creative Commons extraordinaire, and

GV star.

who is joining us from Berlin!

Welcome, everybody!

Um. So we wanted to start today's show

by talking a little bit about the world wide web

and the internet.

'Cuz a lot of people think that they're the same thing

when actually, that's not quite true.

I want to first turn to Jeremy

and just ask, Jer, could you

break it down for us, like,

I thought that the internet was invented in the 70's

but, if it's the 25th birthday of the web,

what does that mean?

Jeremy Clark: Okay, well, the

best place to start, I think, is

the internet - it has existed in various formats

since the 1970's, as you said,

but it was the <i>web</i> that really made it

enter our homes.

and, so, understanding the relationship is important.

So, the internet was invented by

the U.S. Government in a lot of senses...

...a mix of military and science funding

that developed the network of 
<i>actual</i> computers

that can communicate with each other over

wires.

Now, another related technology that is also compri--

[amends] uh, <i>built in</i> to the web

is called <i>hypertext</i>. And that is the notion

of documents that can link between each other

immediately, without having to go and fetch

a separate document. Um.

So there were lots of systems since the 1960s

that were trying to implement hypertext, like,

Xanadu is an example,

uh, but all of them were commercial, 
expensive, closed,

and none of them were very popular.

So, Tim Berners-Lee, who is the
"inventor of the internet,"

[corrects himself] of the <i>web</i>, 
obviously, the World Wide Web -

Um. [Tim Berners-Lee] put those two things together

by building a service that runs

on <i>top</i> of the internet, and he

called it the World Wide Web.

So what the World Wide Web is, is the

decentralized hypertext engine

that we use to communicate between

computers' web pages.

So what makes up WEB is three things:

URLs (or URIs) - Universal Resource Locator

which are the addresses we use 
to find things on the web,

[#2] HTML, which is the

HyperText Markup Lanuage

which is the way that the information

is stored and sent

so that we can then use browsers

to <i>view</i> HTML, and then

all the documents can be understood

and then also they display the <i>links</i>

so that the hypertext part of it works

and we can jump around from page to page.

Um, the final part is HTTP, which is

the HyperText Transfer Protocol

which is the communication method

by which the different computers can

talk to each other and send the

HTML documents back and forth

depending on the URLs.

Um. So, when he built it, there were some

very important things that he 
built into this system

that didn't exist before.

And the main one is

universal authorship.

So he always intended that <i>anyone</i>
would be able

to access these webpages,

and anyone would be able to

add their own webpages, without

asking for permission.

With the very explicit special condition

that anyone can link to any <i>other</i> webpage

without permission.

Previous hypertext systems required that

basically, for you to link to me,

I have to accept that link, and

probably create a link back to you, and

that wasn't required on the Web, which

gives us a lot of freedom to link to people

who wouldn't want us to be able 
to link to them, for example,

so no one can say "I'm putting up free content..."

"...but you can't send your readers here,
because I hate you," et cetera.

The other one is that he made it 
completely, <i>completely</i> free.

So in the world of
inter--[fumbles for words]--programming

the most free thing is generally considered

to be the GPL [General Public License]:
open-source, free software licenses.

uh, and Tim Berners-Lee actually almost used

the GPL, because he wanted the web software

he was building to be free.

But at the last minute he actually changed his mind

and made it full public domain, 
because in certain ways

the GPL is actually more restrictive, because it

forces other people - like, certain commercial actors

wouldn't have wanted to use web technology

if it were GPL, so he made it full public domain,

and then from there went on to make all of the standards

as open and, uh, general and free as possible.

Uh. So that's my extremely brief
history of the internet.

If anyone is curious, he wrote a wonderful book

called "Weaving the Web" about his experiences

[enticing tone] As you can see, it's short!

And he has lots of interesting technical information

in it, without being overwhelming.

It's very approachable

and he's a really interesting person

and it - the book is much better than his tweets,

which are usually incoherent.

[one of the participants huffs out a "whew"]

Ellery: Ouch!
Jeremy [?]: A few minutes?

Ellery: Thanks, that was - that was great, Jer!

Ellery: I mean, I think that that helps

um, in conversations about internet policy,

and internet governance, there's a lot of emphasis

on the ability to kind of create and innovate

without permission? Like, for every

to be able to build parts of the web, and

what you just laid out for us makes it clear

how important the <i>Web</i> piece of the infrastructure is

for that, for that capacity to become

a real tangible <i>thing</i>, and somebody that -
[amends] something that now

we can do - we don't have to have

technical expertise to kind of build our own
spaces there.

Ellery: Um. So, I wanted to -
Jeremy: So um.

Jeremy: If I could add just one more thing, sorry -

Jeremy: I just wanted to give a couple examples

of things that happen over the internet

that <i>aren't</i> the web,

because that was the actual initial question.

So, one example would be <i>torrents</i>,

where you're the - two computers 
connect to each other,

and stream information directly, without any URLs

being mixed into the process.

Um, another one is - email, at its core,

is its own communication protocol

that doesn't have to use the web,

although we often use web <i>sites</i>
to access and manage our email.

Umm. And then another one was the one
right <i>before</i> the Web came out,

a very popular protocol was called <i>Gopher</i>,

which people liked, and sort of worked like the Web

- you surf around and find things -

but it actually became commercial 
right around the time that the web came out,

so people would've had to start paying,

and instead of starting to pay,

they switched to HTTP, HTML, and 
the World Wide Web.

Ellery: Thank you.

Ellery: So I want to move to Alan, now... Um,

Alan built the first search engine.

And I'm kind of... like, overwhelmed, and feel sort of

like, giddy and nervous <i>having</i> him here.

Ellery: This is just - 
[Alan laughs]

Ellery: This is, like, a really big deal!

Ellery: So, Alan, just - if you could tell us -

'cuz I think a lot of people don't know about Archie -

um, it would be really cool just to hear

about how you sort of - what you were doing

that made you decide to, to do this

and kinda what it was like, and then, I mean,
everything you've seen since...

Unfortunately we're time limited, but...

Alan: Right.
Ellery: You know.

Alan [coughs]: Well, um, uh, well, that was back in

1989, and, I was working as a system administrator

for uh, McGill University - I was a grad student

for McGill University - and um, I was responsible

for getting software for - one of my responsibilities

was getting software for the faculty and the students.

And at the time, the three major 
protocols on the internet

- this was pre Web, ummm -

was, uh, Telnet, which would allow you to log in

to a remote machine.

Email, ah, which would allow you to communicate

ah, with another - as we do now, with a, with a

remote machin<i>es</i>, plural,

and, and FTP, which was the File Transfer Protocol,

which allowed you to move, ah, data files, or files

from one machine to another.

And at the time what we had was - people had made

- remember it was a non-commercial internet
at the time -

- actually, commercial traffic was forbidden
on the internet at the time,

because it was run by the 
National Science Foundation

and it was using educational money

and therefore other than companies with

research arms, like IBM and HP 
and those kinds of things,

we didn't have any commercial traffic on the internet,

which nowadays seems kind of amazing 
to even think about -

and, ah, so what people did, were to provide

to provide free space on their machines

- and remember, you know, at the time,

a big disc would be a <i>megabyte,</i> you know -

and so people would provide common repositories

that you could deposit programs that you had written

datafiles, and documents, and that kinda stuff.

into these central repositories that were

spread around the internet.

Then other people could then retrieve them.

And so I spent a lot of my time trying to locate

software, or the information that my, the
students and the faculty were trying to find,

and I got tired of it.

and since I'm lazy and a geek, I...

I automated the process.

I got - instead of doing it manually, I had a bunch

of scripts wake up in the middle of the night
every night,

and go out and index files.

Now remember all of this was just file listings.

It's not like Google, it's not like 
a search engine would be today,

it is just... filenames. All it was, was filenames.

And so what it would do

was it would go out every night,

list all the filenames in all the repositories,

and allow you to <i>search</i> lists of filenames.

And I only used it for myself!

I only used it, um, uh, for my own personal use.

Um, and at one point my boss,

who was also a student, a grad student

at the University, let Peter Deutsch let it be known

that, um, somebody was asking for, you know,

could they, could somebody tell them where, um,

y'know, a particular piece of software was.

And, uh, uh, we, um, uh... we, you know,

he came and asked me, 
he knew we had this database

and he came and asked me if I could help out.

And I gave it to him, and if, y'know,

half a sec- half a minute later I had the information,

and so he put this posting online, and, umm.

People then started asking,

"Well, can you find <i>this</i> for me?"

And, you know, all these manual requests!

Basically - either through email, or UseNet postings -

- which is what we were using at the time -

we thought, this is silly,

there's no point doing these things manually

when we can just allow people access
to the database itself.

And in a moment of insanity,

we had to come up with a name for it,

and I said, "Okay, well, let's just call it ARCHI,"

which is "ARCHIVE" without the V

And, ah, and within about three or four months

we were consuming about <i>half</i>

of all of the traffic to eastern Canada 
<i>[where McGill University is]</i>

as this search engine became - as people, y'know -

- word of mouth -

you know, people who know about Archie

are generally people of a certain <i>age</i>...

...I won't mention what that age is, but

it's generally people who were in university

or working on the internet, so it would have been

so it would have been research people,

people in academia in the early nineties.

So Archie lasted for about, uh, [hems and haws]

Five years. Four or five years.

And, um, it only indexed FTP archives.

It never indexed the web.

Now, I went on, as Archie became popular,

and I got more involved in the standards process

and that kind of stuff,

I worked, uh, fairly closely with Tim Berners-Lee

to, uh, to standardize - for example,

I did the - I ran the committee

at the standard-setting body for the internet,

which is the IETF 
<i>[Internet Engineering Task Force]</i>

to standardize URLs.

Because Tim had come up with

a set of rules for URLS,

and as we looked at expanding that

to a larger range of resources,

we realized that those rules did not cover

all of the cases.

So, we worked, for, uh - Tim brought the,

the specification, his original specification,

to the group, and we worked on it for,

I don't remember, nine months to a year or so,

to come up with a standard for URLs.

So all of those URLs that we use,

day in and day out,

were, were standardized as a result

of that committee.

So, it was, um, it was a really exciting time,

it was a time of, y'know - the question I always get

is why didn't make a billion dollars off of it?

And I keep reminding people