Our weekly video hangout series! I think - let's see, we started a little bit off time so I'll say it again: Welcome to GV Face, our weekly video hangout series! Today, we are celebrating the 25th birthday of the world wide web. Pretty exciting. That was on Wednesday. Um, we've got a really all-star lineup of guests on today's program. Um, moving from left to right, we have: Alan Emtage, a very special guest who is gonna talk to us about his very special creation of, uh, the first web browser... Um! We have Jeremy Clark, in Montreal - Jeremy is a technical director at Global Voices. Josh Levy, from Free Press, in Massachusetts, in the U.S. and Renata Avila, campaign manager for the Web We Want Creative Commons extraordinaire, and GV star. who is joining us from Berlin! Welcome, everybody! Um. So we wanted to start today's show by talking a little bit about the world wide web and the internet. 'Cuz a lot of people think that they're the same thing when actually, that's not quite true. I want to first turn to Jeremy and just ask, Jer, could you break it down for us, like, I thought that the internet was invented in the 70's but, if it's the 25th birthday of the web, what does that mean? Jeremy Clark: Okay, well, the best place to start, I think, is the internet - it has existed in various formats since the 1970's, as you said, but it was the web that really made it enter our homes. and, so, understanding the relationship is important. So, the internet was invented by the U.S. Government in a lot of senses... ...a mix of military and science funding that developed the network of actual computers that can communicate with each other over wires. Now, another related technology that is also compri-- [amends] uh, built in to the web is called hypertext. And that is the notion of documents that can link between each other immediately, without having to go and fetch a separate document. Um. So there were lots of systems since the 1960s that were trying to implement hypertext, like, Xanadu is an example, uh, but all of them were commercial, expensive, closed, and none of them were very popular. So, Tim Berners-Lee, who is the "inventor of the internet," [corrects himself] of the web, obviously, the World Wide Web - Um. [Tim Berners-Lee] put those two things together by building a service that runs on top of the internet, and he called it the World Wide Web. So what the World Wide Web is, is the decentralized hypertext engine that we use to communicate between computers' web pages. So what makes up WEB is three things: URLs (or URIs) - Universal Resource Locator which are the addresses we use to find things on the web, [#2] HTML, which is the HyperText Markup Lanuage which is the way that the information is stored and sent so that we can then use browsers to view HTML, and then all the documents can be understood and then also they display the links so that the hypertext part of it works and we can jump around from page to page. Um, the final part is HTTP, which is the HyperText Transfer Protocol which is the communication method by which the different computers can talk to each other and send the HTML documents back and forth depending on the URLs. Um. So, when he built it, there were some very important things that he built into this system that didn't exist before. And the main one is universal authorship. So he always intended that anyone would be able to access these webpages, and anyone would be able to add their own webpages, without asking for permission. With the very explicit special condition that anyone can link to any other webpage without permission. Previous hypertext systems required that basically, for you to link to me, I have to accept that link, and probably create a link back to you, and that wasn't required on the Web, which gives us a lot of freedom to link to people who wouldn't want us to be able to link to them, for example, so no one can say "I'm putting up free content..." "...but you can't send your readers here, because I hate you," et cetera. The other one is that he made it completely, completely free. So in the world of inter--[fumbles for words]--programming the most free thing is generally considered to be the GPL [General Public License]: open-source, free software licenses. uh, and Tim Berners-Lee actually almost used the GPL, because he wanted the web software he was building to be free. But at the last minute he actually changed his mind and made it full public domain, because in certain ways the GPL is actually more restrictive, because it forces other people - like, certain commercial actors wouldn't have wanted to use web technology if it were GPL, so he made it full public domain, and then from there went on to make all of the standards as open and, uh, general and free as possible. Uh. So that's my extremely brief history of the internet. If anyone is curious, he wrote a wonderful book called "Weaving the Web" about his experiences [enticing tone] As you can see, it's short! And he has lots of interesting technical information in it, without being overwhelming. It's very approachable and he's a really interesting person and it - the book is much better than his tweets, which are usually incoherent. [one of the participants huffs out a "whew"] Ellery: Ouch! Jeremy [?]: A few minutes? Ellery: Thanks, that was - that was great, Jer! Ellery: I mean, I think that that helps um, in conversations about internet policy, and internet governance, there's a lot of emphasis on the ability to kind of create and innovate without permission? Like, for every to be able to build parts of the web, and what you just laid out for us makes it clear how important the Web piece of the infrastructure is for that, for that capacity to become a real tangible thing, and somebody that - [amends] something that now we can do - we don't have to have technical expertise to kind of build our own spaces there. Ellery: Um. So, I wanted to - Jeremy: So um. Jeremy: If I could add just one more thing, sorry - Jeremy: I just wanted to give a couple examples of things that happen over the internet that aren't the web, because that was the actual initial question. So, one example would be torrents, where you're the - two computers connect to each other, and stream information directly, without any URLs being mixed into the process. Um, another one is - email, at its core, is its own communication protocol that doesn't have to use the web, although we often use web sites to access and manage our email. Umm. And then another one was the one right before the Web came out, a very popular protocol was called Gopher, which people liked, and sort of worked like the Web - you surf around and find things - but it actually became commercial right around the time that the web came out, so people would've had to start paying, and instead of starting to pay, they switched to HTTP, HTML, and the World Wide Web. Ellery: Thank you. Ellery: So I want to move to Alan, now... Um, Alan built the first search engine. And I'm kind of... like, overwhelmed, and feel sort of like, giddy and nervous having him here. Ellery: This is just - [Alan laughs] Ellery: This is, like, a really big deal! Ellery: So, Alan, just - if you could tell us - 'cuz I think a lot of people don't know about Archie - um, it would be really cool just to hear about how you sort of - what you were doing that made you decide to, to do this and kinda what it was like, and then, I mean, everything you've seen since... Unfortunately we're time limited, but... Alan: Right. Ellery: You know. Alan [coughs]: Well, um, uh, well, that was back in 1989, and, I was working as a system administrator for uh, McGill University - I was a grad student for McGill University - and um, I was responsible for getting software for - one of my responsibilities was getting software for the faculty and the students. And at the time, the three major protocols on the internet - this was pre Web, ummm - was, uh, Telnet, which would allow you to log in to a remote machine. Email, ah, which would allow you to communicate ah, with another - as we do now, with a, with a remote machines, plural, and, and FTP, which was the File Transfer Protocol, which allowed you to move, ah, data files, or files from one machine to another. And at the time what we had was - people had made - remember it was a non-commercial internet at the time - - actually, commercial traffic was forbidden on the internet at the time, because it was run by the National Science Foundation and it was using educational money and therefore other than companies with research arms, like IBM and HP and those kinds of things, we didn't have any commercial traffic on the internet, which nowadays seems kind of amazing to even think about - and, ah, so what people did, were to provide to provide free space on their machines - and remember, you know, at the time, a big disc would be a megabyte, you know - and so people would provide common repositories that you could deposit programs that you had written datafiles, and documents, and that kinda stuff. into these central repositories that were spread around the internet. Then other people could then retrieve them. And so I spent a lot of my time trying to locate software, or the information that my, the students and the faculty were trying to find, and I got tired of it. and since I'm lazy and a geek, I... I automated the process. I got - instead of doing it manually, I had a bunch of scripts wake up in the middle of the night every night, and go out and index files. Now remember all of this was just file listings. It's not like Google, it's not like a search engine would be today, it is just... filenames. All it was, was filenames. And so what it would do was it would go out every night, list all the filenames in all the repositories, and allow you to search lists of filenames. And I only used it for myself! I only used it, um, uh, for my own personal use. Um, and at one point my boss, who was also a student, a grad student at the University, let Peter Deutsch let it be known that, um, somebody was asking for, you know, could they, could somebody tell them where, um, y'know, a particular piece of software was. And, uh, uh, we, um, uh... we, you know, he came and asked me, he knew we had this database and he came and asked me if I could help out. And I gave it to him, and if, y'know, half a sec- half a minute later I had the information, and so he put this posting online, and, umm. People then started asking, "Well, can you find this for me?" And, you know, all these manual requests! Basically - either through email, or UseNet postings - - which is what we were using at the time - we thought, this is silly, there's no point doing these things manually when we can just allow people access to the database itself. And in a moment of insanity, we had to come up with a name for it, and I said, "Okay, well, let's just call it ARCHI," which is "ARCHIVE" without the V And, ah, and within about three or four months we were consuming about half of all of the traffic to eastern Canada [where McGill University is] as this search engine became - as people, y'know - - word of mouth - you know, people who know about Archie are generally people of a certain age... ...I won't mention what that age is, but it's generally people who were in university or working on the internet, so it would have been so it would have been research people, people in academia in the early nineties. So Archie lasted for about, uh, [hems and haws] Five years. Four or five years. And, um, it only indexed FTP archives. It never indexed the web. Now, I went on, as Archie became popular, and I got more involved in the standards process and that kind of stuff, I worked, uh, fairly closely with Tim Berners-Lee to, uh, to standardize - for example, I did the - I ran the committee at the standard-setting body for the internet, which is the IETF [Internet Engineering Task Force] to standardize URLs. Because Tim had come up with a set of rules for URLS, and as we looked at expanding that to a larger range of resources, we realized that those rules did not cover all of the cases. So, we worked, for, uh - Tim brought the, the specification, his original specification, to the group, and we worked on it for, I don't remember, nine months to a year or so, to come up with a standard for URLs. So all of those URLs that we use, day in and day out, were, were standardized as a result of that committee. So, it was, um, it was a really exciting time, it was a time of, y'know - the question I always get is why didn't make a billion dollars off of it? And I keep reminding people