0:00:02.719,0:00:07.360 The Internet: HTTP and HTML 0:00:07.360,0:00:11.740 I'm Jasmine and I'm a program[br]manager on the XBOX One engineering 0:00:11.759,0:00:18.700 team. One of our biggest features is called[br]XBOX Live. It's an online service that connects 0:00:18.700,0:00:24.099 gamers from all around the world, and we rely[br]on the internet to make that happen. This 0:00:24.099,0:00:30.500 is no easy task and there are a lot of things[br]happening behind the scenes. The internet 0:00:30.500,0:00:36.280 is totally changing how people interact and[br]connect. But how does it work? How do the 0:00:36.280,0:00:43.489 computers all across the world actually communicate[br]with each other? Let's look at web browsing. 0:00:43.489,0:00:50.199 First, you open a web browser. It's the app[br]you use to access the web pages. Next, you 0:00:50.199,0:00:55.899 type in the web address, or URL, which stands[br]for Uniform Resource Locator of the website 0:00:55.899,0:01:06.810 you want to visit like tumblr.com. Hi, I'm[br]David Karp, the founder of Tumblr and we're 0:01:06.810,0:01:12.560 here today to talk about how those web browsers[br]we use everyday actually work. So you've probably 0:01:12.560,0:01:16.350 wondered what actually happens when you type[br]an address into your web browser and then 0:01:16.350,0:01:21.020 hit enter. And it really is about as crazy[br]as you can imagine. So in that moment your 0:01:21.020,0:01:25.930 computer starts talking to another computer,[br]called a server, that's usually thousands 0:01:25.930,0:01:32.450 of miles away. And in milliseconds your computer[br]asks that server for a website, and that server 0:01:32.450,0:01:39.530 starts to talk back to your computer in a[br]language called HTTP. HTTP stands for HyperText 0:01:39.530,0:01:43.680 Transfer Protocol. You can kind of think of[br]it as the language that one computer uses 0:01:43.680,0:01:48.009 to ask another computer for a document. And[br]it's actually really pretty straightforward. 0:01:48.009,0:01:52.540 If you were to intercept the conversation[br]between your computer and a web server on 0:01:52.540,0:01:56.670 the internet, it's mainly made up of something[br]called "GET" requests. Those are really very 0:01:56.670,0:02:01.590 simply the word GET and the name of the document[br]that you're requesting. So if you try to log 0:02:01.590,0:02:06.360 into Tumblr and load our login page, all you're[br]doing is sending a GET request to Tumblr's 0:02:06.360,0:02:14.290 server that says GET /login. And that tells[br]Tumblr's server that you want all of the HTML 0:02:14.290,0:02:21.800 code for the Tumblr login page. So HTML stands[br]for Hyper Text Markup Language and you can 0:02:21.800,0:02:26.470 think of that as the language you use to tell[br]a web browser how to make a page look. If 0:02:26.470,0:02:30.540 you think about something like Wikipedia,[br]which is really just a big simple document 0:02:30.540,0:02:35.630 and HTML is the language that you use to make[br]that title big and bold, to make the font 0:02:35.630,0:02:42.690 the right font, to link certain text to certain[br]other pages, to make some text bold, to make some 0:02:42.690,0:02:46.740 text italic, to put an image in the middle[br]of the page, to align the image to the right, 0:02:46.740,0:02:52.990 to align the image to the left. The text of[br]a web page is included directly in the HTML, 0:02:52.990,0:02:58.380 but other parts like images or videos are[br]separate files with their own URLs that need 0:02:58.380,0:03:04.540 to be requested. The browser sends separate[br]HTTP requests for each of these and displays 0:03:04.540,0:03:11.670 them as they arrive. If a web page has a lot[br]of different images, each of them causes a 0:03:11.670,0:03:20.780 separate HTTP request and the page loads slower.[br]Now sometimes when you browse the web, you're 0:03:20.780,0:03:25.880 not just requesting pages with GET requests.[br]Sometimes you send information like when you 0:03:25.880,0:03:32.300 fill out a form or type a search query. Your[br]browser sends this information in plain text 0:03:32.300,0:03:39.090 to the web server using an HTTP POST request.[br]Let's say you log in to Tumblr. Well the first 0:03:39.090,0:03:45.360 thing you do is you make a POST request, that[br]is a POST to Tumblr's login page that has 0:03:45.360,0:03:49.680 some data attached to it. It has your email[br]address, it has your password. That goes to 0:03:49.680,0:03:55.350 Tumblr's server. Tumblr's server figures out[br]that okay, you're David. It sends a web page 0:03:55.350,0:04:00.480 back to your browser that says, Success! Logged[br]in as David. But along with that web page, 0:04:00.480,0:04:07.000 it also attaches a little bit of invisible cookie[br]data that your browser sees and knows to save. 0:04:07.000,0:04:11.360 And it's really important because it's really[br]the only way that a website can remember who 0:04:11.360,0:04:16.940 you are. All that cookie data really is, is[br]an ID card for Tumblr. It's a number that 0:04:16.940,0:04:21.790 identifies you as David. And your web browser[br]holds on to that number and the next time 0:04:21.790,0:04:26.660 you refresh Tumblr, the next time you go to[br]Tumblr.com, your web browser knows to automatically 0:04:26.660,0:04:30.930 attach that ID number with the request that[br]it sends over to Tumblr's servers. So now 0:04:30.930,0:04:35.970 Tumblr's servers sees the request coming from[br]your browser, sees the ID number, and knows 0:04:35.970,0:04:43.940 "Ok, this is a request from David."[br]Now, the internet is completely open. All 0:04:43.940,0:04:49.350 of its connections are shared and information[br]is sent in plain text. This makes it possible 0:04:49.350,0:04:55.630 for hackers to snoop on any personal information[br]that you send over the internet. But safe 0:04:55.630,0:05:00.970 websites prevent this, by asking your web[br]browser to communicate on a secure channel 0:05:00.970,0:05:07.630 using something called Secure Sockets Layer[br]and its successor Transport Layer Security. 0:05:07.630,0:05:14.000 You can think of SSL and TLS as a layer of[br]security wrapped around your communications 0:05:14.000,0:05:20.530 to protect them from snooping or tampering.[br]SSL and TLS are active when you see the little 0:05:20.530,0:05:27.440 lock that appears in your browser address[br]bar, next to the HTTPS. The HTTPS protocols 0:05:27.440,0:05:33.840 ensure that your HTTP requests are secure[br]and protected. When a website asks your browser 0:05:33.840,0:05:39.500 to engage in a secure connection, it first[br]provides a digital certificate. Which is like 0:05:39.500,0:05:45.140 an official ID card proving that it's the[br]website it claims to be. Digital certificates 0:05:45.140,0:05:49.900 are published by certificate authorities,[br]which are trusted entities that verify the 0:05:49.900,0:05:55.280 identities of websites and issue certificates[br]for them. Just like a government can issue 0:05:55.280,0:06:01.030 IDs or passports. Now if a website tries to[br]start a secure connection without a properly 0:06:01.030,0:06:09.590 issued digital certificate, your browser will[br]warn you. That's the basics of web browsing! 0:06:09.590,0:06:17.010 The part of the internet we see day to day.[br]To summarize, HTTP and DNS manage the sending 0:06:17.010,0:06:23.450 and receiving of HTML, media files, or anything[br]on the web. What makes this possible under 0:06:23.450,0:06:30.370 the hood are TCP/IP and router networks that[br]break down and transport information in small 0:06:30.370,0:06:36.670 packets. Those packets themselves are made[br]up of binary, sequences of 1s and 0s that 0:06:36.670,0:06:42.550 are physically sent through electric wires,[br]fiber optic cables, and wireless networks. 0:06:42.550,0:06:47.440 Fortunately, once you've learned how one layer[br]of the internet works, you can rely on it 0:06:47.440,0:06:52.070 without remembering all the details. And we[br]can trust that all those layers will work 0:06:52.070,0:06:59.090 together to successively deliver information[br]at scale and with reliability.