WEBVTT 00:00:00.120 --> 00:00:07.950 [song counts down: 7, 6, 5, 4, 3, 2, 1] The Internet: Packets, Routing, and Reliability 00:00:07.950 --> 00:00:13.650 Hi, my name is Lynn. I'm a software engineer here at Spotify and I'll be the first to admit 00:00:13.650 --> 00:00:18.970 that I often take for granted the reliability of the internet. The sheer amount of information 00:00:18.970 --> 00:00:23.170 zooming around the internet is astonishing. But how is it possible for every piece of 00:00:23.170 --> 00:00:29.080 data to be delivered to you reliably? Say you want to play a song from Spotify. It seems 00:00:29.080 --> 00:00:33.989 like your computer connects directly to Spotify servers and Spotify sends you a song on a 00:00:33.989 --> 00:00:39.410 direct, dedicated line. But actually, that's not how the internet works. If the internet 00:00:39.410 --> 00:00:43.640 were made of direct, dedicated connections it would be impossible to keep things working 00:00:43.640 --> 00:00:48.050 as millions of users join. Especially since there is no guarantee that every wire and 00:00:48.050 --> 00:00:53.350 computer is working all the time. Instead, data travels on the internet in a much less 00:00:53.350 --> 00:01:01.210 direct fashion. Many many years ago, in the early 1970s my partner Bob Kahn and I began 00:01:01.210 --> 00:01:06.870 working on the design of what we now call the internet. Bob and I had the responsibility 00:01:06.870 --> 00:01:14.790 and the opportunity to design the internet's protocols and its architecture. So we persisted 00:01:14.790 --> 00:01:20.000 in participating in the internet's growth and evolution for all of this time up to and 00:01:20.000 --> 00:01:25.500 including the present. The way information gets transferred from one computer to another 00:01:25.500 --> 00:01:30.900 is pretty interesting. It need not follow a fixed path, in fact, your path may change 00:01:30.900 --> 00:01:36.100 in the midst of a computer to computer conversation. Information on the internet goes from one 00:01:36.100 --> 00:01:42.050 computer to another in what we call a packet of information and a packet travels from one 00:01:42.050 --> 00:01:46.360 place to another on the internet a lot like how you might get from one place to another 00:01:46.360 --> 00:01:51.420 in a car. Depending on traffic congestion or road conditions, you might choose or be 00:01:51.420 --> 00:01:59.000 forced to take a different route to get to the same place each time you travel. And just 00:01:59.000 --> 00:02:03.980 as you can transport all sorts of stuff inside a car, many kinds of digital information can 00:02:03.980 --> 00:02:10.359 be sent with IP packets but there are some limits. What if for example you need to move 00:02:10.359 --> 00:02:14.200 a space shuttle from where it was built to where it will be launched. The shuttle won't 00:02:14.200 --> 00:02:18.780 fit in one truck so it needs to be broken down into pieces, transported using a fleet 00:02:18.780 --> 00:02:23.099 of trucks. They could all take different routes and might get to the destination at different 00:02:23.099 --> 00:02:28.109 times. But once all the pieces are there, you can reassemble the pieces into the complete 00:02:28.109 --> 00:02:34.329 shuttle and it will be ready for launch. On the internet the details work similarly. If 00:02:34.329 --> 00:02:40.090 you have a very large image that you want to send to a friend or upload to a website, 00:02:40.090 --> 00:02:44.819 that image might be made up of 10s of millions of bits of 1s and 0s, too many to send along 00:02:44.819 --> 00:02:49.810 in one packet. Since it's data on a computer, the computer sending the image can quickly 00:02:49.810 --> 00:02:55.719 break it into hundreds or even thousands of smaller parts called packets. Unlike cars 00:02:55.719 --> 00:03:00.230 or trucks these packets don't have drivers and they don't choose their route. Each packet 00:03:00.230 --> 00:03:04.650 has the internet address of where it came from and where it's going. Special computers 00:03:04.650 --> 00:03:09.430 on the internet called routers act like traffic managers to keep the packets moving through 00:03:09.430 --> 00:03:15.239 the networks smoothly. If one route is congested, individual packets may travel different routes 00:03:15.239 --> 00:03:20.370 through the internet and they may arrive at the destination at slightly different times 00:03:20.370 --> 00:03:26.569 or even out of order. Let's talk about how this works. As part of the internet protocol, 00:03:26.569 --> 00:03:31.169 every router keeps track of multiple paths for sending packets, and it chooses the cheapest 00:03:31.169 --> 00:03:37.079 available path for each piece of data based on destination IP address for the packet. 00:03:37.079 --> 00:03:42.120 Cheapest in this case doesn't mean cost, but time and non-technical factors such as politics 00:03:42.120 --> 00:03:47.499 and relationships between companies. Often the best route for data to travel isn't necessarily 00:03:47.499 --> 00:03:53.150 the most direct. Having options for paths makes the network fault tolerant. Which means 00:03:53.150 --> 00:03:57.700 the network can keep sending packets even if something goes horribly, horribly wrong. 00:03:57.700 --> 00:04:04.849 This is the basis for a key principle of the internet: reliability. Now, what if you want 00:04:04.849 --> 00:04:09.349 to request some data and not everything is delivered? Say you want to listen to a song. 00:04:09.349 --> 00:04:14.829 How can you be 100% sure all the data will be delivered so the song plays perfectly? 00:04:14.829 --> 00:04:21.440 Introducing your new best friend, TCP (transmission control protocol). TCP manages the sending 00:04:21.440 --> 00:04:26.530 and receiving of all your data as packets. Think of it like a guaranteed mail service. 00:04:26.530 --> 00:04:31.669 When you request a song on your device, Spotify sends a song broken up into many packets. 00:04:31.669 --> 00:04:37.210 When your packets arrive, TCP does a full inventory and sends back acknowledgements 00:04:37.210 --> 00:04:42.840 of each packet received. If all packets are there, TCP signs for your delivery and you're 00:04:42.840 --> 00:04:54.819 done. (song plays) If TCP finds some packets are missing, it won't sign, otherwise your 00:04:54.819 --> 00:04:59.930 song won't sound as good or portions of the song could be missing. For each missing or 00:04:59.930 --> 00:05:05.930 incomplete packet, Spotify will resend them. Once TCP verifies the delivery of many packets 00:05:05.930 --> 00:05:13.370 for that one song request, your song will start to play. What's great about the TCP 00:05:13.370 --> 00:05:19.220 and router systems is they're scalable. They can work with 8 or 8 billion devices. In fact, 00:05:19.220 --> 00:05:23.449 because of these principles of fault tolerance and redundancy, the more routers we add the 00:05:23.449 --> 00:05:28.069 more reliable the internet becomes. What's also great is we can grow and scale the internet 00:05:28.069 --> 00:05:34.379 without interrupting service for anybody using it. The internet is made of hundreds of thousands 00:05:34.379 --> 00:05:39.280 of networks and billions of computers and devices connected physically. These different 00:05:39.280 --> 00:05:44.360 systems that make up the internet connect to each other, communicate with each other, 00:05:44.360 --> 00:05:51.289 and work together because of agreed upon standards for how data is sent around on the internet. 00:05:51.289 --> 00:05:56.000 Computing devices, or routers along the internet, help all the packets make their way to the 00:05:56.000 --> 00:06:02.789 destination where they're reassembled, if necessary, in order. This happens billions 00:06:02.789 --> 00:06:08.889 of times a day, whether you and others are sending an email, visiting a web page, doing 00:06:08.889 --> 00:06:13.870 a video chat, using a mobile app, or when sensors or devices on the internet talk to 00:06:13.870 --> 00:06:14.910 each other.