0:00:06.070,0:00:07.120 Hi, my name's John. 0:00:07.510,0:00:10.140 I lead the search and machine[br]learning teams at Google. 0:00:12.130,0:00:14.230 I think it's amazingly inspiring 0:00:14.230,0:00:16.214 that people all over the world 0:00:16.215,0:00:19.160 turn to search engines to[br]ask trivial questions 0:00:19.160,0:00:20.930 and incredibly important questions. 0:00:20.930,0:00:23.450 So it's a huge[br]responsibility to give them 0:00:23.450,0:00:24.864 the best answers that we can. 0:00:26.710,0:00:30.610 Hi, my name's Akshaya and [br]I work on the Bing search team. 0:00:30.910,0:00:33.190 There are many times where[br]we will start looking 0:00:33.190,0:00:35.800 into artificial intelligence[br]and machine learning, 0:00:35.830,0:00:39.010 but we have to address how are[br]the users going to use this, 0:00:39.140,0:00:42.390 because at the end of the day,[br]we want to make an impact to society. 0:00:43.780,0:00:45.400 Let's ask a simple question. 0:00:45.820,0:00:48.070 How long does it take to travel to Mars? 0:00:49.330,0:00:50.950 Where did these results come from 0:00:51.370,0:00:54.100 and why was this listed[br]before the other one? 0:00:55.700,0:00:58.150 Okay, let's dive in and[br]see how the search engine 0:00:58.150,0:00:59.860 turned your request into a result. 0:01:00.690,0:01:03.360 The first thing you need to[br]know is when you do a search, 0:01:03.430,0:01:06.480 the search engine isn't actually[br]going out to the World Wide Web 0:01:06.480,0:01:08.010 to run your search in real time. 0:01:08.140,0:01:10.610 And that's because there's[br]over a billion websites 0:01:10.610,0:01:14.140 on the internet and hundreds more are[br]being created every single minute. 0:01:14.140,0:01:16.210 So if the search engine[br]had to look through 0:01:16.240,0:01:18.690 every single site to[br]find the one you wanted, 0:01:18.690,0:01:20.120 it would just take forever. 0:01:20.500,0:01:21.940 So to make your search faster, 0:01:21.970,0:01:24.940 search engines are constantly[br]scanning the web in advance 0:01:25.420,0:01:28.560 to record the information that might[br]help with your search later. 0:01:28.930,0:01:31.270 That way, when you search[br]about travel to Mars, 0:01:31.630,0:01:33.700 the search engine[br]already has what it needs 0:01:33.700,0:01:35.728 to give you an answer in real time. 0:01:36.250,0:01:37.540 Here's how it works. 0:01:37.900,0:01:42.010 The internet is a web of pages[br]connected to each other by hyperlinks. 0:01:42.400,0:01:44.680 Search engines are[br]constantly running a program 0:01:44.680,0:01:47.380 called a Spider that cross[br]through these web pages 0:01:47.380,0:01:49.040 to collect information about them. 0:01:49.780,0:01:51.550 Each time it finds a hyperlink, 0:01:52.090,0:01:55.000 it follows it until it[br]has visited every page 0:01:55.030,0:01:57.240 it can find on the entire[br]internet. 0:01:57.335,0:01:59.170 For each page the spider visits, 0:01:59.200,0:02:02.320 it records any information[br]it might need for a search 0:02:02.500,0:02:05.650 by adding it to a special[br]database called a search index. 0:02:07.166,0:02:09.530 Now, let's go back to[br]that search from earlier 0:02:09.590,0:02:11.990 and see if we can figure[br]out how the search engine 0:02:11.990,0:02:13.333 came up with the results. 0:02:13.640,0:02:16.460 When you ask how long does[br]it take to travel to Mars, 0:02:16.640,0:02:18.860 the search engine looks[br]in each of those words 0:02:18.920,0:02:21.410 in the search index to[br]immediately get a list 0:02:21.410,0:02:24.500 of all the pages on the[br]internet containing those words. 0:02:24.890,0:02:26.870 But just looking for these search terms 0:02:26.870,0:02:28.760 could return millions of pages, 0:02:28.760,0:02:31.110 so the search engine needs[br]to be able to determine 0:02:31.110,0:02:33.120 the best matches to show you first. 0:02:33.340,0:02:36.010 This is where it gets tricky[br]because the search engine 0:02:36.010,0:02:38.040 may need to guess what[br]you're looking for. 0:02:38.930,0:02:41.360 Each search engine[br]uses its own algorithm 0:02:41.360,0:02:44.230 to rank the pages based on[br]what it thinks you want. 0:02:44.930,0:02:47.660 The search engine's ranking[br]algorithm might check 0:02:47.990,0:02:50.360 if your search term shows[br]up in the page title, 0:02:50.900,0:02:53.820 it might check if all of the[br]words show up next to each other, 0:02:54.520,0:02:57.020 or any number of other calculations 0:02:57.020,0:02:58.610 that help it better determine 0:02:58.670,0:03:01.420 which pages you'll want[br]to see and which you won't. 0:03:02.960,0:03:04.960 Google invented the most [br]famous algorithm 0:03:04.960,0:03:08.530 for choosing the most relevant results[br]for a search by taking into account 0:03:08.560,0:03:11.230 how many other Web pages[br]linked to a given page. 0:03:11.830,0:03:14.140 The idea is that if[br]lots of websites think 0:03:14.140,0:03:15.660 that a web page is interesting, 0:03:15.660,0:03:17.940 then it's probably the one [br]you're looking for. 0:03:18.190,0:03:20.020 This algorithm is called page rank, 0:03:20.590,0:03:22.330 not because it ranks web pages, 0:03:22.570,0:03:25.210 but because it was named after[br]its inventor, Larry Page, 0:03:25.480,0:03:27.333 who's one of the founders of Google. 0:03:27.940,0:03:30.520 Because a website often makes[br]money when you visit it, 0:03:30.820,0:03:32.950 spammers are constantly[br]trying to find ways 0:03:32.950,0:03:35.741 to game the search algorithm [br]so that their pages 0:03:35.742,0:03:37.931 are listed higher in the results. 0:03:38.260,0:03:40.750 Search engines regularly[br]update their algorithms 0:03:40.750,0:03:44.296 to prevent fake or untrustworthy[br]sites from reaching the top. 0:03:44.680,0:03:47.350 Ultimately, it's up to you [br]to keep an eye out 0:03:47.500,0:03:49.450 for these pages that are untrustworthy 0:03:49.690,0:03:52.990 by looking at the web address and[br]making sure it's a reliable source. 0:03:53.680,0:03:55.390 Search programs are always evolving 0:03:55.420,0:03:58.420 to improve the algorithms [br]wo they return better results, 0:03:58.540,0:04:00.460 faster results than their competitors. 0:04:01.000,0:04:03.100 Today's search engines[br]even use information 0:04:03.100,0:04:06.820 that you haven't explicitly provided[br]to help you narrow down your search. 0:04:07.150,0:04:10.120 So, for example,[br]if you did a search for dog parks, 0:04:10.240,0:04:12.190 many search engines[br]would give you results 0:04:12.190,0:04:13.840 for all the dog parks nearby, 0:04:14.080,0:04:16.260 even though you didn't[br]type in your location. 0:04:17.800,0:04:20.530 Modern search engines[br]also understand more 0:04:20.530,0:04:22.060 than just the words on a page, 0:04:22.300,0:04:24.970 but what they actually mean[br]in order to find the best one 0:04:24.970,0:04:26.750 that matches what you're looking for. 0:04:27.130,0:04:29.980 For example, if you search [br]for fast pitcher, 0:04:30.280,0:04:32.300 it will know you're[br]looking for an athlete. 0:04:32.500,0:04:34.450 But if you search for large pitcher, 0:04:34.450,0:04:36.730 it will find you options[br]for your kitchen. 0:04:38.420,0:04:41.910 To understand the words better, [br]we use something called machine learning, 0:04:41.910,0:04:43.985 a type of artificial intelligence. 0:04:43.985,0:04:46.050 It enables search[br]algorithms to search out 0:04:46.090,0:04:48.400 not just individual letters[br]or words in the page, 0:04:48.400,0:04:51.280 but understand the underlying[br]meaning of the words. 0:04:53.690,0:04:55.850 The internet is growing exponentially, 0:04:56.210,0:04:59.810 but if the teams that design[br]search engines do our jobs right, 0:05:00.080,0:05:04.090 the information you want should[br]always be just a few keystrokes away.