< Return to Video

The Internet: How Search Works

  • 0:06 - 0:07
    Hi, my name's John.
  • 0:08 - 0:10
    I lead the search and machine
    learning teams at Google.
  • 0:12 - 0:14
    I think it's amazingly inspiring
  • 0:14 - 0:16
    that people all over the world
  • 0:16 - 0:19
    turn to search engines to
    ask trivial questions
  • 0:19 - 0:21
    and incredibly important questions.
  • 0:21 - 0:23
    So it's a huge
    responsibility to give them
  • 0:23 - 0:25
    the best answers that we can.
  • 0:27 - 0:31
    Hi, my name's Akshaya and
    I work on the Bing search team.
  • 0:31 - 0:33
    There are many times where
    we will start looking
  • 0:33 - 0:36
    into artificial intelligence
    and machine learning,
  • 0:36 - 0:39
    but we have to address how are
    the users going to use this,
  • 0:39 - 0:42
    because at the end of the day,
    we want to make an impact to society.
  • 0:44 - 0:45
    Let's ask a simple question.
  • 0:46 - 0:48
    How long does it take to travel to Mars?
  • 0:49 - 0:51
    Where did these results come from
  • 0:51 - 0:54
    and why was this listed
    before the other one?
  • 0:56 - 0:58
    Okay, let's dive in and
    see how the search engine
  • 0:58 - 1:00
    turned your request into a result.
  • 1:01 - 1:03
    The first thing you need to
    know is when you do a search,
  • 1:03 - 1:06
    the search engine isn't actually
    going out to the World Wide Web
  • 1:06 - 1:08
    to run your search in real time.
  • 1:08 - 1:11
    And that's because there's
    over a billion websites
  • 1:11 - 1:14
    on the internet and hundreds more are
    being created every single minute.
  • 1:14 - 1:16
    So if the search engine
    had to look through
  • 1:16 - 1:19
    every single site to
    find the one you wanted,
  • 1:19 - 1:20
    it would just take forever.
  • 1:20 - 1:22
    So to make your search faster,
  • 1:22 - 1:25
    search engines are constantly
    scanning the web in advance
  • 1:25 - 1:29
    to record the information that might
    help with your search later.
  • 1:29 - 1:31
    That way, when you search
    about travel to Mars,
  • 1:32 - 1:34
    the search engine
    already has what it needs
  • 1:34 - 1:36
    to give you an answer in real time.
  • 1:36 - 1:38
    Here's how it works.
  • 1:38 - 1:42
    The internet is a web of pages
    connected to each other by hyperlinks.
  • 1:42 - 1:45
    Search engines are
    constantly running a program
  • 1:45 - 1:47
    called a Spider that cross
    through these web pages
  • 1:47 - 1:49
    to collect information about them.
  • 1:50 - 1:52
    Each time it finds a hyperlink,
  • 1:52 - 1:55
    it follows it until it
    has visited every page
  • 1:55 - 1:57
    it can find on the entire
    internet.
  • 1:57 - 1:59
    For each page the spider visits,
  • 1:59 - 2:02
    it records any information
    it might need for a search
  • 2:02 - 2:06
    by adding it to a special
    database called a search index.
  • 2:07 - 2:10
    Now, let's go back to
    that search from earlier
  • 2:10 - 2:12
    and see if we can figure
    out how the search engine
  • 2:12 - 2:13
    came up with the results.
  • 2:14 - 2:16
    When you ask how long does
    it take to travel to Mars,
  • 2:17 - 2:19
    the search engine looks
    in each of those words
  • 2:19 - 2:21
    in the search index to
    immediately get a list
  • 2:21 - 2:24
    of all the pages on the
    internet containing those words.
  • 2:25 - 2:27
    But just looking for these search terms
  • 2:27 - 2:29
    could return millions of pages,
  • 2:29 - 2:31
    so the search engine needs
    to be able to determine
  • 2:31 - 2:33
    the best matches to show you first.
  • 2:33 - 2:36
    This is where it gets tricky
    because the search engine
  • 2:36 - 2:38
    may need to guess what
    you're looking for.
  • 2:39 - 2:41
    Each search engine
    uses its own algorithm
  • 2:41 - 2:44
    to rank the pages based on
    what it thinks you want.
  • 2:45 - 2:48
    The search engine's ranking
    algorithm might check
  • 2:48 - 2:50
    if your search term shows
    up in the page title,
  • 2:51 - 2:54
    it might check if all of the
    words show up next to each other,
  • 2:55 - 2:57
    or any number of other calculations
  • 2:57 - 2:59
    that help it better determine
  • 2:59 - 3:01
    which pages you'll want
    to see and which you won't.
  • 3:03 - 3:05
    Google invented the most
    famous algorithm
  • 3:05 - 3:09
    for choosing the most relevant results
    for a search by taking into account
  • 3:09 - 3:11
    how many other Web pages
    linked to a given page.
  • 3:12 - 3:14
    The idea is that if
    lots of websites think
  • 3:14 - 3:16
    that a web page is interesting,
  • 3:16 - 3:18
    then it's probably the one
    you're looking for.
  • 3:18 - 3:20
    This algorithm is called page rank,
  • 3:21 - 3:22
    not because it ranks web pages,
  • 3:23 - 3:25
    but because it was named after
    its inventor, Larry Page,
  • 3:25 - 3:27
    who's one of the founders of Google.
  • 3:28 - 3:31
    Because a website often makes
    money when you visit it,
  • 3:31 - 3:33
    spammers are constantly
    trying to find ways
  • 3:33 - 3:36
    to game the search algorithm
    so that their pages
  • 3:36 - 3:38
    are listed higher in the results.
  • 3:38 - 3:41
    Search engines regularly
    update their algorithms
  • 3:41 - 3:44
    to prevent fake or untrustworthy
    sites from reaching the top.
  • 3:45 - 3:47
    Ultimately, it's up to you
    to keep an eye out
  • 3:48 - 3:49
    for these pages that are untrustworthy
  • 3:50 - 3:53
    by looking at the web address and
    making sure it's a reliable source.
  • 3:54 - 3:55
    Search programs are always evolving
  • 3:55 - 3:58
    to improve the algorithms
    wo they return better results,
  • 3:59 - 4:00
    faster results than their competitors.
  • 4:01 - 4:03
    Today's search engines
    even use information
  • 4:03 - 4:07
    that you haven't explicitly provided
    to help you narrow down your search.
  • 4:07 - 4:10
    So, for example,
    if you did a search for dog parks,
  • 4:10 - 4:12
    many search engines
    would give you results
  • 4:12 - 4:14
    for all the dog parks nearby,
  • 4:14 - 4:16
    even though you didn't
    type in your location.
  • 4:18 - 4:21
    Modern search engines
    also understand more
  • 4:21 - 4:22
    than just the words on a page,
  • 4:22 - 4:25
    but what they actually mean
    in order to find the best one
  • 4:25 - 4:27
    that matches what you're looking for.
  • 4:27 - 4:30
    For example, if you search
    for fast pitcher,
  • 4:30 - 4:32
    it will know you're
    looking for an athlete.
  • 4:32 - 4:34
    But if you search for large pitcher,
  • 4:34 - 4:37
    it will find you options
    for your kitchen.
  • 4:38 - 4:42
    To understand the words better,
    we use something called machine learning,
  • 4:42 - 4:44
    a type of artificial intelligence.
  • 4:44 - 4:46
    It enables search
    algorithms to search out
  • 4:46 - 4:48
    not just individual letters
    or words in the page,
  • 4:48 - 4:51
    but understand the underlying
    meaning of the words.
  • 4:54 - 4:56
    The internet is growing exponentially,
  • 4:56 - 5:00
    but if the teams that design
    search engines do our jobs right,
  • 5:00 - 5:04
    the information you want should
    always be just a few keystrokes away.
Title:
The Internet: How Search Works
Description:

Join John, Google's Chief of Search and AI, and Akshaya, from Microsoft Bing, to find out how search really works. They cover everything from how special programs called "spiders" scan the Internet before you even type in your search terms to what determines which search results show up first. Find out how search algorithms bust spammers, manage location services and even use machine learning to make search better every year.

Start learning at http://code.org/

Stay in touch with us!
• on Twitter https://twitter.com/codeorg
• on Facebook https://www.facebook.com/Code.org
• on Instagram https://instagram.com/codeorg
• on Tumblr https://blog.code.org
• on LinkedIn https://www.linkedin.com/company/code-org
• on Google+ https://google.com/+codeorg

"Ryoji Ikeda: Datamatics" by Forma Arts is licensed under CC BY 2.0
"Eyeo 2016" by Gene Kogan is licensed under CC BY 2.0
"Spider" by Oliviu Stoian is licensed under CC BY 2.0
"Bowie" by Artem Kovyazin is licensed under CC BY 2.0
"Spaceship" By Creative Staff from the Noun Project is licensed under CC BY 2.0
"Rover" by Symbolon is licensed under CC BY 2.0
"Signal Barrel" by Beeple is licensed under CC BY 2.0
"Base Ten" by Beeple is licensed under CC BY 2.0

more » « less
Video Language:
English
Team:
Code.org
Project:
How Internet Works
Duration:
05:13

English subtitles

Revisions Compare revisions