English subtitles

← Crawl Web - Intro to Computer Science

Get Embed Code
2 Languages

Showing Revision 5 created 05/24/2016 by Udacity Robot.

  1. Now we're ready to write the code for crawling the
  2. web. So our goal is to define a procedure, we'll
  3. call it crawl_web, that takes as input a seed page
  4. url. So, that's the url that identifies our seed page, and
  5. outputs a list of all the urls that can be
  6. reached by following links starting from the seed page. So,
  7. if you're really ambitious you should try to do this
  8. yourself without anymore help. That's going to be a pretty tough challenge.
  9. So we're also going to step through one way to do
  10. this as a series of quizzes. But you should feel free
  11. at any point, when you feel confident that you can do
  12. it yourself, to try to finish for yourself, rather than following
  13. the step by step quizzes that I'll show you. So we
  14. will start defining our procedure crawl web, and we are going
  15. to introduce two variables. The two crawl variable that keeps track
  16. of the pages that we need to crawl, and the crawl
  17. variable that is a list of pages that we
  18. have already crawled. For the first step, your goal
  19. is to figure out, how to initialize these variables.
  20. Which of the first value, to crawl and crawled be?