YouTube

Got a YouTube account?

New: enable viewer-created translations and captions on your YouTube channel!

English subtitles

← Being a Good Citizen - Web Development

Get Embed Code
2 Languages

Showing Revision 4 created 05/25/2016 by Udacity Robot.

  1. I'd like to now take a few moments to talk about how

  2. to be a good ciziten on the Internet. There are two key
  3. things you can do, when you’re writing programs to manipulate other peoples
  4. websites, or to access other peoples websites, that will make everybody's life a
  5. lot easier. One is, use a good user agent. Remember, we talked
  6. about, in unit one, user agents are the header that describe what,
  7. what browser you are using, or what program you are using to
  8. access somebody. If you are planning
  9. on accessing somebody in a consistent fashion.
  10. If you're going to polll them, you know, every couple of seconds for
  11. updates or do something like that, use a good user-agent. When you're using
  12. urllib2 you can specify headers in your request, and you should set a
  13. user-agent header that says, you know, who you are, what your name is, maybe
  14. links to your website. So that somebody on the other end, if they
  15. see you, you know, pounding them with lots and lots of requests, they
  16. know, they know what's up. They have a way of reaching you to
  17. ask you to stop or to tell you they blocked you or that sort
  18. of thing. It's good to always include that. And the other important thing is,
  19. is to rate-limit yourself. If you want to download, let's say, all of the
  20. search results for the word udacity on Twitter, yeah, you can, you can request
  21. them 15 at a time, which is what their API returns, I believe. As fast
  22. as you can, but you'd be really sending a lot of requests to Twitter
  23. because you can have some loop and it's much, much faster than any human
  24. could type it and that will actually hurt Twitter's service. If you were to
  25. have code like this in Python, you
  26. know, while there's more stuff, make another request
  27. to Twitter, and just run this and this infinite loop, or maybe
  28. not infinite loop, but loop that's going to run through a number of
  29. iterations, you'd be sending requests as fast as Twitter could possibly serve
  30. them. Instead it's a really good to get in the habit of using
  31. the sleep function. In Python you can say import time, time.sleep(1). And
  32. this will cause your interpreter to sleep for one second. And this
  33. is nice. Then you're only hitting them once a second, which is
  34. much more sustainable. But, if you abuse their service, or do too many
  35. requests, they'll probably rate-limit you. I
  36. know Twitter does. Because I thought about
  37. having a quiz in this, in this unit that was, how many requests
  38. in a minute can you make before Twitter rate-limits you? But then I
  39. realized that that would be the exact opposite of being a good citizen
  40. on the net, asking thousands of students to go hit some website as
  41. fast as you can. It's generally not a nice thing to do. So,
  42. instead, we're just going to talk about it. I'm going to ask you
  43. to make sure, if you're hitting somebody hard, that you structure your code
  44. like this. Include a sleep so you pause
  45. a little bit and don't hit anybody too hard.