English subtitles

← Lock Contention - Web Development

Get Embed Code
4 Languages

Showing Revision 6 created 05/24/2016 by Udacity Robot.

  1. Anything else?
  2. Lock contention is a big thing for us right now.
  3. As I was saying with this Cassandra stuff--
  4. Whenever you vote on a link in a popular subreddit,
  5. it has to lock on that subreddit's listing.
  6. An example of that biting us recently--
  7. we have these queue processors,
  8. and all they do is they
  9. get that queue item that says that
  10. somebody's cast a vote.
  11. So you basically have a set of machines that are reading from this. >>Right.
  12. So all these apps are writing what happened to the queue? >>Right.
  13. Then you've got a bunch of machines
  14. that are reading from the queue.
  15. And what they do is, they sit there.
  16. They pull that out, and then they say, okay, that means I have to update these listings,
  17. and I have to record the vote in postscribes and cassandra.
  18. And they go through all of this stuff,
  19. and that involves a lot of the locking in here right now.
  20. And so we had a lot of these queue processors up for votes,
  21. and we get a lot of votes simultaneously,
  22. so we need a lot of them,
  23. but we had too many, it turned out,
  24. and they were all fighting each other for those locks.
  25. Just having the number of those queue processors
  26. actually sped up queue processing in general.
  27. You need enough queue processors to actually handle the depth of the queue,
  28. but if you have too many,
  29. they spend too much time fighting each other.
  30. And one of the ways we're working on that
  31. is we're getting rid of the locking in the cassandra stuff.
  32. And we're trying to get rid of locks as much as possible in general.
  33. Locking--
  34. It's a common theme.
  35. In python itself
  36. when we first switched to Amazon we had this weird issue where
  37. Python's multi-threading--running two threads at once,
  38. running two pieces of your program at the same time,
  39. is not state of the art--
  40. would be a nice way of putting it.
  41. Python was spending so much time locking its data structures
  42. so two threads could access the same data structure at the same time
  43. that it was actually slowing down the
  44. computer's ability read traffic over the network,
  45. which was causing it to spend more time
  46. switching between threads.
  47. It created this weird network/CPU thrashing issue.
  48. Where the way we solved that at the time was we just made python single threaded,
  49. and we'd only handle one request at the same time.
  50. I think you guys are probably still doing that.
  51. Yeah, we very rarely use threaded processes.
  52. The ad servers use threads, but other than that
  53. we just have lots of separate processes.
  54. You guys use lots of processes on 1 machine,
  55. and the OS then can do the task switching for you.
  56. The OS is Linux; these days, is pretty good at it.
  57. And they spend a lot of their time waiting
  58. for something in the back end.
  59. Well, this is really great. Thank you so much for coming.
  60. One thing I'd like to point out is that
  61. everything we've talked about here, the main things--
  62. memcache, Zookeeper, Cassandra, Hadoop, AMQP, NGINX, HAProxy--
  63. This is all free software.
  64. It's pretty wild how far you can get without paying for anything.
  65. The things that you have to pay for are
  66. the computers to run this stuff, but all of the software
  67. and all of the code behind all of the software is online and free.
  68. Also, the vast majority of reddit's code is also open source and online,
  69. so if you wanna look at this stuff--
  70. What's the URL for that? >>Github.com/reddit.
  71. I'm gonna make a hole right in the middle of our picture here.
  72. If you go into this code,
  73. we switched to git at some point,
  74. and my name isn't on a lot of this code anymore,
  75. but if you go in there, you will see
  76. a lot of the code we've written in this class for
  77. hashing and passwords and all of that stuff--
  78. It exists in reddit somewhere.
  79. It's really common stuff.
  80. It's cool, and you can see all of this stuff's also on the reddit mailing list,
  81. where people discuss these architecture changes and that sort of thing.
  82. Thanks again. Good job.
  83. Watching you guys grow Reddit has been really cool.
  84. There were some dark days, and you guys have really done an amazing job growing that site.
  85. It's really impressive. >>Well, it wouldn't be where it is without you.
  86. All right guys, thanks very much for watching, and we'll see you in the next one.