English subtitles

← Project Operator - Data Wranging with MongoDB

Get Embed Code
4 Languages

Showing Revision 2 created 05/24/2016 by Udacity Robot.

  1. Okay so now I want to pick this up
  2. and talk about the project operator. So a couple of
  3. things to note about project, we can use project to
  4. include fields from the original document. Remember that project works
  5. with data in a single document at a time. And
  6. we're essentially doing a shaping task with project. So the
  7. simplest form of shaping, is simply specifying which fields from
  8. each of the documents we're receiving in the stage using
  9. project, we would like to include and pass along
  10. to the next stage. One really cool thing we can
  11. do with project, is insert computed fields. So, for
  12. example, a ratio, which is what we're going to do
  13. for this particular example we're working through. We might
  14. also rename fields. And finally, we can do some pretty
  15. substantial reshaping of the data, by doing something like
  16. creating fields that hold subdocuments that are composed of what
  17. were originally top level fields in the documents,
  18. as they came into the stage using the project
  19. operator. So let's go back to our code,
  20. and look at how we're using project here. Remember,
  21. the problem we're trying to solve here is,
  22. addressing the question who has the highest followers to
  23. friends ratio? So here, it's pretty straight forward, we're
  24. simply pulling out the screen name field of the
  25. user sub-document. Okay? And again, we use this
  26. $ here, because rather than this being interpreted
  27. as just a string literal, we're telling mongoDB
  28. that we want the value. For each document
  29. that's found in the user sub-document and in
  30. the screen name field. Okay? So, in documents
  31. that get passed along from this particular stage,
  32. they will have a screen name field composed of
  33. that value for each document that we received
  34. as input here. Okay, now let's look at this
  35. portion of the projected stage here. So we're going
  36. to create a ratio field and documents that come
  37. out of this particular stage. And that ratio
  38. field is going to have the value of having
  39. divided the followers count by the friends count, so
  40. quite literally, calculating this ratio here. Again, remember friends,
  41. in the documents we're looking at here, are the
  42. number of people that I follow as opposed to
  43. people who follow me. Okay, so again, diving into
  44. the user sub-document, we're going to pull out the
  45. followers account, and the friends account. Again making use
  46. of the dollar operator here to indicate we actually
  47. want the values of each of these. And then
  48. we're going to use the divide operator to calculate the
  49. ratio of these two values. It's that value
  50. then, that will make up the value of the
  51. ratio field in each document that gets passed
  52. along from this stage using the project operator. Okay,
  53. so when we get to this stage all
  54. documents will have exactly two fields: ratio and screen
  55. name. And then we're simply going to sort in
  56. descending order based on ratio, and then of course,
  57. limit to just the very first document that
  58. we see. So let's run this. Okay, and again.
  59. Our output from an aggregation inquiry is always
  60. a single document. The results we're really interested in
  61. are always in the result field, which is an array value field. Okay? So,
  62. in this case, user in our tweets collection that has the highest
  63. followers to friends ratio, is a user called Twitterific.
  64. ' Kay, turns out this is actually a Twitter application,
  65. and even today if you look at Twitterific's page on Twitter,
  66. you'll see that they have something on the order of nearly
  67. half a million followers. But they only follow about 14 people,
  68. so their followers to friends count ratio is still extremely high.
  69. So again, in this example, we focused on the $ match
  70. operator, which is just a filter, the $project operator, which is
  71. a shaping operator, we actually have lots of things that we
  72. can do here. And the sort and limit operators. So
  73. in this case we've got four stages of our pipeline.
  74. Now one thing before I wrap this up, is I
  75. just want to quickly point out that we can build a
  76. variety of expressions using the Project operator. If we take
  77. a look at the MongoDB documentation, there are a number
  78. of arithmetic operators that we can apply, as well as
  79. a number of string operators, date operators and so on.
  80. So this is the aggregation expression operators page in the docs. I encourage
  81. you to look here for more information on the different types of operators
  82. that are available to you when working with Project, as well as the
  83. other aggregation framework operators. See the instructor
  84. notes for a link to this page.