English subtitles

← Geospatial Indexes - Data Wranging with MongoDB

Get Embed Code
5 Languages

Showing Revision 2 created 05/25/2016 by Udacity Robot.

  1. Okay, so now I'd like to talk about another type
  2. of index. Geospatial indexes, in particular. Support for geospatial indexes in
  3. MongoDB give us the ability to perform queries for locations
  4. near another location. So, the type of thing that a lot
  5. of phone apps will do, when you're say, looking for
  6. a nearby cafe. Now the type of geospatial index that we're
  7. going to talk about is 2D geospatial indexes. But I'm going
  8. to be, also provide support for spherical geospatial indexes, that is
  9. those that take into account the curvature of the earth.
  10. But I'll direct you to the documentation and our courses
  11. at MongoDB University if you're interested in learning more about
  12. those. So, with 2D Geospatial Indexes, we're essentially thinking of our
  13. data as all lying on a Cartesian plane, with values
  14. in the x direction and the y direction. So, in these
  15. situations we have a query location of some kind, and
  16. what we want to find in response to queries are other
  17. items or documents that are close to this query location.
  18. So there's essentially three things that you need to know about
  19. in order to construct a geospacial index in MongoDB. The
  20. first of those is that you need a field that holds
  21. a location, and the location needs to be stored as
  22. an array with first an x value and then a y
  23. value. Now you can name this field whatever you want. I've
  24. just chosen the name location here. It could be loc or
  25. position point anything you like, but it does need
  26. to follow this form of having the x value first
  27. and then the y. The second thing you need to
  28. know about is that you need to ensure there's an
  29. index, need to all ensure index and create an
  30. index on this particular field. So in this case I
  31. would need to call insure index. Specifying location as the
  32. field, and I would need to specify a direction here.
  33. We'll take a look at a specific example of that
  34. in just a bit. So again, we need to create an
  35. index now on this field that we have for our
  36. documents that we want to use in geospatial queries. And finally
  37. the way we do queries against the geospatial index, is
  38. through the use of the $near operator. So it's these three
  39. steps in combination that allow us to do something like this,
  40. and get all of the documents that have a location near
  41. this one. So let's take a look at this in some
  42. code, and then we'll do an example query in the Mongo
  43. shell. So this is a script that I actually retrieved from
  44. the Open Street Map, folks. This is a script that they wrote
  45. for putting OSM data into MongoDB. So you could see here
  46. that it's going to do iterative parsing of our Osm data,
  47. just like we did in a previous example back in lesson
  48. three. Now, let's take a look a little bit further down first,
  49. because I want to show you the location
  50. field here. So, for every node in this file,
  51. this script builds a value called loc and it
  52. is composed of the latitude value and the longitude
  53. value for a node element in the XML file.
  54. Then it adds that field to the record that
  55. it's building up as it moves through the node
  56. that it's creating a document for, okay? And then,
  57. finally, it will end up creating a document
  58. in MongoDB by doing an insert at some point.
  59. Okay? Now, using that location field, at the
  60. very top of this code. Ensure_index, it's called. And
  61. ensure_index is called using that location field which
  62. stores the xy coordinates. Four nodes that are parsed
  63. out of the OSM XML file. Now, the syntax
  64. for ensure_index is slightly different in pymongo. It matches
  65. the syntax for the language here, which of course is
  66. Python. Okay? But then also, rather than passing a dictionary,
  67. what we pass instead is a list of tuples. So,
  68. we pass the name of the field that we want
  69. to create an index on, as well as a direction.
  70. So these are constant values that are available to us
  71. on pymongo. So we're not simply typing strings here, which
  72. is potentially very airprone. Okay? And you can see that,
  73. since your index is actually being used here to
  74. create several indexes, here's another example where we're creating an
  75. index on id and we're specifying that we want
  76. that index to be created in ascending order. Okay? So,
  77. technically speaking, GEO2D is a direction argument for ensure_index.
  78. And the reason why it makes sense to think about
  79. this as a direction argument, is because queries using
  80. the near operator will always return documents sorted by those
  81. that are nearest to the query location. Okay, so now
  82. let's take a look at an actual query, and bear in
  83. mind that the query we're going to look at in the Mongo
  84. shell is a query against the collection that we created using
  85. this script. This is exactly the script that I used to
  86. create this collection and store documents in it within MongoDB. Okay
  87. so lets take a look at query here. Now what I
  88. need to do is make sure that I'm using the OSM
  89. data base and then I'm going to create
  90. the nodes collection that script we just looked
  91. at actually creates several collections one of which
  92. is nodes and these are based on the nodes
  93. tags that appear in the OSM dataset. Now
  94. again, we're using the Chicago OSM dataset just for
  95. point of clarification. Okay, so what I'm going
  96. to do here then is query this particular collection,
  97. and note that I'm querying against location field. And
  98. I'm using the near operator, okay? And I'm specifying a
  99. set of coordinates here. Okay? So, this is the
  100. type of thing that an application might do in order
  101. to find restaurants or other amenities near the location
  102. of a user making the query. Okay, this is how
  103. we might do this sort of thing in the
  104. back-end of an application that supports that type of functionality.
  105. Now, I'm doing one other thing here, and this is just is
  106. really purposes of this example. And you remember from a little earlier in
  107. the lesson when we looked at the tags that get applied to nodes.
  108. That there is this tg field in this collection. Okay? And just so
  109. that the data we get back is a little bit more interesting,
  110. I'm just ensuring that there is a tg field, that it actually exists.
  111. Because then we'll have some data that has some names and other sort
  112. of tagging associated with it. So, we can kind of figure out what's
  113. there, near this particular location. Now, this location happens
  114. to be quite close to Wrigley Field. So, what we're
  115. going to get are a number of restaurants, cafes, convenient
  116. stores, that sort of thing in that neighborhood, so imagine
  117. you've just walked out of Wrigley Field and you're looking
  118. to see what's nearby on your phone. This is a
  119. type of query we might do in the backend of
  120. application to support this sort of thing. Okay. So here's
  121. our initial set of results. We could get more by typing IT here in the shell.
  122. Right? We can see there's a Jamba Juice.
  123. There's a school, church, convenience store in this
  124. case, happens to be a Walgreens, Domino's Pizza,
  125. and a Dunkin' Donuts. Okay, so that's pretty
  126. much what you need to know in order
  127. to build geospatial indexes In MongoDB. We'll take a
  128. look at using geospacial indexes in the case study in the next lesson.