English subtitles

← Flexible Schema - Data Wranging with MongoDB

Get Embed Code
4 Languages

Showing Revision 2 created 05/24/2016 by Udacity Robot.

  1. As so often the case with our data, some entries
  2. or documents will have fields that other do not. In
  3. any project, a data model usually goes through several literations.
  4. MongoDB was designed to address both of these issues by providing
  5. a flexible schema that deals well with both individual documents
  6. that vary in the fields they contain as well as
  7. the schema for our entire collection needing to change. Let's
  8. take a look at person info box data, as an example.
  9. Now for nearly everyone, it probably makes sense to
  10. include fields for birth and death dates. Maybe nationality and
  11. even profession, but not everyone will have held office, and
  12. not everyone is associated with a political party. And even
  13. if we're talking about people who are not famous,
  14. some people will have spouses. Some will have more than
  15. one. And others won't. Some will have children and some
  16. will not. Leaving aside the question of what is the
  17. right data model for person data. In MongoDB we can
  18. represent each person using the fields that are appropriate to them
  19. even if some person documents contain fields that others don't have.
  20. MongoDB's indexing system and query execution system take this into account.
  21. So we can query a people collection for people with
  22. two or more children, and it will work as expected retrieving
  23. only data for people that have two or more entries in
  24. the array that serves as the value for the children field.
  25. Ignoring documents with fewer As well as documents that don't
  26. have a children field at all. What this also means is
  27. that it's easy to evolve your scheme as new needs emerge
  28. or more data becomes available. It's a simple matter to begin
  29. adding documents to a collection that have new fields you now
  30. want to track or to change the way you model existing
  31. fields. As an example of this, let's take a look at
  32. the dbpedia page, that describes the data sets that are available.
  33. Now, if I scroll down this page. And, I've already
  34. done that here, we have an example for city data.
  35. So, so far we've looked at automobile data, person data
  36. and now, city data, from the info-box data set. And, I'm
  37. showing this example to illustrate the fact that the scheme
  38. for city info boxes has evolved. And, we can see that
  39. in this old example for Innsbruck. And comparing it to
  40. the new example for Innsbruck and if you look through this
  41. data there are a couple of places where
  42. the data has changed. There's no mayor listed here
  43. as there is here and while there's no time
  44. zone listed in the old form of the data.
  45. The time zone for central and day light
  46. savings time are listed here. Small differences, but these
  47. are the type of subtle changes that we would
  48. expect to see in a schema that is evolving.