YouTube

Got a YouTube account?

New: enable viewer-created translations and captions on your YouTube channel!

English (United States) subtitles

← Fundamentals of XML - Data Wranging with MongoDB

Get Embed Code
5 Languages

Showing Revision 2 created 08/27/2016 by Udacity Robot.

  1. In case you're not terribly familiar with XML, let's spend

  2. a few minutes talking syntax. Even if you are familiar,
  3. it might make sense to follow along with this little
  4. review. So in XML, elements are the basic building blocks
  5. of an XML document. Now an XML element is composed
  6. of an open tag and a closed tag, now this
  7. is some data drawn from the New York Times developer
  8. API Encourage you to have a look at this site.
  9. We are going to look at some data from the most
  10. popular API. These are for example articles that are most frequently
  11. emailed among readers of the New York Times. Okay, so
  12. let's look at a couple of examples here. So, the first
  13. thing that we might notice about this particular document is
  14. that we have some tags for num results or some elements
  15. that have to do with the number of results. So,
  16. this is actually result set from having done a query to
  17. the most popular API and we've got a, an
  18. element that tells us how many results were identified
  19. by our query. And then the list of results
  20. follows. Now this happens to be a single result here.
  21. And we can see that this result begins right
  22. here with this open tag and closes right here
  23. with this close tag. Okay. Now just as a
  24. couple of other examples of the data within this particular
  25. result, we can take a look at the byline, note that
  26. it's got a close tag, as well. And some of the
  27. other elements, here if you note the title for example, this
  28. happens to be an article about bedbugs. Okay. So, this provides an
  29. example using some really nicely named tags. We know what these
  30. mean. Now, there's another aspect of XML that we need to concern
  31. ourselves with especially given some of the exercises that we're going
  32. to have. Later on. And those have to do with attributes for
  33. XML elements. Now, this document provides a number of very
  34. nice examples of elements in XML. But what we don't
  35. have here are any examples of attributes for any of
  36. these elements being used. So what I'd like to do here
  37. is talk about essentially the two types of data that
  38. we're going to look at that have been encoded in XML. One
  39. is this more documented oriented type of XML, which is
  40. originally the type of data that XML was designed to encode.
  41. But then we can also take a look at
  42. something like this. Okay, now this is actual data from
  43. the OpenStreetMap project. This is a pretty close
  44. zoomed in view from OpenStreetMap of West Belmont
  45. avenue. Particularly the 1000 block. And you can see right
  46. here, there's a Giardano's Restaurant here. Giardano's is a famous
  47. pizza chain in Chicago. So, this is data that is
  48. essentially from a layer on top of that particular map.
  49. This is data that is human created. So, users
  50. of OpenStreetMap have actually added this data on
  51. top of the map data. And what I want to point out here is that this is very much
  52. not document oriented. This is just data. Okay? And
  53. a lot of times you see HTML used in this
  54. way, you'll see that attributes are heavily used. So
  55. in this particular example, this is the node that represents
  56. the Giordano's restaurant. We can see that there is
  57. a number of attributes specified for this particular element.
  58. Common among them are the latitude and longitude attributes
  59. that this particular annotation applies to. So, essentially what
  60. this data element provides is a mapping from geographic
  61. coordinates to more common street address coordinates. Okay? So
  62. this is a good example of attributes in XML
  63. and there's one other thing that I want to point
  64. out here. And that is this type of tag here.
  65. Now in this particular data they're doing something that I probably
  66. wouldn't do, but it is the type of thing that
  67. you're going to see as a data scientist and likely already
  68. have. Essentially, they've just got a bunch of key value
  69. pairs that are encoded in something called a tag element. And,
  70. in this case, none of these tag elements have a
  71. close tag. Instead, they use this special xml syntax where you
  72. can simply create what are called empty tags, that
  73. is tags that don't have any content. All of
  74. the data for this type of tag is contained
  75. directly within its attributes. So the most emailed example
  76. here provides us a nice example of document oriented
  77. XML with lots of content inside the elements. And
  78. this particular example from your OpenStreetMap project provides us
  79. with other end of this spectrum which is very
  80. data oriented XML where all or almost all of the data
  81. is contained within attributes for the individual elements and in this
  82. types of cases, you often have mostly or at least many
  83. empty elements within the XML data that you are looking at.