Got a YouTube account?

New: enable viewer-created translations and captions on your YouTube channel!

English subtitles

← Data Formats - Intro to Hadoop and MapReduce

Get Embed Code
2 Languages

Showing Revision 4 created 05/24/2016 by Udacity Robot.

  1. By unstructured we mean data arrives in lots of different formats.
  2. For example, a bank might have list of your credit card
  3. and account transactions. They may also have scans of checks, records
  4. with customer service interactions. Maybe even recordings of those phone calls.
  5. All that data in a variety of different formats can be
  6. hard to store and reconcile in a traditional system. And this
  7. brings us back to volume. You want to store that data
  8. in its original format so you're not throwing any information away.
  9. That way you can then process the data later in
  10. different ways. For instance, if we transcribe a call center
  11. conversation into text we have what people said to customer
  12. service representatives. But if we had the actual recording as well
  13. then later we might develop software which can interpret the
  14. tone of voice the customer uses. And that might lead to
  15. a very different interpretation of the data. And the nice
  16. things about Hadoop is that it doesn't care what format your
  17. data comes in. Unlike a traditional database, you can store the
  18. data in its raw format and manipulate it and reformat it later.