English subtitles

← Parsing CSV Files - Data Wranging with MongoDB

Get Embed Code
4 Languages

Showing Revision 2 created 05/24/2016 by Udacity Robot.

  1. Okay, it's finally time to do a little data wrangling
  2. for ourselves. We're going to look at parsing CSV files in Python.
  3. In this case, we're going to be reading the CSV data into
  4. our program and creating on dictionary for each item in that
  5. file. So you might ask yourself, why would we do something
  6. like this? Why not just open it in a spreadsheet application?
  7. One reason, is because if the file is big, let's tens
  8. or even hundreds of megabytes, opening it in a spreadsheet application
  9. like Excel can be slow, inefficient or maybe even
  10. impossible. Your app might do the software equivalent of
  11. this. Another reason we might want to programmatically process
  12. tabular data, is because we might have a whole
  13. lot of files to process. So, doing it manually
  14. in the spreadsheet application simply isn't an option. Alright,
  15. let's take a look at the code provided. Here
  16. you can see, we have a parse file application.
  17. In this exercise, we're going to be working with
  18. the Beatles' disckography data, one more time. You'll be
  19. working in the parse file function in the provided
  20. code. And, your assignment is to use the Python
  21. function split to parse each row into a dictionary.
  22. For each dictionary, the names of the fields will
  23. serve as the keys and the value you find on a given row will serve as the values
  24. for those keys. You should produce an array of
  25. these dictionaries, one dictionary for each item remember. And you
  26. should return that array from the parse file function.
  27. Now, one final instruction here, is that rather than processing
  28. the entire file, you should only parse the first
  29. ten lines in this file. If you go beyond that,
  30. you run into trouble with this particular dataset. Since this
  31. is the first exercise we're looking at in this course.
  32. Let me talk a little bit about this test
  33. function here. We're providing this as a means for you
  34. to test your implementation of parse file. This will run
  35. a little bit of code which calls the parse file
  36. function and sample the result that it gets back
  37. from parse file, checking to see if it has the
  38. expected values. When you actually submit your program, we'll be
  39. running some different test code, possibly on a different dataset.