YouTube

Got a YouTube account?

New: enable viewer-created translations and captions on your YouTube channel!

English subtitles

← Pandas - Intro to Data Science

Get Embed Code
5 Languages

Showing Revision 5 created 05/24/2016 by Udacity Robot.

  1. Now that we know a little bit about manipulating data,
  2. why don't we talk about how we'll store and reference it using Pandas.
  3. Data in Pandas is often contained in a structure called a data frame.
  4. A data frame is a two dimensional labeled data structure with columns which can
  5. be different types if necessary.
  6. For example, types like string, int, float, or Boolean.
  7. You can think of a data frame as being similar to an Excel spreadsheet.
  8. We'll talk about making data frames in a second.
  9. For now, here's what an example data frame might look like,
  10. using data describing passengers on the Titanic and whether or
  11. not they survived the Titanic's tragic collision with an iceberg.
  12. You'll be using this very data set for project number one.
  13. Note that there are numerous different columns.
  14. Name, age, fare, and survived.
  15. And that these columns all have different data types.
  16. Age is all integers.
  17. Survived is all Boolean, et cetera.
  18. There are also some not a number entries.
  19. This is what happens when we don't specify a value.
  20. How would we go about making this data frame.
  21. First, I'll create a Python dictionary called d where each key is the name of
  22. one of my columns and the corresponding value is a Python series where I
  23. first pass in an array with the values for the actual data frame and
  24. then an array of indexes where I want those values to go.
  25. And notice that in the case of fare where there is a not a number value, I only
  26. provide three actual values, but then I provide the three corresponding indices.
  27. Once I've created this dictionary, I can pass it as an argument to
  28. the DataFrame function to create my actual data frame.
  29. Here I'll call that data frame df.
  30. You'll see that the data frame we've printed here matches the one that we had on
  31. the tablet earlier in this lesson.