YouTube

Got a YouTube account?

New: enable viewer-created translations and captions on your YouTube channel!

English subtitles

← Dataframe Columns - Intro to Data Science

Get Embed Code
5 Languages

Showing Revision 6 created 05/25/2016 by Udacity Robot.

  1. Now that we know how to create a dataframe,
  2. why don't we talk about how we can access the data.
  3. We can operate on specific columns by calling on them as if they were a key in
  4. the dictionary.
  5. For example, if we wanted just the name column of this dataframe,
  6. I could simply type df name.
  7. I could also grab more than one column by passing in a list of
  8. column names as opposed to just one column name.
  9. For example, say I wanted the name and age columns.
  10. I could say df name, age.
  11. I can also call on specific rows by calling the dataframe objects load
  12. method and passing the row index as an argument.
  13. For example, if I only wanted the row corresponding to passenger Braund,
  14. whose index is a, I could simply say df.loc a.
  15. We can also use true false statements regarding columns of the dataframe to
  16. subset the dataframe.
  17. For example let's say I wanted rows of this dataframe only
  18. where the passenger age was greater than or equal to 30.
  19. I could simply say, df where df age greater than or equal to 30.
  20. You can see here that I've only picked out rows b and d,
  21. which were the rows where our passenger is in fact older than 30.
  22. This ability to subset our dataframe based on true false statements in
  23. the index is not limited to the entire row.
  24. I can also perform this operation on particular columns.
  25. For example let's say I only wanted this survived information for
  26. these two rows, b and d.
  27. I can simply say, df survived df age greater than or equal to 30.
  28. Let's pick apart what this statement is
  29. doing since it's a little bit complicated.
  30. First, df survived is going to
  31. pick out only the survived column of our dataframe.
  32. This section here says,
  33. I basically only want the indices where df age is great than or equal to 30.
  34. Then I say, of this array of values, give me only the values where
  35. the indices are equal to the indices where this statement is true.