Return to Video

Pandas - Intro to Data Science

  • 0:01 - 0:03
    Now that we know a little bit about manipulating data,
  • 0:03 - 0:06
    why don't we talk about how we'll store and reference it using Pandas.
  • 0:07 - 0:12
    Data in Pandas is often contained in a structure called a data frame.
  • 0:12 - 0:16
    A data frame is a two dimensional labeled data structure with columns which can
  • 0:16 - 0:18
    be different types if necessary.
  • 0:18 - 0:25
    For example, types like string, int, float, or Boolean.
  • 0:26 - 0:30
    You can think of a data frame as being similar to an Excel spreadsheet.
  • 0:30 - 0:32
    We'll talk about making data frames in a second.
  • 0:32 - 0:35
    For now, here's what an example data frame might look like,
  • 0:35 - 0:39
    using data describing passengers on the Titanic and whether or
  • 0:39 - 0:41
    not they survived the Titanic's tragic collision with an iceberg.
  • 0:43 - 0:45
    You'll be using this very data set for project number one.
  • 0:45 - 0:48
    Note that there are numerous different columns.
  • 0:48 - 0:51
    Name, age, fare, and survived.
  • 0:51 - 0:54
    And that these columns all have different data types.
  • 0:54 - 0:55
    Age is all integers.
  • 0:55 - 0:57
    Survived is all Boolean, et cetera.
  • 0:58 - 1:01
    There are also some not a number entries.
  • 1:01 - 1:03
    This is what happens when we don't specify a value.
  • 1:03 - 1:06
    How would we go about making this data frame.
  • 1:06 - 1:10
    First, I'll create a Python dictionary called d where each key is the name of
  • 1:10 - 1:14
    one of my columns and the corresponding value is a Python series where I
  • 1:14 - 1:18
    first pass in an array with the values for the actual data frame and
  • 1:18 - 1:20
    then an array of indexes where I want those values to go.
  • 1:22 - 1:27
    And notice that in the case of fare where there is a not a number value, I only
  • 1:27 - 1:31
    provide three actual values, but then I provide the three corresponding indices.
  • 1:31 - 1:34
    Once I've created this dictionary, I can pass it as an argument to
  • 1:34 - 1:37
    the DataFrame function to create my actual data frame.
  • 1:37 - 1:38
    Here I'll call that data frame df.
  • 1:38 - 1:42
    You'll see that the data frame we've printed here matches the one that we had on
  • 1:42 - 1:44
    the tablet earlier in this lesson.
タイトル:
Pandas - Intro to Data Science
概説:

more » « less
Video Language:
English
Team:
Udacity
プロジェクト:
ud359: Intro to Data Science
Duration:
01:44
Udacity Robot edited 英語(米国) subtitles for 01-16 Pandas
Udacity Robot edited 英語(米国) subtitles for 01-16 Pandas
Udacity Robot edited 英語(米国) subtitles for 01-16 Pandas
Udacity Robot edited 英語(米国) subtitles for 01-16 Pandas
Cogi-Admin edited 英語(米国) subtitles for 01-16 Pandas

English subtitles

改訂 Compare revisions