Return to Video

Pandas Vectorized Methods - Intro to Data Science

  • 0:02 - 0:06
    Also allows you operate on your data frame in a vectorized and item by item way.
  • 0:07 - 0:09
    What does it mean to operate on data frame in a vectorized way?
  • 0:10 - 0:12
    Let's say we have the following data frame.
  • 0:12 - 0:15
    This data frame has 2 columns, 1 and 2.
  • 0:17 - 0:19
    And 4 rows, a, b, c, and d.
  • 0:21 - 0:23
    All of the values are integers.
  • 0:23 - 0:24
    We can call data frame that apply and
  • 0:24 - 0:27
    provide us the argument sum arbitrary function.
  • 0:27 - 0:28
    In this case,
  • 0:28 - 0:33
    numpy.mean to perform that function on the vector that is every single column.
  • 0:33 - 0:36
    So when we call df.apply numpy.mean.
  • 0:36 - 0:39
    What we get back is the mean of every single column in df.
  • 0:39 - 0:41
    This is itself a new data frame.
  • 0:41 - 0:45
    There are also some operations that simply cannot be vectorized in this way.
  • 0:45 - 0:48
    That is, take an numpy array as their input and
  • 0:48 - 0:50
    then return another array or value.
  • 0:51 - 0:54
    We can also, in this case, call map on particular columns.
  • 0:54 - 0:56
    Or apply map on entire data frames.
  • 0:58 - 1:01
    These methods will accept functions that take in a single value, and
  • 1:01 - 1:03
    return a single value.
  • 1:03 - 1:08
    For example let's say that we said df one.map lambda x x greater than or
  • 1:08 - 1:09
    equal to 1.
  • 1:09 - 1:13
    What this does is goes through every single value in the 1 column, and
  • 1:13 - 1:17
    evaluates whether or not that value is greater than or equal to 1.
  • 1:17 - 1:24
    If we were to call df.applymap lambda x x greater than or equal to 1.
  • 1:24 - 1:28
    The same function is evaluated over every single value in the data frame.
  • 1:28 - 1:30
    As opposed to just the "one" column.
タイトル:
Pandas Vectorized Methods - Intro to Data Science
概説:

more » « less
Video Language:
English
Team:
Udacity
プロジェクト:
ud359: Intro to Data Science
Duration:
01:32

English (United States) subtitles

改訂