Also allows you operate on your data frame in a vectorized and item by item way.
What does it mean to operate on data frame in a vectorized way?
Let's say we have the following data frame.
This data frame has 2 columns, 1 and 2.
And 4 rows, a, b, c, and d.
All of the values are integers.
We can call data frame that apply and
provide us the argument sum arbitrary function.
In this case,
numpy.mean to perform that function on the vector that is every single column.
So when we call df.apply numpy.mean.
What we get back is the mean of every single column in df.
This is itself a new data frame.
There are also some operations that simply cannot be vectorized in this way.
That is, take an numpy array as their input and
then return another array or value.
We can also, in this case, call map on particular columns.
Or apply map on entire data frames.
These methods will accept functions that take in a single value, and
return a single value.
For example let's say that we said df one.map lambda x x greater than or
equal to 1.
What this does is goes through every single value in the 1 column, and
evaluates whether or not that value is greater than or equal to 1.
If we were to call df.applymap lambda x x greater than or equal to 1.
The same function is evaluated over every single value in the data frame.
As opposed to just the "one" column.