## Easy Imputation - Intro to Data Science

• 0:01 - 0:03
Let's first discuss what would seem to be the
• 0:03 - 0:05
easiest way to impute a missing value in our data
• 0:05 - 0:08
set. Just take the mean of our other data
• 0:08 - 0:13
points and fill in the missing values. So, for example,
• 0:13 - 0:16
let's say that Ichiro Suzuki and Babe Ruth are
• 0:16 - 0:20
missing values for weight in our baseball data set. Well,
• 0:20 - 0:23
okay, no problem. We can just take the mean
• 0:23 - 0:26
of all other players weights and assign that value to
• 0:26 - 0:28
Ichiro and Babe Ruth. In this case, we
• 0:28 - 0:31
would assign Ichiro and Babe Ruth both a weight
• 0:31 - 0:35
of 191.67. Wow, that seems really easy, right?
• 0:35 - 0:39
There's gotta be a catch. Well, let's first discuss
• 0:39 - 0:41
• 0:41 - 0:44
the mean of the height across our sample, That's
• 0:44 - 0:46
good. But let's say we were hoping to
• 0:46 - 0:51
study the relationship between weight and birth year. Or
• 0:51 - 0:53
height and weight. Just plugging the mean height into a bunch of our
• 0:53 - 0:56
data points lessens the correlation between
• 0:56 - 0:59
our imputed variable and any other variable.
Title:
Easy Imputation - Intro to Data Science
Description:

03-41 Easy Imputation

more » « less
Video Language:
English
Team:
Udacity
Project:
ud359: Intro to Data Science
Duration:
01:00
 Udacity Robot edited English subtitles for 02-41 Easy Imputation Udacity Robot edited English subtitles for 02-41 Easy Imputation Udacity Robot edited English subtitles for 02-41 Easy Imputation Udacity Robot edited English subtitles for 02-41 Easy Imputation Cogi-Admin edited English subtitles for 02-41 Easy Imputation

# English subtitles

## Revisions Compare revisions

• API
Udacity Robot
• API
Udacity Robot
• API
Udacity Robot
• API
Udacity Robot
• API