  1. Let's first discuss what would seem to be the
  2. easiest way to impute a missing value in our data
  3. set. Just take the mean of our other data
  4. points and fill in the missing values. So, for example,
  5. let's say that Ichiro Suzuki and Babe Ruth are
  6. missing values for weight in our baseball data set. Well,
  7. okay, no problem. We can just take the mean
  8. of all other players weights and assign that value to
  9. Ichiro and Babe Ruth. In this case, we
  10. would assign Ichiro and Babe Ruth both a weight
  11. of 191.67. Wow, that seems really easy, right?
  12. There's gotta be a catch. Well, let's first discuss
  13. what's good about this method. We don't change
  14. the mean of the height across our sample, That's
  15. good. But let's say we were hoping to
  16. study the relationship between weight and birth year. Or
  17. height and weight. Just plugging the mean height into a bunch of our
  18. data points lessens the correlation between
  19. our imputed variable and any other variable.