English subtitles

← Non-Normal Data - Intro to Data Science

Get Embed Code
4 Languages

Showing Revision 2 created 05/25/2016 by Udacity Robot.

  1. When performing a t-test, we assume that our data
  2. is normal. In the wild, you'll often encounter probability distributions.
  3. They're distinctly not normal. They might look like this, or
  4. like this, or completely different. As you'd imagine, there are
  5. still statistical tests that we can utilize when our data
  6. is not normal. Why don't we briefly discuss what you
  7. might do in situations like this. First off, we should
  8. have some machinery in place for determining whether or not
  9. our data is Gaussian in the first place. A crude, inaccurate
  10. way of determining whether or not our data is normal is
  11. simply to plot a histogram of our data and ask, does
  12. this look like a bell curve? In both of these cases, the
  13. answer would definitely be no. But, we can do a little
  14. bit better than that. There are some statistical tests that we
  15. can use to measure the likelihood that a sample is drawn
  16. from a normally distributed population. One such test is the shapiro-wilk test.
  17. I don't want to go into great depth with
  18. regards to the theory behind this test, but I do
  19. want to let you know that it's implemented in scipy.
  20. You can call it really easily like this. W and
  21. P are going to be equal to scipy.stats.shapiro data, where
  22. our data here is just an array, or list containing
  23. all of our data points. This function's going to return these
  24. two values. The first, W is the Shapiro-Wilk Test statistic.
  25. The second value in this two-pole is going
  26. to be our P value, which should be interpreted
  27. in the same way that we would interpret
  28. the p-value for our t-test. That is, given the
  29. null hypothesis that this data is drawn from
  30. a normal distribution, what is the likelihood that we
  31. would observe a value of W that was at least as extreme as the one that we see?