English subtitles

← 16-40 Fake_Data_Summary

16-40 Fake_Data_Summary

Get Embed Code
3 Languages

Showing Revision 2 created 11/02/2015 by Udacity Robot.

  1. There are a couple of things to observe here.
  2. One is in general the fake data pulls everything towards 0.5.
  3. Where you go to extremes over here, we are less extreme in this case.
  4. 0.33 is further away from 0.5 than 0.4.
  5. So, all these numbers get moved towards 0.5.
  6. This is somewhat smoother.
  7. We also see that these two outcomes--the first and the last--
  8. on the division model gives us the same extreme estimate,
  9. but the more data we get in our new estimator, the more we are willing to move away from 0.5.
  10. One observation of heads gave us 0.667, two of them 0. 75.
  11. I can promise you in the limit, as you only see heads for infinitely many,
  12. we will finally approach 1. Now, this is really cool.
  13. We added fake data, and I will tell you that I generally think these are better estimates in practice.
  14. The reason why is it's really reckless after a single coin flip to assume that all coins come up positive.
  15. I think it's much more moderate to say, well, we already have some evidence
  16. that heads might be more likely, but we're not quite convinced yet.
  17. The not quite convinced is the same as having a prior.
  18. There's an entire literature that talks about these priors.
  19. They have a very cryptic name.
  20. They're called Dirichlet priors.
  21. But, more importantly, the method of adding fake data is called a Laplacian estimator.
  22. When there is plenty data, Laplacian estimator gives about the same results
  23. as the maximum likelihood estimator.
  24. But when data is scarce, this works usually much, much, much better
  25. than the maximum likelihood estimator.
  26. It's a really important lesson in statistics.