 ## ← Welchs Two-Sample t-Test - Intro to Data Science

• 2 Followers
• 48 Lines

### Get Embed Code x Embed video Use the following code to embed this video. See our usage guide for more details on embedding. Paste this in your document somewhere (closest to the closing body tag is preferable): ```<script type="text/javascript" src='https://amara.org/embedder-iframe'></script> ``` Paste this inside your HTML body, where you want to include the widget: ```<div class="amara-embed" data-url="http://www.youtube.com/watch?v=B_1cnwYn7so" data-team="udacity"></div> ``` 4 Languages

• English [en] original
• Arabic [ar]
• Japanese [ja]
• Portuguese, Brazilian [pt-br]

Showing Revision 5 created 05/24/2016 by Udacity Robot.

1. Let's talk more about the two sample t-test, since we'll want
2. to compare two different samples in our class project. There are
3. a few different versions of the t-test that one might employ
4. ,and they depend on really on what assumptions we make about the
5. data. So we might want to ask questions such as ,do our
6. samples have the same size ?,and do they have the same
7. variance? . Let's discuss a variant of the t-test called Welch's
8. t-test in more depth. Since it's the most general. It doesn't assume
9. equal sample size ,or equal variance. In Welch's
10. t-test ,we compute a t-statistic using following equation.
11. T equals mu1 minus mu2, divided by the
12. square root of sigma1 squared over n1. Plus
13. sigma 2 squared over n2. Where mu I ,is the sample mean for the Ith sample.
14. Sigma squared I is the sample variance for
15. the Ith sample. And NI is the sample size
16. for the Ith sample. We'll also want to estimate the number of degrees
17. of freedom, nu, using the following equation.
18. Nu is approximately equal to. Quantity sigma1
19. squared ,over n1 ,plus sigma2 squared over n2 ,squared over sigma1 of the
20. 4th over n1 squared nu1 ,plus sigma2 to the 4th ,over n2 squared nu2.
21. Where mu I is equal to mi minus one ,and
22. this is the degrees of freedom associated with the Ith variance
23. estimate. If you're unfamiliar with degrees of freedom again it might
24. be a good idea to brush up on your stats concepts
25. with the audacity's intro to stats course. A link is
26. provided in the instructor comments. All right so once we have
27. these two values, we can estimate the P value. Conceptually, the
28. P-value is the probability of obtaining the test statistic at least
29. as extreme as the one that was actually observed
30. ,assuming that the null hypothesis was true. The P
31. value is not the probability of the null hypothesis
32. is true given the data. So again, just as a
33. thought experiment. Say we were testing whether left handed
34. or right handed baseball players. Were better batters by looking
35. at their average batting average. If the P value
36. is .05, this would mean that ,even if there is
37. no difference between left handed and right handed batters, since
38. that's our null hypothesis. So, even if this was true,
39. we would see a value of t ,equal or greater
40. to the one that we saw 5% of the time.
41. When performing a statistical test like this, we usually set
42. some critical value of P. Let's call it P critical.
43. If P falls below P critical, then we would reject
44. the null hypothesis. In the two sample case, this is equivalent
45. to stating that the mean for our two samples
46. is not equal. Calculating this P value for a
47. given set of data can be kind of of tedious.
48. Thankfully, we seldom have to perform this calculation explicitly.