1 00:00:00,000 --> 00:00:02,193 (female narrator) This is just a couple of tips 2 00:00:02,193 --> 00:00:04,894 on how to use the significance test flow chart. 3 00:00:04,894 --> 00:00:11,738 As you go through each problem, this flow chart can help you to determine 4 00:00:12,298 --> 00:00:15,908 what type of significance test you should perform on that problem. 5 00:00:15,908 --> 00:00:18,858 The first thing you want to do is you want to read the problem. 6 00:00:18,858 --> 00:00:21,871 When you're done reading the problem, you need to ask yourself, 7 00:00:21,871 --> 00:00:24,778 "What was the data collected from each member of the sample?" 8 00:00:24,778 --> 00:00:27,605 For instance, did you collect a weight, or a height, 9 00:00:27,605 --> 00:00:29,606 or did you take their temperature, 10 00:00:29,606 --> 00:00:32,100 or did you count the number of dogs that they had? 11 00:00:32,100 --> 00:00:35,355 That would be numerical data. When you read the problem, 12 00:00:35,355 --> 00:00:40,223 did you see some key words like mean, or average, or standard deviation. 13 00:00:40,223 --> 00:00:44,920 All of those things indicate that you have a quantitative data problem. 14 00:00:45,510 --> 00:00:52,023 On the flip side of that, were there some keywords like proportion, percent, rates, 15 00:00:52,023 --> 00:00:55,752 or was each member of the sample asked a yes or no question, 16 00:00:55,752 --> 00:00:59,108 such as, "Have you had a heart attack? Yes or no?" 17 00:01:00,118 --> 00:01:02,769 Maybe they were asked, "Are you overweight? Yes or no?" 18 00:01:02,769 --> 00:01:04,968 "Do you support the president? Yes or no?" 19 00:01:04,968 --> 00:01:07,967 So that type of a thing. That would be qualitative data. 20 00:01:07,967 --> 00:01:10,364 Once you determine the type of data, 21 00:01:10,364 --> 00:01:13,056 then you follow the decision tree down that side. 22 00:01:13,056 --> 00:01:17,867 For instance, if you have quantitative data, then you would ask yourself, 23 00:01:17,867 --> 00:01:24,415 "Do I have two populations or do I have one population?" 24 00:01:24,415 --> 00:01:28,632 Now remember, an easy way to tell whether you have one or two populations 25 00:01:28,632 --> 00:01:31,464 is to look for the samples that are given in the problem. 26 00:01:31,464 --> 00:01:34,854 If you're only given one sample, then you only have one population. 27 00:01:34,854 --> 00:01:37,880 If you're given two samples, then you have two populations. 28 00:01:37,880 --> 00:01:41,996 You have to be given the complete information about both of those samples, 29 00:01:41,996 --> 00:01:47,296 like the sample mean and sample standard deviation, for two separate samples 30 00:01:47,296 --> 00:01:49,695 in order to determine you have two populations. 31 00:01:51,945 --> 00:01:55,412 Once you determine how many populations that you have, 32 00:01:55,412 --> 00:01:59,969 if you have one population, you're going to do the t-stat one sample. 33 00:01:59,969 --> 00:02:02,278 If you have two independent populations, 34 00:02:02,278 --> 00:02:05,627 then you're going to do the t-stat two samples. 35 00:02:05,627 --> 00:02:11,293 Keep in mind that you need to determine whether you're going to pull the variances 36 00:02:11,293 --> 00:02:13,177 or not pull the variances. 37 00:02:13,177 --> 00:02:16,242 Down here, we have the little tip on how to decide that. 38 00:02:16,242 --> 00:02:19,509 It's just the ratio of the larger sample standard deviation 39 00:02:19,509 --> 00:02:22,129 divided by the smaller sample standard deviation, 40 00:02:22,129 --> 00:02:25,417 and looking to see if it's greater than two, then you remove 41 00:02:25,417 --> 00:02:28,247 the pulled variances check mark, and if it's less than two, 42 00:02:28,247 --> 00:02:31,567 then you keep the pulled variances check mark. 43 00:02:31,567 --> 00:02:33,917 Now remember if you have two dependent samples, 44 00:02:33,917 --> 00:02:39,780 the two samples are related to each other, such as a before and an after, 45 00:02:39,780 --> 00:02:45,058 a pre and a post test. The two samples are related because it's the same person 46 00:02:45,058 --> 00:02:47,972 taking the pre-test and the same person taking the post-test, 47 00:02:47,972 --> 00:02:50,965 or maybe you're testing two different kinds of tires, 48 00:02:50,965 --> 00:02:53,848 so you take a car and you drive it with the first tire, 49 00:02:53,848 --> 00:02:57,101 and then you take the same car and you drive it with the second tire. 50 00:02:57,101 --> 00:03:00,671 Those two samples are related because it's the same car driving both times. 51 00:03:00,671 --> 00:03:03,497 If they're dependent samples, then you have a choice 52 00:03:03,497 --> 00:03:06,629 of either finding the differences between the two samples 53 00:03:06,629 --> 00:03:13,678 and doing a one sample t-test, or under t-stats you can do the paired t-test. 54 00:03:15,098 --> 00:03:19,676 If you have more than two populations, we only have one test 55 00:03:19,676 --> 00:03:24,242 where we had more than two populations and that was from chapter 14, 56 00:03:24,242 --> 00:03:28,966 and that was where we had three, four, five, six, as many populations 57 00:03:28,966 --> 00:03:34,192 as were needed, and that was to perform the one-way ANOVA test. 58 00:03:34,922 --> 00:03:39,975 Now, if you have two quantitative variables, like you were collecting 59 00:03:39,975 --> 00:03:49,089 the height of a person and the weight of a person, 60 00:03:49,089 --> 00:03:56,511 and you were asked to decide if the variables were dependent, 61 00:03:56,511 --> 00:03:59,548 or associated, or independent, if you saw those words in there, 62 00:03:59,548 --> 00:04:03,531 but you have two quantitative variables, a height and the weight, 63 00:04:03,531 --> 00:04:06,914 then that would be a regression type problem, 64 00:04:06,914 --> 00:04:13,099 and you would perform that significance test using the test on beta, the slope, 65 00:04:13,099 --> 00:04:17,847 using the regression simple linear command in StatCrunch, 66 00:04:17,847 --> 00:04:20,276 so again that's for two quantitative variables, 67 00:04:20,276 --> 00:04:23,514 and you're going to look for phrases like "Determine if the variables 68 00:04:23,514 --> 00:04:25,879 are dependent, or associated, or independent" 69 00:04:26,919 --> 00:04:30,745 Back over here on the qualitative side, it's a little bit shorter 70 00:04:30,745 --> 00:04:36,205 on the qualitative side, but starting out the same way we did on the quantitative 71 00:04:36,205 --> 00:04:39,775 side, we need to determine if we have one or two populations. 72 00:04:39,775 --> 00:04:43,957 Again, if you have two populations then you're going to have information 73 00:04:43,957 --> 00:04:47,757 for two samples. You're going to be given the number of successes 74 00:04:47,757 --> 00:04:51,224 and the total sample size for two different samples. 75 00:04:51,874 --> 00:04:55,654 If you do determine that you have one population, 76 00:04:55,654 --> 00:04:58,955 then you would do a proportion stat one sample. 77 00:04:59,185 --> 00:05:02,315 If you do determine that you have two populations, 78 00:05:02,315 --> 00:05:04,606 then again you're going to do a proportion stat, 79 00:05:04,606 --> 00:05:06,389 but you're going to do two samples. 80 00:05:06,389 --> 00:05:11,654 Our last test for qualitative data, is if we have two variables. 81 00:05:11,654 --> 00:05:19,487 For examples, we might be looking at gender and happiness. 82 00:05:19,487 --> 00:05:25,020 That would be two qualitative variables. For gender, you would be male or female. 83 00:05:25,020 --> 00:05:30,936 For happiness, they might indicate very happy, or happy, and not happy, 84 00:05:30,936 --> 00:05:34,168 so those are word answers to those. "How happy are you?" 85 00:05:34,168 --> 00:05:37,834 Very happy, happy, not happy. 86 00:05:37,834 --> 00:05:42,153 Those are word answers so those are definitely qualitative variables, 87 00:05:42,153 --> 00:05:44,577 and it was two variables. We were collecting gender 88 00:05:44,577 --> 00:05:48,466 and we were collecting level of happiness so we have two qualitative variables. 89 00:05:48,466 --> 00:05:52,688 In the problem you can also look to see if it says the variables are dependent, 90 00:05:52,688 --> 00:05:54,648 associated, or independent. 91 00:05:54,648 --> 00:05:58,957 Also note, a little hint here, that when you do have two qualitative variables, 92 00:05:58,957 --> 00:06:02,863 it's typically the data will be shown in a contingency table, 93 00:06:02,863 --> 00:06:05,681 so all of these clues here help you determine 94 00:06:05,681 --> 00:06:08,915 that you're going to perform a chi-square independence test. 95 00:06:10,325 --> 00:06:13,763 So this is a little bit about how to use this decision tree, 96 00:06:15,663 --> 00:06:19,547 or significance test flow chart. Sometimes I call it a decision tree 97 00:06:19,547 --> 00:06:21,412 because it branches off, 98 00:06:21,412 --> 00:06:24,594 or sometimes I just call it a significance test flow chart. 99 00:06:24,594 --> 00:06:28,861 But anyway, this is a little bit about how to use this document.