WEBVTT 00:00:00.000 --> 00:00:02.193 (female narrator) This is just a couple of tips 00:00:02.193 --> 00:00:04.894 on how to use the significance test flow chart. 00:00:04.894 --> 00:00:11.738 As you go through each problem, this flow chart can help you to determine 00:00:12.298 --> 00:00:15.908 what type of significance test you should perform on that problem. 00:00:15.908 --> 00:00:18.858 The first thing you want to do is you want to read the problem. 00:00:18.858 --> 00:00:21.871 When you're done reading the problem, you need to ask yourself, 00:00:21.871 --> 00:00:24.778 "What was the data collected from each member of the sample?" 00:00:24.778 --> 00:00:27.605 For instance, did you collect a weight, or a height, 00:00:27.605 --> 00:00:29.606 or did you take their temperature, 00:00:29.606 --> 00:00:32.100 or did you count the number of dogs that they had? 00:00:32.100 --> 00:00:35.355 That would be numerical data. When you read the problem, 00:00:35.355 --> 00:00:40.223 did you see some key words like mean, or average, or standard deviation. 00:00:40.223 --> 00:00:44.920 All of those things indicate that you have a quantitative data problem. 00:00:45.510 --> 00:00:52.023 On the flip side of that, were there some keywords like proportion, percent, rates, 00:00:52.023 --> 00:00:55.752 or was each member of the sample asked a yes or no question, 00:00:55.752 --> 00:00:59.108 such as, "Have you had a heart attack? Yes or no?" 00:01:00.118 --> 00:01:02.769 Maybe they were asked, "Are you overweight? Yes or no?" 00:01:02.769 --> 00:01:04.968 "Do you support the president? Yes or no?" 00:01:04.968 --> 00:01:07.967 So that type of a thing. That would be qualitative data. 00:01:07.967 --> 00:01:10.364 Once you determine the type of data, 00:01:10.364 --> 00:01:13.056 then you follow the decision tree down that side. 00:01:13.056 --> 00:01:17.867 For instance, if you have quantitative data, then you would ask yourself, 00:01:17.867 --> 00:01:24.415 "Do I have two populations or do I have one population?" 00:01:24.415 --> 00:01:28.632 Now remember, an easy way to tell whether you have one or two populations 00:01:28.632 --> 00:01:31.464 is to look for the samples that are given in the problem. 00:01:31.464 --> 00:01:34.854 If you're only given one sample, then you only have one population. 00:01:34.854 --> 00:01:37.880 If you're given two samples, then you have two populations. 00:01:37.880 --> 00:01:41.996 You have to be given the complete information about both of those samples, 00:01:41.996 --> 00:01:47.296 like the sample mean and sample standard deviation, for two separate samples 00:01:47.296 --> 00:01:49.695 in order to determine you have two populations. 00:01:51.945 --> 00:01:55.412 Once you determine how many populations that you have, 00:01:55.412 --> 00:01:59.969 if you have one population, you're going to do the t-stat one sample. 00:01:59.969 --> 00:02:02.278 If you have two independent populations, 00:02:02.278 --> 00:02:05.627 then you're going to do the t-stat two samples. 00:02:05.627 --> 00:02:11.293 Keep in mind that you need to determine whether you're going to pull the variances 00:02:11.293 --> 00:02:13.177 or not pull the variances. 00:02:13.177 --> 00:02:16.242 Down here, we have the little tip on how to decide that. 00:02:16.242 --> 00:02:19.509 It's just the ratio of the larger sample standard deviation 00:02:19.509 --> 00:02:22.129 divided by the smaller sample standard deviation, 00:02:22.129 --> 00:02:25.417 and looking to see if it's greater than two, then you remove 00:02:25.417 --> 00:02:28.247 the pulled variances check mark, and if it's less than two, 00:02:28.247 --> 00:02:31.567 then you keep the pulled variances check mark. 00:02:31.567 --> 00:02:33.917 Now remember if you have two dependent samples, 00:02:33.917 --> 00:02:39.780 the two samples are related to each other, such as a before and an after, 00:02:39.780 --> 00:02:45.058 a pre and a post test. The two samples are related because it's the same person 00:02:45.058 --> 00:02:47.972 taking the pre-test and the same person taking the post-test, 00:02:47.972 --> 00:02:50.965 or maybe you're testing two different kinds of tires, 00:02:50.965 --> 00:02:53.848 so you take a car and you drive it with the first tire, 00:02:53.848 --> 00:02:57.101 and then you take the same car and you drive it with the second tire. 00:02:57.101 --> 00:03:00.671 Those two samples are related because it's the same car driving both times. 00:03:00.671 --> 00:03:03.497 If they're dependent samples, then you have a choice 00:03:03.497 --> 00:03:06.629 of either finding the differences between the two samples 00:03:06.629 --> 00:03:13.678 and doing a one sample t-test, or under t-stats you can do the paired t-test. 00:03:15.098 --> 00:03:19.676 If you have more than two populations, we only have one test 00:03:19.676 --> 00:03:24.242 where we had more than two populations and that was from chapter 14, 00:03:24.242 --> 00:03:28.966 and that was where we had three, four, five, six, as many populations 00:03:28.966 --> 00:03:34.192 as were needed, and that was to perform the one-way ANOVA test. 00:03:34.922 --> 00:03:39.975 Now, if you have two quantitative variables, like you were collecting 00:03:39.975 --> 00:03:49.089 the height of a person and the weight of a person, 00:03:49.089 --> 00:03:56.511 and you were asked to decide if the variables were dependent, 00:03:56.511 --> 00:03:59.548 or associated, or independent, if you saw those words in there, 00:03:59.548 --> 00:04:03.531 but you have two quantitative variables, a height and the weight, 00:04:03.531 --> 00:04:06.914 then that would be a regression type problem, 00:04:06.914 --> 00:04:13.099 and you would perform that significance test using the test on beta, the slope, 00:04:13.099 --> 00:04:17.847 using the regression simple linear command in StatCrunch, 00:04:17.847 --> 00:04:20.276 so again that's for two quantitative variables, 00:04:20.276 --> 00:04:23.514 and you're going to look for phrases like "Determine if the variables 00:04:23.514 --> 00:04:25.879 are dependent, or associated, or independent" 00:04:26.919 --> 00:04:30.745 Back over here on the qualitative side, it's a little bit shorter 00:04:30.745 --> 00:04:36.205 on the qualitative side, but starting out the same way we did on the quantitative 00:04:36.205 --> 00:04:39.775 side, we need to determine if we have one or two populations. 00:04:39.775 --> 00:04:43.957 Again, if you have two populations then you're going to have information 00:04:43.957 --> 00:04:47.757 for two samples. You're going to be given the number of successes 00:04:47.757 --> 00:04:51.224 and the total sample size for two different samples. 00:04:51.874 --> 00:04:55.654 If you do determine that you have one population, 00:04:55.654 --> 00:04:58.955 then you would do a proportion stat one sample. 00:04:59.185 --> 00:05:02.315 If you do determine that you have two populations, 00:05:02.315 --> 00:05:04.606 then again you're going to do a proportion stat, 00:05:04.606 --> 00:05:06.389 but you're going to do two samples. 00:05:06.389 --> 00:05:11.654 Our last test for qualitative data, is if we have two variables. 00:05:11.654 --> 00:05:19.487 For examples, we might be looking at gender and happiness. 00:05:19.487 --> 00:05:25.020 That would be two qualitative variables. For gender, you would be male or female. 00:05:25.020 --> 00:05:30.936 For happiness, they might indicate very happy, or happy, and not happy, 00:05:30.936 --> 00:05:34.168 so those are word answers to those. "How happy are you?" 00:05:34.168 --> 00:05:37.834 Very happy, happy, not happy. 00:05:37.834 --> 00:05:42.153 Those are word answers so those are definitely qualitative variables, 00:05:42.153 --> 00:05:44.577 and it was two variables. We were collecting gender 00:05:44.577 --> 00:05:48.466 and we were collecting level of happiness so we have two qualitative variables. 00:05:48.466 --> 00:05:52.688 In the problem you can also look to see if it says the variables are dependent, 00:05:52.688 --> 00:05:54.648 associated, or independent. 00:05:54.648 --> 00:05:58.957 Also note, a little hint here, that when you do have two qualitative variables, 00:05:58.957 --> 00:06:02.863 it's typically the data will be shown in a contingency table, 00:06:02.863 --> 00:06:05.681 so all of these clues here help you determine 00:06:05.681 --> 00:06:08.915 that you're going to perform a chi-square independence test. 00:06:10.325 --> 00:06:13.763 So this is a little bit about how to use this decision tree, 00:06:15.663 --> 00:06:19.547 or significance test flow chart. Sometimes I call it a decision tree 00:06:19.547 --> 00:06:21.412 because it branches off, 00:06:21.412 --> 00:06:24.594 or sometimes I just call it a significance test flow chart. 00:06:24.594 --> 00:06:28.861 But anyway, this is a little bit about how to use this document.