< Return to Video

Math 1080 Lecture 21 Confidence Interval for Variance and Standard Deviation An Example

  • 0:04 - 0:09
    Today I'll discuss confidence interval
    for sigma and sigma squared--
  • 0:09 - 0:14
    that's standard deviation and variance--
    using the chi-squared distribution.
  • 0:16 - 0:21
    Now, these are to be estimated (sigma
    and sigma squared) for the population.
  • 0:21 - 0:25
    Remember, population
    is what you're studying.
  • 0:28 - 0:32
    And sample is what
    you have in hand.
  • 0:35 - 0:39
    You don't have access to
    the whole population data,
  • 0:39 - 0:41
    but you can take a sample.
  • 0:42 - 0:46
    And what you are-- what you have
    in hand, again, what you have...
  • 0:46 - 0:52
    is a sample, and the sample has
    standard deviation and variance.
  • 0:53 - 0:59
    And also we know how many data points
    we have...that's number of sample points.
  • 1:00 - 1:02
    Then...
  • 1:02 - 1:08
    Since we chose the
    chi-squared distribution for this,
  • 1:08 - 1:13
    we have two critical values here:
    chi-squared L and chi-squared R.
  • 1:13 - 1:21
    And let me show you what they are.
    Chi-squared is distribution related to...
  • 1:23 - 1:26
    ...normal distribution, in fact,
    the square of normal distribution.
  • 1:26 - 1:30
    So, it is positive and it is on the right side.
  • 1:31 - 1:35
    Now, chi-squared, uh...
    left side is this.
  • 1:37 - 1:43
    And this one [on the right] is this.
    Now, what are they?
  • 1:44 - 1:48
    So, you choose confidence level.
  • 1:51 - 1:57
    Remember, the confidence level was...
    is a number... a small number.
  • 1:57 - 2:00
    Usually you choose 0.05 or 0.01,
  • 2:00 - 2:02
    and if it's not important,
  • 2:02 - 2:05
    you can choose 0.1, for example.
  • 2:05 - 2:12
    Most often we choose 0.05 because it's,
    uh... it's better, it is the best way, I think.
  • 2:13 - 2:17
    Now, what is this chi-squared L?
    Chi-squared L is a point on the...
  • 2:18 - 2:25
    ...on the x-axis, such that this area here
    under the curve of chi-squared is alpha over 2.
  • 2:25 - 2:29
    And this area on this side
    [right side] is alpha over 2.
  • 2:30 - 2:33
    So, the area in the
    middle is 1 minus alpha.
  • 2:35 - 2:40
    So... because you see the area
    under the curve is [equal to] 1.
  • 2:41 - 2:47
    The whole area is 1, so if this is α/2 and
    this is α/2, add them, you have alpha,
  • 2:47 - 2:54
    the whole thing is 1, so this area between
    these two is 1 minus alpha [1 - α].
  • 2:56 - 2:57
    Okay, so...
  • 2:59 - 3:02
    Let me erase this, and...
  • 3:04 - 3:16
    No, let me write down the confidence
    interval, say, end points, the interval itself.
  • 3:18 - 3:20
    In the end...
  • 3:20 - 3:22
    So, you calculated what?
  • 3:22 - 3:27
    You calculated s and s-squared.
  • 3:29 - 3:37
    And you have chi-squared L and
    chi-squared R, and you also have n.
  • 3:40 - 3:48
    So, we use chi-squared with
    n minus 1 degrees of freedom.
  • 3:48 - 3:53
    That is n minus 1 [n - 1]
    degrees of freedom.
  • 3:56 - 3:57
    So...
  • 3:59 - 4:08
    For sigma squared, we have this interval,
    so on the left side we have n minus 2--
  • 4:09 - 4:18
    No, sorry, that's n minus 1, s-squared,
    divided by chi-squared (on the left that's R).
  • 4:18 - 4:26
    And on this side we have n minus 1,
    s-squared, [over] chi-squared L.
  • 4:27 - 4:33
    They are different, so the
    left side is R, the right side is L.
  • 4:33 - 4:36
    And for sigma... we have...
  • 4:38 - 4:42
    It's like almost the same thing, so sigma
    is just the square root of this thing.
  • 4:42 - 4:49
    So, we have square root
    of this n minus 1, s-squared
  • 4:49 - 4:52
    over chi-squared R.
  • 5:00 - 5:08
    And on this side, we have n minus 1,
    s-squared, [over] chi-squared L.
  • 5:09 - 5:15
    Okay, so these are the things that
    we will calculate, these two, that's it.
  • 5:15 - 5:20
    We have the confidence interval
    for sigma and sigma squared.
  • 5:21 - 5:22
    So, let's look at this again.
  • 5:22 - 5:25
    We know this [n - 1], we know this [s-squared],
    we calculate this [chi-squared R]--
  • 5:25 - 5:28
    [correction] we calculate
    them [n - 1 and s-squared]--
  • 5:28 - 5:33
    and also this [on the right] we calculate, and
    this gives me an interval for sigma squared.
  • 5:33 - 5:36
    And this [below] gives me
    the interval for sigma.
  • 5:36 - 5:40
    So, after I give you an example,
    I will also discuss...
  • 5:40 - 5:44
    I give you what this really means. So...
  • 5:46 - 5:50
    Let me erase this. And so...
  • 5:52 - 5:55
    The one that I chose, in fact,
    comes from the book.
  • 5:55 - 5:56
    And that is...
  • 5:57 - 6:01
    Confidence interval estimate
    of sigma for pulse rates.
  • 6:02 - 6:07
    Confidence interval estimate
    of sigma for pulse rates.
  • 6:26 - 6:29
    Now, that makes sense because...
  • 6:31 - 6:38
    But, that probably has significance for...
    like, maybe... I don't know. Maybe...
  • 6:39 - 6:48
    Well, health insurance companies...
    so they gather a team to give them
  • 6:48 - 6:53
    a confidence interval for sigma
    and sigma squared. So...
  • 6:53 - 6:55
    What they have to do is...
  • 6:57 - 7:05
    preferably, find pulse rates of
    everybody in certain society,
  • 7:05 - 7:09
    if that was United Stated, then
    everybody who lives in this country.
  • 7:09 - 7:12
    But, that's not feasible, right?
  • 7:12 - 7:17
    And that must be done at the same
    time, say, so at this very moment.
  • 7:17 - 7:20
    So, just imagine how can you do that.
  • 7:20 - 7:22
    What the do is, they just...
  • 7:25 - 7:31
    take a sample of people in this country
    at random and find the pulse rates.
  • 7:34 - 7:38
    So, they have a sample. Now,
    this sample is given here and...
  • 7:38 - 7:43
    So, the sample consists
    of some numbers here.
  • 7:45 - 7:50
    I will just write some of them,
    but I have all of them in R.
  • 7:50 - 7:54
    Some of them look like this:
    the first person has 76,
  • 7:54 - 7:59
    then the second 76,
    then 86, dot, dot, dot.
  • 7:59 - 8:02
    Last one is like 66.
  • 8:02 - 8:09
    So, this is the sample, sample pulse
    rate for certain number of people.
  • 8:09 - 8:13
    And we will find out everything in R.
  • 8:13 - 8:18
    So, let me share the R with you.
  • 8:21 - 8:24
    Where is R? Okay, I see.
  • 8:28 - 8:29
    Okay.
  • 8:31 - 8:37
    So, you can see that I named this thing
    "data," we can call it "my data," whatever.
  • 8:38 - 8:42
    And these are the numbers:
    76, 76, 86, and blah, blah, blah.
  • 8:42 - 8:45
    And the last one is 74.
    Oh, okay.
  • 8:45 - 8:51
    So, this is my data, and I have
    already entered, so the data...
  • 8:51 - 8:53
    Well, let's see, what was wrong?
  • 8:54 - 8:58
    Oh, I don't know, something
    is terribly wrong here.
  • 9:04 - 9:09
    Okay, so yeah, I guess [we deleted some].
    So, let's check data again.
  • 9:10 - 9:12
    This is my data.
  • 9:12 - 9:21
    First thing is n, I need n number of
    sample points, so I enter length of my data.
  • 9:22 - 9:24
    So, let's check how many: 22.
  • 9:25 - 9:33
    So, degree of freedom (df)
    is n minus 1, so df is 21.
  • 9:34 - 9:38
    What else do I need?
    I need the s and s-squared.
  • 9:39 - 9:48
    So, s is my standard
    deviation of data,
  • 9:49 - 9:56
    and "s_sq" (s-squared)
    is variance of data.
  • 9:59 - 10:03
    Okay, let's check s and s_sq.
  • 10:04 - 10:12
    Well, I said s and s-squared,
    so s-squared is that 's' squared, in fact.
  • 10:12 - 10:15
    So, let's see if s-squared
    is that 's' squared.
  • 10:15 - 10:19
    S power 2...
    See, they are the same.
  • 10:20 - 10:27
    In fact, variance is s-squared.
    Or 's' is square root of variance.
  • 10:28 - 10:30
    So, what else do we need?
  • 10:30 - 10:34
    We need the chi-squared L
    and chi-squared R.
  • 10:34 - 10:38
    You can find them like this: "qchisq."
  • 10:41 - 10:43
    Oh! [corrects himself]
    Alpha, let's enter alpha.
  • 10:44 - 10:52
    Alpha is... How much was alpha?
    0.05...or no [corrects typo of 4].
  • 10:52 - 11:00
    That's the usual alpha [0.05], so
    I need "qchisq"-- Did I enter that?
  • 11:00 - 11:09
    No, there isn't "qchisq" [yet],
    [corrects typo] "qchisq" alpha,
  • 11:11 - 11:15
    over 2, and the degree of freedom (df).
  • 11:16 - 11:25
    Let's call it something...
    Let's call it... "chi_L," right?
  • 11:27 - 11:30
    So, it's easier when I have to do it again.
  • 11:30 - 11:37
    So, that is chi-squared left, I just
    called it something [chi_L], and...
  • 11:39 - 11:46
    chi-squared right [chi-r] is qchisq.
  • 11:46 - 11:53
    Now, this is 1 minus alpha over 2
    with the same degree of freedom.
  • 11:55 - 11:56
    Okay, so...
  • 11:58 - 12:01
    I guess I have almost
    everything here.
  • 12:01 - 12:03
    What is the formula for s?
  • 12:04 - 12:13
    For s, it was square root of (sqrt) of
    n minus 1, which was degree of freedom...
  • 12:13 - 12:24
    well, let's write it "n-1" times
    s-squared "s_sq," all of that divided by chi...
  • 12:25 - 12:32
    ...well, that was... which one am I...
    "chi_L" [corrects himself], "chi_r."
  • 12:33 - 12:36
    R goes to the left [so] "chi-r."
  • 12:37 - 12:39
    Oops, what did I enter?
  • 12:40 - 12:45
    Oh, yeah, I have to enter "times" [*];
    otherwise, it doesn't understand.
  • 12:46 - 12:51
    So, 7.63, let me write
    this here somewhere.
  • 12:54 - 13:00
    So, that's for sigma,
    in fact, so 7.63 [unclear].
  • 13:09 - 13:12
    Let me... Oh, let me see.
  • 13:18 - 13:23
    And the other one is the same thing
    except chi squared left.
  • 13:24 - 13:28
    14.17. So this one is 14.17, okay?
  • 13:36 - 13:41
    And I also need...
    This is for... This is...
  • 13:42 - 13:48
    7.63 and 14.17 are
    the two ends for sigma.
  • 13:48 - 13:51
    I need for sigma squared.
    That is the variance.
  • 13:51 - 13:55
    So I don't have to find
    the square root of them,
  • 13:55 - 14:00
    so this was square root of this thing,
    so let's just remove square root.
  • 14:00 - 14:02
    And that is for variance.
  • 14:07 - 14:12
    So that's 58.24. Let me
    write it somewhere here.
  • 14:12 - 14:13
    58.24.
  • 14:23 - 14:26
    The other one would be...
  • 14:28 - 14:31
    ...the left one. Let's
    remove the square root.
  • 14:32 - 14:36
    And that is for variance. 200.95.
  • 14:38 - 14:43
    200.95 [unclear] come back.
  • 14:50 - 14:55
    I guess I have everything.
    Let's go back to the board.
  • 14:57 - 15:02
    So I calculated this one, this one,
    and the other one.
  • 15:03 - 15:05
    And that is all I needed.
  • 15:06 - 15:08
    So...
  • 15:09 - 15:11
    Let's write for sigma first.
  • 15:11 - 15:21
    So I found that sigma is
    between 7.63 and 14.17,
  • 15:21 - 15:25
    with 95% confidence.
  • 15:30 - 15:35
    And the other one is this one:
  • 15:35 - 15:39
    58 and 200,
    I'll just write it here.
  • 15:40 - 15:44
    I'm sorry. I didn't keep
    the numbers up there.
  • 15:44 - 15:48
    So... And sigma squared is
    between these two numbers:
  • 15:48 - 15:58
    58.24 and 200.95,
    with 95% confidence.
  • 16:16 - 16:25
    So I found that the standard deviation
    of the population with 95% confidence
  • 16:25 - 16:28
    is between these two numbers.
  • 16:31 - 16:40
    Pulse rate doesn't have any units, right?
    So that-- I guess the unit is "per minute."
  • 16:40 - 16:43
    I think it's per minute.
  • 16:43 - 16:47
    So they just count the number
    of pulses per minute.
  • 16:48 - 16:50
    So that sigma is per minute, in fact.
  • 16:50 - 16:55
    So that's 7.63 per minute
    and 14.17 per minute.
  • 16:56 - 17:01
    Sigma has the same unit as
    the quantity that you are studying,
  • 17:01 - 17:03
    but sigma squared does not.
  • 17:03 - 17:07
    But sigma squared has
    more theoretical significance.
  • 17:07 - 17:10
    But sigma makes
    more sense in practice.
  • 17:10 - 17:18
    So that is sigma for the population
    and sigma squared for the population
  • 17:18 - 17:20
    and with 95% confidence.
  • 17:20 - 17:25
    We can find, say,
    0.1 percent confidence.
  • 17:25 - 17:30
    So let me rewrite this here
    and we'll find 0.1 percent.
  • 17:30 - 17:37
    So this is for 95
    (sigma squared for 95 is this).
  • 17:38 - 17:45
    Let me write this here
    and then find the other one
  • 17:45 - 17:47
    with more confidence, say.
  • 17:50 - 17:55
    So this is for 95%, and let's see
    if we can do that for...
  • 17:56 - 18:01
    What difference...
    What kind of calculation?
  • 18:01 - 18:05
    Let's see. Well, let me
    share this thing again.
  • 18:09 - 18:14
    Alright, so with 0.01, the only
    thing that changes is alpha.
  • 18:14 - 18:17
    The rest is the same.
  • 18:17 - 18:23
    Alpha this time is 0.01, okay?
  • 18:24 - 18:29
    So the thing that changes
    is chi-L and chi-r,
  • 18:30 - 18:33
    so let's calculate chi-L and chi-r again.
  • 18:33 - 18:38
    Chi-r is q chi squared 1 minus alpha over 2.
  • 18:38 - 18:44
    Okay, because I have changed alpha,
    so chi-r is this and chi-L, let's find it.
  • 18:46 - 18:49
    Okay, and chi-L is this.
  • 18:49 - 18:55
    The rest of the calculation is the same,
    so let's find the square root of...
  • 18:57 - 19:05
    Everything else is the same,
    so let's see: sqrt, this is chi-L,
  • 19:05 - 19:08
    so that's the right side.
  • 19:12 - 19:14
    16.03.
  • 19:15 - 19:18
    16.03, let me write it on the board.
  • 19:21 - 19:26
    And the other one is chi-r --
  • 19:27 - 19:30
    Let's just change this to r,
    let's just go back.
  • 19:31 - 19:33
    And 7.06.
  • 19:43 - 19:48
    Let's find confidence
    level interval for 257.21.
  • 20:00 - 20:05
    And the other one, chi-r: 49.91.
  • 20:13 - 20:17
    And this is 99% confidence.
  • 20:21 - 20:26
    Okay, so let's go back to the...
  • 20:28 - 20:29
    Where is it?
  • 20:32 - 20:38
    So these two, let's look at these two
    and see what we have here.
  • 20:39 - 20:44
    And why... So this one, let's take sigma.
  • 20:46 - 20:54
    That interval is something like,
    this is say 7.63 and this is 14.17.
  • 20:54 - 20:56
    That is for 95%.
  • 20:56 - 21:04
    What is this? 7.06 is right by here,
    and this is like 16.03.
  • 21:04 - 21:09
    So that's 16.03 and 7.06.
  • 21:11 - 21:18
    As you see, the interval
    for 95% is shorter.
  • 21:26 - 21:31
    So you say that
    "with 99% confidence,
  • 21:31 - 21:36
    I think sigma lies in this interval."
  • 21:38 - 21:43
    The other one is 95% confidence,
    and that interval is smaller.
  • 21:43 - 21:46
    If you want 50% confidence,
  • 21:46 - 21:51
    then that will become even more,
    even smaller than this one.
  • 21:52 - 21:56
    Why is it happening?
    What's the meaning?
  • 21:57 - 22:02
    And why more confidence
    gives you larger interval?
  • 22:05 - 22:08
    You see, when you
    need more confidence...
  • 22:10 - 22:12
    How shall I say that?
  • 22:18 - 22:24
    Well, you see, this larger interval--
  • 22:24 - 22:28
    so you are more confident
    that they are...
  • 22:29 - 22:33
    This lies in a larger interval...
  • 22:36 - 22:38
    Well, because it has
    to do with probability.
  • 22:38 - 22:41
    I can discuss it using probability,
  • 22:41 - 22:43
    so you can say that this
    is, in fact, probability.
  • 22:43 - 22:48
    The probability that it lies
    in this interval is 0.99.
  • 22:48 - 22:52
    So that makes the interval larger
    because when you need the probability--
  • 22:52 - 22:58
    And remember, probability is like
    area under certain curve,
  • 22:58 - 23:00
    theoretically or mathematically.
  • 23:00 - 23:04
    So that changes the interval
  • 23:04 - 23:07
    or let's say the area under the curve.
  • 23:07 - 23:10
    But intuitively, that also makes sense
  • 23:10 - 23:19
    because when you extend the range
    with larger probability,
  • 23:19 - 23:23
    you can... you are more certain
    with larger probability
  • 23:23 - 23:27
    that it really is here.
    So if you say, okay,
  • 23:27 - 23:32
    you have hidden something in the house,
    and the house has 10 rooms.
  • 23:33 - 23:37
    You can say that with--
    just choose 9 of them
  • 23:37 - 23:45
    and you can say with 0.9 probability,
    that object is in one of those 9 rooms.
  • 23:45 - 23:47
    But if you reduce that to 5 rooms,
  • 23:47 - 23:52
    then you can say, okay, with 50%
    probability, it is in one of those rooms.
  • 23:53 - 23:55
    Now, which one is much better?
  • 23:55 - 23:58
    The 9 rooms is much better, right?
  • 23:58 - 24:00
    Because even intuitively,
  • 24:00 - 24:04
    you can say, yeah, I think
    if we choose 9 rooms out of 10,
  • 24:04 - 24:09
    then there's a larger probability
    that the object is in one of them.
  • 24:09 - 24:12
    But if you choose just
    5 rooms out of 10,
  • 24:12 - 24:15
    then the probability goes down
  • 24:15 - 24:19
    (the probability that object which is
    hidden is in one of those 5 rooms).
  • 24:20 - 24:23
    So you can see that
    this is like the rooms.
  • 24:23 - 24:26
    This [the top interval] is much less
    number of rooms. This is like 5 rooms,
  • 24:26 - 24:28
    this [bottom range] is like 9 rooms.
  • 24:29 - 24:34
    Maybe this example I give you,
    that example makes sense.
  • 24:34 - 24:39
    This nice example, I just
    came up with that example.
  • 24:39 - 24:46
    So if you want more confidence,
    you get larger interval,
  • 24:46 - 24:47
    or let's say larger--
  • 24:47 - 24:49
    I should say it backwards.
  • 24:49 - 24:53
    Larger interval gives
    you more confidence.
  • 24:54 - 25:01
    Okay. So that was the confidence interval
    for this standard deviation and variance,
  • 25:01 - 25:04
    and I hope to see you next week.
Title:
Math 1080 Lecture 21 Confidence Interval for Variance and Standard Deviation An Example
Video Language:
English
Duration:
25:06

English subtitles

Revisions