< Return to Video

https:/.../564e8ea0-13c3-4722-aeca-ad7501798e69-8faeb2ce-79d2-480f-b337-ad8c0110cfc1.mp4?invocationId=7b7b05c5-6603-ec11-a9e9-0a1a827ad0ec

  • Not Synced
    1
    00:00:04,750 --> 00:00:13,540
    Welcome back. And this video, I want to talk with you about what I'm going to call questioning questions I want us to dig deeper into.
  • Not Synced
    2
    00:00:13,540 --> 00:00:21,340
    We have a question. We have a candidate operationalization. How do we understand what it's actually going to be doing and how do we evaluate it?
  • Not Synced
    3
    00:00:21,340 --> 00:00:25,570
    So we're learning outcomes here are to be able to identify stakeholders and subgroups for a
  • Not Synced
    4
    00:00:25,570 --> 00:00:31,930
    problem and to evaluate the operationalization of a question to see who and what it prioritizes.
  • Not Synced
    5
    00:00:31,930 --> 00:00:37,030
    A key concept I want you to take away from this lecture is that we can understand
  • Not Synced
    6
    00:00:37,030 --> 00:00:42,640
    a metric or a measurement by asking what it takes to make it improve.
  • Not Synced
    7
    00:00:42,640 --> 00:00:49,000
    So to return to the example from the previous video, we want to assess whether a new change is improved.
  • Not Synced
    8
    00:00:49,000 --> 00:00:51,910
    Our introductory computer science class.
  • Not Synced
    9
    00:00:51,910 --> 00:01:00,820
    We can't one way we can go try to do that is to look at students who have taken their grades in the next class,
  • Not Synced
    10
    00:01:00,820 --> 00:01:05,200
    see us to twenty one and see if they're more likely to pass the next class.
  • Not Synced
    11
    00:01:05,200 --> 00:01:11,500
    We can measure this by looking at the grades. But this is not the only way that we can.
  • Not Synced
    12
    00:01:11,500 --> 00:01:16,030
    We can try to measure whether we've improved since to twenty or one twenty one.
  • Not Synced
    13
    00:01:16,030 --> 00:01:21,570
    So this option, I'm going to call it option one. We're looking at the pass rate. What fraction of students pass.
  • Not Synced
    14
    00:01:21,570 --> 00:01:26,290
    Yes. To twenty one for whatever definition of pass they get a C minus or better.
  • Not Synced
    15
    00:01:26,290 --> 00:01:31,270
    We could, we could up get. We could say what fraction get a B or get a B minus or better.
  • Not Synced
    16
    00:01:31,270 --> 00:01:36,460
    But what fraction of students pass. And does the Nusi as 121 method improve it.
  • Not Synced
    17
    00:01:36,460 --> 00:01:42,400
    Another way we could go try to look at it would be to look at the grades students receive in to twenty one.
  • Not Synced
    18
    00:01:42,400 --> 00:01:47,920
    So what is the average grade for students? For first students with the new intro.
  • Not Synced
    19
    00:01:47,920 --> 00:01:55,930
    With the previous intro and C two twenty one. Does the new, does the new technique in intro improve it.
  • Not Synced
    20
    00:01:55,930 --> 00:02:04,330
    We then have a question. Do we want to look at letter grades and, and maybe compute the average with the same formula used for GPA.
  • Not Synced
    21
    00:02:04,330 --> 00:02:10,930
    Do we want to look at coarse points. So actually look at what did they get in ninety five and ninety nine.
  • Not Synced
    22
    00:02:10,930 --> 00:02:17,350
    But we can look at the movement in the average grades now to evaluate what we have to figure out.
  • Not Synced
    23
    00:02:17,350 --> 00:02:29,260
    Which of these do we want to do. And a key tool that I want you to become familiar with for evaluating and understanding different measurements,
  • Not Synced
    24
    00:02:29,260 --> 00:02:35,140
    metrics, statistics, etc., is to ask how do I improve this measurement?
  • Not Synced
    25
    00:02:35,140 --> 00:02:38,680
    So we're computing's that. They were computing a pass rate, were computing an average grade.
  • Not Synced
    26
    00:02:38,680 --> 00:02:49,000
    How do I make this measurement better or worse? And this is going to give us a lot of insight into how the measurement works, what it prioritizes.
  • Not Synced
    27
    00:02:49,000 --> 00:02:53,020
    This this way of thinking about a measurement or a problem,
  • Not Synced
    28
    00:02:53,020 --> 00:02:58,030
    I think is going to serve you well throughout this class and throughout the rest of your education and work.
  • Not Synced
    29
    00:02:58,030 --> 00:03:02,710
    So let's talk about what these measurements and how to improve these measurements.
  • Not Synced
    30
    00:03:02,710 --> 00:03:08,710
    So if we want to improve the pass rate and we're not changing six to twenty one itself,
  • Not Synced
    31
    00:03:08,710 --> 00:03:12,100
    we can improve the pass rate by just passing everybody in to twenty one.
  • Not Synced
    32
    00:03:12,100 --> 00:03:18,390
    But if we're leaving to twenty one alone and we're changing, we're trying to improve the pass rate.
  • Not Synced
    33
    00:03:18,390 --> 00:03:23,010
    By making a change to the class that prepares students for it,
  • Not Synced
    34
    00:03:23,010 --> 00:03:28,340
    the only way to improve this is by helping those students who are going to have the most difficulty with C.
  • Not Synced
    35
    00:03:28,340 --> 00:03:35,040
    S one to one with C. S two one. If a student goes through one twenty one and they're going to get A, B and A and C.
  • Not Synced
    36
    00:03:35,040 --> 00:03:40,110
    S one twenty one. No change that we make to their excuse me to twenty one.
  • Not Synced
    37
    00:03:40,110 --> 00:03:45,630
    No change that we make to their one twenty one experience is going to improve our metric.
  • Not Synced
    38
    00:03:45,630 --> 00:03:51,710
    The only way to improve it is by helping more students move.
  • Not Synced
    39
    00:03:51,710 --> 00:04:03,550
    From a D to a C minus. Option two, why measuring the grade, we can improve this by making students do better at.
  • Not Synced
    40
    00:04:03,550 --> 00:04:06,670
    But we make do it by enabling students to do better.
  • Not Synced
    41
    00:04:06,670 --> 00:04:14,110
    If a student that would have gotten an A minus under the previous one twenty one is now prepared to the point where they'll get an A.
  • Not Synced
    42
    00:04:14,110 --> 00:04:19,270
    We improve it and we improve it just as much as if we move from a C minus to a C.
  • Not Synced
    43
    00:04:19,270 --> 00:04:23,650
    And so we there's more opportunity to make the grade better.
  • Not Synced
    44
    00:04:23,650 --> 00:04:31,690
    But we can improve this metric only by helping the students who were already well-prepared for C to twenty one.
  • Not Synced
    45
    00:04:31,690 --> 00:04:36,730
    So this brings us to a key point you get, which you measure.
  • Not Synced
    46
    00:04:36,730 --> 00:04:43,390
    If you set something up as the evaluation criteria, we're gonna see this really clearly once we start optimizing machine learning models.
  • Not Synced
    47
    00:04:43,390 --> 00:04:48,600
    When you set something up as your optimization criteria, that's what you get.
  • Not Synced
    48
    00:04:48,600 --> 00:04:58,470
    If you evaluate 120, what if you evaluate changes to the introductory class by people passing the next class, then you?
  • Not Synced
    49
    00:04:58,470 --> 00:05:08,970
    That's going to. That structures the pedagogy because design and the teaching evaluations to favor preparing students to pass the next class.
  • Not Synced
    50
    00:05:08,970 --> 00:05:18,330
    If you measure if you measure increased in average grade, that structure did to improve average grades.
  • Not Synced
    51
    00:05:18,330 --> 00:05:24,870
    But it might but it might focus the attention more on helping the students who were going to get a pretty good grade.
  • Not Synced
    52
    00:05:24,870 --> 00:05:26,740
    Anyway.
  • Not Synced
    53
    00:05:26,740 --> 00:05:34,930
    So going beyond, though, just just so we're looking here at two subgroups, one metric clearly favors the students who are on the edge pass rate.
  • Not Synced
    54
    00:05:34,930 --> 00:05:42,240
    You can only improve it by helping the students who are on the edge. The average grade, you can help across the board.
  • Not Synced
    55
    00:05:42,240 --> 00:05:50,260
    There are a variety of different stakeholders in the in the design of of an introductory class.
  • Not Synced
    56
    00:05:50,260 --> 00:05:54,130
    There's the students themselves who are going to be learning. There's the faculty who have to teach it.
  • Not Synced
    57
    00:05:54,130 --> 00:06:02,770
    Either they teach it themselves, they teach it directly, or they depend on students being prepared by that class as a prerequisite for the class.
  • Not Synced
    58
    00:06:02,770 --> 00:06:09,990
    They do teach the department, obviously, as a stakeholder because it has an interest in students producing a good education
  • Not Synced
    59
    00:06:09,990 --> 00:06:15,820
    that that students are want to come for and that employers want to hire for.
  • Not Synced
    60
    00:06:15,820 --> 00:06:18,970
    Employers want students to come out of the program well-prepared.
  • Not Synced
    61
    00:06:18,970 --> 00:06:28,360
    The university wants one has an interest in having programs that that produce well-prepared students.
  • Not Synced
    62
    00:06:28,360 --> 00:06:33,970
    And there also are attractive to students to be it to increase enrollment numbers.
  • Not Synced
    63
    00:06:33,970 --> 00:06:40,180
    But then even within within stake, the broad stakeholder categories, we have different subgroups.
  • Not Synced
    64
    00:06:40,180 --> 00:06:43,870
    So just within students, we can talk about high performing students.
  • Not Synced
    65
    00:06:43,870 --> 00:06:48,520
    We can talk about students who are underprepared for one or another class.
  • Not Synced
    66
    00:06:48,520 --> 00:06:51,700
    We can talk about students who are in some way or another marginalized.
  • Not Synced
    67
    00:06:51,700 --> 00:07:02,110
    And they may experience changes in the class structure of the class, delivery of the class assessment differently.
  • Not Synced
    68
    00:07:02,110 --> 00:07:09,950
    So even even once we've identified a stakeholder group, not every subset of that stakeholder group is going to experience.
  • Not Synced
    69
    00:07:09,950 --> 00:07:14,780
    What we're trying to study or is going to be reflected in the data in the same way,
  • Not Synced
    70
    00:07:14,780 --> 00:07:20,360
    we need to be able to identify these different groups to understand what it is that we're actually measuring.
  • Not Synced
    71
    00:07:20,360 --> 00:07:26,600
    So I want to return, though, this key question ask how do I improve this more?
  • Not Synced
    72
    00:07:26,600 --> 00:07:31,620
    Faced with a metric that really helps us clarify how a metric or a measurement behaves.
  • Not Synced
    73
    00:07:31,620 --> 00:07:35,390
    What changes to improve it? What also what can remain the same?
  • Not Synced
    74
    00:07:35,390 --> 00:07:43,150
    While it is improved. And then another important question is, how can it be gamed or manipulated?
  • Not Synced
    75
    00:07:43,150 --> 00:07:46,960
    Metrics are always what we call Lawsie and reductive.
  • Not Synced
    76
    00:07:46,960 --> 00:07:54,600
    Which means what that means is that no measurement captures everything about a phenomenon we care about.
  • Not Synced
    77
    00:07:54,600 --> 00:07:59,850
    And reductive means that we're we're taking a complex virts phenomenon.
  • Not Synced
    78
    00:07:59,850 --> 00:08:04,980
    We're reducing it down to one or a handful of measurements. We always lose something there.
  • Not Synced
    79
    00:08:04,980 --> 00:08:14,670
    But the results. But this question helps us evaluate what we're losing and iterate and improve our metrics and also assess and
  • Not Synced
    80
    00:08:14,670 --> 00:08:21,390
    assess whether those those weaknesses are actual challenges to validity or just something we need to keep in mind.
  • Not Synced
    81
    00:08:21,390 --> 00:08:28,980
    So to wrap up, it's crucial to appropriately define our measurements and then to question our definitions and a good import.
  • Not Synced
    82
    00:08:28,980 --> 00:08:34,650
    A useful way to do that is to ask how do we improve a metric that we're thinking about using?
  • Not Synced
    83
    00:08:34,650 --> 00:08:39,480
    And then also we want to look at who or what does a metric prioritize?
  • Not Synced
    84
    00:08:39,480 --> 00:08:54,933
    And is that prioritization is consistent with what we want to accomplish through organizational business or scientific goals.
  • Not Synced
Title:
https:/.../564e8ea0-13c3-4722-aeca-ad7501798e69-8faeb2ce-79d2-480f-b337-ad8c0110cfc1.mp4?invocationId=7b7b05c5-6603-ec11-a9e9-0a1a827ad0ec
Video Language:
English
Duration:
08:55

English subtitles

Incomplete

Revisions