1 99:59:59,999 --> 99:59:59,999 1 00:00:04,750 --> 00:00:13,540 Welcome back. And this video, I want to talk with you about what I'm going to call questioning questions I want us to dig deeper into. 2 99:59:59,999 --> 99:59:59,999 2 00:00:13,540 --> 00:00:21,340 We have a question. We have a candidate operationalization. How do we understand what it's actually going to be doing and how do we evaluate it? 3 99:59:59,999 --> 99:59:59,999 3 00:00:21,340 --> 00:00:25,570 So we're learning outcomes here are to be able to identify stakeholders and subgroups for a 4 99:59:59,999 --> 99:59:59,999 4 00:00:25,570 --> 00:00:31,930 problem and to evaluate the operationalization of a question to see who and what it prioritizes. 5 99:59:59,999 --> 99:59:59,999 5 00:00:31,930 --> 00:00:37,030 A key concept I want you to take away from this lecture is that we can understand 6 99:59:59,999 --> 99:59:59,999 6 00:00:37,030 --> 00:00:42,640 a metric or a measurement by asking what it takes to make it improve. 7 99:59:59,999 --> 99:59:59,999 7 00:00:42,640 --> 00:00:49,000 So to return to the example from the previous video, we want to assess whether a new change is improved. 8 99:59:59,999 --> 99:59:59,999 8 00:00:49,000 --> 00:00:51,910 Our introductory computer science class. 9 99:59:59,999 --> 99:59:59,999 9 00:00:51,910 --> 00:01:00,820 We can't one way we can go try to do that is to look at students who have taken their grades in the next class, 10 99:59:59,999 --> 99:59:59,999 10 00:01:00,820 --> 00:01:05,200 see us to twenty one and see if they're more likely to pass the next class. 11 99:59:59,999 --> 99:59:59,999 11 00:01:05,200 --> 00:01:11,500 We can measure this by looking at the grades. But this is not the only way that we can. 12 99:59:59,999 --> 99:59:59,999 12 00:01:11,500 --> 00:01:16,030 We can try to measure whether we've improved since to twenty or one twenty one. 13 99:59:59,999 --> 99:59:59,999 13 00:01:16,030 --> 00:01:21,570 So this option, I'm going to call it option one. We're looking at the pass rate. What fraction of students pass. 14 99:59:59,999 --> 99:59:59,999 14 00:01:21,570 --> 00:01:26,290 Yes. To twenty one for whatever definition of pass they get a C minus or better. 15 99:59:59,999 --> 99:59:59,999 15 00:01:26,290 --> 00:01:31,270 We could, we could up get. We could say what fraction get a B or get a B minus or better. 16 99:59:59,999 --> 99:59:59,999 16 00:01:31,270 --> 00:01:36,460 But what fraction of students pass. And does the Nusi as 121 method improve it. 17 99:59:59,999 --> 99:59:59,999 17 00:01:36,460 --> 00:01:42,400 Another way we could go try to look at it would be to look at the grades students receive in to twenty one. 18 99:59:59,999 --> 99:59:59,999 18 00:01:42,400 --> 00:01:47,920 So what is the average grade for students? For first students with the new intro. 19 99:59:59,999 --> 99:59:59,999 19 00:01:47,920 --> 00:01:55,930 With the previous intro and C two twenty one. Does the new, does the new technique in intro improve it. 20 99:59:59,999 --> 99:59:59,999 20 00:01:55,930 --> 00:02:04,330 We then have a question. Do we want to look at letter grades and, and maybe compute the average with the same formula used for GPA. 21 99:59:59,999 --> 99:59:59,999 21 00:02:04,330 --> 00:02:10,930 Do we want to look at coarse points. So actually look at what did they get in ninety five and ninety nine. 22 99:59:59,999 --> 99:59:59,999 22 00:02:10,930 --> 00:02:17,350 But we can look at the movement in the average grades now to evaluate what we have to figure out. 23 99:59:59,999 --> 99:59:59,999 23 00:02:17,350 --> 00:02:29,260 Which of these do we want to do. And a key tool that I want you to become familiar with for evaluating and understanding different measurements, 24 99:59:59,999 --> 99:59:59,999 24 00:02:29,260 --> 00:02:35,140 metrics, statistics, etc., is to ask how do I improve this measurement? 25 99:59:59,999 --> 99:59:59,999 25 00:02:35,140 --> 00:02:38,680 So we're computing's that. They were computing a pass rate, were computing an average grade. 26 99:59:59,999 --> 99:59:59,999 26 00:02:38,680 --> 00:02:49,000 How do I make this measurement better or worse? And this is going to give us a lot of insight into how the measurement works, what it prioritizes. 27 99:59:59,999 --> 99:59:59,999 27 00:02:49,000 --> 00:02:53,020 This this way of thinking about a measurement or a problem, 28 99:59:59,999 --> 99:59:59,999 28 00:02:53,020 --> 00:02:58,030 I think is going to serve you well throughout this class and throughout the rest of your education and work. 29 99:59:59,999 --> 99:59:59,999 29 00:02:58,030 --> 00:03:02,710 So let's talk about what these measurements and how to improve these measurements. 30 99:59:59,999 --> 99:59:59,999 30 00:03:02,710 --> 00:03:08,710 So if we want to improve the pass rate and we're not changing six to twenty one itself, 31 99:59:59,999 --> 99:59:59,999 31 00:03:08,710 --> 00:03:12,100 we can improve the pass rate by just passing everybody in to twenty one. 32 99:59:59,999 --> 99:59:59,999 32 00:03:12,100 --> 00:03:18,390 But if we're leaving to twenty one alone and we're changing, we're trying to improve the pass rate. 33 99:59:59,999 --> 99:59:59,999 33 00:03:18,390 --> 00:03:23,010 By making a change to the class that prepares students for it, 34 99:59:59,999 --> 99:59:59,999 34 00:03:23,010 --> 00:03:28,340 the only way to improve this is by helping those students who are going to have the most difficulty with C. 35 99:59:59,999 --> 99:59:59,999 35 00:03:28,340 --> 00:03:35,040 S one to one with C. S two one. If a student goes through one twenty one and they're going to get A, B and A and C. 36 99:59:59,999 --> 99:59:59,999 36 00:03:35,040 --> 00:03:40,110 S one twenty one. No change that we make to their excuse me to twenty one. 37 99:59:59,999 --> 99:59:59,999 37 00:03:40,110 --> 00:03:45,630 No change that we make to their one twenty one experience is going to improve our metric. 38 99:59:59,999 --> 99:59:59,999 38 00:03:45,630 --> 00:03:51,710 The only way to improve it is by helping more students move. 39 99:59:59,999 --> 99:59:59,999 39 00:03:51,710 --> 00:04:03,550 From a D to a C minus. Option two, why measuring the grade, we can improve this by making students do better at. 40 99:59:59,999 --> 99:59:59,999 40 00:04:03,550 --> 00:04:06,670 But we make do it by enabling students to do better. 41 99:59:59,999 --> 99:59:59,999 41 00:04:06,670 --> 00:04:14,110 If a student that would have gotten an A minus under the previous one twenty one is now prepared to the point where they'll get an A. 42 99:59:59,999 --> 99:59:59,999 42 00:04:14,110 --> 00:04:19,270 We improve it and we improve it just as much as if we move from a C minus to a C. 43 99:59:59,999 --> 99:59:59,999 43 00:04:19,270 --> 00:04:23,650 And so we there's more opportunity to make the grade better. 44 99:59:59,999 --> 99:59:59,999 44 00:04:23,650 --> 00:04:31,690 But we can improve this metric only by helping the students who were already well-prepared for C to twenty one. 45 99:59:59,999 --> 99:59:59,999 45 00:04:31,690 --> 00:04:36,730 So this brings us to a key point you get, which you measure. 46 99:59:59,999 --> 99:59:59,999 46 00:04:36,730 --> 00:04:43,390 If you set something up as the evaluation criteria, we're gonna see this really clearly once we start optimizing machine learning models. 47 99:59:59,999 --> 99:59:59,999 47 00:04:43,390 --> 00:04:48,600 When you set something up as your optimization criteria, that's what you get. 48 99:59:59,999 --> 99:59:59,999 48 00:04:48,600 --> 00:04:58,470 If you evaluate 120, what if you evaluate changes to the introductory class by people passing the next class, then you? 49 99:59:59,999 --> 99:59:59,999 49 00:04:58,470 --> 00:05:08,970 That's going to. That structures the pedagogy because design and the teaching evaluations to favor preparing students to pass the next class. 50 99:59:59,999 --> 99:59:59,999 50 00:05:08,970 --> 00:05:18,330 If you measure if you measure increased in average grade, that structure did to improve average grades. 51 99:59:59,999 --> 99:59:59,999 51 00:05:18,330 --> 00:05:24,870 But it might but it might focus the attention more on helping the students who were going to get a pretty good grade. 52 99:59:59,999 --> 99:59:59,999 52 00:05:24,870 --> 00:05:26,740 Anyway. 53 99:59:59,999 --> 99:59:59,999 53 00:05:26,740 --> 00:05:34,930 So going beyond, though, just just so we're looking here at two subgroups, one metric clearly favors the students who are on the edge pass rate. 54 99:59:59,999 --> 99:59:59,999 54 00:05:34,930 --> 00:05:42,240 You can only improve it by helping the students who are on the edge. The average grade, you can help across the board. 55 99:59:59,999 --> 99:59:59,999 55 00:05:42,240 --> 00:05:50,260 There are a variety of different stakeholders in the in the design of of an introductory class. 56 99:59:59,999 --> 99:59:59,999 56 00:05:50,260 --> 00:05:54,130 There's the students themselves who are going to be learning. There's the faculty who have to teach it. 57 99:59:59,999 --> 99:59:59,999 57 00:05:54,130 --> 00:06:02,770 Either they teach it themselves, they teach it directly, or they depend on students being prepared by that class as a prerequisite for the class. 58 99:59:59,999 --> 99:59:59,999 58 00:06:02,770 --> 00:06:09,990 They do teach the department, obviously, as a stakeholder because it has an interest in students producing a good education 59 99:59:59,999 --> 99:59:59,999 59 00:06:09,990 --> 00:06:15,820 that that students are want to come for and that employers want to hire for. 60 99:59:59,999 --> 99:59:59,999 60 00:06:15,820 --> 00:06:18,970 Employers want students to come out of the program well-prepared. 61 99:59:59,999 --> 99:59:59,999 61 00:06:18,970 --> 00:06:28,360 The university wants one has an interest in having programs that that produce well-prepared students. 62 99:59:59,999 --> 99:59:59,999 62 00:06:28,360 --> 00:06:33,970 And there also are attractive to students to be it to increase enrollment numbers. 63 99:59:59,999 --> 99:59:59,999 63 00:06:33,970 --> 00:06:40,180 But then even within within stake, the broad stakeholder categories, we have different subgroups. 64 99:59:59,999 --> 99:59:59,999 64 00:06:40,180 --> 00:06:43,870 So just within students, we can talk about high performing students. 65 99:59:59,999 --> 99:59:59,999 65 00:06:43,870 --> 00:06:48,520 We can talk about students who are underprepared for one or another class. 66 99:59:59,999 --> 99:59:59,999 66 00:06:48,520 --> 00:06:51,700 We can talk about students who are in some way or another marginalized. 67 99:59:59,999 --> 99:59:59,999 67 00:06:51,700 --> 00:07:02,110 And they may experience changes in the class structure of the class, delivery of the class assessment differently. 68 99:59:59,999 --> 99:59:59,999 68 00:07:02,110 --> 00:07:09,950 So even even once we've identified a stakeholder group, not every subset of that stakeholder group is going to experience. 69 99:59:59,999 --> 99:59:59,999 69 00:07:09,950 --> 00:07:14,780 What we're trying to study or is going to be reflected in the data in the same way, 70 99:59:59,999 --> 99:59:59,999 70 00:07:14,780 --> 00:07:20,360 we need to be able to identify these different groups to understand what it is that we're actually measuring. 71 99:59:59,999 --> 99:59:59,999 71 00:07:20,360 --> 00:07:26,600 So I want to return, though, this key question ask how do I improve this more? 72 99:59:59,999 --> 99:59:59,999 72 00:07:26,600 --> 00:07:31,620 Faced with a metric that really helps us clarify how a metric or a measurement behaves. 73 99:59:59,999 --> 99:59:59,999 73 00:07:31,620 --> 00:07:35,390 What changes to improve it? What also what can remain the same? 74 99:59:59,999 --> 99:59:59,999 74 00:07:35,390 --> 00:07:43,150 While it is improved. And then another important question is, how can it be gamed or manipulated? 75 99:59:59,999 --> 99:59:59,999 75 00:07:43,150 --> 00:07:46,960 Metrics are always what we call Lawsie and reductive. 76 99:59:59,999 --> 99:59:59,999 76 00:07:46,960 --> 00:07:54,600 Which means what that means is that no measurement captures everything about a phenomenon we care about. 77 99:59:59,999 --> 99:59:59,999 77 00:07:54,600 --> 00:07:59,850 And reductive means that we're we're taking a complex virts phenomenon. 78 99:59:59,999 --> 99:59:59,999 78 00:07:59,850 --> 00:08:04,980 We're reducing it down to one or a handful of measurements. We always lose something there. 79 99:59:59,999 --> 99:59:59,999 79 00:08:04,980 --> 00:08:14,670 But the results. But this question helps us evaluate what we're losing and iterate and improve our metrics and also assess and 80 99:59:59,999 --> 99:59:59,999 80 00:08:14,670 --> 00:08:21,390 assess whether those those weaknesses are actual challenges to validity or just something we need to keep in mind. 81 99:59:59,999 --> 99:59:59,999 81 00:08:21,390 --> 00:08:28,980 So to wrap up, it's crucial to appropriately define our measurements and then to question our definitions and a good import. 82 99:59:59,999 --> 99:59:59,999 82 00:08:28,980 --> 00:08:34,650 A useful way to do that is to ask how do we improve a metric that we're thinking about using? 83 99:59:59,999 --> 99:59:59,999 83 00:08:34,650 --> 00:08:39,480 And then also we want to look at who or what does a metric prioritize? 84 99:59:59,999 --> 99:59:59,999 84 00:08:39,480 --> 00:08:54,933 And is that prioritization is consistent with what we want to accomplish through organizational business or scientific goals. 85 99:59:59,999 --> 99:59:59,999