9:59:59.000,9:59:59.000 1[br]00:00:04,470 --> 00:00:10,000[br]Blow and this video, I'm going to introduce our week three module on presenting data. 9:59:59.000,9:59:59.000 2[br]00:00:10,000 --> 00:00:14,110[br]So our learning outcomes for this week are for you to be able to create plots from data, 9:59:59.000,9:59:59.000 3[br]00:00:14,110 --> 00:00:18,910[br]identify the appropriate plot for type of plot for data in question. 9:59:59.000,9:59:59.000 4[br]00:00:18,910 --> 00:00:24,130[br]I want you to be able to read and interpret a plot, refine a plot more clearly, show its data, 9:59:59.000,9:59:59.000 5[br]00:00:24,130 --> 00:00:34,030[br]but then also put these plots and our other presentations in discussions of data into a well-organized notebook to present your data analysis. 9:59:59.000,9:59:59.000 6[br]00:00:34,030 --> 00:00:41,260[br]Before we dove into how to actually present data, I want to start with talking about some of the purposes of data presentation, 9:59:59.000,9:59:59.000 7[br]00:00:41,260 --> 00:00:45,190[br]because these purposes should guide your presentation design decisions. 9:59:59.000,9:59:59.000 8[br]00:00:45,190 --> 00:00:49,750[br]They should guide your evaluation of your own presentations and those of others. 9:59:59.000,9:59:59.000 9[br]00:00:49,750 --> 00:00:56,050[br]They'll also guide my evaluation of your presentation when you are submitting assignments. 9:59:59.000,9:59:59.000 10[br]00:00:56,050 --> 00:01:02,320[br]And one of the first things we need to do is guide the reader attention to important results. 9:59:59.000,9:59:59.000 11[br]00:01:02,320 --> 00:01:07,900[br]And effective presentation is not going to have every piece of thing in it. 9:59:59.000,9:59:59.000 12[br]00:01:07,900 --> 00:01:12,820[br]It's going to draw focus and attention to the important results and make it easy for 9:59:59.000,9:59:59.000 13[br]00:01:12,820 --> 00:01:17,710[br]your reader to ask the key questions around the data analysis you're presenting. 9:59:59.000,9:59:59.000 14[br]00:01:17,710 --> 00:01:22,420[br]But then it also needs to substantiate the results and conclusions. 9:59:59.000,9:59:59.000 15[br]00:01:22,420 --> 00:01:30,010[br]So when we're presenting the data, we want to guide the reader to focus on what it is that we want them to learn, 9:59:59.000,9:59:59.000 16[br]00:01:30,010 --> 00:01:39,430[br]but also in a context that gives them the information needed to assess the validity of the conclusions that we're presenting. 9:59:59.000,9:59:59.000 17[br]00:01:39,430 --> 00:01:44,710[br]And we want to do so with integrity. It's easy to make charts that highlight the result. 9:59:59.000,9:59:59.000 18[br]00:01:44,710 --> 00:01:50,440[br]You want the user to see whether or not or not that result is rigorously defensible from the data. 9:59:59.000,9:59:59.000 19[br]00:01:50,440 --> 00:01:57,730[br]And we want to avoid making those kinds of misleading data visualizations and presentations. 9:59:59.000,9:59:59.000 20[br]00:01:57,730 --> 00:02:03,910[br]So in doing this, we want to be able to think about the audience and you're gonna be presenting data to several different audiences. 9:59:59.000,9:59:59.000 21[br]00:02:03,910 --> 00:02:05,710[br]The first audience is yourself. 9:59:59.000,9:59:59.000 22[br]00:02:05,710 --> 00:02:12,770[br]When you're working with a data set, when you're trying to understand what you're learning from it, the results of your inferences, 9:59:59.000,9:59:59.000 23[br]00:02:12,770 --> 00:02:21,070[br]your presenting data to yourself, and you need the presentation to be clear so that you understand what it is that you're learning from the data. 9:59:59.000,9:59:59.000 24[br]00:02:21,070 --> 00:02:26,980[br]So you see the next question to asks. You're not misleading yourself in the data analysis process. 9:59:59.000,9:59:59.000 25[br]00:02:26,980 --> 00:02:34,000[br]But those kinds of charts don't necessarily need the same level of Polish that a chart for an external audience. 9:59:59.000,9:59:59.000 26[br]00:02:34,000 --> 00:02:39,880[br]Would your collaborators, supervisors, et cetera, need to be able to see the data, see what you're learning? 9:59:59.000,9:59:59.000 27[br]00:02:39,880 --> 00:02:43,690[br]Maybe it's in the weekly meeting you have with your research advisor. 9:59:59.000,9:59:59.000 28[br]00:02:43,690 --> 00:02:47,130[br]You're presenting them with the results that you found. 9:59:59.000,9:59:59.000 29[br]00:02:47,130 --> 00:02:53,340[br]They are people who have a lot of knowledge of the project you're working on another problem base you're working on. 9:59:59.000,9:59:59.000 30[br]00:02:53,340 --> 00:02:58,230[br]They're gonna help you guide and refine your questions. 9:59:59.000,9:59:59.000 31[br]00:02:58,230 --> 00:03:06,660[br]Again, they perhaps don't need as much Polish as a final published result, but they're not just the ones that you're creating internally for yourself. 9:59:59.000,9:59:59.000 32[br]00:03:06,660 --> 00:03:08,550[br]You're going to be presenting to expert readers. 9:59:59.000,9:59:59.000 33[br]00:03:08,550 --> 00:03:15,030[br]If you write a scientific paper, the readers are probably usually wouldn't have some level of expertize in the topic that you're talking about. 9:59:59.000,9:59:59.000 34[br]00:03:15,030 --> 00:03:20,550[br]They may know. They'll probably know the subject in general, but they may not know your specific work. 9:59:59.000,9:59:59.000 35[br]00:03:20,550 --> 00:03:28,650[br]You may be presenting this to decision makers, especially if you're doing a data science project in an industrial or corporate environment. 9:59:59.000,9:59:59.000 36[br]00:03:28,650 --> 00:03:29,820[br]You're providing data. 9:59:59.000,9:59:59.000 37[br]00:03:29,820 --> 00:03:39,240[br]That's going to inform the decisions that your boss, who may not have significant statistical expertize or data expertize that may, 9:59:59.000,9:59:59.000 38[br]00:03:39,240 --> 00:03:45,840[br]but they're going to be using those decision, those the data and the the data that you present in order to make decisions. 9:59:59.000,9:59:59.000 39[br]00:03:45,840 --> 00:03:52,170[br]And then finally, you may occasionally be be producing or presenting data for the general public at large. 9:59:59.000,9:59:59.000 40[br]00:03:52,170 --> 00:03:57,120[br]Each of these audiences is going to require different things from your data presentation. 9:59:59.000,9:59:59.000 41[br]00:03:57,120 --> 00:04:04,830[br]So you need to understand who it is that you're presenting the data to in order to make appropriate data presentation decisions. 9:59:59.000,9:59:59.000 42[br]00:04:04,830 --> 00:04:14,940[br]When we're presenting the data, here are some questions that are going to help us understand what it is that we need to guide the reader towards. 9:59:59.000,9:59:59.000 43[br]00:04:14,940 --> 00:04:22,110[br]So we need to be clear on the reader needs to come away knowing what we sought to find out. 9:59:59.000,9:59:59.000 44[br]00:04:22,110 --> 00:04:30,360[br]This might be just explicitly stating our research questions, but they need to know the purpose that the data we're presenting is supposed to serve. 9:59:59.000,9:59:59.000 45[br]00:04:30,360 --> 00:04:36,630[br]What are they supposed to learn from it? We then they didn't then need to see what we do learn. 9:59:59.000,9:59:59.000 46[br]00:04:36,630 --> 00:04:43,110[br]And then they need to see the supporting evidence, the context to trust the conclusions. 9:59:59.000,9:59:59.000 47[br]00:04:43,110 --> 00:04:48,330[br]It's not just enough to say here's the results, but in many cases you need to provide enough data, 9:59:59.000,9:59:59.000 48[br]00:04:48,330 --> 00:04:56,190[br]enough context that they not only see what you learned, but they see why you believe it is true. 9:59:59.000,9:59:59.000 49[br]00:04:56,190 --> 00:05:04,350[br]And presentation with integrity shows the reader and really makes it clear to the reader what we learned. 9:59:59.000,9:59:59.000 50[br]00:05:04,350 --> 00:05:08,370[br]The evidentiary support behind it, why it flows from the data. 9:59:59.000,9:59:59.000 51[br]00:05:08,370 --> 00:05:17,280[br]Whereas dishonest presentation manipulates them into the conclusion without having the rigorous foundation underneath it. 9:59:59.000,9:59:59.000 52[br]00:05:17,280 --> 00:05:26,790[br]So I want to show you an example of a very bad graphic that came up out of the state of Georgia earlier this year. 9:59:59.000,9:59:59.000 53[br]00:05:26,790 --> 00:05:40,980[br]They presented campaign ad hundred network television, a graph that's purporting to show Kofod cases in various high population counties over time. 9:59:59.000,9:59:59.000 54[br]00:05:40,980 --> 00:05:49,170[br]But if you look closely at the Y at the X axis of this graph and you'll see this more clearly when you go and look at the slides, 9:59:59.000,9:59:59.000 55[br]00:05:49,170 --> 00:05:57,660[br]the axis is not sorted. It starts with the twenty eighth of April and then it goes to the twenty seventh, followed by the twenty ninth. 9:59:59.000,9:59:59.000 56[br]00:05:57,660 --> 00:06:01,470[br]Then May 1st. Then April 30th. At the end we have May 2nd. 9:59:59.000,9:59:59.000 57[br]00:06:01,470 --> 00:06:06,420[br]May 7th. April 26. May 3rd. It violates the expected convention. 9:59:59.000,9:59:59.000 58[br]00:06:06,420 --> 00:06:12,030[br]And what they're what you'd need to do to show a trend over time that time goes from left to right. 9:59:59.000,9:59:59.000 59[br]00:06:12,030 --> 00:06:18,960[br]They're sorting things in. They're putting things out of order to show the trend. 9:59:59.000,9:59:59.000 60[br]00:06:18,960 --> 00:06:26,880[br]They want. In a way that's not substantiated by the data and it takes a lot of work to make a chart. 9:59:59.000,9:59:59.000 61[br]00:06:26,880 --> 00:06:35,480[br]This bat. I'm not sure how to do it in any of the statistical software I actually use, but this is a this is an egregious example, 9:59:59.000,9:59:59.000 62[br]00:06:35,480 --> 00:06:42,220[br]but it's an example of one of the things that can happen when where when we focus on. 9:59:59.000,9:59:59.000 63[br]00:06:42,220 --> 00:06:47,220[br]The effect we want to demonstrate over the evidentiary support for it. 9:59:59.000,9:59:59.000 64[br]00:06:47,220 --> 00:07:04,960[br]I want to contrast with a. I want to contrast with a chart from W edi w e, b do BWAS charts created for the nineteen hundred parece X position. 9:59:59.000,9:59:59.000 65[br]00:07:04,960 --> 00:07:11,250[br]And these were a series of charts for an exhibition to show the economic, educational, 9:59:59.000,9:59:59.000 66[br]00:07:11,250 --> 00:07:19,660[br]etc. progress of black Americans from emancipation to the turn of the century. 9:59:59.000,9:59:59.000 67[br]00:07:19,660 --> 00:07:23,500[br]And he made a series of charts showing economic situations and things. 9:59:59.000,9:59:59.000 68[br]00:07:23,500 --> 00:07:28,090[br]And here's a bar chart that clearly shows the result. 9:59:59.000,9:59:59.000 69[br]00:07:28,090 --> 00:07:35,350[br]The distribution of economic statuses for farmers after a year of farm labor. 9:59:59.000,9:59:59.000 70[br]00:07:35,350 --> 00:07:39,250[br]And it shows we have the first categories of bankrupt and in debt. 9:59:59.000,9:59:59.000 71[br]00:07:39,250 --> 00:07:46,570[br]And then we have four different or five different levels of of non-negative return up to clearing. 9:59:59.000,9:59:59.000 72[br]00:07:46,570 --> 00:07:53,160[br]Fifty dollars or more. And it shows them the bars are proportional to the length of the data. 9:59:59.000,9:59:59.000 73[br]00:07:53,160 --> 00:07:55,290[br]Very clearly highlights these things. 9:59:59.000,9:59:59.000 74[br]00:07:55,290 --> 00:08:03,720[br]It then also does a creative thing of pulls out the separate bar that it indicates is the composite of all of the non-negative bars. 9:59:59.000,9:59:59.000 75[br]00:08:03,720 --> 00:08:08,760[br]And we can see that even if you if you add all of that, they non-negative bars together. 9:59:59.000,9:59:59.000 76[br]00:08:08,760 --> 00:08:13,860[br]It's not as many farmers as as the indebt category. 9:59:59.000,9:59:59.000 77[br]00:08:13,860 --> 00:08:17,490[br]That's not a very standard thing. These charts were hand drawn. 9:59:59.000,9:59:59.000 78[br]00:08:17,490 --> 00:08:25,260[br]But it's a creative use of the visualization to highlight in a way that's supported by the data, 9:59:59.000,9:59:59.000 79[br]00:08:25,260 --> 00:08:35,910[br]the relative distributions of a different in return levels for black American farmers at the time. 9:59:59.000,9:59:59.000 80[br]00:08:35,910 --> 00:08:42,990[br]Another one, that's another one that's creative here is this spiral bar chart. 9:59:59.000,9:59:59.000 81[br]00:08:42,990 --> 00:08:49,080[br]Again, it's an unusual thing, but the lines, we can see them going progressively longer. 9:59:59.000,9:59:59.000 82[br]00:08:49,080 --> 00:08:55,680[br]And if this were just a part of normal horizontal bar chart without the spiral that the smallest line, 9:59:59.000,9:59:59.000 83[br]00:08:55,680 --> 00:09:00,570[br]the first line, 1975, would be so small you couldn't even see us. 9:59:59.000,9:59:59.000 84[br]00:09:00,570 --> 00:09:07,050[br]This gives more space. And it shows good visualization doesn't just mean following the checklist of rules. 9:59:59.000,9:59:59.000 85[br]00:09:07,050 --> 00:09:16,920[br]It means presenting the data in a way that the conclusions and the takeaways are clear and they're rigorously supported by the underlying data. 9:59:59.000,9:59:59.000 86[br]00:09:16,920 --> 00:09:20,850[br]There's no visual tricks to make things look larger or smaller than they are. 9:59:59.000,9:59:59.000 87[br]00:09:20,850 --> 00:09:27,990[br]It transparently shows the connection between the conclusion and the underlying data that support it. 9:59:59.000,9:59:59.000 88[br]00:09:27,990 --> 00:09:33,850[br]So to wrap up, the goal of good presentation is to guide the reader to what we learned and how we know it. 9:59:59.000,9:59:59.000 89[br]00:09:33,850 --> 00:09:41,460[br]Effective. Presentation is going to highlight the important things for the reader to understand without distraction or deception. 9:59:59.000,9:59:59.000 90[br]00:09:41,460 --> 00:09:55,533[br]And we're going to see throughout more of this week and throughout more the semester how practically to go about doing that. 9:59:59.000,9:59:59.000