1 99:59:59,999 --> 99:59:59,999 1 00:00:08,280 --> 00:00:13,300 Hello again. And this video, I want to talk about organizing notebooks as I've promised. 2 99:59:59,999 --> 99:59:59,999 2 00:00:13,300 --> 00:00:18,210 So we've talked about how do we make charts? That's been a lot of what we've been talking about here. 3 99:59:59,999 --> 99:59:59,999 3 00:00:18,210 --> 00:00:25,260 But I wanted to talk about how do we actually put together a notebook that's presenting these charts and presenting our conclusions from them. 4 99:59:59,999 --> 99:59:59,999 4 00:00:25,260 --> 00:00:33,040 So learning outcomes for this video are for you to be able to use markdown document structure to organize a notebook, to use the Jupiter, 5 99:59:59,999 --> 99:59:59,999 5 00:00:33,040 --> 00:00:40,350 a markdown features to format text in a notebook to create a notebook that clearly tells the story of a data analysis. 6 99:59:59,999 --> 99:59:59,999 6 00:00:40,350 --> 00:00:47,610 First thing to understand is that a notebook is a document. It is a convenient way to run Python code and to see the results of it. 7 99:59:59,999 --> 99:59:59,999 7 00:00:47,610 --> 00:00:52,950 But the notebook structure is first and foremost a document. It's meant to be read. 8 99:59:59,999 --> 99:59:59,999 8 00:00:52,950 --> 00:01:00,300 And there's some structure imposed in the document because it has to read in the same order as the code is going to execute. 9 99:59:59,999 --> 99:59:59,999 9 00:01:00,300 --> 00:01:08,420 But. We want to be able to actually read it and understand what's going on as we walk through the notebook. 10 99:59:59,999 --> 99:59:59,999 10 00:01:08,420 --> 00:01:12,830 So we also want to factor particularly complex computations out of the notebook. 11 99:59:59,999 --> 99:59:59,999 11 00:01:12,830 --> 00:01:15,410 So far, nothing. We've been doing a super complex. 12 99:59:59,999 --> 99:59:59,999 12 00:01:15,410 --> 00:01:22,220 But if I have a large, complicated data processing operation, training and an extensive set of machine learning models or something, 13 99:59:59,999 --> 99:59:59,999 13 00:01:22,220 --> 00:01:33,200 I'll put those out of the notebook and other scripts and other modules and leave the notebook for communicating the results of my data analysis. 14 99:59:59,999 --> 99:59:59,999 14 00:01:33,200 --> 00:01:38,570 So a notebook has two primary types of cells. We have code cells, which you've seen a lot. 15 99:59:59,999 --> 99:59:59,999 15 00:01:38,570 --> 00:01:44,720 The Python code and its output. And we have marked down cells that contain formatted text. 16 99:59:59,999 --> 99:59:59,999 16 00:01:44,720 --> 00:01:48,260 Could keep you. I recommend keeping your code cells relatively short. 17 99:59:59,999 --> 99:59:59,999 17 00:01:48,260 --> 00:01:57,050 One, a few lines. One function definition. If you're defining an entire class and it's taking 100 lines within a code cell. 18 99:59:59,999 --> 99:59:59,999 18 00:01:57,050 --> 00:02:01,580 That's a good sign you to pull that out into a python module of some kind. 19 99:59:59,999 --> 99:59:59,999 19 00:02:01,580 --> 00:02:06,440 If helpful, show results after the cell. I do this a lot, particularly in development. 20 99:59:59,999 --> 99:59:59,999 20 00:02:06,440 --> 00:02:11,450 But if you have too much of it, it can make it hard to read the final notebook because you have all of these outputs. 21 99:59:59,999 --> 99:59:59,999 21 00:02:11,450 --> 00:02:15,440 And the notebook wins it being a sea of charts and tables. 22 99:59:59,999 --> 99:59:59,999 22 00:02:15,440 --> 00:02:20,600 And it's difficult to find your way through the notebook and find the pieces that you need to look at. 23 99:59:59,999 --> 99:59:59,999 23 00:02:20,600 --> 00:02:27,020 So go ahead. Do a lot of them, especially while you're debugging in your prototyping before you submit. 24 99:59:59,999 --> 99:59:59,999 24 00:02:27,020 --> 00:02:32,750 Maybe go through and clean up, remove things that were just there for you to test how something worked and leave the cells in 25 99:59:59,999 --> 99:59:59,999 25 00:02:32,750 --> 00:02:37,940 your notebook being the ones that help the reader understand the results of what you're doing. 26 99:59:59,999 --> 99:59:59,999 26 00:02:37,940 --> 00:02:45,790 Remember, the purpose of the presentation is to show the reader what you learned and how you know it's true. 27 99:59:59,999 --> 99:59:59,999 27 00:02:45,790 --> 00:02:54,040 Cells that didn't help you do that. Maybe you can consider removing, though, or that don't help you do that. 28 99:59:59,999 --> 99:59:59,999 28 00:02:54,040 --> 00:02:58,720 At the end of the day, they might have helped you figure out how to do that. 29 99:59:59,999 --> 99:59:59,999 29 00:02:58,720 --> 00:03:02,650 You can save up a copy of your notebook before doing the cleanup so you don't lose them. 30 99:59:59,999 --> 99:59:59,999 30 00:03:02,650 --> 00:03:07,210 You can have a supplementary notebook that has maybe Pazz, you went down. 31 99:59:59,999 --> 99:59:59,999 31 00:03:07,210 --> 00:03:11,170 That didn't work out. Another thing you can consider doing is having an appendix. 32 99:59:59,999 --> 99:59:59,999 32 00:03:11,170 --> 00:03:17,020 So you've got all of the main content, the notebook. And then down at the end, you have a big heading appendix. 33 99:59:59,999 --> 99:59:59,999 33 00:03:17,020 --> 00:03:20,770 And there you have extra things. You want to make sure you can still run from top to bottom. 34 99:59:59,999 --> 99:59:59,999 34 00:03:20,770 --> 00:03:27,490 But there you have some of the other things that maybe dove into more details about the building blocks of some of your computations. 35 99:59:59,999 --> 99:59:59,999 35 00:03:27,490 --> 00:03:32,650 But it is good to show the results after loading data and after doing a complex manipulation, 36 99:59:59,999 --> 99:59:59,999 36 00:03:32,650 --> 00:03:38,910 especially one that significantly changes the shape of the data that you're working with. 37 99:59:59,999 --> 99:59:59,999 37 00:03:38,910 --> 00:03:41,790 And they talk mostly in this video, though, about markdown sales, 38 99:59:59,999 --> 99:59:59,999 38 00:03:41,790 --> 00:03:48,180 because markdown sells or what you use to build up the structure of your document and make it tell a story, 39 99:59:59,999 --> 99:59:59,999 39 00:03:48,180 --> 00:03:52,830 not just be a kind of strange way to present Python code. 40 99:59:59,999 --> 99:59:59,999 40 00:03:52,830 --> 00:04:01,710 So markdown is a text syntax for simple markup. I'm going to provide a link to the markdown documentation in the class notes that go with this video. 41 99:59:59,999 --> 99:59:59,999 41 00:04:01,710 --> 00:04:06,570 But there's several inline formatting things. If you put two stars around some text, that'll make it bold. 42 99:59:59,999 --> 99:59:59,999 42 00:04:06,570 --> 00:04:16,110 One star will make it italics. You can indicate a code using the fit, something that's going to show up as the fixed width code layout using back Tex. 43 99:59:59,999 --> 99:59:59,999 43 00:04:16,110 --> 00:04:24,570 This is one that I see ignored very frequently in and writing up because it's really, 44 99:59:59,999 --> 99:59:59,999 44 00:04:24,570 --> 00:04:27,960 really useful for function names, variable names, things like that. 45 99:59:59,999 --> 99:59:59,999 45 00:04:27,960 --> 00:04:32,670 To be able to set apart like this is a special thing. This is a function name also. 46 99:59:59,999 --> 99:59:59,999 46 00:04:32,670 --> 00:04:39,180 Then you can use tech math syntax by putting it between dollar signs in this markdown notebook. 47 99:59:59,999 --> 99:59:59,999 47 00:04:39,180 --> 00:04:48,450 Pay attention to the details of what your markdown code or what your text formatted text looks like after you render it in the notebook. 48 99:59:59,999 --> 99:59:59,999 48 00:04:48,450 --> 00:04:55,500 Make sure it reads well. Make sure it's clear. Ask yourself if I weren't the one who right wrote this. 49 99:59:59,999 --> 99:59:59,999 49 00:04:55,500 --> 00:05:03,290 What? I like reading this and clean it up and pay attention to those details, to make it look, 50 99:59:59,999 --> 99:59:59,999 50 00:05:03,290 --> 00:05:08,670 to make it look good and to make it be effective at communicating and so that the reader 51 99:59:59,999 --> 99:59:59,999 51 00:05:08,670 --> 00:05:13,170 can clearly understand what the different pieces are and what needs to be emphasized, 52 99:59:59,999 --> 99:59:59,999 52 00:05:13,170 --> 00:05:17,260 etc. Markdown also has a number of block elements. 53 99:59:59,999 --> 99:59:59,999 53 00:05:17,260 --> 00:05:21,520 The basic one is a paragraph, paragraphs or just text separated by blank lines. 54 99:59:59,999 --> 99:59:59,999 54 00:05:21,520 --> 00:05:26,890 You can also have bulleted and numbered lists. You can have code blocks for if you need to have a little. 55 99:59:59,999 --> 99:59:59,999 55 00:05:26,890 --> 00:05:31,720 These aren't super common in a notebook because a lot of your code is in the code cells that you execute. 56 99:59:59,999 --> 99:59:59,999 56 00:05:31,720 --> 00:05:34,900 But if you need to have a little code that you don't execute for some reason, 57 99:59:59,999 --> 99:59:59,999 57 00:05:34,900 --> 00:05:42,970 you can put it in the code block and markdown and then you can also block mathematics, a line on its own that begins and ends with two dollar signs. 58 99:59:59,999 --> 99:59:59,999 58 00:05:42,970 --> 00:05:49,170 And you can actually span multiple lines so long as there aren't any blanks that's going to be treated as a piece of block mathematics. 59 99:59:59,999 --> 99:59:59,999 59 00:05:49,170 --> 00:05:54,510 It's not in line in a sentence, but it becomes its own block and the rendered self. 60 99:59:59,999 --> 99:59:59,999 60 00:05:54,510 --> 00:06:00,450 Headings are an important one to pay attention to. So Mark Down headings are lines that start with one, two, 61 99:59:59,999 --> 99:59:59,999 61 00:06:00,450 --> 00:06:06,510 three up to six hash marks and then a space in the heading text having one heading to hitting three. 62 99:59:59,999 --> 99:59:59,999 62 00:06:06,510 --> 00:06:11,370 Something that's important to know is the hashes do not mean big and bold. 63 99:59:59,999 --> 99:59:59,999 63 00:06:11,370 --> 00:06:16,890 That's what they look like. But that's not what they mean. What they mean is heading. 64 99:59:59,999 --> 99:59:59,999 64 00:06:16,890 --> 00:06:23,300 And so you need to have an outline structure to your notebook using the headings. 65 99:59:59,999 --> 99:59:59,999 65 00:06:23,300 --> 00:06:29,030 And you need to nest them properly, so within each one, you have your H 2s. 66 99:59:59,999 --> 99:59:59,999 66 00:06:29,030 --> 00:06:32,760 And then you have your H threes. You don't go straight from H one to H for you. 67 99:59:59,999 --> 99:59:59,999 67 00:06:32,760 --> 00:06:41,540 You have H three in the middle. Start the notebook with an H one that has the notebook title and that that will become in a lot of rendering context. 68 99:59:59,999 --> 99:59:59,999 68 00:06:41,540 --> 00:06:46,850 That becomes the title at the top of your notebook. And then all your other headings are two or lower. 69 99:59:59,999 --> 99:59:59,999 69 00:06:46,850 --> 00:06:55,670 Also you might if you have an appendix, you might have Appendix B, another H1, but also the section headers should be short, not sentences. 70 99:59:59,999 --> 99:59:59,999 70 00:06:55,670 --> 00:07:00,630 If you're writing an entire sentence in your section header. You're you're putting too much there. 71 99:59:59,999 --> 99:59:59,999 71 00:07:00,630 --> 00:07:06,840 The section header should be a short title and then the section content comes after it. 72 99:59:59,999 --> 99:59:59,999 72 00:07:06,840 --> 00:07:13,650 Now, one of the few reasons why it's important to use the section headers heading levels properly. 73 99:59:59,999 --> 99:59:59,999 73 00:07:13,650 --> 00:07:19,710 One is just visually, it helps break up your notebook so we can easily see which component we're at. 74 99:59:59,999 --> 99:59:59,999 74 00:07:19,710 --> 00:07:26,130 Second, there are extensions that will do things like no your headings or give you a Browsr Bowl table of contents. 75 99:59:59,999 --> 99:59:59,999 75 00:07:26,130 --> 00:07:32,130 You can use to navigate the notebook by heading what? I'm rendering notebooks as a part of the course website. 76 99:59:59,999 --> 99:59:59,999 76 00:07:32,130 --> 00:07:36,960 You'll see this over in the right hand side. You can jump directly to notebook headings. 77 99:59:59,999 --> 99:59:59,999 77 00:07:36,960 --> 00:07:45,720 That only works because I'm consistently using the heading levels to build the structure and outline based structure of my notebook. 78 99:59:59,999 --> 99:59:59,999 78 00:07:45,720 --> 00:07:53,160 Another a third reason is for accessibility. If someone's reading your notebook with an assistive technology such as a screen reader, 79 99:59:59,999 --> 99:59:59,999 79 00:07:53,160 --> 00:08:01,500 the section headings are very important to help them navigate to the parts. The notebook that are both relevant to them at a given time. 80 99:59:59,999 --> 99:59:59,999 80 00:08:01,500 --> 00:08:09,180 So on the section headers, one additional little rule is if your section editor has to wrap onto a second line, really rethink. 81 99:59:59,999 --> 99:59:59,999 81 00:08:09,180 --> 00:08:13,230 It's almost certainly too long particularly. 82 99:59:59,999 --> 99:59:59,999 82 00:08:13,230 --> 00:08:16,860 Don't put an entire question in the section. Maybe. Usually. 83 99:59:59,999 --> 99:59:59,999 83 00:08:16,860 --> 00:08:22,890 Occasionally it's OK to put a whole question, but maybe put a brief like three to five word summary of the questions topic and 84 99:59:59,999 --> 99:59:59,999 84 00:08:22,890 --> 00:08:30,750 then write this question itself as the first paragraph of the of the section. 85 99:59:59,999 --> 99:59:59,999 85 00:08:30,750 --> 00:08:33,240 But pay attention to these different formatting features. 86 99:59:59,999 --> 99:59:59,999 86 00:08:33,240 --> 00:08:41,460 You can build a well-structured notebook that communicates clearly and draws the reader's emphasis to the places where it needs to go. 87 99:59:59,999 --> 99:59:59,999 87 00:08:41,460 --> 00:08:46,650 Writing the text itself. Use the document to tell a story. What's the goal of what you're doing? 88 99:59:59,999 --> 99:59:59,999 88 00:08:46,650 --> 00:08:51,630 Either the whole notebook or of individual pieces of analysis. What's the data that we're doing? 89 99:59:59,999 --> 99:59:59,999 89 00:08:51,630 --> 00:08:58,140 What do we know about it going in at the up at the top, either at the very top of your notebook or where you're loading the data? 90 99:59:59,999 --> 99:59:59,999 90 00:08:58,140 --> 00:09:04,050 It's useful to write some, especially at the notebooks and we report you submit to somebody. 91 99:59:59,999 --> 99:59:59,999 91 00:09:04,050 --> 00:09:07,800 It's useful to write there. What do you know? Where did you get the data? How was it collected? 92 99:59:59,999 --> 99:59:59,999 92 00:09:07,800 --> 00:09:16,290 Not a full data sheet, but at least some summary information to help the reader understand what it is that we're going to be going and looking at. 93 99:59:59,999 --> 99:59:59,999 93 00:09:16,290 --> 00:09:19,800 Why are we doing each piece of the analysis? What's the purpose here? 94 99:59:59,999 --> 99:59:59,999 94 00:09:19,800 --> 00:09:25,260 How does it fit into our broader picture, into our broader goals? What approach are we using? 95 99:59:59,999 --> 99:59:59,999 95 00:09:25,260 --> 00:09:28,200 We don't want to just repeat the code writing a a numbered list here. 96 99:59:59,999 --> 99:59:59,999 96 00:09:28,200 --> 00:09:33,240 The steps and those steps are just a literal translation of the code doesn't help understanding. 97 99:59:59,999 --> 99:59:59,999 97 00:09:33,240 --> 00:09:37,570 It creates an opportunity for a code and documentation to become mismatched. 98 99:59:59,999 --> 99:59:59,999 98 00:09:37,570 --> 00:09:41,670 But explain if there's anything tricky in the code. 99 99:59:59,999 --> 99:59:59,999 99 00:09:41,670 --> 00:09:48,600 Explain why that does the job. Explain the conceptual idea behind why you're approaching things the way you are. 100 99:59:59,999 --> 99:59:59,999 100 00:09:48,600 --> 00:09:56,700 If you're doing a data clean up, explain why that what that cleanup's doing and why that's the right cleanup for your data. 101 99:59:59,999 --> 99:59:59,999 101 00:09:56,700 --> 00:10:03,600 And then what do we learn from it? So oftentimes what I do with us, with an individual piece of it, like a chart. 102 99:59:59,999 --> 99:59:59,999 102 00:10:03,600 --> 00:10:07,260 All right. What question the charge is supposed to be answering. 103 99:59:59,999 --> 99:59:59,999 103 00:10:07,260 --> 00:10:11,010 Or at least the purpose of the chart that we have the code to generate the chart itself. 104 99:59:59,999 --> 99:59:59,999 104 00:10:11,010 --> 00:10:18,450 And then we have a tech cell that has observations about what we learn from the chart. 105 99:59:59,999 --> 99:59:59,999 105 00:10:18,450 --> 00:10:23,160 So what are we doing? How are we going to do it if that's not immediately clear code results? 106 99:59:59,999 --> 99:59:59,999 106 00:10:23,160 --> 00:10:25,590 And then what do we observe from these results? 107 99:59:59,999 --> 99:59:59,999 107 00:10:25,590 --> 00:10:33,350 So the over then the high level document structure that I recommend is to start start with the title and intros. 108 99:59:59,999 --> 99:59:59,999 108 00:10:33,350 --> 00:10:37,140 You've got your title. You're heading one. Then what's the notebook for? 109 99:59:59,999 --> 99:59:59,999 109 00:10:37,140 --> 00:10:41,160 Why does this notebook exist? Are there to include links? 110 99:59:59,999 --> 99:59:59,999 110 00:10:41,160 --> 00:10:46,650 There's hyperlinks and taxes and markdown as well. Read the markdown documentation to see how to use that. 111 99:59:59,999 --> 99:59:59,999 111 00:10:46,650 --> 00:10:50,790 But where does this go? Are there things we need to know? 112 99:59:59,999 --> 99:59:59,999 112 00:10:50,790 --> 00:10:54,990 Background about where why this documents being created? 113 99:59:59,999 --> 99:59:59,999 113 00:10:54,990 --> 00:11:00,360 Where did the data come from? If we have defined research questions, what are those research questions? 114 99:59:59,999 --> 99:59:59,999 114 00:11:00,360 --> 00:11:03,750 You can write those right in the intro, the notebook. Then I have a set up. 115 99:59:59,999 --> 99:59:59,999 115 00:11:03,750 --> 00:11:09,060 I almost always have a setup section that comes next. That has input. I import my python libraries. 116 99:59:59,999 --> 99:59:59,999 116 00:11:09,060 --> 00:11:11,970 I've maybe defined some help or functions that I'm going to be using throughout the 117 99:59:59,999 --> 99:59:59,999 117 00:11:11,970 --> 00:11:16,860 notebook helper function specific to one section I might define in that section. 118 99:59:59,999 --> 99:59:59,999 118 00:11:16,860 --> 00:11:20,730 But then and then I LOEs load the data. Sometimes I load the data as a part of the setup. 119 99:59:59,999 --> 99:59:59,999 119 00:11:20,730 --> 00:11:27,990 So it's OK. Important modules and then load my data. Sometimes if specially if I have more to say about the data, it's its own section. 120 99:59:59,999 --> 99:59:59,999 120 00:11:27,990 --> 00:11:34,500 But then as I load each table, I just show the first few rows of it often so that I can see, OK, I've loaded this data and then it's right there. 121 99:59:59,999 --> 99:59:59,999 121 00:11:34,500 --> 00:11:39,390 We can see as we're going through the rest of the notebook. What is the data just loaded look like? 122 99:59:59,999 --> 99:59:59,999 122 00:11:39,390 --> 00:11:45,030 Then we perform our analysis and this might be two sections. It might be five, six, seven, eight sections. 123 99:59:59,999 --> 99:59:59,999 123 00:11:45,030 --> 00:11:49,920 And then finally at the end, we can summarize and conclude this is going to be really I don't always do this in 124 99:59:59,999 --> 99:59:59,999 124 00:11:49,920 --> 00:11:53,340 my research notebook because often that's the material that goes in the paper. 125 99:59:59,999 --> 99:59:59,999 125 00:11:53,340 --> 00:11:57,150 But this is going to be something particularly in our assignment, and we're submitting notebooks. 126 99:59:59,999 --> 99:59:59,999 126 00:11:57,150 --> 00:12:00,810 Put that at the end of the notebook. What do we learn from this? 127 99:59:59,999 --> 99:59:59,999 127 00:12:00,810 --> 00:12:03,150 Sometimes they going to have specific directions for things. 128 99:59:59,999 --> 99:59:59,999 128 00:12:03,150 --> 00:12:08,730 I want you to reflect on there when like an assignment one, I've broken down the different requirements. 129 99:59:59,999 --> 99:59:59,999 129 00:12:08,730 --> 00:12:15,120 Those become good candidates for your age to your level. Two headings for each of those. 130 99:59:59,999 --> 99:59:59,999 130 00:12:15,120 --> 00:12:21,730 So we've got, I think six require six different requirements. An assignment one. 131 99:59:59,999 --> 99:59:59,999 131 00:12:21,730 --> 00:12:28,120 H2, heading a primary section of your document for each of those is a good starting point for your layout. 132 99:59:59,999 --> 99:59:59,999 132 00:12:28,120 --> 00:12:34,460 In addition to you're probably gonna have another one up at the top for the setup and maybe another for the data load. 133 99:59:59,999 --> 99:59:59,999 133 00:12:34,460 --> 00:12:38,590 But think about this. This the flow, your document be able to communicate. 134 99:59:59,999 --> 99:59:59,999 134 00:12:38,590 --> 00:12:42,760 What are we doing? What are the prerequisites in terms of and data? 135 99:59:59,999 --> 99:59:59,999 135 00:12:42,760 --> 00:12:47,790 How are we actually doing it? And then at the end, what do we learn? 136 99:59:59,999 --> 99:59:59,999 136 00:12:47,790 --> 00:12:53,580 So you're going to write a lot of cells and produce a lot of outputs in your notebook while you're debugging, 137 99:59:59,999 --> 99:59:59,999 137 00:12:53,580 --> 00:12:59,220 before you submit to before you share in other contexts. Spend some time cleaning up your notebook, 138 99:59:59,999 --> 99:59:59,999 138 00:12:59,220 --> 00:13:06,780 remove dead ends and extraneous outputs that you included for debugging, but don't fit in the flow of the story. 139 99:59:59,999 --> 99:59:59,999 139 00:13:06,780 --> 00:13:13,350 Consider putting them in a supplementary notebook. If you want to keep them around and then make sure you can rerun your notebook from top to bottom. 140 99:59:59,999 --> 99:59:59,999 140 00:13:13,350 --> 00:13:20,300 So when the Jupiter interface is the kernel, when you click that and choose, restart and rerun all. 141 99:59:59,999 --> 99:59:59,999 141 00:13:20,300 --> 00:13:26,570 And it will restart the python kernel that's actually running your code so all your variables disappear. 142 99:59:59,999 --> 99:59:59,999 142 00:13:26,570 --> 00:13:31,220 Your data is no longer loaded. And then it starts running the notebook from top to bottom. 143 99:59:59,999 --> 99:59:59,999 143 00:13:31,220 --> 00:13:37,220 You want that to succeed so that someone else working with the notebook can actually rerun and reproduce your results. 144 99:59:59,999 --> 99:59:59,999 144 00:13:37,220 --> 00:13:43,790 If that doesn't succeed, then that means either you deleted something that's that's important or you're the order of 145 99:59:59,999 --> 99:59:59,999 145 00:13:43,790 --> 00:13:48,740 your source in the notebook does not match the order in which it actually has to be executed. 146 99:59:59,999 --> 99:59:59,999 146 00:13:48,740 --> 00:13:54,980 But make sure it succeeds and also read back to the notebook to make sure that the charts all still look right. 147 99:59:59,999 --> 99:59:59,999 147 00:13:54,980 --> 00:14:01,100 The data is the conclusions are all still correct, etc. before you submit the final notebook. 148 99:59:59,999 --> 99:59:59,999 148 00:14:01,100 --> 00:14:05,330 So when you're writing an up of two, you also need to know your audience and your purpose. 149 99:59:59,999 --> 99:59:59,999 149 00:14:05,330 --> 00:14:10,550 For example, the notebooks I'm writing for you for teaching purposes here. 150 99:59:59,999 --> 99:59:59,999 150 00:14:10,550 --> 00:14:17,060 They the things I write in them differ from what I'm going to write in a research notebook that I share with my collaborators, 151 99:59:59,999 --> 99:59:59,999 151 00:14:17,060 --> 00:14:22,700 or I use my own purposes because my purpose partially in their notebooks, is to explain how they're working. 152 99:59:59,999 --> 99:59:59,999 152 00:14:22,700 --> 00:14:28,850 So I'm going to say more in these notebooks about how exactly the what exactly the code is 153 99:59:59,999 --> 99:59:59,999 153 00:14:28,850 --> 00:14:35,040 doing is that you can learn how the code works that I would expect in a research notebook. 154 99:59:59,999 --> 99:59:59,999 154 00:14:35,040 --> 00:14:41,730 But also, you're your own internal your own personal use sharing with your adviser or your supervisor, 155 99:59:59,999 --> 99:59:59,999 155 00:14:41,730 --> 00:14:47,970 sharing with the public, either the professional public working on your topic or the general public. 156 99:59:59,999 --> 99:59:59,999 156 00:14:47,970 --> 00:14:54,180 These are all different audiences and they're going to need different levels of explanation and different things highlighted in your notebook. 157 99:59:59,999 --> 99:59:59,999 157 00:14:54,180 --> 00:14:56,820 Also, not all audiences are well served for notebooks. 158 99:59:59,999 --> 99:59:59,999 158 00:14:56,820 --> 00:15:04,500 Notebooks are fantastic for internal reports, collaboration, et cetera, sharing the results of a data analysis with colleagues or with yourself. 159 99:59:59,999 --> 99:59:59,999 159 00:15:04,500 --> 00:15:09,090 But for final publication, you're often going to need a separate final report. 160 99:59:59,999 --> 99:59:59,999 160 00:15:09,090 --> 00:15:15,930 I don't know that it's possible to write a research paper and Jupiter notebooks. Somebody might have tried, but. 161 99:59:59,999 --> 99:59:59,999 161 00:15:15,930 --> 00:15:21,180 But I'll still have the notebook where I explain the analysis. I often make that notebook available. 162 99:59:59,999 --> 99:59:59,999 162 00:15:21,180 --> 00:15:31,050 So for a lot of my a lot of my published research papers, you can download a zip file or a get repository that contains the notebooks and you 163 99:59:59,999 --> 99:59:59,999 163 00:15:31,050 --> 00:15:36,240 can rerun the experiment and rerun my analysis with the notebooks in the notebook. 164 99:59:59,999 --> 99:59:59,999 164 00:15:36,240 --> 00:15:40,370 Then also I write the files out to disk. And we're not going to see this quite yet. 165 99:59:59,999 --> 99:59:59,999 165 00:15:40,370 --> 00:15:45,630 We're going to see it later when we start talking about workflow. Because right now I'm just having to submit notebooks. 166 99:59:59,999 --> 99:59:59,999 166 00:15:45,630 --> 00:15:49,720 But the note, the figures as they show up in the notebook aren't very high resolution. 167 99:59:59,999 --> 99:59:59,999 167 00:15:49,720 --> 00:15:55,200 So we're gonna want to render a higher resolution version of them to a PMG file or a PDA file or a 168 99:59:59,999 --> 99:59:59,999 168 00:15:55,200 --> 00:16:01,110 postscript file that we can then include in our document and word or law tech or whatever we're writing. 169 99:59:59,999 --> 99:59:59,999 169 00:16:01,110 --> 00:16:08,910 So to wrap up, your notebook is first and foremost a document that contains code to generate the results that you're trying to discuss. 170 99:59:59,999 --> 99:59:59,999 170 00:16:08,910 --> 00:16:14,490 Take advantage of the document structure and use it as a store to tell the story of your analysis. 171 99:59:59,999 --> 99:59:59,999 171 00:16:14,490 --> 00:16:19,910 The conclusion you come to in why we should believe them. Pay attention to the examples I'm giving you in class. 172 99:59:59,999 --> 99:59:59,999 172 00:16:19,910 --> 00:16:36,043 I'm also going to be trying to give you some examples of research oriented notebooks that you can look at to see examples of good notebook practice. 173 99:59:59,999 --> 99:59:59,999