WEBVTT 99:59:59.999 --> 99:59:59.999 1 00:00:08,280 --> 00:00:13,300 Hello again. And this video, I want to talk about organizing notebooks as I've promised. 99:59:59.999 --> 99:59:59.999 2 00:00:13,300 --> 00:00:18,210 So we've talked about how do we make charts? That's been a lot of what we've been talking about here. 99:59:59.999 --> 99:59:59.999 3 00:00:18,210 --> 00:00:25,260 But I wanted to talk about how do we actually put together a notebook that's presenting these charts and presenting our conclusions from them. 99:59:59.999 --> 99:59:59.999 4 00:00:25,260 --> 00:00:33,040 So learning outcomes for this video are for you to be able to use markdown document structure to organize a notebook, to use the Jupiter, 99:59:59.999 --> 99:59:59.999 5 00:00:33,040 --> 00:00:40,350 a markdown features to format text in a notebook to create a notebook that clearly tells the story of a data analysis. 99:59:59.999 --> 99:59:59.999 6 00:00:40,350 --> 00:00:47,610 First thing to understand is that a notebook is a document. It is a convenient way to run Python code and to see the results of it. 99:59:59.999 --> 99:59:59.999 7 00:00:47,610 --> 00:00:52,950 But the notebook structure is first and foremost a document. It's meant to be read. 99:59:59.999 --> 99:59:59.999 8 00:00:52,950 --> 00:01:00,300 And there's some structure imposed in the document because it has to read in the same order as the code is going to execute. 99:59:59.999 --> 99:59:59.999 9 00:01:00,300 --> 00:01:08,420 But. We want to be able to actually read it and understand what's going on as we walk through the notebook. 99:59:59.999 --> 99:59:59.999 10 00:01:08,420 --> 00:01:12,830 So we also want to factor particularly complex computations out of the notebook. 99:59:59.999 --> 99:59:59.999 11 00:01:12,830 --> 00:01:15,410 So far, nothing. We've been doing a super complex. 99:59:59.999 --> 99:59:59.999 12 00:01:15,410 --> 00:01:22,220 But if I have a large, complicated data processing operation, training and an extensive set of machine learning models or something, 99:59:59.999 --> 99:59:59.999 13 00:01:22,220 --> 00:01:33,200 I'll put those out of the notebook and other scripts and other modules and leave the notebook for communicating the results of my data analysis. 99:59:59.999 --> 99:59:59.999 14 00:01:33,200 --> 00:01:38,570 So a notebook has two primary types of cells. We have code cells, which you've seen a lot. 99:59:59.999 --> 99:59:59.999 15 00:01:38,570 --> 00:01:44,720 The Python code and its output. And we have marked down cells that contain formatted text. 99:59:59.999 --> 99:59:59.999 16 00:01:44,720 --> 00:01:48,260 Could keep you. I recommend keeping your code cells relatively short. 99:59:59.999 --> 99:59:59.999 17 00:01:48,260 --> 00:01:57,050 One, a few lines. One function definition. If you're defining an entire class and it's taking 100 lines within a code cell. 99:59:59.999 --> 99:59:59.999 18 00:01:57,050 --> 00:02:01,580 That's a good sign you to pull that out into a python module of some kind. 99:59:59.999 --> 99:59:59.999 19 00:02:01,580 --> 00:02:06,440 If helpful, show results after the cell. I do this a lot, particularly in development. 99:59:59.999 --> 99:59:59.999 20 00:02:06,440 --> 00:02:11,450 But if you have too much of it, it can make it hard to read the final notebook because you have all of these outputs. 99:59:59.999 --> 99:59:59.999 21 00:02:11,450 --> 00:02:15,440 And the notebook wins it being a sea of charts and tables. 99:59:59.999 --> 99:59:59.999 22 00:02:15,440 --> 00:02:20,600 And it's difficult to find your way through the notebook and find the pieces that you need to look at. 99:59:59.999 --> 99:59:59.999 23 00:02:20,600 --> 00:02:27,020 So go ahead. Do a lot of them, especially while you're debugging in your prototyping before you submit. 99:59:59.999 --> 99:59:59.999 24 00:02:27,020 --> 00:02:32,750 Maybe go through and clean up, remove things that were just there for you to test how something worked and leave the cells in 99:59:59.999 --> 99:59:59.999 25 00:02:32,750 --> 00:02:37,940 your notebook being the ones that help the reader understand the results of what you're doing. 99:59:59.999 --> 99:59:59.999 26 00:02:37,940 --> 00:02:45,790 Remember, the purpose of the presentation is to show the reader what you learned and how you know it's true. 99:59:59.999 --> 99:59:59.999 27 00:02:45,790 --> 00:02:54,040 Cells that didn't help you do that. Maybe you can consider removing, though, or that don't help you do that. 99:59:59.999 --> 99:59:59.999 28 00:02:54,040 --> 00:02:58,720 At the end of the day, they might have helped you figure out how to do that. 99:59:59.999 --> 99:59:59.999 29 00:02:58,720 --> 00:03:02,650 You can save up a copy of your notebook before doing the cleanup so you don't lose them. 99:59:59.999 --> 99:59:59.999 30 00:03:02,650 --> 00:03:07,210 You can have a supplementary notebook that has maybe Pazz, you went down. 99:59:59.999 --> 99:59:59.999 31 00:03:07,210 --> 00:03:11,170 That didn't work out. Another thing you can consider doing is having an appendix. 99:59:59.999 --> 99:59:59.999 32 00:03:11,170 --> 00:03:17,020 So you've got all of the main content, the notebook. And then down at the end, you have a big heading appendix. 99:59:59.999 --> 99:59:59.999 33 00:03:17,020 --> 00:03:20,770 And there you have extra things. You want to make sure you can still run from top to bottom. 99:59:59.999 --> 99:59:59.999 34 00:03:20,770 --> 00:03:27,490 But there you have some of the other things that maybe dove into more details about the building blocks of some of your computations. 99:59:59.999 --> 99:59:59.999 35 00:03:27,490 --> 00:03:32,650 But it is good to show the results after loading data and after doing a complex manipulation, 99:59:59.999 --> 99:59:59.999 36 00:03:32,650 --> 00:03:38,910 especially one that significantly changes the shape of the data that you're working with. 99:59:59.999 --> 99:59:59.999 37 00:03:38,910 --> 00:03:41,790 And they talk mostly in this video, though, about markdown sales, 99:59:59.999 --> 99:59:59.999 38 00:03:41,790 --> 00:03:48,180 because markdown sells or what you use to build up the structure of your document and make it tell a story, 99:59:59.999 --> 99:59:59.999 39 00:03:48,180 --> 00:03:52,830 not just be a kind of strange way to present Python code. 99:59:59.999 --> 99:59:59.999 40 00:03:52,830 --> 00:04:01,710 So markdown is a text syntax for simple markup. I'm going to provide a link to the markdown documentation in the class notes that go with this video. 99:59:59.999 --> 99:59:59.999 41 00:04:01,710 --> 00:04:06,570 But there's several inline formatting things. If you put two stars around some text, that'll make it bold. 99:59:59.999 --> 99:59:59.999 42 00:04:06,570 --> 00:04:16,110 One star will make it italics. You can indicate a code using the fit, something that's going to show up as the fixed width code layout using back Tex. 99:59:59.999 --> 99:59:59.999 43 00:04:16,110 --> 00:04:24,570 This is one that I see ignored very frequently in and writing up because it's really, 99:59:59.999 --> 99:59:59.999 44 00:04:24,570 --> 00:04:27,960 really useful for function names, variable names, things like that. 99:59:59.999 --> 99:59:59.999 45 00:04:27,960 --> 00:04:32,670 To be able to set apart like this is a special thing. This is a function name also. 99:59:59.999 --> 99:59:59.999 46 00:04:32,670 --> 00:04:39,180 Then you can use tech math syntax by putting it between dollar signs in this markdown notebook. 99:59:59.999 --> 99:59:59.999 47 00:04:39,180 --> 00:04:48,450 Pay attention to the details of what your markdown code or what your text formatted text looks like after you render it in the notebook. 99:59:59.999 --> 99:59:59.999 48 00:04:48,450 --> 00:04:55,500 Make sure it reads well. Make sure it's clear. Ask yourself if I weren't the one who right wrote this. 99:59:59.999 --> 99:59:59.999 49 00:04:55,500 --> 00:05:03,290 What? I like reading this and clean it up and pay attention to those details, to make it look, 99:59:59.999 --> 99:59:59.999 50 00:05:03,290 --> 00:05:08,670 to make it look good and to make it be effective at communicating and so that the reader 99:59:59.999 --> 99:59:59.999 51 00:05:08,670 --> 00:05:13,170 can clearly understand what the different pieces are and what needs to be emphasized, 99:59:59.999 --> 99:59:59.999 52 00:05:13,170 --> 00:05:17,260 etc. Markdown also has a number of block elements. 99:59:59.999 --> 99:59:59.999 53 00:05:17,260 --> 00:05:21,520 The basic one is a paragraph, paragraphs or just text separated by blank lines. 99:59:59.999 --> 99:59:59.999 54 00:05:21,520 --> 00:05:26,890 You can also have bulleted and numbered lists. You can have code blocks for if you need to have a little. 99:59:59.999 --> 99:59:59.999 55 00:05:26,890 --> 00:05:31,720 These aren't super common in a notebook because a lot of your code is in the code cells that you execute. 99:59:59.999 --> 99:59:59.999 56 00:05:31,720 --> 00:05:34,900 But if you need to have a little code that you don't execute for some reason, 99:59:59.999 --> 99:59:59.999 57 00:05:34,900 --> 00:05:42,970 you can put it in the code block and markdown and then you can also block mathematics, a line on its own that begins and ends with two dollar signs. 99:59:59.999 --> 99:59:59.999 58 00:05:42,970 --> 00:05:49,170 And you can actually span multiple lines so long as there aren't any blanks that's going to be treated as a piece of block mathematics. 99:59:59.999 --> 99:59:59.999 59 00:05:49,170 --> 00:05:54,510 It's not in line in a sentence, but it becomes its own block and the rendered self. 99:59:59.999 --> 99:59:59.999 60 00:05:54,510 --> 00:06:00,450 Headings are an important one to pay attention to. So Mark Down headings are lines that start with one, two, 99:59:59.999 --> 99:59:59.999 61 00:06:00,450 --> 00:06:06,510 three up to six hash marks and then a space in the heading text having one heading to hitting three. 99:59:59.999 --> 99:59:59.999 62 00:06:06,510 --> 00:06:11,370 Something that's important to know is the hashes do not mean big and bold. 99:59:59.999 --> 99:59:59.999 63 00:06:11,370 --> 00:06:16,890 That's what they look like. But that's not what they mean. What they mean is heading. 99:59:59.999 --> 99:59:59.999 64 00:06:16,890 --> 00:06:23,300 And so you need to have an outline structure to your notebook using the headings. 99:59:59.999 --> 99:59:59.999 65 00:06:23,300 --> 00:06:29,030 And you need to nest them properly, so within each one, you have your H 2s. 99:59:59.999 --> 99:59:59.999 66 00:06:29,030 --> 00:06:32,760 And then you have your H threes. You don't go straight from H one to H for you. 99:59:59.999 --> 99:59:59.999 67 00:06:32,760 --> 00:06:41,540 You have H three in the middle. Start the notebook with an H one that has the notebook title and that that will become in a lot of rendering context. 99:59:59.999 --> 99:59:59.999 68 00:06:41,540 --> 00:06:46,850 That becomes the title at the top of your notebook. And then all your other headings are two or lower. 99:59:59.999 --> 99:59:59.999 69 00:06:46,850 --> 00:06:55,670 Also you might if you have an appendix, you might have Appendix B, another H1, but also the section headers should be short, not sentences. 99:59:59.999 --> 99:59:59.999 70 00:06:55,670 --> 00:07:00,630 If you're writing an entire sentence in your section header. You're you're putting too much there. 99:59:59.999 --> 99:59:59.999 71 00:07:00,630 --> 00:07:06,840 The section header should be a short title and then the section content comes after it. 99:59:59.999 --> 99:59:59.999 72 00:07:06,840 --> 00:07:13,650 Now, one of the few reasons why it's important to use the section headers heading levels properly. 99:59:59.999 --> 99:59:59.999 73 00:07:13,650 --> 00:07:19,710 One is just visually, it helps break up your notebook so we can easily see which component we're at. 99:59:59.999 --> 99:59:59.999 74 00:07:19,710 --> 00:07:26,130 Second, there are extensions that will do things like no your headings or give you a Browsr Bowl table of contents. 99:59:59.999 --> 99:59:59.999 75 00:07:26,130 --> 00:07:32,130 You can use to navigate the notebook by heading what? I'm rendering notebooks as a part of the course website. 99:59:59.999 --> 99:59:59.999 76 00:07:32,130 --> 00:07:36,960 You'll see this over in the right hand side. You can jump directly to notebook headings. 99:59:59.999 --> 99:59:59.999 77 00:07:36,960 --> 00:07:45,720 That only works because I'm consistently using the heading levels to build the structure and outline based structure of my notebook. 99:59:59.999 --> 99:59:59.999 78 00:07:45,720 --> 00:07:53,160 Another a third reason is for accessibility. If someone's reading your notebook with an assistive technology such as a screen reader, 99:59:59.999 --> 99:59:59.999 79 00:07:53,160 --> 00:08:01,500 the section headings are very important to help them navigate to the parts. The notebook that are both relevant to them at a given time. 99:59:59.999 --> 99:59:59.999 80 00:08:01,500 --> 00:08:09,180 So on the section headers, one additional little rule is if your section editor has to wrap onto a second line, really rethink. 99:59:59.999 --> 99:59:59.999 81 00:08:09,180 --> 00:08:13,230 It's almost certainly too long particularly. 99:59:59.999 --> 99:59:59.999 82 00:08:13,230 --> 00:08:16,860 Don't put an entire question in the section. Maybe. Usually. 99:59:59.999 --> 99:59:59.999 83 00:08:16,860 --> 00:08:22,890 Occasionally it's OK to put a whole question, but maybe put a brief like three to five word summary of the questions topic and 99:59:59.999 --> 99:59:59.999 84 00:08:22,890 --> 00:08:30,750 then write this question itself as the first paragraph of the of the section. 99:59:59.999 --> 99:59:59.999 85 00:08:30,750 --> 00:08:33,240 But pay attention to these different formatting features. 99:59:59.999 --> 99:59:59.999 86 00:08:33,240 --> 00:08:41,460 You can build a well-structured notebook that communicates clearly and draws the reader's emphasis to the places where it needs to go. 99:59:59.999 --> 99:59:59.999 87 00:08:41,460 --> 00:08:46,650 Writing the text itself. Use the document to tell a story. What's the goal of what you're doing? 99:59:59.999 --> 99:59:59.999 88 00:08:46,650 --> 00:08:51,630 Either the whole notebook or of individual pieces of analysis. What's the data that we're doing? 99:59:59.999 --> 99:59:59.999 89 00:08:51,630 --> 00:08:58,140 What do we know about it going in at the up at the top, either at the very top of your notebook or where you're loading the data? 99:59:59.999 --> 99:59:59.999 90 00:08:58,140 --> 00:09:04,050 It's useful to write some, especially at the notebooks and we report you submit to somebody. 99:59:59.999 --> 99:59:59.999 91 00:09:04,050 --> 00:09:07,800 It's useful to write there. What do you know? Where did you get the data? How was it collected? 99:59:59.999 --> 99:59:59.999 92 00:09:07,800 --> 00:09:16,290 Not a full data sheet, but at least some summary information to help the reader understand what it is that we're going to be going and looking at. 99:59:59.999 --> 99:59:59.999 93 00:09:16,290 --> 00:09:19,800 Why are we doing each piece of the analysis? What's the purpose here? 99:59:59.999 --> 99:59:59.999 94 00:09:19,800 --> 00:09:25,260 How does it fit into our broader picture, into our broader goals? What approach are we using? 99:59:59.999 --> 99:59:59.999 95 00:09:25,260 --> 00:09:28,200 We don't want to just repeat the code writing a a numbered list here. 99:59:59.999 --> 99:59:59.999 96 00:09:28,200 --> 00:09:33,240 The steps and those steps are just a literal translation of the code doesn't help understanding. 99:59:59.999 --> 99:59:59.999 97 00:09:33,240 --> 00:09:37,570 It creates an opportunity for a code and documentation to become mismatched. 99:59:59.999 --> 99:59:59.999 98 00:09:37,570 --> 00:09:41,670 But explain if there's anything tricky in the code. 99:59:59.999 --> 99:59:59.999 99 00:09:41,670 --> 00:09:48,600 Explain why that does the job. Explain the conceptual idea behind why you're approaching things the way you are. 99:59:59.999 --> 99:59:59.999 100 00:09:48,600 --> 00:09:56,700 If you're doing a data clean up, explain why that what that cleanup's doing and why that's the right cleanup for your data. 99:59:59.999 --> 99:59:59.999 101 00:09:56,700 --> 00:10:03,600 And then what do we learn from it? So oftentimes what I do with us, with an individual piece of it, like a chart. 99:59:59.999 --> 99:59:59.999 102 00:10:03,600 --> 00:10:07,260 All right. What question the charge is supposed to be answering. 99:59:59.999 --> 99:59:59.999 103 00:10:07,260 --> 00:10:11,010 Or at least the purpose of the chart that we have the code to generate the chart itself. 99:59:59.999 --> 99:59:59.999 104 00:10:11,010 --> 00:10:18,450 And then we have a tech cell that has observations about what we learn from the chart. 99:59:59.999 --> 99:59:59.999 105 00:10:18,450 --> 00:10:23,160 So what are we doing? How are we going to do it if that's not immediately clear code results? 99:59:59.999 --> 99:59:59.999 106 00:10:23,160 --> 00:10:25,590 And then what do we observe from these results? 99:59:59.999 --> 99:59:59.999 107 00:10:25,590 --> 00:10:33,350 So the over then the high level document structure that I recommend is to start start with the title and intros. 99:59:59.999 --> 99:59:59.999 108 00:10:33,350 --> 00:10:37,140 You've got your title. You're heading one. Then what's the notebook for? 99:59:59.999 --> 99:59:59.999 109 00:10:37,140 --> 00:10:41,160 Why does this notebook exist? Are there to include links? 99:59:59.999 --> 99:59:59.999 110 00:10:41,160 --> 00:10:46,650 There's hyperlinks and taxes and markdown as well. Read the markdown documentation to see how to use that. 99:59:59.999 --> 99:59:59.999 111 00:10:46,650 --> 00:10:50,790 But where does this go? Are there things we need to know? 99:59:59.999 --> 99:59:59.999 112 00:10:50,790 --> 00:10:54,990 Background about where why this documents being created? 99:59:59.999 --> 99:59:59.999 113 00:10:54,990 --> 00:11:00,360 Where did the data come from? If we have defined research questions, what are those research questions? 99:59:59.999 --> 99:59:59.999 114 00:11:00,360 --> 00:11:03,750 You can write those right in the intro, the notebook. Then I have a set up. 99:59:59.999 --> 99:59:59.999 115 00:11:03,750 --> 00:11:09,060 I almost always have a setup section that comes next. That has input. I import my python libraries. 99:59:59.999 --> 99:59:59.999 116 00:11:09,060 --> 00:11:11,970 I've maybe defined some help or functions that I'm going to be using throughout the 99:59:59.999 --> 99:59:59.999 117 00:11:11,970 --> 00:11:16,860 notebook helper function specific to one section I might define in that section. 99:59:59.999 --> 99:59:59.999 118 00:11:16,860 --> 00:11:20,730 But then and then I LOEs load the data. Sometimes I load the data as a part of the setup. 99:59:59.999 --> 99:59:59.999 119 00:11:20,730 --> 00:11:27,990 So it's OK. Important modules and then load my data. Sometimes if specially if I have more to say about the data, it's its own section. 99:59:59.999 --> 99:59:59.999 120 00:11:27,990 --> 00:11:34,500 But then as I load each table, I just show the first few rows of it often so that I can see, OK, I've loaded this data and then it's right there. 99:59:59.999 --> 99:59:59.999 121 00:11:34,500 --> 00:11:39,390 We can see as we're going through the rest of the notebook. What is the data just loaded look like? 99:59:59.999 --> 99:59:59.999 122 00:11:39,390 --> 00:11:45,030 Then we perform our analysis and this might be two sections. It might be five, six, seven, eight sections. 99:59:59.999 --> 99:59:59.999 123 00:11:45,030 --> 00:11:49,920 And then finally at the end, we can summarize and conclude this is going to be really I don't always do this in 99:59:59.999 --> 99:59:59.999 124 00:11:49,920 --> 00:11:53,340 my research notebook because often that's the material that goes in the paper. 99:59:59.999 --> 99:59:59.999 125 00:11:53,340 --> 00:11:57,150 But this is going to be something particularly in our assignment, and we're submitting notebooks. 99:59:59.999 --> 99:59:59.999 126 00:11:57,150 --> 00:12:00,810 Put that at the end of the notebook. What do we learn from this? 99:59:59.999 --> 99:59:59.999 127 00:12:00,810 --> 00:12:03,150 Sometimes they going to have specific directions for things. 99:59:59.999 --> 99:59:59.999 128 00:12:03,150 --> 00:12:08,730 I want you to reflect on there when like an assignment one, I've broken down the different requirements. 99:59:59.999 --> 99:59:59.999 129 00:12:08,730 --> 00:12:15,120 Those become good candidates for your age to your level. Two headings for each of those. 99:59:59.999 --> 99:59:59.999 130 00:12:15,120 --> 00:12:21,730 So we've got, I think six require six different requirements. An assignment one. 99:59:59.999 --> 99:59:59.999 131 00:12:21,730 --> 00:12:28,120 H2, heading a primary section of your document for each of those is a good starting point for your layout. 99:59:59.999 --> 99:59:59.999 132 00:12:28,120 --> 00:12:34,460 In addition to you're probably gonna have another one up at the top for the setup and maybe another for the data load. 99:59:59.999 --> 99:59:59.999 133 00:12:34,460 --> 00:12:38,590 But think about this. This the flow, your document be able to communicate. 99:59:59.999 --> 99:59:59.999 134 00:12:38,590 --> 00:12:42,760 What are we doing? What are the prerequisites in terms of and data? 99:59:59.999 --> 99:59:59.999 135 00:12:42,760 --> 00:12:47,790 How are we actually doing it? And then at the end, what do we learn? 99:59:59.999 --> 99:59:59.999 136 00:12:47,790 --> 00:12:53,580 So you're going to write a lot of cells and produce a lot of outputs in your notebook while you're debugging, 99:59:59.999 --> 99:59:59.999 137 00:12:53,580 --> 00:12:59,220 before you submit to before you share in other contexts. Spend some time cleaning up your notebook, 99:59:59.999 --> 99:59:59.999 138 00:12:59,220 --> 00:13:06,780 remove dead ends and extraneous outputs that you included for debugging, but don't fit in the flow of the story. 99:59:59.999 --> 99:59:59.999 139 00:13:06,780 --> 00:13:13,350 Consider putting them in a supplementary notebook. If you want to keep them around and then make sure you can rerun your notebook from top to bottom. 99:59:59.999 --> 99:59:59.999 140 00:13:13,350 --> 00:13:20,300 So when the Jupiter interface is the kernel, when you click that and choose, restart and rerun all. 99:59:59.999 --> 99:59:59.999 141 00:13:20,300 --> 00:13:26,570 And it will restart the python kernel that's actually running your code so all your variables disappear. 99:59:59.999 --> 99:59:59.999 142 00:13:26,570 --> 00:13:31,220 Your data is no longer loaded. And then it starts running the notebook from top to bottom. 99:59:59.999 --> 99:59:59.999 143 00:13:31,220 --> 00:13:37,220 You want that to succeed so that someone else working with the notebook can actually rerun and reproduce your results. 99:59:59.999 --> 99:59:59.999 144 00:13:37,220 --> 00:13:43,790 If that doesn't succeed, then that means either you deleted something that's that's important or you're the order of 99:59:59.999 --> 99:59:59.999 145 00:13:43,790 --> 00:13:48,740 your source in the notebook does not match the order in which it actually has to be executed. 99:59:59.999 --> 99:59:59.999 146 00:13:48,740 --> 00:13:54,980 But make sure it succeeds and also read back to the notebook to make sure that the charts all still look right. 99:59:59.999 --> 99:59:59.999 147 00:13:54,980 --> 00:14:01,100 The data is the conclusions are all still correct, etc. before you submit the final notebook. 99:59:59.999 --> 99:59:59.999 148 00:14:01,100 --> 00:14:05,330 So when you're writing an up of two, you also need to know your audience and your purpose. 99:59:59.999 --> 99:59:59.999 149 00:14:05,330 --> 00:14:10,550 For example, the notebooks I'm writing for you for teaching purposes here. 99:59:59.999 --> 99:59:59.999 150 00:14:10,550 --> 00:14:17,060 They the things I write in them differ from what I'm going to write in a research notebook that I share with my collaborators, 99:59:59.999 --> 99:59:59.999 151 00:14:17,060 --> 00:14:22,700 or I use my own purposes because my purpose partially in their notebooks, is to explain how they're working. 99:59:59.999 --> 99:59:59.999 152 00:14:22,700 --> 00:14:28,850 So I'm going to say more in these notebooks about how exactly the what exactly the code is 99:59:59.999 --> 99:59:59.999 153 00:14:28,850 --> 00:14:35,040 doing is that you can learn how the code works that I would expect in a research notebook. 99:59:59.999 --> 99:59:59.999 154 00:14:35,040 --> 00:14:41,730 But also, you're your own internal your own personal use sharing with your adviser or your supervisor, 99:59:59.999 --> 99:59:59.999 155 00:14:41,730 --> 00:14:47,970 sharing with the public, either the professional public working on your topic or the general public. 99:59:59.999 --> 99:59:59.999 156 00:14:47,970 --> 00:14:54,180 These are all different audiences and they're going to need different levels of explanation and different things highlighted in your notebook. 99:59:59.999 --> 99:59:59.999 157 00:14:54,180 --> 00:14:56,820 Also, not all audiences are well served for notebooks. 99:59:59.999 --> 99:59:59.999 158 00:14:56,820 --> 00:15:04,500 Notebooks are fantastic for internal reports, collaboration, et cetera, sharing the results of a data analysis with colleagues or with yourself. 99:59:59.999 --> 99:59:59.999 159 00:15:04,500 --> 00:15:09,090 But for final publication, you're often going to need a separate final report. 99:59:59.999 --> 99:59:59.999 160 00:15:09,090 --> 00:15:15,930 I don't know that it's possible to write a research paper and Jupiter notebooks. Somebody might have tried, but. 99:59:59.999 --> 99:59:59.999 161 00:15:15,930 --> 00:15:21,180 But I'll still have the notebook where I explain the analysis. I often make that notebook available. 99:59:59.999 --> 99:59:59.999 162 00:15:21,180 --> 00:15:31,050 So for a lot of my a lot of my published research papers, you can download a zip file or a get repository that contains the notebooks and you 99:59:59.999 --> 99:59:59.999 163 00:15:31,050 --> 00:15:36,240 can rerun the experiment and rerun my analysis with the notebooks in the notebook. 99:59:59.999 --> 99:59:59.999 164 00:15:36,240 --> 00:15:40,370 Then also I write the files out to disk. And we're not going to see this quite yet. 99:59:59.999 --> 99:59:59.999 165 00:15:40,370 --> 00:15:45,630 We're going to see it later when we start talking about workflow. Because right now I'm just having to submit notebooks. 99:59:59.999 --> 99:59:59.999 166 00:15:45,630 --> 00:15:49,720 But the note, the figures as they show up in the notebook aren't very high resolution. 99:59:59.999 --> 99:59:59.999 167 00:15:49,720 --> 00:15:55,200 So we're gonna want to render a higher resolution version of them to a PMG file or a PDA file or a 99:59:59.999 --> 99:59:59.999 168 00:15:55,200 --> 00:16:01,110 postscript file that we can then include in our document and word or law tech or whatever we're writing. 99:59:59.999 --> 99:59:59.999 169 00:16:01,110 --> 00:16:08,910 So to wrap up, your notebook is first and foremost a document that contains code to generate the results that you're trying to discuss. 99:59:59.999 --> 99:59:59.999 170 00:16:08,910 --> 00:16:14,490 Take advantage of the document structure and use it as a store to tell the story of your analysis. 99:59:59.999 --> 99:59:59.999 171 00:16:14,490 --> 00:16:19,910 The conclusion you come to in why we should believe them. Pay attention to the examples I'm giving you in class. 99:59:59.999 --> 99:59:59.999 172 00:16:19,910 --> 00:16:36,043 I'm also going to be trying to give you some examples of research oriented notebooks that you can look at to see examples of good notebook practice. 99:59:59.999 --> 99:59:59.999