WEBVTT 00:00:00.967 --> 00:00:01.870 So, 00:00:02.130 --> 00:00:05.140 It's official. We're starting. NOTE Paragraph 00:00:05.140 --> 00:00:09.987 Hello everyone, I'm Stefanie Butland, rOpenSci's Community Manager 00:00:09.987 --> 00:00:13.258 and I so warmly welcome you to our community call 00:00:13.258 --> 00:00:15.723 on maintaining an R package. 00:00:16.183 --> 00:00:24.187 I wanted to first acknowledge that everyone here is under extraordinary stresses right now 00:00:24.187 --> 00:00:28.534 in light of the COVID-19 pandemic. 00:00:29.454 --> 00:00:34.769 None of us knows what weights other people are carrying at this time. 00:00:35.299 --> 00:00:40.723 It's a remarkable thing that all of us have decided to take this hour 00:00:40.723 --> 00:00:43.496 to come together and be together as a community. 00:00:43.496 --> 00:00:48.946 And so many of you I know as sort of warm, generous, accepting people. 00:00:48.946 --> 00:00:51.810 And so I thank you all for taking the time to do this, 00:00:51.810 --> 00:00:55.510 and for the next one hour of your lives, I've got your back! 00:00:55.954 --> 00:00:58.753 And now for something completely different: 00:00:59.233 --> 00:01:02.212 rOpenSci is a non-profit initiative, founded in 2011 00:01:02.212 --> 00:01:05.950 by Karthik Ram, Scott Chamberlain, and Carl Boettiger. 00:01:05.950 --> 00:01:09.620 We enable open and reproducible research by building technical infrastructure 00:01:09.620 --> 00:01:13.290 in the form of staff- and community-contributed R software tools 00:01:13.290 --> 00:01:18.787 and we build social infrastructure in the form of a welcoming and diverse community. 00:01:19.497 --> 00:01:25.166 You can find our bi-weekly newsletter at news.ropensci.org. 00:01:25.726 --> 00:01:29.499 We have a code of conduct that applies to both in-person 00:01:29.499 --> 00:01:32.455 and online interactions, like this call today. 00:01:33.372 --> 00:01:35.815 You can find it linked from the footer of our website 00:01:35.815 --> 00:01:41.618 and it includes reporting and enforcement guidelines. 00:01:41.618 --> 00:01:47.491 The session is being recorded and the video and any other resources 00:01:47.491 --> 00:01:53.526 are going to be posted on our website at ropensci.org/commcalls 00:01:53.526 --> 00:01:56.210 within about three business days. 00:01:58.050 --> 00:02:00.369 I'm going to tweet from rOpenSci when those things are up. 00:02:00.369 --> 00:02:05.690 We use a shared Google Doc, which if anyone's in there 00:02:05.690 --> 00:02:07.652 could you please paste the link back into the zoom 00:02:07.652 --> 00:02:09.615 so that new joiners can see this. 00:02:09.870 --> 00:02:14.062 We typically use a Google Doc in our community calls 00:02:14.062 --> 00:02:15.731 for collaborative note taking. 00:02:17.193 --> 00:02:23.028 You can find that at bit.ly/ropensci-commcall-maintaining 00:02:24.610 --> 00:02:30.064 I'd like you to add your name and your affiliation into the attendees list. 00:02:30.324 --> 00:02:35.090 And there's a format there that you can follow because it helps me grab that information. 00:02:36.687 --> 00:02:42.200 In this call, it's a different flavor for us, we've decided to do this as a panel discussion 00:02:42.200 --> 00:02:46.320 as opposed to a series of talks with attendees asking questions. 00:02:47.489 --> 00:02:50.418 And so we're going to have one short presentation, 00:02:50.418 --> 00:02:53.348 followed by a panel discussion, full of pre-selected questions. 00:02:53.950 --> 00:02:57.970 I've already typed each of these questions that are planning to be asked in the doc 00:02:57.970 --> 00:03:03.412 and so invite anyone to add any comments you're interested in capturing from the panelists, 00:03:03.412 --> 00:03:07.736 as well as any expertise, any thoughts you might have about this 00:03:07.736 --> 00:03:11.460 because we really are a bunch of rich resource people here. 00:03:11.460 --> 00:03:16.140 And so audience answers are just as valuable as the panelist answers at this point. 00:03:16.660 --> 00:03:20.600 And this is something that we end up sharing as a long-term resource. 00:03:20.600 --> 00:03:22.560 It will be accessible forever. 00:03:23.336 --> 00:03:27.708 This time, unfortunately, we won't have time for taking impromptu questions 00:03:27.708 --> 00:03:30.151 from attendees on the community call 00:03:30.151 --> 00:03:34.418 but I've created a separate section in the doc called questions - Part B, 00:03:34.418 --> 00:03:37.544 where you can ask questions that come up for you 00:03:37.544 --> 00:03:41.970 and I invite anyone to answer each other's questions. 00:03:43.150 --> 00:03:47.324 And together, as I say, we'll make this a rich resource for everyone to use. 00:03:48.160 --> 00:03:51.327 So finally, it's my pleasure to introduce our panel. NOTE Paragraph 00:03:51.960 --> 00:03:54.708 Julia Silge recently joined RStudio 00:03:54.708 --> 00:03:57.076 as a data scientist and software engineer. 00:03:57.596 --> 00:04:00.880 When we put out a call for a new maintainer for the qualtRics package, 00:04:00.880 --> 00:04:03.600 Julia took it on because she was using it in her day job 00:04:03.600 --> 00:04:06.450 as a data scientist at StackOverflow at the time 00:04:06.450 --> 00:04:10.270 And she worked on the annual developer survey using qualtRics. 00:04:10.270 --> 00:04:13.360 She also maintains other R packages, including tidytext, NOTE Paragraph 00:04:13.360 --> 00:04:16.010 which has been downloaded almost 900,000 times. 00:04:17.080 --> 00:04:17.890 Elin Waring 00:04:17.890 --> 00:04:20.713 is a professor of sociology and interim dean 00:04:20.713 --> 00:04:25.487 of the School of Health Sciences, human services and nursing at Lehman College CUNY 00:04:25.487 --> 00:04:28.003 She teaches research methods and statistics. 00:04:28.388 --> 00:04:31.245 Elin was part of the rOpenSci Unconf17 group 00:04:31.245 --> 00:04:33.110 that developed the skimr package 00:04:33.110 --> 00:04:37.199 which has become very popular at over 250,000 downloads. 00:04:37.906 --> 00:04:40.612 Elin works with Michael Quinn to maintain skimr 00:04:40.612 --> 00:04:44.130 as they've shepherded it through two major releases already. 00:04:44.290 --> 00:04:46.210 She formerly was a contributor and maintainer 00:04:46.210 --> 00:04:47.669 for the Joomla CMS project 00:04:47.669 --> 00:04:48.963 and her approach to maintaining 00:04:48.963 --> 00:04:50.793 is influenced by that experience. 00:04:51.189 --> 00:04:53.346 This includes understanding the importance of having a clear concept 00:04:53.346 --> 00:04:55.503 of what you're trying to achieve, 00:04:55.503 --> 00:04:58.350 being able to politely but firmly say no, 00:04:59.010 --> 00:05:01.320 and knowing having users changes everything. 00:05:01.890 --> 00:05:04.341 Erin Grand is a data scientist 00:05:04.341 --> 00:05:05.720 at Uncommon Schools 00:05:05.720 --> 00:05:08.638 and a Board member of R-Ladies New York City. 00:05:08.638 --> 00:05:10.366 Erin created and maintains a package 00:05:10.366 --> 00:05:12.224 for NASA's Astronomy Picture of The Day 00:05:12.224 --> 00:05:16.540 It's called astropic and it was inspired by her early love of astronomy. 00:05:16.620 --> 00:05:19.496 And one of her own images was featured 00:05:19.496 --> 00:05:21.320 as Astronomy Picture of the Day. 00:05:21.320 --> 00:05:22.491 Life goal achieved! 00:05:22.881 --> 00:05:25.740 She also maintains a set of Internal packages at her work. 00:05:26.785 --> 00:05:29.575 Leonardo Collado-Torres is a research scientist 00:05:29.575 --> 00:05:31.860 at the Lieber Institute for brain development. 00:05:32.050 --> 00:05:34.440 He maintains several Bioconductor packages, 00:05:34.440 --> 00:05:39.281 including recently submitted spatialLIBD for spatial transcriptomics data. 00:05:39.281 --> 00:05:41.842 He's a co-founder of the LIBD rstats club, 00:05:41.842 --> 00:05:44.674 the CDSB Mexico community of R 00:05:44.674 --> 00:05:46.859 and Bioconductor in Latin America 00:05:46.859 --> 00:05:49.809 and those members just submitted their first package to Bioconductor. 00:05:51.000 --> 00:05:56.604 This represents a dramatic percent increase in Latin American Bioconductor developers. 00:05:56.604 --> 00:05:58.467 So, congratulations! 00:05:58.487 --> 00:06:01.036 Also, congratulations, Leo, just in the last couple of days, 00:06:01.036 --> 00:06:04.060 was promoted to the position of Research Scientist 00:06:04.060 --> 00:06:06.045 and he's written a post about that. 00:06:06.715 --> 00:06:11.798 Scott Chamberlain, our final panelist, is a co-founder and technical lead of rOpenSci. 00:06:11.798 --> 00:06:15.561 He maintains, in his words, probably too many packages. 00:06:16.051 --> 00:06:18.585 Part of Scott's work involves finding new maintainers 00:06:18.585 --> 00:06:21.029 for rOpenSci peer reviewed packages. 00:06:22.320 --> 00:06:26.210 And he tries to find those when current maintainer needs to move on, 00:06:26.210 --> 00:06:27.982 like the qualtRics example. 00:06:28.455 --> 00:06:31.530 His bio is really shortest because he's actually far too humble. 00:06:33.565 --> 00:06:36.115 Julia is going to speak for about 10 minutes 00:06:36.398 --> 00:06:40.444 and then for the rest of the hour, she's going to be moderating a panel discussion 00:06:40.444 --> 00:06:41.893 with pre-selected questions. 00:06:41.893 --> 00:06:44.733 For people who have just joined, thank you for sharing the link 00:06:44.733 --> 00:06:47.260 to the shared Google Doc again in the zoom. 00:06:48.255 --> 00:06:52.623 Please add your notes there, add your own questions in the bottom. 00:06:53.220 --> 00:06:55.497 I am now going to share my screen 00:06:55.497 --> 00:07:02.614 because I will note for people joined that Julia is not sharing her lovely face in home backdrop, 00:07:02.614 --> 00:07:05.006 because there was an earthquake where she is, 00:07:05.006 --> 00:07:06.362 she lost internet. 00:07:06.362 --> 00:07:08.296 And so I'm going to be showing her slides. 00:07:08.296 --> 00:07:10.210 So if you'll give me a moment. 00:07:10.650 --> 00:07:14.287 Sure... Well... [inaudible] 00:07:16.790 --> 00:07:18.130 [Julia Silge] Fingers crossed. 00:07:18.130 --> 00:07:22.794 Yeah, there was a 5.7 earthquake 00:07:22.794 --> 00:07:27.117 at your needs and it was not super big. 00:07:27.117 --> 00:07:33.418 But enough that one of the aftershocks knocked out internet at my house 00:07:33.418 --> 00:07:36.705 and I'm trying to be able to still speak 00:07:36.705 --> 00:07:42.758 If we lose me, then I know that everything will keep going 00:07:42.758 --> 00:07:45.630 and in a great way. 00:07:45.630 --> 00:07:47.918 So, 00:07:49.130 --> 00:07:54.936 So we're going to talk through these slides about just briefly about particular perspectives 00:07:54.936 --> 00:07:57.002 on maintaining an R package. 00:07:57.002 --> 00:08:01.163 So if we go to that slide about being... 00:08:02.220 --> 00:08:05.417 Let's see that, next slide... 00:08:05.417 --> 00:08:09.283 Maintaining an R package we often think... we cannot... 00:08:09.463 --> 00:08:11.870 There's a Reese's about 00:08:12.283 --> 00:08:15.648 when you're building R package, about what we focus on 00:08:15.648 --> 00:08:17.853 in the technical aspects. 00:08:17.954 --> 00:08:19.647 But once you're in the piece, 00:08:19.647 --> 00:08:21.923 though the part of you actually built your package 00:08:21.923 --> 00:08:23.200 and people are using it. 00:08:23.230 --> 00:08:26.278 There's quite a balance in the amount of time 00:08:26.278 --> 00:08:29.326 that we spend managing technical work, 00:08:29.326 --> 00:08:33.727 which is, of course, extremely important with social aspects 00:08:33.727 --> 00:08:36.988 of who is using the package, if you will, 00:08:36.988 --> 00:08:41.317 involves often with asking a lot of the right questions 00:08:41.317 --> 00:08:43.020 (go to the next slide) 00:08:43.020 --> 00:08:45.891 Some of the right kinds of questions that we ask 00:08:45.891 --> 00:08:49.125 when we're thinking about what it takes you to packaging, 00:08:49.125 --> 00:08:50.944 the date, or some of these like -- 00:08:51.284 --> 00:08:56.841 Is this a package that's used really broadly by a lot of different kinds of people 00:08:56.841 --> 00:08:58.095 and we can think beginners? 00:08:58.095 --> 00:09:01.855 Is this a package that has a specialized use case 00:09:01.985 --> 00:09:06.022 or that's used by people who know each other, internally at a company? 00:09:06.624 --> 00:09:10.548 Is the person maintaining the package, the person who put it together originally, 00:09:10.548 --> 00:09:14.790 or as has it been passed along a couple of times? 00:09:15.090 --> 00:09:17.254 When you think about maintaining your package... 00:09:17.254 --> 00:09:22.870 I'm interested here as we have our discussion what people's perspectives on like 00:09:22.870 --> 00:09:24.010 how do we change? 00:09:24.010 --> 00:09:28.464 And so you can either thinking about packages changing over time 00:09:28.464 --> 00:09:31.120 or, packages being superseded. 00:09:31.120 --> 00:09:39.251 And it's been interesting in our discussions preparing for this community call. 00:09:39.811 --> 00:09:42.605 Software doesn't live forever. 00:09:42.620 --> 00:09:48.548 And when we build software, do we put thoughtfulness into 00:09:48.548 --> 00:09:51.090 what do we expect to happen next? 00:09:51.090 --> 00:09:52.645 So we can go to the next slide. 00:09:52.645 --> 00:09:57.373 One of the motivating ideas of setting up this community call 00:09:57.373 --> 00:10:04.500 is that most software out there, whether you're talking about R packages or not, 00:10:04.520 --> 00:10:09.069 have one main person who keeps it running 00:10:09.069 --> 00:10:15.705 and a goal of rOpenSci, and a lot of us in this community, 00:10:15.705 --> 00:10:24.306 is to build up the sustainability of our software ecosystem. 00:10:24.310 --> 00:10:29.356 And it's somewhat brittle and also contributes to burnout 00:10:29.356 --> 00:10:33.848 and uncertainty about what is going to happen 00:10:33.848 --> 00:10:36.639 when we just have one maintainer. 00:10:36.639 --> 00:10:39.380 Some other things you can struggle with are 00:10:39.380 --> 00:10:44.603 You know, literally no one else knows what to do with this internal of this package too. 00:10:44.603 --> 00:10:52.955 What do you do with one person having to manage sometimes what can feel like 00:10:52.955 --> 00:10:56.647 an overwhelming amount of feedback from users. 00:10:57.557 --> 00:11:00.034 If you go to the next slide 00:11:00.034 --> 00:11:03.227 There's interesting research out there about 00:11:03.227 --> 00:11:06.367 both like what is the situation with software contributors 00:11:06.367 --> 00:11:11.525 and how can we either navigate that situation, 00:11:11.525 --> 00:11:16.652 encourage more or figure out what the right path could be for a community. 00:11:18.550 --> 00:11:27.420 The references at the bottom here of analysis of open source contributors 00:11:27.420 --> 00:11:29.735 In this analysis, they did -- 00:11:29.735 --> 00:11:31.928 It's not uncommon to find 00:11:31.928 --> 00:11:35.136 casual contributors, like people who are not the main maintainer 00:11:35.136 --> 00:11:39.726 and you have a situation where there's a lot of, a long tail of small contributions, 00:11:39.726 --> 00:11:45.078 so like half the contributors are responsible for 2% of the commits. 00:11:45.078 --> 00:11:47.841 But they are lots of different kinds of commits. 00:11:48.408 --> 00:11:51.602 These 2% of commits from lots of different kinds of people 00:11:51.602 --> 00:11:53.245 are lots of kinds of things. 00:11:54.064 --> 00:12:00.118 You know there are things like typos, but they're also things like fixing bugs 00:12:00.118 --> 00:12:02.731 and building new features and refactoring. 00:12:04.310 --> 00:12:08.110 This ... contributes -- 00:12:08.110 --> 00:12:11.600 What this is, is evidence that these are people who could be scaled up 00:12:11.600 --> 00:12:15.110 to being more contributor, more contributing, 00:12:15.610 --> 00:12:20.951 more significant maintainer or contributors, if that is appropriate for your program. 00:12:20.951 --> 00:12:23.030 If you go to the next slide. 00:12:23.030 --> 00:12:25.800 We often have this model of software contributions, 00:12:25.800 --> 00:12:28.755 where we have to think of it as like an onion model 00:12:28.755 --> 00:12:33.460 where you've got the users, and the contributors are inside of there, 00:12:33.460 --> 00:12:35.544 and the committers are inside of there. 00:12:35.544 --> 00:12:37.765 And that's often how we have this mental model 00:12:37.765 --> 00:12:40.840 of like how, why, the software work, you know. 00:12:40.840 --> 00:12:45.915 But, we might want to consider whether that is the best model 00:12:45.915 --> 00:12:52.340 and instead move to what's on the next slide, which is a hub and spoke model 00:12:52.340 --> 00:12:55.070 where the code is central, right? 00:12:55.070 --> 00:12:58.105 Like, that's the thing that we're all using. So that is in the middle. 00:12:58.105 --> 00:13:01.482 And there are maintainer who work mostly on code, 00:13:01.482 --> 00:13:05.982 but there are also other kinds of maintenance activities happening 00:13:05.982 --> 00:13:11.802 So there are maintainer of the software who focus mainly on education and docs. 00:13:11.802 --> 00:13:16.005 There are maintainers who focus mainly on issue triage. 00:13:16.005 --> 00:13:18.995 There are maintainers who focus mainly on evangelism. 00:13:18.995 --> 00:13:23.270 And users kind of swim around in this -- 00:13:24.159 --> 00:13:28.794 swim around in this like in a soup around this hub and spoke model, 00:13:28.794 --> 00:13:32.950 and depending on their particular need at any one time, 00:13:32.950 --> 00:13:36.679 they engage with these different maintainers. Like, maybe the user-support one, 00:13:36.679 --> 00:13:41.240 or maybe the people writing the code, or maybe the people writing the docs 00:13:41.240 --> 00:13:44.635 and so like this might be a helpful mental model 00:13:44.635 --> 00:13:50.272 for thinking about package maintenance, especially for larger packages 00:13:50.272 --> 00:13:52.282 that have more users. 00:13:52.412 --> 00:13:56.160 So if you can go to the next slide. 00:13:56.540 --> 00:14:00.620 So it turns out we do actually have research and know what can contribute to, 00:14:00.620 --> 00:14:06.260 what can encourage more contributions. So the next slide outlines something 00:14:06.580 --> 00:14:12.320 For one study that was done, something that somethings that we, you know - 00:14:12.320 --> 00:14:16.550 if you're involved with rOpenSci, you've heard this kind of thing 00:14:16.550 --> 00:14:17.750 and seen this thing in action. 00:14:17.750 --> 00:14:21.720 So, include and enforce a Code of Conduct. 00:14:21.880 --> 00:14:26.840 Have cultural norms and include kindness and respect 00:14:26.840 --> 00:14:31.770 Something that was here. That is interesting. I don't see a lot in R packages, 00:14:31.770 --> 00:14:35.070 but could be interesting to consider. 00:14:35.070 --> 00:14:40.656 That's: make more public or explicit any future plans you have. 00:14:40.656 --> 00:14:45.047 That can help contributors know what to do. 00:14:45.507 --> 00:14:51.787 And then the - if you go to the next slide. This paper also has some very interesting ideas 00:14:51.787 --> 00:14:58.545 of how to help newcomers become contributors and maybe eventually maintainers. 00:14:58.602 --> 00:15:02.260 These are all here. I'll highlight a couple 00:15:02.490 --> 00:15:05.685 Let's talk about that. I'll just highlight that second one: 00:15:05.759 --> 00:15:10.989 Have forms of participation that are legitimate in your projects, 00:15:10.989 --> 00:15:17.213 that are valued, that are not writing code, 00:15:17.213 --> 00:15:22.530 that are on ramps and then those last two I think are very interesting to 00:15:23.490 --> 00:15:30.581 To explicitly acknowledge all contributions. To have a culture around your project 00:15:30.581 --> 00:15:36.310 that doesn't let contributions just, kind of get, you know, swept away 00:15:36.310 --> 00:15:39.355 and also to follow up both on success and failure. 00:15:39.355 --> 00:15:44.620 If someone opens an issue or submits a PR that is not a good fit, 00:15:44.620 --> 00:15:49.200 To follow up on both the things that succeed and fail. 00:15:49.280 --> 00:15:51.870 So, what we just went through in those slides 00:15:51.870 --> 00:15:58.840 are just some summary and thoughts on the current situation. 00:15:58.840 --> 00:16:00.867 So a little bit of research of what we know. 00:16:00.867 --> 00:16:01.487 You can go to the next slide. 00:16:01.487 --> 00:16:05.710 The rest of the time that we're going to have here is going to be a panel discussion 00:16:05.780 --> 00:16:13.080 If my phone tethering holds up, I am going to moderate this panel of folks 00:16:13.080 --> 00:16:16.905 who are going to talk about some of our experiences. 00:16:16.905 --> 00:16:22.810 Some of our opinions on maintaining R packages. 00:16:22.810 --> 00:16:27.011 And if you will, the next slide that will just have some of the references. 00:16:27.011 --> 00:16:29.895 Just to thank you. Where some of those images. 00:16:29.895 --> 00:16:33.595 And a thank you to Scott for some of the research that he shared. 00:16:33.595 --> 00:16:36.365 So thank you to all that. 00:16:36.365 --> 00:16:41.915 And I think with that, we can get started with our discussion. 00:16:43.530 --> 00:16:47.434 Alright. So if you're -- So, panelists: Leo and... 00:16:47.434 --> 00:16:55.030 So our panelists are: Leo, and Elin, and Scott, and Erin. 00:16:55.670 --> 00:17:01.039 So, you all have been introduced, but can you unmute? 00:17:01.039 --> 00:17:04.356 And then, to get started, I think the first question 00:17:04.356 --> 00:17:09.983 I would love to have us discuss is: What does it mean to maintain an R package? 00:17:09.983 --> 00:17:11.043 This is what we're talking about. 00:17:11.043 --> 00:17:14.202 So, I would love to get your perspective on that. 00:17:14.202 --> 00:17:19.280 So let's go around and so, briefly, let's first have all four of you all say like 00:17:19.280 --> 00:17:23.886 What do you, like -- what does it mean to maintain R packages? So Elin, can you start? 00:17:24.487 --> 00:17:28.580 So I think it means a couple of different things, 00:17:28.684 --> 00:17:33.246 There is this very specific thing, that term you use 'committer' before, 00:17:33.246 --> 00:17:37.240 which is people who can commit to the master branch 00:17:37.240 --> 00:17:44.380 And in CRAN like the person whose name is going to be there as the email address for 00:17:44.485 --> 00:17:47.583 and make the submission and they're going to be the prime contact 00:17:47.583 --> 00:17:51.490 So that's one definition which is kind of the traditional 00:17:52.047 --> 00:17:54.158 open-source way of thinking about it. 00:17:54.233 --> 00:17:57.246 But then there is kind of, I think what you're getting to, 00:17:57.246 --> 00:18:01.987 bigger possible group of people who are invested in making sure 00:18:01.987 --> 00:18:03.909 that the package is maintained. 00:18:03.909 --> 00:18:06.539 Meaning keeping -- just like maintenance on anything, 00:18:06.539 --> 00:18:09.170 keeping it up to date, dealing with bugs, 00:18:10.900 --> 00:18:17.516 what happens when you stop working, because some other package updated 00:18:17.516 --> 00:18:19.955 or because R, base R, change something 00:18:21.017 --> 00:18:23.070 So someone who participates in that. 00:18:23.070 --> 00:18:27.245 And then also, potentially, in all the other areas you were mentioning. 00:18:27.245 --> 00:18:30.102 Yeah yeah nice! Scott? 00:18:30.102 --> 00:18:32.959 What do you think it means to maintain an R package? 00:18:34.608 --> 00:18:39.545 [Scott] Um, there's a lot of details, I guess. But I think that a very... 00:18:39.800 --> 00:18:42.058 Can you hear me good? [Julia and Stefanie] Yeah. 00:18:42.058 --> 00:18:47.806 At a very high level, I guess, the thing that came to mind first for me 00:18:47.806 --> 00:18:50.876 was just that it's like a constant learning process. 00:18:51.010 --> 00:18:54.555 A constant, sort of like, trying to figure out 00:18:55.983 --> 00:19:01.739 how to do any particular thing better, whether it's testing or function compos-- 00:19:01.739 --> 00:19:05.430 like how your function is composed, the parameters or whatever. 00:19:06.023 --> 00:19:10.474 And I think another point, about the second point that I came up with 00:19:10.474 --> 00:19:15.512 was sort of constantly learning how to design better function interfaces. 00:19:15.512 --> 00:19:18.480 You know how the functions are named, and the parameters are named, 00:19:18.480 --> 00:19:23.620 and how their default values, their -- stuff like that. 00:19:23.620 --> 00:19:29.690 So I think this is constant learning process of how to design easy to use interfaces. 00:19:29.690 --> 00:19:33.672 [Julia:] Yeah yeah yeah yeah, all of what you both you just said 00:19:33.672 --> 00:19:39.696 really resonate with my own experience with like the different packages I maintained. 00:19:39.696 --> 00:19:43.240 Erin, when you think of like, maintaining an R package, 00:19:43.240 --> 00:19:45.280 what do you think that actually means? 00:19:45.610 --> 00:19:51.390 [Erin] Yeah, first of all, I agree with everything the other panelists said 00:19:51.930 --> 00:19:53.940 Something that hadn't been mentioned, I think, is 00:19:54.180 --> 00:20:01.523 the sort of ownership around community and communication of the package. 00:20:01.523 --> 00:20:09.975 So, being the person who responds to issues or is looking at push requests, 00:20:10.315 --> 00:20:17.962 and really, like, dealing with the communication out to contributors or to users 00:20:17.962 --> 00:20:22.688 on either changes or what's happening with the package. 00:20:22.688 --> 00:20:26.653 [Julia] Yeah. Nice! Yeah, that's absolutely, that's really great 00:20:26.653 --> 00:20:31.041 And then Leo, what do you -- what about -- what's your response to this? 00:20:31.041 --> 00:20:34.195 What do you think it means means to maintain an R package? 00:20:34.650 --> 00:20:39.212 [Leo] So I'm going to echo what some of the other panelists said 00:20:39.212 --> 00:20:42.570 but for me it's like you deal with the questions you get from users. 00:20:42.602 --> 00:20:46.294 You approve or disapprove changes that you receive from others 00:20:46.294 --> 00:20:49.320 and you end up learning about community guidelines, 00:20:49.320 --> 00:20:52.346 like my case like the Bioconductor guidelines. 00:20:52.590 --> 00:20:57.775 And then you also have to -- you end up learning about like are R-devel 00:20:57.775 --> 00:21:01.998 and what changes are coming, how to anticipate those changes, 00:21:01.998 --> 00:21:05.290 such that you can fix them before the user sees them. 00:21:05.890 --> 00:21:07.453 [Julia] Yeah, that's great. 00:21:07.453 --> 00:21:13.690 So one thing I heard a lot of you mention was like deal -- understanding users, 00:21:13.690 --> 00:21:18.591 hearing from users... and, the issue of like user feedback 00:21:18.591 --> 00:21:22.625 I think is a really interesting one when it comes to maintaining R packages. 00:21:22.765 --> 00:21:24.382 So some R packa -- 00:21:24.382 --> 00:21:28.799 So some, you know, pieces of software are in the situation where you're like -- 00:21:28.799 --> 00:21:35.599 [inaudible] 00:21:35.710 --> 00:21:38.069 [Erin] I think we lost you. 00:21:38.359 --> 00:21:41.230 [Stefanie] Julia? We just lost your sound. 00:21:44.060 --> 00:21:47.269 Folks, we have a backup plan. And for those of you in here: 00:21:47.560 --> 00:21:50.780 Julia lost internet due to an earthquake today. 00:21:51.510 --> 00:21:53.469 [Julia] But, you know, on the other -- 00:21:53.469 --> 00:21:58.290 [Stefanie] Oh! Julia! Hold on. We lost you for like 40 seconds. 00:21:58.290 --> 00:21:59.321 [Julia] Oh, okay. 00:21:59.321 --> 00:22:03.141 [Stefanie] Could you please restart by asking the question that you were just about to ask? 00:22:03.141 --> 00:22:04.735 [Julia] Sure. Sure... 00:22:04.735 --> 00:22:12.720 So user feedback is an issue that R packages need to deal with 00:22:12.720 --> 00:22:19.330 and many packages need more contributors, not fewer 00:22:19.330 --> 00:22:21.975 and so we often want to encourage user feedback. 00:22:21.975 --> 00:22:26.377 At the same time, some packages are in the situation 00:22:26.377 --> 00:22:29.280 where they have a fire hose of user feedback 00:22:29.780 --> 00:22:35.631 And that fire hose can sometimes feel like -- 00:22:37.727 --> 00:22:39.660 You need to manage that. 00:22:40.807 --> 00:22:44.588 How do you manage that? What kind of situations have you been in? 00:22:44.588 --> 00:22:49.349 What strategies do you use to deal with user feedback? 00:22:49.559 --> 00:22:54.344 Elin, can you start with this first because I think I've heard 00:22:54.344 --> 00:22:59.690 you have some interesting perspectives on this, especially as someone who -- 00:22:59.690 --> 00:23:03.695 with skimr as a very popular package. 00:23:03.695 --> 00:23:08.877 (Erin) Sure. So skimr is pretty popular 00:23:08.877 --> 00:23:12.471 and we do get a lot of different kinds of user feedback 00:23:12.471 --> 00:23:15.957 we get people who want to know how to do things in skimr, 00:23:15.957 --> 00:23:17.332 they have questions about it. 00:23:17.332 --> 00:23:24.441 We have people who want to make, you know, suggestions for future development. 00:23:24.441 --> 00:23:27.772 And then we get people with issue reports and -- 00:23:27.965 --> 00:23:30.122 I will say... it -- 00:23:30.122 --> 00:23:34.081 when I said having users changes everything, it really does 00:23:34.081 --> 00:23:36.825 because you do have kind of a relationship with them 00:23:36.825 --> 00:23:39.830 and they're using -- you've kind of -- 00:23:39.830 --> 00:23:43.924 It's complex, right? Because you've kind of given them something 00:23:43.924 --> 00:23:48.785 And you want them to be grateful that you gave them this thing 00:23:48.785 --> 00:23:52.038 and but you also, you know, in terms of -- 00:23:52.038 --> 00:23:55.814 if you're enjoying your package and you're developing it, 00:23:55.814 --> 00:23:57.621 and you want to find out what's wrong. 00:23:57.621 --> 00:24:01.359 So it's kind of like you feel good when people are asking you 00:24:01.359 --> 00:24:03.585 and tweeting to you and stuff like that. 00:24:03.585 --> 00:24:08.330 But it can also get a little bit overwhelming. I will say -- 00:24:08.330 --> 00:24:14.073 And skimr is kind of a strange case because it was first developed at the unconf. 00:24:14.073 --> 00:24:19.512 And so people were tweeting about it like before it was finished, 00:24:19.660 --> 00:24:23.445 before even like the first prototype was finished. 00:24:23.445 --> 00:24:29.289 And so we had a lot of feedback right away about ideas of things to do 00:24:29.289 --> 00:24:35.419 and people started using it. 00:24:35.419 --> 00:24:39.676 Um, and so I'll just tell you how I kind of think about dividing it up 00:24:39.676 --> 00:24:41.213 like one thing I did was: 00:24:41.213 --> 00:24:48.128 within two weeks we had questions on StackOverflow about skimr. 00:24:48.128 --> 00:24:51.888 And so in the end, once it got to like five questions, 00:24:51.888 --> 00:24:55.334 I just created a tag. And so I have a tag that I follow. 00:24:55.334 --> 00:24:57.606 And I find that's helpful. 00:24:57.606 --> 00:25:01.170 We don't get that many questions anymore over there, but -- 00:25:01.170 --> 00:25:05.223 And then we have our issue tracker. 00:25:05.223 --> 00:25:10.476 It's the main place where people show up and it's really helpful in a way 00:25:10.476 --> 00:25:15.858 because we have some kind of heavy users who come in and say: 00:25:15.858 --> 00:25:21.559 "hey, if you're on the development version of tibble, it doesn't work anymore because this happened." 00:25:21.559 --> 00:25:26.230 And so that's helping us keep a little bit ahead of the game 00:25:26.230 --> 00:25:32.838 because you don't want to find out that it breaks with the development version of tibble, 00:25:32.838 --> 00:25:39.801 the day that there's a release and they can be really helpful with that, 00:25:40.561 --> 00:25:43.710 On the other hand, the whole issue of -- 00:25:44.640 --> 00:25:50.390 You know, if you have a package that you're keeping for multiple years now. 00:25:50.390 --> 00:25:54.439 You have things like your code style that you want to enforce 00:25:54.439 --> 00:25:56.940 like we use spaces some places and 00:25:56.940 --> 00:26:03.206 and we want to use the assignment operator and not the equal sign and things like that... 00:26:03.206 --> 00:26:07.539 And so sometimes it's hard when users want to send a pull request. 00:26:07.539 --> 00:26:10.917 And then you want them -- you want to encourage them to contribute, 00:26:10.917 --> 00:26:12.331 but you don't want them -- 00:26:12.331 --> 00:26:18.227 no, it feels kind of like you're being so OCD on like saying: 00:26:18.260 --> 00:26:20.696 "Hey, would you mind adding a space here?" 00:26:20.696 --> 00:26:25.433 And so I find it challenging to find the balance with that in terms of saying 00:26:25.433 --> 00:26:29.421 I'll just fix it for you, versus asking them to fix. 00:26:29.765 --> 00:26:33.234 [Julia] Yeah, yeah, there was -- some of the things you said in there in terms of like, 00:26:33.234 --> 00:26:37.145 you know, following a tag on StackOverflow, or 00:26:38.362 --> 00:26:44.006 You know, getting that note of should I edit a PR afterwards versus interacting with somebody? 00:26:44.006 --> 00:26:46.590 Are things that I also have, kind of had to figure out, 00:26:46.590 --> 00:26:50.608 like, what am I, what am I going to do. And so that's interesting. 00:26:50.608 --> 00:26:58.143 Um, you addressed some of the issues around also managing feature requests as well, 00:26:58.143 --> 00:27:00.149 which was another interesting question. 00:27:00.149 --> 00:27:03.450 So Leo. I think you um -- 00:27:04.342 --> 00:27:08.524 I wanted to ask you about that issue of hearing from users 00:27:08.524 --> 00:27:11.177 as someone who works on more specialized packages. 00:27:12.105 --> 00:27:14.815 [Leo] Yes, so the packages I work with on Bioconductor, 00:27:14.815 --> 00:27:16.719 they don't have as many users 00:27:16.719 --> 00:27:18.392 [inaudible] 00:27:18.392 --> 00:27:20.805 you needed some very expensive data sometimes 00:27:22.372 --> 00:27:23.586 in order to use them. 00:27:23.586 --> 00:27:28.881 And so the issue I deal with is that 00:27:28.881 --> 00:27:33.092 from one side we have open source tools and we want to provide them 00:27:33.092 --> 00:27:36.293 and you know for free and build a community around them. 00:27:36.293 --> 00:27:40.111 But the other side, sometimes we have people that have this expensive private data. 00:27:40.111 --> 00:27:45.011 Some published under scared about sharing it, even when they have questions. 00:27:45.011 --> 00:27:47.734 So you end up getting a lot of emails. 00:27:48.370 --> 00:27:52.737 And I try to convince them saying that this doesn't really benefit anyone 00:27:52.737 --> 00:27:57.901 Because I mean, I learn from the experience. They learn from the experience, 00:27:57.901 --> 00:27:59.503 but no one else really does. 00:27:59.503 --> 00:28:05.569 So I tried to convince them to put their questions on the Bioconductor support website 00:28:05.820 --> 00:28:10.268 and through it share small reproducible examples. 00:28:10.853 --> 00:28:13.711 Sometimes I can write a blog post about the question, 00:28:13.711 --> 00:28:15.449 but that's a lot more work for me. 00:28:15.740 --> 00:28:18.283 [Julia] Yeah yeah Yeah, yeah, no, the same. 00:28:18.283 --> 00:28:20.786 I bet this happens to a lot of folks who maintain packages. 00:28:20.786 --> 00:28:22.678 You get the email and and then that's what I do too actually is like, 00:28:24.530 --> 00:28:29.025 And sometimes, I will like help the person write the reprex 00:28:29.025 --> 00:28:31.477 and then be like now post it because then it's like, 00:28:31.477 --> 00:28:34.613 well now at least this person knows how to post a reprex 00:28:34.613 --> 00:28:40.201 and can do it next time, because helping someone over email is not -- 00:28:40.201 --> 00:28:43.552 doesn't multiply in the way that like public stuff does. Exactly. 00:28:43.822 --> 00:28:44.620 We've touched a little bit -- 00:28:44.620 --> 00:28:47.593 [Leo ] Sorry, just for that, like -- what I tried to reward them with 00:28:47.593 --> 00:28:51.220 is answering as fast as I can, but 00:28:51.250 --> 00:28:51.740 (Julia) Yeah. 00:28:51.790 --> 00:28:53.017 [Leo] questions they make. 00:28:53.017 --> 00:28:56.059 [Julia] Yes. Yes, being really responsive on those channels. 00:28:56.059 --> 00:29:00.368 We have already touched a little bit on, like managing issues and feature request, 00:29:00.368 --> 00:29:03.904 but I wanted to get maybe one other person's perspective on that, 00:29:03.904 --> 00:29:05.661 like about workflows, or whatever. 00:29:05.661 --> 00:29:09.857 Scott, you have a ton of packages, 00:29:09.857 --> 00:29:17.115 and I wonder if you have any perspective on user, on issues, feature requests 00:29:17.115 --> 00:29:19.784 and any thoughts on like workflows around that. 00:29:22.490 --> 00:29:25.637 [Scott] Um, yeah, I guess I have a lot of packages 00:29:25.637 --> 00:29:27.671 but none of them are very popular. 00:29:27.671 --> 00:29:32.565 So I think it's -- I don't really have that sort of tidyverse problem. 00:29:33.665 --> 00:29:40.810 So, but, you know, things I try and do. Or I think Leo said this, you know, responding. 00:29:40.810 --> 00:29:44.222 I try and respond to all issues quickly, even if I just say: 00:29:44.222 --> 00:29:46.288 'Hey, I got it. And I'm going to have a look at it.' 00:29:46.540 --> 00:29:51.667 I think it's important to sort of give people that feedback so they don't walk away 00:29:51.667 --> 00:29:53.408 from your package. 00:29:53.408 --> 00:29:57.364 And I think that's likely going to happen if they don't have a response. 00:29:57.444 --> 00:29:58.776 [Julia] Yeah. Absolutely. 00:29:58.776 --> 00:30:01.788 [Scott] Um, and then feature requests... 00:30:02.814 --> 00:30:07.520 I think it's always good advice to think about scope creep 00:30:08.040 --> 00:30:10.967 And if you're, you know -- if something's out of scope, 00:30:10.967 --> 00:30:15.288 then make sure to say that and just, yeah. And instead of -- 00:30:15.288 --> 00:30:22.487 try not to get your package to be too disjointed for users. 00:30:24.086 --> 00:30:25.883 [Julia] Yeah. Nice. Nice. 00:30:25.883 --> 00:30:28.840 Alright, so one of the goals that - like one of the motivating goals 00:30:28.840 --> 00:30:33.880 for this discussion is that, hey, most packages only have one maintainer. 00:30:33.900 --> 00:30:36.869 And it'd be better if there was a broader 00:30:36.869 --> 00:30:39.459 broader groups of people who can maintain. 00:30:39.459 --> 00:30:46.370 So what is a path for someone, for new contributors to R packages. 00:30:46.660 --> 00:30:48.956 So, for example, what it would be a first step. 00:30:48.956 --> 00:30:53.710 What should someone do if they want to help maintain one of your packages? 00:30:53.710 --> 00:30:57.629 So let's um... so Erin: can you say that, 00:30:57.629 --> 00:31:00.870 so you've got like some up like a public package on GitHub. 00:31:00.870 --> 00:31:03.254 You maintain packages internally. 00:31:03.254 --> 00:31:05.818 What should someone do if they want to help with one of your packages? 00:31:07.424 --> 00:31:10.010 [Erin] Yeah, I'll take this from the internal side. 00:31:10.300 --> 00:31:14.931 Because I think that's a perspective that I come from a lot more often. 00:31:15.310 --> 00:31:21.409 Because I have like four packages in that case in one package external 00:31:21.409 --> 00:31:28.280 but in terms of how do I look for new contributors and maintainers, 00:31:28.400 --> 00:31:33.502 a lot of my communication and issues and features or feature requests 00:31:33.502 --> 00:31:37.115 for an internal package or like a work specific package. 00:31:37.229 --> 00:31:42.627 Come via slack, even though the package is hosted on on GitHub, or Gitlab. 00:31:42.627 --> 00:31:47.760 The questions and comments and issues come in via like a different tool. 00:31:48.263 --> 00:31:54.726 So if someone is constantly asking questions or constantly asking for features, 00:31:54.726 --> 00:31:57.858 it's pretty easy to be like, all right, will onboard you to this package 00:31:57.858 --> 00:32:01.400 and then voila, you may update it yourself! 00:32:03.768 --> 00:32:10.519 Exactly. So I think like, for me, if you're interested in the package, 00:32:10.519 --> 00:32:13.823 if you have questions on the package 00:32:13.823 --> 00:32:21.031 and if you've like shown an ability to contribute to any package at all before 00:32:21.031 --> 00:32:25.740 I think one like first initial step is to have your own package 00:32:25.740 --> 00:32:27.703 or have something that you've contributed somewhere 00:32:27.703 --> 00:32:31.039 just to show that you know what an R package is in the first place. 00:32:31.740 --> 00:32:36.833 But really motivation is the, the important thing. 00:32:36.921 --> 00:32:39.996 [Julia] Yeah. Nice. Nice. What about you, Scott? 00:32:39.996 --> 00:32:43.416 What, like what do you see as like a path for someone to get on? 00:32:43.416 --> 00:32:47.784 And what would be like the first like a first step for someone who is interested ? 00:32:49.577 --> 00:32:52.103 [Scott] Yeah, I guess my first, my main point was 00:32:52.103 --> 00:32:56.369 what Erin already said was essentially is, you know, 00:32:56.369 --> 00:33:01.198 From my experience, like the most successful sort of new contributors 00:33:01.198 --> 00:33:05.218 are people that end up taking over packages or contribute a lot 00:33:05.218 --> 00:33:08.868 or people that use the package and their package is a dependency 00:33:08.868 --> 00:33:10.693 or in a project or whatever. 00:33:10.693 --> 00:33:15.664 And so they sort of have this at least short term, you know vested interest in the package 00:33:15.664 --> 00:33:20.006 because you know you have drive by contributors that will fix a bug 00:33:20.006 --> 00:33:22.398 or do this or that. 00:33:22.398 --> 00:33:27.244 But it's often when your package is a dependency 00:33:27.244 --> 00:33:30.327 or sort of major part of somebody's project or something. 00:33:30.327 --> 00:33:35.310 And so that's always a good, a good place to find contributors. Um, yeah. 00:33:36.871 --> 00:33:41.356 [Julia] Yes. Well, speaking of dependencies, speaking of dependencies... 00:33:41.356 --> 00:33:43.362 That's a big part. 00:33:43.362 --> 00:33:48.691 I mean, that's a big bit of like the decisions around maintaining an R package 00:33:48.691 --> 00:33:53.001 like deciding what do I want to take on as a dependency. 00:33:53.786 --> 00:33:57.801 Like do I want to, do I want to... like what do I want to depend on 00:33:58.811 --> 00:34:03.802 Do I want to like rewrite something internally or take on dependency, 00:34:03.802 --> 00:34:10.493 like everything from like something like, you know, off to some little algorithm or whatever. 00:34:10.743 --> 00:34:15.157 So, so I would love to hear something about that. 00:34:15.157 --> 00:34:19.891 So Leo, what are some of the thoughts you have had 00:34:19.891 --> 00:34:22.222 as you have made those decisions in your packages? 00:34:22.718 --> 00:34:27.447 [Leo] Yes. So, in the Bioconductor realm, 00:34:28.025 --> 00:34:33.255 there's the Bioconductor core team that they did their own grants and funding 00:34:33.255 --> 00:34:37.580 and they maintain the core packages, the core infrastructure packages. 00:34:37.580 --> 00:34:39.237 So I tried to depend on those 00:34:39.237 --> 00:34:41.855 because I know they're going to be professionally maintained. 00:34:42.929 --> 00:34:45.687 Also they have access to your package. 00:34:45.687 --> 00:34:50.705 So if you depend on them and they make change that breaks your package, 00:34:50.705 --> 00:34:55.004 they can actually go and fix yours without you actually doing anything. 00:34:56.746 --> 00:34:59.817 And similarly, we try to rely on like the tidyverse packages, 00:34:59.817 --> 00:35:07.142 because I know that they're well funded to keep working on the packages and and fix them. 00:35:07.142 --> 00:35:12.531 But I also like to depend on packages from authors that I have interacted with in the past. 00:35:12.531 --> 00:35:16.480 That's also sometimes how I find out about this packages from like-- 00:35:16.570 --> 00:35:18.030 [Julia] Yeah yeah 00:35:18.490 --> 00:35:22.788 Yeah yeah yeah yeah that -- yeah, those that -- So yeah, 00:35:22.788 --> 00:35:25.240 so things that you know are stable projects, 00:35:25.240 --> 00:35:27.602 things that you know you have relationships with people, 00:35:27.602 --> 00:35:30.015 that you'll be able to communicate with. 00:35:30.015 --> 00:35:32.382 Yeah, that all, that all makes sense. 00:35:32.382 --> 00:35:36.841 Another, um, some -- so there's dependencies, 00:35:36.841 --> 00:35:40.640 and then there's -- then, there are also 00:35:41.380 --> 00:35:45.625 like other things that so they can change, right and you have to manage that. 00:35:45.625 --> 00:35:48.557 Then there's also APIs that can change. 00:35:48.557 --> 00:35:54.800 So like Erin, your package that's on GitHub is an API package, right? 00:35:56.160 --> 00:35:58.066 [Erin] Yeah, exactly! 00:35:58.106 --> 00:36:02.931 [Julia] And like, so you have to, you have to like pay attention to 00:36:02.931 --> 00:36:04.777 when the API itself changes. 00:36:06.874 --> 00:36:11.273 [Erin] Yeah, precisely so as an example: 00:36:12.563 --> 00:36:18.131 Astronomy Picture of the Day, or APOD as it's more commonly called 00:36:18.131 --> 00:36:23.361 can either post a, like a picture or an image or a gif. 00:36:23.361 --> 00:36:27.346 And then all of that information is supposed to get transferred back into the API. 00:36:27.346 --> 00:36:30.871 But for a really long time, there was an error. 00:36:30.871 --> 00:36:35.959 Anytime there was the -- with the API, anytime there was a video 00:36:35.959 --> 00:36:37.521 that was accessed. 00:36:37.521 --> 00:36:41.768 So it was able to basically download any information about any pictures 00:36:41.768 --> 00:36:44.764 but any time there was a video, there was a problem. 00:36:45.060 --> 00:36:48.033 So I wrote in this like whole test. 00:36:48.033 --> 00:36:54.337 Like: if video, do not pull this day of information. 00:36:54.690 --> 00:37:00.245 So that the user doesn't see the error, they just don't get an image back 00:37:00.245 --> 00:37:02.304 and then they fixed the -- 00:37:02.304 --> 00:37:03.850 [Julia] They fixed it! 00:37:03.890 --> 00:37:09.757 [Erin] Making my whole little workaround unnecessary. 00:37:10.131 --> 00:37:12.298 So like keeping on track of like 00:37:12.298 --> 00:37:15.651 (a) what's like issues are happening in the API 00:37:15.651 --> 00:37:18.972 to either like find workarounds or solve them 00:37:18.972 --> 00:37:26.806 or even like do a pull request to this like API source to fix it for them. 00:37:27.466 --> 00:37:31.149 I think it's like an important part of maintaining 00:37:31.149 --> 00:37:36.219 a package solely based on an API structure. 00:37:36.219 --> 00:37:37.990 [Julia] Yeah and you know that's, um -- 00:37:38.120 --> 00:37:41.950 There are a lot of parallels just with Just with like if you're dependent 00:37:41.950 --> 00:37:45.725 on another R package in general, you know, like everything you just said about the API, 00:37:45.725 --> 00:37:49.748 that happens with just other software that you're dependent on. 00:37:49.748 --> 00:37:54.955 Either other R packages or non-R software, you know, and that is for sure 00:37:54.955 --> 00:38:01.830 part of this whole deal and like, choosing carefully what software 00:38:02.120 --> 00:38:05.348 Are you going to decide to use or not. 00:38:05.348 --> 00:38:09.777 And, you know, people do -- you know Leonardo shared his perspective. 00:38:09.777 --> 00:38:13.216 But people have different sets of priorities they bring to that 00:38:13.216 --> 00:38:18.081 and make different decisions, depending on their own perspective 00:38:18.081 --> 00:38:20.770 which I think is, you know, makes sense and it's fine. 00:38:20.770 --> 00:38:23.460 Um, one thing that happens in packages, is that 00:38:25.790 --> 00:38:26.500 They that 00:38:27.620 --> 00:38:30.345 Packages don't keep the maintainer forever. 00:38:30.345 --> 00:38:32.700 I mean we you know we are talking about the fact that, 00:38:32.700 --> 00:38:34.880 like the qualtRics package had a different maintainer. 00:38:34.880 --> 00:38:38.010 He wasn't using it anymore. And then I started maintaining it. 00:38:38.090 --> 00:38:42.300 And actually, like, since I moved -- I switched jobs from StackOverflow to RStudio 00:38:43.130 --> 00:38:46.915 RStudio doesn't use qualtRics for surveys and so I'm actually -- 00:38:46.915 --> 00:38:50.667 Like I kind of have a stopgap saying in place for now, 00:38:50.667 --> 00:38:54.780 but I'm gonna, I'm actually looking for someone else to take over qualtRics in the long term. 00:38:55.337 --> 00:38:56.580 Because it like -- 00:38:56.580 --> 00:39:00.730 it is better, if it's someone who uses it, who is actually actively using it so that 00:39:01.140 --> 00:39:05.610 So this is something that has to happen in real in the real world is that 00:39:05.610 --> 00:39:10.080 maintainers, packages, pieces of software have to change and maintainers. 00:39:10.080 --> 00:39:13.516 And so I'm wondering what sets -- 00:39:13.516 --> 00:39:18.898 if you have experienced this, what sets that up for success? 00:39:18.898 --> 00:39:24.781 And this is something that probably looks different in open source software 00:39:24.781 --> 00:39:29.690 versus internal packages versus, you know, and really big packages 00:39:29.690 --> 00:39:31.255 versus small packages. 00:39:31.255 --> 00:39:39.267 So maybe, Scott, can you talk about what this has looked like for you? 00:39:39.267 --> 00:39:44.274 You know, how you would manage that, say in rOpenSci? 00:39:45.730 --> 00:39:46.790 [Scott] Yeah. 00:39:47.618 --> 00:39:51.780 So for most of the ones I've been involved with 00:39:52.400 --> 00:39:59.310 They'd mostly been sort of wholesale letting somebody else 00:39:59.310 --> 00:40:01.968 manage the package without sort of me being involved. 00:40:01.968 --> 00:40:07.985 And so that's mostly what's happened and I think that's worked okay 00:40:07.985 --> 00:40:13.930 and I think like one of the things that you have to be okay with those sort of giving up 00:40:13.930 --> 00:40:18.080 being okay with giving up control of your baby. 00:40:18.232 --> 00:40:19.530 [Julia] Yeah, it's not yours anymore so... 00:40:19.530 --> 00:40:24.040 [Scott] Yeah, that can be hard, but you know, you just have to sort of say, 00:40:24.040 --> 00:40:29.095 you know, the new person Is the maintainer and if they want to change the functions 00:40:29.095 --> 00:40:32.620 and whatever, like it's you know it's their package, they're the maintainer. 00:40:33.190 --> 00:40:37.630 I think it's it's worked pretty well, but I think an important thing is being there. 00:40:37.700 --> 00:40:41.360 Being, you have to sort of be available at least for a little while 00:40:41.360 --> 00:40:46.319 for people to get oriented and that can take some time. 00:40:46.319 --> 00:40:49.897 And I think an important thing when looking for a new maintainer 00:40:49.897 --> 00:40:52.628 is trying to find somebody that knows the topic area. 00:40:52.858 --> 00:40:53.784 [Julia] Yes, absolutely. 00:40:53.784 --> 00:40:56.524 [Scott] That's like if it's a genomics package, 00:40:56.524 --> 00:41:00.155 then it should be somebody in genomics probably because they're going to maybe use it 00:41:00.155 --> 00:41:05.127 and maybe know the area know this sort of ins and outs of that type of data. 00:41:05.127 --> 00:41:07.180 So. Yeah. 00:41:07.770 --> 00:41:08.914 [Julia] Yeah, all right. 00:41:08.914 --> 00:41:13.709 Erin, can you reflect on that maybe in the internal package domain. 00:41:13.709 --> 00:41:17.250 Like what, like, what does it take to pass things off well 00:41:17.250 --> 00:41:22.124 because that, actually in my experience, it happens a lot in internal 00:41:22.124 --> 00:41:23.750 because people change jobs. 00:41:24.410 --> 00:41:25.629 [Erin] Yeah, exactly. 00:41:25.629 --> 00:41:29.704 I think one of the major differences that I've seen with passing an internal package 00:41:29.704 --> 00:41:33.780 is the like time of notice 00:41:33.780 --> 00:41:35.220 [Julia] Yeah. 00:41:35.220 --> 00:41:38.585 [Erin] Someone switching jobs, they may not tell the other people 00:41:38.585 --> 00:41:41.554 that there's two things out until the like two weeks beforehand, 00:41:41.554 --> 00:41:46.082 in which case they have a lot of other things to offboard. 00:41:46.082 --> 00:41:49.890 That might not be top priority. 00:41:49.890 --> 00:41:56.541 So I think it comes to having clear like guidelines 00:41:56.541 --> 00:42:00.080 around what the package does, the style of the code, where it's located, 00:42:00.150 --> 00:42:06.350 where questions and answers happen like a side effect in Slack 00:42:06.350 --> 00:42:11.810 or effect in GitHub in a way to sort of pass off everything 00:42:11.810 --> 00:42:14.530 through written documentation 00:42:14.530 --> 00:42:18.830 if like in person or over zoom communication 00:42:18.830 --> 00:42:24.940 like can't happen due to other time commitment or at work. 00:42:25.350 --> 00:42:32.250 But if like possible, then having like a real like onboarding experience 00:42:32.250 --> 00:42:35.995 of walking someone through the ins and outs of a package, 00:42:35.995 --> 00:42:38.057 I've found to be very useful. 00:42:38.057 --> 00:42:40.720 But there's not always a lot of time for it. 00:42:41.270 --> 00:42:43.138 [Julia] Absolutely. Absolutely. 00:42:43.138 --> 00:42:47.670 All right, one question I'd like to ask Is about the decision 00:42:47.670 --> 00:42:56.231 to submit a package to some kind of like centralized repository like CRAN or Bioconductor 00:42:56.231 --> 00:43:00.760 or to do something like peer review, like rOpenSci, 00:43:00.760 --> 00:43:06.619 Or just the Journal of Open Source Software versus maybe to say only on GitHub. 00:43:06.619 --> 00:43:10.579 And Elin, I was wondering, so you know you maybe in the context of 00:43:10.579 --> 00:43:12.499 you've worked in a lot of different kinds of software, 00:43:12.499 --> 00:43:15.969 but then you had skimr you all started it at the unconf 00:43:15.969 --> 00:43:20.520 and then you know, so it was rOpenSci package 00:43:20.520 --> 00:43:22.239 and then you did decide to submit it to CRAN 00:43:22.239 --> 00:43:27.707 like what do you think, how do you, what do you think are the right decisions 00:43:27.707 --> 00:43:31.560 to consider when deciding when making those decisions? 00:43:32.140 --> 00:43:36.540 [Elin] So it's good because it's a really good question. 00:43:36.640 --> 00:43:42.688 We took a while to decide to submit it to CRAN 00:43:42.688 --> 00:43:47.175 like at first we were just working on getting the functionality and thinking about it. 00:43:47.175 --> 00:43:51.031 And we reverted we -- you know version numbers are really important. 00:43:51.031 --> 00:43:55.625 And at the conference at the unconf, we kind of started, we said it's version one 00:43:55.625 --> 00:44:01.430 but then afterwards, a few weeks later, we went back and said it was like version 0.5 instead 00:44:01.430 --> 00:44:04.713 Because once you say it's version one, 00:44:04.713 --> 00:44:08.729 you really kind of making a promise to people that it's going to work. 00:44:08.729 --> 00:44:13.779 And if you, you can always if it's less than one Kind of, say, it doesn't . 00:44:13.779 --> 00:44:15.600 'Yeah we're not promising anything.' 00:44:15.600 --> 00:44:19.270 And you can put that in your README. And definitely when you're going to CRAN, 00:44:19.340 --> 00:44:23.090 All of a sudden, it really, you know, they're going to do what they do. 00:44:23.090 --> 00:44:25.999 Everybody complains, but they're maintainers too, right? 00:44:25.999 --> 00:44:30.793 And so they're going to do what they do to make sure that everything works 00:44:30.793 --> 00:44:36.930 and they're going to find a million little things that you didn't really follow the rules on. 00:44:38.580 --> 00:44:41.865 And then all of a sudden, you have this world of users 00:44:41.865 --> 00:44:45.214 and you've kind of made this published manual on the web 00:44:45.214 --> 00:44:48.937 that anybody can find and it's just a different feeling 00:44:48.937 --> 00:44:52.567 when you once you're in one of those repos, I think in one of those repository. 00:44:52.567 --> 00:44:56.708 With just in GitHub, I actually sometimes don't even put a license. 00:44:56.708 --> 00:45:00.299 I mean, I know they get mad but I just don't put a license sometimes 00:45:00.299 --> 00:45:05.324 because I'm like, I'm not even sure I want people to have that much confidence 00:45:05.324 --> 00:45:09.820 in this package. That they should be using it. 00:45:09.820 --> 00:45:13.409 And you know, I do have another one from the following year's unconf, 00:45:13.409 --> 00:45:15.100 which is called qcoder. 00:45:15.100 --> 00:45:18.065 And we actually have quite a few users of qcoder, 00:45:18.065 --> 00:45:21.277 but not at the same volume, because it's not, you know, 00:45:21.380 --> 00:45:25.495 It could go on CRAN, you know, probably, I could get it ready in a couple weeks. 00:45:25.495 --> 00:45:30.553 But I just, I don't feel like ready to have a lot of users there. 00:45:30.553 --> 00:45:33.777 So I just think you're making that big decision. 00:45:33.777 --> 00:45:38.272 The other thing is, once you're on CRAN, that's actually when -- 00:45:38.272 --> 00:45:41.918 and I'm sure with Bioconductor as well, then all of a sudden you're going to have 00:45:41.918 --> 00:45:47.453 other packages using you as a dependency, and especially because they changed, you know, 00:45:47.453 --> 00:45:54.998 Nobody can use a GitHub package anymore. If you're, you know, in CRAN and so it -- 00:45:54.998 --> 00:45:56.114 but it has, you know -- 00:45:56.114 --> 00:45:59.120 once you have those other people out there, then depending on you, 00:45:59.120 --> 00:46:05.363 that also creates a level of kind of social obligation, social contract 00:46:05.363 --> 00:46:07.420 where, you know, you could say: 00:46:07.420 --> 00:46:10.110 'Okay, I'm just gonna let my package get archived.' 00:46:10.110 --> 00:46:14.106 But then all this other stuff breaks and you know you feel bad about that. 00:46:14.106 --> 00:46:16.130 Well, if you're me anyway. 00:46:17.380 --> 00:46:22.150 So, you're kind of once you're in, it's there. 00:46:22.150 --> 00:46:23.579 There's just a snowballing of it. 00:46:23.579 --> 00:46:27.600 And I feel like in, you know, your GitHub, you can just say: 00:46:27.600 --> 00:46:31.080 'Hey, I put it out there. Feel free to fork it.' 00:46:31.080 --> 00:46:33.789 Right, that's another thing, no one mentioned, right? 00:46:33.789 --> 00:46:38.675 I mean, again in open source, there is kind of the social contract 00:46:38.675 --> 00:46:41.347 that a fork is the last resort. 00:46:41.347 --> 00:46:49.506 But if a maintainer totally ghosts the project, then they someone else can always work the project 00:46:49.506 --> 00:46:53.018 and make the fixes and you know, I certainly have done that. 00:46:53.410 --> 00:46:57.270 Not for public consumption but just for free. 00:46:57.270 --> 00:47:02.717 Yeah, where there's like I use, For teaching I use RStudio Server a lot. 00:47:02.717 --> 00:47:06.340 And there's some packages that don't work well on RStudio Server. 00:47:06.340 --> 00:47:08.967 And so, you know, I have my little fixes. 00:47:08.967 --> 00:47:13.430 They know it's like when you're ready for my bug and interested in supporting it, 00:47:13.430 --> 00:47:17.152 I'll send you my pull request again. 00:47:17.152 --> 00:47:20.481 But I'm not going to like get into an argument with a maintainer about that. 00:47:20.681 --> 00:47:29.011 So it's -- there's just a -- but I do, I feel it is this big you are, it's kind of like going public. 00:47:29.011 --> 00:47:35.398 And now you're out there and you have people depending on you 00:47:35.398 --> 00:47:38.729 and you said you're ready so... 00:47:38.729 --> 00:47:39.504 [Julia] Yeah, yeah. 00:47:39.504 --> 00:47:45.240 No, those are really good thoughts on those decisions to submit to those central repos. 00:47:45.310 --> 00:47:49.683 Okay, so now it's time for our last question. So for our last question. 00:47:49.683 --> 00:47:52.816 I'm gonna -- I want everybody say what their response is, 00:47:52.816 --> 00:47:59.830 maybe just kind of in like one sentence, if at all possible, and just like one sentence. 00:47:59.960 --> 00:48:09.160 So, for this last question, let's say, let's all say, what does someone need to know 00:48:09.600 --> 00:48:17.563 Like in in terms of like need to know or skills to start maintaining a package? 00:48:17.563 --> 00:48:20.765 So, Leonardo, can you go first? 00:48:20.765 --> 00:48:23.967 What does someone need to know to start maintaining a package? 00:48:25.098 --> 00:48:28.839 [Leo] Okay, so for me it's: you have to be willing to communicate regularly. 00:48:28.839 --> 00:48:32.820 So that means responding emails or slack messages in a timely fashion. 00:48:32.820 --> 00:48:36.670 You have to also learn how to ask questions in such a way that others can help you fast 00:48:36.670 --> 00:48:41.522 and ultimately need to practice patience and be patient with yourself, 00:48:41.522 --> 00:48:44.076 be patient with others and practice empathy with others 00:48:44.076 --> 00:48:47.629 because they're helping you with their time. 00:48:47.710 --> 00:48:49.785 [Julia] I love it, I love it. Fantastic. 00:48:49.785 --> 00:48:55.720 Erin, what do you think people need to know to start maintaining a package? 00:48:56.056 --> 00:48:59.730 [Erin] Leo stole my answer. But I will reiterate it. 00:48:59.730 --> 00:49:03.690 What is like really good communication skills. 00:49:04.630 --> 00:49:09.890 Both to answer questions and to write up really great documentation 00:49:09.918 --> 00:49:15.080 that helps to mitigate the types of questions and issues. 00:49:15.080 --> 00:49:16.460 [Julia] That's awesome! 00:49:16.460 --> 00:49:21.632 Elin, what do you think somebody needs to know to start maintaining an R package? 00:49:22.290 --> 00:49:25.632 [Elin] I think you need to know that you are really willing to do it. 00:49:25.632 --> 00:49:28.420 I think you need to know you really like your package actually. 00:49:28.420 --> 00:49:32.337 Like you don't put a package out in the in the world 00:49:32.337 --> 00:49:36.729 because you want other people to maintain it, right? Or give you bug fixes. 00:49:36.729 --> 00:49:38.394 It's because you want it to work. 00:49:38.394 --> 00:49:40.529 [Julia] Nice. I love that. I love that. 00:49:40.529 --> 00:49:46.362 Scott, what do you think someone needs to know to start maintaining an R package? 00:49:48.247 --> 00:49:53.673 [Scott] So if you're somebody that only writes scripts 00:49:53.673 --> 00:49:59.566 and what -- which I did, you know, the first probably four years of using R. 00:49:59.566 --> 00:50:00.619 Learn functions. 00:50:00.619 --> 00:50:05.460 So you can't really make an R package if you just have scripts. 00:50:05.460 --> 00:50:11.820 So I would say if that's one thing to learn is to learn how to write functions and use them. 00:50:12.220 --> 00:50:17.075 [Julia] Nice. I love that too. Awesome. Awesome! I love this whole discussion that we have had. 00:50:17.075 --> 00:50:22.497 And it really aligns so strongly with the experiences I've had 00:50:22.497 --> 00:50:26.219 maintaining a couple different packages. And when I think about -- 00:50:26.219 --> 00:50:32.290 So, I took on the qualtRics package, which is an rOpenSci package 00:50:32.291 --> 00:50:39.594 for accessing survey data from qualtrics through their API. 00:50:39.594 --> 00:50:43.475 So I took it on from one maintainer from before, 00:50:43.475 --> 00:50:48.814 and now I'm thinking about now, like, what will, like what happens if I, you know like now 00:50:48.814 --> 00:50:50.994 I need to find the new maintainer., as I pass it on too. 00:50:50.994 --> 00:50:52.997 And as I think about all those things you all said. 00:50:52.997 --> 00:50:55.825 Like what someone needs to know, I agree entirely. 00:50:55.825 --> 00:50:58.510 And I think about like in that particular -- 00:50:58.510 --> 00:51:04.130 One thing I'm going to add, as I think through this. 00:51:04.130 --> 00:51:09.507 Is that, like, really, in an ideal world, like the person is someone 00:51:11.451 --> 00:51:23.870 Someone who is like a user of that, like someone who is kind of the audience. 00:51:23.870 --> 00:51:25.395 Like you can't -- 00:51:25.395 --> 00:51:30.187 And it really aligned with what Elin was saying about you care about that domain. 00:51:30.187 --> 00:51:37.260 And if you're someone who is the audience for that, then you're like: 00:51:37.400 --> 00:51:44.940 'Yep, I'm ready to maintain this because I'm actively using it and know how to fix it!' 00:51:45.240 --> 00:51:50.080 And so that is another -- Like for example when I'm -- 00:51:50.080 --> 00:51:52.826 When we're going to be talking about, like, who's going to take over qualtRics? 00:51:52.826 --> 00:51:57.084 Like that's going to be -- that's a big part of it, right? 00:51:57.084 --> 00:52:03.362 Like someone who is a person who uses qualtrics and understands how packages are put together, 00:52:03.362 --> 00:52:07.804 and has these responsive communication skills. 00:52:07.804 --> 00:52:12.862 So thank you so much panelists for that wonderful discussion. 00:52:12.862 --> 00:52:16.250 I think that Stefanie is going to wrap us up with a few announcements. 00:52:19.120 --> 00:52:24.179 [Stefanie] I am, thank you so much. My heart is full today. 00:52:24.179 --> 00:52:30.481 This was really such a wonderful discussion. I love it, particularly because this is we thought: 00:52:30.481 --> 00:52:33.574 'Oh, sure. Let's do a community call as a panel discussion.' 00:52:33.574 --> 00:52:38.781 But of course, that could just be so disorganized and people chattering. 00:52:38.781 --> 00:52:41.371 This was very well planned. And I thank the panel so much 00:52:41.371 --> 00:52:45.247 because we all met a week ago to talk about this. 00:52:45.247 --> 00:52:47.930 So this is not what an impromptu panel discussion looks like. 00:52:47.930 --> 00:52:51.730 A lot of work went into this on the part of the panelists. 00:52:51.730 --> 00:52:57.116 And so I thank all of you sincerely. This could not have been more successful I think 00:52:57.116 --> 00:53:00.326 We can even function without Julia's house having internet. 00:53:00.326 --> 00:53:01.710 So this is wild. 00:53:01.800 --> 00:53:06.960 At the peak, we actually had 90 participants attending this call. 00:53:06.960 --> 00:53:12.120 So congratulations to everybody for joining. We shared kind of cool thing today. 00:53:12.760 --> 00:53:15.460 I wanted, especially to thank, 00:53:15.460 --> 00:53:18.800 I noticed Janani Ravi was taking a bunch of notes in responses 00:53:18.800 --> 00:53:22.510 as the panelists were talking. So thank you very much for capturing that. 00:53:23.100 --> 00:53:27.905 I also noticed quite a number of people have been adding their questions 00:53:27.905 --> 00:53:31.087 and answering a bit in questions Part B. 00:53:31.087 --> 00:53:35.035 So that's really cool because I didn't notice as the discussion was happening. 00:53:35.520 --> 00:53:39.730 In this shared Google Doc, I'm going to leave this open for editing, 00:53:39.730 --> 00:53:42.200 at least for another 24 hours, 00:53:42.200 --> 00:53:46.799 So, if you have to go off to other meetings, I'll leave this open for editing for a while 00:53:46.799 --> 00:53:50.520 so that you can come in, add additional questions you have, 00:53:50.520 --> 00:53:54.757 answer each other's questions. Participants here can add their comments. 00:53:54.757 --> 00:53:57.492 Ideally, if you're willing to put your name beside that, 00:53:57.492 --> 00:54:00.688 add your comments to some of the questions that the panelists were asked, 00:54:00.688 --> 00:54:05.520 because we really do have such a rich amount of expertise here in the audience. 00:54:05.780 --> 00:54:09.775 After about 24 hours, I'll lock the document to view only. 00:54:09.775 --> 00:54:14.417 It, along with the video of this call, is going to be posted on the archive page. 00:54:14.417 --> 00:54:19.400 So it'll ropensci.org/commcalls. This will live there forever. 00:54:19.730 --> 00:54:21.290 What else do I want to tell you? 00:54:21.290 --> 00:54:27.460 Please, before you go, please, add your name to the attendees list in the doc, 00:54:27.560 --> 00:54:28.965 I don't share that much. 00:54:28.965 --> 00:54:31.357 Just for us to know what countries you came from 00:54:31.357 --> 00:54:33.750 and what organizations... That kind of thing. 00:54:34.460 --> 00:54:39.128 We have a new discussion category in our public forum. 00:54:39.128 --> 00:54:45.994 So our public forum is discuss.ropensci.org, and just in the last couple of days, 00:54:45.994 --> 00:54:48.770 we created a package maintenance category. 00:54:49.130 --> 00:54:53.185 I encourage anyone, especially people who have said they're feeling a bit overwhelmed, 00:54:53.185 --> 00:54:55.997 they're just getting involved in maintaining a package. 00:54:55.997 --> 00:54:57.990 Please ask your questions there. 00:54:59.010 --> 00:55:05.366 Some of our, sort of like internal maintainers will also get a flag when something's posted there. 00:55:05.366 --> 00:55:07.573 So they may be able to come and answer your questions. 00:55:07.573 --> 00:55:09.996 You can answer each other's questions. So right now it's empty. 00:55:09.996 --> 00:55:12.960 It's just a category that exists, and I encourage you to use it. 00:55:14.339 --> 00:55:16.770 Do I have anything else I need to tell you? 00:55:17.250 --> 00:55:19.070 I think that's it. 00:55:19.510 --> 00:55:24.471 You really, it's only 10 o'clock in the morning for me here in Kamloops British Columbia, 00:55:24.471 --> 00:55:27.590 you set me off to start a wonderful day. 00:55:27.590 --> 00:55:31.605 I thank you all for joining us wherever you are in the world, 00:55:31.605 --> 00:55:38.668 and I wish you both a physically and mentally healthy and happy rest of the day. 00:55:38.668 --> 00:55:40.660 Thanks so much, everyone. 00:55:40.660 --> 00:55:41.900 Bye bye.