So,
It's official. We're starting.
Hello everyone, I'm Stefanie Butland,
rOpenSci's Community Manager
and I so warmly welcome you to
our community call
on maintaining an R package.
I wanted to first acknowledge that everyone here
is under extraordinary stresses right now
in light of the COVID-19 pandemic.
None of us knows what weights
other people are carrying at this time.
It's a remarkable thing that all of us
have decided to take this hour
to come together and be together
as a community.
And so many of you I know as sort of warm,
generous, accepting people.
And so I thank you all
for taking the time to do this,
and for the next one hour of your lives,
I've got your back!
And now for something completely different:
rOpenSci is a non-profit initiative,
founded in 2011
by Karthik Ram, Scott Chamberlain,
and Carl Boettiger.
We enable open and reproducible research
by building technical infrastructure
in the form of staff- and community-contributed
R software tools
and we build social infrastructure
in the form of a welcoming and diverse community.
You can find our bi-weekly newsletter
at news.ropensci.org.
We have a code of conduct
that applies to both in-person
and online interactions,
like this call today.
You can find it linked
from the footer of our website
and it includes reporting
and enforcement guidelines.
The session is being recorded
and the video and any other resources
are going to be posted on our website
at ropensci.org/commcalls
within about three business days.
I'm going to tweet from rOpenSci
when those things are up.
We use a shared Google Doc,
which if anyone's in there
could you please paste the link back
into the zoom
so that new joiners can see this.
We typically use a Google Doc
in our community calls
for collaborative note taking.
You can find that at
bit.ly/ropensci-commcall-maintaining
I'd like you to add your name
and your affiliation into the attendees list.
And there's a format there that you can follow
because it helps me grab that information.
In this call, it's a different flavor for us,
we've decided to do this as a panel discussion
as opposed to a series of talks
with attendees asking questions.
And so we're going to have
one short presentation,
followed by a panel discussion,
full of pre-selected questions.
I've already typed each of these questions
that are planning to be asked in the doc
and so invite anyone to add any comments
you're interested in capturing from the panelists,
as well as any expertise,
any thoughts you might have about this
because we really are
a bunch of rich resource people here.
And so audience answers are just as valuable
as the panelist answers at this point.
And this is something that we end up sharing
as a long-term resource.
It will be accessible forever.
This time, unfortunately,
we won't have time for taking impromptu questions
from attendees on the community call
but I've created a separate section in the doc
called questions - Part B,
where you can ask questions
that come up for you
and I invite anyone
to answer each other's questions.
And together, as I say, we'll make this
a rich resource for everyone to use.
So finally, it's my pleasure
to introduce our panel.
Julia Silge recently joined RStudio
as a data scientist and software engineer.
When we put out a call for a new
maintainer for the qualtRics package,
Julia took it on because
she was using it in her day job
as a data scientist at StackOverflow
at the time
And she worked on the annual developer survey
using qualtRics.
She also maintains other R packages,
including tidytext,
which has been downloaded almost
900,000 times.
Elin Waring
is a professor of sociology and interim dean
of the School of Health Sciences, human services
and nursing at Lehman College CUNY
She teaches research methods
and statistics.
Elin was part of the rOpenSci Unconf17 group
that developed the skimr package
which has become very popular
at over 250,000 downloads.
Elin works with Michael Quinn
to maintain skimr
as they've shepherded it
through two major releases already.
She formerly was a contributor
and maintainer
for the Joomla CMS project
and her approach to maintaining
is influenced by that experience.
This includes understanding
the importance of having a clear concept
of what you're trying to achieve,
being able to politely but firmly say no,
and knowing having users changes everything.
Erin Grand is a data scientist
at Uncommon Schools
and a Board member of
R-Ladies New York City.
Erin created and maintains a package
for NASA's Astronomy Picture of The Day
It's called astropic and it was inspired
by her early love of astronomy.
And one of her own images was featured
as Astronomy Picture of the Day.
Life goal achieved!
She also maintains a set of
Internal packages at her work.
Leonardo Collado-Torres is a
research scientist
at the Lieber Institute
for brain development.
He maintains several
Bioconductor packages,
including recently submitted spatialLIBD
for spatial transcriptomics data.
He's a co-founder of the LIBD rstats club,
the CDSB Mexico community of R
and Bioconductor in Latin America
and those members just submitted
their first package to Bioconductor.
This represents a dramatic percent increase
in Latin American Bioconductor developers.
So, congratulations!
Also, congratulations, Leo,
just in the last couple of days,
was promoted
to the position of Research Scientist
and he's written a post about that.
Scott Chamberlain, our final panelist,
is a co-founder and technical lead of rOpenSci.
He maintains, in his words,
probably too many packages.
Part of Scott's work involves
finding new maintainers
for rOpenSci peer reviewed packages.
And he tries to find those when
current maintainer needs to move on,
like the qualtRics example.
His bio is really shortest
because he's actually far too humble.
Julia is going to speak
for about 10 minutes
and then for the rest of the hour,
she's going to be moderating a panel discussion
with pre-selected questions.
For people who have just joined,
thank you for sharing the link
to the shared Google Doc again in the zoom.
Please add your notes there,
add your own questions in the bottom.
I am now going to share my screen
because I will note for people joined that
Julia is not sharing her lovely face in home backdrop,
because there was an earthquake
where she is,
she lost internet.
And so I'm going to be showing her slides.
So if you'll give me a moment.
Sure... Well... [inaudible]
[Julia Silge] Fingers crossed.
Yeah, there was a 5.7 earthquake
at your needs and it was not super big.
But enough that one of the aftershocks
knocked out internet at my house
and I'm trying to be able to still speak
If we lose me,
then I know that everything will keep going
and in a great way.
So,
So we're going to talk through these slides
about just briefly about particular perspectives
on maintaining an R package.
So if we go to that slide about being...
Let's see that, next slide...
Maintaining an R package we often think...
we cannot...
There's a Reese's about
when you're building R package,
about what we focus on
in the technical aspects.
But once you're in the piece,
though the part of you actually
built your package
and people are using it.
There's quite a balance
in the amount of time
that we spend managing technical work,
which is, of course,
extremely important with social aspects
of who is using the package, if you will,
involves often with asking
a lot of the right questions
(go to the next slide)
Some of the right kinds of questions
that we ask
when we're thinking about
what it takes you to packaging,
the date, or some of these like --
Is this a package that's used really broadly
by a lot of different kinds of people
and we can think beginners?
Is this a package that has a specialized use case
or that's used by people who know each other,
internally at a company?
Is the person maintaining the package,
the person who put it together originally,
or as has it been passed along
a couple of times?
When you think about
maintaining your package...
I'm interested here as we have our discussion
what people's perspectives on like
how do we change?
And so you can either thinking
about packages changing over time
or, packages being superseded.
And it's been interesting in our discussions
preparing for this community call.
Software doesn't live forever.
And when we build software,
do we put thoughtfulness into
what do we expect to happen next?
So we can go to the next slide.
One of the motivating ideas
of setting up this community call
is that most software out there,
whether you're talking about R packages or not,
have one main person who keeps it running
and a goal of rOpenSci,
and a lot of us in this community,
is to build up the sustainability
of our software ecosystem.
And it's somewhat brittle
and also contributes to burnout
and uncertainty about
what is going to happen
when we just have one maintainer.
Some other things
you can struggle with are
You know, literally no one else knows
what to do with this internal of this package too.
What do you do with one person having to manage
sometimes what can feel like
an overwhelming amount of feedback from users.
If you go to the next slide
There's interesting research out there
about
both like what is the situation
with software contributors
and how can we either navigate that situation,
encourage more or figure out
what the right path could be for a community.
The references at the bottom here
of analysis of open source contributors
In this analysis, they did --
It's not uncommon to find
casual contributors,
like people who are not the main maintainer
and you have a situation where there's
a lot of, a long tail of small contributions,
so like half the contributors
are responsible for 2% of the commits.
But they are lots of different kinds of commits.
These 2% of commits
from lots of different kinds of people
are lots of kinds of things.
You know there are things like typos,
but they're also things like fixing bugs
and building new features and refactoring.
This ... contributes --
What this is, is evidence that
these are people who could be scaled up
to being more contributor,
more contributing,
more significant maintainer or contributors,
if that is appropriate for your program.
If you go to the next slide.
We often have this model
of software contributions,
where we have to think of it as
like an onion model
where you've got the users,
and the contributors are inside of there,
and the committers are inside of there.
And that's often how we have
this mental model
of like how, why, the software work,
you know.
But, we might want to consider whether
that is the best model
and instead move to what's on the next slide,
which is a hub and spoke model
where the code is central, right?
Like, that's the thing that we're all using.
So that is in the middle.
And there are maintainer
who work mostly on code,
but there are also other kinds
of maintenance activities happening
So there are maintainer of the software
who focus mainly on education and docs.
There are maintainers
who focus mainly on issue triage.
There are maintainers
who focus mainly on evangelism.
And users kind of swim around in this --
swim around in this like in a soup
around this hub and spoke model,
and depending
on their particular need at any one time,
they engage with these different maintainers.
Like, maybe the user-support one,
or maybe the people writing the code,
or maybe the people writing the docs
and so like this might be
a helpful mental model
for thinking about package maintenance,
especially for larger packages
that have more users.
So if you can go to the next slide.
So it turns out we do actually have research
and know what can contribute to,
what can encourage more contributions.
So the next slide outlines something
For one study that was done, something
that somethings that we, you know -
if you're involved with rOpenSci,
you've heard this kind of thing
and seen this thing in action.
So, include and enforce a Code of Conduct.
Have cultural norms and
include kindness and respect
Something that was here. That is interesting.
I don't see a lot in R packages,
but could be interesting to consider.
That's: make more public or explicit
any future plans you have.
That can help contributors know what to do.
And then the - if you go to the next slide.
This paper also has some very interesting ideas
of how to help newcomers become
contributors and maybe eventually maintainers.
These are all here.
I'll highlight a couple
Let's talk about that.
I'll just highlight that second one:
Have forms of participation
that are legitimate in your projects,
that are valued,
that are not writing code,
that are on ramps and
then those last two I think are very interesting to
To explicitly acknowledge all contributions.
To have a culture around your project
that doesn't let contributions just,
kind of get, you know, swept away
and also to follow up both on success and failure.
If someone opens an issue or submits
a PR that is not a good fit,
To follow up on both the things
that succeed and fail.
So, what we just went through in those slides
are just some summary and thoughts
on the current situation.
So a little bit of research of what we know.
You can go to the next slide.
The rest of the time that we're going to have
here is going to be a panel discussion
If my phone tethering holds up,
I am going to moderate this panel of folks
who are going to talk about some of
our experiences.
Some of our opinions on maintaining
R packages.
And if you will, the next slide that will
just have some of the references.
Just to thank you.
Where some of those images.
And a thank you to Scott for some
of the research that he shared.
So thank you to all that.
And I think with that,
we can get started with our discussion.
Alright. So if you're --
So, panelists: Leo and...
So our panelists are:
Leo, and Elin, and Scott, and Erin.
So, you all have been introduced,
but can you unmute?
And then, to get started,
I think the first question
I would love to have us discuss is:
What does it mean to maintain an R package?
This is what we're talking about.
So, I would love to get
your perspective on that.
So let's go around and so, briefly,
let's first have all four of you all say like
What do you, like -- what does it mean to
maintain R packages? So Elin, can you start?
So I think it means a couple of different things,
There is this very specific thing,
that term you use 'committer' before,
which is people who can commit
to the master branch
And in CRAN like the person whose name
is going to be there as the email address for
and make the submission
and they're going to be the prime contact
So that's one definition
which is kind of the traditional
open-source way of thinking about it.
But then there is kind of,
I think what you're getting to,
bigger possible group of people
who are invested in making sure
that the package is maintained.
Meaning keeping --
just like maintenance on anything,
keeping it up to date, dealing with bugs,
what happens when you stop working,
because some other package updated
or because R, base R, change something
So someone who participates in that.
And then also, potentially,
in all the other areas you were mentioning.
Yeah yeah nice! Scott?
What do you think it means
to maintain an R package?
[Scott] Um, there's a lot of details, I guess.
But I think that a very...
Can you hear me good?
[Julia and Stefanie] Yeah.
At a very high level, I guess,
the thing that came to mind first for me
was just that it's like
a constant learning process.
A constant, sort of like,
trying to figure out
how to do any particular thing better,
whether it's testing or function compos--
like how your function is composed,
the parameters or whatever.
And I think another point,
about the second point that I came up with
was sort of constantly learning
how to design better function interfaces.
You know how the functions are named,
and the parameters are named,
and how their default values,
their -- stuff like that.
So I think this is constant learning process
of how to design easy to use interfaces.
[Julia:] Yeah yeah yeah yeah,
all of what you both you just said
really resonate with my own experience
with like the different packages I maintained.
Erin, when you think of like,
maintaining an R package,
what do you think that actually means?
[Erin] Yeah, first of all,
I agree with everything the other panelists said
Something that hadn't been mentioned, I think, is
the sort of ownership around community
and communication of the package.
So, being the person who responds to issues
or is looking at push requests,
and really, like, dealing with the communication
out to contributors or to users
on either changes or what's happening
with the package.
[Julia] Yeah. Nice!
Yeah, that's absolutely, that's really great
And then Leo, what do you -- what about --
what's your response to this?
What do you think it means means
to maintain an R package?
[Leo] So I'm going to echo
what some of the other panelists said
but for me it's like you deal
with the questions you get from users.
You approve or disapprove changes
that you receive from others
and you end up learning
about community guidelines,
like my case like the Bioconductor guidelines.
And then you also have to --
you end up learning about like are R-devel
and what changes are coming,
how to anticipate those changes,
such that you can fix them
before the user sees them.
[Julia] Yeah, that's great.
So one thing I heard a lot of you mention
was like deal -- understanding users,
hearing from users...
and, the issue of like user feedback
I think is a really interesting one
when it comes to maintaining R packages.
So some R packa --
So some, you know, pieces of software
are in the situation where you're like --
[inaudible]
[Erin] I think we lost you.
[Stefanie] Julia? We just lost your sound.
Folks, we have a backup plan.
And for those of you in here:
Julia lost internet
due to an earthquake today.
[Julia] But, you know, on the other --
[Stefanie] Oh! Julia! Hold on.
We lost you for like 40 seconds.
[Julia] Oh, okay.
[Stefanie] Could you please restart by asking
the question that you were just about to ask?
[Julia] Sure. Sure...
So user feedback is an issue
that R packages need to deal with
and many packages need more contributors,
not fewer
and so we often want to encourage
user feedback.
At the same time,
some packages are in the situation
where they have a fire hose
of user feedback
And that fire hose
can sometimes feel like --
You need to manage that.
How do you manage that?
What kind of situations have you been in?
What strategies do you use
to deal with user feedback?
Elin, can you start with this first
because I think I've heard
you have some interesting perspectives on this,
especially as someone who --
with skimr as a very popular package.
(Erin) Sure. So skimr is pretty popular
and we do get
a lot of different kinds of user feedback
we get people who want to know
how to do things in skimr,
they have questions about it.
We have people who want to make,
you know, suggestions for future development.
And then we get people with issue reports
and --
I will say... it --
when I said having users changes everything,
it really does
because you do have kind of a relationship
with them
and they're using --
you've kind of --
It's complex, right?
Because you've kind of given them something
And you want them to be grateful
that you gave them this thing
and but you also, you know, in terms of --
if you're enjoying your package
and you're developing it,
and you want to find out what's wrong.
So it's kind of like you feel good
when people are asking you
and tweeting to you and stuff like that.
But it can also get a little bit overwhelming.
I will say --
And skimr is kind of a strange case
because it was first developed at the unconf.
And so people were tweeting about it like
before it was finished,
before even like the first prototype
was finished.
And so we had a lot of feedback right away
about ideas of things to do
and people started using it.
Um, and so I'll just tell you
how I kind of think about dividing it up
like one thing I did was:
within two weeks we had questions
on StackOverflow about skimr.
And so in the end,
once it got to like five questions,
I just created a tag.
And so I have a tag that I follow.
And I find that's helpful.
We don't get that many questions anymore
over there, but --
And then we have our issue tracker.
It's the main place where people show up
and it's really helpful in a way
because we have some kind of heavy users
who come in and say:
"hey, if you're on the development version of tibble,
it doesn't work anymore because this happened."
And so that's helping us
keep a little bit ahead of the game
because you don't want to find out
that it breaks with the development version of tibble,
the day that there's a release
and they can be really helpful with that,
On the other hand, the whole issue of --
You know, if you have a package
that you're keeping for multiple years now.
You have things like your code style
that you want to enforce
like we use spaces some places and
and we want to use the assignment operator
and not the equal sign and things like that...
And so sometimes it's hard when users
want to send a pull request.
And then you want them --
you want to encourage them to contribute,
but you don't want them --
no, it feels kind of like
you're being so OCD on like saying:
"Hey, would you mind adding a space here?"
And so I find it challenging to find the balance
with that in terms of saying
I'll just fix it for you, versus asking them to fix.
[Julia] Yeah, yeah, there was -- some of the
things you said in there in terms of like,
you know, following a tag
on StackOverflow, or
You know, getting that note of should I edit a PR
afterwards versus interacting with somebody?
Are things that I also have,
kind of had to figure out,
like, what am I, what am I going to do.
And so that's interesting.
Um, you addressed some of the issues
around also managing feature requests as well,
which was another interesting question.
So Leo. I think you um --
I wanted to ask you about that issue
of hearing from users
as someone who works
on more specialized packages.
[Leo] Yes, so the packages
I work with on Bioconductor,
they don't have as many users
[inaudible]
you needed some very expensive data
sometimes
in order to use them.
And so the issue I deal with is that
from one side we have open source tools
and we want to provide them
and you know for free and build
a community around them.
But the other side, sometimes we have people
that have this expensive private data.
Some published under scared about sharing it,
even when they have questions.
So you end up getting a lot of emails.
And I try to convince them saying
that this doesn't really benefit anyone
Because I mean, I learn from the experience.
They learn from the experience,
but no one else really does.
So I tried to convince them to put
their questions on the Bioconductor support website
and through it share small reproducible examples.
Sometimes I can write a blog post
about the question,
but that's a lot more work for me.
[Julia] Yeah yeah
Yeah, yeah, no, the same.
I bet this happens to a lot of folks
who maintain packages.
You get the email and and
then that's what I do too actually is like,
And sometimes, I will like help the person
write the reprex
and then be like now post it because then it's like,
well now at least this person knows
how to post a reprex
and can do it next time,
because helping someone over email is not --
doesn't multiply in the way
that like public stuff does. Exactly.
We've touched a little bit --
[Leo ] Sorry, just for that, like --
what I tried to reward them with
is answering as fast as I can, but
(Julia) Yeah.
[Leo] questions they make.
[Julia] Yes.
Yes, being really responsive on those channels.
We have already touched a little bit on,
like managing issues and feature request,
but I wanted to get maybe
one other person's perspective on that,
like about workflows, or whatever.
Scott, you have a ton of packages,
and I wonder if you have any perspective
on user, on issues, feature requests
and any thoughts on like workflows
around that.
[Scott] Um, yeah, I guess I have
a lot of packages
but none of them are very popular.
So I think it's -- I don't really have
that sort of tidyverse problem.
So, but, you know, things I try and do.
Or I think Leo said this, you know, responding.
I try and respond to all issues quickly,
even if I just say:
'Hey, I got it.
And I'm going to have a look at it.'
I think it's important to sort of give people
that feedback so they don't walk away
from your package.
And I think that's likely going to happen
if they don't have a response.
[Julia] Yeah. Absolutely.
[Scott] Um, and then feature requests...
I think it's always good advice
to think about scope creep
And if you're, you know --
if something's out of scope,
then make sure to say that
and just, yeah. And instead of --
try not to get your package
to be too disjointed for users.
[Julia] Yeah. Nice. Nice.
Alright, so one of the goals that -
like one of the motivating goals
for this discussion is that,
hey, most packages only have one maintainer.
And it'd be better
if there was a broader
broader groups of people
who can maintain.
So what is a path for someone,
for new contributors to R packages.
So, for example, what it would be a first step.
What should someone do
if they want to help maintain one of your packages?
So let's um... so Erin:
can you say that,
so you've got like some up
like a public package on GitHub.
You maintain packages internally.
What should someone do
if they want to help with one of your packages?
[Erin] Yeah, I'll take this
from the internal side.
Because I think that's a perspective that
I come from a lot more often.
Because I have like four packages
in that case in one package external
but in terms of how do I look for
new contributors and maintainers,
a lot of my communication and issues
and features or feature requests
for an internal package
or like a work specific package.
Come via slack, even though
the package is hosted on on GitHub, or Gitlab.
The questions and comments and issues
come in via like a different tool.
So if someone is constantly asking questions
or constantly asking for features,
it's pretty easy to be like, all right,
will onboard you to this package
and then voila,
you may update it yourself!
Exactly. So I think like, for me,
if you're interested in the package,
if you have questions on the package
and if you've like shown an ability
to contribute to any package at all before
I think one like first initial step is
to have your own package
or have something
that you've contributed somewhere
just to show that you know
what an R package is in the first place.
But really motivation is the,
the important thing.
[Julia] Yeah. Nice. Nice.
What about you, Scott?
What, like what do you see
as like a path for someone to get on?
And what would be like the first
like a first step for someone who is interested ?
[Scott] Yeah, I guess my first,
my main point was
what Erin already said was essentially is,
you know,
From my experience, like
the most successful sort of new contributors
are people that end up
taking over packages or contribute a lot
or people that use the package
and their package is a dependency
or in a project or whatever.
And so they sort of have this at least short term,
you know vested interest in the package
because you know you have
drive by contributors that will fix a bug
or do this or that.
But it's often
when your package is a dependency
or sort of major part of somebody's project
or something.
And so that's always a good,
a good place to find contributors. Um, yeah.
[Julia] Yes. Well, speaking of dependencies,
speaking of dependencies...
That's a big part.
I mean, that's a big bit of like the decisions
around maintaining an R package
like deciding what do I want to take on
as a dependency.
Like do I want to, do I want to...
like what do I want to depend on
Do I want to like rewrite something internally
or take on dependency,
like everything from like something like,
you know, off to some little algorithm or whatever.
So, so I would love to hear something
about that.
So Leo, what are some of the thoughts
you have had
as you have made those decisions
in your packages?
[Leo] Yes.
So, in the Bioconductor realm,
there's the Bioconductor core team
that they did their own grants and funding
and they maintain the core packages,
the core infrastructure packages.
So I tried to depend on those
because I know
they're going to be professionally maintained.
Also they have access to your package.
So if you depend on them
and they make change that breaks your package,
they can actually go and fix yours
without you actually doing anything.
And similarly, we try to rely on
like the tidyverse packages,
because I know that they're well funded
to keep working on the packages and and fix them.
But I also like to depend on packages
from authors that I have interacted with in the past.
That's also sometimes
how I find out about this packages from like--
[Julia] Yeah yeah
Yeah yeah yeah yeah that --
yeah, those that -- So yeah,
so things that you know
are stable projects,
things that you know
you have relationships with people,
that you'll be able to communicate with.
Yeah, that all, that all makes sense.
Another, um, some --
so there's dependencies,
and then there's --
then, there are also
like other things that so they can change,
right and you have to manage that.
Then there's also APIs that can change.
So like Erin, your package that's on GitHub
is an API package, right?
[Erin] Yeah, exactly!
[Julia] And like, so you have to,
you have to like pay attention to
when the API itself changes.
[Erin] Yeah, precisely so as an example:
Astronomy Picture of the Day, or APOD
as it's more commonly called
can either post a, like a picture
or an image or a gif.
And then all of that information is supposed
to get transferred back into the API.
But for a really long time,
there was an error.
Anytime there was the --
with the API, anytime there was a video
that was accessed.
So it was able to basically download
any information about any pictures
but any time there was a video,
there was a problem.
So I wrote in this like whole test.
Like: if video,
do not pull this day of information.
So that the user doesn't see the error,
they just don't get an image back
and then they fixed the --
[Julia] They fixed it!
[Erin] Making my whole
little workaround unnecessary.
So like keeping on track of like
(a) what's like issues
are happening in the API
to either like find workarounds
or solve them
or even like do a pull request to this
like API source to fix it for them.
I think it's like an important part
of maintaining
a package solely based on
an API structure.
[Julia] Yeah and you know that's, um --
There are a lot of parallels just with
Just with like if you're dependent
on another R package in general, you know,
like everything you just said about the API,
that happens with just other software
that you're dependent on.
Either other R packages or non-R software,
you know, and that is for sure
part of this whole deal and like,
choosing carefully what software
Are you going to decide to use or not.
And, you know, people do --
you know Leonardo shared his perspective.
But people have different sets of priorities
they bring to that
and make different decisions,
depending on their own perspective
which I think is, you know, makes sense
and it's fine.
Um, one thing that happens in packages, is that
They that
Packages don't keep the maintainer forever.
I mean we you know we are talking
about the fact that,
like the qualtRics package
had a different maintainer.
He wasn't using it anymore.
And then I started maintaining it.
And actually, like, since I moved --
I switched jobs from StackOverflow to RStudio
RStudio doesn't use qualtRics for surveys
and so I'm actually --
Like I kind of have a stopgap
saying in place for now,
but I'm gonna, I'm actually looking for someone else
to take over qualtRics in the long term.
Because it like --
it is better, if it's someone who uses it,
who is actually actively using it so that
So this is something that has to happen
in real in the real world is that
maintainers, packages, pieces of software
have to change and maintainers.
And so I'm wondering what sets --
if you have experienced this,
what sets that up for success?
And this is something that probably
looks different in open source software
versus internal packages
versus, you know, and really big packages
versus small packages.
So maybe, Scott, can you talk about
what this has looked like for you?
You know, how you would manage that,
say in rOpenSci?
[Scott] Yeah.
So for most of the ones
I've been involved with
They'd mostly been sort of
wholesale letting somebody else
manage the package without
sort of me being involved.
And so that's mostly what's happened
and I think that's worked okay
and I think like one of the things
that you have to be okay with those sort of giving up
being okay with giving up control
of your baby.
[Julia] Yeah, it's not yours anymore so...
[Scott] Yeah, that can be hard, but you know,
you just have to sort of say,
you know, the new person Is the maintainer
and if they want to change the functions
and whatever, like it's you know
it's their package, they're the maintainer.
I think it's it's worked pretty well,
but I think an important thing is being there.
Being, you have to sort of be available
at least for a little while
for people to get oriented
and that can take some time.
And I think an important thing
when looking for a new maintainer
is trying to find somebody
that knows the topic area.
[Julia] Yes, absolutely.
[Scott] That's like
if it's a genomics package,
then it should be somebody in genomics probably
because they're going to maybe use it
and maybe know the area
know this sort of ins and outs of that type of data.
So. Yeah.
[Julia] Yeah, all right.
Erin, can you reflect on that maybe
in the internal package domain.
Like what, like, what does it
take to pass things off well
because that, actually in my experience,
it happens a lot in internal
because people change jobs.
[Erin] Yeah, exactly.
I think one of the major differences
that I've seen with passing an internal package
is the like time of notice
[Julia] Yeah.
[Erin] Someone switching jobs,
they may not tell the other people
that there's two things out until
the like two weeks beforehand,
in which case they have a lot
of other things to offboard.
That might not be top priority.
So I think it comes to having
clear like guidelines
around what the package does,
the style of the code, where it's located,
where questions and answers happen
like a side effect in Slack
or effect in GitHub in a way
to sort of pass off everything
through written documentation
if like in person or
over zoom communication
like can't happen due to
other time commitment or at work.
But if like possible, then having
like a real like onboarding experience
of walking someone through
the ins and outs of a package,
I've found to be very useful.
But there's not always
a lot of time for it.
[Julia] Absolutely. Absolutely.
All right, one question I'd like to ask
Is about the decision
to submit a package to some kind of like
centralized repository like CRAN or Bioconductor
or to do something like peer review,
like rOpenSci,
Or just the Journal of Open Source Software
versus maybe to say only on GitHub.
And Elin, I was wondering,
so you know you maybe in the context of
you've worked in a lot of
different kinds of software,
but then you had skimr
you all started it at the unconf
and then you know,
so it was rOpenSci package
and then you did decide
to submit it to CRAN
like what do you think, how do you,
what do you think are the right decisions
to consider when deciding
when making those decisions?
[Elin] So it's good
because it's a really good question.
We took a while to decide
to submit it to CRAN
like at first we were just working on
getting the functionality and thinking about it.
And we reverted we --
you know version numbers are really important.
And at the conference at the unconf,
we kind of started, we said it's version one
but then afterwards, a few weeks later,
we went back and said it was like version 0.5 instead
Because once you say it's version one,
you really kind of making a promise to people
that it's going to work.
And if you, you can always if it's less than one
Kind of, say, it doesn't .
'Yeah we're not promising anything.'
And you can put that in your README.
And definitely when you're going to CRAN,
All of a sudden, it really, you know,
they're going to do what they do.
Everybody complains,
but they're maintainers too, right?
And so they're going to do what they do
to make sure that everything works
and they're going to find a million little things
that you didn't really follow the rules on.
And then all of a sudden,
you have this world of users
and you've kind of made
this published manual on the web
that anybody can find and
it's just a different feeling
when you once you're in one of those repos,
I think in one of those repository.
With just in GitHub,
I actually sometimes don't even put a license.
I mean, I know they get mad
but I just don't put a license sometimes
because I'm like, I'm not even sure
I want people to have that much confidence
in this package.
That they should be using it.
And you know, I do have another one
from the following year's unconf,
which is called qcoder.
And we actually have
quite a few users of qcoder,
but not at the same volume,
because it's not, you know,
It could go on CRAN, you know, probably,
I could get it ready in a couple weeks.
But I just, I don't feel like ready
to have a lot of users there.
So I just think you're making
that big decision.
The other thing is, once you're on CRAN,
that's actually when --
and I'm sure with Bioconductor as well,
then all of a sudden you're going to have
other packages using you as a dependency,
and especially because they changed, you know,
Nobody can use a GitHub package anymore.
If you're, you know, in CRAN and so it --
but it has, you know --
once you have those other people out there,
then depending on you,
that also creates a level of
kind of social obligation, social contract
where, you know, you could say:
'Okay, I'm just gonna
let my package get archived.'
But then all this other stuff breaks
and you know you feel bad about that.
Well, if you're me anyway.
So, you're kind of once you're in,
it's there.
There's just a snowballing of it.
And I feel like in, you know, your GitHub,
you can just say:
'Hey, I put it out there.
Feel free to fork it.'
Right, that's another thing,
no one mentioned, right?
I mean, again in open source,
there is kind of the social contract
that a fork is the last resort.
But if a maintainer totally ghosts the project,
then they someone else can always work the project
and make the fixes and you know,
I certainly have done that.
Not for public consumption
but just for free.
Yeah, where there's like I use,
For teaching I use RStudio Server a lot.
And there's some packages that don't
work well on RStudio Server.
And so, you know, I have my little fixes.
They know it's like when you're ready for my bug
and interested in supporting it,
I'll send you my pull request again.
But I'm not going to like get into an argument
with a maintainer about that.
So it's -- there's just a -- but I do,
I feel it is this big you are, it's kind of like going public.
And now you're out there
and you have people depending on you
and you said you're ready so...
[Julia] Yeah, yeah.
No, those are really good thoughts on
those decisions to submit to those central repos.
Okay, so now it's time for our last question.
So for our last question.
I'm gonna -- I want everybody say
what their response is,
maybe just kind of in like one sentence,
if at all possible, and just like one sentence.
So, for this last question, let's say, let's all say,
what does someone need to know
Like in in terms of like need to know or skills
to start maintaining a package?
So, Leonardo, can you go first?
What does someone need to know
to start maintaining a package?
[Leo] Okay, so for me it's:
you have to be willing to communicate regularly.
So that means responding emails
or slack messages in a timely fashion.
You have to also learn how to ask questions
in such a way that others can help you fast
and ultimately need to practice patience
and be patient with yourself,
be patient with others
and practice empathy with others
because they're helping you
with their time.
[Julia] I love it, I love it. Fantastic.
Erin, what do you think people need
to know to start maintaining a package?
[Erin] Leo stole my answer.
But I will reiterate it.
What is like really
good communication skills.
Both to answer questions
and to write up really great documentation
that helps to mitigate
the types of questions and issues.
[Julia] That's awesome!
Elin, what do you think somebody needs
to know to start maintaining an R package?
[Elin] I think you need to know
that you are really willing to do it.
I think you need to know
you really like your package actually.
Like you don't put a package out
in the in the world
because you want other people
to maintain it, right? Or give you bug fixes.
It's because you want it to work.
[Julia] Nice. I love that. I love that.
Scott, what do you think someone needs to
know to start maintaining an R package?
[Scott] So if you're somebody
that only writes scripts
and what -- which I did, you know,
the first probably four years of using R.
Learn functions.
So you can't really make an R package
if you just have scripts.
So I would say if that's one thing to learn
is to learn how to write functions and use them.
[Julia] Nice. I love that too. Awesome. Awesome!
I love this whole discussion that we have had.
And it really aligns so strongly
with the experiences I've had
maintaining a couple different packages.
And when I think about --
So, I took on the qualtRics package,
which is an rOpenSci package
for accessing survey data from qualtrics
through their API.
So I took it on from one maintainer
from before,
and now I'm thinking about now, like, what will,
like what happens if I, you know like now
I need to find the new maintainer.,
as I pass it on too.
And as I think about
all those things you all said.
Like what someone needs to know,
I agree entirely.
And I think about like
in that particular --
One thing I'm going to add,
as I think through this.
Is that, like, really, in an ideal world,
like the person is someone
Someone who is like a user of that,
like someone who is kind of the audience.
Like you can't --
And it really aligned with what
Elin was saying about you care about that domain.
And if you're someone who is the audience for that,
then you're like:
'Yep, I'm ready to maintain this because
I'm actively using it and know how to fix it!'
And so that is another --
Like for example when I'm --
When we're going to be talking about, like,
who's going to take over qualtRics?
Like that's going to be --
that's a big part of it, right?
Like someone who is a person who uses qualtrics
and understands how packages are put together,
and has these responsive communication skills.
So thank you so much panelists
for that wonderful discussion.
I think that Stefanie is going to wrap us up
with a few announcements.
[Stefanie] I am, thank you so much.
My heart is full today.
This was really such a wonderful discussion.
I love it, particularly because this is we thought:
'Oh, sure. Let's do a community call
as a panel discussion.'
But of course, that could just be
so disorganized and people chattering.
This was very well planned.
And I thank the panel so much
because we all met a week ago
to talk about this.
So this is not
what an impromptu panel discussion looks like.
A lot of work went into this
on the part of the panelists.
And so I thank all of you sincerely.
This could not have been more successful I think
We can even function
without Julia's house having internet.
So this is wild.
At the peak, we actually had
90 participants attending this call.
So congratulations to everybody for joining.
We shared kind of cool thing today.
I wanted, especially to thank,
I noticed Janani Ravi was taking
a bunch of notes in responses
as the panelists were talking.
So thank you very much for capturing that.
I also noticed quite a number of people
have been adding their questions
and answering a bit in questions Part B.
So that's really cool because I didn't notice
as the discussion was happening.
In this shared Google Doc,
I'm going to leave this open for editing,
at least for another 24 hours,
So, if you have to go off to other meetings,
I'll leave this open for editing for a while
so that you can come in,
add additional questions you have,
answer each other's questions.
Participants here can add their comments.
Ideally, if you're willing to put
your name beside that,
add your comments to some of the questions
that the panelists were asked,
because we really do have such a rich amount
of expertise here in the audience.
After about 24 hours,
I'll lock the document to view only.
It, along with the video of this call,
is going to be posted on the archive page.
So it'll ropensci.org/commcalls.
This will live there forever.
What else do I want to tell you?
Please, before you go, please,
add your name to the attendees list in the doc,
I don't share that much.
Just for us to know what countries
you came from
and what organizations...
That kind of thing.
We have a new discussion category
in our public forum.
So our public forum is discuss.ropensci.org,
and just in the last couple of days,
we created a package maintenance category.
I encourage anyone, especially people
who have said they're feeling a bit overwhelmed,
they're just getting involved
in maintaining a package.
Please ask your questions there.
Some of our, sort of like internal maintainers
will also get a flag when something's posted there.
So they may be able to come
and answer your questions.
You can answer each other's questions.
So right now it's empty.
It's just a category that exists,
and I encourage you to use it.
Do I have anything else
I need to tell you?
I think that's it.
You really, it's only 10 o'clock in the morning
for me here in Kamloops British Columbia,
you set me off to start a wonderful day.
I thank you all for joining us
wherever you are in the world,
and I wish you both a physically
and mentally healthy and happy rest of the day.
Thanks so much, everyone.
Bye bye.