-
36C3 preroll music
-
purine:bitter: Thanks a lot to WikiPakaWG
for hosting this and for keeping us all
-
awake. So probably it's not wrong to say
Good Morning everyone. Okay, what I would
-
like to do so this all of this has been
announced as a discussion so there's
-
probably no point in me talking to you for
something like 55 minutes straight. So I
-
would just like to give you a couple of
slides on what we could discuss and then
-
see where we want to go with this one,
okay? So to start off with: Who of you
-
considers him- or herself to be a
scientist? Okay, who has the pleasure to
-
work within the European scientific
system? Okay, and within the German one?
-
Okay, so negative control: Who knows what
the capital of North Dakota is? Okay, so
-
there is no rigor mortis in your arms.
Okay, so topic today is Free Software for
-
Open Science and as I have some
association with the Free Software
-
Foundation Europe, well we should probably
start with the definitions: So number one,
-
what do we consider to be Free Software in
this one: It's pretty much every software
-
that would be released under an either
FSF- or OSI-compliant license. So this is
-
what most people know also as Open Source
and main point here is, as the FSF and OSI
-
definitions pretty much standardized the
same things that they just have different
-
ways to say it, it should be made sure
that it guarantees the Four Freedoms to
-
the user, so to use, to study, to improve
and to share the piece of software and of
-
course this does require the existence and
openness of a source code and the ability
-
to actually create derivatives. Okay so
and I think for everyone who has been
-
working in science it's pretty clear that
those four core freedoms are very well
-
aligned with what we're trying to do in
science okay we're trying to build up on
-
the work of others and to get humanity
along and increase our overall knowledge.
-
So for that reason what we're doing there
is exactly that we're exercising those
-
four freedoms just not necessarily that
we're doing it in a digital or code-based
-
manner. Okay so that's the first thing.
Then what actually is Open Science? So
-
first of all, Open Science is a Class A
buzzword. Nevertheless, the European
-
Commission took the liberty to get a
committee in there, in that case the OSPP,
-
the Open Science Policy Platform, and
those people developed a lot of bits or
-
paper, whatever. And what they defined is
eight key areas, they are called sometimes
-
called "ambitions", sometimes they're
called "priorities", which is the key
-
things that need to be addressed in the
midterm to move European science to what
-
they consider to be Open Science. And this
is not only, and that's very important,
-
about the classical things that you might
know like Open Access and Open Data. Open
-
Access and Open Data are basically
incorporated in here, so scholarly
-
communication, it says "Future of
Scholarly Communication", which can be
-
everything from Open Access to just going
digital. However, we should all be aware
-
that European Commission now has endorsed
Plan S, which is a rather far-reaching
-
push towards more or rather radical
program in terms of publishing
-
requirements, so we can consider that this
part for scholarly communication is really
-
meant to be Open Access. And then the
other things, so Open Data is what is
-
called here to be FAIR Data, because the
Commission typically tries to avoid the
-
term "Open", because "Open" is of course
is not FAIR and FAIR unfortunately is not
-
"Open". But this is where we lead our
discussions. So this means that we only
-
have two of the classical Open Science
points that are in here. Everything else
-
are things like "Incentives", so this is
how can we generate better citation or how
-
can we make sure that the people who do
the work get the credit, so we might need
-
some reform in how we do citations. Then
"Indicators" is -- was that me or was that
-
okay -- so "Indicators" is kind of a way
to try to overcome the simple citation
-
indices and of course especially the
impact factor. "EOSC" for those of you
-
have not heard that term that's a very
large project, that's the European Open
-
Science Cloud. It's still rather ill-
defined what it should be, it's getting
-
better along the way but the term has been
out there for three years. In the end what
-
this is about is to really create a large
federated European infrastructure for
-
scientific data. The main funding for that
one will come from the National States and
-
so for example the German implementation
is called NFDI, National Research Data
-
Infrastructure, and will be heavily funded
by nearly 1 billion Euros over the next 10
-
years so this is the scale that we are
talking about. "Integrity" means how to
-
assure integrity, "Skills" is how to train
the next generation of scientists and CS
-
is the abbreviation for "Citizen Science".
So with all of this you see that what Open
-
Science is not just trying to do tick
marks, what they're really trying to push
-
for is a rather fundamental change in the
way how we do our work to what's really
-
becoming a more egalitarian system and a
more open and participatory system. Okay,
-
so now the question is, what is the role
that free software can play in this. And
-
so one of the things that we need to
define here are we talking about Free
-
Software for Open Science, which is the
thing that this talk was announced for.
-
But of course we could also, if that's the
general interest, to talk about Free
-
Software in Open Science or in science in
general. So distinction would be that the
-
"for Open Science" is mainly, here we're
talking about software as a research
-
product, so this is mainly the main focus
software that is created by the scientists
-
themselves and here we then have of course
issues like how to sustain it how to
-
ensure quality and how to choose proper
licensing models for it. While the "in
-
science" is more generally talking about
generic software tools so this is
-
operating system, office suites and so on
that are just used by scientists in more
-
general. In both cases the main point of
course is how Free Software can contribute
-
to the scientific endeavor is of course by
promoting the reproducibility because
-
everyone can use these tools there is no
there is no pay wall in that case. So you
-
don't need to purchase as given Microsoft
Office version to recreate an Excel table
-
or something like this and of course also
the attempt to reduce black boxing. The
-
other thing that is more specific for Free
Software for Open Science is the general
-
thing that we already said: Okay, so some
of the ideas of Free Software align well
-
with what we're trying to do in science.
But more importantly the question right
-
now is: Does it fit the policies under
which we are operating? And so of course
-
the main policy that most people know is
FAIR. So FAIR stands for Findable,
-
Accessible Interoperable and Reusable and
it's a kind of a paradigm that was
-
defined, so published 2016, was in the
making for a couple of years before that
-
and this is something that was a primarily
geared towards data. The nice thing about
-
FAIR is that the 2016 paper also
operationalizes this so they give criteria
-
on what you need to do or what you need to
ensure that for example a data set is
-
findable, what it means how it needs to be
accessible and so on so forth. And of
-
course reuse also says something about,
well you need to put a license on it, but
-
otherwise it's not that specific. Okay,
now importantly for this one stuff, that
-
is FAIR does not necessarily align with
Free Software because Free Software means
-
that there are no restrict- that there are
basically no restrictions in use, while
-
the reusability for FAIR simply says:
People somehow need to be able to reuse
-
it, so there needs to be a clear pathway.
That can still be a proprietary license,
-
okay and that license might still not
allow you to do everything with it, there
-
just needs to be this ability. So that's
one of the main things where FAIR does not
-
fit the usual - the Free Software
definitions. On the other hand of course,
-
Free Software doesn't say anything about
-- Oh No! I killed the alpaca! --
-
Applause
Okay, I'm probably gonna be kicked off the
-
stage any minute, okay sorry. Alright, so
on the other hand, I can write beautiful
-
code and put it under an Open Source
license and put it on a USB stick and bury
-
it somewhere in my garden. Okay, so then
it's neither findable nor accessible and
-
this is of course also something where the
classical definitions for Free Software
-
don't necessarily match these two
criteria, which nevertheless also for
-
software do make sense. Finally one last
thing is that FAIR defines a product, so
-
it says: Okay, so the outcome of your
research needs to comply with different
-
criteria and that's of course a relatively
easy thing to test. What it does not do
-
and maybe from a software development
perspective this is something that is more
-
important, it doesn't define a process how
we do things. And this is one of the
-
things that also one of the German
committees so the RfII has recently
-
started to criticize for FAIR that we say
okay, FAIR data just says this one, but
-
you can have completely rubbish data and
it can still be FAIR. But what we want to
-
have is high quality FAIR data. So FAIR
clearly is some kind of minimal consensus
-
it's condicio sine qua non, but we
probably need to extend it at this point
-
and of course was this one we can also
discuss on how we want to continue, how we
-
want to get this into or align this with
Free Software. Okay, so that's more or
-
less the brief introduction, now there are
a couple of things that we can discuss
-
further, depending on your interest. And
that would be basically what about the
-
current European policies, before we
review what about the current German
-
policies, what about generic Free Software
tools. But maybe that's the point where
-
you could say something to
get us going a bit.
-
Question: I think it's working -- You
mentioned that the current software
-
standards might not be in line with the
policies, what were you exactly referring
-
to?
Answer: Can you repeat this?
-
Q: You mentioned before that the current
software procedures or standards might not
-
be in line with the policies in the
European Union. What exactly did you mean
-
by that?
A: So the thing is that the so I can
-
comply with OSI regulations for Open
Source Software, but none of our funding
-
bodies says you need to be OSI compliant.
What they say typically is you should do
-
stuff that is FAIR but right now one of
the issues, this is what basically this
-
slide then says, is the question whether
any of the policy makers really define
-
code as a primary research object. And
that's right now not the case so therefore
-
everyone assumes that code behaves like
data and to equal code with data is
-
something where some people get cold
shivers, others don't because it is an
-
operation that you can do, it's a lossy
operation, but it might be it might help
-
us in some ways. And the main point here
is that code has some idiosyncrasies that
-
make it distinct from data and this is
where our policies break. On the other
-
hand, some of the policies that we came up
-- not for research but in general, so
-
from the from the Free Software
perspective -- that we made up there,
-
didn't make it into the policy documents
and so therefore are not incorporated
-
there. Okay, so FAIR criteria and the
other ones don't completely overlap. So
-
most people might write code but it still
won't align with a FAIR criterion if you
-
would take it one to one.
Q: So a question about the topic item to
-
start the licensing. So when we say we
have a commercial company who like
-
Microsoft who develops an office package
and when you say Free Software for Open
-
Science it would be better to like invest
the money not into license cost where
-
reoccurring but better for like and like a
bigger thing like country to invest more
-
in like open code or like open programs.
Is this kind of like tackled by what you
-
mean with the FAIR or the Open Source?
A: This is this is one of the things that
-
not necessary is not necessarily so you
could construct it in a way that it
-
actually overlaps with FAIR. Because
you're talking about reproducibility, oh
-
well so okay, FAIR doesn't say
reproducibility but it says accessibility
-
and if you're using formats that are
proprietary you could say okay well this
-
is not accessible to everyone because you
need to pay for it. Now the thing is that
-
there are a lot of things where you have
to pay for so this was one of the things
-
that was never on the agenda to try to be
eradicated. This is, so the generic
-
software part is just something that I
that came into this whole process later,
-
initially it was really geared towards
the: How can scientists make sure that or
-
how does the software produced by
scientists is both Free Software and
-
contributes to Open Science and what do we
need to do to create potentially
-
additional funding opportunities for,
because this is where typically breaks, to
-
say well I can write better code if I have
more man or woman power, if I have people
-
who curate, if I have people who do who do
issue fixing and so on and so forth. Which
-
right now is not considered part of the
research process but in reality, so by the
-
policy makers, but in reality it already
has become that. Now if you're saying you
-
are using generic software or generic
office suits for that one, then yes, we
-
are investing a lot on in these things in
the tertiary education and in the research
-
sector and, personal opinion, yes we
should spend this on things that doesn't
-
nudge people towards proprietary
solutions. But the question there but
-
that's something that is because it it has
a stronger education component also for
-
student education, so I wanted to bring it
up here because I thought okay maybe it's
-
something that more people here are
interested in. But I agree that it doesn't
-
overlap completely, doesn't strongly
overlap with the with the Open Science
-
part.
Q: Right, okay. I've heard some people
-
work on the FAIR principles specific for
software. You've heard about it and you
-
know what kind of the differences are?
A: Yes, so thanks for this input. So let
-
me check. Okay I've missed that one. So
yeah, there's a recent paper that just
-
came out a couple of weeks ago by Anna-
Lena Lamprecht, she's from the Netherlands
-
eScience Center. So what they try to do
is, they to use the catalog or this the
-
original FAIR criteria and check for each
of those ones does it apply to software,
-
yes or no? And then change them, amend
them in a way to make sure that it then,
-
well, better fits into the process. So
they for example say well so there needs
-
to be some kind of documented quality
control, they're more talking of course
-
about software repositories, they then
include versioning, which is one of the
-
huge things that sets code apart from
data, which is once it's released
-
typically a rather static object. So
they're trying to get somewhere and I
-
think it's, it's a good document to start
with but in my personal opinion, I think
-
it wasn't bold enough. You might have
been, I mean we had this discussion at the
-
RSE19 conference also, where Anna-Lena
also was there, and it tries to stick very
-
closely to FAIR, because they assume that
this is what people know. Which I think is
-
good. On the other hand there's a very
clear recommendation form most bodies that
-
FAIR should not be extended, so we don't
need, as they say, we don't need
-
"additional letters" for FAIR and they
really want to have those basically as one
-
concept to stick on to stick with data. So
therefore I think it would have been
-
necessary have a bolder step to to try to
work in all the established development
-
policies that we already have than just to
stick as close as possible to FAIR and
-
then just change the nitty-gritty details,
which is what they did. But nevertheless I
-
think it's it's something that is clearly
worth reading.
-
Q: Thanks a lot for your talk this
resonated a lot with me and as someone
-
working in research infrastructure I think
it's super important that we focus on
-
recognizing research infrastructure so all
kinds of services like sustainable data
-
storage for researchers, tools that help
make data discoverable and things like
-
that. That this should be considered a
public good right?
-
A: Yes
Q: And so next to what you mentioned and
-
rightly so with Microsoft, the other risk
that I currently see, is that legacy
-
publishers like Elsevier, like Springer-
Nature and so on, try to capture the whole
-
market so this all as trying to deliver on
all the needs that researchers have in the
-
digital area with huge platforms. And this
is like a battle that we almost have lost
-
already, as it seems. So there are many
interesting very good free and open source
-
alternatives to what they deliver but it's
really not recognized very well why this
-
is so important. This is my impression.
A: Yeah I mean I would I would second
-
that. So, I think and this is it's
interesting to see the large publishing
-
companies now really moving away from
their traditional business because
-
apparently they have recognized that they
might be on a losing path there. But
-
really to offer a wholesale data
management solutions to institutes. I mean
-
there is, this is probably just an
anecdote, but so apparently Elsevier
-
offered to I think the Netherlands or the
Dutch government to say that they said:
-
Okay, we do all of your data management or
basically you get everything for free, but
-
each and every institution has to deliver
but we become your central data deposition
-
platform. Which well, unfortunately it
might appeal to some politicians, I think
-
it doesn't appeal to anyone else in this
room given that probably Elsevier is a
-
company that is even more hated than
Microsoft for reasons completely unknown I
-
mean they just make a revenue of thirty-
five percent every year so maybe we should
-
just buy stock options.
Q: Oh thank you for your talk. What I not
-
completely understand is why we use the
FAIR concept for as a point of reference
-
at all. Because I feel like this the
concept of Open Access in science is far
-
more applicable to code. So in the end
code is text and it's part of the
-
scientific publication system, so we have
references from and to code and such
-
things. And the the Open Access yeah yeah
the the concept of Open Access has the
-
same ancestors like the scientific
publication system with the Mertonian
-
norms of science and such, so why don't
treat code like scientific publications.
-
A: Ok, I'm honestly I'm relatively open to
this idea because this is I mean is the
-
reason why we're having this discussion.
The mainly what I'm presenting to you now
-
is mainly developed out of the existing EU
policies and the EU talks about FAIR a
-
lot. Because for them it's an
operationalized thing, it's something that
-
they would like to test in the end, they
it's something that they would like to
-
score and so on so forth so that paper
pushers have something to do with. But I
-
agree that we can simply say well in the
end the openness is more important and
-
FAIR, as we already said, isn't open, so
therefore the Open Access would maybe the
-
better point to to hook this up so yeah I
agree on that.
-
postroll music
-
Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!