34C3 preroll music
Herald: Please give a warm welcome here.
It’s Franziska, Teresa, and Judith.
Judith, you have the stage, thank you.
Judith Hartstein: Thank you, thanks!
applause
inaudible
Judith: We believe that
scientific performance indicators
are widely applied to inform
funding decisions and to
determine the availability of career
opportunities. So, those of you who are
working in science or have had a look into
the science system might agree to that.
And we want to understand evaluative
bibliometrics as algorithmic science
evaluation instruments to highlight some
things that do occur also with other
algorithmic instruments of evaluation. And
so we’re going to start with a quote from
a publication in 2015 which reads “As the
tyranny of bibliometrics tightens its
grip, it is having a disastrous effect on
the model of science presented to young
researchers.” We have heard the talk of
hanno already, and he’s basically also
talking about problems in the science
system and the reputation by the
indicators. And the question is, is
bibliometrics the bad guy here? If you
speak of ‘tyranny of bibliometrics’, who
is the actor doing this? Or are maybe
bibliometricians the problem? We want to
contextualize our talk into the growing
movement of Reflexive Metrics. So those
who are doing science studies, social
studies of science, scientometrics and
bibliometrics. The movement of Reflexive
Metrics. So the basic idea is to say:
“Okay, we have to accept accountability if
we do bibliometrics and scientometrics.”
We have to understand the effects of
algorithmic evaluation on science, and we
will try not to be the bad guy. And the
main mediator of the science evaluation
which is perceived by the researchers is
the algorithm. I will hand over the
microphone to… or I will not hand over the
microphone but I will hand over the talk
to Teresa. She’s going to talk about
"Datafication of Scientific Evaluation".
Teresa Isigkeit: Okay. I hope you can
hear me. No? Yes? Okay.
Judith: mumbling
When we think about the science system
what do we expect?
What can society expect
from a scientific system?
In general, we would say
reliable and truthful knowledge,
that is scrutinized by
the scientific community.
So where can we find this knowledge?
Normally in publications.
So with these publications,
can we actually say
whether science is bad or good? Or is
there better science than others?
In the era of
digital publication databases,
there’s big datasets of publications.
And these are used to
evaluate and calculate
the quality of scientific output.
So in general, with this metadata
we can tell you
who is the author of a publication,
where is the home institution
of this author,
or which types of citations are in
the bibliographic information.
This is used in the calculation
of bibliometric indicators.
For example if you take the
journal impact factors,
which is a citation based indicator,
you can compare different journals.
And maybe say which journals
are performing better than others
or if the journal factor has increased or
decreased over the years.
Another example would be the
Hirsch-Index for individual scientists,
which is also widely used when
scientists apply for jobs. So they put
these numbers in their CVs and supposedly
this tells you something about the quality
of research those scientists are
conducting. With the availability of the
data we can see an increase in its usage.
And in a scientific environment in which
data-driven science is established,
scientific conduct decisions regarding
hiring or funding heavily rely on these
indicators. There’s maybe a naive belief
that these indicators that are data-driven
and rely on data that is collected in the
database is a more objective metric that
we can use. So here's a quote by Rieder
and Simon: “In this brave new world trust
no longer resides in the integrity of
individual truth-tellers or the veracity
of prestigious institutions, but is placed
in highly formalized procedures enacted
through disciplined self-restraint.
Numbers cease to be supplements.” So we
see a change of an evaluation system that
is relying on expert knowledge to a system
of algorithmic science evaluation. In this
change there’s a belief in a
depersonalization of the system and the
perception of algorithms as the rule of
law. So when looking at the interaction
between the algorithm and scientists we
can tell that this relationship is not as
easy as it seems. Algorithms are not in
fact objective. They carry social meaning
and human agency. They are used to
construct a reality and algorithms don’t
come naturally. They don’t grow on trees
and can be picked by scientists and people
who evaluate the scientific system, so we
have to be reflective and think about
which social meanings the algorithm holds.
So when there is a code that the algorithm
uses, there is a subjective meaning in
this code, and there is agency in this
code, and you can’t just say, oh, this is
a perfect construction of the reality of
scientific system. So the belief that this
tells you more about the quality of
research is not a good indicator. So when
you think about the example of citation
counts the algorithm reads the
bibliographic information of a publication
from the database. So scientists, they
cite papers that relate to their studies.
But we don’t actually know which of these
citations are more meaningful than others,
so they’re not as easily comparable. But
the algorithms give you the belief they
are, so relevance is not as easily put
into an algorithm and there is different
types of citations. So the scientists
perceive this use of the algorithms also
as a powerful instrument. And so the
algorithm has some sway above the
scientists because they rely so much on
those indicators to further their careers,
to get a promotion, or get funding for
their next research projects. So we have a
reciprocal relationship between the
algorithm and the scientists, and this
creates a new construction of reality. So
we can conclude that governance by
algorithms leads to behavioral adaptation
in scientists, and one of these examples
that uses the Science Citation Index will
be given from Franziska.
Franziska Sörgel: Thanks for the
handover! Yes, let me start.
I’m focusing on reputation
and authorship as you can see
on the slide, and first let me
start with a quote
by Eugene Garfield, which says: “Is it
reasonable to assume that if I cite a
paper that I would probably be interested
in those papers which subsequently cite it
as well as my own paper. Indeed, I have
observed on several occasions that people
preferred to cite the articles I had cited
rather than cite me! It would seem to me
that this is the basis for the building up
of the ‘logical network’ for the Citation
Index service.” So, actually, this Science
Citation Index which is described here was
mainly developed in order to solve the
problems of information retrieval. Eugene
Garfield, also founder of this Science
Citation Index – short: SCI – noted or
began to note a huge interest in
reciprocal publication behavior. He
recognized the increasing interest as a
strategic instrument to exploit
intellectual property. And indeed, the
interest in the SCI – and its data –
successively became more relevant within
the disciplines, and its usage extended.
Later, [Derek J.] de Solla Price, another
social scientist, asked or claimed for a
better research on the topic, as it
currently also meant a crisis in science,
and stated: “If a paper was cited once,
it would get cited again and
again, so the main problem was that the
rich would get richer”, which is also
known as the “Matthew Effect”. Finally,
the SCI and its use turned into a system
which was and still is used as a
reciprocal citation system, and became a
central and global actor. Once a paper was
cited, the probability it was cited again
was higher, and it would even extend its
own influence on a certain topic within
the scientific field. So it was known that
you would either read a certain article
and people would do research on a certain
topic or subject. So this phenomenon would
rise to an instrument of disciplining
science and created power structures.
Let me show you one example which is
closely connected to this phenomenon
I just told you about – and I don’t know
if here in this room there are any
astronomers or physicists?
Yeah, there are few, okay.
That’s great, actually.
So in the next slide, here,
we have a table with a time
window from 2010 to 2016, and social
scientists from Berlin found out that the
co-authorship within the field of physics
extended by 58 on a yearly basis in this
time window. So this is actually already
very high, but they also found another
very extreme case. They found one paper
which had roundabout 7,000 words and the
mentioned authorship of 5,000. So, in
average, the contribution of each
scientist or researcher of this paper who
was mentioned was 1.1 word. Sounds
strange, yeah. And so of course you have
to see this in a certain context, and
maybe we can talk about this later on,
because it has to do with Atlas particle
detector, which requires high maintenance
and stuff. But still, the number of
authorship, and you can see this
regardless which scientific field we are
talking about, generally increased the
last years. It remains a problem
especially for the reputation, obviously.
It remains a problem that there is such
high pressure on nowadays researchers.
Still, of course, we have ethics and
research requires standards of
responsibility. And for example there’s
one, there’s other ones, but there’s one
here on the slide: the “Australian Code
for the Responsible Conduct of Research”
which says: “The right to authorship is
not tied to position or profession and
does not depend on whether the
contribution was paid for or voluntary.
It is not enough to have provided
materials or routine technical support,
or to have made the measurements
on which the publication is based.
Substantial intellectual involvement
is required.”
So yeah, this is, could be one rule
to work with or to work by, to follow.
And still we have this problem
of reputation which remains,
and where I hand over to Judith again.
Judith: Thank you. So we’re going to speak
about strategic citation now. So if you
put this point of reputation like that,
you may say: So the researcher does find
something in his research, his or her
research, and addresses the publication
describing it to the community. And the
community, the scientific community
rewards the researcher with reputation.
And now the algorithm, which is like
perceived to be a new thing, is mediating
the visibility of the researcher’s results
to the community, and is also mediating
the rewards – the career opportunities or
the funding decisions etc. And what
happens now and what is plausible to
happen is that the researcher addresses
his or her research also to the algorithm
in terms of citing those who are evaluated
by the algorithm, who he wants to support,
and also in terms of strategic keywording
etc. And that’s the only thing which
happens new, might be a perspective on
that. So the one thing new: the algorithm
is addressed as a recipient of scientific
publications. And it is like far-fetched
to discriminate between so-called and
‘visible colleges’ and ‘citation cartels’.
What do I mean by that? So ‘invisible
colleges’ is a term to say: “Okay, people
are citing each other. They do not work
together in a co-working space, maybe, but
they do research on the same topic.” And
that’s only plausible that they cite each
other. And if we look at citation networks
and find people citing each other, that
does not have necessarily to be something
bad. And we also have people who are
concerned that there might be like
‘citation cartels’. So researchers citing
each other not for purposes like the
research topics are closely connected, but
to support each other in their career
prospects. And people do try to
discriminate those invisible colleges from
citation cartels ex post from looking at
metadata networks of publication and find
that a problem. And we have a discourse on
that in the bibliometrics community. I
will show you some short quotes how people
talk about those citation cartels. So e.g.
Davis in 2012 said: “George Franck warned
us on the possibility of citation cartels
– groups of editors and journals working
together for mutual benefit.” So we have
heard about their journal impact factors,
so they... it’s believed that editors talk
to each other: “Hey you cite my journal,
I cite your journal, and we both
will boost our impact factors.”
So we have people trying
to detect those cartels,
and Mongeon et al. wrote that:
“We have little knowledge
about the phenomenon itself and
about where to draw the line between
acceptable and unacceptable behavior.” So
we are having like moral discussions,
about research ethics. And also we find
discussions about the fairness of the
impact factors. So Yang et al. wrote:
“Disingenuously manipulating impact factor
is the significant way to harm the
fairness of the impact factor.” And that’s
a very interesting thing I think, because
why should an indicator be fair? So the...
To believe that we have a fair measurement
of scientific quality relevance and rigor
in one single like number, like their
journal impact factor, is not a small
thing to say. And also we have a call for
detection and punishment. So Davis also
wrote: “If disciplinary norms and decorum
cannot keep this kind of behavior at bay,
the threat of being delisted from the JCR
may be necessary.” So we find the moral
concerns on right and wrong. We find the
evocation of the fairness of indicators
and we find the call for detection and
punishment. When I first heard about that
phenomenon of citation cartels which is
believed to exist, I had something in mind
which sounded... or it sounded like
familiar to me. Because we have a similar
information retrieval discourse or a
discourse about ranking and power in a
different area of society: in search
engine optimization. So I found a quote by
Page et al., who developed the PageRank
algorithm – Google’s ranking algorithm –
in 1999, which has changed since that a
lot. But they wrote also a paper about the
social implications of the information
retrieval by the PageRank as an indicator.
And wrote that: “These types of
personalized PageRanks are virtually
immune to manipulation by commercial
interests. ... For example fast updating
of documents is a very desirable feature,
but it is abused by people who want to
manipulate the results of the search
engine.” And that was important to me to
read because we also have like a narration
of abuse, of manipulation, the perception
that that might be fair, so we have a fair
indicator and people try to betray it.
And then we had in the early 2000s,
I recall having a private website
with a public guest book and
getting link spam from people
who wanted to boost their
Google PageRanks,
and shortly afterwards Google
decided to punish link spam in their
ranking algorithm. And then I got lots of
emails of people saying: “Please delete my
post from your guestbook because Google’s
going to punish me for that.” We may say
that this search engine optimization
discussion is now somehow settled and it’s
accepted that Google's ranking is useful.
They have a secret algorithm, but it works
and that is why it’s widely used. Although
that journal impact factor seems to be
transparent it’s basically the same thing
that it's accepted to be useful and thus
it's widely used. So the journal impact
factor, the SCI and the like. We have
another analogy so that Google decides
which SEO behavior is regarded acceptable
and punishes those who act against the
rules and thus holds an enormous amount of
power, which has lots of implications and
led to the spreading of content management
systems, for example, with search engine
optimization plugins etc. We also have
this power concentration in the hands of
Clarivate (formerly ThomsonReuters) who
host the database for the general impact
factor. And they decide on who’s going to
be indexed in those journal citation
records and how is the algorithm, in
detail, implemented in their databases. So
we have this power concentration there
too, and I think if we think about this
analogy we might come to interesting
thoughts but our time is running out so we
are going to give a take-home message.
Tl;dr, we find that the scientific
community reacts with codes of conduct to
a problem which is believed to exist. The
strategic citation – we have database
providers which react with sanctions so
people are delisted from the journal
citation records or journals are delisted
from the journal citation records to
punish them for citation stacking. And we
have researchers and publishers who adapt
their publication strategies in reaction
to this perceived algorithmic power. But
if we want to understand this as a problem
we don’t have to only react to the
algorithm but we have to address the power
structures. Who holds these instruments in
in their hands? If we talk about
bibliometrics as an instrument and we
should not only blame the algorithm – so
#dontblamethealgorithm.
Thank you very much!
applause
Herald: Thank you to Franziska, Teresa
and Judith, or in the reverse order.
Thank you for shining a light on
how science is actually seen
in its publications.
As I started off as well,
it’s more about
scratching each other a little bit.
I have some questions here
from the audience.
This is Microphone 2, please!
Mic2: Yes, thank you for this interesting
talk. I have a question. You may be
familiar with the term ‘measurement
dysfunction’, that if you provide a worker
with an incentive to do a good job based
on some kind of metric then the worker
will start optimizing for the metric
instead of trying to do a good job, and
this is kind of inevitable. So, don’t you
see that maybe it could be treating the
symptoms if we just react about code of
conduct, tweaking algorithms or addressing
power structures. But instead we need to
remove the incentives that lead to this
measurement dysfunction.
Judith: I would refer to this phenomenon
as “perverse learning” – learning for the
grades you get but not for your intrinsic
motivation to learn something. We observe
that in the science system. But if we only
adapt the algorithm, so take away the
incentives, would be like you wouldn’t
want to evaluate research at all which you
can probably want to do. But to whom would
you address this call or this demand, so
“please do not have indicators” or… I give
the question back to you. laughs
Herald: Okay, questions from the audience
out there on the Internet, please. Your
mic is not working? Okay, then I go to
Microphone 1, please Sir.
Mic1: Yeah, I want to have a provocative
thesis. I think the fundamental problem is
not how these things are gamed but the
fundamental problem is that if we think
the impact factor is a useful measurement
for the quality of science.
Because I think it’s just not.
applause
Judith: Ahm.. I..
Mic 1: I guess that was obvious...
Judith: Yeah, I would not say
that the journal impact factor is
a measurement of scientific quality
because no one has like
a definition of scientific quality.
So what I can observe is only
people believe this journal impact factor
to reflect some quality.
Maybe they are chasing a ghost but I…
whether that’s a valid measure
is not so important to me,
even if it were a relevant
or a valid measure,
it would concern me
how it affects science.
Herald: Okay, question from Microphone 3
there. Please.
Mic3: Thanks for the interesting talk.
I have a question about
the 5,000 authors paper.
Was that same paper published
five thousand times or was it one paper
with ten page title page?
Franziska: No, it was one paper ...
... counting more than 7,000 words.
And the authorship,
so authors and co-authors,
were more than 5,000.
Mic3: Isn’t it obvious
that this is a fake?
Franziska: Well that’s
what I meant earlier
when saying, you have to see this within
its context. So physicists are working
with this with Atlas, this detective
system. As there were some physicists in
the audience they probably do know how
this works. I do not. But as they claim
it’s so much work to work with this, and
it, as I said, requires so high
maintenance it’s... They obviously have
yeah...
Mic3: So everybody who contributed was
listed?
Judith: Exactly, that’s it. And if this is
ethically correct or not, well, this is
something which needs to be discussed,
right? This is why we have this talk, as
we want to make this transparent, and
contribute it to an open discussion.
Herald: Okay, I’m sorry guys. I have to
cut off here because our emission out
there in space is coming to an end.
I suggest that you guys
find each other somewhere,
maybe in the tea house or...
Judith: Sure. We are around, we are here.
Herald: You are around. I would love to
have lots of applause for these ladies,
for it really lights on
how these algorithms
not or are working. Thank you very much!
Judith: Thank you!
postroll music
subtitles created by c3subtitles.de
in the year 2018