-
Hello. Just to check.
-
Can everyone hear me?
-
Grand.
I've never understood
-
why that's such a phenomenon
when people give talks
-
because if you can't,
what are you meant to say?
-
(laughter)
-
But yes, so as said, I'm Os.
-
I'm a PhD student
at the University of Washington,
-
where, according to the slide,
-
I study "Gender, Infrastructure
and (Counter)Power."
-
I'd ask you all to do me the indulgence
of pretending that
-
that's some very explicit, nuanced,
thoughtful, academic description
-
and not just what I write as a catch-all,
-
because I kind of study
a thousand different things
-
and fitting them all
into a few words is hard.
-
But most of the things I study
-
are around how systems
of knowledge enforce particular ideas
-
of how the world works,
-
and particular relationships of power
-
with a specific focus on gender.
-
I'm also an ex-Wikipedian.
-
I spent 15 years as an editor
-
which is maybe where my interest
in the nature of knowledge started,
-
and I really can't express
how happy I was to be invited
-
and how glad I am to be here
with all of you,
-
but particularly James Forrester
-
who is probably the only person qualified
-
to countersign
my passport renewal application,
-
cause it's running out soon
and I've been trying to work out...
-
(laughter)
-
You move to Seattle.
Everything is great.
-
Then you're like,
"Oh, the UK government
-
requires me to find an ex-priest,
civil servant, or member of parliament,
-
who's known me for at least 2 years
-
and who I can ship paperwork to."
-
That sounds plausible.
-
(laughter)
-
Anyway, but...
-
So I'm here as someone
who has spent a lot of time of...
-
a number of years--
-
which I don't like to think about
because it makes me feel incredibly old--
-
wrestling with the nature of knowledge
-
and the idea of knowledge--
-
to talk to you about
-
what Wikidata looks like
-
to someone from my background
and with my research interests.
-
And I'm not going to spend much time
on the story of Wikidata itself,
-
because if you're here,
having spent 24 hours
-
having it brain dumped into you,
-
you're familiar with it.
-
It's a big semantic data store
-
that aims to provide
machine-readable knowledge
-
in a centralized way.
-
And what this looks like
is a series of items
-
with associated properties or statements.
-
So the item for "apple"
has the property "fruit."
-
I mean, probably.
-
It's a Wiki so there's probably
a long-running edit war
-
of whether an apple is a fruit,
-
and there's 50 people
running 300 accounts between them,
-
and it's been going for years,
-
and at this point,
if you mention the word apple on Wikidata,
-
you're preemptively banned
as someone who, you know,
-
is secretly a sock puppet
-
and running an account on one
or another side of this.
-
So as a consequence,
-
it's also a classification system, right?
-
A way of sorting and organizing the world.
-
So, objects or people or concepts
-
are classified as worth
having a Wikidata entry or not.
-
A fruit or not.
-
And in each case
-
a series of criterion apply
-
to determine the properties
that an object should have,
-
and the values of these properties
-
and how the objects
all relate to each other.
-
So Wikidata is really an attempt to build
-
a universal classification system.
-
And classification systems
have been studied pretty extensively.
-
One prominent work
which I'd really recommend people read
-
if they're interested in this stuff
is Sorting Things Out,
-
which is book by Geoff Bowker
and Susan Leigh Star.
-
And they found that
in an ideal universe,
-
a classification system,
-
be it universal
or over a particular domain,
-
has three attributes.
-
The first is it operates on consistent
and unique principles.
-
So, there's a consistent pattern
-
of what should be in each category
and for what reasons.
-
The second is all the categories
are mutually exclusive.
-
And the third is
that the system is complete.
-
It contains total coverage of
what it describes.
-
And this doesn't mean
it has to have every single object
-
that fits into the system.
-
It just means that in the situation
-
where it lacks an object
-
and that object then shows up,
-
there should be
a consistent mechanism
-
to work out
whether it should be added or not,
-
and how it should be described
-
and so on, and so forth.
-
There is one small problem
with this which is that:
-
"No real-world
working classification system
-
that we have looked at
meets these simple requirements
-
and we doubt that any ever could."
-
Or to put it another way,
-
all classification systems fail.
-
All classification systems
have gaps and exceptions.
-
And obviously, the same is true
for all systems, full stop.
-
Anyone who has ever coded
-
or simply worked in an environment,
-
or studied in an environment,
or lived in the world
-
knows that we've yet
to design a single thing
-
that we've thought all the way through.
-
The problem is that when we take a system,
-
classification, or otherwise,
-
and put it out into the world
-
and give it power and authority,
and integrate it into other systems,
-
that already have power
and authority,
-
there are consequences
for what happens
-
when the system inevitably fails,
-
for how it reinforces or undermines
existing relationships of power,
-
for how it hurts people.
-
A universal classification system is,
in another words,
-
not merely doomed to failure,
it's also doomed to hurt people.
-
And the way that it is structured
-
is ultimately a series of ethical
and political choices as a result--
-
Who do you want to hurt?
How much?
-
What should be done
when people are injured?
-
And those choices have real consequences.
-
And so making these choices
often involves confronting the fact
-
that there's very rarely a single
-
simple machine-readable interpretation
-
of something that's true
for all people throughout all history.
-
Anything in the universe
-
has multiple meanings,
and symbolisms, and nuances
-
to different people in different contexts
at different times.
-
But designing a classification system
and implementing it,
-
designing a system that can make a claim
-
to having consistent principles,
-
and covering everything it discusses,
-
inevitably involves
cutting down on this complexity
-
and making decisions about what
"the" meaning of a thing is going to be,
-
or what array of possible meaning
-
should be presented
and in what sequence.
-
And as a result,
-
it involves silencing voices
or rendering voices louder.
-
Again, this has consequences.
-
And to see what I mean
about this complexity
-
and context, and reduction,
and the consequences of it,
-
I'd like to set through some examples
from Wikidata itself.
-
The ones I've chosen
are all gender-related because again,
-
gender is both professionally
and personally sort of a key interest.
-
So, the first that I'll start with
is transexualism
-
which is described as a "condition
-
in which an individual
identifies with a gender
-
inconsistent or not culturally associated
with their biological sex."
-
Fairly unobjectionable and--
-
wait, no, it's classified as a disease,
-
and a psychiatric disease at that.
-
Now, I know what you're thinking,
which is this is appalling
-
but actually it's not as simple
-
as either of these statements
being true or false, right?
-
They're in a category of sort of,
"true, except."
-
So, take transsexualism
is an instance of disease, right?
-
Technically, this is true,
-
in so far as transsexualism
-
is the name of an entry
-
under the International Classification
of Diseases, version 10.
-
But we should add some complexity
and nuance to that.
-
So, the ICD
is a classification of literally
-
everything in the world
that you could have
-
that was in any way involved at all
in someone's injury or death.
-
It is in fact illegal to die of something
that is not listed in the ICD.
-
(laughter)
-
So it contains kind of a lot of things,
-
and transexualism is listed in it
-
so we classify it as a disease
-
because it's in a classification
of diseases.
-
So, here are some other things
that the ICD also lists as diseases
-
that it has specific entries for.
-
PA80: Shot by accident.
-
PA40.0: Fell off a boat, drowned.
-
(laughter)
-
PA41.1: Fell off a boat,
damaged the boat, and drowned.
-
(laughter)
-
PA40.1: Fell off the boat,
-
didn't damage the boat,
-
didn't drown,
-
still died of something.
-
(laughter)
-
And finally, QD50: Being poor.
-
(laughter)
-
So, if any of you
have ever fallen off a boat,
-
I'm very sorry but you have a disease
-
which you should really
talk to a doctor about.
-
What class of doctor,
I'm not sure.
-
It might be a psychiatrist.
-
Who knows?
-
So you know that's disease, right?
-
What about health specialty: psychiatry?
-
Well, that's also true, sort of.
-
So, psychiatrists are the people
-
who diagnose the presence
of gender dysphoria,
-
a disconnect between one's sense of gender
-
and one's sort of like,
embodied or perceived gender.
-
But again, context.
-
For example,
saying psychiatrists diagnose it
-
ignores the fact
that none of the treatments
-
are psychiatric.
-
You might as well list the specialties
-
as specialization in hormones
-
or plastic surgery,
or being a personal shopper.
-
All of these also have some role
in people's life trajectories.
-
They are not listed.
-
One other useful
potential factoid by the way,
-
is that the ICD 10 is actually
-
the old International Classification
of Diseases,
-
and the ICD 11 no longer lists
transsexualism at all,
-
much less as a disease.
-
But my point here is not that Wikidata
sometimes contains outdated information
-
or sometimes contains
false information,
-
it's that the statements
that are constructed from that information
-
as a consequence of what they leave out
-
and what the results are,
-
drop things and add risk.
-
So, one way of structuring
-
the information that
that entry contained is:
-
"transsexualism is a psychiatric disease."
-
And this leaves out
a lot of complexity,
-
some of which we've discussed.
-
But the greater issue is how it interlocks
-
and resonates with existing narratives,
and existing information.
-
For example, the idea
of transsexualism is a disease.
-
Does anyone know why
the ICD stops listing it as a disease?
-
Well, two reasons.
-
First is because calling
being trans a disease is not accurate.
-
It does not meet the definition
of being a disease.
-
In fact, the only reason
that anything to do with being trans
-
is still in the ICD is not
out of some objective
-
like, you know, examination
of biology or psychiatry
-
but instead purely pragmatism.
-
That if you stop listing it,
-
then insurance companies
in places like the U.S.
-
would stop covering medical care
-
that is associated with being trans.
-
And the second is that
-
the stigma associated
with having something classified
-
as a disease is substantive,
-
and when you list transsexualism
as a disease
-
and a psychiatric one at that,
-
you tap into really
long-standing assumptions
-
and false beliefs about trans people.
-
Assumptions and beliefs
that have a lot of power.
-
Like, if it's a disease
there must be something wrong
-
with trans people,
something that people should fix.
-
And if it's a psychiatric condition
-
then trans people should
be therapized out of being trans.
-
In other words, whatever the raw truth
or falseness of the statement,
-
stripping out its complexity
and contextuality,
-
lets people fit it into their own notions
of what it means.
-
And that doesn't end
-
in a neutral objective
classification system,
-
it ends in things like conversion therapy,
-
and it being legal
to beat people to death for being trans
-
when you find out that they're trans
after you slept with them,
-
because, you know,
something's wrong with them.
-
Like why would you
be considered reasonable
-
to have done this?
-
So a more accurate framing of this
might be this,
-
which is hard to fit into Wikidata.
-
And because we can't fit
that into Wikidata,
-
and we strip it down,
-
and we lose all that complexity,
-
we open up the possibility to, again,
reinforce these really dangerous notions.
-
So, let's look at another example,
also from gender,
-
and that is the entry for non-binary.
-
So, as Wikidata informs us,
-
non-binary is a range of genders
-
that are neither exclusively man
nor woman.
-
And there are some critiques
I have of the "also known as" section,
-
but that's not the biggest issue here.
-
No, the biggest issue here
-
is that at no point does this entire page
make any reference to trans people.
-
So, if you go to the entry
for transgender woman,
-
it says, "opposite to transgender man."
-
And if you go to the entry
for transgender man
-
it says, "opposite to transgender woman."
-
If you go to this entry,
-
it has absolutely no reference
to trans people whatsoever.
-
There is this complete disconnect
and distinction
-
between non-binary people
and trans people.
-
And this might be, seems to be,
-
a pedantic thing to be concerned about
-
but it's actually a really useful example
for a couple of reasons.
-
The first is that how non-binary people
relates to being trans
-
is really hotly debated.
-
Individual non-binary people
may or may not identify as trans.
-
As a consequence, it's really difficult
-
to make big categorical judgements
about a class of people.
-
Other people would say that non-binary
people aren't trans,
-
for whatever reason,
or that non-binary people are trans.
-
You know, you have to
make a decision at some point.
-
How are you going
to categorize this entry?
-
What attributes are you going
to associate it with?
-
But it's hard to do that in Wikidata
-
when by necessity
the structure of the platform
-
is so categorical and so fixed,
-
that you can't really say like,
for some people these things are related
-
and for others they aren't,
and it's actually very politically charged
-
but you should think about it.
-
There's no objective fact to fall back on.
-
It's very contextual and complex,
and disputed.
-
So, how do you fit this in?
-
Anyone?
-
But, this reductiveness
isn't just a question of,
-
"Oh well, we haven't fit all
the information in
-
so I guess it's not perfect."
-
Again, it fits into preexisting discourses
and the preexisting world,
-
and has the potential
to cause very real harms.
-
There's this very long history
-
of non-binary people
not being considered trans,
-
going back to, in fact, the foundational,
-
sort of medical and academic,
and authoritative works
-
on what being trans is
and how trans people should be treated.
-
And what this has resulted in
-
is non-binary people being cut
out of access to resources--
-
medical care, community membership,
any kind of support.
-
In fact until 2013,
-
being non-binary was not a thing
you could possibly be
-
while still getting access,
to transition-related medical treatment.
-
If you were, and you wanted access
you would have to go to your doctor
-
and consistently lie,
and hopefully get away with it.
-
So, if you want that diagnosis to happen
-
so that your health insurance
will cover things
-
or that your national health service
will cover things,
-
you could either be a man
or a woman,
-
and nothing else.
-
And right now there's a ton of backlash
-
to non-binary existences
-
from people who are thinking
that we are a threat,
-
or something new and novel
-
when we've been around for just
as long as any other kind of trans person
-
and just not discussed.
-
And again, the consequence of this
-
is that this silence is reinforcing
those preexisting ideas
-
of being non-binary has nothing to do
with being trans whatsoever,
-
and it creates and reinforces discourses
that cut people off from care,
-
and cut people off from community.
-
And finally, before I stop harping
on things about gender quite so much,
-
the hijra.
-
So, according to Wikidata
-
the hijra are the third gender
of South Asian cultures
-
and a sub class of non-binary.
-
Now, here's the thing.
-
Yes, hijra people fall
outside a simple man-woman binary,
-
but pretty much zero hijra people
-
would ever define themselves
as non-binary,
-
because it just doesn't make any sense.
-
In a western context,
non-binary people are, by definition,
-
not man or woman
-
but as a consequence
not trans man or trans woman.
-
Hijra includes trans women,
-
and also includes all intersex people,
-
all sterile people,
and a large number of gay people
-
while not including trans men
-
or people who are non-binary,
and were assigned female at birth.
-
All of this is really complex
-
and there are literally books written
-
on the framework of gender
and how that fits into it.
-
But the point is
there's not a simple mapping
-
of western gender notions
-
to gender notions
in the rest of the world.
-
Categorizing hijra people
-
as a subset of non-binary people
-
ignores the fact that most hijra people
do not see themselves that way,
-
would not see themselves that way,
-
and that the definitions of hijra
and non-binary
-
are completely incompatible.
-
But again this has the potential
-
to cause harm.
-
Because the fact of the matter
-
is that western notions of gender
are pretty regularly
-
and over a long period of time
exported to the rest of the world
-
often by violence.
-
We have these information systems.
-
We have classification systems.
-
We have standards.
-
We have, historically and currently, wars,
-
all of which are orientated
around this idea
-
of the western way of doing things
is the only good way
-
or is the best way
and the standard way,
-
and everyone should conform.
-
And so when we have these big projects
which are trying to fit the world
-
in to a very westernized idea
of knowledge, because they have to,
-
because that’s how classification systems
do universally work--
-
everything has to fit
into one consistent scheme.
-
It is perpetuating that kind of violence.
-
So, you could respond
to my concerns and examples,
-
and rambles with kind of a lot.
-
One line to take would be,
"Why does this matter?"
-
Why does Wikidata participating
and validating
-
or invalidating particular discourses
have an impact on the world?
-
And the first answer is
it actually doesn't matter if it matters.
-
It matters that you acknowledge it,
-
So, right now the default framing
of Wikidata is
-
we're just collecting all of the knowledge
in a machine-readable form,
-
but you're not.
-
You're also making decisions
-
about what should be included
and what shouldn't,
-
and how knowledge should be represented.
-
What complexity is worth representing
and what isn't.
-
And those are ethical
and political choices,
-
and framing the project
as simply the result
-
of a million anonymous,
and interchangeable monkeys
-
with an equivalent number of typewriters
-
makes it impossible for us
to have conversations about it.
-
Wikidata's organizers and users
and funders must understand
-
that they're fundamentally
making charged decisions
-
that are not neutral
or objective at all,
-
and that is not bad but dangerous.
-
And so, okay, having accepted
-
that these are ethical
and political decisions,
-
you could say,
"Well, if people want their takes
-
on things included,
they should just contribute."
-
And marginalized communities
do contribute a lot, right?
-
There's a long history
of queer communities,
-
particularly, being
very early adopters of technology.
-
And so people could
just contribute to Wikidata.
-
Like Hijra people could create accounts
and start arguing
-
that actually the entry
shouldn't be a subset of non-binary
-
and so, and so forth.
-
The problem is that
this is unlikely to help
-
because they're the minority,
-
because many of the voices
and perspectives
-
that are currently silenced,
-
in the political and ethical decisions
being made,
-
are those of minorities.
-
So, I did some number crunching on this.
-
Wikidata has 20,000 active editors
-
from a human population
of seven billion give or take,
-
unless you believe that maths is a lie
-
and the world governments,
controlled by lizards under the Arctic,
-
is making everything up.
-
And there are approximately...
Um hmm?
-
(person 1) You mean they're not?
-
(laughter)
-
Look, I'll be honest.
-
If living in the U.S.
for the last five years
-
has taught me anything,
-
it's that any government assemblage
large enough to try and control
-
a big chunk of the human population
-
would in no way be consistently competent
enough to actually cover it up.
-
(laughter)
-
Like we would have found out
in three months--
-
and it wouldn't even have been
-
because of some
plucky investigative reporter--
-
it would have been
because one of the lizards
-
forgot to put on
their human suit one day
-
and accidentally went out
to the shops for a pint of milk
-
(laughter)
-
and got caught in a TikTok video.
-
(laughter)
-
So Wikidata has 20,000 active editors--
-
of whom we will assume none are lizards
-
in human suits or otherwise--
-
from a human population of seven billion,
-
and there are approximately
one million Hijra people in the world.
-
So if we assume a rate
of equal participation--
-
setting aside the extreme poverty
a lot of Hijra people live in
-
and the corresponding impact
-
on access to things
like reliable internet coverage--
-
then the combined efforts
of 20,000 Wikidata editors
-
would have to be overwhelmed
by 2.85 people.
-
That doesn't seem particularly plausible.
-
Okay, so then you might say,
-
"Well, what if we just have
other Wikibase instances
-
isn't that the whole thing
we're building towards?
-
You can set up your own Wikibase
with your own perspectives
-
and your own decisions
about how to classify things,
-
and what to prioritize,
and what not to.
-
Make your own site with your own standard
for what constitutes knowledge
-
and what information is important."
-
And people could do precisely that.
-
But the problem is
that Wikidata has a lot of heft behind it
-
which is why the decisions
that Wikidata makes have so much import.
-
There's the fact that it already exists.
-
It has a first movers advantage.
-
There's the Wikimedia brand.
-
There's the funding
from places like Google.
-
There's the relationships
with other institutions.
-
When the strategic plan for Wikidata
-
calls for engagement
and integration with museums,
-
that doesn't just result
-
in getting more data for Wikidata.
-
That also results in Wikidata
-
and the decisions its users make
permeating more of reality,
-
becoming more of a standard
of how data systems work,
-
and more of a place that is drawn from
to populate other spaces.
-
So I keep using this line,
"Not bad, but dangerous"
-
to describe classification systems
or to describe Wikidata,
-
and I want to reinforce
-
that I don't think that Wikidata
is inherently bad.
-
But I do think that its dangers are vast
-
and are not being properly attended to.
-
Just by looking at gender,
-
we saw three examples,
which I pulled very, very quickly,
-
of situations where even setting aside
-
the sort of objective "accuracy"
-
of the information that
a Wikidata entry might contain,
-
the information it chooses to contain
and chooses to prioritize perpetuates
-
or silences particular discourses,
and particular ideas
-
that have weight in the rest of the world,
that do harm in the rest of the world.
-
And I picked those examples
-
not because they're surprising
in any way,
-
or not because they're unique,
-
but simply to point out that
if I could find that many problems
-
with resonances in wider violent systems
-
in such a tiny sliver of content,
-
imagine how many others
are lurking out there.
-
And the goal of Wikidata,
-
the goal of universal classification
-
if these dangers are not attended to
-
could ultimately result,
or will ultimately result,
-
not in simple like neutral classification,
-
but imposition.
-
In saying this is the way
the world works
-
and if you don't like it
-
then congrats, you should try
and fit into it.
-
And I really wish that I had
a sort of simple answer for this.
-
I don't.
-
It's one of the advantages
-
of switching to academia
-
instead of working
in an engineering department.
-
You can just show up places
-
and go, "Everything
is really complicated."
-
Someone should do something about that.
-
Could I have a grant please?
-
(laughter)
-
But all I can really do
-
is point you back to
Bowker and Star's conclusion,
-
which is that this isn't ultimately
about Wikidata,
-
this isn't a problem with Wikidata
-
this is that the class of systems
-
that Wikidata is a part of
has never been done safely
-
and there is no reason
to think it could be.
-
And so my call is ultimately
-
not for a particular change,
-
or for all of you
to just go home and give up.
-
It's for the project collectively
-
and for you all individually
-
to determine how comfortable you are
-
with participating and building a system
-
that makes a claim to universalism,
-
that makes a claim to neutrality
and truth in data,
-
when we know that that's neither possible
-
nor harmless when it fails.
-
and if you are not comfortable
with that, working to articulate
-
what other ways of doing this
there might be.
-
And these could look like, for example,
-
giving primacy
to those local Wikibase installs.
-
Saying that ultimately
-
we need to give individual communities
-
and individual contexts
and spaces primacy
-
in defining what matters to them,
-
and how they wish to be defined.
-
And the conversation about
which perspective should be included
-
in some central repository should wait
-
until we have
the full range of perspectives.
-
So, that's everything from me.
-
Thank you, everyone,
for sitting through this.
-
I think we have about 20 to 25 minutes--
-
(moderator) 25 minutes for questions,
so, please, plentiful.
-
Thank you very much.
-
(applause)
-
(person 2) Thank you so much
for this wonderful presentation
-
about the problems inherent
in classification systems.
-
One of the examples you had
is really cool
-
from a mathematical point of view,
-
when you were showing
that transgender male
-
is the opposite of transgender female--
-
or transgender female
is the opposite of transgender male
-
and the opposite of cisgendered female.
-
That makes cisgendered female
be the same as transgender male,
-
because opposite of is the same--
-
if A is opposite of B
and C is the opposite of B,
-
A and C are the same.
-
So actually that's a place
where it should be different from
-
and not opposite of,
-
and that involves
a lot of mathematical issues
-
when we go to actually ask queries
of the database,
-
so it's really important
that you've pointed out things like that.
-
Yeah, another example
of that which I thought was fun
-
was transsexualism was defined
in part further down--
-
which I wanted to include,
-
but couldn't find a way
of fitting it into the flow--
-
as the same as sex-reassignment surgery.
-
Which is unintentionally hilarious
-
because a diagnosis of transsexualism
-
was historically a prerequisite
for sex-reassignment surgery.
-
So it's not so much a chicken
and an egg problem
-
as the chicken is carrying the egg.
-
(laughter)
-
Yeah. So yeah, these--
-
When we look at Wikidata
and how much it uses mathematical,
-
or pseudo-mathematical language of, like,
-
opposite of, distinct from,
in the set of...
-
Yeah, reality is more complex
-
than the mathematics
we have to represent it.
-
I don't have a smart answer there
except to say
-
that I used to be
a quantitative researcher
-
and I left,
and there is a reason for this.
-
(moderator) Next question.
-
Who raised hands?
-
I see a hand over there?
-
(person 3) Hello.
-
First of all.
Thank you for this presentation.
-
It was very eye-opening.
-
I want to tell you,
but first of all--
-
there's a Wikimedia--
I don't know if you know
-
about the community LGBT+ user group.
-
So it's a user group,
-
and they have this mailing list,
-
and they discussing actually
the issue of sex and gender in Wikidata,
-
and there is some proposals made
-
by LGBT+ people to improve it.
-
So, but it's not fully done yet.
-
So, there are some plans,
people working on it.
-
It would be great
if you want to chime in there
-
and give your opinion
-
because I'm pretty sure
you're more expert than most of us.
-
But I want to give a critique
of this thing that you said
-
about hijra people that said
-
out of 20,000 editors of Wikidata,
-
assuming 2.8 of them will be hijra
-
and they need to overcome
all of these 20,000 people
-
but this is not true.
-
Lots of people, I say assume 20,000 people
-
are just unaware of an issue.
-
They are not bigots
-
or they are not going to actively
-
not let people do this.
-
And lots of them would help
if you tell them.
-
Like, as you [inaudible]
that edits Wikidata,
-
I have no idea about this issue
-
and if I knew it
I would have fixed it.
-
So, yeah.
-
Yeah. I totally get what you mean.
-
And I want to be clear that I'm not saying
-
there are 20,000 people,
-
many of whom are in this room,
-
although only a tiny percentage
-
who are vehement bigots
and cultural imperialists.
-
Instead what I'm getting at
is the fact
-
that the consensus model,
and discussion-based model
-
that the WikiProjects are based on
-
has a couple of flaws,
-
and one of the big flaws
-
is that it assumes that all of the voices
worth representing are there
-
and are represented
somewhat proportionately.
-
Consensus started off
as a model in Quaker communities
-
where literally everyone impacted
by a decision was in the room,
-
because everyone impacted
by a decision could fit in the room.
-
And so my point
with this 2.85 number is not to say
-
you have to argue
with the entire population of Wikidata
-
every time you want to make any decision,
-
but instead to say
that the consensus model
-
and the majoritarian model
of what knowledge should be represented
-
runs fundamentally into a problem
-
when the people
who are being underrepresented
-
are underrepresented.
-
For another example, and a real one,
-
Myanmar as a country.
-
The English Wikipedia claims
that it was called Burma
-
until a couple of years ago.
-
And the reasoning for this
was very simple.
-
The BBC didn't like calling it Myanmar
-
and a load of editors--
-
(person 4) [inaudible] completely wrong.
-
Sorry.
-
(laughter)
-
You run into this issue of like...
-
I know it's not the precise thing,
but it's just...
-
- (person 4) : [inaudible] it's actually--
- (moderator) I give you the mic, sir.
-
- Yes?
- (person 4) I'm sorry,
-
that's just incredibly playing
being ignorant and that...
-
- Okay. Go for it.
- (person 4) That's an absolute terrible,
-
terrible mischaracterization
of the political situation in Myanmar.
-
Okay. Go for it.
-
(person 4) Anyways, so basically
what it is is that the country--
-
in the Burmese language
-
the country can be referred to as
Myanma or Bama.
-
Yep.
-
Myanma tends to be a more
formal register
-
and Bama tends to be
a little bit more informal register
-
but both are acceptable terms
for the country.
-
The term Burma came obviously
from the term Bama,
-
but what happened was
-
there is no official...
-
The country was officially referred
to, in English, as Burma
-
up until 1988-- 1989, excuse me,
-
when the military government
of the country
-
basically decided,
the military junta of the country decided
-
that the country should be
referred to as Myanma.
-
Ostensibly, this was as an attempt
to make the country name
-
more acceptable to minorities
within the country.
-
However, this is a bit
of historical revisionism
-
because Myanma and Bama
specifically refer
-
to the majority ethnicity in the country.
-
So, it was basically the government
of Burma at the time--
-
trying to make the people
equivalent to the country,
-
therefore implicitly saying--
-
(person 4) Almost the opposite,
-
but in a really weird way.
-
They basically declared that
Bama was in reference to the ethnicity
-
and Myanma was in reference
to the country,
-
when historically they both
represent ethnicity
-
and the country.
-
That makes sense.
-
(person 4) But what happen was
because Democrat advocates
-
within the country
believed that the military junta
-
did not have the power
-
to be able to change
the name of the country
-
in any language,
-
because they were not
empowered by the people of the country.
-
and were explicitly
a military junta that they...
-
therefore the country should continue
-
to be referred to Burma in English.
-
Because of the fact that essentially
to call it Myanmar is essentially to say
-
the government of Burma and Myanmar
at the time was legitimate.
-
After the fall of the-- well not fall,
-
but after like the semi return
of civilian government in 2014,
-
this question came up,
-
"Okay, should we call this country
Burma or Myanmar in English?"
-
and essentially,
the facto leader of the country,
-
Aung San Suu Kyi,
-
said that there's nothing
in the Burmese constitution
-
that says you know,
what you should call it in English
-
so call it whatever you want.
-
I mean the name of the country
-
is officially the Union of Myanma
in Burmese,
-
but as far as in English
you can call it whatever you want.
-
But generally before the return
of the civilian government in Burma,
-
to refer to it is as Myanmar
was essentially
-
to legitimize the military government.
-
And so therefore,
-
to call it Burma was generally considered
to be a specific political act
-
to not give that government legitimacy.
-
Yeah. So, I'm not saying that
that isn't a rationale for it.
-
I'm saying that
on the English Wikipedia specifically,
-
the page went through seven requested
move discussions
-
over four years
and a mediation cabal decision,
-
and an attempted structured mediation,
-
and a review of one the closures
of the move discussion,
-
and that when you look
at the discussions,
-
most of the sort of argument
back and forth
-
is not about
the nuanced political situation
-
of the country
-
but it's instead about
what is the common name in media sources
-
and what do
different institutions call it.
-
And that when you look at the discussion,
-
you can see a clear point
where pretty much every news organization
-
that isn't the BBC
in the English Language,
-
that's considered like a major
western news source
-
has switched their language sources,
-
and the debate
essentially becomes a debate
-
of whether we should listen
to the Wall Street Journal
-
or the BBC.
-
So the point I'm making
is not about the specific politics
-
of the situation, but instead the fact
-
that it's really easy for those decisions
-
to actually become almost a proxy dispute
of how much do we love the BBC,
-
and that when you look at the discussions
-
you see this really nice case study
-
in the issues of having
those conversations
-
and having those nuanced,
and often insider perspectives
-
when most of the discussions
are centered around
-
how much we love the BBC
-
and are coming from people
who are outside the context.
-
So, it's not--
-
My point in all of this is basically
-
that even if you're not fighting
20,000 people,
-
even if you're only arguing
with 20 people,
-
probabilistically, 19 of them
-
are going to be people
who have very strong opinions,
-
who don't necessarily bear
any negative consequences
-
of whichever change happens,
-
but have a particular world view
and have decided to stick in it,
-
and so the proposals
by the LGBTQ+ group
-
to change the Wikidata criteria
-
might be amazing, I might love them,
I might not love them,
-
I haven't read them.
-
But the base premise of this is...
-
We got the people who show up
on Wikidata right now,
-
and those are the representatives
of all queer people
-
and this is the universal rule
of what should be done
-
with the content of all queer people
-
is almost a microcosm
of the same problem.
-
- (moderator) We have another question.
- Yep.
-
(person 5) Hi.
-
I think there's another problem
-
with the consensus-based approach we have,
-
is that sometimes we have consensus
-
on really difficult issues
on how to deal with that
-
and [inaudible] that on Wikidata,
and nobody is reading the discussion.
-
Typically, the project Names,
-
which is a really, really old
WikiProject on Wikidata--
-
and names are a really,
really complicated issue in the world.
-
Not every people of the world
have a given name,
-
not every people have a family name,
not, well, you have an idea.
-
And there are so many
writing systems out there,
-
and we have, actually, a system
-
which was working
for many cases in the world
-
on how to use properties,
-
what items should look like,
-
how to link these together
and everything--
-
We have eight pages--
-
nobody is reading that,
and someone just added
-
Latin script family names
to a Chinese researcher.
-
So, we don't have the names
of these researchers
-
but we know for sure
that the value added was wrong.
-
I don't have the correct value,
-
but I know this one
is not the correct value.
-
And it's not just discussing the issue
-
because we have big discussions
-
and we have actually modeling
-
which is mostly working on
and even qualifier on things to deal
-
with more complicated cases
-
but people are just,
"Oh, given names suggest a property,
-
I will just add that."
-
- No.
- Yeah.
-
I think it's not just
how to model thing,
-
it's really how to explain
to people the model,
-
and that's a technical part--
we could have tools with suggestions
-
and I think the constraint thing
which went live last year
-
is a great thing for that.
-
But even when we know to model thing,
-
it's how to make
this model known to people.
-
That's a bit technical issue
on how to do that better.
-
(moderator) So, there was just remark.
-
There's no real question for you?
-
Or that's a question to you?
-
- How to do that.
- (person 5) Yeah, it's a question.
-
(person 5): Sorry,
even if we have the discussion,
-
(moderator) Yeah, sure.
-
(person 5) My question,
if I was not clear, is that
-
even when everyone is in agreement
-
on how to model complicated cases,
-
how do we make technically
the model known for project
-
with the scope of Wikidata,
-
so people are not adding
the wrong value in good faith?
-
Because our problem is both.
-
We have trouble
modeling complicated realities,
-
and we have trouble explaining
to users, how to follow the model
-
we actually have.
-
Yep.
-
I will say that
if I could solve that problem
-
which is to reframe it,
-
how to reliably and consistently
enculture new users
-
into having the same view
and understanding
-
of the project space,
-
then they would let me graduate
-
and also give me a job.
-
It's the second oldest problem
in internet spaces is how to do that.
-
The oldest problem is writing a system
-
that will automatically detect insults.
-
I will say that...
-
You can look back at Wikipedia,
-
or before that,
there was the phenomenon
-
of eternal September on Usenet
-
which was, "Oh these people keep--
AOL disks have gone everywhere
-
and now there's newcomers
-
all the time who don't know
how things work around here,
-
and everything is drowning
in people hitting "Reply All."
-
Generally speaking,
the place that I would look for that
-
is there is a discipline called,
"Computer-supported collaborative work,"
-
and one of their big questions
-
is this question of onboarding,
and of like...
-
making the culture known to people.
-
But it may not be something
that is directly solvable,
-
or that we want to directly solve, right?
-
So, Susan Leigh Star
who wrote Sorting Things Out,
-
one of her other contributions
-
was generally the study of infrastructures
-
of which I would argue Wikidata
is definitely one,
-
and of the things that she argued
-
was that infrastructures
make themselves known
-
through using them.
-
So like, basically the only way
to work out how a system works
-
is to engage with it,
and trip over, and fall flat on your face,
-
and learn not to fall over that way again.
-
And I think everyone everywhere,
including new users,
-
including people
coming from other projects,
-
wants a way of approaching this
where they don't have to fall over.
-
But I'm not sure if that exists,
-
and I think that a better place
we might look is maybe to ask
-
what are the consequences
of people screwing up
-
and how do we make screwing up
an understandable
-
and a more expected component
of the user experience.
-
(moderator) Okay thanks.
Next question.
-
(person 6) Thank you.
-
So, first, thank you very much
for your presentation to us.
-
Again, someone said, eye-opening.
-
I was looking at the specific item
on transsexualism,
-
and it's actually even more interesting
-
because I was looking
at different Wikipedias,
-
how they dealt with the issue.
-
And I just look at three.
-
So, apparently, what
we are seeing on Wikidata
-
actually reflects pretty much
what happened to some extent
-
at some level on English Wikipedia,
-
whereas if you look
at Portuguese Wikipedia,
-
the actual item connects to transgender,
-
and on French Wikipedia
it connects to trans identity
-
whereas transsexualism is a redirect
in both Portuguese and French.
-
And I was looking at the history
of editing on the Wikidata item,
-
and if you look at--
there were several sort of wars
-
but the discussion page
is actually only one line,
-
but there were several conflicts
between editors,
-
particularly with the French
-
that were opposing
the use of transsexualism.
-
If you look at the names of the items
on each language,
-
the only one on which
you don't have transsexualism
-
is French for trans identity,
-
and then someone came,
and did what you said about
-
it's the opposite [inaudible],
trans identity,
-
and then there is a different item that--
-
Oh yeah.
-
(person 6) So, it's a complete
global fight over...
-
basically it's reverberating conflicts
-
that are apparently also
-
the manifestations of conflicts
that happen on each Wikipedia.
-
Yes, that also reflect conflicts
in local cultures,
-
and in different parts
of the world, yeah...
-
And I'd argue that, I mean,
-
I'm British so I have a tendency
to say, "Wait, fighting with the French?"
-
"Yes, Please!"
-
(laughter)
-
But I'd say there's almost something
more fundamental than that,
-
and you can make an argument
in the other direction.
-
I can, as a trans person, make an argument
in the other direction and say,
-
"Actually, it's the French
and Portuguese who have it wrong."
-
Because the actual question is
-
is the entry transsexualism about
-
the medical classification,
or the state of being,
-
or the historic medical classification,
-
or the historic term
for the state of being,
-
or are these different entries,
or the same entries?
-
When are things distinct enough
to be different objects,
-
and how do we negotiate that fight
-
between people who think
that the medical status
-
and the identity are the same thing,
or different things.
-
But yeah, there is no easy answer
-
but yeah, I suspect
if you look at a lot of these examples,
-
and if you look
at a lot of controversies,
-
generally on Wikidata
-
what you're going to see is
these fights over...
-
These almost negotiations
-
are the local community norms,
-
and beyond that are the cultural norms.
-
Which is a problem because again,
-
when we're talking about marginalized
or minority groups,
-
we would expect them to also
be marginalized within Wiki communities,
-
and also within Wikidata,
-
and so Wikidata is sort of...
-
building on these
preexisting prioritizations
-
of whose knowledge matters,
and under what circumstances
-
and in what form.
-
(person 7): I wanted to touch
on something you mentioned.
-
Everything is complex
and I think modeling it right,
-
getting it right on Wikidata
-
is not the sum of the issue.
-
As you said, Wikidata is infrastructure,
-
and as [Hermione] said,
-
we have gotten it right perhaps
in some things, in some other topics,
-
and still can't
actually practice it right.
-
Yep.
-
(person 7): So I want to suggest that
-
this is a prevalent condition
of the human race.
-
And however well we model something,
even if we model gender
-
ten times more complexly
than we do today,
-
most SPARQL queries involving gender
would not bother
-
- with the qualifiers right?
- Yeah.
-
And would still generate very,
very flattened, very simplified results.
-
Google's use of our data
in the infamous Google infoboxes
-
will also flatten the data
and ignore qualifiers.
-
That is not going to change.
-
Wikidata will continue to be used
in simplistic ways.
-
Indeed, the majority of use,
-
probably, will be that simplistic thing.
-
My point is, it's probably not fixable
-
and we shouldn't stop trying.
-
I mean we should try to get it right
-
and understand that a lot of the use is,
despite our best efforts,
-
going to be simplistic and wrong.
-
Yep. I would agree with that.
-
I guess I would say that
-
you know,
it's not about like,
-
my issue here is not about
it being you know,
-
there is one true
incredibly complex answer.
-
At some point I just gave up
-
even in my thesis which is about
transness and technology
-
of defining transness.
-
I just gave up.
-
And I instead took what is referred to
as a pragmatist view,
-
which is basically that
it is whatever the people
-
in the situation that you're studying
believe it to be,
-
and however they construct
the world as if it were,
-
and what I'm getting at this
-
is not that there is
some universal definition
-
of anything which,
if sufficiently complicated,
-
would be enough,
-
but instead that I think
that the scale is the problem,
-
and the universalism is the problem.
-
Maybe we should keep trying,
-
or maybe we should stop.
-
Maybe we should instead say
that, again,
-
there should be a Wikibase install
in every self-defined community
-
that wants it and they can define things,
and articulate things
-
to their own satisfaction.
-
But then we end up in more political
-
and fraught debates of a reformist
versus radical actions,
-
and how you open a box
with a crowbar that's already inside it,
-
and I end up quoting Foucault for an hour,
-
and everyone gets sad.
-
Including me because I hate Foucault.
-
So this might be a discussion
for elsewhere.
-
But generally agreed, I just--
-
I would raise questions about
-
whether we should keep trying
-
for a better form of universalism,
-
or whether the problem
is that universalism.
-
I'm guessing we have
a time for one more? Yeah.
-
(person 8): This is a short question,
possibly complex answer.
-
One of the most popular
-
and used properties is sex
or gender on Wikidata.
-
Could you speak to whether you find
-
that merging useful,
productive, problematic?
-
Sure, I mean I think it's always
-
going to be reductive
cause it's a merging.
-
But I also think
that it is deeply tiresome
-
in a way that's kind of interesting
-
insofar as it reveals
the limitations of Wikidata,
-
though Wikidata claims to be building
-
towards this like big objective
set of knowledge,
-
but ultimately kind of
smushed these things together
-
because I mean they haven't asked
-
most people who have entries
what their gender is,
-
and/or what their sex is,
-
and so they just merge them
-
so that inference is easier.
-
But generally speaking, yeah,
I say that the merging
-
of the two together
is reductive and dangerous
-
but...
-
Again it's not...
-
There is no good way of doing it.
-
I think this is a particularly bad way
-
of treating them
as interchangeable things,
-
and treating them
as forever-linked things,
-
but I can't suggest a better way
that remains--
-
that continues to have Wikidata
-
even tracking this information
or the information contained
-
in that at all.
-
(moderator): Okay.
I think we have to conclude here.
-
I still saw some raised hands
-
so hopefully you'll be around.
-
Yeah. I am a grad student.
-
I have functionally no life, so...
-
(laughter)
-
(moderator): Perfect. Okay.
So please come and talk.
-
Thank you very much.
-
(applause)