Hello. Just to check.
Can everyone hear me?
Grand.
I've never understood
why that's such a phenomenon
when people give talks
because if you can't,
what are you meant to say?
(laughter)
But yes, so as said, I'm Os.
I'm a PhD student
at the University of Washington,
where, according to the slide,
I study "Gender, Infrastructure
and (Counter)Power."
I'd ask you all to do me the indulgence
of pretending that
that's some very explicit, nuanced,
thoughtful, academic description
and not just what I write as a catch-all,
because I kind of study
a thousand different things
and fitting them all
into a few words is hard.
But most of the things I study
are around how systems
of knowledge enforce particular ideas
of how the world works,
and particular relationships of power
with a specific focus on gender.
I'm also an ex-Wikipedian.
I spent 15 years as an editor
which is maybe where my interest
in the nature of knowledge started,
and I really can't express
how happy I was to be invited
and how glad I am to be here
with all of you,
but particularly James Forrester
who is probably the only person qualified
to countersign
my passport renewal application,
cause it's running out soon
and I've been trying to work out...
(laughter)
You move to Seattle.
Everything is great.
Then you're like,
"Oh, the UK government
requires me to find an ex-priest,
civil servant, or member of parliament,
who's known me for at least 2 years
and who I can ship paperwork to."
That sounds plausible.
(laughter)
Anyway, but...
So I'm here as someone
who has spent a lot of time of...
a number of years--
which I don't like to think about
because it makes me feel incredibly old--
wrestling with the nature of knowledge
and the idea of knowledge--
to talk to you about
what Wikidata looks like
to someone from my background
and with my research interests.
And I'm not going to spend much time
on the story of Wikidata itself,
because if you're here,
having spent 24 hours
having it brain dumped into you,
you're familiar with it.
It's a big semantic data store
that aims to provide
machine-readable knowledge
in a centralized way.
And what this looks like
is a series of items
with associated properties or statements.
So the item for "apple"
has the property "fruit."
I mean, probably.
It's a Wiki so there's probably
a long-running edit war
of whether an apple is a fruit,
and there's 50 people
running 300 accounts between them,
and it's been going for years,
and at this point,
if you mention the word apple on Wikidata,
you're preemptively banned
as someone who, you know,
is secretly a sock puppet
and running an account on one
or another side of this.
So as a consequence,
it's also a classification system, right?
A way of sorting and organizing the world.
So, objects or people or concepts
are classified as worth
having a Wikidata entry or not.
A fruit or not.
And in each case
a series of criterion apply
to determine the properties
that an object should have,
and the values of these properties
and how the objects
all relate to each other.
So Wikidata is really an attempt to build
a universal classification system.
And classification systems
have been studied pretty extensively.
One prominent work
which I'd really recommend people read
if they're interested in this stuff
is Sorting Things Out,
which is book by Geoff Bowker
and Susan Leigh Star.
And they found that
in an ideal universe,
a classification system,
be it universal
or over a particular domain,
has three attributes.
The first is it operates on consistent
and unique principles.
So, there's a consistent pattern
of what should be in each category
and for what reasons.
The second is all the categories
are mutually exclusive.
And the third is
that the system is complete.
It contains total coverage of
what it describes.
And this doesn't mean
it has to have every single object
that fits into the system.
It just means that in the situation
where it lacks an object
and that object then shows up,
there should be
a consistent mechanism
to work out
whether it should be added or not,
and how it should be described
and so on, and so forth.
There is one small problem
with this which is that:
"No real-world
working classification system
that we have looked at
meets these simple requirements
and we doubt that any ever could."
Or to put it another way,
all classification systems fail.
All classification systems
have gaps and exceptions.
And obviously, the same is true
for all systems, full stop.
Anyone who has ever coded
or simply worked in an environment,
or studied in an environment,
or lived in the world
knows that we've yet
to design a single thing
that we've thought all the way through.
The problem is that when we take a system,
classification, or otherwise,
and put it out into the world
and give it power and authority,
and integrate it into other systems,
that already have power
and authority,
there are consequences
for what happens
when the system inevitably fails,
for how it reinforces or undermines
existing relationships of power,
for how it hurts people.
A universal classification system is,
in another words,
not merely doomed to failure,
it's also doomed to hurt people.
And the way that it is structured
is ultimately a series of ethical
and political choices as a result--
Who do you want to hurt?
How much?
What should be done
when people are injured?
And those choices have real consequences.
And so making these choices
often involves confronting the fact
that there's very rarely a single
simple machine-readable interpretation
of something that's true
for all people throughout all history.
Anything in the universe
has multiple meanings,
and symbolisms, and nuances
to different people in different contexts
at different times.
But designing a classification system
and implementing it,
designing a system that can make a claim
to having consistent principles,
and covering everything it discusses,
inevitably involves
cutting down on this complexity
and making decisions about what
"the" meaning of a thing is going to be,
or what array of possible meaning
should be presented
and in what sequence.
And as a result,
it involves silencing voices
or rendering voices louder.
Again, this has consequences.
And to see what I mean
about this complexity
and context, and reduction,
and the consequences of it,
I'd like to set through some examples
from Wikidata itself.
The ones I've chosen
are all gender-related because again,
gender is both professionally
and personally sort of a key interest.
So, the first that I'll start with
is transexualism
which is described as a "condition
in which an individual
identifies with a gender
inconsistent or not culturally associated
with their biological sex."
Fairly unobjectionable and--
wait, no, it's classified as a disease,
and a psychiatric disease at that.
Now, I know what you're thinking,
which is this is appalling
but actually it's not as simple
as either of these statements
being true or false, right?
They're in a category of sort of,
"true, except."
So, take transsexualism
is an instance of disease, right?
Technically, this is true,
in so far as transsexualism
is the name of an entry
under the International Classification
of Diseases, version 10.
But we should add some complexity
and nuance to that.
So, the ICD
is a classification of literally
everything in the world
that you could have
that was in any way involved at all
in someone's injury or death.
It is in fact illegal to die of something
that is not listed in the ICD.
(laughter)
So it contains kind of a lot of things,
and transexualism is listed in it
so we classify it as a disease
because it's in a classification
of diseases.
So, here are some other things
that the ICD also lists as diseases
that it has specific entries for.
PA80: Shot by accident.
PA40.0: Fell off a boat, drowned.
(laughter)
PA41.1: Fell off a boat,
damaged the boat, and drowned.
(laughter)
PA40.1: Fell off the boat,
didn't damage the boat,
didn't drown,
still died of something.
(laughter)
And finally, QD50: Being poor.
(laughter)
So, if any of you
have ever fallen off a boat,
I'm very sorry but you have a disease
which you should really
talk to a doctor about.
What class of doctor,
I'm not sure.
It might be a psychiatrist.
Who knows?
So you know that's disease, right?
What about health specialty: psychiatry?
Well, that's also true, sort of.
So, psychiatrists are the people
who diagnose the presence
of gender dysphoria,
a disconnect between one's sense of gender
and one's sort of like,
embodied or perceived gender.
But again, context.
For example,
saying psychiatrists diagnose it
ignores the fact
that none of the treatments
are psychiatric.
You might as well list the specialties
as specialization in hormones
or plastic surgery,
or being a personal shopper.
All of these also have some role
in people's life trajectories.
They are not listed.
One other useful
potential factoid by the way,
is that the ICD 10 is actually
the old International Classification
of Diseases,
and the ICD 11 no longer lists
transsexualism at all,
much less as a disease.
But my point here is not that Wikidata
sometimes contains outdated information
or sometimes contains
false information,
it's that the statements
that are constructed from that information
as a consequence of what they leave out
and what the results are,
drop things and add risk.
So, one way of structuring
the information that
that entry contained is:
"transsexualism is a psychiatric disease."
And this leaves out
a lot of complexity,
some of which we've discussed.
But the greater issue is how it interlocks
and resonates with existing narratives,
and existing information.
For example, the idea
of transsexualism is a disease.
Does anyone know why
the ICD stops listing it as a disease?
Well, two reasons.
First is because calling
being trans a disease is not accurate.
It does not meet the definition
of being a disease.
In fact, the only reason
that anything to do with being trans
is still in the ICD is not
out of some objective
like, you know, examination
of biology or psychiatry
but instead purely pragmatism.
That if you stop listing it,
then insurance companies
in places like the U.S.
would stop covering medical care
that is associated with being trans.
And the second is that
the stigma associated
with having something classified
as a disease is substantive,
and when you list transsexualism
as a disease
and a psychiatric one at that,
you tap into really
long-standing assumptions
and false beliefs about trans people.
Assumptions and beliefs
that have a lot of power.
Like, if it's a disease
there must be something wrong
with trans people,
something that people should fix.
And if it's a psychiatric condition
then trans people should
be therapized out of being trans.
In other words, whatever the raw truth
or falseness of the statement,
stripping out its complexity
and contextuality,
lets people fit it into their own notions
of what it means.
And that doesn't end
in a neutral objective
classification system,
it ends in things like conversion therapy,
and it being legal
to beat people to death for being trans
when you find out that they're trans
after you slept with them,
because, you know,
something's wrong with them.
Like why would you
be considered reasonable
to have done this?
So a more accurate framing of this
might be this,
which is hard to fit into Wikidata.
And because we can't fit
that into Wikidata,
and we strip it down,
and we lose all that complexity,
we open up the possibility to, again,
reinforce these really dangerous notions.
So, let's look at another example,
also from gender,
and that is the entry for non-binary.
So, as Wikidata informs us,
non-binary is a range of genders
that are neither exclusively man
nor woman.
And there are some critiques
I have of the "also known as" section,
but that's not the biggest issue here.
No, the biggest issue here
is that at no point does this entire page
make any reference to trans people.
So, if you go to the entry
for transgender woman,
it says, "opposite to transgender man."
And if you go to the entry
for transgender man
it says, "opposite to transgender woman."
If you go to this entry,
it has absolutely no reference
to trans people whatsoever.
There is this complete disconnect
and distinction
between non-binary people
and trans people.
And this might be, seems to be,
a pedantic thing to be concerned about
but it's actually a really useful example
for a couple of reasons.
The first is that how non-binary people
relates to being trans
is really hotly debated.
Individual non-binary people
may or may not identify as trans.
As a consequence, it's really difficult
to make big categorical judgements
about a class of people.
Other people would say that non-binary
people aren't trans,
for whatever reason,
or that non-binary people are trans.
You know, you have to
make a decision at some point.
How are you going
to categorize this entry?
What attributes are you going
to associate it with?
But it's hard to do that in Wikidata
when by necessity
the structure of the platform
is so categorical and so fixed,
that you can't really say like,
for some people these things are related
and for others they aren't,
and it's actually very politically charged
but you should think about it.
There's no objective fact to fall back on.
It's very contextual and complex,
and disputed.
So, how do you fit this in?
Anyone?
But, this reductiveness
isn't just a question of,
"Oh well, we haven't fit all
the information in
so I guess it's not perfect."
Again, it fits into preexisting discourses
and the preexisting world,
and has the potential
to cause very real harms.
There's this very long history
of non-binary people
not being considered trans,
going back to, in fact, the foundational,
sort of medical and academic,
and authoritative works
on what being trans is
and how trans people should be treated.
And what this has resulted in
is non-binary people being cut
out of access to resources--
medical care, community membership,
any kind of support.
In fact until 2013,
being non-binary was not a thing
you could possibly be
while still getting access,
to transition-related medical treatment.
If you were, and you wanted access
you would have to go to your doctor
and consistently lie,
and hopefully get away with it.
So, if you want that diagnosis to happen
so that your health insurance
will cover things
or that your national health service
will cover things,
you could either be a man
or a woman,
and nothing else.
And right now there's a ton of backlash
to non-binary existences
from people who are thinking
that we are a threat,
or something new and novel
when we've been around for just
as long as any other kind of trans person
and just not discussed.
And again, the consequence of this
is that this silence is reinforcing
those preexisting ideas
of being non-binary has nothing to do
with being trans whatsoever,
and it creates and reinforces discourses
that cut people off from care,
and cut people off from community.
And finally, before I stop harping
on things about gender quite so much,
the hijra.
So, according to Wikidata
the hijra are the third gender
of South Asian cultures
and a sub class of non-binary.
Now, here's the thing.
Yes, hijra people fall
outside a simple man-woman binary,
but pretty much zero hijra people
would ever define themselves
as non-binary,
because it just doesn't make any sense.
In a western context,
non-binary people are, by definition,
not man or woman
but as a consequence
not trans man or trans woman.
Hijra includes trans women,
and also includes all intersex people,
all sterile people,
and a large number of gay people
while not including trans men
or people who are non-binary,
and were assigned female at birth.
All of this is really complex
and there are literally books written
on the framework of gender
and how that fits into it.
But the point is
there's not a simple mapping
of western gender notions
to gender notions
in the rest of the world.
Categorizing hijra people
as a subset of non-binary people
ignores the fact that most hijra people
do not see themselves that way,
would not see themselves that way,
and that the definitions of hijra
and non-binary
are completely incompatible.
But again this has the potential
to cause harm.
Because the fact of the matter
is that western notions of gender
are pretty regularly
and over a long period of time
exported to the rest of the world
often by violence.
We have these information systems.
We have classification systems.
We have standards.
We have, historically and currently, wars,
all of which are orientated
around this idea
of the western way of doing things
is the only good way
or is the best way
and the standard way,
and everyone should conform.
And so when we have these big projects
which are trying to fit the world
in to a very westernized idea
of knowledge, because they have to,
because that’s how classification systems
do universally work--
everything has to fit
into one consistent scheme.
It is perpetuating that kind of violence.
So, you could respond
to my concerns and examples,
and rambles with kind of a lot.
One line to take would be,
"Why does this matter?"
Why does Wikidata participating
and validating
or invalidating particular discourses
have an impact on the world?
And the first answer is
it actually doesn't matter if it matters.
It matters that you acknowledge it,
So, right now the default framing
of Wikidata is
we're just collecting all of the knowledge
in a machine-readable form,
but you're not.
You're also making decisions
about what should be included
and what shouldn't,
and how knowledge should be represented.
What complexity is worth representing
and what isn't.
And those are ethical
and political choices,
and framing the project
as simply the result
of a million anonymous,
and interchangeable monkeys
with an equivalent number of typewriters
makes it impossible for us
to have conversations about it.
Wikidata's organizers and users
and funders must understand
that they're fundamentally
making charged decisions
that are not neutral
or objective at all,
and that is not bad but dangerous.
And so, okay, having accepted
that these are ethical
and political decisions,
you could say,
"Well, if people want their takes
on things included,
they should just contribute."
And marginalized communities
do contribute a lot, right?
There's a long history
of queer communities,
particularly, being
very early adopters of technology.
And so people could
just contribute to Wikidata.
Like Hijra people could create accounts
and start arguing
that actually the entry
shouldn't be a subset of non-binary
and so, and so forth.
The problem is that
this is unlikely to help
because they're the minority,
because many of the voices
and perspectives
that are currently silenced,
in the political and ethical decisions
being made,
are those of minorities.
So, I did some number crunching on this.
Wikidata has 20,000 active editors
from a human population
of seven billion give or take,
unless you believe that maths is a lie
and the world governments,
controlled by lizards under the Arctic,
is making everything up.
And there are approximately...
Um hmm?
(person 1) You mean they're not?
(laughter)
Look, I'll be honest.
If living in the U.S.
for the last five years
has taught me anything,
it's that any government assemblage
large enough to try and control
a big chunk of the human population
would in no way be consistently competent
enough to actually cover it up.
(laughter)
Like we would have found out
in three months--
and it wouldn't even have been
because of some
plucky investigative reporter--
it would have been
because one of the lizards
forgot to put on
their human suit one day
and accidentally went out
to the shops for a pint of milk
(laughter)
and got caught in a TikTok video.
(laughter)
So Wikidata has 20,000 active editors--
of whom we will assume none are lizards
in human suits or otherwise--
from a human population of seven billion,
and there are approximately
one million Hijra people in the world.
So if we assume a rate
of equal participation--
setting aside the extreme poverty
a lot of Hijra people live in
and the corresponding impact
on access to things
like reliable internet coverage--
then the combined efforts
of 20,000 Wikidata editors
would have to be overwhelmed
by 2.85 people.
That doesn't seem particularly plausible.
Okay, so then you might say,
"Well, what if we just have
other Wikibase instances
isn't that the whole thing
we're building towards?
You can set up your own Wikibase
with your own perspectives
and your own decisions
about how to classify things,
and what to prioritize,
and what not to.
Make your own site with your own standard
for what constitutes knowledge
and what information is important."
And people could do precisely that.
But the problem is
that Wikidata has a lot of heft behind it
which is why the decisions
that Wikidata makes have so much import.
There's the fact that it already exists.
It has a first movers advantage.
There's the Wikimedia brand.
There's the funding
from places like Google.
There's the relationships
with other institutions.
When the strategic plan for Wikidata
calls for engagement
and integration with museums,
that doesn't just result
in getting more data for Wikidata.
That also results in Wikidata
and the decisions its users make
permeating more of reality,
becoming more of a standard
of how data systems work,
and more of a place that is drawn from
to populate other spaces.
So I keep using this line,
"Not bad, but dangerous"
to describe classification systems
or to describe Wikidata,
and I want to reinforce
that I don't think that Wikidata
is inherently bad.
But I do think that its dangers are vast
and are not being properly attended to.
Just by looking at gender,
we saw three examples,
which I pulled very, very quickly,
of situations where even setting aside
the sort of objective "accuracy"
of the information that
a Wikidata entry might contain,
the information it chooses to contain
and chooses to prioritize perpetuates
or silences particular discourses,
and particular ideas
that have weight in the rest of the world,
that do harm in the rest of the world.
And I picked those examples
not because they're surprising
in any way,
or not because they're unique,
but simply to point out that
if I could find that many problems
with resonances in wider violent systems
in such a tiny sliver of content,
imagine how many others
are lurking out there.
And the goal of Wikidata,
the goal of universal classification
if these dangers are not attended to
could ultimately result,
or will ultimately result,
not in simple like neutral classification,
but imposition.
In saying this is the way
the world works
and if you don't like it
then congrats, you should try
and fit into it.
And I really wish that I had
a sort of simple answer for this.
I don't.
It's one of the advantages
of switching to academia
instead of working
in an engineering department.
You can just show up places
and go, "Everything
is really complicated."
Someone should do something about that.
Could I have a grant please?
(laughter)
But all I can really do
is point you back to
Bowker and Star's conclusion,
which is that this isn't ultimately
about Wikidata,
this isn't a problem with Wikidata
this is that the class of systems
that Wikidata is a part of
has never been done safely
and there is no reason
to think it could be.
And so my call is ultimately
not for a particular change,
or for all of you
to just go home and give up.
It's for the project collectively
and for you all individually
to determine how comfortable you are
with participating and building a system
that makes a claim to universalism,
that makes a claim to neutrality
and truth in data,
when we know that that's neither possible
nor harmless when it fails.
and if you are not comfortable
with that, working to articulate
what other ways of doing this
there might be.
And these could look like, for example,
giving primacy
to those local Wikibase installs.
Saying that ultimately
we need to give individual communities
and individual contexts
and spaces primacy
in defining what matters to them,
and how they wish to be defined.
And the conversation about
which perspective should be included
in some central repository should wait
until we have
the full range of perspectives.
So, that's everything from me.
Thank you, everyone,
for sitting through this.
I think we have about 20 to 25 minutes--
(moderator) 25 minutes for questions,
so, please, plentiful.
Thank you very much.
(applause)
(person 2) Thank you so much
for this wonderful presentation
about the problems inherent
in classification systems.
One of the examples you had
is really cool
from a mathematical point of view,
when you were showing
that transgender male
is the opposite of transgender female--
or transgender female
is the opposite of transgender male
and the opposite of cisgendered female.
That makes cisgendered female
be the same as transgender male,
because opposite of is the same--
if A is opposite of B
and C is the opposite of B,
A and C are the same.
So actually that's a place
where it should be different from
and not opposite of,
and that involves
a lot of mathematical issues
when we go to actually ask queries
of the database,
so it's really important
that you've pointed out things like that.
Yeah, another example
of that which I thought was fun
was transsexualism was defined
in part further down--
which I wanted to include,
but couldn't find a way
of fitting it into the flow--
as the same as sex-reassignment surgery.
Which is unintentionally hilarious
because a diagnosis of transsexualism
was historically a prerequisite
for sex-reassignment surgery.
So it's not so much a chicken
and an egg problem
as the chicken is carrying the egg.
(laughter)
Yeah. So yeah, these--
When we look at Wikidata
and how much it uses mathematical,
or pseudo-mathematical language of, like,
opposite of, distinct from,
in the set of...
Yeah, reality is more complex
than the mathematics
we have to represent it.
I don't have a smart answer there
except to say
that I used to be
a quantitative researcher
and I left,
and there is a reason for this.
(moderator) Next question.
Who raised hands?
I see a hand over there?
(person 3) Hello.
First of all.
Thank you for this presentation.
It was very eye-opening.
I want to tell you,
but first of all--
there's a Wikimedia--
I don't know if you know
about the community LGBT+ user group.
So it's a user group,
and they have this mailing list,
and they discussing actually
the issue of sex and gender in Wikidata,
and there is some proposals made
by LGBT+ people to improve it.
So, but it's not fully done yet.
So, there are some plans,
people working on it.
It would be great
if you want to chime in there
and give your opinion
because I'm pretty sure
you're more expert than most of us.
But I want to give a critique
of this thing that you said
about hijra people that said
out of 20,000 editors of Wikidata,
assuming 2.8 of them will be hijra
and they need to overcome
all of these 20,000 people
but this is not true.
Lots of people, I say assume 20,000 people
are just unaware of an issue.
They are not bigots
or they are not going to actively
not let people do this.
And lots of them would help
if you tell them.
Like, as you [inaudible]
that edits Wikidata,
I have no idea about this issue
and if I knew it
I would have fixed it.
So, yeah.
Yeah. I totally get what you mean.
And I want to be clear that I'm not saying
there are 20,000 people,
many of whom are in this room,
although only a tiny percentage
who are vehement bigots
and cultural imperialists.
Instead what I'm getting at
is the fact
that the consensus model,
and discussion-based model
that the WikiProjects are based on
has a couple of flaws,
and one of the big flaws
is that it assumes that all of the voices
worth representing are there
and are represented
somewhat proportionately.
Consensus started off
as a model in Quaker communities
where literally everyone impacted
by a decision was in the room,
because everyone impacted
by a decision could fit in the room.
And so my point
with this 2.85 number is not to say
you have to argue
with the entire population of Wikidata
every time you want to make any decision,
but instead to say
that the consensus model
and the majoritarian model
of what knowledge should be represented
runs fundamentally into a problem
when the people
who are being underrepresented
are underrepresented.
For another example, and a real one,
Myanmar as a country.
The English Wikipedia claims
that it was called Burma
until a couple of years ago.
And the reasoning for this
was very simple.
The BBC didn't like calling it Myanmar
and a load of editors--
(person 4) [inaudible] completely wrong.
Sorry.
(laughter)
You run into this issue of like...
I know it's not the precise thing,
but it's just...
- (person 4) : [inaudible] it's actually--
- (moderator) I give you the mic, sir.
- Yes?
- (person 4) I'm sorry,
that's just incredibly playing
being ignorant and that...
- Okay. Go for it.
- (person 4) That's an absolute terrible,
terrible mischaracterization
of the political situation in Myanmar.
Okay. Go for it.
(person 4) Anyways, so basically
what it is is that the country--
in the Burmese language
the country can be referred to as
Myanma or Bama.
Yep.
Myanma tends to be a more
formal register
and Bama tends to be
a little bit more informal register
but both are acceptable terms
for the country.
The term Burma came obviously
from the term Bama,
but what happened was
there is no official...
The country was officially referred
to, in English, as Burma
up until 1988-- 1989, excuse me,
when the military government
of the country
basically decided,
the military junta of the country decided
that the country should be
referred to as Myanma.
Ostensibly, this was as an attempt
to make the country name
more acceptable to minorities
within the country.
However, this is a bit
of historical revisionism
because Myanma and Bama
specifically refer
to the majority ethnicity in the country.
So, it was basically the government
of Burma at the time--
trying to make the people
equivalent to the country,
therefore implicitly saying--
(person 4) Almost the opposite,
but in a really weird way.
They basically declared that
Bama was in reference to the ethnicity
and Myanma was in reference
to the country,
when historically they both
represent ethnicity
and the country.
That makes sense.
(person 4) But what happen was
because Democrat advocates
within the country
believed that the military junta
did not have the power
to be able to change
the name of the country
in any language,
because they were not
empowered by the people of the country.
and were explicitly
a military junta that they...
therefore the country should continue
to be referred to Burma in English.
Because of the fact that essentially
to call it Myanmar is essentially to say
the government of Burma and Myanmar
at the time was legitimate.
After the fall of the-- well not fall,
but after like the semi return
of civilian government in 2014,
this question came up,
"Okay, should we call this country
Burma or Myanmar in English?"
and essentially,
the facto leader of the country,
Aung San Suu Kyi,
said that there's nothing
in the Burmese constitution
that says you know,
what you should call it in English
so call it whatever you want.
I mean the name of the country
is officially the Union of Myanma
in Burmese,
but as far as in English
you can call it whatever you want.
But generally before the return
of the civilian government in Burma,
to refer to it is as Myanmar
was essentially
to legitimize the military government.
And so therefore,
to call it Burma was generally considered
to be a specific political act
to not give that government legitimacy.
Yeah. So, I'm not saying that
that isn't a rationale for it.
I'm saying that
on the English Wikipedia specifically,
the page went through seven requested
move discussions
over four years
and a mediation cabal decision,
and an attempted structured mediation,
and a review of one the closures
of the move discussion,
and that when you look
at the discussions,
most of the sort of argument
back and forth
is not about
the nuanced political situation
of the country
but it's instead about
what is the common name in media sources
and what do
different institutions call it.
And that when you look at the discussion,
you can see a clear point
where pretty much every news organization
that isn't the BBC
in the English Language,
that's considered like a major
western news source
has switched their language sources,
and the debate
essentially becomes a debate
of whether we should listen
to the Wall Street Journal
or the BBC.
So the point I'm making
is not about the specific politics
of the situation, but instead the fact
that it's really easy for those decisions
to actually become almost a proxy dispute
of how much do we love the BBC,
and that when you look at the discussions
you see this really nice case study
in the issues of having
those conversations
and having those nuanced,
and often insider perspectives
when most of the discussions
are centered around
how much we love the BBC
and are coming from people
who are outside the context.
So, it's not--
My point in all of this is basically
that even if you're not fighting
20,000 people,
even if you're only arguing
with 20 people,
probabilistically, 19 of them
are going to be people
who have very strong opinions,
who don't necessarily bear
any negative consequences
of whichever change happens,
but have a particular world view
and have decided to stick in it,
and so the proposals
by the LGBTQ+ group
to change the Wikidata criteria
might be amazing, I might love them,
I might not love them,
I haven't read them.
But the base premise of this is...
We got the people who show up
on Wikidata right now,
and those are the representatives
of all queer people
and this is the universal rule
of what should be done
with the content of all queer people
is almost a microcosm
of the same problem.
- (moderator) We have another question.
- Yep.
(person 5) Hi.
I think there's another problem
with the consensus-based approach we have,
is that sometimes we have consensus
on really difficult issues
on how to deal with that
and [inaudible] that on Wikidata,
and nobody is reading the discussion.
Typically, the project Names,
which is a really, really old
WikiProject on Wikidata--
and names are a really,
really complicated issue in the world.
Not every people of the world
have a given name,
not every people have a family name,
not, well, you have an idea.
And there are so many
writing systems out there,
and we have, actually, a system
which was working
for many cases in the world
on how to use properties,
what items should look like,
how to link these together
and everything--
We have eight pages--
nobody is reading that,
and someone just added
Latin script family names
to a Chinese researcher.
So, we don't have the names
of these researchers
but we know for sure
that the value added was wrong.
I don't have the correct value,
but I know this one
is not the correct value.
And it's not just discussing the issue
because we have big discussions
and we have actually modeling
which is mostly working on
and even qualifier on things to deal
with more complicated cases
but people are just,
"Oh, given names suggest a property,
I will just add that."
- No.
- Yeah.
I think it's not just
how to model thing,
it's really how to explain
to people the model,
and that's a technical part--
we could have tools with suggestions
and I think the constraint thing
which went live last year
is a great thing for that.
But even when we know to model thing,
it's how to make
this model known to people.
That's a bit technical issue
on how to do that better.
(moderator) So, there was just remark.
There's no real question for you?
Or that's a question to you?
- How to do that.
- (person 5) Yeah, it's a question.
(person 5): Sorry,
even if we have the discussion,
(moderator) Yeah, sure.
(person 5) My question,
if I was not clear, is that
even when everyone is in agreement
on how to model complicated cases,
how do we make technically
the model known for project
with the scope of Wikidata,
so people are not adding
the wrong value in good faith?
Because our problem is both.
We have trouble
modeling complicated realities,
and we have trouble explaining
to users, how to follow the model
we actually have.
Yep.
I will say that
if I could solve that problem
which is to reframe it,
how to reliably and consistently
enculture new users
into having the same view
and understanding
of the project space,
then they would let me graduate
and also give me a job.
It's the second oldest problem
in internet spaces is how to do that.
The oldest problem is writing a system
that will automatically detect insults.
I will say that...
You can look back at Wikipedia,
or before that,
there was the phenomenon
of eternal September on Usenet
which was, "Oh these people keep--
AOL disks have gone everywhere
and now there's newcomers
all the time who don't know
how things work around here,
and everything is drowning
in people hitting "Reply All."
Generally speaking,
the place that I would look for that
is there is a discipline called,
"Computer-supported collaborative work,"
and one of their big questions
is this question of onboarding,
and of like...
making the culture known to people.
But it may not be something
that is directly solvable,
or that we want to directly solve, right?
So, Susan Leigh Star
who wrote Sorting Things Out,
one of her other contributions
was generally the study of infrastructures
of which I would argue Wikidata
is definitely one,
and of the things that she argued
was that infrastructures
make themselves known
through using them.
So like, basically the only way
to work out how a system works
is to engage with it,
and trip over, and fall flat on your face,
and learn not to fall over that way again.
And I think everyone everywhere,
including new users,
including people
coming from other projects,
wants a way of approaching this
where they don't have to fall over.
But I'm not sure if that exists,
and I think that a better place
we might look is maybe to ask
what are the consequences
of people screwing up
and how do we make screwing up
an understandable
and a more expected component
of the user experience.
(moderator) Okay thanks.
Next question.
(person 6) Thank you.
So, first, thank you very much
for your presentation to us.
Again, someone said, eye-opening.
I was looking at the specific item
on transsexualism,
and it's actually even more interesting
because I was looking
at different Wikipedias,
how they dealt with the issue.
And I just look at three.
So, apparently, what
we are seeing on Wikidata
actually reflects pretty much
what happened to some extent
at some level on English Wikipedia,
whereas if you look
at Portuguese Wikipedia,
the actual item connects to transgender,
and on French Wikipedia
it connects to trans identity
whereas transsexualism is a redirect
in both Portuguese and French.
And I was looking at the history
of editing on the Wikidata item,
and if you look at--
there were several sort of wars
but the discussion page
is actually only one line,
but there were several conflicts
between editors,
particularly with the French
that were opposing
the use of transsexualism.
If you look at the names of the items
on each language,
the only one on which
you don't have transsexualism
is French for trans identity,
and then someone came,
and did what you said about
it's the opposite [inaudible],
trans identity,
and then there is a different item that--
Oh yeah.
(person 6) So, it's a complete
global fight over...
basically it's reverberating conflicts
that are apparently also
the manifestations of conflicts
that happen on each Wikipedia.
Yes, that also reflect conflicts
in local cultures,
and in different parts
of the world, yeah...
And I'd argue that, I mean,
I'm British so I have a tendency
to say, "Wait, fighting with the French?"
"Yes, Please!"
(laughter)
But I'd say there's almost something
more fundamental than that,
and you can make an argument
in the other direction.
I can, as a trans person, make an argument
in the other direction and say,
"Actually, it's the French
and Portuguese who have it wrong."
Because the actual question is
is the entry transsexualism about
the medical classification,
or the state of being,
or the historic medical classification,
or the historic term
for the state of being,
or are these different entries,
or the same entries?
When are things distinct enough
to be different objects,
and how do we negotiate that fight
between people who think
that the medical status
and the identity are the same thing,
or different things.
But yeah, there is no easy answer
but yeah, I suspect
if you look at a lot of these examples,
and if you look
at a lot of controversies,
generally on Wikidata
what you're going to see is
these fights over...
These almost negotiations
are the local community norms,
and beyond that are the cultural norms.
Which is a problem because again,
when we're talking about marginalized
or minority groups,
we would expect them to also
be marginalized within Wiki communities,
and also within Wikidata,
and so Wikidata is sort of...
building on these
preexisting prioritizations
of whose knowledge matters,
and under what circumstances
and in what form.
(person 7): I wanted to touch
on something you mentioned.
Everything is complex
and I think modeling it right,
getting it right on Wikidata
is not the sum of the issue.
As you said, Wikidata is infrastructure,
and as [Hermione] said,
we have gotten it right perhaps
in some things, in some other topics,
and still can't
actually practice it right.
Yep.
(person 7): So I want to suggest that
this is a prevalent condition
of the human race.
And however well we model something,
even if we model gender
ten times more complexly
than we do today,
most SPARQL queries involving gender
would not bother
- with the qualifiers right?
- Yeah.
And would still generate very,
very flattened, very simplified results.
Google's use of our data
in the infamous Google infoboxes
will also flatten the data
and ignore qualifiers.
That is not going to change.
Wikidata will continue to be used
in simplistic ways.
Indeed, the majority of use,
probably, will be that simplistic thing.
My point is, it's probably not fixable
and we shouldn't stop trying.
I mean we should try to get it right
and understand that a lot of the use is,
despite our best efforts,
going to be simplistic and wrong.
Yep. I would agree with that.
I guess I would say that
you know,
it's not about like,
my issue here is not about
it being you know,
there is one true
incredibly complex answer.
At some point I just gave up
even in my thesis which is about
transness and technology
of defining transness.
I just gave up.
And I instead took what is referred to
as a pragmatist view,
which is basically that
it is whatever the people
in the situation that you're studying
believe it to be,
and however they construct
the world as if it were,
and what I'm getting at this
is not that there is
some universal definition
of anything which,
if sufficiently complicated,
would be enough,
but instead that I think
that the scale is the problem,
and the universalism is the problem.
Maybe we should keep trying,
or maybe we should stop.
Maybe we should instead say
that, again,
there should be a Wikibase install
in every self-defined community
that wants it and they can define things,
and articulate things
to their own satisfaction.
But then we end up in more political
and fraught debates of a reformist
versus radical actions,
and how you open a box
with a crowbar that's already inside it,
and I end up quoting Foucault for an hour,
and everyone gets sad.
Including me because I hate Foucault.
So this might be a discussion
for elsewhere.
But generally agreed, I just--
I would raise questions about
whether we should keep trying
for a better form of universalism,
or whether the problem
is that universalism.
I'm guessing we have
a time for one more? Yeah.
(person 8): This is a short question,
possibly complex answer.
One of the most popular
and used properties is sex
or gender on Wikidata.
Could you speak to whether you find
that merging useful,
productive, problematic?
Sure, I mean I think it's always
going to be reductive
cause it's a merging.
But I also think
that it is deeply tiresome
in a way that's kind of interesting
insofar as it reveals
the limitations of Wikidata,
though Wikidata claims to be building
towards this like big objective
set of knowledge,
but ultimately kind of
smushed these things together
because I mean they haven't asked
most people who have entries
what their gender is,
and/or what their sex is,
and so they just merge them
so that inference is easier.
But generally speaking, yeah,
I say that the merging
of the two together
is reductive and dangerous
but...
Again it's not...
There is no good way of doing it.
I think this is a particularly bad way
of treating them
as interchangeable things,
and treating them
as forever-linked things,
but I can't suggest a better way
that remains--
that continues to have Wikidata
even tracking this information
or the information contained
in that at all.
(moderator): Okay.
I think we have to conclude here.
I still saw some raised hands
so hopefully you'll be around.
Yeah. I am a grad student.
I have functionally no life, so...
(laughter)
(moderator): Perfect. Okay.
So please come and talk.
Thank you very much.
(applause)