Hello. Just to check. Can everyone hear me? Grand. I've never understood why that's such a phenomenon when people give talks because if you can't, what are you meant to say? (laughter) But yes, so as said, I'm Os. I'm a PhD student at the University of Washington, where, according to the slide, I study "Gender, Infrastructure and (Counter)Power." I'd ask you all to do me the indulgence of pretending that that's some very explicit, nuanced, thoughtful, academic description and not just what I write as a catch-all, because I kind of study a thousand different things and fitting them all into a few words is hard. But most of the things I study are around how systems of knowledge enforce particular ideas of how the world works, and particular relationships of power with a specific focus on gender. I'm also an ex-Wikipedian. I spent 15 years as an editor which is maybe where my interest in the nature of knowledge started, and I really can't express how happy I was to be invited and how glad I am to be here with all of you, but particularly James Forrester who is probably the only person qualified to countersign my passport renewal application, cause it's running out soon and I've been trying to work out... (laughter) You move to Seattle. Everything is great. Then you're like, "Oh, the UK government requires me to find an ex-priest, civil servant, or member of parliament, who's known me for at least 2 years and who I can ship paperwork to." That sounds plausible. (laughter) Anyway, but... So I'm here as someone who has spent a lot of time of... a number of years-- which I don't like to think about because it makes me feel incredibly old-- wrestling with the nature of knowledge and the idea of knowledge-- to talk to you about what Wikidata looks like to someone from my background and with my research interests. And I'm not going to spend much time on the story of Wikidata itself, because if you're here, having spent 24 hours having it brain dumped into you, you're familiar with it. It's a big semantic data store that aims to provide machine-readable knowledge in a centralized way. And what this looks like is a series of items with associated properties or statements. So the item for "apple" has the property "fruit." I mean, probably. It's a Wiki so there's probably a long-running edit war of whether an apple is a fruit, and there's 50 people running 300 accounts between them, and it's been going for years, and at this point, if you mention the word apple on Wikidata, you're preemptively banned as someone who, you know, is secretly a sock puppet and running an account on one or another side of this. So as a consequence, it's also a classification system, right? A way of sorting and organizing the world. So, objects or people or concepts are classified as worth having a Wikidata entry or not. A fruit or not. And in each case a series of criterion apply to determine the properties that an object should have, and the values of these properties and how the objects all relate to each other. So Wikidata is really an attempt to build a universal classification system. And classification systems have been studied pretty extensively. One prominent work which I'd really recommend people read if they're interested in this stuff is Sorting Things Out, which is book by Geoff Bowker and Susan Leigh Star. And they found that in an ideal universe, a classification system, be it universal or over a particular domain, has three attributes. The first is it operates on consistent and unique principles. So, there's a consistent pattern of what should be in each category and for what reasons. The second is all the categories are mutually exclusive. And the third is that the system is complete. It contains total coverage of what it describes. And this doesn't mean it has to have every single object that fits into the system. It just means that in the situation where it lacks an object and that object then shows up, there should be a consistent mechanism to work out whether it should be added or not, and how it should be described and so on, and so forth. There is one small problem with this which is that: "No real-world working classification system that we have looked at meets these simple requirements and we doubt that any ever could." Or to put it another way, all classification systems fail. All classification systems have gaps and exceptions. And obviously, the same is true for all systems, full stop. Anyone who has ever coded or simply worked in an environment, or studied in an environment, or lived in the world knows that we've yet to design a single thing that we've thought all the way through. The problem is that when we take a system, classification, or otherwise, and put it out into the world and give it power and authority, and integrate it into other systems, that already have power and authority, there are consequences for what happens when the system inevitably fails, for how it reinforces or undermines existing relationships of power, for how it hurts people. A universal classification system is, in another words, not merely doomed to failure, it's also doomed to hurt people. And the way that it is structured is ultimately a series of ethical and political choices as a result-- Who do you want to hurt? How much? What should be done when people are injured? And those choices have real consequences. And so making these choices often involves confronting the fact that there's very rarely a single simple machine-readable interpretation of something that's true for all people throughout all history. Anything in the universe has multiple meanings, and symbolisms, and nuances to different people in different contexts at different times. But designing a classification system and implementing it, designing a system that can make a claim to having consistent principles, and covering everything it discusses, inevitably involves cutting down on this complexity and making decisions about what "the" meaning of a thing is going to be, or what array of possible meaning should be presented and in what sequence. And as a result, it involves silencing voices or rendering voices louder. Again, this has consequences. And to see what I mean about this complexity and context, and reduction, and the consequences of it, I'd like to set through some examples from Wikidata itself. The ones I've chosen are all gender-related because again, gender is both professionally and personally sort of a key interest. So, the first that I'll start with is transexualism which is described as a "condition in which an individual identifies with a gender inconsistent or not culturally associated with their biological sex." Fairly unobjectionable and-- wait, no, it's classified as a disease, and a psychiatric disease at that. Now, I know what you're thinking, which is this is appalling but actually it's not as simple as either of these statements being true or false, right? They're in a category of sort of, "true, except." So, take transsexualism is an instance of disease, right? Technically, this is true, in so far as transsexualism is the name of an entry under the International Classification of Diseases, version 10. But we should add some complexity and nuance to that. So, the ICD is a classification of literally everything in the world that you could have that was in any way involved at all in someone's injury or death. It is in fact illegal to die of something that is not listed in the ICD. (laughter) So it contains kind of a lot of things, and transexualism is listed in it so we classify it as a disease because it's in a classification of diseases. So, here are some other things that the ICD also lists as diseases that it has specific entries for. PA80: Shot by accident. PA40.0: Fell off a boat, drowned. (laughter) PA41.1: Fell off a boat, damaged the boat, and drowned. (laughter) PA40.1: Fell off the boat, didn't damage the boat, didn't drown, still died of something. (laughter) And finally, QD50: Being poor. (laughter) So, if any of you have ever fallen off a boat, I'm very sorry but you have a disease which you should really talk to a doctor about. What class of doctor, I'm not sure. It might be a psychiatrist. Who knows? So you know that's disease, right? What about health specialty: psychiatry? Well, that's also true, sort of. So, psychiatrists are the people who diagnose the presence of gender dysphoria, a disconnect between one's sense of gender and one's sort of like, embodied or perceived gender. But again, context. For example, saying psychiatrists diagnose it ignores the fact that none of the treatments are psychiatric. You might as well list the specialties as specialization in hormones or plastic surgery, or being a personal shopper. All of these also have some role in people's life trajectories. They are not listed. One other useful potential factoid by the way, is that the ICD 10 is actually the old International Classification of Diseases, and the ICD 11 no longer lists transsexualism at all, much less as a disease. But my point here is not that Wikidata sometimes contains outdated information or sometimes contains false information, it's that the statements that are constructed from that information as a consequence of what they leave out and what the results are, drop things and add risk. So, one way of structuring the information that that entry contained is: "transsexualism is a psychiatric disease." And this leaves out a lot of complexity, some of which we've discussed. But the greater issue is how it interlocks and resonates with existing narratives, and existing information. For example, the idea of transsexualism is a disease. Does anyone know why the ICD stops listing it as a disease? Well, two reasons. First is because calling being trans a disease is not accurate. It does not meet the definition of being a disease. In fact, the only reason that anything to do with being trans is still in the ICD is not out of some objective like, you know, examination of biology or psychiatry but instead purely pragmatism. That if you stop listing it, then insurance companies in places like the U.S. would stop covering medical care that is associated with being trans. And the second is that the stigma associated with having something classified as a disease is substantive, and when you list transsexualism as a disease and a psychiatric one at that, you tap into really long-standing assumptions and false beliefs about trans people. Assumptions and beliefs that have a lot of power. Like, if it's a disease there must be something wrong with trans people, something that people should fix. And if it's a psychiatric condition then trans people should be therapized out of being trans. In other words, whatever the raw truth or falseness of the statement, stripping out its complexity and contextuality, lets people fit it into their own notions of what it means. And that doesn't end in a neutral objective classification system, it ends in things like conversion therapy, and it being legal to beat people to death for being trans when you find out that they're trans after you slept with them, because, you know, something's wrong with them. Like why would you be considered reasonable to have done this? So a more accurate framing of this might be this, which is hard to fit into Wikidata. And because we can't fit that into Wikidata, and we strip it down, and we lose all that complexity, we open up the possibility to, again, reinforce these really dangerous notions. So, let's look at another example, also from gender, and that is the entry for non-binary. So, as Wikidata informs us, non-binary is a range of genders that are neither exclusively man nor woman. And there are some critiques I have of the "also known as" section, but that's not the biggest issue here. No, the biggest issue here is that at no point does this entire page make any reference to trans people. So, if you go to the entry for transgender woman, it says, "opposite to transgender man." And if you go to the entry for transgender man it says, "opposite to transgender woman." If you go to this entry, it has absolutely no reference to trans people whatsoever. There is this complete disconnect and distinction between non-binary people and trans people. And this might be, seems to be, a pedantic thing to be concerned about but it's actually a really useful example for a couple of reasons. The first is that how non-binary people relates to being trans is really hotly debated. Individual non-binary people may or may not identify as trans. As a consequence, it's really difficult to make big categorical judgements about a class of people. Other people would say that non-binary people aren't trans, for whatever reason, or that non-binary people are trans. You know, you have to make a decision at some point. How are you going to categorize this entry? What attributes are you going to associate it with? But it's hard to do that in Wikidata when by necessity the structure of the platform is so categorical and so fixed, that you can't really say like, for some people these things are related and for others they aren't, and it's actually very politically charged but you should think about it. There's no objective fact to fall back on. It's very contextual and complex, and disputed. So, how do you fit this in? Anyone? But, this reductiveness isn't just a question of, "Oh well, we haven't fit all the information in so I guess it's not perfect." Again, it fits into preexisting discourses and the preexisting world, and has the potential to cause very real harms. There's this very long history of non-binary people not being considered trans, going back to, in fact, the foundational, sort of medical and academic, and authoritative works on what being trans is and how trans people should be treated. And what this has resulted in is non-binary people being cut out of access to resources-- medical care, community membership, any kind of support. In fact until 2013, being non-binary was not a thing you could possibly be while still getting access, to transition-related medical treatment. If you were, and you wanted access you would have to go to your doctor and consistently lie, and hopefully get away with it. So, if you want that diagnosis to happen so that your health insurance will cover things or that your national health service will cover things, you could either be a man or a woman, and nothing else. And right now there's a ton of backlash to non-binary existences from people who are thinking that we are a threat, or something new and novel when we've been around for just as long as any other kind of trans person and just not discussed. And again, the consequence of this is that this silence is reinforcing those preexisting ideas of being non-binary has nothing to do with being trans whatsoever, and it creates and reinforces discourses that cut people off from care, and cut people off from community. And finally, before I stop harping on things about gender quite so much, the hijra. So, according to Wikidata the hijra are the third gender of South Asian cultures and a sub class of non-binary. Now, here's the thing. Yes, hijra people fall outside a simple man-woman binary, but pretty much zero hijra people would ever define themselves as non-binary, because it just doesn't make any sense. In a western context, non-binary people are, by definition, not man or woman but as a consequence not trans man or trans woman. Hijra includes trans women, and also includes all intersex people, all sterile people, and a large number of gay people while not including trans men or people who are non-binary, and were assigned female at birth. All of this is really complex and there are literally books written on the framework of gender and how that fits into it. But the point is there's not a simple mapping of western gender notions to gender notions in the rest of the world. Categorizing hijra people as a subset of non-binary people ignores the fact that most hijra people do not see themselves that way, would not see themselves that way, and that the definitions of hijra and non-binary are completely incompatible. But again this has the potential to cause harm. Because the fact of the matter is that western notions of gender are pretty regularly and over a long period of time exported to the rest of the world often by violence. We have these information systems. We have classification systems. We have standards. We have, historically and currently, wars, all of which are orientated around this idea of the western way of doing things is the only good way or is the best way and the standard way, and everyone should conform. And so when we have these big projects which are trying to fit the world in to a very westernized idea of knowledge, because they have to, because that’s how classification systems do universally work-- everything has to fit into one consistent scheme. It is perpetuating that kind of violence. So, you could respond to my concerns and examples, and rambles with kind of a lot. One line to take would be, "Why does this matter?" Why does Wikidata participating and validating or invalidating particular discourses have an impact on the world? And the first answer is it actually doesn't matter if it matters. It matters that you acknowledge it, So, right now the default framing of Wikidata is we're just collecting all of the knowledge in a machine-readable form, but you're not. You're also making decisions about what should be included and what shouldn't, and how knowledge should be represented. What complexity is worth representing and what isn't. And those are ethical and political choices, and framing the project as simply the result of a million anonymous, and interchangeable monkeys with an equivalent number of typewriters makes it impossible for us to have conversations about it. Wikidata's organizers and users and funders must understand that they're fundamentally making charged decisions that are not neutral or objective at all, and that is not bad but dangerous. And so, okay, having accepted that these are ethical and political decisions, you could say, "Well, if people want their takes on things included, they should just contribute." And marginalized communities do contribute a lot, right? There's a long history of queer communities, particularly, being very early adopters of technology. And so people could just contribute to Wikidata. Like Hijra people could create accounts and start arguing that actually the entry shouldn't be a subset of non-binary and so, and so forth. The problem is that this is unlikely to help because they're the minority, because many of the voices and perspectives that are currently silenced, in the political and ethical decisions being made, are those of minorities. So, I did some number crunching on this. Wikidata has 20,000 active editors from a human population of seven billion give or take, unless you believe that maths is a lie and the world governments, controlled by lizards under the Arctic, is making everything up. And there are approximately... Um hmm? (person 1) You mean they're not? (laughter) Look, I'll be honest. If living in the U.S. for the last five years has taught me anything, it's that any government assemblage large enough to try and control a big chunk of the human population would in no way be consistently competent enough to actually cover it up. (laughter) Like we would have found out in three months-- and it wouldn't even have been because of some plucky investigative reporter-- it would have been because one of the lizards forgot to put on their human suit one day and accidentally went out to the shops for a pint of milk (laughter) and got caught in a TikTok video. (laughter) So Wikidata has 20,000 active editors-- of whom we will assume none are lizards in human suits or otherwise-- from a human population of seven billion, and there are approximately one million Hijra people in the world. So if we assume a rate of equal participation-- setting aside the extreme poverty a lot of Hijra people live in and the corresponding impact on access to things like reliable internet coverage-- then the combined efforts of 20,000 Wikidata editors would have to be overwhelmed by 2.85 people. That doesn't seem particularly plausible. Okay, so then you might say, "Well, what if we just have other Wikibase instances isn't that the whole thing we're building towards? You can set up your own Wikibase with your own perspectives and your own decisions about how to classify things, and what to prioritize, and what not to. Make your own site with your own standard for what constitutes knowledge and what information is important." And people could do precisely that. But the problem is that Wikidata has a lot of heft behind it which is why the decisions that Wikidata makes have so much import. There's the fact that it already exists. It has a first movers advantage. There's the Wikimedia brand. There's the funding from places like Google. There's the relationships with other institutions. When the strategic plan for Wikidata calls for engagement and integration with museums, that doesn't just result in getting more data for Wikidata. That also results in Wikidata and the decisions its users make permeating more of reality, becoming more of a standard of how data systems work, and more of a place that is drawn from to populate other spaces. So I keep using this line, "Not bad, but dangerous" to describe classification systems or to describe Wikidata, and I want to reinforce that I don't think that Wikidata is inherently bad. But I do think that its dangers are vast and are not being properly attended to. Just by looking at gender, we saw three examples, which I pulled very, very quickly, of situations where even setting aside the sort of objective "accuracy" of the information that a Wikidata entry might contain, the information it chooses to contain and chooses to prioritize perpetuates or silences particular discourses, and particular ideas that have weight in the rest of the world, that do harm in the rest of the world. And I picked those examples not because they're surprising in any way, or not because they're unique, but simply to point out that if I could find that many problems with resonances in wider violent systems in such a tiny sliver of content, imagine how many others are lurking out there. And the goal of Wikidata, the goal of universal classification if these dangers are not attended to could ultimately result, or will ultimately result, not in simple like neutral classification, but imposition. In saying this is the way the world works and if you don't like it then congrats, you should try and fit into it. And I really wish that I had a sort of simple answer for this. I don't. It's one of the advantages of switching to academia instead of working in an engineering department. You can just show up places and go, "Everything is really complicated." Someone should do something about that. Could I have a grant please? (laughter) But all I can really do is point you back to Bowker and Star's conclusion, which is that this isn't ultimately about Wikidata, this isn't a problem with Wikidata this is that the class of systems that Wikidata is a part of has never been done safely and there is no reason to think it could be. And so my call is ultimately not for a particular change, or for all of you to just go home and give up. It's for the project collectively and for you all individually to determine how comfortable you are with participating and building a system that makes a claim to universalism, that makes a claim to neutrality and truth in data, when we know that that's neither possible nor harmless when it fails. and if you are not comfortable with that, working to articulate what other ways of doing this there might be. And these could look like, for example, giving primacy to those local Wikibase installs. Saying that ultimately we need to give individual communities and individual contexts and spaces primacy in defining what matters to them, and how they wish to be defined. And the conversation about which perspective should be included in some central repository should wait until we have the full range of perspectives. So, that's everything from me. Thank you, everyone, for sitting through this. I think we have about 20 to 25 minutes-- (moderator) 25 minutes for questions, so, please, plentiful. Thank you very much. (applause) (person 2) Thank you so much for this wonderful presentation about the problems inherent in classification systems. One of the examples you had is really cool from a mathematical point of view, when you were showing that transgender male is the opposite of transgender female-- or transgender female is the opposite of transgender male and the opposite of cisgendered female. That makes cisgendered female be the same as transgender male, because opposite of is the same-- if A is opposite of B and C is the opposite of B, A and C are the same. So actually that's a place where it should be different from and not opposite of, and that involves a lot of mathematical issues when we go to actually ask queries of the database, so it's really important that you've pointed out things like that. Yeah, another example of that which I thought was fun was transsexualism was defined in part further down-- which I wanted to include, but couldn't find a way of fitting it into the flow-- as the same as sex-reassignment surgery. Which is unintentionally hilarious because a diagnosis of transsexualism was historically a prerequisite for sex-reassignment surgery. So it's not so much a chicken and an egg problem as the chicken is carrying the egg. (laughter) Yeah. So yeah, these-- When we look at Wikidata and how much it uses mathematical, or pseudo-mathematical language of, like, opposite of, distinct from, in the set of... Yeah, reality is more complex than the mathematics we have to represent it. I don't have a smart answer there except to say that I used to be a quantitative researcher and I left, and there is a reason for this. (moderator) Next question. Who raised hands? I see a hand over there? (person 3) Hello. First of all. Thank you for this presentation. It was very eye-opening. I want to tell you, but first of all-- there's a Wikimedia-- I don't know if you know about the community LGBT+ user group. So it's a user group, and they have this mailing list, and they discussing actually the issue of sex and gender in Wikidata, and there is some proposals made by LGBT+ people to improve it. So, but it's not fully done yet. So, there are some plans, people working on it. It would be great if you want to chime in there and give your opinion because I'm pretty sure you're more expert than most of us. But I want to give a critique of this thing that you said about hijra people that said out of 20,000 editors of Wikidata, assuming 2.8 of them will be hijra and they need to overcome all of these 20,000 people but this is not true. Lots of people, I say assume 20,000 people are just unaware of an issue. They are not bigots or they are not going to actively not let people do this. And lots of them would help if you tell them. Like, as you [inaudible] that edits Wikidata, I have no idea about this issue and if I knew it I would have fixed it. So, yeah. Yeah. I totally get what you mean. And I want to be clear that I'm not saying there are 20,000 people, many of whom are in this room, although only a tiny percentage who are vehement bigots and cultural imperialists. Instead what I'm getting at is the fact that the consensus model, and discussion-based model that the WikiProjects are based on has a couple of flaws, and one of the big flaws is that it assumes that all of the voices worth representing are there and are represented somewhat proportionately. Consensus started off as a model in Quaker communities where literally everyone impacted by a decision was in the room, because everyone impacted by a decision could fit in the room. And so my point with this 2.85 number is not to say you have to argue with the entire population of Wikidata every time you want to make any decision, but instead to say that the consensus model and the majoritarian model of what knowledge should be represented runs fundamentally into a problem when the people who are being underrepresented are underrepresented. For another example, and a real one, Myanmar as a country. The English Wikipedia claims that it was called Burma until a couple of years ago. And the reasoning for this was very simple. The BBC didn't like calling it Myanmar and a load of editors-- (person 4) [inaudible] completely wrong. Sorry. (laughter) You run into this issue of like... I know it's not the precise thing, but it's just... - (person 4) : [inaudible] it's actually-- - (moderator) I give you the mic, sir. - Yes? - (person 4) I'm sorry, that's just incredibly playing being ignorant and that... - Okay. Go for it. - (person 4) That's an absolute terrible, terrible mischaracterization of the political situation in Myanmar. Okay. Go for it. (person 4) Anyways, so basically what it is is that the country-- in the Burmese language the country can be referred to as Myanma or Bama. Yep. Myanma tends to be a more formal register and Bama tends to be a little bit more informal register but both are acceptable terms for the country. The term Burma came obviously from the term Bama, but what happened was there is no official... The country was officially referred to, in English, as Burma up until 1988-- 1989, excuse me, when the military government of the country basically decided, the military junta of the country decided that the country should be referred to as Myanma. Ostensibly, this was as an attempt to make the country name more acceptable to minorities within the country. However, this is a bit of historical revisionism because Myanma and Bama specifically refer to the majority ethnicity in the country. So, it was basically the government of Burma at the time-- trying to make the people equivalent to the country, therefore implicitly saying-- (person 4) Almost the opposite, but in a really weird way. They basically declared that Bama was in reference to the ethnicity and Myanma was in reference to the country, when historically they both represent ethnicity and the country. That makes sense. (person 4) But what happen was because Democrat advocates within the country believed that the military junta did not have the power to be able to change the name of the country in any language, because they were not empowered by the people of the country. and were explicitly a military junta that they... therefore the country should continue to be referred to Burma in English. Because of the fact that essentially to call it Myanmar is essentially to say the government of Burma and Myanmar at the time was legitimate. After the fall of the-- well not fall, but after like the semi return of civilian government in 2014, this question came up, "Okay, should we call this country Burma or Myanmar in English?" and essentially, the facto leader of the country, Aung San Suu Kyi, said that there's nothing in the Burmese constitution that says you know, what you should call it in English so call it whatever you want. I mean the name of the country is officially the Union of Myanma in Burmese, but as far as in English you can call it whatever you want. But generally before the return of the civilian government in Burma, to refer to it is as Myanmar was essentially to legitimize the military government. And so therefore, to call it Burma was generally considered to be a specific political act to not give that government legitimacy. Yeah. So, I'm not saying that that isn't a rationale for it. I'm saying that on the English Wikipedia specifically, the page went through seven requested move discussions over four years and a mediation cabal decision, and an attempted structured mediation, and a review of one the closures of the move discussion, and that when you look at the discussions, most of the sort of argument back and forth is not about the nuanced political situation of the country but it's instead about what is the common name in media sources and what do different institutions call it. And that when you look at the discussion, you can see a clear point where pretty much every news organization that isn't the BBC in the English Language, that's considered like a major western news source has switched their language sources, and the debate essentially becomes a debate of whether we should listen to the Wall Street Journal or the BBC. So the point I'm making is not about the specific politics of the situation, but instead the fact that it's really easy for those decisions to actually become almost a proxy dispute of how much do we love the BBC, and that when you look at the discussions you see this really nice case study in the issues of having those conversations and having those nuanced, and often insider perspectives when most of the discussions are centered around how much we love the BBC and are coming from people who are outside the context. So, it's not-- My point in all of this is basically that even if you're not fighting 20,000 people, even if you're only arguing with 20 people, probabilistically, 19 of them are going to be people who have very strong opinions, who don't necessarily bear any negative consequences of whichever change happens, but have a particular world view and have decided to stick in it, and so the proposals by the LGBTQ+ group to change the Wikidata criteria might be amazing, I might love them, I might not love them, I haven't read them. But the base premise of this is... We got the people who show up on Wikidata right now, and those are the representatives of all queer people and this is the universal rule of what should be done with the content of all queer people is almost a microcosm of the same problem. - (moderator) We have another question. - Yep. (person 5) Hi. I think there's another problem with the consensus-based approach we have, is that sometimes we have consensus on really difficult issues on how to deal with that and [inaudible] that on Wikidata, and nobody is reading the discussion. Typically, the project Names, which is a really, really old WikiProject on Wikidata-- and names are a really, really complicated issue in the world. Not every people of the world have a given name, not every people have a family name, not, well, you have an idea. And there are so many writing systems out there, and we have, actually, a system which was working for many cases in the world on how to use properties, what items should look like, how to link these together and everything-- We have eight pages-- nobody is reading that, and someone just added Latin script family names to a Chinese researcher. So, we don't have the names of these researchers but we know for sure that the value added was wrong. I don't have the correct value, but I know this one is not the correct value. And it's not just discussing the issue because we have big discussions and we have actually modeling which is mostly working on and even qualifier on things to deal with more complicated cases but people are just, "Oh, given names suggest a property, I will just add that." - No. - Yeah. I think it's not just how to model thing, it's really how to explain to people the model, and that's a technical part-- we could have tools with suggestions and I think the constraint thing which went live last year is a great thing for that. But even when we know to model thing, it's how to make this model known to people. That's a bit technical issue on how to do that better. (moderator) So, there was just remark. There's no real question for you? Or that's a question to you? - How to do that. - (person 5) Yeah, it's a question. (person 5): Sorry, even if we have the discussion, (moderator) Yeah, sure. (person 5) My question, if I was not clear, is that even when everyone is in agreement on how to model complicated cases, how do we make technically the model known for project with the scope of Wikidata, so people are not adding the wrong value in good faith? Because our problem is both. We have trouble modeling complicated realities, and we have trouble explaining to users, how to follow the model we actually have. Yep. I will say that if I could solve that problem which is to reframe it, how to reliably and consistently enculture new users into having the same view and understanding of the project space, then they would let me graduate and also give me a job. It's the second oldest problem in internet spaces is how to do that. The oldest problem is writing a system that will automatically detect insults. I will say that... You can look back at Wikipedia, or before that, there was the phenomenon of eternal September on Usenet which was, "Oh these people keep-- AOL disks have gone everywhere and now there's newcomers all the time who don't know how things work around here, and everything is drowning in people hitting "Reply All." Generally speaking, the place that I would look for that is there is a discipline called, "Computer-supported collaborative work," and one of their big questions is this question of onboarding, and of like... making the culture known to people. But it may not be something that is directly solvable, or that we want to directly solve, right? So, Susan Leigh Star who wrote Sorting Things Out, one of her other contributions was generally the study of infrastructures of which I would argue Wikidata is definitely one, and of the things that she argued was that infrastructures make themselves known through using them. So like, basically the only way to work out how a system works is to engage with it, and trip over, and fall flat on your face, and learn not to fall over that way again. And I think everyone everywhere, including new users, including people coming from other projects, wants a way of approaching this where they don't have to fall over. But I'm not sure if that exists, and I think that a better place we might look is maybe to ask what are the consequences of people screwing up and how do we make screwing up an understandable and a more expected component of the user experience. (moderator) Okay thanks. Next question. (person 6) Thank you. So, first, thank you very much for your presentation to us. Again, someone said, eye-opening. I was looking at the specific item on transsexualism, and it's actually even more interesting because I was looking at different Wikipedias, how they dealt with the issue. And I just look at three. So, apparently, what we are seeing on Wikidata actually reflects pretty much what happened to some extent at some level on English Wikipedia, whereas if you look at Portuguese Wikipedia, the actual item connects to transgender, and on French Wikipedia it connects to trans identity whereas transsexualism is a redirect in both Portuguese and French. And I was looking at the history of editing on the Wikidata item, and if you look at-- there were several sort of wars but the discussion page is actually only one line, but there were several conflicts between editors, particularly with the French that were opposing the use of transsexualism. If you look at the names of the items on each language, the only one on which you don't have transsexualism is French for trans identity, and then someone came, and did what you said about it's the opposite [inaudible], trans identity, and then there is a different item that-- Oh yeah. (person 6) So, it's a complete global fight over... basically it's reverberating conflicts that are apparently also the manifestations of conflicts that happen on each Wikipedia. Yes, that also reflect conflicts in local cultures, and in different parts of the world, yeah... And I'd argue that, I mean, I'm British so I have a tendency to say, "Wait, fighting with the French?" "Yes, Please!" (laughter) But I'd say there's almost something more fundamental than that, and you can make an argument in the other direction. I can, as a trans person, make an argument in the other direction and say, "Actually, it's the French and Portuguese who have it wrong." Because the actual question is is the entry transsexualism about the medical classification, or the state of being, or the historic medical classification, or the historic term for the state of being, or are these different entries, or the same entries? When are things distinct enough to be different objects, and how do we negotiate that fight between people who think that the medical status and the identity are the same thing, or different things. But yeah, there is no easy answer but yeah, I suspect if you look at a lot of these examples, and if you look at a lot of controversies, generally on Wikidata what you're going to see is these fights over... These almost negotiations are the local community norms, and beyond that are the cultural norms. Which is a problem because again, when we're talking about marginalized or minority groups, we would expect them to also be marginalized within Wiki communities, and also within Wikidata, and so Wikidata is sort of... building on these preexisting prioritizations of whose knowledge matters, and under what circumstances and in what form. (person 7): I wanted to touch on something you mentioned. Everything is complex and I think modeling it right, getting it right on Wikidata is not the sum of the issue. As you said, Wikidata is infrastructure, and as [Hermione] said, we have gotten it right perhaps in some things, in some other topics, and still can't actually practice it right. Yep. (person 7): So I want to suggest that this is a prevalent condition of the human race. And however well we model something, even if we model gender ten times more complexly than we do today, most SPARQL queries involving gender would not bother - with the qualifiers right? - Yeah. And would still generate very, very flattened, very simplified results. Google's use of our data in the infamous Google infoboxes will also flatten the data and ignore qualifiers. That is not going to change. Wikidata will continue to be used in simplistic ways. Indeed, the majority of use, probably, will be that simplistic thing. My point is, it's probably not fixable and we shouldn't stop trying. I mean we should try to get it right and understand that a lot of the use is, despite our best efforts, going to be simplistic and wrong. Yep. I would agree with that. I guess I would say that you know, it's not about like, my issue here is not about it being you know, there is one true incredibly complex answer. At some point I just gave up even in my thesis which is about transness and technology of defining transness. I just gave up. And I instead took what is referred to as a pragmatist view, which is basically that it is whatever the people in the situation that you're studying believe it to be, and however they construct the world as if it were, and what I'm getting at this is not that there is some universal definition of anything which, if sufficiently complicated, would be enough, but instead that I think that the scale is the problem, and the universalism is the problem. Maybe we should keep trying, or maybe we should stop. Maybe we should instead say that, again, there should be a Wikibase install in every self-defined community that wants it and they can define things, and articulate things to their own satisfaction. But then we end up in more political and fraught debates of a reformist versus radical actions, and how you open a box with a crowbar that's already inside it, and I end up quoting Foucault for an hour, and everyone gets sad. Including me because I hate Foucault. So this might be a discussion for elsewhere. But generally agreed, I just-- I would raise questions about whether we should keep trying for a better form of universalism, or whether the problem is that universalism. I'm guessing we have a time for one more? Yeah. (person 8): This is a short question, possibly complex answer. One of the most popular and used properties is sex or gender on Wikidata. Could you speak to whether you find that merging useful, productive, problematic? Sure, I mean I think it's always going to be reductive cause it's a merging. But I also think that it is deeply tiresome in a way that's kind of interesting insofar as it reveals the limitations of Wikidata, though Wikidata claims to be building towards this like big objective set of knowledge, but ultimately kind of smushed these things together because I mean they haven't asked most people who have entries what their gender is, and/or what their sex is, and so they just merge them so that inference is easier. But generally speaking, yeah, I say that the merging of the two together is reductive and dangerous but... Again it's not... There is no good way of doing it. I think this is a particularly bad way of treating them as interchangeable things, and treating them as forever-linked things, but I can't suggest a better way that remains-- that continues to have Wikidata even tracking this information or the information contained in that at all. (moderator): Okay. I think we have to conclude here. I still saw some raised hands so hopefully you'll be around. Yeah. I am a grad student. I have functionally no life, so... (laughter) (moderator): Perfect. Okay. So please come and talk. Thank you very much. (applause)