Asaf Bartov: Testing, testing. Is this heard in the room? Testing. Hello, everyone. This is a gentle introduction to Wikidata for absolute beginners. If you're an absolute beginner, if you've never heard of Wikidata, or if you've heard of Wikidata but don't quite get it, don't know what it's good for, have only used it for inter-wiki links-- if you're anywhere on this range, you're in the right place. My name is Asaf Bartov. I work for the Wikimedia Foundation, and I am a Wikidata enthusiast. So the first thing I want to say is that you are lucky. You are lucky because Wikidata is already and is quickly becoming even more of an important research tool for anyone who's trying to ask questions about large amounts of information. It will become more and more used across the humanities, in particular, because of the things that it's able to do, some of which we will demonstrate shortly. And you are lucky because you get to find out about it now before most of the world. So by the end of this talk, you will be a Wikidata hipster because you'll be able to say, oh yeah. I knew about Wikidata before it was cool. So before we actually visit Wikidata, I want to share two key problems that Wikidata seeks to solve and which would help us understand why it exists. The first problem is that have of dated data, that is data that is out of date. And this is apparent on Wikipedia across our free knowledge encyclopedias. Data on Wikipedia is not always up to date. And the more obscure it is, the more likely it is not to be up to date. So the Polish Wikipedia may have an article about a small town in Argentina, and that article will include information about that town like population size, name of the mayor. And that information, ideally, was correct at the time the article was created on the Polish Wikipedia-- maybe translated from another wiki. But then how likely is it to be kept up to date? How likely is it that the Polish Wikipedia would give us the correct and latest numbers or data about the population size of that town or the mayor, right? So this is the kind of data that does go out of date, right? Every few years-- five, 10 years-- there is a census, and now there are new population figures. Now the census in Argentina will be made available in Argentina in Spanish, probably, which brings us to another component of the problem of dated data, which is there are no obvious triggers for updating the data. So the Polish Wikipedian is not sent an email by the Argentinean government saying, hey, we have a new census. There are new population numbers for you to update on Wikipedia. No such email is sent. So it's kind of hard to notice when. And of course, multiply that by all the different jurisdictions around the world. There's no easy way and notice when your data goes out of date. So that's difficult to keep up to date. And even if we were to receive some kind of indication-- oh, there's a new census in Argentina, so a whole bunch of population figures have now gone out of date. Updating it on the Polish Wikipedia and the French Wikipedia and the Indonesian Wikipedia and the Arabic Wikipedia is a whole bunch of repetitive work that a lot of different volunteers will need to do just for that one updated piece of information about Argentina. So I hope this is clear and resonates with some of your experience editing Wikipedia-- data that is out of date or that needs to be updated manually, menially, on a fairly frequent schedule across the different countries and data sources. The other-- and I think maybe more interesting-- shortcoming or problem that I want to discuss is what I call the inflexible ways of lateral queries, crosscutting queries of knowledge. So if I want an answer to the question, what countries in the world export rubber-- that's a reasonable question, right? That information is on Wikipedia. Do you agree? If you go to Wikipedia and read up about Brazil, about Peru, about Germany, somewhere in there-- maybe a sub-article called Economics of Brazil-- you will find the main exports of that country. And you can find out whether or not that country exports rubber. But what if I don't want to go country by country looking for the word rubber? I just want an answer. What are the countries that export rubber? Even though that information is in Wikipedia, it's hard to get at. It's hard to query. Now, you may say, well, that's what we have categories for, right? Categories are a way to cut across Wikipedia. So if someone made a category called rubber exporting countries, then you can go to that category and see a list of countries that export rubber. And if nobody has made it yet, well, you can create that category and, with a kind of one-time effort, populate that category, and you're done. Well, yes. That's still not very convenient. But also, it's still very, very limited, because what if I only want countries that export rubber and have a democratic system of government, or any other kind of additional condition that I would like to add to this? Or take a completely different example. What if I want to know which Flemish town had the most painters born in it? There's a ton of Flemish painters. Most of them were born somewhere. We could theoretically, just you know, look up all the birthplaces of all the Flemish painters and tally up the numbers and figure out what is the place where the most Flemish painters come from? I don't know the answer to that. It would be nice to be able to get that answer. Again, the data is in Wikipedia. Those birthplaces are listed in the articles about those painters. But there's no easy way to get that information. What if I want to ask, who are some painters whose father was also a painter? That's a thing that exists, right? Some painters are sons of painters. You know, Bruegel comes to mind as an obvious example. But there's a bunch of others, right? So who are those people? What if I want to ask that question? That's the kind of question that not only Wikipedia doesn't answer today. If you walk to your friendly university library reference desk and say, hello, I would like a list of painters whose father was also a painter, how would that librarian help you? There's no easy way to get an answer to a question like that. What if you only want a list of painters who were immigrants, painters who lived somewhere else than where they were born? There's no book. I guess maybe there is, but you know, it's not obvious that there's a ready resource that says, list of painters who are immigrants. And the librarian would probably refer you to a book on the shelf called, I don't know, The Complete Dictionary of Flemish Painters and go, look up the index, you know, and if you see a similar surname, maybe they're father and son. And kind of cobble together the answer on your own. The reason I'm comparing this to a library is to show you that this is a kind of question that is not readily satisfiable today. Now, these questions may sound contrived to you. You may say to yourself, well, you know, painters who are also sons of painters, yeah. You know, that never occurred to me as a question I might care about. But I want to invite you to consider that this kind of question, questions like that question, may well be questions you do care about. And I also want to suggest that the fact it is so nearly impossible, the fact that there's no obvious way to ask that kind of question today, is partly responsible to your not coming up with those questions, right? We tend to be limited by the possible. You know, until human flight was made possible, it did not occur to anyone to say, oh yeah, by this time next week I will be in Australia, because that was just impossible. But when flight is possible, there's all kinds of things that suddenly become possible, and there's all kinds of needs that arise based on the availability of resources to fulfill those needs. So many of these research questions, compound lateral cross-cutting queries, are not being asked because people have internalized the fact that there is no way to get an answer to questions like, what is the most popular first name among British politicians? I just made that up, you know? Is it John? Maybe. Maybe it's William, for whatever reason. You know, these are the kinds of questions we don't routinely ask because we know that it's like, who are you going to ask? How are you going to get an answer to that? So this problem of not having very flexible ways of querying the data that we already have-- in Wikipedia, in Wikisource, elsewhere-- is a significant limitation. So these two key problems have one solution. And that is an editable, central storage for structured and linked data on a wiki, under a free license, which is a very long way of saying Wikidata. That is Wikidata. Wikidata is an editable, central storage for structured and linked data on a wiki, under a free license. So let's take this apart and unpack it. First of all, it's a central storage. This relates to the first problem, right? If we had one place containing data like population size, we would be able to update that one place and then have all of the different Wikipedias draw the data from that one place so that we wouldn't have to manually, repetitively update it across our hundreds of projects. So having central storage makes, I hope, kind of immediate, intuitive sense. But what do I mean by structured and linked data? So structured data means that each datum, each piece-- individual piece-- of data is managed on its own, is identified and defined on its own, as distinct from Wikipedia. Wikipedia has articles. The article about Brazil includes a ton of data, all kinds of information, and it's presented as text, as several paragraphs-- several pages-- of text, right? Now, we do have an approximation of structured data on Wikipedia. If you've browsed Wikipedia a little, you've noticed that we often have an info box, what we call an info box on Wikipedia. That's the table on the right side if it's a left to right language, the table on the right side that has information that is easy to tabulate, right? So you know, birth date, birth place, death date, death place, nationality-- or if it's about a country, area, population, anthem, type of government, whatever you are likely to find. If it's a movie, then you know, starring, genre, box office receipts, whatever pieces of data are relevant to an article about a movie. So we do already kind of group pieces of information on Wikipedia into this kind of structured format. Those of you who have ever looked at the source, at what the wiki code under that looks like, know that it's only semi-structured. It looks neat and organized in a table, but really, it's just a bunch of text that is put there. It is not centralized. Every Wikipedia has its own copy of that data. And if I go and update the population size on Spanish Wikipedia of that Argentinean town, it does not get updated automagically on the English Wikipedia or the Arabic Wikipedia, right? So the structured data that we already have on Wikipedia is not managed centrally. The other thing about structured data is, when you have a notion of an individual piece of data, that is the cornerstone of allowing the kinds of queries that I was talking about. That is what will allow me to ask questions like, what is the Flemish town where the most painters were born, or what are the world's largest cities that have a female mayor? I could come up with other examples all day long, right? These are all questions that you can ask, once you break down your data into individual pieces, each of which is-- you're able to refer to each of those programmatically. The computer can identify, isolate, and calculate based on each of those pieces of data. So that's why the structure is important. Now, Wikidata is also a linked data repository. What does it mean that the data is linked? Well, it means that a single piece of data can point at, can link to another whole bag of data. So if we are describing, for example, a person, and we record the single piece of data that this person was born in Salem, Massachusetts, that single piece of data links to the item about Salem, Massachusetts because, of course, we know a lot of things about that place, Salem, Massachusetts. So it's not just the text-- S-A-L-E-M. It's not just, that's where they were born. But it's a link to all the data that we have about Salem, Massachusetts. If we say someone's nationality is French, that is a link to France. That is a link to everything we know about the country France. The fact that the data is linked and structured allows not only humans, but also computers to traverse information and to bring us different pieces of relevant information programmatically, automatically, based on those links. Because it's not just text, it's an actual link to another chunk of data. If this sounds a little abstract, it will become much clearer in just a second when we see it in action. But the other components of this little definition are, of course, this central storage of structured and linked data needs to be editable, of course, because we need to keep it up to date. We need to correct mistakes. And we want it on a wiki under a free license. The free license is, of course, essential to enable reuse of that data, to enable all kinds of reuse of the data. And Wikidata, unlike Wikipedia, is released under a different free license. Wikidata is released under CC0 waiver. That means unlike Wikipedia, where you have to attribute Wikipedia when you reuse information from Wikipedia, you do not need to attribute Wikidata, and you do not need to share alike your work. It's an unencumbered license to reuse the data in any way you want, including commercially. You don't have to say that it comes from Wikidata. I mean, it could be nice, but you don't have to. You're under no obligation to do it. And that is important to allow certain kinds of reuse where, for example, if you're building some kind of device, you may not have a practical way to give attribution. And had we required that to use Wikidata, we would have made Wikidata less reusable. So Wikidata is unencumbered by the requirement of attribution. And of course, because it's on a wiki, we get all the benefits that we are used to expect from a wiki, right? So it's a wiki, which means, yes. It has discussion pages. It has revision histories. It remembers everything. So if you screw it up, you can always go a version back. Or if someone else vandalized the content, we can always go back, just like Wikipedia. So we get all the benefits we're used to-- user talk pages, group discussion pages, watch lists, all the features that we expect in a wiki. In short, Wikidata is love. I hope you agree with me by the end of this talk. So let's zoom in and see what this structured data looks like. So structured data on Wikidata is collected in statements. And statements have the general form of this triple, this tripartite ascription-- items, properties, and values. Now an item is the subject, is the topic that we are trying to describe. It can be any topic that Wikipedia can cover, and many others that Wikipedia wouldn't. So the topic, the item can be Germany, or it can be Salem, Massachusetts, or it can be the concept of redemption. It can be anything at all. Anything you can imagine describing in any way with data can be the item. So the item, consider it like the title of the rest of the data. And then what do we say about Salem, Massachusetts or about Germany? Well, that's a series of properties and values, properties and values. The property is the kind of datum, like birth date or language spoken or manner of death. These are all real properties. Or national anthem, if I'm trying to describe a country-- these are properties. And then they have values, right? So this person, this imaginary person's place of birth, the value of the property place of birth is Salem, Massachusetts. So you can think about it as like a government form-- or not government, just any form that you're filling out-- where there are field names, and then empty spaces for you to fill out. That's the value, OK? So the field names or the categories are the properties, right? So name, language, occupation, date of birth-- these are all properties. And the values are the actual piece of data, the actual information that we have. And of course, different kinds of data are relevant for describing different kinds of items. And the key in the value is it can be either a literal value-- like if we're describing the height of a mountain, we might say just the number 8,848. That's the height of which mountain? Not everyone at once. Oh, because it's meters, the metric system. Yeah, Mt. Everest is 8,848 meters. Yes. Get with it, America. The metric system. All right, so that can be a literal value like an actual number. Or it can be a link to an item, pointing at another item. But in this statement, it is the value. So if I'm talking about Germany, the item is Germany. And the property capital city has the value Berlin. But the value is not B-E-R-L-I-N. The value is a pointer to the item Berlin, right? That's the link. So a single item is described by a series of such statements, right? There's hundreds and hundreds of things I can say about Germany. There's hundreds of things I can say about a person. And these will generally take the form of a property and a value. By the way, some properties may have more than one value. Consider the property languages spoken. People can speak more than one language, right? So if I'm from describing myself, we can say languages spoken-- English, Hebrew, Latin, whatever. So a property can have more than one value. So if the item is about a country, it would have statements about properties like population, land area, official languages, borders with, anthem, capital city. If I'm describing a person, I have a whole mostly different set of properties that are relevant, right? Date of birth, place of birth, citizenship, occupation, father, mother, religion, notable works-- now, are all of these relevant for all people? No, of course not. It depends. And different items about different people will either have or not have these fields, right? So we wouldn't record religion for absolutely every person. Some people manage to do without. And also, it's not relevant for a lot of people, like, what their religion happens to be. Date of birth is generally relevant for most people that we're documenting. So some properties kind of crop up more commonly than others. A person's height, for example, is not generally considered of encyclopedic value, right? We don't, for example, if we have an article about even a really well-documented person like Winston Churchill, does Wikipedia mention his height? I don't think it does. Even though I'm sure we could probably find a source somewhere that lists his height, it's just not a very relevant piece of information about Churchill. With everything else that's written about him and that we know about him that we want to include in the article, a person's height is not really something of great value most of the time. But if we are describing Michael Jordan, it is relevant. I'm dating myself. People still know Michael Jordan, right? You know, a basketball player, that's when height is very relevant, right? That's one of the first things you say when you're describing a basketball player, is list their height. So even within the class of person, some properties may be more or less relevant, depending on the context. So let's look at some examples. These are examples of statements. Each line is a statement. So here's the first one. I want to state, about the item Earth, our planet. And what I want to say about Earth is that the property highest point on Earth has the value Mt. Everest. Would you agree with that? That is the highest point on Earth. That's a statement. It says something specific, one piece of information about Earth. Now of course, there's a lot of other things we want to say about Earth-- circumference, average temperature, I don't know, all kinds of things we can describe the planet with, density, it's a galaxy, it belongs to, all that. But here's one piece of information, one very specific field in the detailed form about Earth. The highest point is Mt. Everest. Now here's a second statement. This time Mt. Everest itself is the item that I'm describing, right? The topic has changed. Now I'm saying something about Mt. Everest, and what I'm saying about Mt. Everest is elevation above sea level. Sounds the same but it isn't, because the highest point on Earth answers the question where, like on the planet, what is the highest point? It's Mt. Everest. But how high is that highest point is a different piece of information. Do you agree? It's the actual altitude. It's not where on the planet it is. So it may sound similar, but these are actually very different pieces of information. So that highest point, how high is it? Well, it's 8,848 meters high. Now the third statement gives another piece of information about the first item. Same item-- I could have grouped them together. Another thing I know about the Earth is that the deepest point on the planet is the Challenger Deep, part of the so-called Mariana Trench in the ocean. So that is the deepest point. And how deep is it? I again use the elevation above sea level. That's the name of the property even though it's not above sea level. I have a negative value because the elevation of the Challenger Deep is minus 11 kilometers, more or less. All right? So these are statements. These are four individual pieces of data. And I could also look at it this way. Maybe that's closer to the government form example that I was giving, right? So I want to say something about Earth. What do I want to say? Two things-- highest point. That's the field, that's the property, and this is the value. The highest point is Mt. Everest. The deepest point is Challenger Deep. And then I have things to say about Challenger Deep-- the property of elevation above sea level, the value is minus 11 kilometers. Now here's yet another view of the same data once more, with numeric IDs. So this is the same information, the same four statements. But this time, in addition to using words, I'm also including weird numbers following either Q or P. So P stands for property. So the highest point property is P610. And the deepest point property is P1589. What do these numbers mean? They don't mean anything at all. They're just numbers. They're just sequential numbers. And if I create a new Wikidata item right now, it'll get just the next available number. So they're just numbers. So P stands for property. What does Q stand for? Does anyone know? It's a trick question because it's hard to guess. But the principal architect of Wikidata, a Wikipedian named Danny [INAUDIBLE] and data scientist, is married to a lovely lady named [INAUDIBLE] spelled with a Q. And this is a loving tribute. And she's also a Wikipedian and an admin of Uzbek Wikipedia. So Q2 is just the numeric identifier of the item Earth. And Q513 is the identifier of Mt. Everest. You notice that we use that ID across the statement, right? So from Wikidata's perspective, this is actually what the database actually contains. What we were saying with words-- the Earth, highest point, whatever-- never mind that. Q2 has P610 with a value Q513. That's what Wikidata cares about, OK? Now that, you'll agree, is a little inaccessible. Just these lists of numbers, that's a little hard. So Wikidata understands and allows us to continue using our words. But actually, it gets translated into numeric IDs. Now why is this a good idea? Why can't we just say Earth or Mt. Everest? Any thoughts? This is an open question. Why is this a good idea to use numbers instead of the names of things? Yes, because more than one thing can have the same name. What do you mean? There's only one Mt. Everest. Well, yeah. But there there's also a movie called-- and probably more than one-- called Mt. Everest, or a TV documentary literally called Mt. Everest. And of course, if I'm describing a person named Frank Johnson, not the only Frank Johnson on the planet, right? But wait, you say. On Wikipedia we deal with that problem, right? How do we deal with that problem on Wikipedia? Does anyone in the audience know? The standard way to deal with the fact that there is more than one Frank Johnson in the world, on Wikipedia, is to use parentheses after the name. So there is Frank Johnson (actor) and Frank Johnson (politician), for example, if that's the distinction we need to make. So you put in parentheses kind of the minimal amount of information you need to tell apart these Frank Johnsons. What if there's two politician Frank Johnsons? Well, then you would say Frank Johnson, (Delaware politician) versus Frank Johnson (California politician), right? You just put in that bit of context to tell them apart. So that's the solution that Wikipedians came up with years and years ago because they did need a unique name for the article. You can't have two articles literally called Frank Johnson on Wikipedia. So that's the solution on Wikipedia. But Wikidata was designed much later, more than a decade after Wikipedia, and was able to kind of learn from the experience of Wikipedia, which has tremendous experience with multilingualism, much more than most sites and projects, as we know. And so the Wikidata team understood from the get go that this will be an issue, and it's better to use numbers that are unequivocally different from each other instead of labels, instead of the actual name, the actual text, because names are not unique. Names can change, right? Just last year, there was a big naming reform in Ukraine and a whole bunch of towns and districts were renamed. Does that mean we should change all the data that we have, like lose all the data that we have about the old name? No, we ideally just want to change the name without breaking links. So having the links actually refer to the numbers is one way to ensure the integrity of the data, of the links, when renaming happens. Another reason is well, even if the name doesn't change, not all humans call everything the same, right? So Earth is Earth in English, but it's [SPEAKING ARABIC] in Arabic. It's [SPEAKING HEBREW] in Hebrew. So obviously, Earth-- even that is not as unambiguous or unequivocal as you might think. And so that is the reason Wikidata, which is built to be multilingual from the start, talks about numbers rather than labels. OK. Ha, I had a whole slide about that and I forgot. Yes, so even London, again, is not just London, England, which is what you were thinking about. It's also a city in Canada. And it's also a family name, like Jack London. It's also a movie company. There must be some hotel named London somewhere. This is a good opportunity to remind everyone that the vast majority of humankind does not speak a word of English. That's a statistic worth remembering. The vast majority of the planet does not speak English at all. That does not contradict the datum that English is the most widely spoken language. And yet, in aggregate, a majority of people speak other languages, and not English at all. So moving swiftly on, this is a pause for questions about what I've covered so far. Any questions in the audience? If not, we moved to IRC. If there are any questions-- Any questions? No? IRC? Any questions? OK. We will have additional pauses for questions later. But enough of my hand-waving. Let's go explore Wikidata. So Wikidata lives at wikidata.org. And Wikidata already has more than 25 million items. That is, it collects statements about more than 25 million topics. It has many, many more than 25 million statements because many of these items have dozens or hundreds of statements. So it documents 25 million things-- people, books, rivers, whatever. Just to give us a sense of how big that number is, how many articles do we have on English Wikipedia? More than-- yes, more than 5 million articles. And that's the largest Wikipedia. So Wikidata is already describing more than five times, or about five times as many items as even our largest Wikipedia. So obviously, Wikidata contains data about things that have no article on any Wikipedia. It is a much, much larger, more comprehensive project. All right, the second thing we might notice is, well, this looks kind of like Wikipedia, right? If we've never visited, it looks kind of like Wikipedia. It has this sidebar. It has these buttons at the top. It looks like it's from the '90s. Yeah. So the reason it looks like Wikipedia is that it is a wiki running on Mediawiki software. It is running on software very much like Wikipedia. But it is running on a kind of modification of the standard wiki software. It has an additional, very important component named Wikibase, which gives it all of its structured and linked data power. So let's start exploring Wikidata. Let's take something local-- Harvey Milk. Harvey Milk. What does Wikidata know about Harvey Milk? For those on YouTube who may not be local, he's a San Francisco politician and gay rights activist who was murdered in the '70s. It was very significant in the history of those struggles in this country. So what does Wikidata tell us about Harvey Milk? Well, the first thing is it knows that Harvey Milk is Q17141. That's the most important piece of information, is first of all, that is the identifier. That is the item number of all the data that we will collect about Harvey Milk. The second thing you see right under the title is this line, this very, very brief summary, right? "American politician who became a martyr in the gay community." This line is the description line. So the name of the item-- this is the label. We call it label on Wikidata. That's the label. And this line is the description. Now why is this description important? This is the description that helps us tell this Harvey Milk from any other Harvey Milk that may exist, all right? So again, this would be useful if I'm looking up someone with a slightly more generic name. That line will help me tell apart the item about Harvey Milk the gay activist rather than Harvey Milk the film actor, OK? And where is it coming from? Well, Wikidata has this whole table, as you can see, with descriptions and labels in other languages. So Wikidata is able to refer to Harvey Milk in Arabic which, don't panic, is written from right to left. It also knows what to call him in Bulgarian. I mean, it's the same name, but it's in a different script. In French, in Hebrew, and that's it? Does it not know a name for Harvey Milk in Italian? Of course it does. It actually has labels for this person in many, many, many languages. It doesn't have descriptions in every language, as you can see. OK? So why was Wikidata showing me these languages and not others? I mean, why this somewhat arbitrary collection-- English, Arabic, Bulgarian, German, French, and Hebrew? Because I told it to. So if we briefly click over to my user page-- again, like every wiki, you have user accounts. You have user pages. This is my user page. And as you can see, there's this little user information box here called a Babel box by Wikipedians, where I list the languages that I speak. And Wikidata uses this box just to kind of helpfully show me these languages. Of course, all the other languages are still available, as you saw, by clicking the more languages. But this is just a useful little way of getting the languages I care about up there first. By the way, this is a lie. I don't actually speak Bulgarian. That stayed on my user page because I was demonstrating this in Bulgaria and I wanted that label to show up there during the talk-- just in case you were going to tell me a really good Bulgarian joke. OK so for example, Hebrew is my mother tongue. And we have a Hebrew label for Harvey Milk. But we don't have a description. So let's fix that right now by clicking the edit button right here. I click edit, and this table became editable. And now I can very briefly type a description. AUDIENCE: Online in about 20 seconds. But can we hold it? ASAF BARTOV: OK. That was good timing for the screen to crash. OK? Are we back? OK. Sorry about that. So this was all about what to call him in different languages and scripts and how to tell this person apart from other people with potentially the same name. Let's scroll down and see what else does Wikidata know about this person? So as you can see, this is a list of statements, right? This is a list of statements. And the properties are on the left, the values are on the right. So the first thing Wikidata knows about Harvey Milk is a very important property called instance of. Instance of. And the property instance of answers the very basic question what kind of thing is this that I'm describing? Is it a book? Is it a poem? Is it a mountain? Is it a theological concept? No, it's a human. It's a person, OK? The item about Mt. Everest will say instance of mountain, OK? This is a very important property. Why is it important? Wouldn't anyone looking at this know that this is a human being? Yes. Anyone looking at this will know. But if I want a computer to be able to pull information about people, I want to be able to easily exclude all the mountains and poems and other things that are not people from my query. So this single datum, this single piece of data, is what tells computers and algorithms very clearly, this is a human. Things that aren't instance of human are other things. OK? So it may sound very trivial, but it's not. It's very important to have an instance of field for Wikidata items. All right, what else do we know? Well, Wikidata knows about an image for Harvey Milk. Again, we can find a ton of images-- or maybe not a ton, but we can find dozens of images of Harvey Milk on Commons, on our Wikimedia multimedia repository. So why should we have a single image here on Wikidata? Again, this is mostly for reusers. If I'm building some kind of tool that pulls information from Wikidata, it's nice if there's at least one representative image to kind of use as the default or immediate image for Harvey Milk in some other reused context. All right, sex or gender-- male. Country of citizenship-- United States of America. Given name is Harvey. The date of birth is so and so. The place of birth is Woodmere. The place of death is San Francisco. The manner of death is homicide. Wikidata knows that. Now again, every little datum like that is the basis for later querying and answering questions. So the fact that we record the manner of death of people-- or at least of some people-- will allow us later to go, you know, who are some people from Belgium who died by homicide? That's a question Wikidata can answer, thanks to this field. The other thing I mentioned is that things are links. So the place of birth is Woodmere. I don't know where Woodmere is, but I can click that and find out. Here is the Wikidata item about Woodmere, right? It was the value in the statement about Harvey Milk, but now I'm looking at the item about Woodmere. And it turns out it's in Nassau County, New York, right? And of course, Wikidata has a whole bunch of information for me about Woodmere-- what country it's in and the coordinates and the population and the area, all the things you would expect about a place, OK? Let's get back to Harvey Milk. So the manner of death, the cause of death-- now here, Wikidata gives us excellent information. The actual cause of death is ballistic trauma. That's a professional term. And this statement has qualifiers. So until now, I was talking about triples, right? The item has a property with a certain value. Actually, each statement can also have a number of qualifiers which add aspects of information, still about that one question that we're answering, right? So if this property answers cause of death, it's not discussing anything else. It's not discussing languages. It's not discussing date of birth, right? It's talking about the cause of death. But we're not just saying ballistic trauma. We're saying ballistic trauma with the quantity attribute being five. What does that mean? Five bullets, right? There are five ballistic traumas. He was he was shot five times. And he was shot by this person named Dan White. And this ballistic trauma, like this actual shooting, is itself the subject of this other thing. This is a link to a whole other Wikidata item about the Moscone-Milk assassinations. Moscone was the San Francisco mayor at the time. We'll see slightly better or easier to understand examples of qualifiers in a bit. So if this was confusing, hang on. So he was killed by Dan White. He spoke English. His occupation-- here's an example of a property with more than one value, right? So Milk was a politician. But he was also a Navy officer, at least for a while. That was another thing that he did during his life. And he was a human rights activist, right? So some people are writers and translators. So people can have more than one occupation. People can speak more than one language. Here's a better example of a qualifier. So the property award received has the value Presidential Medal of Freedom. And that award has an attribute called point in time, like when was this? This was in 2009. Do you see that this piece of data-- 2009-- is a sub-statement or is subjugated to the context of this award, was the Presidential Medal of Freedom? It can't just kind of free float in the article. It's not that 2009 is itself a meaningful thing, right? This medal was awarded in 2009. If Wikidata doesn't tell us, for example, when he was a Navy officer, OK? But if we were, for example, to look that up right now and find out that Milk was a Navy officer between 1962 and 1964, we could go back here to the Navy officer bit and click edit. This is how I edit this particular little piece of information. And add a qualifier like this. I click Add Qualifier. And I could pick start time and end time, right? And then I could type 1962 to 1964, and that would be teaching Wikidata. Oh, I'm sorry, I meant to do that for Navy officer. OK. But, you know, that is the exact-- the accurate time span of that statement. So it's true to say about a person, he was a Navy officer, even if of course he wasn't a Navy officer his entire life. But it's better and it's more accurate, to say he was a Navy officer between 1962 and 1964. Don't worry, I'm not saving this. No vandalizing of Wikidata in this session. OK. Moving on. What else does Wikidata know? He was educated at this university. He was a member of this political party. Right? That's of course if they're a relevant property for a politician. Religion, military branch, what is the category on commons that discusses this item, is something that Wikidata can tell us. And that's it. Now, is that everything that we could possibly say in a structured way about Harvey Milk? No. We could probably find at least a few more things to say. We will see how to contribute new information to Wikidata in just a minute with a different example. But this-- all this was a set of statements. Right? This was the title statements here. But at the bottom of the list of statements is another section called identifiers. And I want to spend a minute talking about what that is. So identifiers is a collection of keys. A collection of IDs, or codes, that are keys to other information sources. And a lot of Wikidata items have a whole series of keys to other databases, other sites, other repositories, that help you or a computer be able to access not just some database and look for information about Harvey Milk, but access the exact record relevant to Harvey Milk. And again, if you imagine someone named John Smith, that is really valuable, right? If you're not just told, oh yeah, you can look at the Library of Congress for John Smith, good luck with that. Or if I tell you, go to the Library of Congress to this record for this John Smith, you see the difference. So Wikidata tells us that on VIAF, which is the Virtual International Authority File. It's an aggregated master index built by bibliographers, by librarians, of people. Right? It tries to kind of aggregate information about people across library catalogs everywhere. So the VIAF ID for Harvey Milk is this number. And conveniently, if I click that, I'm not taking to some Wikidata item. I'm actually taken to the relevant site. So this took me right to viaf.org, the Virtual International Authority File, directly to their record about Harvey Milk. All right? And that itself leads me to national catalogs of national libraries all over the world. We won't get into the things you can do with VIAF. The point is Wikidata contained the piece of thread that I could tug on to arrive directly to that information in other databases. Yes. And it has that for many, many kinds of databases. The BNF, for example, that's the National Library of France. And that will take me to that index card. IMDB. We all know IMDB, right? So here I have the key to Harvey Milk in IMDB. And this is what IMDB says about Harvey Milk, right? They have their own piece of information about him, of course, with filmography and everything else. And see, I did not have to search IMDB for it. I just had the key right there waiting for me. Now, again, this is very convenient for me as I just showed you the human use case for this. But it's even more powerful in aggregate when we allow computers to traverse this network of links between-- not just within wiki data, but between data storage facilities and repositories. This is sometimes referred to as the linked data open cloud. Cloud, because it's multiple different repositories that are interlinked. And Wikidata is already, and to a growing extent, the Nexus, the connection point between a lot of these different databases. So IMDB, for example, it's a good example because it's site almost everyone knows, IMDB has information about Harvey Milk. But that information does not include a link to the French National Library. Right? Do you see what I'm saying? So IMDB is a data repository with IDs and allows linking. But it does not give you what Wikidata gives you which is this kind of collection of-- it's like a junction of all these different data sources. So Wikidata is the place where you can document these interrelationships or equivalencies. Right? So ID, you know, 587548 on IMDB is discussing the same topic as French National Library ID whatever. Wikidata contains that piece of information. that this ID in this database is about the same person as that ID in that database. OK. So that's what identifiers are about. Still scrolling down the Wikidata item about Harvey Milk, we have the site links. The site links are links to Wikimedia projects that are related to this item. So of course there are Wikipedia articles about Harvey Milk in many, many different wikipedias. Quite a few language versions. And there are pages on Wikiquote, one of the sister projects. There are pages on Wikiquote with some quotes from Harvey Milk. And there is even a page for Harvey Milk on Wikisource. Right? So this is a collection of those links. And those of you who have maybe only dealt with Wikidata data for inter-wiki links, which we used to do in the old days manually within the article text, now we do it through Wikidata, so maybe that's the only thing you didn't know about Wikidata is how to update these inter-wiki tables on Wikidata. All right. So that concludes our little tour of the anatomy of a Wikidata page. I will just remind you that it's a wiki page, which means it has a discussion page, a talk page. This one happens to be empty. But, you know, if we have concerns or arguments about some of the data here that is what we would use to discuss this and to arrive at consensus. It also has a history view just like every Wikipedia article. So you can see here a list of edits. Maybe some of you have never looked at a history page on Wikipedia, so this looks overwhelming. But every line here, every entry here, is a single edit, a single revision, a single change to this Wikidata item. Just Harvey Milk. And you can see at the very top this edit that I just made-- this is my volunteer account and I just made this edit, and in parentheses you can see what I did. I added an HE, Hebrew, description. And this is the text that I added in Hebrew. Right? So we can see who added what to the Wikidata item, just like we can do the same on Wikipedia. So we have the revision history. We can undo edits. We can revert, just like on Wikipedia. And what else did I want to show here? We can add an item to my watch list using the star, just like on Wikipedia. So we have all these standard wiki features that we would come to expect. Let's pause for questions. Any questions about what we've covered so far? Yes. Are attributes of statements precept for the specific value? No they're not reset. And generally Wikidata data does not enforce by default logic. So, I mean, there's nothing to prevent you from editing the item about Brazil, and adding the property height. Now height is not a relevant property for a country. Right? I mean, maybe average elevation, maybe. But not just height, which is used for humans or for physical things. So you could add that property to Brazil and save it and the wiki would not complain. Now in the background there are kind of extra wiki outside the wiki prostheses for constraint validation. So there are bots and other processes that run, and occasionally, for example, identify non-living things with a date of birth field. That's nonsensical. That should not exist. If someone mistakenly added that there are processes that would flag that to be fixed. But the wiki itself, Wikidata, will not prevent you from adding that. And that is by design to keep things flexible. So that people don't run into, oh wait, but I can't add this because nobody thought that I would need this, maybe. I hope that answers your question. You say helpful answer, question mark. So was it a helpful answer, or? OK. Yes, Eleanor. AUDIENCE: [INAUDIBLE] ASAF BARTOV: Excellent question. I'll repeat it. You ask how do I find the wiki data item number from Wikipedia. If I'm reading about Harvey Milk and I want to look at the data how do I do that? That is an excellent question and let's skip to Wikipedia. Conveniently I have the link right here on English. So this is the Wikipedia article about Harvey Milk and every item on Wikipedia should have a wiki data item associated with it, but it doesn't happen automatically. So if I just created a page on Wikipedia I also need to create a Wikidata entity for it if it doesn't already exist. It could already exist because it was already covered in a different language, for example. So that was parenthetical. But every article on Wikipedia should have, here on the side, on the side are under Tools, a link called Wikidata item. Right here. OK. That Wikidata data item is a link that takes you to Wikidata, to the entity, and there you find the number. You can-- you don't even have to click it. I mean, the URL itself tells you the number. The number, you see, it's wikidata.org/wiki/q17141. OK. So that was an excellent question. Other questions? Yes. Yeah, about the additional attributes, the qualifiers. So, yes, I answered more generically. But just like the properties themselves are not limited per item, the qualifiers per statement are also not entirely preordained. But there is some structure to it. I don't want to go into it at great length right now. If we have time in the end we can get back to that. But some qualifiers are again relevant for some things, start time, end time, and others won't be. Wikidata does try to offer you-- you may remember when I clicked add qualifier, it gave me kind of drop down of some relevant qualifiers. So it does try to help you in that way. Other question? Are the values for instance of already mappable to external ontologies? That is a complicated question. I'll help people understand the question first. So an ontology is a structure, some kind of hierarchy or cloud, of entities and their interrelationships. An ontology would say, for example, a person is a living thing. So is a dog. They're both living things, but they're different things. And then, you know, say things about those entities and their interrelationships. Now there are many, many competing, or coexisting models of ontology's. Many of them were created for specific needs. Many of them want to be a universal ontology. But of course it's impossible to quite agree on one complete and simple ontology. And so there are many ontology's. Which brings up your question, can we map across ontology's? Can we say that when wiki data says instance of book that is equivalent to some other ontology saying instance of bibliographic record? And the answer is yes. There are some such mappings. They are incomplete. And there's no kind of auto magic thing happening in the wiki vis-a-vis those other ontology's. That's kind of left as an exercise for those dealing with those other ontology's, and for tool builders and other platform improvements beyond Wikidata itself. OK. Other questions? Yeah, we have one from the YouTube stream. Someone asked, why can't I link Howard Carter's occupation to archeologists when I use an info box that fetches info from Wikidata? Why can't I link it from the info box? So, someone on the stream answered saying, because it's an improper connection, because the target is not about the subject only. The target is not about the subject? If I understand the question correctly, what you would want to be able to do is from within Wikipedia be able to say occupation and link to a Wikidata entry about archeology. That doesn't quite work that way. We will get to a little discussion of that in an upcoming section of this talk. So I will defer the rest of my answer to then. OK. So we're done with questions for this phase, and my browser got tired of waiting for me. So, yes. All right. So we took a look at Wikidata, and we took questions. So now, let's teach Wikidata some new things. Some things it doesn't already know. Let's look at this item here. So this item is about one of my favorite writers, an American writer named Helen Dewitt. Wikidata, of course, fondly refers to her as q54674, but we can call her Helen Dewitt. And what can we contribute here? So Wikidata has far less information about Helen Dewitt. Most of you probably haven't heard of her, that's OK. What does Wikidata know about her? Well instance of human. We have a photo of her. She's female. She's an American. Her name is Helen. Date of birth. Place of birth. She's an author, a novelist, a writer. She was educated at the University of Oxford. And Wikidata knows what her official website is. That's useful, but that's it. Now we can contribute information here. For example, she's an American author writing in English. So we could add that information. We could click the Add button here. And this is a good moment to acknowledge that the user interface of Wikidata is a work in progress. It's not as intuitive as it might be. So you need to understand that click-- to add a completely new property, You need to click this Add button. If you want to add an additional value to the property official website, you need to click this Add button. It makes a kind of sense with a shaded box. But, you know, you need to kind of pay attention, and it's not as friendly as it might be. [COUGHING] Excuse me. So, let's add a property here. Click the Add button. Again, Wikidata tries to be useful by suggesting some relevant properties for humans. A bit more morbidly it suggests, how about date of death? That's not cool, Wikidata. Helen Dewitt is still alive. So I will not add date of death, but I can add languages spoken, written, or signed. OK, so I click that. And she writes in English. I just type English-- whoops. Not in Hebrew. Don't panic. I type English here. And, oh, and of course Wikidata has auto-complete, right? So it tries to help me along. But you will notice that it has all kinds of things called English. I mean, it turns out that there is a place in Indiana called English, Indiana. Did I mean that? No, of course I didn't mean that she writes her books in English, Indiana. Right? But, you know, Wikidata gives me the option of linking to that. I also don't mean the botanist Carl Schwartz English. No, no I mean the west Germanic language originating in England. That's what I mean. So I click that. And I click Save. And that's it. Again I have just made an edit to Wikidata. I have just taught Wikidata that this author speaks English. Now, again, this may be very obvious. She's American. Of course not all Americans write in English. It may be obvious if you look at her books. The important thing is that now Wikidata knows this as a piece of data. And, again, think ahead to queries, which we will demonstrate in a little bit. Without this piece of information that I just added, if I were to ask Wikidata five minutes ago, give me a list of novelists writing in English, OK, Wikidata would have returned thousands of results. But Helen Dewitt would not have been among them. Because up until two minutes ago Wikidata didn't know that Helen Dewitt writes in English and not in Spanish. Do you see? It is this explicit statement that will now make her be included in any future queries that asks, who are novelists writing in English? OK. By the way, she's a PhD in Classics. She speaks-- or at least reads and writes Latin and Greek, ancient Greek, and I could-- I can-- I mean, I happen to know that. But wait, wait, wait, wait, wait, you say. What about original research? I mean, you can't just add stuff like that to Wikidata. Don't you need sources? Citations? Of course I do. Yes. Let's add some sources to this. So on Wikidata, just like Wikipedia, things should generally be supported by citations, by references. And just like Wikipedia, they aren't always supported in that way. OK so, I mean, I can just add it to Wikidata. Watch me. I just did that, right? I just added English and Latin without any citation, and I will not be arrested for it. Just like I could edit a Wikipedia article and add some information without a citation. It may stick. It may stay in the article, or it may be reverted. It depends on the kind of information I'm adding. It depends how many people are paying attention to the article on Wikipedia. And it works the same way on Wikidata. OK, so, you can add some things without references. Ideally, when you add, information you should include references. So let's be good Wikidata citizens and add a source. Here is an article that I prepared in advance. This is Helen Dewitt. And in this article, somewhere, it actually says right at the bottom here, see, Dewitt knows, in descending order of proficiency, Latin, ancient Greek, French, German, Spanish, and Portuguese, Dutch, Danish, Norwegian, Swedish, Arabic, Hebrew and Japanese. This may sound excessive, but it's true. I met this woman. So anyway, we don't have to include all of that. The point is this article from a reasonably reliable source, this magazine, this interview, can count as a source for the languages she speaks. So I copy the URL. I just copied off my browser. And, whoops-- that's not-- here we go. And I can just add a reference here to the information that I just added to Wikidata, right? I can click Add Reference. And then just say the reference URL is, and I just paste. I paste this URL. Hit Enter. And that's it. And now the fact that she speaks Latin has a reference. If you look at the other things here on Wikidata, you can see that these IDs, for example, have references, too. Right? In this case, the reference just says, excuse me-- In this case it just as imported from English Wikipedia. But wait, you say, can Wikipedia be a source? Not properly, no. I mean, just like Wikipedia itself doesn't cite itself. We don't say, this person was born in this city how do we know? We read it on Wikipedia in another language. That's not a good citation. It's not a good citation for Wikidata either so why do we put it here? Well you can see the qualifier here is different, right? It's not reference URL, which is what I put in for Latin here. It's not reference URL here, it's a different qualifier. It says-- saying, imported from. So this is not an actual reference that supports this piece of data. It just shows where did this data come from. It's a slightly different thing, because this data was mass imported into Wikidata. So it wasn't input by hand by some volunteer. It was imported into Wikidata en masse by a script, by a program. And we want to know, where did this number come from? Well it came from English Wikipedia. So again, that's not a proper reference for the validity of the information, but it does at least tell us it came from English Wikipedia. We can click and look on English Wikipedia and find out. Maybe there's a footnote there that says where it did come from. OK. So this was an example of teaching Wikidata something that it didn't know. Something about the languages. And of course I could add this reference for English. I could add all the other languages that she speaks. And I won't bore you with that, but that is basically how it's done. So you click this Add to add a completely new-- completely new statement. Now, by the way, the fact that these are the only two suggestions that Wikidata can think of, doesn't mean these are the only options. OK, you can just type anything that may be relevant. We could add, for example, award. Just start typing award. And here I have I have a bunch of properties that are relevant for awards. Awards received, together with, conferred by, right? There's all kinds of properties that I could rely on. And of course there is a list of all the properties of Wikidata. And that list is also sorted by type. So yes, there is a list of properties relevant to people so that you don't have to guess. But a surprising amount of the time you can just start typing and get the right properties suggested to you. OK. So we taught Wikidata something new, and now let's teach Wikidata something completely new. Right? So how do we create a new Wikidata item? So, like I said, if I created a Wikipedia article about something that was not previously covered on any other Wikipedia, chances are there would not be an already existing Wikidata item. Sometimes there might be, because Wikidata does have 25 million entities. But sometimes there wouldn't be. So, first of all, I could search for it, right? So I could go to Wikidata to the search box here and just start typing, and search for what I want, right? So if I'm searching for Helen Dewitt I just say Helen, and I can see whether or not it exists. And there's a detailed search results page, et cetera, where I can where I can find out if the item does exist or not. Excuse me, this reminds me of a very important thing I wanted to demonstrate, and that is the multilingualism of Wikidata. So remember all these labels in other languages. Wikidata knows what to call Helen Dewitt in Hebrew. And it will show it to Wikidata users whose language is Hebrew. Mine is set to English, for your sake. But if I change this I go to Preferences here and change my language. [INAUDIBLE] All right, and I hit Save. Wikidata will start talking to me in Hebrew. Now brace yourselves. Are you ready? Don't panic, it's right to left. Oh my god everything is topsy-turvy. So this is the same article in Hebrew. So the sidebar has switched direction, and I know most of you cannot read it. Bear with me. This is the label that we previously saw in the label box. This is how you spell Helen Dewitt in Hebrew. And here is the description in Hebrew. It's not the description in English, this description, American writer, which I was shown previously. Now I'm shown the Hebrew description, appropriately. But more interestingly, oh my god! All these statements are suddenly in Hebrew. How did that happen? Well this tiny word here is the very concise way to say in Hebrew, instance of, and this word here means human. So these are links to the same things, right? It still links to Q5. Q5 is the Wikidata entity for human. These are still the same things. But because Wikidata has multiple labels for everything, it has multiple labels for items. And it also has multiple labels for property names. So Wikidata knows how to say, instance of, and award received, in other languages. That is why it is able to show me all this data in Hebrew even if none of that data was actually input into Wikidata by a Hebrew speaker. That data could have been input by English speakers, but thanks to the fact that someone once translated the word photo into Hebrew, I can see this field in Hebrew. So one of the things you can do to help Wikidata, right now, without any special knowledge is to help translate those labels. Every label only needs to be translated just once. So you can see that all of these properties, date of birth, name et cetera, they all have Hebrew labels. Maybe one of these would not. No, they all have Hebrew labels. Doing pretty good. And I'm able to search in my own language. I'm able to click Add. This word is Add, so I click this, and now I have the Add screen. It all speaks my language, and it's awesome. And now for your sake I will switch back to English, but it is important to know you can edit Wikidata in any language. And it is far more multi-lingual and multi-lingual friendly than, for example commons, which is also a project we all share. But commons has some limitations on how multi-lingual it is. For example, the category names, et cetera. OK. So we were beginning to discuss creating something completely new. AUDIENCE: Quick questions, if that's OK? So there's two questions on IRC. The first one is, can you show search for something like getting the list of things? I want to learn how to search for something properly like, show me all the items with this value of this property. ASAF BARTOV: Yes. That is part of this talk, but I'll get to that in a little bit later. There's a whole section where I will demonstrate the very, very powerful query system of Wikidata where I will cash that check that I gave at the beginning of all these painters who are sons of painters queries et cetera So I will demonstrate how to do that. AUDIENCE: Other question. How does Wikidata data deal with link rot, and other issues streaming from their URL refs. ASAF BARTOV: URLs break. We call that link rot. Wikidata doesn't have any particular magic around link rot, just like Wikipedia. So if you do use a bare URL it may well rot. But you can add qualifiers with back up URLs else on the Internet Archive, or another mirroring service. And potentially that could be a software feature for Wikidata to automatically save or ensure that something is saved on Internet Archive, but I don't know that it is doing so now. So, just like Wikipedia, if it is a bear URL it may rot. And may need to be replaced, possibly by bot. Other questions? All right, so let's talk about how you create a completely new item. It's very simple. You go to Wikidata and you click here on the side. There's a link, create new item, which gives you this screen. And let's create an item about a book that I'm reading right now by this Bulgarian writer. So we have an article about this writer guy named Deyan Enev. But we don't have an article or a Wikidata item about one of his famous books called Circus Bulgaria. That's the book I'm reading, his first collection of short stories in English. Circus Bulgaria came out in 2010, Portobello Books, translated by Kapka Kassabova. So that's the book I'm reading. As you can see it's not a link on Wikipedia. There's no article about it, and there's not even a Wikidata entity item about it. But we can totally create it, even without a Wikipedia article. So let's create this new item. Let's create it in English for the purposes of our demonstration. The name of the item is Circus Bulgaria. Circus Bulgaria, that's the name. Not Circus Bulgaria parentheses book, or anything you may be used to from Wikipedia. It's the actual name of the book, and the description, again, remember, the description field is just to kind of help tell apart this Circus Bulgaria from any other potential Circus Bulgaria. Maybe there's a film or something. So it's enough to just say something like short story collection. I might add by Deyan Enev and if just in case, again, some future other short story collection by some other author happens to have that same name. That should be disambiguating enough. OK. Short story collection by Deyan Enev. I could have aliases for this. The aliases assist find-ability. This particular book has just this one name, so that's fine. And I click Create. That's it. I just start with a label, and a description. I click Create. I have a brand new queue number for my new Wikidata item. And Wikidata knows what to call it. And a description in one language at least. And that's it, and I can start populating it. As it can see, it it has no site links, but it's ready to be taught. So, for example, I can start by teaching it the name of the book in another language that I happened to speak. Now it has two labels in English and Hebrew. I could also look up the book Areon, the original Bulgarian label for this book. Seems relevant. Again, I do not speak Bulgarian. But I can go to the Bulgarian Wikipedia through into Wiki. This is this gentleman. And I could find-- I can read Cyrillic so I could easily find-- when I say easily-- when I say easily-- maybe not so easy, but I can search for it. Here we go. Tsirk Bulgaria. That is the name of the book. Tsirk, as in circus. No problem. So I just copy this right here. And I go back to my new item. My new item, which is here, and I edit the Bulgarian field. And here it is. Awesome. All right. But I still haven't told Wikidata anything about this. I know I'm talking about a book. Wikidata that doesn't know that yet. So let's start by adding some statements. First of all, I click Add. Wikidata sensibly says, how about we start with instance of. Tell me what kind of animal-- no, not kind of animal. What kind of thing are you trying to describe here? Well it's an instance of a book. Not in Hebrew, please. So it's an instance of a book. I could even be a little more specific and say it's an instance of a short story collection. There we go, short story collection. I hit Save. Awesome. So now we know what kind of thing it is. It's not a human, it's not a mountain, it's not a concept. It's a short story collection. Now I can add some other things. See, Wikidata is already working for me. Because it's a short story collection it's offering me to populate these properties, and not other ones. Publication date, original language, genre, country of origin, these are all relevant, right? So let's start with original language of the work is Bulgarian. Not Bulgaria, Bulgarian. This is the item I want to link. Hit Save, and whatever. Author. Let's identify the author. So the author, the main creator of the work, is that gentleman Deyan Enev. And remember, he has a Wikipedia article. He also has a Wikidata entity. So Wikidata does know about him. So I hit Save, and I can add something about the translator. And what was that lady's name? Kapka Kassabova. Now it so happens that Wikidata already knows about this lady. See? So I can just start typing and then just link to it. Awesome. But what if it didn't? What if it was translated by someone who isn't already covered on Wikidata? Well I could just type the name as a string, but ideally I could create a Wikidata entity about this translator so that there is a possibility to link to her. Now I might actually add a qualifier here because, she's not the translator of the book, right? She's the translator of the book into English. Right. So the language that she translated into is English. Right? This book-- remember I'm describing the book. The item is about the book. So the book would have a different translator into Polish. So this is an example of a property or a statement that doesn't make sense without one of those qualifiers. It's just not correct. It doesn't make sense to say that translator is. The English translator, or even this English translator. In 50 years maybe there would be an additional English translation. So that's an example of needing that qualifier. And of course I could go on and populate the other fields. We don't have to do that right now. Publication date, country of origin, et cetera. So this is already beginning to look like all those items that we already saw, but just a moment ago it didn't exist. Just a moment ago Wikidata had no concept of this work. This happens to be one of his notable works. So I could actually go to the item about Deyan Enev which has all this information already, occupation, languages, and add a property. Remember, I'm not limited to these. I can add a property called notable works, and mention my new item. Circus Bulgaria. See? My new item is showing up, and thanks to this description that I wrote, short story collection, it's already appearing here in the dropdown very conveniently. So I linked to this. I hit Save. Ideally again I should find some references showing that this is a notable work by him, but we won't spend time on that right now. But the point is we created a new item. We populated it a little bit. We linked to it so that it's more discoverable by mentioning it in the author name, and of course the book item itself mentions the author and links to the author. So that's all good. One last thing we shall do is give it some useful identifier so let's add, say, the Library of Congress record for this book. OK. So I have prepared this in advance. Ooh. Just in time, with 80 seconds to go before it's giving up on me. Oh it has already given up on me. That is very unfortunate. So I go to the Library of Congress and I find this book. I find this entry, right? In the Library of Congress database about this book. And it has a permalink. It has a kind of guaranteed to be permanent link. I can just copy that link, go back to my little book, and say the Library of Congress. Yeah, LCCN, that's what they call their IDs, the call number. And I paste it here. I actually don't need the URL. I need just a number. And there we go. I have added it, and now Wikidata knows how to find bibliographic information about this book. And any re-user of Wikidata, some program, some tool that connects books to authors or does statistical analysis or whatever, some future yet to be imagined tool could automatically find additional metadata on the Library of Congress site thanks to this connection that I just made. And of course I could add many other IDs to other catalogs around the world, and we won't do that right now. You can see that it's now showing up under identifiers. So this is how we created a brand new piece of data. Questions about this, about creating new items? Yeah, all right. So we've seen how to contribute to Wikidata on our own, kind of through-- directly through Wikidata. Now you may you may be thinking, but Asaf, this sounds like a ton of work recording all of these little tiny bits of information about every person and every book and every town. And if you think that you would be correct. That is a ton of work. It's a lot of work. However, it is centralized, so it is reusable on other wikis and we will show in just a moment how we pull information from Wikidata into Wikipedia or other projects. We will show that in just a moment. But here's an awesome little game that we Wikidata volunteer, Magnis Monska, has authored called the Wikidata game, in which he tricks people-- sorry, helps people make contributions to Wikidata in a very, very easy and pleasant way. Let's look at the Wikidata game. So the first thing you need to do in that Wikidata game is to log in, because the Wikidata game makes edits in your name. So we need to authorize it. It's perfectly safe. And after you do that you can go to the Wikidata game. So this is the game. Now I'm logged in. And the Wikidata game actually includes a number of different games. Let's start with a person game. So Wikidata shows you-- shows you an item, and asks you a very simple question. Person, or not a person? So Wikidata goes through Wikidata entities that don't even have the instance of property. Which is why Wikidata doesn't know, literally doesn't know, if this is a person, or a mountain, or a city, or a country, or anything else. So it asks you, because this is the kind of question that Wikidata cannot decide on its own, but for us humans it's generally trivial to be able to say whether something that we're looking at is a person or not. It gets slightly trickier when the information is in Javanese, as it is here, rather than English. So this item happens to be described in Javanese. My Javanese, spoken in Indonesia, is very weak. However, I can tell that this is not a person. How can I tell? Without understanding a word of Japanese I see that it mentions 1000 kilometers and square kilometers, see? So this is about a place, or an area, or a region, or whatever, but not a person. So this is an example of how even without understanding language you can sometimes make a determination. However, of course, you should be sure. This is definitely not what the Wikipedia article about a person looks like. So this is not a person. I just click it and I'm shown the next item. This item is in another language I do not speak, and I just don't know. I do not know if this is about a person or not. So I click Not Sure. This is in Swedish, and it's about Sulawesi, still Indonesia. And it is not about a person. I have enough Swedish for that. So I click not a person. Now, you may say, well, do I really have to deal with all these languages that I don't speak? The answer is no. You don't have to. Here at the bottom of the Wikidata game there are settings. You can click that and tell Wikidata, I cannot even read Chinese or Japanese, so please don't show me items in those languages. Because I wouldn't even be able to guess. I prefer these languages in which I can relatively easily make determinations. And I can even tell Wikidata to only show me these languages. You see? This was not selected, which is why I was shown some other languages. I could say, only use these languages, and save. And now I can try this game again. However, that can slow it down a little. So here we go. Here's a Spanish-- which is one of the languages I told Wikidata game it can use. This is a Spanish item. Now is it about a person or not? It is not about a person. Is it about a person? No. Yes, it is right? Monk Cistercian, Pedro de Ovideo Falconi. That sounds like a person. Frau Pedro Nasser. Yeah, he was born in Madrid 1577. This is a person. OK. So I click person. Again, if you're not sure, click not sure. The point is, just by clicking person and as you can see this would work very well on mobile, which is why I said you can contribute on your commute. You can just hold your phone or tablet or whatever, and just tap. Person, not a person. Person, not a person. The amazing thing is that just tapping person has actually made an edit to Wikidata on my behalf, which I can find out, like every wiki, by clicking contributions. And as you can see in addition to the stuff about circus Bulgaria, my latest edit is in fact about this Pedro de Ovideo Falconi person. And the edit was, you can-- I hope you can see this, created the claim instance of human. So I added-- I mean Wikidata game added for me the statement instance of human. Now, the awesome thing is that it was super easy to do. I didn't have to go into that entity, click the Add button, choose the instance of property, choose human, hit Save. Instead of all these operations I just tapped on my screen, person, not a person. And I can do hundreds of edits during my daily commute. There are other games, like the gender game. So this is about-- this is when Wikidata already knows that this item is a person, but it doesn't know the gender of this person. Which is another one of the more basic items. And this is taking a long time because of the language limitations that I set on it. I guess the less exotic languages have already been exhausted in the game. We don't have to wait all this time. We can try something else. How about occupation? The occupation game. Here we go, this is in Russian. And what is the occupation of this gentleman? Well he is an [INAUDIBLE]. He's a church person. However, so the occupation game is where Wikidata game will automatically pull likely occupations from the article text and ask for confirmation. So if he-- if this person really is a deacon, I should click that. But I'm not sure. I'm not clear on the Russian church's distinctions between-- I mean [INAUDIBLE] is pretty senior, but I don't know if that automatically also means he's a deacon or not. And [INAUDIBLE] is not listed here. So I will click not listed. Also, these guesses are not always correct. So, this guy for example, is in Russian. I can read this. He's a philologist. He's a linguist. So I can confirm it and click linguist. All right? And again, if we look at my contributions we can see the Wikidata game on my behalf created occupation linguist. OK. Just by typing linguist there. Now if it's taken from the article, why would it ever be wrong? Well Jesus was the son of a carpenter. The word carpenter appears in the text. That doesn't mean it's correct to say Jesus was a carpenter. OK? Just a trivial example, right? So many, many articles will say, you know, born to a physician. And so the word physician could be guessed, but it wouldn't be correct unless the son is also a physician. So I hope it gives you the gist of it. There is also a distributed Wikidata game, which is pretty awesome. Here we go, which has additional games. So, for example, the key on game gives you, maybe it gives you, some items to play with. Yes? No? OK. So it gives you this little card, and asks you to confirm is this instance of human settlement? That is, is it a village, town, city, whatever. Is it a kind of human settlement or not? Or maybe it's a book. Maybe it's a poem. Again, so, is it an English settlement? And you can click the languages here to see the information. So I can click English. And indeed the article-- I mean the actual Wikipedia article says Camigji is a town and territory in this district in the Congo. So yes, this is an instance of human settlement. So I clicked yes. And just clicking yes again went to that item, and added property of human settlement. Now the point of all these games is these are tools, written by programmers, making kind of semi educated guesses about these fairly basic properties. And they are meant to semi automate, to assist, in the accumulation of all these important pieces of data. Now every single click here helps Wikidata give better results, richer results in future queries. Again, as of right now Wikidata can include Camigji if I ask it, you know, what are some towns in Congo? Until now it could not. Because it literally didn't know. So every time we click male, female, person, not a person, make these decisions, we help improve Wikidata and enrich the results that we could receive. Any questions about this, about kind of micro contributions through the Wikidata game? If that looks appealing I encourage you to go and visit the Wikidata game and start contributing in that way. There is a question here. If I make an article about Circus Bulgaria how should I correctly connect them? That is an excellent question. So once-- so now there is a Wikidata item about that book, but there is no Wikipedia article anywhere. Now suppose I write one in, Bulgarian maybe, you go to Wikidata. You find the item by searching. You find the item, and then the empty site links section right at the bottom there-- where are we? We have this? Circus Bulgaria. Let's demonstrate this. So here is the item about the book. Let's say that now there is an article because I just created it. I can go here to the empty Wikipedia link section, click Edit, type the name of the wiki, let's say English, and then type the name of the page that I just created. Circus-- right? And again, it offers me auto-complete for my convenience. Now we don't actually have the article created, but I could let's just say this was the article. I can just click this, hit Save, and that would associate the new Wikipedia article with this Wikidata item. That is the beginning of the inter-wiki list for this item. I will not click Save Now, because we didn't have the article yet. So I hope that answers that question. Was there another question that I missed here? No. OK. Any questions about the Wikidata game? About this idea of micro contributions? If not then we can move on to embedding data, and after that we can discuss queries, how to get at all this data from Wikidata. So the short version of how to embed data from Wikidata is that there is this little magic incantation. Curly brace, curly brace, hash mark, property. It looks like a template, but it isn't because of that hash. And that is magic. Take a look at this little demo that I prepared. This page, which is off my user page on meta, but it could be on any wiki. OK. Says, since San Francisco is item Q62 in Wikidata, and since population is property P1082, I can tell you that according to Wikidata the population of San Francisco is this. And this bolded number here was produced with this incantation. Curly brace, curly brace, hash mark, property P1082, that's population, type from what item? Right? Cause I'm pulling an arbitrary number. I could put any property in any item here, and kind of include it, embedded, into my text. This isn't even about-- you notice this is my user page. This isn't even the article about San Francisco. I just want to pull that number into this thing that I'm writing. So it's fairly simple. I identify the property. I identify the item to take it from. And Wikidata will, I mean Wikipedia, or the wiki I'm on, in this case meta, will go to Wikipedia and fetch it for me. Likewise, since Denny Vrandecic, the designer of Wikidata is item 18618629, right? I mean, he's a notable person, so he has a Wikidata entity. And since occupation is property 106, and date of birth is 569, and place of birth is 19, because of all that I can tell you that Vrandecic was born in Stuttgart, on this date, and is researcher, programmer, and computer scientist. If you look at the source for this page, click Edit Source, you can see that the word Stuttgart does not appear here, because it came from Wikidata. I did not write this into my little demo page here. See? Place of birth is-- where is it? Here. Born in property 19 from queue number so-and-so. That is how easy it is to pull stuff into a wiki from Wikidata. OK now there's some nuance to it. And there's there are some additional parameters you can give. And you can ask Wikidata to give you not just the text of the values, but actually make it links. So, for example, if I change this from property to values-- No, that did not work at all. Wasn't it values? What was it? Values and then-- Oh, statements. My bad, sorry. The Magic word is statements. Statements. So going back here. If I change the word property to the word statements here then this same value-- that did not work at all. Oh, because I'm on meta. So because I'm on meta, meta doesn't have an article named researcher, programmer, or computer scientist. But Wikipedia does. If I included this same syntax in Wikipedia, like English Wikipedia, for example-- So let's go there right now. And go-- go to my-- Go to my sandbox. If I just brutally paste this on my sandbox here-- So, see, these became links. Because Wikipedia has an article called programmer and computer scientist. So, like I said, there's some additional nuance to the embedding. The important thing is that this is the key to delivering on that first problem that I mentioned. How to get data from a central location onto your wiki in your language. Basically using property and statements magic incantations. And of course, usually, this would be in the context of an info box. Some wikis-- English Wikipedia is not leading the way there. Some smaller wikis are more advanced actually in integrating Wikidata embeddings like this into their info boxes. So that instead of the info box just being a template on the wiki with field equals value, field equals value. That template of the info box on the wiki pulls the values, the birthdate, the languages, et cetera, pulls them from Wikidata. So basically just-- I just demonstrated single calls to this, but of course an info box template would include maybe 20 or 40 such embeds, and that is not a problem. Of course, before you go and edit the English Wikipedia's info box person and replace it all with Wikidata embeds, you should discuss it with the English Wikipedia community. These discussions have already been taking place. There are some concerns about how to patrol this, how to keep it newbie friendly, et cetera. So there are legitimate concerns with just moving everything to be embedded from Wikidata. But the communities are gradually handling this. I mean this ability to embed from Wikidata is not very old. It's been around for about a year. So communities are still working on kind of integrating that technology. But that is that is kind of just the basics of how to pull data, individual bits of data, that's not querying, that's not asking those sweeping questions that I was talking about yet. We'll get to that right now this is how to pull a specific datum, a specific piece of data, from Wikidata. OK. So here's another quick thing to demonstrate before we go to queries, and that is the article placeholder. The article placeholder is a feature that is being tested on the Esperanto Wikipedia, and maybe another wiki, I don't remember. And it is using the potential of Wikidata to offer a placeholder for an article. An automatically generated Wikidata powered replacement placeholder for an article for articles that don't yet exist on Esperanto. So let's go to the Esperanto Wikipedia. I don't speak Esperanto. But let's look for Helen Dewitt, our friend, in Esperanto Wikipedia. Now Esperanto is not one of the Wikipedias that have an article about Helen Dewitt. And so it tells me that, right? There is no Helen Dewitt. Maybe you were looking for Helena Dewitt. No, I was not. You can start an article about Helen Dewitt. You can search. You know, there's all this stuff. But there is also this little option here, hiding, which tells me that the Esperanto Wikipedia is-- what's happening here? Yes. The Esperanto Wikipedia is ready to give me this page. This page, as you can see, it's on the Esperanto Wikipedia, but it's not an article. See, it's a special page. It's machine generated. You can see the URL as well. It's not, you know, slash Helen Dewitt. It's slash specialio, about topic, and then the Wikidata ID of Helen Dewitt. And what I get here-- I get an English description, by the way, because there is no Esperanto description. Wikidata can't make it up. But what it can do is offer me these pieces of data in my language, in this case Esperanto. I'm on the Esperanto Wikipedia. OK. So it tells me that she's American, for example, and it tells me that in Esperanto. OK and it tells me that she speaks Latin. Remember we taught Wikidata that? It tells me that she was educated in Oxford, you know, and gives me the references to the extent that they exist. I mean this is not an article. It's not, you know, paragraphs of fluent Esperanto text. But it is information that I can understand if I speak this language. And it's better than nothing. And remember Helen Dewitt was not a very detailed article. If I were to ask about, I don't know, some politician, or popular singer that has more data in Wikidata, than this machine generated thing would have been richer. So this feature is available and is under beta testing right now, but generally if this sounds interesting for you especially if you come from a smaller wiki that is missing a lot of articles that people may want to learn about, you can contact the Wikimedia foundation and ask for article placeholder to be enabled on your wiki. And again, this is a placeholder. Of course, it exists only until someone actually writes a proper Esperanto article about Helen Dewitt. So I hope this is clear. This is all coming from Wikidata on the fly. In real time. As you can see it includes my latest edits to Helen Dewitt. OK. Questions about the-- questions about the article placeholder? If there are try and put them on the channel. And this brings us to one of the main courses of this talk, which is querying Wikidata. So I've explained how Wikidata works. We've walked through it. We've added to it. We've created a new item. We learned how to contribute during our commutes. And all this was you kept promising us, Asaf, that this would be-- this would enable these amazing queries. So time to make good on that. The URL you need to remember is query.wikidata.org. And that will take you to a query system that uses a language called SPARQL. SPARQL, spelt with a Q. This language is not a Wikimedia creation. It's a standardized language used for querying linked data sources. And because of that there are there are certain usability prices that we pay for using SPARQL, for using a standard language. It's not completely custom made for querying Wikidata, and we'll see that in just a moment. The principle to remember about Wikidata query is that Wikidata will tell you everything it knows, but no more. I have anticipated this several times already, right? Until this moment when we taught Wikidata data that Helen Dewitt speaks Latin, she would not have appeared in query results asking who are American writers who speak Latin? She would not have appeared. But as of this afternoon, she will appear because I've added that piece of information. So a result of that principle is that you can never say, well I ran a Wikidata query and this is the list of Flemish painters who are sons of painters. The list. That these are all the Flemish painters who are sons of painters. That is never something you can say based on a Wikidata query, because of course, maybe not all the Flemish painters who are sons of painters have been expressed in Wikidata data yet. Wikidata doesn't know about some of them, or maybe it knows about all of them but doesn't know the important fact that this person is the son of that person, because those properties have not been added. And so they cannot be included in the results. So the results of a Wikidata query are never the definitive sets. What you can say about a Wikidata query is here are some Flemish painters who are sons of painters. Here are some cities with female mayors. Whatever it is you're querying about is never guaranteed to be complete because Wikidata, like Wikipedia, is a work in progress. And of course, the more we teach Wikidata the more useful it becomes. OK so lets go and see those queries. So this is query.wikidata.org. It's not the wiki. All right? So this isn't like some page on the wiki itself. This is kind of an external system. So it's not a wiki. You can see I don't have a user page here. I don't have a history tab. This isn't a wiki page. This is a special kind of tool or system. And it invites me to input a SPARQL query. Now most of us do not speak SPARQL. It's a a technical language. It's a query language. Some of you may be thinking about SQL, the database query language. SPARQL is named with kind of a wink, or a nod, to SQL. But, I warn you, if you are comfortable in SQL don't expect to carry over your knowledge of SQL into SPARQL. They're not the same. They are superficially similar. Right? So they both use the keyword select, and they use the word where, and they use things like limit, and order. So again, if you know this already from SQL those mean roughly the same things, but don't expect it to behave just like SQL. You do need to spend some time understanding how SPARQL works. So, by all means, I invite you to go and read one of the many fine SPARQL tutorials that are out there on the web, or to click the Help button here, which also includes help about SPARQL. But I also know that most of us when we want to do some advanced formatting on wiki, for example, we don't go and read the help page on templates, right? We go to a page that already does what we want to do, and adopt and adapt the code from that other page, right? So we just take something that does roughly what we want, and just copy it over and change what we need to change. That is a very pragmatic and reasonable way to do things which is why-- and the wiki data engineers know this, which is why they prepared this very handy button for us called examples. We click the examples button. And, oh my god, there is a ton of-- well there's 312 example queries for us to choose from. And we can just pick something that is roughly like what we're trying to find out, and then just change what needs changing. So let's take a very simple one. The cats query. Maybe one of the simplest you could possibly have. And let's run it first and then I'll kind of walk you through it. The goal here is not to teach you SPARQL, but to get you to be kind of literate in SPARQL. To kind of understand why this does what it does. So let's run this query first. We click Run and here I have results at the bottom. The item, which is just a Wikidata item, which of course is a number. Remember, wiki data thinks of items as queue numbers. And the label, because we're humans and we prefer words to numbers. So these 114 results are all the cats that wiki data knows about. Is this all the cats in the world? No of course not, remember? It's all the cats Wikidata knows about, which means they're somehow notable. I mean someone bothered to describe them on Wikidata. And Wikidata was told this item is an instance of cat. Right? So these are those cats. And we can click any of them. I don't know, Pixel, for example. Click the Wikipedia item. And here is the Wikidata item about Pixel with the queue number. And he is a tortoiseshell cat. And as you can see instance of cat. OK. And he is five inches high. And he is apparently documented in Indonesian, In Bahasa. Right here this is Pixel. And he is apparently somehow related to the Guinness World Records book. I don't speak Bahasa, so I don't know exactly why this cat is so notable. But, of course, cats can become notable for all kinds of reasons. Maybe they're a YouTube sensation, you know, maybe they were involved in some historical event. I like this cat named Gladstone. This cat named Gladstone is-- he has position held Chief Mouser to Her Majesty's Treasury. This is an official cat with a job. And he has been holding this job, mind you, since the 28th of June this past year. That's the start time. And there is no end time which means he currently holds the position of Chief Mouser to her Majesty's Treasury. His employer is Her Majesty's Treasury. He's a male creature. And Wikidata knows that this cat is named after William Gladstone, the Victorian prime minister. Of course if I don't know who this person is I can click through and learn that he was a liberal politician and prime minister, right? He even has a Twitter account. And Wikidata sends me right to it. The treasury cat Twitter account. And he has articles in German, and English, and of course Japanese, because he's a cat. All right. So this was a very simple query. Let's find out why it works. OK. So what did we actually tell Wikidata to do for us? We said, please select some items for us along with their labels. OK? Along with their human readable labels because if I remove this label what I get is, see, just a list of item numbers. That's not as fun. So that's what this little bit did. I just said, give me the items, but also they're human readable label. And I want you to select a bunch of items, but not just any random bunch of items, I want to select items where a certain condition holds. What is the condition? The condition is that the item that I want you to select needs to have property 31 with a value of Q146. Well, that's helpful. If I hover over these numbers-- Again, I get the human readable version. So I'm looking for items that have property instance of with the value cat. Right? Because that's literally what I want, right? I want all the items that have a property, a statement, that says instance of cat. That's the condition. I'm not interested in items that are instance of book, or instance of human. I'm interested in instance of cat. That is the only condition here in this query. This complicated line I ask you to basically ignore. This is one of those sacrifices that we make for using a standard language like SPARQL. But the role of this complicated line is to basically ensure that we get the English label for that cat. OK? So don't worry about that. Just leave it there. And we run the query and we get the list of cats with their English labels, and that is awesome. By the way, if I change EN, without really understanding this line, if I change EN to HE, for Hebrew, I get the same results with a Hebrew label. Of course, these cats, nobody bothered to give them Hebrew labels unfortunately. So I get the queue number. But if I changed it to Japanese, JA, I would get still a bunch of queue numbers for where there isn't a Japanese label, but I would get the labels in Japanese. OK? So this is an example of how you don't even need to understand all the syntax of this query to adapt it to your needs. If you want this query as is, but you want the labels in Japanese, you can just change the language code here. OK so that is all this query does. Again, just give me the items that have property 31, instance of, with a value 146, which is cat. Let's take a question just about this very simple query before we advance to more complicated queries. Any questions just about this? Like, did anyone kind of really lose me talking about this simple query? Again, this query just tells Wikidata, get me all the items that somewhere among their statements have instance of cat. That's the only condition. No questions. OK, feel free to ask if you'd come up with one. So let's complicate things a little. Let's ask only for male cats. OK. Remember this cat Gladstone is male, and we know this because he has a property called sex or gender, and the value is male creature, right? So let's add another condition right here under the first condition. OK? This is a new line. And I'm adding a new condition to the query. I'm saying, not only do I want this item that you return to be instance of cat, I also want this same item to have another property, the property sex or gender. Right? And I need to refer to the property by number. But don't worry, Wikidata will help you. So you start with this prefix, Wikidata WDDT. Again, just ignore that prefix it's one of the features of SPARQL that we need to respect. WDT colon, and then I can just type control space to do a search, to do an auto complete. So I can just type sex and Wikidata helpfully offers me a drop down with relevant properties. So I click property 21, which is the sex or gender property. And then I say, so I want the sex or gender property to have the Wikidata value. Again, control space. And I can just say male creature. See? There's a different item for male, as inhuman, and a different one for male creature, for reasons that we won't go into. Let's pick male creature, because we're talking about cats here. All right. And add a period here at the end and click Run. And instead of 114 cats, we get, this time, we got 43 results. Including our friend Gladstone who is a male creature cat. So that means all the rest are female, right? Wrong. Wrong. That does not mean that at all. What it means is of the 114 items that have instance of cat, only 43 have explicitly sex male creature. The rest of them do not. Maybe because they have sex female creature, but maybe because they don't have that property at all. I'm emphasizing this to kind of help you train yourself to correctly interpret the results of queries from Wikidata. Don't jump into this kind of simplistic conclusion, OK there's 114 total, 43 male, therefore the rest are female. That is not correct. OK? But 43 of those explicitly had another statement, sex or gender, male creature. So I just added another condition, and now my query is asking two separate things about the results. They need to be a cat and a male creature. AUDIENCE: Maybe we should see how many cats have Twitter accounts. But there is a question from YouTube, which is will you talk about the export possibilities of the result of the query? ASAF BARTOV: Absolutely. Absolutely I will in just a little bit. I mean there is, in addition to just getting this kind of table, I can get these results in other formats. And I can also download these results. I can click the Download button and get them as a comma separated file, tab separated file, a JSON file, which is useful for programmatic uses. I can also get a link. So I can get a link to this query. I mean, I spent all this time designing this beautiful query. I can get a short URL that was generated especially for me right now with a tiny URL. I can just paste this into Twitter and go, hey people look at all the male cats that Wikidata knows about. OK, this is not a very exciting query. But once I get to a really complicated exciting query I can totally share that very easily through this. And we will get to more interesting queries in just a second. Any questions on this kind of basic querying so far? OK. So that was a very simple example. Let's spend a moment exploring. So this cat Gladstone was named after this dude, William Gladstone, who was an important British politician. I'm sure he's not the only thing out there in the universe that's named after Gladstone, right? I mean there has got to be, I don't know, park benches, planets, asteroids, something other than the cat, named after this guy. So we can ask Wikidata to tell us all the things that, you know, without saying instance of something. Like, I don't know, anything named after William Gladstone. So how do I do that? Same principle. Instead of asking about the property instance of, property 31, instead of that, I will ask about the property named after-- sorry, named after-- I don't need to remember the number. I have auto-complete. Named after is property 138. And I want anything at all that is named after this person, William Gladstone. Here we go. Which is 160852. Whatever. OK. You notice I removed instance of cat. I remove the male creature. I'm only asking, get me all the items that are somehow named after that particular politician. And I run the query, and it turns out the Wikidata knows about three such things. Does that mean that's the only-- these are the only three things named after him in the world? Of course not. But these are the only three items that are in Wikidata and explicitly have the property named after Gladstone. For all I know, there may be a village in England called Gladstone named after this person. But if nobody added the property, named after, linking to the person, he wouldn't show up in the results to my query. So Wikidata knows about three such things. One of them is something called the Gladstone Professor of Government. I can click through and see that it's a chair at Oxford University, right? So it's a position. And another is the William Gladstone school number 18. William Gladstone school number 18. Where is that? That is in Sofia, Bulgaria. Again. All right, so that's a particular school in Bulgaria named after William Gladstone. And finally, the third result is, of course, our pal Gladstone the Cheif Mouser. If I click through, that's the cat. All right, so that was an example. I mean, you saw how easy it was. I just named the property and the value that I care about, and I get the results. Again, I mean, it's kind of a silly example, but think about it. This is-- how else can you answer that question? There's no reference desk, even at a great University of Oxford, where you can walk in and say, give me a list of things named after Gladstone. There's no easy way to answer that unless you happen to have a very large structured and linked data store, like Wikidata. All right, so that was a silly example. Let's take some-- AUDIENCE: There's a bunch of stuff on there. ASAF: Oh, OK. AUDIENCE: Can you show easy query on the video? And somebody needs to know how to just do property exists without giving a specific value. And then once you show easy query you reload the page and-- ASAF: I don't know easy query. So is that a gadget? I don't know what easy query is. I don't use it. So someone can maybe send a link or something? Oh it is a gadget. I don't have it enabled. That is nice. So now, what I just did by hand, by formulating the query named after Gladstone-- I guess this is the-- Is it? Yeah. So this-- I just clicked the three-- the ellipsis here. Right after the name. You see this? This was just added by enabling easy query, which I just learned about. So you just click this and it auto-magically made this kind of trivial query. Of course, if I want a more complicated query like, I don't know, give me all the things that are named after Lincoln but are a school, I will still need to kind of edit a custom query. But this is a super easy and very nice way of just doing a very super quick query for exactly this. Right? Like. what other items have exactly this property and value named after William Gladstone? So, thank you to whoever made this suggestion to demonstrate that, and I'm glad I learned something too today. Let's move to another sample query. Here's a fun example. Popular surnames among fictional characters. Think about that for a second. Popular surnames among fictional characters. So we're asking Wikidata to go through all the fictional characters you know, and of those look through their surnames, group them so that you can count them, the repetitions of the surnames, and give me the most popular surnames among them. Additionally, I want you to awesomely present the results as a bubble chart. Oh, yeah. Wikidata can do that. And I run the query. And check it out. The most popular names among fictional characters we can say that knows about are Joan, Smith, Taylor, et cetera. I mean for all we know, the most popular name among fictional characters actually in the world may be Wu. Or something in Chinese for all we know. But if that has not been modeled in Wikidata, we're not going to get that. So Taylor, Smith, Jones, Williams, seem to be the most popular names. And again, I could limit this. I could make the same query but add, only among works whose original language was Italian, for example, to get more interesting results if I only care about Italian literature. But this is an example of how I got awesome bubble charts for free, and I can just plug this into an awesome presentation that I make. Of course I can still look at the raw table. So the query still resulted in a bunch of data, right? So Smith repeats 41 times, Jones 38 times, Taylor 34 times, et cetera, et cetera. And down that list. And I could, again, I could export this into a file and load it up in a spreadsheet, and do additional processing on it. I can link to it. I can do all kinds of awesome things with it. So that's another awesome query. We don't have to go into every line by line analysis here of why this works the way it does. I want to show you some other queries first. Let's look at-- this is just fun, overall causes of death. Again a bubble chart just looking at people who died of things, and have a cause of death listed. And we learn that the most commonly listed cause of death is myocardial infarction, pneumonitis, cerebral vascular, lung cancer, et cetera, et cetera. And again, in a bubble chart. And so how does that work? So just very briefly, the important parts of this query are I'm looking for something, for some person, who is instance of 31, instance of Q5, which is human. So a human. Again, just to kind of limit the query. I'm not interested in books or mountains. I'm looking for humans who have that same person, that same variable PID, should have a 509, meaning-- Hello. Why don't I have the-- Yeah. A 509, which is cause of death. And that cause of death is another variable, that I'm calling CID. Now, previously we were saying you know I want things that are named after Gladstone specifically. Only things that have that particular value. Here I'm saying I'm looking for things that have some cause of death. Not a specific one. I just wanted to get everything that has a statement with some value about property 509 cause of death. OK? And then this other bit of magic here, the group by, tells Wikidata I'm not actually interested in every individual thing. I want you to group those causes, and then count them and give me the top ones. So that's how this query works. Here's that query I promised. Painters whose fathers were also painters. I can only think of a couple. I mean, Monet and Vogel. But I'm sure Wikidata knows many more. So let's run this query. And I have 100 results. By the way, I have limited it to 100 results just to keep it kind of snappy. But actually, we could maybe try removing the limit and see if Wikidata could tell us the total number in Wikidata. Yeah, that wasn't too bad. So 1,270 results. OK. Wikidata, already at this early date and it's progress, already knows about more than 1,200 painters who are sons of painters. Sons of male painters, like their father is a painter. There may be additional painters who are sons of female painters not included in this query. Again, always remember what exactly you are asking. In this query I was asking about the father. I'm leaving out any possible painters who are sons of mother painters. OK? So how does this work? I'm asking for the painter along with the human label, and the father along with the human label. So Michel Monet is the son of Claude Monet. And Domenico Tintoretto is the son of the famous Tintoretto whose label, you know, is just Tintoretto like Michelangelo. You know, you don't always have to have the full name in the common label. Paloma Picasso is the daughter of Pablo Picasso. OK. So Wikidata knows about all these results. Of course Holbein the Younger son of Holbein the Elder. And how did we get there? Well we asked Wikidata to look for something, let's call it painter, which has 106, which is occupation, with a value painter. Right? This unwieldy number 1028181, that's painter. So I'm asking for any item that has occupation painter. And let's call that item painter. I also want that painter to have a property 22, which is father. OK. Father. And I want it to have some value. OK, I'm putting it into another variable called father. I could have called it, you know, frog. That doesn't change anything, just to be clear. What matters is that this is the property father. I could have called it anything I want. So, and then, I have a third condition. That the father, like whatever it says here in property 22, I want that father to have himself a property 106 occupation with a value painter. OK? These conditions combined to give me a list of people who have a father and that father has occupation painter as well. Of course, if I suddenly, or if you suddenly, are consumed by curiosity to know who are some politicians who are sons of carpenters? You could just change that, right? Change the first value from painter to politician. Change the third line's value from painter to carpenter. Maybe that list will be very short because carpenters don't tend to be notable, so they wouldn't be represented on Wikidata. That's why this works relatively well with painters, right? Because most of them are notable. But generally you could do that, right? That's an example of how you can take a query and just replace one of those values, or even the language. So again, I could ask for these same painters. It's limited again. These same painters, but with Arabic labels. Same query, but I have Arabic labels for these painters. And of course where there is no Arabic label I get the queue number. OK? So that's that query that I promised you, painters who sons of painters can be done by Wikidata in under one second. How awesome is that? We can also get some statistics. So how about counting total articles in a given wiki by gender. This is what we call the content gender gap, as distinct from the participation gender gap. This is the gender gap in what we cover on Wikipedia. So let's take one of these. So this is a query. Articles about women in some given Wikipedia. All right. So let's take-- I don't know. Let's take the Tamil Wikipedia. That's language code TA. So I just put TA here. And I click Run, and I get this count. That's all I wanted. I'm not actually interested in the items, like in the list of women on the Tamil Wikipedia. I just want the number. So I selected the count here. And this number turns out to be 2159. So there are 2000 articles about women the Tamil Wikipedia that Wikidata knows to be female. Right? I'm asking about the gender field, property 21 again. Remember, if there's some article about a woman in Tamil Wikipedia, but wiki data doesn't have a statement about the gender, that person will not be counted here. So again, be careful about kind of stating that is exactly the number of women articles on Tamil Wikipedia. That's probably not true. I'm sure some of those articles are missing a sex or gender or property. But for raw statistics, that's probably good, because some men are also missing the sex or gender statistic property. So we could take the same query for men. It's essentially the exact same. It just has this unwieldy number for males, 6581097. I can change this language code again to TA for Tamil. And how many men are covered on Tamil Wikipedia 14,649. OK. So women, 2,100, men, about seven times as many. Right? So that's the approximate size of the content gender gap on Tamil Wikipedia. And again, I can complicate this query as much as I want. For example, I can try and find out if this gender gap is wider or narrower among musicians, just as an example. I could just add a line here that says occupation musician, and then I'm only counting articles on Tamil Wikipedia about musicians who are female versus articles on Tamil Wikipedia about musicians who are male. And I can kind of compare the gender-- the content gender gap across occupations on Tamil Wikipedia. Do you see the important point here? Is that this is not just kind of a one purpose query. I can just with a single additional conditional suddenly make it a much more interesting query, because I break it down by occupation. Or I break it down by century. Do we have more of the coverage gap in 19th century people than in 21st century people? I mean, I sure hope so, right? The patriarchy is weakening somewhat. So I wouldn't be surprised if there are many more notable men covered about the 19th century. But if we are also covering-- I mean it's the gender gap is just as wide for 21st century people, that would be a little disappointing. Again that's something I can fairly easily find out on Wikidata query. Any questions so far, or are you just sharing links? AUDIENCE: Yep there is one. So somebody is wondering if you can demonstrate, or at least give a short answer of the latter of this question. Is it possible using in Wikidata SPARQL to find specific Wikidata articles, e.g. featured articles, of a certain language which do not exist in another language. I know it is possible to find category based results using a PET scan tool. But can we specify that by selecting e.g. featured articles? ASAF BARTOV: Yes. Excellent question. It is possible, indeed. And I will demonstrate one such query. Another query that I already mentioned largest cities in the world with a female mayor. This query-- let's close some of these tabs before my browser chokes. So this query lists the major world cities run by women currently. And the answer is Mumbai, Mexico City, Tokyo, bunch of others. And wait-- that's not it at all. I clicked the wrong one. That's the map of paintings. OK. Let's demonstrate that for a second. So this is the map of all paintings for which we know a location with the count per location. And the results are awesomely presented on a map. OK. Again, under the hood this is a table, of course, of results. But, awesomely, I can browse it as a map. So here is a map of the world with all the paintings that Wikidata knows about. Not just knows about the paintings, but knows about their location in a museum. Not surprisingly Europe is much better covered than Russia or Africa. There is a huge gap in contribution to Wikidata from these countries. And some of it can be fixed. And of course there is much more documentation, and much more art in Europe. But if we zoom in, I don't know, Rome probably has a few paintings. Right? Hello. Sorry. It's-- Yes. Vatican City sounds like a good bet, right? I can zoom in here. And I can just click one of these dots and see in this point there are two paintings. And in this one there is one and it's the Archbasilica of St. John Lateran. Let's see, this is the actual St. Peter, right? Sistine Chapel has 23 paintings. What? The Sistine Chapel has way more than 23 paintings. Correct, but 23 of them are documented on Wikidata. Have their own item for the painting, not the Sistine Chapel, the painting has an item that lists its being in the Sistine Chapel. There are 23 of those. OK. There is definitely room to document the rest of the artworks in the Sistine Chapel. So, again, this is just not the kind of query you were able to make before Wikidata, and it's a fairly simple query, as you can see. There are examples using maps like airports within 100 kilometers of Berlin. Again using the coordinates as a useful data point. And here is a map showing me only airports within a 100 kilometer radius from Berlin. But I wanted to show you the mayors query. Let's click the-- oh I just have the wrong link here. But I can still find it here by typing mayor. Here we go, largest cities with female mayor. So this is a slightly more complicated query. But if I run it, I get the top 10, because I set limit to 10. I get the top 10 cities in the world, by population, size that are currently run by women. Tokyo, Mumbai, Yokohama, Caracas, et cetera. And one interesting thing that you may want to notice here is that I'm asking for cities. I mean items, that are instance of city. And that have a head of government, that have some statement about who is in charge, and that statement has sex that's listed up here as female. Don't worry about the syntax right now. I just want to show you some specific angle here. And I'm further filtering these results. I only want those items where there is not the property and the qualifier, end time. Why is that important? Because if a city once had a female mayor, but that mayor is not the mayor anymore, because mayors change, I don't want them in this query. I want to query of cities currently having a female mayor. And of course Wikidata may have historical data with start and end time, as we've seen, that documents this person was the mayor of Tokyo or San Francisco between these years. But if there is no end times that means they are currently the mayor. So that's an example of asking about a qualifier of a statement, to again, to get the results we actually want. If we want current mayors it's important to put this filter. If we don't, we will get historical female mayors as well. All right. So these are some example queries. Questions about that? Oh, the featured article example. So let's look at that. So I have prepared such a query recently. Here we go. So this is a query. I just saved it here on my user page. I mean, this is not Wikidata query. This is just a meta page containing the query usefully. And let's run this. So this query, it's actually not very complicated. It's just has a long list of countries, because I'm asking about African countries. OK. I'm looking for human females from one of these countries that have an article in English. That's what this line means. But not in French. That's what this part means. OK. This part, these two lines together. But not in French. And this is what's called a badge. That's Wikidata's concept of good and featured articles. It's called a badge. So I want them to have some badge on English Wikipedia. OK? So again, this query is asking for the top 100 women from Africa who are documented on English Wikipedia, in a featured or good article status. But not on French Wikipedia. So this is a query that's a to-do query, right? That's a query for French editors to consider what they might usefully translate or create in French. And if we run this see we have three results. I mean, we have many women from Africa covered on English Wikipedia. But only three articles have featured or good status among those that do not have French Wikipedia coverage. Let me rephrase that. Among the English Wikipedia articles about African women that don't have a French counterpart, only three are featured or good. OK? Do you see this? The badge is good article. This little incantation here is what allows you to ask about the badge. This here. And, by the way, the slides will be uploaded to commons. And we will-- how shall we make it available on the YouTube thing as well? No, no. But, I mean, for people who will later watch this video. Oh yeah, we can add it to the YouTube description and the comments description. So in the-- if you're watching this video later, in the description, we will add a link to this query specifically. Because it's not in the slides right now. It will be. OK. So. Questions so far? We're almost done. We have a few minutes left. So questions about queries? I mean, I'm sure there's tons of things you don't know how to do yet. And you maybe you didn't really get the sense for SPARQL. It's something you need to really do on your own on your computer. See how it works. Fiddle with it. Change something. See that it breaks and complains. But, very importantly-- oh I had this in the other questions slide. Remember Wikidata project chat. That's kind of the Wikidata equivalent of the village pump. It's the page on Wikidata where you can just show up and ask a question. In my experience, the Wikidata community is very nice, very welcoming, and very eager to help newer people integrate and learn how to do things. There's also an IRC channel. If you know what IRC is and how to use it, by all means, go to IRC channel Wikidata. There's people there all the time, and you can just ask a question. If you're trying to do a query, and you don't quite understand the syntax, or you're not sure how to get the result you want. There are people there who will gladly help you do that. There is also a Wikidata newsletter published by the Wikidata team, which is centered in Germany and Wikipedia Germany. And they send out a newsletter in English with Wikidata news. You know, new properties, new items, new things in the project. But also sample queries. So once a week there is kind of an awesome query to learn from, if you want to learn that way instead of reading like a whole manual on SPARQL. So I'm just encouraging you to get help in one of those channels. Of course you can write to me. Just reach out to me and ask me questions as well. I hope by now you agree that Wikidata is love, and Wikidata data is awesome. If there are no questions, we do have a tiny bit of time to demonstrate one more tool but that's-- no? No questions. OK so let's talk about-- well, the resonator is kind of nice, but it's a little like the article placeholder. So this is not Wikidata this is a tool again built by Magnus Manske-- AUDIENCE: There's also one final question to you in case-- ASAF BARTOV: Oh, there is a question. AUDIENCE: Yeah. ASAF BARTOV: Which advantages and disadvantages to create an item before an article is done on English Wikipedia? Well, I mean, this example that I just made right. I'm reading this book by a notable author. OK. I want this to exist on Wikidata, and to be mentioned on Wikidata, so that when people look up that author in Wikidata they will know about one of his notable works. But I'm not prepared to put in the time investment to build a whole article on English Wikipedia. Either because I don't have the time, or I don't have good sources. Or maybe my English is not good enough, but it is good enough to just record these very basic facts and point to the Library of Congress records et cetera. So that it's better than nothing. So that's one reason to maybe do it. Another reason is to be able to link to it. So remember that translator lady already had an item on Wikidata, but if she hadn't we could have just created a very, very basic rudimentary item about her just saying, you know, this name is human. Country, Bulgaria. Occupation, translator. Even just that would have would have been something, and would have enabled me to link to this person. So these are legitimate reasons to create Wikidata entities without, or at least before, creating a Wikipedia article. If you are going to create-- I mean if you're at and edit-a-thon or something, and you have come to create Wikipedia articles, by all means, first create the Wikipedia article, then create the Wikipedia item and link to it. I hope that answers the question. So the reasonator is simply a kind of prettier view of items in Wikidata. So you can just type the name of an item or the number. Let's pick just a random number, 42. Say 42. Which happens to be, maybe you've heard of this guy, Douglas Adams. He happened to have received the queue number 42. I'm sure it's a cosmic coincidence of infinite improbability. And this is a view-- this is a tool that is not Wikidata. It's a tool built on top of Wikidata called resonator. And it gives us the information from Q42, that is from the-- this item in Wikidata, which looks like an item in Wikidata. But it gives it to us in a slightly more rational kind of lay out. It even kind of generates a little bit of pseudo article text for us. You know, Douglas Adams was a British writer, playwright, screenwriter, bla-bla-bla, an author. He was born on this date, in this place, to these people. He studied at this place between these years. That's all machine generated. Nobody wrote this text. That's all taken from those statements in Wikidata, and generates this reasonable reading summary paragraph. And then it gives us this little table of relatives. It's all taken from Wikidata. But as you can see, this is already a little more accessible than the essentially arbitrary ordering of statements on Wikidata. And that's OK. I mean, that's kind of by design. Wikidata is the platform. There is going to be-- there are going to be many new applications, and platforms, and tools, and visual interfaces on top of Wikidata to browse Wikidata in a more friendly or more customized ways. For example, one of the things that resonator does for us is give us pictures and maps and a timeline. Check it out this. Time line machine generated, just from dates and points in time, mentioned in the relatively rich Wikidata item about Douglas Adams. Right? So this timeline, for example again, completely machine generated. But he was educated between these years, so I can put it on the timeline. And this is the year he was nominated for a Hugo awards, so I can put that in a timeline. Et cetera. So that's just a super quick demonstration of that tool, the resonator. Links are all here in the slides. And the final tool I wanted to mention very quickly is the mix and match tool. You remember my explanation about Wikidata as Nexus, as connection point between many databases, many data sources. Those depend on these equivalencies. On Wikidata being taught that this item is like that ID in this other database. And mix and match is a tool again by, Magnus Manske. Maybe you're detecting a pattern here. It's a tool by Magnus that is designed to enable us to kind of take a foreign, an external data set, put it alongside Wikidata, and kind of try and align them. So this item in this external dataset, is that already covered in Wikidata? If so, by what queue number? By what item? If not, maybe we need to create a Wikidata item to represent it. Or maybe it's a duplicate, or something. So the mix and match tool has a list of external data sets, as you can see. The Art and Architecture Thesaurus by the Getty Research Institute. Or the Australian Dictionary of Biography. All kinds of external data sets here. Somewhere here I had a specific link to the Royal Society. It can also give me some statistics. So there is an external data set of all the Fellows of the Royal Society. Right? The oldest academic learned society in England. And the internet is tired. Here we go. Nope. Did that work? Fellows of the Royal Society, here we go. So this one is complete. I mean, people have manually gone over every single item there and either matched it to Wikidata or declared that it was not in scope, or a duplicate or whatever. But let's look at site stats. This is a fun kind of aspect of this tool. But that is not working. Or it's taking too long. So let's just demonstrate how this works. Maybe Britannica? Is that done already? Here we go. Encyclopedia Britannica. Yeah. So the Encyclopedia Britannica has 40% of the items there are not yet processed. So let's process one of them. For example there is an item in the Encyclopedia Britannica called Boston, England. As you know All-American place names are totally stolen from elsewhere. So there is a Boston in England, though it's no longer the famous one. And the mix and match tool has automatically matched it based on the label to queue 100, which is Boston big city in the United States. And that is incorrect, right? That's kind of naive computer going, well this is Boston, and this other thing is also Boston. And it is asking me to confirm this match or not. You see? So this is the Boston, England from Britannica. And the tool is asking me, is this the same as Boston queue 100 in America? The answer is no. I removed this. I remove this match. And now this Boston, England is unmatched. And I can match it to the correct one in England. I can do this by searching English Wikipedia, or searching Wikidata. I mean, it has these handy links. So the English town is in Lincolnshire. Boston, Lincolnshire. So I can go there and then get the Wikidata item number. See this is not queue 100, Boston in the states, this is queue 311975 town in Lincolnshire. I can get this queue number, go back to the mix and match tool-- Where was that? Here we are. And set queue. I can tell the tool that this is the right Boston, and click OK. And now this town in Lincolnshire, you can see this here, this item, queue 311975, is linked to Britannica. What does this mean? Well, if we go there. If we actually go to the Wikidata entity you will see that in addition to the few statements that it already had, it now has, thanks to my clicking, it now has another identifier here. See? Encyclopedia Britannica Online ID, with this link. And if we click it, we will indeed reach this page in the Britannica online, which is indeed about this town in Lincolnshire. You see? So I've contributed one of those mappings, one of those identifiers, into Wikidata. And I didn't have to do it manually. This tool kind of prompted me to either confirm if it was correct, I could have just clicked confirm since it wasn't correct. I corrected it manually, but it made this edit on my behalf. So that's another tool that encourages us to systematically teach Wikidata more things. And we're out of time. Go edit Wikidata, Now that you have the power, you know the deal. Use it for good, and not for evil. If you have questions, this is my email address. If you're watching this video not live the description will have links to the slides, and to a bunch of other useful pieces of information. Any last questions on IRC? If not, thank you for your attention. And if you like this, and if you feel that you now get Wikidata, and you get what it's good for, and you're inspired to contribute, I have only one request from you. I mean, in addition to using it for good not for evil, I ask that you spread the word. Show this video-- share this video with other people in your community, or around you. Teach this yourself once you're comfortable with these concepts. Feel free to use my slides. Yeah, and edit Wikidata. Thank you very much, and goodbye.