rC3 Wikipaka intro music Léa: Hi, everyone, I'm Léa. Here's Mohammed, and we're going to introduce you to Wikidata today. Mohammed: Yes, hi everyone. So in the course of the talk, if you do have a question, just feel free to ask them in a chat. And then we are going to try and answer them at the end of the talk. Yes. So let's dive straight in. What is Wikidata? Wikidata is a free knowledge base that is based on facts and references that anyone can edit and reuse. It is part of the Wikimedia projects. And like all of us, to start open projects, Wikipedia is multilingual and has no language barriers. Data in Wikidata is released under CC0 license. That means Wikidata's data is in the public domain and it has no exclusive intellectual property rights that is applied to it. Wikidata is not a primary source of information. It only aggregates or collects structured data that is already available, some of which are links to other databases. So it is not meant to be a place for original research. Wikidata is made for humans and machines, and is available for everyone to use, whether on other Wikimedia projects or outside of it. Next slide. So what is in the Wikidata? Wikidata was launched some eight years ago and was originally created to solve the problem of unstructuredness in the plain text format that information in Wikipedia is rendered in, and also to provide a central storage location where all of the different language Wikipedias can connect and talk to each other. Today, Wikidata has since outgrown its intended purpose and has become so big and successful that it is not only, you know, the most edited Wikimedia projects, Wikidata's data is now used more outside of the Wikimedia project than within it. There are more than 25,000 active editors. That means people who make at least one edit every month. Wikidata is used across 800+ Wikimedia projects in more than 300 languages. And it's interesting to note that the largest proportion of Wikidata's items are in the category of scholarly items comprising about 30% of the whole. Next slide. So far, people and bots have made more than 1.3 billion edits to Wikidata and created more than 91 million items. This map you see here is a visual impression of geolocated items currently existing on Wikidata. So, the bright areas are items that have coordinates, location property added as a statement. Next slide. So Wikidata has a vision, and what is this vision? Wikidata's vision is to give more people more access to more knowledge. So Wikidata gives access to information, regardless of the language that people speak, because Wikidata is multilingual, it expects translations of so-called Q numbers into different languages. And so doing Wikidata helps us support the smaller Wikimedia projects better, you know, by helping them to benefit from all of the work that the bigger projects are doing. And applications and projects outside of Wikimedia are also able to benefit from the rich datasets in Wikidata. So in a nutshell, Wikidata can be thought of as an online repository of structured data that anyone can edit and reuse. Next slide. OK, now, how is Wikidata connected to Wikipedia and other Wikimedia projects? Among other things, Wikidata can assist sister projects with more easily maintainable infoboxes. So the table at the right corner of this article on Wikipedia is called an infobox, which I'm sure you've seen before, and Wikidata is able to retrieve content on Wikidata into those infoboxes [distorted]. And for smaller language Wikipedias like, you know, Catalan Wikipedia or Welsh Wikipedia that, that readily leverages Wikidata to see their content. And it is helpful because it's, it helps to reduce editing workload for volunteers. Next slide. So what should we expect to see on a typical Wikidata item? Wikidata expresses relationships in the form of triples that use items starting with "Q" and property starting with "P", OK, and the item will typically be made up of at least one statement. So in this example you see on the screen we have two statements about an entity called Douglas Adams. The first statement, Douglas Adams was educated at P69 St John's College. What this means is that this statement is qualified by further properties. That is the academic major, academic degree, the start time and then the end time and qualifiers add more meaning to statements. So Wikidata records not just statements, but also their sources. And as you can see here, this helps us to reflect the notion of verifiability on the project so that statements Douglas Adams was educated at St. John's College has two open references that points to the source of that information. And the second statement, Douglas Adams, Q42, was educated at P69, Brentwood School, only has the qualifiers start time and end time, and it has no references, so a single statement consists of a property that is made up of a value with or without a reference or with or without qualifiers. Next slide. So a typical Wikidata item looks like this, and you can edit by clicking on the edit button, it has this pen symbol with edit next to it. As you can see, each item has a unique ID that is Q followed by some number. In this case, the item Douglas Adams has QID of Q42. And when you look at the top, there's a termbox. We call it, we call it the termbox at the top, at the top, that contains the label in different languages. A description of the items that is more of a short phrase telling us what the item represents. It's says here in English that Douglas Adams is an English writer and humorist. Then there is the alias next to the description which, aside from the label, tells us what the item could also be known by here. Next slide. So, creating a new item is as simple as going to any page on Wikidata and clicking on create a new item. And once you click on create a new item, you get to fill in the form that is asking for a label, description and an alias and QIDs are assigned automatically. Next slide. Next slide. Next slide, please. Alright, so there are tools that allow us to edit Wikidata more efficiently and make bulk edits to Wikidata, such as Quick Statements and OpenRefine. Please go to the previous slide. OK, yeah, right, so, yeah, Quick Statements and OpenRefine allow us to make automated edits and changes to Wikidata. Other tools are available that allow us to visualize Wikidata's data. Some of them enhances the user interface of Wikidata, and these could include scripts that editors can install or they could be gadgets that may be enabled in your preferences settings. Next slide. Léa: Alright. So, um, so far, Mohammed told you about how we describe concepts in Wikidata, and that's what we've been doing for the first years of the project, but in 2018, we also started storing a new type of information in Wikidata, which is lexicographical data, which is basically information about words and phrases in all kinds of languages. And so you see on the left the data model that is a bit complex and that's why I'm not going to get too much into details now but we can talk about this later. And you can see an example on the right where we basically describe the word "Luftballon" in German and we indicate the language, the lexical category and all kind of informations that are not about the object any more, but actually about the word and how it's composed of two words, as we like to do in German and things like this. So, again, if you want to know more about this, you can have a look at lexicographical data in Wikidata or we can talk about it together later in the questions, for example. So Wikidata doesn't come alone, it comes with a bunch of tools that have been, some of them have been developed by the development team of Wikidata, some of them have been developed by the community themselves in order to do things more efficiently. That can be, for example, adding data and some of the tools have already been mentioned by Mohammed, that can also be matching data with other databases, querying the data, reusing the data. There are also a bunch of tools that are about watching the data and watching its quality, watching what edits have been done recently and so on. And you can find the page that is called Wikidata Tools on Wikidata to discover plenty of these tools and you can, of course, create your own. So we mentioned that the goal of Wikidata is to be reused by everyone, but you may wonder who is actually reusing the data. Well, the first reusers of Wikidata's data is actually the Wikidata community itself, the Wikidata editors, because all of these items are connected. So one item can be linked from another, the content of one item can be reused on another and so on. The Wikimedia project such as Wikipedia, but not only. Wikimedia Commons, Wikisource, almost all of the Wikimedia projects at that point reuse part of the data that is coming from Wikidata, and then we have companies, from the biggest ones to the small ones because the data is in CC0 everyone can just reuse the content that they need. We have, of course, public institutions such as museums, libraries and so on. We also have journalists and, for example, data journalists. We have scientists and researchers and probably much more. And the thing is that we don't necessarily know who's reusing the data because it's here in the open but there are probably many usages that we don't even imagine. So if you're using Wikidata, or if you would like to use Wikidata's data, let us know, because we are always interested to discover more. Now, the question is: How can one reuse Wikidata? I'm going to present very quickly one of the most popular way to query the data. I'm not going to get into details right now because there will actually be a workshop at the conference in two days on day three about the query service so I'm gonna let you go there and discover more about how to use it. The query service is basically a SPARQL endpoint, SPARQL being a query language where you can basically ask questions to Wikidata and get lists or visualizations as a result. For example, here's the map of the airports of the world named after the person and the color of the dot, it represent the gender of the person. Or you can make a list of country flags that are including a sun, because if the data is properly modeled in Wikidata, you're able to describe, what are the different elements that compose a country flag? Or you can have this bubble charts with the occupation of accused witches, because why not? That's the kind of data we have in Wikidata. Now, there are other ways, of course, to query the data, I'm not going to get into details right now, but if you want to talk more about this, you can, for example, join the Wikidata meetups that are gonna happen tomorrow. We have dumps of the data where you can download part of or all of the data in a file. We have a bunch of APIs to access the data directly from your program. And on a Wikimedia project specifically, the community developed a bunch of templates that are using Wikidata's data using Lua. And now for something a bit different, Wikibase. You may have heard of it and you may even have wondered, OK, what's the difference between Wikibase and Wikidata? Well, Wikibase is basically the software powering Wikidata and, more precisely, the MediaWiki extension that is turning MediaWiki into a database. And so, Wikibase was started to power Wikidata but it also started developing on its own. Wikidata is still for now the biggest existing Wikibase instance, but people can also install Wikibase directly on their server and basically create their own little personal or public Wikidata. And the development is still ongoing, there are all kind of super exciting features coming up soon. And, for example, the ability to connect better Wikidata and your own instance of Wikibase, for example, to be able to reuse data that is already in Wikidata and to connect it to the data that you have in your own Wikibase. So, if you're interested in Wikidata, if you want to know more, there are a bunch of pages that you can find. There is a help portal, the Project Chat is the main discussion page on the wiki where you can interact with the other editors, the community. It's super important to get in touch with them if you want to get started with Wikidata. We also have a mailing list. We have a newsletter that is called Weekly Summary that you can find on wiki but also if you subscribe to the mailing list, you will also receive it. And then we have some accounts in the social media, on Twitter, there is a Facebook group, there is a Telegram, um, that is linked from the Project Chat and there is also an IRC channel. So you can basically find people from the Wikidata community everywhere. So we are approaching the end of the session, but it's not done, we have more Wikidata related sessions at the c3 in the Wikipaka. So, for example, tomorrow you're going to get an introduction to Wikidata, specifically for journalists and especially data journalists. Then in the afternoon, we're gonna have two Wikidata meetups. The first one is gonna be in German. The second one is gonna be in English. So depending on your preferred language, you can attend one or the other or both, and on day three, as I mentioned before, we're going to have a workshop to learn how to query Wikidata's data with SPARQL. So feel free to have a look and check them also in the main schedule of Wikipaka. Thank you very much for attending this session. These are our contact details if you want to, to contact us. And of course, you can now ask questions, as we mentioned in the chat or with the hashtag. And we will be very happy to answer all your questions right now. Herald: Thank you for your input and the overview about Wikidata. There has been a few question or questions already answered by Joel in the IRC channel. One was about the big dump of scholarly data and what scholarly data is and how this came to be in Wikidata. But there is one more question from the chat right now Till asks can I add new types of data that are not yet tracked in Wikidata? Léa: So I'm wondering, what do you mean exactly by type of data? Maybe you can give a bit more details because that can mean a lot of things. Wikidata, the data model of Wikidata is very flexible and it's absolutely not set in stone. Every, every week the community comes up with some new ways to describe things. Sometimes we realize that there is an area of the world that we completely forgot to cover, and then we create new properties to describe, for example, a certain type of, I don't know, of concept, a certain type of building or objects that we or philosophical concept that we didn't describe yet. So this is always in movement, in action. When it comes to what we actually call data types, which is, for example, a string of text or a date or a picture, we have all kind of data types like this, this is a bit more complicated and overall, it's quite rare that we add a new data type and it needs a strong, like, use case so we add that to the software. I hope that it answer your question and if I didn't, feel free to ask again. Herald: Yeah, we've got a feedback. The example Till meant was, there's a, there's an organization or a project called Parliamentwatch in Germany. There was one talk earlier today where they try to track and scrape and analyze the parliamentary protocols. And one big issue they had was with structural data about all the members of parliament and how they are organized and stuff like that. And, um, well, if I remember correctly, there actually was a project that tried to include the structural data of of members of parliament in Wikidata, if I'm not mistaken. Léa: Absolutely. It's a WikiProject that is called, um, something politicians, all politicians. I don't remember the exact name right now, but indeed. Some people are already working on members of parliaments and, like, political people in general. So it's very likely that there is already a way to structure the data. The best way is to contact the people directly involved on this, on this WikiProject. WikiProjects, by the way, are pages where basically people who have a specific topic of interest gather and can discuss about the specific questions about the topic. Um, so have a look at this, at this project about politics and, um, yeah. Try to see if, if anything is missing, but generally Wikidata definitely welcome information about about politicians, about member of parliaments, this kind of stuff. What we do not do, however, is store the full, like, documents, for example, in that case, the reports or the documents, that belongs elsewhere. Maybe on Wikimedia Commons, for example, if it's possible, if the license allows it. But on Wikidata, we'll be happy to store the metadata about them. Herald: Alright, Joel just posted the link to the WikiProject, Every Politician, so if anybody looks for Every Politician on Wikidata, they will find the project. So basically, the bottom line is pretty much anything is possible in Wikidata, right? Léa: Yeah, thank you Joel, and hi. Almost everything. So on Wikidata, just like on Wikipedia, we still have some criteria to define what can get in Wikidata and what not, because we are aware that this knowledge base, it needs to stay quite general and it cannot contain absolutely everything. For example, the community decided a while ago that they would not create one item for each human living or who used to live on Earth, that's just not possible, so there are some notability criteria that you can find in the help pages and I would say that the level of, like, how fine-grained the data should be has to be discussed with the community and the good thing about having Wikibase also available as a separate instance of Wikidata is that if some people want to work on a topic where they have some information that is very, very specific and would maybe not fit the scope of Wikidata, they can create their own Wikibase and then they can connect the content with what is already in Wikidata. So altogether, in this Wikibase ecosystem, yes, pretty much everything is possible. Herald: Well, the future is certainly here, at least, with Wikidata. Thank you again, Léa and Mohammed, for your insightful introduction to Wikidata and we're looking forward to more people joining you in your efforts. Thanks for your presentation. Léa: Thank you. See you soon. rC3 Wikipaka outro music Subtitles created by c3subtitles.de in the year 2021. Join, and help us!