-
rC3 Wikipaka intro music
-
Léa: Hi, everyone, I'm Léa. Here's
Mohammed, and we're going to introduce you
-
to Wikidata today.
Mohammed: Yes, hi everyone. So in the
-
course of the talk, if you do have a
question, just feel free to ask them in a
-
chat. And then we are going to try and
answer them at the end of the talk. Yes.
-
So let's dive straight in. What is
Wikidata? Wikidata is a free knowledge
-
base that is based on facts and references
that anyone can edit and reuse. It is part
-
of the Wikimedia projects. And like all of
us, to start open projects, Wikipedia is
-
multilingual and has no language barriers.
Data in Wikidata is released under CC0
-
license. That means Wikidata's data is in
the public domain and it has no exclusive
-
intellectual property rights that is
applied to it. Wikidata is not a primary
-
source of information. It only aggregates
or collects structured data that is
-
already available, some of which are links
to other databases. So it is not meant to
-
be a place for original research. Wikidata
is made for humans and machines, and is
-
available for everyone to use, whether on
other Wikimedia projects or outside of it.
-
Next slide. So what is in the Wikidata?
Wikidata was launched some eight years ago
-
and was originally created to solve the
problem of unstructuredness in the plain
-
text format that information in Wikipedia
is rendered in, and also to provide a
-
central storage location where all of the
different language Wikipedias can connect
-
and talk to each other. Today, Wikidata
has since outgrown its intended purpose
-
and has become so big and successful that
it is not only, you know, the most edited
-
Wikimedia projects, Wikidata's data is now
used more outside of the Wikimedia project than
-
within it. There are more than 25,000
active editors. That means people who make
-
at least one edit every month. Wikidata
is used across 800+ Wikimedia projects in
-
more than 300 languages. And it's
interesting to note that the largest
-
proportion of Wikidata's items are in the
category of scholarly items comprising
-
about 30% of the whole. Next slide. So
far, people and bots have made more than
-
1.3 billion edits to Wikidata and created
more than 91 million items. This map you
-
see here is a visual impression of
geolocated items currently existing on
-
Wikidata. So, the bright areas are items
that have coordinates, location property
-
added as a statement. Next slide. So
Wikidata has a vision, and what is this
-
vision? Wikidata's vision is to give more
people more access to more knowledge. So
-
Wikidata gives access to information,
regardless of the language that people
-
speak, because Wikidata is multilingual,
it expects translations of so-called Q
-
numbers into different languages. And so
doing Wikidata helps us support the
-
smaller Wikimedia projects better, you
know, by helping them to benefit from all
-
of the work that the bigger projects are
doing. And applications and projects
-
outside of Wikimedia are also able to
benefit from the rich datasets in
-
Wikidata. So in a nutshell, Wikidata can
be thought of as an online repository of
-
structured data that anyone can edit and
reuse. Next slide. OK, now, how is
-
Wikidata connected to Wikipedia and other
Wikimedia projects? Among other things,
-
Wikidata can assist sister projects
with more easily maintainable infoboxes.
-
So the table at the right corner of this
article on Wikipedia is called an infobox,
-
which I'm sure you've seen before, and
Wikidata is able to retrieve content on
-
Wikidata into those infoboxes [distorted].
And for smaller language Wikipedias like,
-
you know, Catalan Wikipedia or Welsh
Wikipedia that, that readily leverages
-
Wikidata to see their content. And it is
helpful because it's, it helps to reduce
-
editing workload for volunteers. Next
slide. So what should we expect to see on
-
a typical Wikidata item? Wikidata
expresses relationships in the form of
-
triples that use items starting with "Q"
and property starting with "P", OK, and
-
the item will typically be made up of at
least one statement. So in this example
-
you see on the screen we have two
statements about an entity called Douglas
-
Adams. The first statement, Douglas Adams
was educated at P69 St John's College.
-
What this means is that this statement is
qualified by further properties. That is
-
the academic major, academic degree, the
start time and then the end time and
-
qualifiers add more meaning to statements.
So Wikidata records not just statements,
-
but also their sources. And as you can see
here, this helps us to reflect the notion
-
of verifiability on the project so that
statements Douglas Adams was educated at
-
St. John's College has two open references
that points to the source of that
-
information. And the second statement,
Douglas Adams, Q42, was educated at P69,
-
Brentwood School, only has the qualifiers
start time and end time, and it has no
-
references, so a single statement consists
of a property that is made up of a value
-
with or without a reference or with or
without qualifiers. Next slide. So a
-
typical Wikidata item looks like this, and
you can edit by clicking on the edit
-
button, it has this pen symbol with edit
next to it. As you can see, each item has
-
a unique ID that is Q followed by some
number. In this case, the item Douglas
-
Adams has QID of Q42. And when you look at
the top, there's a termbox. We call it, we
-
call it the termbox at the top, at the
top, that contains the label in different
-
languages. A description of the items that
is more of a short phrase telling us what
-
the item represents. It's says here in
English that Douglas Adams is an English
-
writer and humorist. Then there is the
alias next to the description which, aside
-
from the label, tells us what the item
could also be known by here. Next slide.
-
So, creating a new item is as simple as
going to any page on Wikidata and clicking
-
on create a new item. And once you click
on create a new item, you get to fill in
-
the form that is asking for a label,
description and an alias and QIDs are
-
assigned automatically. Next slide. Next
slide. Next slide, please. Alright, so
-
there are tools that allow us to edit
Wikidata more efficiently and make bulk
-
edits to Wikidata, such as Quick
Statements and OpenRefine. Please go to
-
the previous slide. OK, yeah, right, so,
yeah, Quick Statements and OpenRefine
-
allow us to make automated edits and
changes to Wikidata. Other tools are
-
available that allow us to visualize
Wikidata's data. Some of them enhances the
-
user interface of Wikidata, and these
could include scripts that editors can
-
install or they could be gadgets that may
be enabled in your preferences settings.
-
Next slide.
Léa: Alright. So, um, so far, Mohammed
-
told you about how we describe concepts in
Wikidata, and that's what we've been doing
-
for the first years of the project, but in
2018, we also started storing a new type
-
of information in Wikidata, which is
lexicographical data, which is basically
-
information about words and phrases in all
kinds of languages. And so you see on the
-
left the data model that is a bit complex
and that's why I'm not going to get too
-
much into details now but we can talk
about this later. And you can see an
-
example on the right where we basically
describe the word "Luftballon" in German
-
and we indicate the language, the lexical
category and all kind of informations that
-
are not about the object any more, but
actually about the word and how it's
-
composed of two words, as we like to do in
German and things like this. So, again, if
-
you want to know more about this, you can
have a look at lexicographical data in
-
Wikidata or we can talk about it together
later in the questions, for example. So
-
Wikidata doesn't come alone, it comes with
a bunch of tools that have been, some of
-
them have been developed by the
development team of Wikidata, some of them
-
have been developed by the community
themselves in order to do things more
-
efficiently. That can be, for example,
adding data and some of the tools have
-
already been mentioned by Mohammed, that
can also be matching data with other
-
databases, querying the data, reusing the
data. There are also a bunch of tools that
-
are about watching the data and watching
its quality, watching what edits have been
-
done recently and so on. And you can find
the page that is called Wikidata Tools on
-
Wikidata to discover plenty of these tools
and you can, of course, create your own.
-
So we mentioned that the goal of Wikidata
is to be reused by everyone, but you may
-
wonder who is actually reusing the data.
Well, the first reusers of Wikidata's data
-
is actually the Wikidata community itself,
the Wikidata editors, because all of these
-
items are connected. So one item can be
linked from another, the content of one
-
item can be reused on another and so on.
The Wikimedia project such as Wikipedia,
-
but not only. Wikimedia Commons,
Wikisource, almost all of the Wikimedia
-
projects at that point reuse part of the
data that is coming from Wikidata, and
-
then we have companies, from the biggest
ones to the small ones because the data is
-
in CC0 everyone can just reuse the content
that they need. We have, of course, public
-
institutions such as museums, libraries
and so on. We also have journalists and,
-
for example, data journalists. We have
scientists and researchers and probably
-
much more. And the thing is that we don't
necessarily know who's reusing the data
-
because it's here in the open but there
are probably many usages that we don't
-
even imagine. So if you're using Wikidata,
or if you would like to use Wikidata's
-
data, let us know, because we are always
interested to discover more. Now, the
-
question is: How can one reuse Wikidata?
I'm going to present very quickly one of
-
the most popular way to query the data.
I'm not going to get into details right
-
now because there will actually be a
workshop at the conference in two days on
-
day three about the query service so I'm
gonna let you go there and discover more
-
about how to use it. The query service is
basically a SPARQL endpoint, SPARQL being
-
a query language where you can basically
ask questions to Wikidata and get lists or
-
visualizations as a result. For example,
here's the map of the airports of the
-
world named after the person and the color
of the dot, it represent the gender of the
-
person. Or you can make a list of country
flags that are including a sun, because if
-
the data is properly modeled in Wikidata,
you're able to describe, what are the
-
different elements that compose a country
flag? Or you can have this bubble charts
-
with the occupation of accused witches,
because why not? That's the kind of data
-
we have in Wikidata. Now, there are other
ways, of course, to query the data, I'm
-
not going to get into details right now,
but if you want to talk more about this,
-
you can, for example, join the Wikidata
meetups that are gonna happen tomorrow. We
-
have dumps of the data where you can
download part of or all of the data in a
-
file. We have a bunch of APIs to access
the data directly from your program. And
-
on a Wikimedia project specifically, the
community developed a bunch of templates
-
that are using Wikidata's data using Lua.
And now for something a bit different,
-
Wikibase. You may have heard of it and you
may even have wondered, OK, what's the
-
difference between Wikibase and Wikidata?
Well, Wikibase is basically the software
-
powering Wikidata and, more precisely, the
MediaWiki extension that is turning
-
MediaWiki into a database. And so,
Wikibase was started to power Wikidata
-
but it also started developing on its own.
Wikidata is still for now the biggest
-
existing Wikibase instance, but people can
also install Wikibase directly on their
-
server and basically create their own
little personal or public Wikidata. And
-
the development is still ongoing, there
are all kind of super exciting features
-
coming up soon. And, for example, the
ability to connect better Wikidata and
-
your own instance of Wikibase, for
example, to be able to reuse data that is
-
already in Wikidata and to connect it to
the data that you have in your own
-
Wikibase. So, if you're interested in
Wikidata, if you want to know more, there
-
are a bunch of pages that you can find.
There is a help portal, the Project Chat
-
is the main discussion page on the wiki
where you can interact with the other
-
editors, the community. It's super
important to get in touch with them if you
-
want to get started with Wikidata. We also
have a mailing list. We have a newsletter
-
that is called Weekly Summary that you can
find on wiki but also if you subscribe to
-
the mailing list, you will also receive
it. And then we have some accounts in the
-
social media, on Twitter, there is a
Facebook group, there is a Telegram, um,
-
that is linked from the Project Chat and
there is also an IRC channel. So you can
-
basically find people from the Wikidata
community everywhere. So we are
-
approaching the end of the session, but
it's not done, we have more Wikidata
-
related sessions at the c3 in the
Wikipaka. So, for example, tomorrow you're
-
going to get an introduction to Wikidata,
specifically for journalists and
-
especially data journalists. Then in the
afternoon, we're gonna have two Wikidata
-
meetups. The first one is gonna be in
German. The second one is gonna be in
-
English. So depending on your preferred
language, you can attend one or the other
-
or both, and on day three, as I mentioned
before, we're going to have a workshop to
-
learn how to query Wikidata's data with
SPARQL. So feel free to have a look and
-
check them also in the main schedule of
Wikipaka. Thank you very much for
-
attending this session. These are our
contact details if you want to, to contact
-
us. And of course, you can now ask
questions, as we mentioned in the chat or
-
with the hashtag. And we will be very
happy to answer all your questions right
-
now.
Herald: Thank you for your input and the
-
overview about Wikidata. There has been a
few question or questions already answered
-
by Joel in the IRC channel. One was about
the big dump of scholarly data and what
-
scholarly data is and how this came to be
in Wikidata. But there is one more
-
question from the chat right now Till asks
can I add new types of data that are not
-
yet tracked in Wikidata?
Léa: So I'm wondering, what do you mean
-
exactly by type of data? Maybe you can
give a bit more details because that can
-
mean a lot of things. Wikidata, the data
model of Wikidata is very flexible and
-
it's absolutely not set in stone. Every,
every week the community comes up with
-
some new ways to describe things.
Sometimes we realize that there is an area
-
of the world that we completely forgot to
cover, and then we create new properties
-
to describe, for example, a certain type
of, I don't know, of concept, a certain type
-
of building or objects that we or
philosophical concept that we didn't
-
describe yet. So this is always in
movement, in action. When it comes to what
-
we actually call data types, which is, for
example, a string of text or a date or a
-
picture, we have all kind of data types
like this, this is a bit more complicated
-
and overall, it's quite rare that we add a
new data type and it needs a strong, like,
-
use case so we add that to the software. I
hope that it answer your question and if I
-
didn't, feel free to ask again.
Herald: Yeah, we've got a feedback. The
-
example Till meant was, there's a, there's
an organization or a project called
-
Parliamentwatch in Germany. There was one
talk earlier today where they try to track
-
and scrape and analyze the parliamentary
protocols. And one big issue they had was
-
with structural data about all the members
of parliament and how they are organized
-
and stuff like that. And, um, well, if I
remember correctly, there actually was a
-
project that tried to include the
structural data of of members of
-
parliament in Wikidata, if I'm not
mistaken.
-
Léa: Absolutely. It's a WikiProject
that is called, um, something politicians,
-
all politicians. I don't remember the
exact name right now, but indeed. Some
-
people are already working on members of
parliaments and, like, political people in
-
general. So it's very likely that there is
already a way to structure the data. The
-
best way is to contact the people directly
involved on this, on this WikiProject.
-
WikiProjects, by the way, are pages where
basically people who have a specific topic
-
of interest gather and can discuss about
the specific questions about the topic.
-
Um, so have a look at this, at this
project about politics and, um, yeah. Try
-
to see if, if anything is missing, but
generally Wikidata definitely welcome
-
information about about politicians, about
member of parliaments, this kind of stuff.
-
What we do not do, however, is store the
full, like, documents, for example, in
-
that case, the reports or the documents,
that belongs elsewhere. Maybe on Wikimedia
-
Commons, for example, if it's possible, if
the license allows it. But on Wikidata,
-
we'll be happy to store the metadata about
them.
-
Herald: Alright, Joel just posted the link
to the WikiProject, Every Politician, so
-
if anybody looks for Every Politician on
Wikidata, they will find the project. So
-
basically, the bottom line is pretty much
anything is possible in Wikidata, right?
-
Léa: Yeah, thank you Joel, and hi. Almost
everything. So on Wikidata, just like on
-
Wikipedia, we still have some criteria to
define what can get in Wikidata and what
-
not, because we are aware that this
knowledge base, it needs to stay quite
-
general and it cannot contain absolutely
everything. For example, the community
-
decided a while ago that they would not
create one item for each human living or
-
who used to live on Earth, that's just not
possible, so there are some notability
-
criteria that you can find in the help
pages and I would say that the level of,
-
like, how fine-grained the data should be has
to be discussed with the community and the
-
good thing about having Wikibase also
available as a separate instance of
-
Wikidata is that if some people want to
work on a topic where they have some
-
information that is very, very specific
and would maybe not fit the scope of
-
Wikidata, they can create their own
Wikibase and then they can connect the
-
content with what is already in Wikidata.
So altogether, in this Wikibase ecosystem,
-
yes, pretty much everything is possible.
Herald: Well, the future is certainly
-
here, at least, with Wikidata. Thank you
again, Léa and Mohammed, for your
-
insightful introduction to Wikidata and
we're looking forward to more people
-
joining you in your efforts. Thanks for
your presentation.
-
Léa: Thank you. See you soon.
-
rC3 Wikipaka outro music
-
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!