intro music
Herald: Wikidata for (Data) Journalists
by Elizabeth Giesemann.
Elisabeth Giesemann: So our agenda for
today is that we will have a look on key
points of data journalism. We will quickly
explain what Wikidata is, what tools you
can use inside of Wikidata for data
visualization, what other third party
tools are there for your research? Then we
have a look at critical research done with
Wikidata. And finally, we have a critical
look on the data of Wikidata itself.
Key points of data journalism are that you
want to interview a dataset, so you want
to find connections, correlations and
causalities behind the data. Also, you
want to visualize the data in a compelling
way and you want to write your own story.
You want to find a new spin
and a new look on- at the facts
and all of these things
you can do with Wikidata.
At Wikimedia Deutschland, we want
to support evidence-based reporting
that's why we want to support you
in using Wikidata.
Also data journalism helps you to tailor
your story to the users or your readers.
Data journalism helps you to create visual
storytelling instead of walls of text.
And this, again, helps you to convey facts
faster and way more easy
and that makes your story
way more inclusive.
So how do you get to a story
with Wikidata?
You want to find and recognize patterns
in a dataset, you can search for geographical
data, you can search for similarities and
differences in the data, and you can also
search for missing data, because that also
exists in Wikidata. You can visualize your
findings with the tools that you find in
the Wikidata Query Service. And what's
most important is you can connect to the
Wikidata community and find people who are
working on a similar subject or have a
similar research- research question to the
one that you have. So I included this
visualization to show you that data is
only the beginning of your story and the
path that you will take. We want you to
use the data in Wikidata for- to create a
compelling story and therefore contribute
value and your idea about what's in the
data. Because data is a lot, but it's not
everything, as we've seen in the last
month, many people aren't convinced by
facts. Also, there is a lack of time and
there is a lack of data- data literacy in
our society. It's not always easy to
understand the complexity of historical
events and developments, to understand the
complexity of medical data or demographic
changes. So it is important to have a
storytelling aspect to your data, have
good visualizations and an easy to
understand approach to convey the
significance of your data and your story.
And finally, it is important to remain
transparent and clear about the use and
analysis of the data. So what is Wikidata?
Wikidata is a free linked database that
can be read and edited by both humans and
machines, so it is a database of linked
open data. It- that means that the data
doesn't just sit there in tables. It can
be connected and combined with other data,
found on Wikidata. As such, it is a
realization of the semantic web as dreamt
by Tim Berners-Lee and also Wikidata won a
prize for its realization of the semantic
web. We just celebrated Wikidata- data's
8th birthday. It currently holds 90
million items and has 44,000 active users
and contributors, which makes it the most
edited Wikimedia project. It was initially
used to or thought of to support the
projects of the other projects of the
Wikimedia ecosystem and seen as a central
storage for the structured data of the
sister of projects like Wikivoyage,
Wikisource and the most famous Wikimedia
project, Wikipedia. But it also has
another function, which means- which is to
provide free and open data to the
Internet, and that became really huge. As
already said, we now have more than 80- 90
million data items on Wikidata. A
colleague of mine created this map and you
can see here the geolocation data that is
in Wikidata and we are very proud that
it's distributed all over the world but
it's also- we also take it with a grain of
salt, because as you can see, it's very
bright in Europe and on the east and west
coasts of the US, but there are very dark
spots where we can't record the knowledge
in the same way as we do in our Western
societies and that brings us to the
question of what is knowledge equity and
how can we actually best serve everybody
in our global society? So how does it
work? Wikidata items, which are real
things or concepts in the real world, like
Berlin, Barack Obama, helium, and these
items are identified with an ID, the QID.
So Q76 or Q... I don't, I can't read the
number now, so these items have labels,
descriptions, aliases and sitelinks.
Labels, that means it's described in all
of the languages that Wikidata holds
currently, those are around 300.
Descriptions are forms to describe what
the item holds and aliases, sometimes one
item has several names, etc, etc. An item
also has properties, those are used to
label to data like a person is born
somewhere, its date of birth or death or
the location of a specific building.
Statements hold informations in
properties, so P47 shares the border with
another, like, country or the population.
Statements also have qualifiers to expand
the information and then also they have
references which is very important because
for scientific research, you want to have
those references. So here we see again our
item, Berlin, Q64. The property is the
population of 3.7 million. So what's new
about research with Wikidata is that you
can ask your own questions. Before, you
would go to a library and some- the
librarians - librarians are awesome, but
they would give you books with specific
facts in them and you would consume them
and try to use them for your research. At
Wikidata you can ask very specific
questions that nobody else came up with
before. So for your research, you want to
do your own Wikidata queries, that's what
we have the Wikidata Query Service for.
The good news is that you don't have to
learn Python or R or become a data
scientist, but you want to learn a bit of
SPARQL. We included a few resources here
in this presentation and there's also
going to be a talk given by my colleague
Lucas on the 29th on how to query Wikidata
with SPARQL. We also have a guided tour on
Wikidata on our website which I can
recommend. OK, so, um, as said, once you
queried your data, you can visualize your
results for more compelling storytelling
and there are several ways of doing this
and I'm going to show you some of this
just to give you an idea. You could, for
instance, ask the query service to show
you airports that are named after a person
and color code them according to their
gender. Gender of the person, not the
airport, obviously. You can ask the query
service, show me everything connected to
the item Berlin. You can ask it to show
you the population of the countries that
are bordering Germany and how it
developed. You can also ask the query
service to show you the most common cause
of death among noble people. Or here it
shows you an- an historical overview of
space probes. Or all of the children and
grandchildren of Genghis Khan. So we had a
look on the visualizations inside of
Wikidata's Query Service, but there are
also tools that use Wikidata's data for
their own visualizations. And I'm going to
show you some of them now. So here is
Histropedia, which makes time beams of
historical events using data from
Wikidata. This is Inventaire. Basically,
it lets you create your own private
library and then uses the data from
Wikidata to describe the publications.
Here is "Ask me anything". That's done by
different researchers in Europe, and it
lets you pose questions in natural
language to Wikidata so you don't have to
use the query service. That's a way that
to use Wikidata that's also used by a lot
of voice assistants like Siri and Alexa.
And here you have Scholia, which is
basically a platform for scientific
publications that are published under open
access and collected, and it can answer
your questions like who published what
paper, with whom, who and when or who
wrote the first paper on COVID, when was
it published, etc. And here we have "Sum
of All Paintings". Basically, it's a
database that creates all of the paintings
in the world and lists their metadata so
you can combine it in your own specific
way. So I showed you a couple of examples,
what you could do, and I want to hint at
other researchers who did great stuff with
Wikidata and used it for very cool
storytelling. If my slides work, OK, here
we go. So, um, "Women's representation and
voice in media coverage of the coronavirus
crisis", that's the- that's a study done
by a researcher called Laura Jones
regarding the representation of female
experts within the coverage of
coronavirus. It uses evaluations of
Wikipedia and Wikidata to show- to show
how much representation was there, of
female experts. And, as we see, it's not a
lot. Finally, there is another great
example I want to tell you about, it's a
project called Enslaved.org. It's a linked
open data platform based on Wikibase,
which is the software behind Wikidata and
it basically shows or it collects and
connects data related to the transatlantic
slave trade. So, people who suffered under
the slave trade and the records that were
done by the people active in this slave
trade, those data is collected. It has
been collected in several databases and
Enslaved build one large database to
connect them and rebuild the stories,
which I think is a really great idea to or
really great way to humanize people who
have been dehumanized with data. Like you
can see here, they collect- they collect
data from newspapers and from the
slaveholders to recount a story of
individuals. So finally, I also want to
talk to you about one thing in Wikidata
that is always on our minds, which is that
Wikidata is not perfect. I highly
recommend the talk by Os Keyes
"Questioning Wikidata" in which it is
explained that all classification systems
are inherently dangerous and Wikidata is a
large encyclopedic wiki classification
system which makes choices, ethical and
political choices, about what is notable,
about how to categorize information. And
these choices, they reduce complexity and
reduce also specific forms of- of history,
like oral history. This reduction has
consequences. As you know, Wikidata is
used by many programs, apps, voice
assistance and what- what and how we store
information in Wikidata really matters. So
we ask ourselves, what is encyclopedic
knowledge? And how can we organize it in a
more inclusive way? Encyclopedic knowledge
is a Western concept, and we can and must
do better than just use our own Western
view to organize the world. But then also
the wiki principle applies, we have a huge
community behind Wikidata that helps us to
make these decisions, and you can also
become a part of this by researching
Wikidata, using it for your work and also
contributing your research. So once again,
I want to tell you, you can use Wikidata
as a tool for your storytelling. Wikidata
can help you find connections between
data. Wikidata can help you find- can help
you build visualization in its query
service. You can ask questions about
historical data correlations more
critically than you could- than you could
before. And- but there are also downsides
to- downsides to Wikidata because it is an
encyclopedic way of organizing Western
knowledge. So this was only a start. I'm
looking forward to our Q&A session now and
if you have further questions, concerns or
have ideas, you can contact me and my
colleagues and you can also contact me
individually. Thank you.
Herald: Hello and welcome to Elizabeth.
Thank you very much for your interesting
talk. That was a very great introduction.
Elisabeth: Hi. Yeah, thanks for having me.
I'm happy that I was able to talk a bit
about Wikidata and how you could do
storytelling with it. I wanted to add
that, obviously, you can ask me questions
now, but also I want to hint at the great
introduction of Wikidata that one of my
colleagues gave. Yesterday, two of my
colleagues, which is already online, and
tomorrow there will be a query service
workshops where you can learn a bit more
in-depth how to query Wikidata.
Herald: Yeah, that's a very good hint.
There's actually there's two questions in
the chat right now. The first one is, are
your slides going to be published because
people are interested in your links to the
tutorials, obviously.
Elisabeth: Yes, that was, uh, I asked
before, I think the talk will be published
and the slides. Is there a Wikipaka board
where I can put it? Otherwise, I can also
put a link on our Twitter account,
Wikimedia Deutschland. And yeah...
Herald: I think Twitter for now would
probably be the best idea, I actually have
to check on the Wikipaka board, but we
will let you know where you can find
everything.
Elisabeth: I put it on the Wikimedia
Deutschland Twitter. It's @wmde I think
Herald: we will also retweet it
obviously. You will find it, I promise.
Elisabeth: OK.
Herald: There's another question. What
resources would you recommend for self-
studying the writing of queries for
query.wikidata.org?
Elisabeth: Mhm. Um, I put some links in
the- in the slides. There is... yeah, we
have, like, a few tutorials on Wikidata.
There was also a couple of months ago, a
very nice and very easy tutorial published
by Wikimedia Israel. And I- so we didn't
do it, but I can recommend it, it's a very
low key introduction to your first
queries.
Herald: OK. We will also publish that
somehow. I have a question for you as
well. You mentioned that Wikidata is like
a great way for meeting other people that
are working on similar topics. So is there
some kind of like greater community of
journalists using Wikidata?
Elisabeth: So far, the community is mostly
research based. That's also why we wanted
to reach out here. So I would recommend
getting in touch with the community on
there regarding the research topics that
you have. And you can also get in touch
with us and we connect you. I have a noise
in my ear, but I hope it's only me.
Herald: Well, I don't have it, so it might
just be you, but I feel like there might
be also an echo on the stream, that's what
people on the chat are saying.
Elisabeth: Oh, OK.
Herald: So I don't have any other questions
in the chat and since there seems to be an
echo on the stream, I don't want to annoy
people any further. So I would suggest for
everyone who has further questions to you
that you can meet in our Big Blue Button
meetup room that I will be posting in the
chat right now and we will continue our
program here at 2:20 with another talk
about Flutter by "The one with the braid",
so I'm saying bye for now.
Elisabeth: Thanks, bye.
Herald: Bye.
outro music
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!