< Return to Video

#rC3 - Wikidata for (Data) Journalists

  • 0:00 - 0:09
    intro music
  • 0:15 - 0:18
    Herald: Wikidata for (Data) Journalists
    by Elizabeth Giesemann.
  • 0:20 - 0:26
    Elisabeth Giesemann: So our agenda for
    today is that we will have a look on key
  • 0:26 - 0:33
    points of data journalism. We will quickly
    explain what Wikidata is, what tools you
  • 0:33 - 0:39
    can use inside of Wikidata for data
    visualization, what other third party
  • 0:39 - 0:46
    tools are there for your research? Then we
    have a look at critical research done with
  • 0:46 - 0:53
    Wikidata. And finally, we have a critical
    look on the data of Wikidata itself.
  • 0:57 - 1:03
    Key points of data journalism are that you
    want to interview a dataset, so you want
  • 1:03 - 1:09
    to find connections, correlations and
    causalities behind the data. Also, you
  • 1:09 - 1:17
    want to visualize the data in a compelling
    way and you want to write your own story.
  • 1:17 - 1:24
    You want to find a new spin
    and a new look on- at the facts
  • 1:24 - 1:26
    and all of these things
    you can do with Wikidata.
  • 1:32 - 1:35
    At Wikimedia Deutschland, we want
    to support evidence-based reporting
  • 1:35 - 1:40
    that's why we want to support you
    in using Wikidata.
  • 1:40 - 1:50
    Also data journalism helps you to tailor
    your story to the users or your readers.
  • 1:50 - 1:56
    Data journalism helps you to create visual
    storytelling instead of walls of text.
  • 1:56 - 2:04
    And this, again, helps you to convey facts
    faster and way more easy
  • 2:04 - 2:06
    and that makes your story
    way more inclusive.
  • 2:11 - 2:14
    So how do you get to a story
    with Wikidata?
  • 2:14 - 2:19
    You want to find and recognize patterns
    in a dataset, you can search for geographical
  • 2:19 - 2:26
    data, you can search for similarities and
    differences in the data, and you can also
  • 2:26 - 2:32
    search for missing data, because that also
    exists in Wikidata. You can visualize your
  • 2:32 - 2:38
    findings with the tools that you find in
    the Wikidata Query Service. And what's
  • 2:38 - 2:43
    most important is you can connect to the
    Wikidata community and find people who are
  • 2:43 - 2:49
    working on a similar subject or have a
    similar research- research question to the
  • 2:49 - 3:00
    one that you have. So I included this
    visualization to show you that data is
  • 3:00 - 3:09
    only the beginning of your story and the
    path that you will take. We want you to
  • 3:09 - 3:17
    use the data in Wikidata for- to create a
    compelling story and therefore contribute
  • 3:18 - 3:30
    value and your idea about what's in the
    data. Because data is a lot, but it's not
  • 3:30 - 3:35
    everything, as we've seen in the last
    month, many people aren't convinced by
  • 3:35 - 3:43
    facts. Also, there is a lack of time and
    there is a lack of data- data literacy in
  • 3:43 - 3:49
    our society. It's not always easy to
    understand the complexity of historical
  • 3:49 - 3:55
    events and developments, to understand the
    complexity of medical data or demographic
  • 3:55 - 4:03
    changes. So it is important to have a
    storytelling aspect to your data, have
  • 4:03 - 4:08
    good visualizations and an easy to
    understand approach to convey the
  • 4:08 - 4:14
    significance of your data and your story.
    And finally, it is important to remain
  • 4:14 - 4:28
    transparent and clear about the use and
    analysis of the data. So what is Wikidata?
  • 4:28 - 4:34
    Wikidata is a free linked database that
    can be read and edited by both humans and
  • 4:34 - 4:40
    machines, so it is a database of linked
    open data. It- that means that the data
  • 4:40 - 4:46
    doesn't just sit there in tables. It can
    be connected and combined with other data,
  • 4:46 - 4:56
    found on Wikidata. As such, it is a
    realization of the semantic web as dreamt
  • 4:56 - 5:05
    by Tim Berners-Lee and also Wikidata won a
    prize for its realization of the semantic
  • 5:05 - 5:13
    web. We just celebrated Wikidata- data's
    8th birthday. It currently holds 90
  • 5:13 - 5:21
    million items and has 44,000 active users
    and contributors, which makes it the most
  • 5:21 - 5:32
    edited Wikimedia project. It was initially
    used to or thought of to support the
  • 5:32 - 5:39
    projects of the other projects of the
    Wikimedia ecosystem and seen as a central
  • 5:39 - 5:46
    storage for the structured data of the
    sister of projects like Wikivoyage,
  • 5:46 - 5:58
    Wikisource and the most famous Wikimedia
    project, Wikipedia. But it also has
  • 5:58 - 6:05
    another function, which means- which is to
    provide free and open data to the
  • 6:05 - 6:13
    Internet, and that became really huge. As
    already said, we now have more than 80- 90
  • 6:13 - 6:19
    million data items on Wikidata. A
    colleague of mine created this map and you
  • 6:19 - 6:28
    can see here the geolocation data that is
    in Wikidata and we are very proud that
  • 6:28 - 6:34
    it's distributed all over the world but
    it's also- we also take it with a grain of
  • 6:34 - 6:41
    salt, because as you can see, it's very
    bright in Europe and on the east and west
  • 6:41 - 6:51
    coasts of the US, but there are very dark
    spots where we can't record the knowledge
  • 6:51 - 6:56
    in the same way as we do in our Western
    societies and that brings us to the
  • 6:56 - 7:02
    question of what is knowledge equity and
    how can we actually best serve everybody
  • 7:02 - 7:16
    in our global society? So how does it
    work? Wikidata items, which are real
  • 7:16 - 7:22
    things or concepts in the real world, like
    Berlin, Barack Obama, helium, and these
  • 7:22 - 7:36
    items are identified with an ID, the QID.
    So Q76 or Q... I don't, I can't read the
  • 7:36 - 7:43
    number now, so these items have labels,
    descriptions, aliases and sitelinks.
  • 7:43 - 7:50
    Labels, that means it's described in all
    of the languages that Wikidata holds
  • 7:50 - 7:59
    currently, those are around 300.
    Descriptions are forms to describe what
  • 7:59 - 8:10
    the item holds and aliases, sometimes one
    item has several names, etc, etc. An item
  • 8:10 - 8:17
    also has properties, those are used to
    label to data like a person is born
  • 8:17 - 8:23
    somewhere, its date of birth or death or
    the location of a specific building.
  • 8:25 - 8:32
    Statements hold informations in
    properties, so P47 shares the border with
  • 8:32 - 8:42
    another, like, country or the population.
    Statements also have qualifiers to expand
  • 8:42 - 8:48
    the information and then also they have
    references which is very important because
  • 8:50 - 9:00
    for scientific research, you want to have
    those references. So here we see again our
  • 9:00 - 9:22
    item, Berlin, Q64. The property is the
    population of 3.7 million. So what's new
  • 9:22 - 9:29
    about research with Wikidata is that you
    can ask your own questions. Before, you
  • 9:29 - 9:34
    would go to a library and some- the
    librarians - librarians are awesome, but
  • 9:34 - 9:41
    they would give you books with specific
    facts in them and you would consume them
  • 9:41 - 9:48
    and try to use them for your research. At
    Wikidata you can ask very specific
  • 9:48 - 9:56
    questions that nobody else came up with
    before. So for your research, you want to
  • 9:56 - 10:01
    do your own Wikidata queries, that's what
    we have the Wikidata Query Service for.
  • 10:03 - 10:08
    The good news is that you don't have to
    learn Python or R or become a data
  • 10:08 - 10:17
    scientist, but you want to learn a bit of
    SPARQL. We included a few resources here
  • 10:17 - 10:23
    in this presentation and there's also
    going to be a talk given by my colleague
  • 10:23 - 10:33
    Lucas on the 29th on how to query Wikidata
    with SPARQL. We also have a guided tour on
  • 10:33 - 10:47
    Wikidata on our website which I can
    recommend. OK, so, um, as said, once you
  • 10:47 - 10:56
    queried your data, you can visualize your
    results for more compelling storytelling
  • 10:56 - 11:00
    and there are several ways of doing this
    and I'm going to show you some of this
  • 11:00 - 11:10
    just to give you an idea. You could, for
    instance, ask the query service to show
  • 11:10 - 11:18
    you airports that are named after a person
    and color code them according to their
  • 11:18 - 11:32
    gender. Gender of the person, not the
    airport, obviously. You can ask the query
  • 11:32 - 11:46
    service, show me everything connected to
    the item Berlin. You can ask it to show
  • 11:46 - 11:52
    you the population of the countries that
    are bordering Germany and how it
  • 11:52 - 12:03
    developed. You can also ask the query
    service to show you the most common cause
  • 12:03 - 12:17
    of death among noble people. Or here it
    shows you an- an historical overview of
  • 12:17 - 12:43
    space probes. Or all of the children and
    grandchildren of Genghis Khan. So we had a
  • 12:43 - 12:48
    look on the visualizations inside of
    Wikidata's Query Service, but there are
  • 12:48 - 12:55
    also tools that use Wikidata's data for
    their own visualizations. And I'm going to
  • 12:55 - 13:05
    show you some of them now. So here is
    Histropedia, which makes time beams of
  • 13:05 - 13:16
    historical events using data from
    Wikidata. This is Inventaire. Basically,
  • 13:16 - 13:24
    it lets you create your own private
    library and then uses the data from
  • 13:24 - 13:35
    Wikidata to describe the publications.
    Here is "Ask me anything". That's done by
  • 13:35 - 13:43
    different researchers in Europe, and it
    lets you pose questions in natural
  • 13:43 - 13:53
    language to Wikidata so you don't have to
    use the query service. That's a way that
  • 13:53 - 14:02
    to use Wikidata that's also used by a lot
    of voice assistants like Siri and Alexa.
  • 14:05 - 14:11
    And here you have Scholia, which is
    basically a platform for scientific
  • 14:11 - 14:19
    publications that are published under open
    access and collected, and it can answer
  • 14:19 - 14:28
    your questions like who published what
    paper, with whom, who and when or who
  • 14:28 - 14:37
    wrote the first paper on COVID, when was
    it published, etc. And here we have "Sum
  • 14:37 - 14:45
    of All Paintings". Basically, it's a
    database that creates all of the paintings
  • 14:45 - 14:51
    in the world and lists their metadata so
    you can combine it in your own specific
  • 14:51 - 15:06
    way. So I showed you a couple of examples,
    what you could do, and I want to hint at
  • 15:06 - 15:15
    other researchers who did great stuff with
    Wikidata and used it for very cool
  • 15:15 - 15:32
    storytelling. If my slides work, OK, here
    we go. So, um, "Women's representation and
  • 15:32 - 15:37
    voice in media coverage of the coronavirus
    crisis", that's the- that's a study done
  • 15:37 - 15:46
    by a researcher called Laura Jones
    regarding the representation of female
  • 15:46 - 15:54
    experts within the coverage of
    coronavirus. It uses evaluations of
  • 15:54 - 16:04
    Wikipedia and Wikidata to show- to show
    how much representation was there, of
  • 16:04 - 16:22
    female experts. And, as we see, it's not a
    lot. Finally, there is another great
  • 16:22 - 16:30
    example I want to tell you about, it's a
    project called Enslaved.org. It's a linked
  • 16:30 - 16:38
    open data platform based on Wikibase,
    which is the software behind Wikidata and
  • 16:38 - 16:46
    it basically shows or it collects and
    connects data related to the transatlantic
  • 16:46 - 16:53
    slave trade. So, people who suffered under
    the slave trade and the records that were
  • 16:53 - 17:03
    done by the people active in this slave
    trade, those data is collected. It has
  • 17:03 - 17:13
    been collected in several databases and
    Enslaved build one large database to
  • 17:13 - 17:22
    connect them and rebuild the stories,
    which I think is a really great idea to or
  • 17:22 - 17:30
    really great way to humanize people who
    have been dehumanized with data. Like you
  • 17:30 - 17:41
    can see here, they collect- they collect
    data from newspapers and from the
  • 17:41 - 17:56
    slaveholders to recount a story of
    individuals. So finally, I also want to
  • 17:56 - 18:03
    talk to you about one thing in Wikidata
    that is always on our minds, which is that
  • 18:04 - 18:10
    Wikidata is not perfect. I highly
    recommend the talk by Os Keyes
  • 18:10 - 18:16
    "Questioning Wikidata" in which it is
    explained that all classification systems
  • 18:16 - 18:23
    are inherently dangerous and Wikidata is a
    large encyclopedic wiki classification
  • 18:23 - 18:31
    system which makes choices, ethical and
    political choices, about what is notable,
  • 18:31 - 18:43
    about how to categorize information. And
    these choices, they reduce complexity and
  • 18:43 - 18:54
    reduce also specific forms of- of history,
    like oral history. This reduction has
  • 18:54 - 19:03
    consequences. As you know, Wikidata is
    used by many programs, apps, voice
  • 19:03 - 19:17
    assistance and what- what and how we store
    information in Wikidata really matters. So
  • 19:17 - 19:27
    we ask ourselves, what is encyclopedic
    knowledge? And how can we organize it in a
  • 19:27 - 19:34
    more inclusive way? Encyclopedic knowledge
    is a Western concept, and we can and must
  • 19:34 - 19:46
    do better than just use our own Western
    view to organize the world. But then also
  • 19:46 - 19:52
    the wiki principle applies, we have a huge
    community behind Wikidata that helps us to
  • 19:52 - 20:00
    make these decisions, and you can also
    become a part of this by researching
  • 20:00 - 20:12
    Wikidata, using it for your work and also
    contributing your research. So once again,
  • 20:12 - 20:18
    I want to tell you, you can use Wikidata
    as a tool for your storytelling. Wikidata
  • 20:18 - 20:24
    can help you find connections between
    data. Wikidata can help you find- can help
  • 20:24 - 20:30
    you build visualization in its query
    service. You can ask questions about
  • 20:30 - 20:38
    historical data correlations more
    critically than you could- than you could
  • 20:38 - 20:45
    before. And- but there are also downsides
    to- downsides to Wikidata because it is an
  • 20:45 - 20:55
    encyclopedic way of organizing Western
    knowledge. So this was only a start. I'm
  • 20:55 - 21:03
    looking forward to our Q&A session now and
    if you have further questions, concerns or
  • 21:03 - 21:08
    have ideas, you can contact me and my
    colleagues and you can also contact me
  • 21:08 - 21:19
    individually. Thank you.
  • 21:19 - 21:24
    Herald: Hello and welcome to Elizabeth.
    Thank you very much for your interesting
  • 21:24 - 21:30
    talk. That was a very great introduction.
    Elisabeth: Hi. Yeah, thanks for having me.
  • 21:30 - 21:36
    I'm happy that I was able to talk a bit
    about Wikidata and how you could do
  • 21:36 - 21:43
    storytelling with it. I wanted to add
    that, obviously, you can ask me questions
  • 21:43 - 21:51
    now, but also I want to hint at the great
    introduction of Wikidata that one of my
  • 21:51 - 21:57
    colleagues gave. Yesterday, two of my
    colleagues, which is already online, and
  • 21:57 - 22:03
    tomorrow there will be a query service
    workshops where you can learn a bit more
  • 22:03 - 22:09
    in-depth how to query Wikidata.
    Herald: Yeah, that's a very good hint.
  • 22:09 - 22:13
    There's actually there's two questions in
    the chat right now. The first one is, are
  • 22:13 - 22:18
    your slides going to be published because
    people are interested in your links to the
  • 22:18 - 22:22
    tutorials, obviously.
    Elisabeth: Yes, that was, uh, I asked
  • 22:22 - 22:30
    before, I think the talk will be published
    and the slides. Is there a Wikipaka board
  • 22:30 - 22:36
    where I can put it? Otherwise, I can also
    put a link on our Twitter account,
  • 22:36 - 22:44
    Wikimedia Deutschland. And yeah...
    Herald: I think Twitter for now would
  • 22:44 - 22:48
    probably be the best idea, I actually have
    to check on the Wikipaka board, but we
  • 22:48 - 22:50
    will let you know where you can find
    everything.
  • 22:50 - 23:02
    Elisabeth: I put it on the Wikimedia
    Deutschland Twitter. It's @wmde I think
  • 23:02 - 23:05
    Herald: we will also retweet it
    obviously. You will find it, I promise.
  • 23:05 - 23:09
    Elisabeth: OK.
    Herald: There's another question. What
  • 23:09 - 23:13
    resources would you recommend for self-
    studying the writing of queries for
  • 23:13 - 23:19
    query.wikidata.org?
    Elisabeth: Mhm. Um, I put some links in
  • 23:19 - 23:28
    the- in the slides. There is... yeah, we
    have, like, a few tutorials on Wikidata.
  • 23:28 - 23:35
    There was also a couple of months ago, a
    very nice and very easy tutorial published
  • 23:35 - 23:42
    by Wikimedia Israel. And I- so we didn't
    do it, but I can recommend it, it's a very
  • 23:43 - 23:48
    low key introduction to your first
    queries.
  • 23:48 - 23:54
    Herald: OK. We will also publish that
    somehow. I have a question for you as
  • 23:54 - 23:59
    well. You mentioned that Wikidata is like
    a great way for meeting other people that
  • 23:59 - 24:05
    are working on similar topics. So is there
    some kind of like greater community of
  • 24:05 - 24:13
    journalists using Wikidata?
    Elisabeth: So far, the community is mostly
  • 24:13 - 24:19
    research based. That's also why we wanted
    to reach out here. So I would recommend
  • 24:19 - 24:26
    getting in touch with the community on
    there regarding the research topics that
  • 24:26 - 24:35
    you have. And you can also get in touch
    with us and we connect you. I have a noise
  • 24:35 - 24:41
    in my ear, but I hope it's only me.
    Herald: Well, I don't have it, so it might
  • 24:42 - 24:47
    just be you, but I feel like there might
    be also an echo on the stream, that's what
  • 24:47 - 24:51
    people on the chat are saying.
    Elisabeth: Oh, OK.
  • 24:51 - 24:56
    Herald: So I don't have any other questions
    in the chat and since there seems to be an
  • 24:56 - 25:02
    echo on the stream, I don't want to annoy
    people any further. So I would suggest for
  • 25:02 - 25:08
    everyone who has further questions to you
    that you can meet in our Big Blue Button
  • 25:08 - 25:16
    meetup room that I will be posting in the
    chat right now and we will continue our
  • 25:16 - 25:23
    program here at 2:20 with another talk
    about Flutter by "The one with the braid",
  • 25:23 - 25:29
    so I'm saying bye for now.
    Elisabeth: Thanks, bye.
  • 25:29 - 25:30
    Herald: Bye.
  • 25:30 - 25:34
    outro music
  • 25:34 - 25:40
    Subtitles created by c3subtitles.de
    in the year 2021. Join, and help us!
Title:
#rC3 - Wikidata for (Data) Journalists
Description:

more » « less
Video Language:
English
Duration:
25:40

English subtitles

Revisions