< Return to Video

cdn.media.ccc.de/.../wikidatacon2019-1068-eng-Reinstating_Female_Artists_to_the_Cultural_Record_via_Wikidata_hd.mp4

  • 0:06 - 0:08
    Right! Hello, everyone!
  • 0:09 - 0:10
    The clock is running.
  • 0:11 - 0:15
    Thank you to Wikidata
    for the opportunity to speak here today.
  • 0:15 - 0:18
    I will be talking about reinstating
    female artists to the cultural record.
  • 0:19 - 0:21
    First off, what am I doing here?
  • 0:21 - 0:24
    I work for an organization called digiS,
  • 0:24 - 0:27
    which is the Forschungs-und
    Kompetenzzentrum Digitalisierung Berlin
  • 0:27 - 0:29
    We are a state-funded body.
  • 0:29 - 0:31
    We're based in
    the Zuse Institute in Berlin.
  • 0:31 - 0:36
    which is part of the high-performance
    computing network in Germany.
  • 0:36 - 0:39
    We are involved with GLAMs,
    with cultural institutions in Berlin,
  • 0:40 - 0:45
    helping them get
    their analog treasures online,
  • 0:45 - 0:49
    digitized online,
    making high-quality metadata.
  • 0:49 - 0:53
    And one of our major focuses
    is also long-term digital preservation.
  • 0:53 - 0:55
    So, we assist them with all of that,
  • 0:55 - 0:57
    And as I said, it's state-funded,
  • 0:57 - 0:58
    and with a focus on Berlin,
  • 0:58 - 1:00
    so we work with Berlin institutions.
  • 1:02 - 1:04
    Our standard operating procedure
  • 1:04 - 1:06
    is that we take the metadata
  • 1:06 - 1:10
    that the cultural institutions
    have produced
  • 1:10 - 1:13
    and that is ingested into
    the Deutsche Digitale Bibliothek.
  • 1:13 - 1:15
    That's our main focus.
  • 1:15 - 1:20
    And then on into Europeana
    if the metadata is appropriately licensed.
  • 1:20 - 1:23
    Not all the institutions
    put out their metadata with CC0.
  • 1:24 - 1:26
    But that's our standard route.
  • 1:26 - 1:29
    But we're always looking
    at new ways to present data,
  • 1:29 - 1:32
    because a lot of the museums
    and cultural institutions we work with,
  • 1:32 - 1:34
    archives as well,
  • 1:34 - 1:37
    sometimes focus a bit too much
    on putting the data on a website.
  • 1:37 - 1:39
    You've got to realize
    that we're dealing with institutions,
  • 1:39 - 1:41
    sometimes very small institutions
    in Berlin,
  • 1:41 - 1:44
    where there's just one
    or two people working.
  • 1:44 - 1:46
    They're really happy
    if they've just got a website.
  • 1:46 - 1:49
    If we start coming to them and saying,
    "Hey! Start producing your metadata
  • 1:49 - 1:53
    on LIDO or Metsmart or whatever,"
    they tend to have a bit of a crisis,
  • 1:53 - 1:55
    but that's why we've got
    this funding program
  • 1:55 - 1:56
    to try and help them through it,
  • 1:56 - 1:58
    because we also would like
    the data that they produce
  • 1:58 - 2:00
    to be usable for machines,
  • 2:00 - 2:04
    to really leverage the power
    of the semantic web and link data.
  • 2:05 - 2:08
    Which is why we became interested
    in looking at Wikidata,
  • 2:08 - 2:10
    because some of our partners
    were actually using Wikidata already,
  • 2:10 - 2:12
    pretty much as a resource.
  • 2:12 - 2:16
    They weren't inputting any data,
    but to make their whole...
  • 2:18 - 2:21
    collection management easier,
    they would then collect to Wikidata
  • 2:21 - 2:24
    to save them the trouble
    of inputting everything
  • 2:24 - 2:28
    and that's how we ended up
    with the Berliner Malweiber,
  • 2:28 - 2:29
    as they are called.
  • 2:30 - 2:32
    So, Malweiber is a pejorative term.
  • 2:32 - 2:35
    It just means "Painting Women,"
  • 2:35 - 2:39
    and it was used to refer to women artists
    who were not taken seriously
  • 2:39 - 2:42
    by the male-dominated academe
    at the turn of the last century.
  • 2:43 - 2:44
    These were women who were painting,
  • 2:44 - 2:47
    but they weren't allowed
    to study at any universities
  • 2:47 - 2:50
    or any higher education institutions.
  • 2:50 - 2:52
    They were allowed to maybe
    go to painting schools,
  • 2:52 - 2:55
    but they were producing a lot of work.
  • 2:55 - 2:57
    Some of the artists
    that we've got in the database
  • 2:57 - 3:02
    are early 1850's as well, but most of it
    was around the turn of the century
  • 3:02 - 3:04
    So, this is Anna Bernhardi's
    self-portrait
  • 3:04 - 3:06
    from 1891, I think.
  • 3:08 - 3:11
    In Berlin, the first time
    women were admitted
  • 3:11 - 3:13
    to the Schule der Kunst was 1919.
  • 3:13 - 3:16
    So, it took a while
    before they got any recognition
  • 3:16 - 3:18
    for the work that they were doing.
  • 3:18 - 3:21
    So, the Stadtmuseum in Berlin, in 2016,
    had a temporary exhibition
  • 3:21 - 3:23
    called "Berlin -- Stadt der Frauen,"
  • 3:23 - 3:26
    where they tried to readdress
    this imbalance a bit
  • 3:26 - 3:28
    and to showcase
    a lot of the work that they have--
  • 3:28 - 3:30
    the Stadtmuseum has their artworks--
  • 3:30 - 3:33
    and to actually put on
    an exhibition for them.
  • 3:33 - 3:38
    Of the 20 women who remained
    the focus of the exhibition,
  • 3:38 - 3:39
    most of them are already quite famous,
  • 3:39 - 3:42
    but there were still quite a few
    artworks in the museum
  • 3:42 - 3:45
    that are from lesser-known artists.
  • 3:48 - 3:51
    And that's why, as part of this program--
  • 3:51 - 3:53
    this is the website from digiS--
  • 3:53 - 3:57
    there was a program in 2016
    or a project by the Stadtmuseum
  • 3:57 - 4:00
    to then digitize a further 215
    of their paintings
  • 4:00 - 4:02
    and produce metadata to that.
  • 4:02 - 4:04
    So, that was the basis of our work.
  • 4:05 - 4:07
    So, standard operating procedure:
  • 4:07 - 4:12
    we announced the ingest
    on the Wikidata Import website.
  • 4:12 - 4:14
    It was actually the old site.
    They've changed the way now.
  • 4:14 - 4:16
    And happily...
  • 4:19 - 4:21
    ...cleaned up the data
    as much as they could in OpenRefine
  • 4:21 - 4:24
    because I was dealing with a dataset
    that had been edited
  • 4:24 - 4:27
    by about three different employees
    of the museum.
  • 4:27 - 4:30
    So, you notice immediately
    how great it would be
  • 4:30 - 4:32
    if people would use controlled
    vocabularies and stuff like that
  • 4:32 - 4:37
    because for materials and techniques
    there were very often different words used
  • 4:37 - 4:39
    to describe exactly the same concept.
  • 4:39 - 4:41
    I did a whole lot of cleaning up,
  • 4:41 - 4:44
    checked for which artists
    were already existing in Wikidata,
  • 4:44 - 4:46
    which had to be added new.
  • 4:46 - 4:50
    All of this was uploaded,
    and then tragedy struck,
  • 4:51 - 4:57
    because for a lot of these artists
    I had no external references.
  • 4:57 - 5:00
    I do not have any reference in the GND,
    the German Authority Control.
  • 5:03 - 5:07
    And so, the only place I could enter
    their data was in Wikidata.
  • 5:07 - 5:09
    They didn't exist anywhere else.
  • 5:09 - 5:13
    And Wikidata admin, quite rightly,
    deleted 23 of the artists
  • 5:13 - 5:15
    that I had added, by saying,
  • 5:15 - 5:19
    "No source, no sitelink, no backlink,
    no data to understand who they are."
  • 5:19 - 5:22
    I waited a while, then I canceled.
  • 5:22 - 5:25
    So, I engaged in a discussion
    with this admin
  • 5:25 - 5:31
    and he or she very kindly restored
    the items that had been deleted,
  • 5:31 - 5:34
    which gave me the opportunity to say,
  • 5:34 - 5:36
    "From my point of view,
    I did a bit of calculating back then,
  • 5:36 - 5:39
    I may be wrong; this is a year ago now.
  • 5:39 - 5:40
    There were 15 million items.
  • 5:40 - 5:42
    Of those, a million had been deleted.
  • 5:42 - 5:44
    That's about 2% of Wikidata."
  • 5:44 - 5:47
    [Arguing], I realized today
    that in the earlier session
  • 5:47 - 5:50
    we'd say that Wikidata
    was having a resource problem. Good.
  • 5:50 - 5:52
    We need bigger machines.
  • 5:53 - 5:54
    But it would be interesting to say,
  • 5:54 - 5:58
    maybe the notability criteria
    should be made even more inclusive
  • 5:59 - 6:02
    if it's clearly spam or vandalism--
    yes, that's not an issue--
  • 6:02 - 6:05
    but there may be some items
    that are a little bit on the borderline
  • 6:05 - 6:08
    and maybe Wikidata would be
    an interesting place to store that data,
  • 6:08 - 6:12
    or Wikibase, if we're going to see
    some other options
  • 6:12 - 6:14
    of what we can do with it.
  • 6:14 - 6:17
    So, a suggestion would be to set up
    some sort of arbitration process,
  • 6:17 - 6:20
    because what annoyed me a little bit
    about having my items deleted
  • 6:20 - 6:23
    was the fact that I wasn't really
    given a chance to argue my case
  • 6:23 - 6:25
    in any way, shape, or form.
  • 6:25 - 6:28
    and it would have been nice if someone
    gave me the opportunity just to say,
  • 6:28 - 6:32
    "Hang on! This is actually real data,"
    and explain the situation.
  • 6:32 - 6:35
    On Wikipedia, we have pages very often
    with a banner on the top:
  • 6:35 - 6:36
    "This page has some issues."
  • 6:36 - 6:38
    Maybe we could think
    of something similar for Wikidata,
  • 6:38 - 6:41
    but I do not know
    how realistic that suggestion is.
  • 6:41 - 6:46
    It was just something I thought
    that might make people be more open
  • 6:46 - 6:49
    to putting their work into Wikidata
    and not be too terrified
  • 6:49 - 6:51
    that their work is going to be
    somehow deleted at some point.
  • 6:51 - 6:54
    I'll just remind you
    what the Wikidata criteria are.
  • 6:56 - 6:58
    Let me scroll.
  • 6:58 - 7:00
    This is not my computer,
    and I am left-handed,
  • 7:00 - 7:02
    so I'm struggling here a bit.
  • 7:04 - 7:06
    I might just...
  • 7:07 - 7:08
    ...use my left hand!
  • 7:12 - 7:13
    Let me scroll. Yes!
  • 7:13 - 7:15
    Ah! Of course! Windows.
  • 7:17 - 7:18
    There we go!
  • 7:18 - 7:21
    So, if it refers to an instance
    of a clearly identifiable concept
  • 7:22 - 7:25
    or material entity,
    the entity must be notable in the sense
  • 7:25 - 7:28
    that it can be described using serious
    and publicly available references.
  • 7:28 - 7:29
    So, this was part of our battle
  • 7:29 - 7:34
    in how can we get
    a reference for an artist
  • 7:34 - 7:39
    who was excluded by design
    from any formal record.
  • 7:39 - 7:41
    It's quite a complicated situation.
  • 7:42 - 7:45
    Now, how do I get out of that page?
  • 7:46 - 7:47
    Going back to the left hand.
  • 7:49 - 7:52
    Finding the mouse. Hello, mouse?
    Anyone seen a mouse?
  • 7:53 - 7:55
    (chuckles)
  • 7:55 - 7:57
    Ah! There we go!
  • 7:57 - 7:58
    Let's go back.
  • 7:58 - 8:02
    So, at this stage, there are
    still 19 artists without a GND,
  • 8:02 - 8:05
    so this is the German Authority Control.
  • 8:05 - 8:09
    So, these are all German artists,
    or Berlin artists as well.
  • 8:09 - 8:13
    So, one would hope
    that they would qualify for a GND ID,
  • 8:13 - 8:15
    but at this stage they do not have one.
  • 8:16 - 8:20
    I have artists, for example,
    like Ida Maurer-Hahn...
  • 8:21 - 8:22
    (humming)
  • 8:23 - 8:24
    There we go!
  • 8:25 - 8:27
    ...where I can really only just say
    this much about her.
  • 8:27 - 8:31
    I know she's human, I know she's female,
    and I know she's a painter.
  • 8:31 - 8:33
    That's all I know.
    I'm not the domain expert.
  • 8:33 - 8:35
    This is data I got from the museum.
  • 8:35 - 8:37
    The museum is busy setting up
    all sorts of other things,
  • 8:37 - 8:39
    so it's a little bit
    of a battle for them too
  • 8:39 - 8:41
    to allocate resources to this,
  • 8:41 - 8:42
    and we're stuck in the middle of digiS,
  • 8:42 - 8:46
    trying to make sure that good quality
    metadata is being created,
  • 8:46 - 8:51
    and doing our best
    to ensure that it's done.
  • 8:54 - 8:56
    There we go!
  • 8:56 - 8:57
    And this is, for example, in some cases,
  • 8:57 - 9:03
    even though I may not have
    the metadata or an external reference,
  • 9:03 - 9:04
    I have a painting,
  • 9:04 - 9:06
    and I can at least link to that painting.
  • 9:06 - 9:08
    And for some of the paintings
    that are online at Stadtmuseum--
  • 9:08 - 9:11
    but I won't-- well, maybe
    we can explain this here.
  • 9:11 - 9:14
    This is what the metadata looks like
    for one of the paintings, instance of.
  • 9:14 - 9:17
    I have all the data that the museum
    has about this artwork;
  • 9:17 - 9:20
    however, I have no online link
    to any of this stuff.
  • 9:20 - 9:21
    I can't reference it
  • 9:21 - 9:24
    because it's in their collection
    management system.
  • 9:24 - 9:27
    It's not available online yet,
    so how am I going to do this?
  • 9:31 - 9:34
    There is hope. Do not despair!
  • 9:38 - 9:41
    The German Authority Control
    is run by the German National Library,
  • 9:41 - 9:43
    the Deutsche Nationalbibliothek,
  • 9:43 - 9:45
    and they have a project running
    called GND4C,
  • 9:45 - 9:48
    which is the GND for culture data.
  • 9:48 - 9:51
    And, in fact, I presented
    on this topic a few days ago
  • 9:51 - 9:55
    at the annual meeting
    of the expert group on documentation
  • 9:55 - 9:57
    of the Museums Federation of Germany,
  • 9:57 - 10:00
    and there were people
    in the audience from the GND4C
  • 10:00 - 10:05
    who have now said they will help me get
    at least these 19 artists who have no GND
  • 10:05 - 10:08
    to at least get their IDs,
    so that we can start the process of--
  • 10:08 - 10:11
    as it has been said
    in another presentation today,
  • 10:12 - 10:15
    the moment this data is online in Wikidata
    and someone does a query
  • 10:15 - 10:19
    and says, "Hey, here's an artist,
    but there's not much data about them.
  • 10:19 - 10:21
    Maybe there's a story to tell here,
    and maybe we'd be interested
  • 10:21 - 10:23
    in doing some research
    and to expand that."
  • 10:23 - 10:27
    That's been the amazing effect
    of uploading a lot of this data
  • 10:27 - 10:30
    when there has been IDs
    or there has been the ability to leverage
  • 10:30 - 10:32
    all the link data power,
  • 10:32 - 10:34
    that all of a sudden,
    so much more information
  • 10:34 - 10:37
    has been collected about these artists
    who in the past had really been ignored.
  • 10:41 - 10:44
    What are the lessons
    we can learn from all of this?
  • 10:44 - 10:46
    How can we improve on past performance?
  • 10:49 - 10:53
    I think it's difficult to find
    the right name to describe this.
  • 10:53 - 10:55
    There is such a thing
    as the Chief Data Officer.
  • 10:55 - 10:58
    It's very much a commercial
    and business enterprise sort of thing,
  • 10:58 - 11:05
    but what we noticed is that
    in the cultural and heritage institutions
  • 11:05 - 11:08
    which have a personnel
    and resource problem anyway,
  • 11:08 - 11:10
    we would like to have
    a Chief Data Officer,
  • 11:10 - 11:14
    at least someone who is responsible there
    for making sure that the metadata
  • 11:14 - 11:16
    is of a high standard
    and that it's not just--
  • 11:16 - 11:19
    There's a role in Germany
    called the data redakteur,
  • 11:19 - 11:21
    I don't know what the English
    translation is. I could not find it.
  • 11:21 - 11:23
    But someone who is mainly responsible
  • 11:23 - 11:26
    to make sure that there are
    no spelling mistakes in the data
  • 11:26 - 11:28
    or that it's of a relatively good quality.
  • 11:28 - 11:30
    But we actually need a bit more.
  • 11:30 - 11:32
    And a rose by any name
    would smell as sweet,
  • 11:32 - 11:35
    so I don't really care what we call it.
    I care more about what they do.
  • 11:35 - 11:38
    And what they need to do
    is to make strategic decisions
  • 11:38 - 11:39
    regarding this data usage.
  • 11:39 - 11:42
    Someone actually in the institution
    needs to sit down and say,
  • 11:42 - 11:44
    "It's important that
    we get online because..."
  • 11:44 - 11:47
    I unfortunately missed
    the presentation by The Met today
  • 11:47 - 11:50
    to see what they've done with their data
    once they've uploaded it into Wikidata.
  • 11:51 - 11:56
    But if we can explain
    to the institutions why it's important,
  • 11:56 - 11:58
    and they can see the reason for this,
  • 11:58 - 12:00
    then maybe they will be able
    to make funds available
  • 12:00 - 12:02
    or resources available
    to have someone to take this task
  • 12:02 - 12:05
    of actually realizing a museum
    or a cultural heritage institution
  • 12:05 - 12:06
    is really a data provider.
  • 12:06 - 12:08
    They need the architecture
    to provide data.
  • 12:08 - 12:11
    They need to know
    who wants to use the data.
  • 12:11 - 12:14
    Is it digital humanist researches?
    Is it just the general public?
  • 12:14 - 12:16
    They could have one central system
  • 12:16 - 12:19
    that can maybe feed into their website
    and also have an API.
  • 12:19 - 12:23
    And one of the issues with the DDB
    is they have an API,
  • 12:23 - 12:28
    so the metadata that we uploaded
    into the DDB can be queried by an API.
  • 12:28 - 12:30
    However, you need an API key
    before you do that.
  • 12:31 - 12:33
    That makes Wikidata attractive again
  • 12:33 - 12:35
    because I can just go to
    any Wikidata SPARQL endpoint
  • 12:35 - 12:37
    and basically do my query.
  • 12:37 - 12:40
    And we need to ensure high-quality data.
  • 12:40 - 12:42
    That’s the beginning
    of this whole process.
  • 12:42 - 12:47
    Just for the 250 artworks that I uploaded,
    the data took so much work
  • 12:47 - 12:51
    to try and massage it into a useful form
    that could be really great,
  • 12:51 - 12:54
    to ensure data quality
    and that would be the basic things,
  • 12:54 - 12:58
    the uniform measurement system
    and controlled vocabularies
  • 12:58 - 12:59
    as much as possible.
  • 12:59 - 13:02
    And all of this compliant
    with relevant metadata standards.
  • 13:02 - 13:05
    We always bang the drum for standards,
    because the moment you use standards,
  • 13:05 - 13:09
    you increase the ability
    to connect your data
  • 13:09 - 13:11
    to other forms of data out there.
  • 13:12 - 13:14
    And then to avoid some of the issues,
  • 13:14 - 13:16
    because I can understand
    that Wikidata also says,
  • 13:16 - 13:18
    "Look, we're not your data repository.
  • 13:18 - 13:20
    You can't just come here
    and dump your data."
  • 13:20 - 13:24
    And the institutions, on the other hand,
    say, "Look, this is our curated data.
  • 13:24 - 13:26
    This is stuff we spent years researching
  • 13:26 - 13:29
    and spent many employee hours working on.
  • 13:29 - 13:31
    We don't just want
    to upload that to Wikidata
  • 13:31 - 13:33
    and have it overwritten
    or deleted at some point."
  • 13:33 - 13:36
    So, I really like the idea of Wikibase.
  • 13:36 - 13:40
    The idea that an institution could run
    its own Wikidata instance,
  • 13:40 - 13:43
    where we really need to have a look
    and see if we can assist...
  • 13:43 - 13:44
    Where's the pointy bit?
  • 13:47 - 13:48
    ...is in this bit here.
  • 13:48 - 13:49
    So, if you have an institution
  • 13:49 - 13:51
    that has its own collection
    management system,
  • 13:52 - 13:55
    we need to find a way, write scripts,
    and support it as much as possible,
  • 13:55 - 14:00
    to as automatically as possible get data
    into their local Wikibase instance
  • 14:00 - 14:03
    or an instance run
    by another government institution,
  • 14:03 - 14:07
    perhaps like digiS-- but I can't
    make that prediction at this stage--
  • 14:07 - 14:10
    that could then be used by Wikidata
    in terms of a synchronized
  • 14:10 - 14:12
    or a federated query.
  • 14:12 - 14:16
    And all of these things can, of course,
    then use authority controls and so on.
  • 14:16 - 14:20
    But if the institutions can be encouraged
    to set up their own Wikibases.
  • 14:20 - 14:22
    10-15 years ago, very few museums
    had their own websites.
  • 14:22 - 14:24
    I would hope that in five years
  • 14:24 - 14:26
    more institutions
    would have their own Wikibase
  • 14:26 - 14:28
    or be prepared to use a Wikibase.
  • 14:28 - 14:29
    And, of course, it gets interesting
  • 14:29 - 14:32
    when the moment you input it
    into a system like that,
  • 14:33 - 14:35
    issues with the data will arise
  • 14:35 - 14:38
    that you can hopefully feed back
    into the collection management system
  • 14:38 - 14:40
    to improve their data in general.
  • 14:41 - 14:44
    So, in other words, something
    like a DIY Authority Control
  • 14:44 - 14:47
    for the institution
    or even just a controlled vocabulary.
  • 14:48 - 14:51
    If they just had a resource
    within their institution
  • 14:51 - 14:53
    where they could say,
    "We're going to choose this term,
  • 14:53 - 14:55
    we're not going to go through
    a long standardization process.
  • 14:55 - 14:57
    Even if five years down the track
  • 14:57 - 15:00
    they decide it's the wrong term
    and they want to change the term,
  • 15:00 - 15:03
    at least I have a clear idea
    what I'm talking about."
  • 15:03 - 15:04
    It’s a start.
  • 15:05 - 15:07
    But I have to say
    that this suggestion that I made
  • 15:07 - 15:12
    was also met with quite
    a lot of skepticism at this conference,
  • 15:12 - 15:17
    because I think the people in the audience
    have also had bad experiences
  • 15:17 - 15:18
    trying to set up these systems.
  • 15:18 - 15:20
    And there's the saying:
  • 15:20 - 15:23
    "To err is human, but to really screw up,
    you need a computer."
  • 15:23 - 15:25
    So I can imagine if you've got people
  • 15:25 - 15:28
    trying to set up their own Wikibases,
    their own control vocabularies,
  • 15:28 - 15:31
    a lot of things are going to go wrong,
    but it's a learning experience.
  • 15:31 - 15:34
    And we hope that something good
    will come out of it in the end.
  • 15:34 - 15:36
    This is something that was referenced
  • 15:36 - 15:41
    in the Wikibase inspirational
    presentation this morning.
  • 15:42 - 15:45
    The GND is now cooperating with Wikimedia
  • 15:45 - 15:48
    to actually see how Wikibase
    can be used for it,
  • 15:48 - 15:52
    because it's part of the whole GND4C,
    opening the very strictly controlled
  • 15:52 - 15:55
    for libraries resource
    of the Authority Control to now say,
  • 15:55 - 15:58
    "Hey, other GLAMS, when you use galleries,
    libraries, archives, and museums,
  • 15:58 - 16:01
    also want to use this resource"
  • 16:01 - 16:04
    GND4C is trying to see
    how they can actually make that happen.
  • 16:04 - 16:07
    And one of the things
    they're exploring is Wikibase.
  • 16:11 - 16:14
    And just the last point to say,
  • 16:14 - 16:15
    if you're in Berlin
    on the 5th of December,
  • 16:15 - 16:18
    and you'd like to come
    to the digiS Annual Conference,
  • 16:18 - 16:21
    where every year our partners present
    their projects, it's fascinating.
  • 16:21 - 16:24
    We've got a very broad spectrum.
    You're welcome to come along.
  • 16:24 - 16:28
    Details there in German on the URL.
  • 16:28 - 16:29
    Thank you for your attention.
  • 16:30 - 16:32
    (applause)
  • 16:34 - 16:38
    And I'll just point out, if you want,
    there are some interesting readings.
  • 16:38 - 16:40
    This never loads for some reason,
  • 16:40 - 16:43
    but it’s actually an article
    by Linda Nochlan
  • 16:43 - 16:46
    about why there are no great
    women artists from 1971.
  • 16:46 - 16:49
    It was republished
    a few years ago, online.
  • 16:49 - 16:51
    The link there will work.
    It's a fascinating article.
  • 16:51 - 16:54
    Invisible women: Exposing Data Bias
    in a World Designed for Men
    ,
  • 16:54 - 16:55
    also fascinating.
  • 16:55 - 16:57
    And this is a book about Die Malweiber.
  • 16:57 - 16:59
    Right!
  • 16:59 - 17:01
    (person 1) Can you show that...
  • 17:01 - 17:03
    - Mm-hmm. Yup.
    - (person 1) ...[in writing]?
  • 17:04 - 17:05
    Questions?
  • 17:09 - 17:12
    (person 2) So, how did you initially
    find those women?
  • 17:12 - 17:15
    Because they are in a museum
    database, I guess.
  • 17:15 - 17:17
    How did they end up
    in the museum database?
  • 17:17 - 17:19
    [inaudible]
  • 17:19 - 17:22
    So, the museum has artworks
    by these artists,
  • 17:22 - 17:26
    so they basically got them
    in a storage facility somewhere,
  • 17:26 - 17:27
    and they have it
    in their collection system
  • 17:27 - 17:29
    saying this is this painting
    by this and this artist.
  • 17:29 - 17:30
    So they've got that data.
  • 17:30 - 17:32
    (person 2) Then you solve
    a notability problem, right?
  • 17:33 - 17:35
    It would if it was online,
    but it's not online.
  • 17:35 - 17:37
    This collection resource--
  • 17:37 - 17:39
    (person 3) [inaudible]
    been made public [inaudible].
  • 17:39 - 17:40
    Yeah, I can't--
  • 17:40 - 17:42
    (person 2) That's not needed,
  • 17:42 - 17:44
    like if you referenced
    a book, that's also good enough.
  • 17:44 - 17:46
    (person 3) You could reference
    an offline catalog.
  • 17:46 - 17:51
    Yeah, but again it's a question of, well,
    I don't even have an offline catalog.
  • 17:51 - 17:55
    I could maybe say it's the museum
    or whichever collection...
  • 17:55 - 17:57
    (person 2) I was going to say,
    I understand it's a small museum,
  • 17:57 - 17:59
    and maybe they don't have
    a very sophisticated--
  • 17:59 - 18:01
    Well, Stadtmuseum
    is not small, but, yeah. (chuckles)
  • 18:01 - 18:03
    (person 2) Maybe
    they don't have the resources
  • 18:03 - 18:06
    to really put up something
    sophisticated metadata-wise,
  • 18:06 - 18:11
    but you could literally have them
    publish even a press release.
  • 18:12 - 18:16
    Whatever mechanism they have
    of pushing data online that is by them,
  • 18:16 - 18:18
    that is by the state museum,
  • 18:18 - 18:22
    that can say, "Here is a list of women
    who were systematically excluded,
  • 18:22 - 18:25
    and yet, here is what we know about them.
  • 18:25 - 18:27
    And we say this as the state museum."
  • 18:27 - 18:28
    That is a source.
  • 18:28 - 18:30
    This was exactly the problem.
  • 18:30 - 18:33
    So, they did have
    this exhibition that's online,
  • 18:33 - 18:36
    but that's only 20 of, I think,
    50 women artists.
  • 18:36 - 18:38
    (person 2) But they can say that
    about all of them.
  • 18:38 - 18:42
    That's why I think Wikidata
    also relaxed the rules a bit,
  • 18:42 - 18:43
    and let me keep that data in,
  • 18:43 - 18:46
    because I had it all up
    in the Wikidata import page.
  • 18:46 - 18:48
    But as I'm trying to say
    with Ida Maurer-Hahn,
  • 18:48 - 18:50
    I don't have any external reference,
  • 18:50 - 18:52
    and I hoped, though,
    just getting the GND IDs,
  • 18:52 - 18:54
    we'll have an opportunity to...
  • 18:54 - 18:57
    (person 2) I just wanted
    to mention, this is Andrew,
  • 18:57 - 19:01
    and his talk was about The Met,
  • 19:01 - 19:03
    the one I mentioned
    to you about this morning.
  • 19:03 - 19:05
    - So you two should talk.
    - Fantastic!
  • 19:05 - 19:08
    (person 2) Because he's also doing
    interesting things with museums,
  • 19:08 - 19:11
    So, I just wanted
    to make sure you connect.
  • 19:12 - 19:15
    And we should publish
    some of the artworks on Wikidata
  • 19:15 - 19:17
    to make the data more complete.
  • 19:17 - 19:19
    Well, I should also point out
    that some of the things we uploaded--
  • 19:19 - 19:22
    So, the Stadtmuseum
    is very focused at the moment
  • 19:22 - 19:25
    on the Humboldt Forum,
    which is being set up in Berlin.
  • 19:25 - 19:26
    Of these artworks that we digitized,
  • 19:26 - 19:29
    some of them have now
    run into rights issues,
  • 19:29 - 19:31
    some of them were taken offline again.
  • 19:32 - 19:33
    The moment we're digitizing this stuff,
  • 19:33 - 19:36
    half our work is really dealing
    with legal issues
  • 19:36 - 19:38
    and trying to explain to the institutions.
  • 19:38 - 19:42
    Because even if an artwork
    is in the public domain,
  • 19:42 - 19:44
    when it's being digitized or photographed,
  • 19:44 - 19:46
    certain rights are then
    ascribed to the person.
  • 19:46 - 19:47
    (person 2) But to be on Wikidata,
  • 19:47 - 19:49
    it doesn't have to be
    in the public domain.
  • 19:49 - 19:53
    No, but it has to have
    a fairly unrestricted license.
  • 19:53 - 19:55
    Not on Wikidata, sorry,
    but if I'm talking about the actual image.
  • 19:55 - 19:57
    - Sorry, I thought you meant the image.
    - (person 2) I was talking about Wikidata.
  • 19:57 - 20:01
    The metadata itself, that's all
    been put online. CC0, no problem.
  • 20:08 - 20:11
    (person 4) Hi! Your sources
    don't have to be online.
  • 20:11 - 20:12
    So, if they have a printed catalog,
  • 20:12 - 20:16
    whether it's an exhibition catalog,
    or a catalog of a whole collection,
  • 20:16 - 20:19
    you can create an item
    about that publication
  • 20:19 - 20:21
    and then use a stated in property
  • 20:21 - 20:22
    to cite that.
  • 20:22 - 20:24
    That's the problem.
    Not all the artists are in the catalog.
  • 20:24 - 20:25
    They do have a catalog,
  • 20:25 - 20:28
    but it's only 20 of 50,
    I think, women artists.
  • 20:28 - 20:29
    That's the whole issue.
  • 20:29 - 20:34
    They selected out of their whole
    collection of artists,
  • 20:35 - 20:37
    they only selected 20 of those artists.
  • 20:37 - 20:41
    So, there's still no written
    record of them anyway.
  • 20:44 - 20:47
    Like I said, I mean, hopefully,
    if within the next few weeks
  • 20:47 - 20:50
    I can get the GND IDs
    of the 19 missing artists, then...
  • 20:50 - 20:51
    (person 4) And further than that,
  • 20:51 - 20:54
    if you can state
    that a painting is in a collection,
  • 20:54 - 20:56
    and an artist is the creator
    of that painting,
  • 20:56 - 20:59
    or indeed the subject
    of the self-portrait painting,
  • 20:59 - 21:00
    then I think that satisfies
    the requirements
  • 21:00 - 21:02
    for inclusion in Wikidata.
  • 21:02 - 21:04
    It seems to so far.
    No one has deleted them yet.
  • 21:04 - 21:05
    Because that was that problem too.
  • 21:05 - 21:07
    The reason why those things got deleted
  • 21:07 - 21:10
    is there was a gap of a few months
    between when the artists were uploaded
  • 21:10 - 21:12
    and when I could complete
    uploading the paintings.
  • 21:12 - 21:14
    But, yeah, since all the paintings
    have been linked now too.
  • 21:14 - 21:17
    But I don't have any of those
    like Ida Maurer-Hahn.
  • 21:17 - 21:20
    They're just statements at the moment
    but without any references.
  • 21:22 - 21:24
    Hopefully, that will change soon.
  • 21:24 - 21:27
    (person 5) Hi. Thank you for your talk.
  • 21:27 - 21:31
    I have a question concerning
    the GLAM institution you mentioned.
  • 21:31 - 21:35
    What do you think
    are their learnings on this project,
  • 21:35 - 21:39
    because [inaudible]
    also have a lot of GLAM work,
  • 21:39 - 21:43
    and then for me it's interesting to see
    what the institution is thinking
  • 21:43 - 21:47
    about a collaboration
    with Wikidata for future work.
  • 21:47 - 21:50
    I mean, you sound a bit disappointed.
  • 21:50 - 21:52
    This is exactly the part where I need...
  • 21:52 - 21:55
    I need to be able to show them
    what they can do with the data online,
  • 21:55 - 21:58
    because at the moment
    it's a battle for us
  • 21:58 - 22:00
    even just to get them to deliver
    structured metadata.
  • 22:00 - 22:01
    So we work a lot with LIDO.
  • 22:01 - 22:04
    That will be passed on
    to the Deutsche Digitale Bibliothek.
  • 22:05 - 22:08
    It's a huge battle to get any data
    into LIDO, for a start,
  • 22:08 - 22:11
    and then we upload it to the DDB,
    which has its own issues.
  • 22:11 - 22:14
    Then with Wikidata,
    I've sort of tried to show--
  • 22:14 - 22:17
    We actually had Jason Evans two years ago
    give a presentation to say,
  • 22:17 - 22:20
    "This is what you can do.
    This is what we've done in Wales."
  • 22:20 - 22:23
    But every city has got
    its own speed and tempo.
  • 22:23 - 22:26
    Hopefully, out of this Stadtmuseum
    with their data,
  • 22:26 - 22:28
    we can start doing
    some interesting things,
  • 22:28 - 22:29
    find some applications
    and other institutions
  • 22:29 - 22:31
    that can maybe upload their data.
  • 22:31 - 22:33
    But at the moment,
    the museums, the GLAMs--
  • 22:33 - 22:34
    it's just not on their radar,
  • 22:34 - 22:38
    because they're so occupied
    with fundamental existential issues
  • 22:38 - 22:40
    that, you know, linked data
    is for them like,
  • 22:40 - 22:43
    "What the hell are you
    wasting my time with that for?
  • 22:43 - 22:45
    I'm going to make sure
    I can pay my employees,
  • 22:45 - 22:48
    of which there's only one, for next year."
  • 22:48 - 22:49
    But we try.
  • 22:49 - 22:51
    That's our role as we see it too.
  • 22:51 - 22:53
    This whole funding program
    is supposed to kind of...
  • 22:53 - 22:56
    I mean, the other major thing
    is long-term digital preservation.
  • 22:56 - 22:57
    We said, "Look,
    you can digitize this stuff,
  • 22:57 - 22:59
    but having it
    on a hard drive is not good.
  • 22:59 - 23:01
    You've got to have it
    into a proper system,
  • 23:01 - 23:04
    where, in 100 years, someone
    can actually access this resource."
  • 23:15 - 23:18
    (person 5) I feel your pain
    about the relevance
  • 23:18 - 23:21
    because I ran into a lot
    of these same issues of the Wikipedia,
  • 23:21 - 23:25
    especially the German language Wikipedia
    has these really strong relevance criteria
  • 23:25 - 23:26
    that are absolutely observed
  • 23:26 - 23:30
    and tend to exclude
    a lot of important information.
  • 23:33 - 23:36
    Would it help to change
    the relevance criteria?
  • 23:36 - 23:38
    Would that at all be possible?
  • 23:38 - 23:41
    Because I'm afraid, of course,
    that people will be afraid
  • 23:41 - 23:44
    that then "unnotable"--
    whatever that means-- things
  • 23:44 - 23:46
    would manage to get in.
  • 23:47 - 23:48
    Well, this is it, I mean...
  • 23:50 - 23:51
    (sighs)
  • 23:51 - 23:53
    I think the moment someone makes
    an effort to upload something,
  • 23:53 - 23:55
    it becomes notable to a degree,
  • 23:55 - 23:59
    and I think the criteria at this stage,
    though, are actually quite flexible,
  • 23:59 - 24:04
    but I do think you run into that issue
    of there are notability criteria...
  • 24:04 - 24:06
    There is a process
    that has been put in place,
  • 24:06 - 24:08
    but it still depends on
    what an admin does.
  • 24:08 - 24:11
    At that time, when I got in touch
    with that admin,
  • 24:11 - 24:14
    he or she had deleted
    150,000 different items.
  • 24:14 - 24:18
    So, there is a certain degree
    of arbitrariness in the whole process,
  • 24:18 - 24:20
    because at least I could
    convince him to put it back on,
  • 24:20 - 24:22
    but I know from the Wikidata mailing list
  • 24:22 - 24:24
    there's quite a few people
    who ran into the same issue
  • 24:24 - 24:25
    where stuff has been deleted.
  • 24:25 - 24:29
    So, I think the solution is to go
    the route, as I understand,
  • 24:29 - 24:32
    that Wikidata is proposing to
    of Wikibase installations.
  • 24:32 - 24:33
    They're making it a lot easier.
  • 24:33 - 24:35
    The Wikibase Docker
    is a lot easier to install.
  • 24:35 - 24:38
    I tried to install it about
    a year ago and struggled,
  • 24:38 - 24:40
    and I managed to get
    an instance up and running
  • 24:40 - 24:44
    and actually usable very easily
    with the Docker container.
  • 24:44 - 24:45
    And I think that's the way to go.
  • 24:45 - 24:48
    If you can get the institutions to see
    how easy it is to set it up.
  • 24:48 - 24:51
    15 years ago, they didn't have websites.
    Now they have websites.
  • 24:51 - 24:52
    And I would hope for Berlin, at least,
  • 24:52 - 24:54
    that we can maybe get
    a Wikibase instance going
  • 24:54 - 24:56
    and get a few museums
    to upload their curated data,
  • 24:56 - 24:58
    and then see how
    we can feed that into Wikidata.
  • 24:58 - 25:02
    And then I think everyone's happy,
    I would hope, but yeah...
  • 25:03 - 25:07
    (person 6) I'm going to ask a question
    you're unlikely to be able to answer.
  • 25:07 - 25:09
    But I'm going to ask it anyway.
  • 25:09 - 25:13
    I'm curious about the women
    whose artworks haven't been collected
  • 25:13 - 25:15
    by the institutions
  • 25:15 - 25:16
    because it sounds to me
  • 25:16 - 25:20
    like there's a large group
    of women artists,
  • 25:20 - 25:24
    some of whom won't have
    had their paintings collected
  • 25:24 - 25:27
    and don't appear to be
    in the record anywhere.
  • 25:27 - 25:32
    As a result, is there any movement
    or any research being undertaken,
  • 25:32 - 25:35
    or anything you're aware of
    to try and find these women
  • 25:35 - 25:37
    and get them into the record?
  • 25:37 - 25:39
    It’s like an extra step.
  • 25:39 - 25:41
    No, I think, this is the thing.
  • 25:41 - 25:44
    We can only start with the museums
    what they've got, basically.
  • 25:44 - 25:48
    The Stadtmuseum gets a lot of donations,
    as well from people around,
  • 25:48 - 25:50
    and then if maybe
    something turns up, they go,
  • 25:50 - 25:52
    "Well, we don't know who this artist is,
    and we can try to [track it back]."
  • 25:52 - 25:55
    That's the only serendipitous way
    it's going to happen.
  • 25:55 - 25:58
    But I think Berlin and Germany
    with its history
  • 25:58 - 26:01
    has lost so much artwork through bombing
    and destruction, whatever, too,
  • 26:01 - 26:06
    that I think there's a lot
    that has been lost permanently.
  • 26:06 - 26:08
    (person 5) Okay. Thank you.
  • 26:17 - 26:18
    (person 6) There's a group--
  • 26:18 - 26:20
    I can't remember the name
    for the life of me.
  • 26:20 - 26:26
    but they actually do a lot of work
    with Italian artists.
  • 26:26 - 26:29
    There's also work done in the UK
  • 26:29 - 26:34
    to kind of bring back women artists
    back to the front.
  • 26:36 - 26:41
    And so, I think all these
    local initiatives
  • 26:41 - 26:44
    might need to start working together
  • 26:44 - 26:49
    to do something global,
    to push more for it.
  • 26:49 - 26:51
    Maybe that would do something more.
  • 27:05 - 27:12
    (person 7) I wanted to inquire
    if there's a right direction to go
  • 27:12 - 27:15
    to animate institutions
  • 27:15 - 27:19
    to take part in making their data linked,
  • 27:19 - 27:22
    or open, in some kind of way available.
  • 27:22 - 27:25
    Should there be
    other institutions approaching--
  • 27:25 - 27:27
    Jealousy, it has to be jealousy.
  • 27:27 - 27:29
    That's the only way you're going to do it.
  • 27:29 - 27:30
    (person 7) I see. (chuckles)
  • 27:30 - 27:33
    You need to get one institution
    leading the way and people saying,
  • 27:33 - 27:36
    "Wow! Look what they can do
    with their data! We want that too!"
  • 27:36 - 27:39
    I think that's the way web servers
    got going really,
  • 27:39 - 27:41
    and I think that's the only way
    linked data is going to work
  • 27:41 - 27:43
    is to make them jealous.
  • 27:43 - 27:46
    (person 7) So, you think
    that the institutions
  • 27:46 - 27:51
    that are not having this great
    linked data Wikibase instance stuff
  • 27:51 - 27:53
    are going to say, "Yeah, we need that!"
  • 27:53 - 27:56
    and then are trying to find some people
    that can set it up for them.
  • 27:58 - 28:01
    Or should they approach other institutions
    that can help them with it?
  • 28:01 - 28:04
    Or which is the right direction
    for them to go?
  • 28:05 - 28:09
    For a starter, you've got to realize
    that data is a process.
  • 28:09 - 28:10
    It's never finished.
  • 28:10 - 28:12
    It's always something you’re working on,
  • 28:12 - 28:15
    and we would encourage any institution
    to take the first step,
  • 28:15 - 28:17
    but only to get cooperation.
  • 28:17 - 28:21
    So the Stadtmuseum Berlin
    has helped a lot of smaller museums
  • 28:21 - 28:24
    because then they say,
    "Look, we've got this stuff,
  • 28:24 - 28:25
    we need to do something with it."
  • 28:25 - 28:28
    And then they'll share it with
    the Stadtmuseum, so to speak,
  • 28:28 - 28:29
    and help it get online.
  • 28:29 - 28:30
    But we really have institutions
  • 28:30 - 28:33
    where there's just two working on it
    in there, part-time.
  • 28:33 - 28:36
    It's going to take a while
    before they get on that trajectory,
  • 28:36 - 28:39
    but I would hope that
    as the processes become better,
  • 28:39 - 28:41
    as Wikibase gets easier to use,
  • 28:41 - 28:44
    and as people get to see
    what the advantages are of it.
  • 28:44 - 28:46
    But I really think it's a process
  • 28:46 - 28:47
    that you're always going
    to have to support,
  • 28:47 - 28:49
    and it's something you're always going
    to have to make clear
  • 28:49 - 28:50
    to the funding institutions,
  • 28:50 - 28:53
    that it's not something they can just
    throw some money at, and then it's done.
  • 28:53 - 28:56
    It's going to need permanent funding.
    Libraries need permanent funding.
  • 28:56 - 28:59
    Anything to do with digital resources
    also need permanent funding, I think.
  • 29:05 - 29:08
    (moderator) Sorry, I think we have
    just run out of time for questions.
  • 29:08 - 29:11
    I encourage you to keep talking
    after the session,
  • 29:11 - 29:14
    and thank you so much to our speaker
    and all the attendees.
  • 29:14 - 29:16
    Thank you.
  • 29:16 - 29:18
    (applause)
Title:
cdn.media.ccc.de/.../wikidatacon2019-1068-eng-Reinstating_Female_Artists_to_the_Cultural_Record_via_Wikidata_hd.mp4
Video Language:
English
Duration:
29:27

English subtitles

Revisions