cdn.media.ccc.de/.../wikidatacon2019-8-eng-Libraries_panel_hd.mp4

Edit subtitles

0:07 - 0:12

I work as a teacher
at the University of Alicante,
0:12 - 0:17

where I recently obtained my PhD
on data libraries and linked open data.
0:17 - 0:19

And I'm also a software developer
0:19 - 0:22

at the Biblioteca Virtual
Miguel de Cervantes.
0:22 - 0:24

And today, I'm going to talk
about data quality.
0:28 - 0:32

Well, those are my colleagues
at the university.
0:32 - 0:37

And as you may know, many organizations
are publishing their data
0:37 - 0:38

or linked open data--
0:38 - 0:41

for example,
the National Library of France,
0:41 - 0:46

the National Library of Spain,
us, which is Cervantes Virtual,
0:46 - 0:49

the British National Bibliography,
0:49 - 0:52

the Library of Congress and Europeana.
0:52 - 0:56

All of them provide a SPARQL endpoint,
0:56 - 0:59

which is useful in order
to retrieve the data.
0:59 - 1:01

And if I'm not wrong,
1:01 - 1:06

the Library of Congress only provide
the data as a dump that you can't use.
1:08 - 1:14

When we publish our repository
as linked open data,
1:14 - 1:17

my idea was to be reused
by other institutions.
1:18 - 1:24

But what about if I'm an institution
who wants to enrich their data
1:24 - 1:27

with any data from other data libraries.
1:28 - 1:31

Which data set should I use?
1:31 - 1:34

Which data set is better
in terms of quality?
1:37 - 1:41

The benefits of the evaluation
of data quality in libraries are many.
1:41 - 1:47

For example, methodologies can be improved
in order to include new criteria,
1:47 - 1:49

in order to assess the quality.
1:49 - 1:55

And also, organizations can benefit
from best practices and guidelines
1:55 - 1:58

in order to publish their data
as linked open data.
2:00 - 2:03

What do we need
in order to assess the quality?
2:03 - 2:07

Well, obviously, a set of candidates
and a set of features.
2:07 - 2:10

For example, do they have
a SPARQL endpoint,
2:10 - 2:13

do they have a web interface,
how many publications do they have,
2:13 - 2:18

how many vocabularies do they use,
how many Wikidata properties do they have,
2:18 - 2:21

and where can I get those candidates?
2:21 - 2:22

I use LOD Cloud--
2:22 - 2:27

but when I was doing this slide,
I thought about using Wikidata
2:28 - 2:30

in order to retrieve those candidates.
2:30 - 2:34

For example, getting entities
of type data library,
2:34 - 2:36

which has a SPARQL endpoint.
2:36 - 2:39

You have here the link.
2:41 - 2:45

And I come up with those data libraries.
2:45 - 2:50

The first one uses bibliographic ontology
as main vocabulary,
2:50 - 2:54

and the others are based,
more or less, on FRBR,
2:54 - 2:57

which is a vocabulary published by IFLA.
2:57 - 3:00

And this is just an example
of how we could compare
3:00 - 3:04

data libraries using
bubble charts on Wikidata.
3:04 - 3:09

And this is just an example comparing
how many Wikidata properties
3:09 - 3:11

are per data library.
3:13 - 3:16

Well, how can we measure quality?
3:16 - 3:18

There are different methodologies,
3:18 - 3:20

for example, FRBR 1,
3:20 - 3:24

which provides a set of criteria
grouped by dimensions,
3:24 - 3:28

and those in green
are the ones that I found--
3:28 - 3:31

that I could assess by means of Wikidata.
3:34 - 3:39

And we also find that we
could define new criteria,
3:39 - 3:45

for example, a new one to evaluate
the number of duplications in Wikidata.
3:45 - 3:47

We use those properties.
3:47 - 3:50

And this is an example of SPARQL,
3:50 - 3:54

in order to count the number
of duplicates property.
3:57 - 4:00

And about the results,
while at the moment of doing this study,
4:00 - 4:05

not the slides, there was no property
for the British National Bibliography.
4:06 - 4:08

They don't provide provenance information,
4:08 - 4:12

which could be useful
for metadata enrichment.
4:12 - 4:15

And they don't allow
to edit the information.
4:15 - 4:17

So, we've been talking
about Wikibase the whole weekend,
4:17 - 4:21

and maybe we should try to adopt
Wikibase as an interface.
4:23 - 4:25

And they are focused on their own content,
4:25 - 4:29

and this is just the SPARQL query
based on Wikidata
4:29 - 4:31

in order to assess the population.
4:32 - 4:36

And the BnF provides labels
in multiple languages,
4:36 - 4:39

and they all use self-describing URIs,
4:39 - 4:43

which is that in the URI,
they have the type of entity,
4:43 - 4:48

which allows the human reader
to understand what they are using.
4:51 - 4:55

And more results, they provide
different output format,
4:55 - 4:59

they use external vocabularies.
4:59 - 5:01

Only the British National Bibliography
5:01 - 5:04

provides machine-readable
licensing information.
5:04 - 5:09

And up to one-third of the instances
are connected to external repositories,
5:09 - 5:11

which is really nice.
5:13 - 5:18

And while this study, this work
has been done in our Labs team,
5:18 - 5:22

a lab in a GLAM is a group of people
5:22 - 5:28

who want to explore new ways
5:28 - 5:30

of reusing data collections.
5:31 - 5:35

And there's a community
led by the British Library,
5:35 - 5:37

and in particular, Mahendra Mahey,
5:37 - 5:41

and we had a first event in London,
5:41 - 5:43

and another one in Copenhagen,
5:43 - 5:45

and we're going to have a new one in May
5:45 - 5:48

at the Library of Congress in Washington.
5:49 - 5:52

And we are now 250 people.
5:52 - 5:56

And I'm so glad that I found
somebody here at the WikidataCon
5:56 - 5:59

who has just joined us--
5:59 - 6:01

Sylvia from [inaudible], Mexico.
6:01 - 6:05

And I'd like to invite you
to our community,
6:05 - 6:10

since you may be part
of a GLAM institution.
6:11 - 6:13

So, we can talk later
if you want to know about this.
6:15 - 6:17

And this--it's all about people.
6:17 - 6:20

This is me, people
from the British Library,
6:20 - 6:25

Library of Congress, Universities,
and National Libraries in Europe
6:25 - 6:28

And there's a link here
in case you want to know more.
6:28 - 6:33

And, well, last month,
we decided to meet in Doha
6:33 - 6:37

in order to write a book
about how to create a lab in our GLAM.
6:39 - 6:43

And they choose 15 people,
and I was so lucky to be there.
6:45 - 6:49

And the book follows
the Booksprint methodology,
6:49 - 6:52

which means that nothing
is prepared beforehand.
6:52 - 6:53

All is done there in a week.
6:53 - 6:56

And believe me, it was really hard work
6:56 - 6:59

to have their whole book
done in this week.
7:00 - 7:04

And I'd like to introduce you to the book,
which will be published--
7:04 - 7:06

it was supposed to be published this week,
7:06 - 7:08

but it will be next week.
7:09 - 7:13

And it will be published open,
so you can have it,
7:13 - 7:16

and I can show you
a little bit later if you want.
7:16 - 7:18

And those are the authors.
7:18 - 7:20

I'm here-- I'm so happy, too.
7:20 - 7:22

And those are the institutions--
7:22 - 7:27

Library of Congress, British Library--
and this is the title.
7:27 - 7:30

And now, I'd like to show you--
7:31 - 7:34

a map that I'm doing.
7:34 - 7:37

We are launching a website
for our community,
7:37 - 7:43

and I'm in charge of creating a map
with our institutions there.
7:43 - 7:45

This is not finished.
7:45 - 7:50

But this is just SPARQL, and below,
7:52 - 7:53

we see the map.
7:53 - 7:58

And we see here
the new people that I found, here,
7:58 - 8:00

at the WikidataCon--
I'm so happy for this.
8:01 - 8:06

And we have here my data library
of my university,
8:06 - 8:08

and many other institutions.
8:09 - 8:11

Also, from Australia--
8:12 - 8:13

if I can do it.
8:14 - 8:16

Well, here, we have some links.
8:20 - 8:21

There you go.
8:21 - 8:23

Okay, this is not finished.
8:24 - 8:26

We are still working on this,
and that's all.
8:26 - 8:28

Thank you very much for your attention.
8:29 - 8:34

(applause)
8:42 - 8:48

[inaudible]
8:59 - 9:01

Good morning, everybody.
9:01 - 9:02

I'm Olaf Janssen.
9:02 - 9:04

I'm the Wikimedia coordinator
9:04 - 9:06

at the National Library
of the Netherlands.
9:06 - 9:08

And I would like to share my work,
9:08 - 9:12

which I'm doing about creating
Linked Open Data
9:12 - 9:15

for Dutch Public Libraries using Wikidata.
9:18 - 9:21

And my story starts roughly a year ago
9:21 - 9:25

when I was at the GLAM Wiki conference
in Tel Aviv, in Israel.
9:25 - 9:28

And there are two men
with very similar shirts,
9:28 - 9:31

and equally similar hairdos, [Matt]...
9:31 - 9:33

(laughter)
9:33 - 9:35

And on the left, that's me.
9:35 - 9:39

And a year ago, I didn't have
any practical knowledge and skills
9:39 - 9:40

about Wikidata.
9:40 - 9:43

I looked at Wikidata,
and I looked at the items,
9:43 - 9:45

and I played with it.
9:45 - 9:47

But I wasn't able to make a SPARQL query
9:47 - 9:50

or to do data modeling
with the right shape expression.
9:51 - 9:53

That's a year ago.
9:53 - 9:57

And on the lefthand side,
that's Simon Cobb, user: Sic19.
9:57 - 10:00

And I was talking to him,
because, just before,
10:01 - 10:02

he had given a presentation
10:02 - 10:06

about improving the coverage
of public libraries in Wikidata.
10:07 - 10:09

And I was very inspired by his talk.
10:10 - 10:13

And basically, he was talking
about adding basic data
10:13 - 10:15

about public libraries.
10:15 - 10:19

So, the name of the library, if available,
the photo of the building,
10:19 - 10:21

the address data of the library,
10:21 - 10:25

the geo-coordinates
latitude and longitude,
10:25 - 10:26

and some other things,
10:26 - 10:29

including with all source references.
10:31 - 10:35

And what I was very impressed
about a year ago was this map.
10:35 - 10:37

This is a map about
public libraries in the U.K.
10:37 - 10:39

with all the colors.
10:39 - 10:43

And you can see that all the libraries
are layered by library organizations.
10:43 - 10:46

And when he showed this,
I was really, "Wow, that's cool."
10:47 - 10:49

So, then, one minute later, I thought,
10:49 - 10:53

"Well, let's do it
for the country for that one."
10:53 - 10:55

(laughter)
10:57 - 10:59

And something about public libraries
in the Netherlands--
10:59 - 11:03

there are about 1,300 library
branches in our country,
11:03 - 11:07

grouped into 160 library organizations.
11:08 - 11:11

And you might wonder why
do I want to do this project?
11:11 - 11:14

Well, first of all, because
for the common good, for society,
11:14 - 11:17

because I think using Wikidata,
11:17 - 11:21

and from there,
creating Wikipedia articles,
11:21 - 11:23

and opening it up
via the linked open data cloud--
11:23 - 11:29

it's improving visibility and reusability
of public libraries in the Netherlands.
11:30 - 11:32

And my second goal was actually
a more personal one,
11:32 - 11:37

because a year ago, I had this
yearly evaluation with my manager,
11:37 - 11:42

and we decided it was a good idea
that I got more practical skills
11:42 - 11:46

on linked open data, data modeling,
and also on Wikidata.
11:46 - 11:50

And of course, I wanted to be able to make
these kinds of maps myself.
11:50 - 11:51

(laughter)
11:54 - 11:57

Then you might wonder
why do I want to do this?
11:57 - 12:02

Isn't there already enough basic
library data out there in the Netherlands
12:02 - 12:04

to have a good coverage?
12:06 - 12:08

So, let me show you some of the websites
12:08 - 12:13

that are available to discover
address and location information
12:13 - 12:15

about Dutch public libraries.
12:15 - 12:18

And the first one is this one--
Gidsvoornederland.nl--
12:18 - 12:21

and that's the official
public library inventory
12:21 - 12:23

maintained by my library,
the National Library.
12:24 - 12:29

And you can look up addresses
and geo-coordinates on that website.
12:30 - 12:33

Then there is this site,
Bibliotheekinzicht--
12:33 - 12:37

this is also an official website
maintained by my National Library.
12:37 - 12:39

And this is about
public library statistics.
12:41 - 12:44

Then there is another one,
debibliotheken.nl--
12:44 - 12:46

as you can see there is also
address information
12:46 - 12:50

about library organizations,
not about individual branches.
12:52 - 12:55

And there's even this one,
which also has address information.
12:57 - 12:59

And of course, there's something
like Google Maps,
12:59 - 13:02

which also has all the names
and the locations and the addresses.
13:03 - 13:06

And this one, the International
Library of Technology,
13:06 - 13:10

which has a worldwide
inventory of libraries,
13:10 - 13:11

including the Netherlands.
13:13 - 13:15

And I even discovered there is a data set
13:15 - 13:18

you can buy for 50 euros or so
to download it.
13:18 - 13:21

And there is also--seems to be
I didn't download it,
13:21 - 13:24

but there seems to be address
information available.
13:24 - 13:30

You might wonder is this kind of data
good enough for the purposes I had?
13:32 - 13:37

So, this is my birthday list
for my ideal public library data list.
13:37 - 13:39

And what's on my list?
13:39 - 13:44

First of all, the data I want to have
must be up-to-date-ish--
13:44 - 13:46

it must be fairly up-to-date.
13:46 - 13:49

So, doesn't have to be real time,
13:49 - 13:51

but let's say, a couple
of months, or half a year,
13:53 - 13:57

delayed with official publication,
that's okay for my purposes.
13:58 - 14:01

And I want to have it both
library branches
14:01 - 14:03

and the library organizations.
14:04 - 14:08

Then I want my data to be structured,
because it has to be machine-readable.
14:08 - 14:12

It has to be in open file format,
such as CSV or JSON or RDF.
14:13 - 14:15

It has to be linked
to other resources preferably.
14:16 - 14:22

And the uses--the license on the data
needs to be manifest public domain or CC0.
14:24 - 14:26

Then, I would like my data to have an API,
14:27 - 14:31

which must be public, free,
and preferably also anonymous
14:31 - 14:35

so you don't have to use an API key,
or you have to register an account.
14:36 - 14:39

And I also want to have
a SPARQL interface.
14:41 - 14:44

So, now, these are all the sites
I just showed you.
14:44 - 14:46

And I'm going to make a big grid.
14:47 - 14:50

And then, this is about
the evaluation I did.
14:51 - 14:54

I'm not going into it,
but there is no single column
14:54 - 14:56

which has all green check marks.
14:56 - 14:58

That's the important thing to take away.
14:59 - 15:04

And so, in summary, there was no
linked public free linked open data
15:04 - 15:09

for Dutch public libraries available
before I started my project.
15:09 - 15:13

So, this was the ideal motivation
to actually work on it.
15:15 - 15:17

So, that's what I've been doing
for a year now.
15:18 - 15:23

And I've been adding libraries bit by bit,
organization by organization to Wikidata.
15:23 - 15:26

I created also a project website on it.
15:27 - 15:30

It's still rather messy,
but it has all the information,
15:30 - 15:33

and I try to keep it
as up-to-date as possible.
15:33 - 15:36

And also all the SPARQL queries
you can see are linked from here.
15:38 - 15:40

And I'm just adding
really basic information.
15:40 - 15:44

You see the instances,
images if available,
15:44 - 15:47

addresses, locations, et cetera,
municipalities.
15:49 - 15:53

And where possible, I also try to link
the libraries to external identifiers.
15:56 - 15:58

And then, you can really easily--
we all know,
15:58 - 16:03

generating some Listeria lists
with public libraries grouped
16:03 - 16:05

by organizations, for instance.
16:05 - 16:08

Or using SPARQL queries,
you can also do aggregation on data--
16:08 - 16:11

let's say, give me all
the municipalities in the Netherlands
16:11 - 16:15

and the number of library branches
in all the municipalities.
16:17 - 16:20

With one click, you can make
these kinds of photo galleries.
16:22 - 16:24

And what I set out to do first,
16:24 - 16:26

you can really create these kinds of maps.
16:27 - 16:30

And you might wonder,
"Are there any libraries here or there?"
16:31 - 16:33

There are--they are not yet in Wikidata.
16:33 - 16:35

We're still working on that.
16:35 - 16:38

And actually, last week,
I spoke with a volunteer,
16:38 - 16:41

who's helping now
with entering the libraries.
16:42 - 16:45

You can really make cool--in Wikidata,
16:45 - 16:48

and also with using
the Cartographer extension,
16:48 - 16:50

you can use these kinds of maps.
16:52 - 16:54

And I even took it one step further.
16:54 - 16:57

I also have some Python skills,
and some Leaflet things skills--
16:57 - 17:00

so, I created, and I'm quite
proud of it, actually.
17:00 - 17:03

I created this library heat map,
which is fully interactive.
17:03 - 17:06

You can zoom in to it,
and you can see all the libraries,
17:07 - 17:09

and you can also run it off Wiki.
17:09 - 17:11

So, you can just embed it
in your own website,
17:11 - 17:13

and it fully runs interactively.
17:15 - 17:18

So, now going back to my big scary table.
17:20 - 17:23

There is one column
on the right, which is blank.
17:23 - 17:25

And no surprise, it will be Wikidata.
17:25 - 17:26

Let's see how it scores there.
17:26 - 17:30

(cheering)
17:33 - 17:35

So, I actually think
of printing this on a T-shirt.
17:35 - 17:37

(laughter)
17:38 - 17:40

So, just to summarize this in words,
17:40 - 17:41

thanks to my project, now,
17:41 - 17:46

there is public free linked open data
available for Dutch public libraries.
17:47 - 17:50

And who can benefit from my effort?
17:50 - 17:52

Well, all kinds of parties--
17:52 - 17:54

you see Wikipedia,
because you can generate lists
17:54 - 17:56

and overviews and articles,
17:56 - 18:00

for instance, using this
and be able to from Wikidata
18:00 - 18:02

for our National Library for--
18:03 - 18:05

IFLA also has an inventory
of worldwide libraries,
18:05 - 18:07

they can also reuse the data.
18:08 - 18:09

And especially for Sandra,
18:10 - 18:13

it's also important for the Ministry--
Dutch Ministry of Culture--
18:13 - 18:16

because Sandra is going
to have a talk about Wikidata
18:16 - 18:18

with the Ministry this Monday,
next Monday.
18:20 - 18:22

And also, on the righthand side,
for instance,
18:24 - 18:27

Amazon with Alexa, the assistant,
18:27 - 18:29

they're also using Wikidata,
18:29 - 18:31

so you can imagine that they also use,
18:31 - 18:33

if you're looking for public
library information,
18:33 - 18:37

they can also use Wikidata for that.
18:39 - 18:42

Because one year ago,
Simon Cobb inspired me
18:42 - 18:44

to do this project,
I would like to call upon you,
18:44 - 18:46

if you have time available,
18:46 - 18:50

and if you have data from your own country
about public libraries,
18:52 - 18:54

make the coverage better,
add more red dots,
18:55 - 18:57

and of course, I'm willing
to help you with that.
18:57 - 18:59

And Simon is also willing
to help with this.
19:00 - 19:01

And so, I hope next year, somebody else
19:01 - 19:04

will be at this conference
or another conference
19:04 - 19:06

and there will be more
red dots on the map.
19:08 - 19:09

Thank you very much.
19:09 - 19:13

(applause)
19:18 - 19:20

Thank you, Olaf.
19:20 - 19:24

Next we have Ursula Oberst
and Heleen Smits
19:24 - 19:28

presenting how can a small
research library benefit from Wikidata:
19:28 - 19:31

enhancing library products using Wikidata.
19:54 - 19:58

Okay. Good morning.
My name is Heleen Smits.
19:59 - 20:02

And my colleague,
Ursula Oberst--where are you?
20:02 - 20:04

(laughter)
20:04 - 20:09

And I work at the Library
of the African Studies Center
20:09 - 20:11

in Leiden, in the Netherlands.
20:11 - 20:15

And the African Studies Center
is a center devoted--
20:15 - 20:21

is an academic institution
devoted entirely to the study of Africa,
20:21 - 20:24

focusing on Humanities and Social Studies.
20:25 - 20:28

We used to be an independent
research organization,
20:28 - 20:33

but in 2016, we became part
of Leiden University,
20:33 - 20:38

and our catalog was integrated
into the larger university catalog.
20:39 - 20:44

Though it remained possible
to do a search in the part of the Leiden--
20:44 - 20:46

of the African Studies Catalog, alone,
20:48 - 20:51

we remained independent in some respects.
20:51 - 20:53

For example, with respect
to our thesaurus.
20:55 - 21:00

And also with respect
to the products we make for our users,
21:01 - 21:04

such as acquisition lists
and work dossiers.
21:05 - 21:12

And it is in the field of the web dossiers
21:12 - 21:15

that we have been looking
21:15 - 21:20

for possible ways to apply Wikidata,
21:20 - 21:23

and that's the part where Ursula
will in the second part of this talk
21:24 - 21:27

show you a bit
what we've been doing there.
21:31 - 21:35

The web dossiers are our collections
21:35 - 21:39

of titles from our catalog
that we compile
21:39 - 21:46

around a theme usually connected
to, for example, a conference,
21:46 - 21:51

or to a special event, and actually,
the most recent web dossier we made
21:51 - 21:56

was connected to the year
of indigenous languages,
21:56 - 22:00

and that was around proverbs
in African languages.
22:01 - 22:02

Our first steps--
22:04 - 22:09

next slide--our first steps
on the Wiki path as a library,
22:10 - 22:15

were in 2013, when we were one
of 12 GLAM institutions
22:15 - 22:16

in the Netherlands,
22:16 - 22:21

part of the project
of Wikipedians in Residence,
22:21 - 22:26

and we had for two months,
a Wikipedian in the house,
22:27 - 22:33

and he gave us trainings
for adding articles to Wikipedia,
22:33 - 22:38

and also, we made a start with uploading
photo collections to Commons,
22:39 - 22:43

which always remained a little bit
dependent on funding, as well,
22:43 - 22:46

whether we would be able to digitize them,
22:46 - 22:50

and to mostly have
a student assistant to do this.
22:51 - 22:55

But it was actually a great adding
to what we could offer
22:55 - 22:58

as an academic library.
22:59 - 23:05

In May 2018, so is that my Ursula,
my colleague Ursula--
23:05 - 23:09

she started to really explore--
dive into Wikidata
23:09 - 23:15

and see what we as a small
and not very much experienced library
23:15 - 23:18

in these fields could do with that.
23:25 - 23:27

So, I mentioned, we have
our own thesaurus.
23:28 - 23:31

And this is where we started.
23:31 - 23:35

This is a thesaurus of 13,000 terms,
23:35 - 23:38

all in the field of African studies.
23:38 - 23:41

It contains a lot of African languages,
23:43 - 23:46

names of ethnic groups in Africa,
23:48 - 23:49

and other proper names,
23:49 - 23:56

which are perhaps especially
interesting for Wikidata.
23:59 - 24:05

So, it is a real authority control
24:05 - 24:08

to vocabulary
with 5,000 preferred terms.
24:09 - 24:11

So, we submitted the request to Wikidata,
24:11 - 24:17

and that was actually very quickly
met with a positive response,
24:17 - 24:19

which was very encouraging for us.
24:23 - 24:26

Our thesaurus was loaded into Mix-n-Match,
24:26 - 24:32

and by now, 75% of the terms
24:32 - 24:36

have been manually matched with Wikidata.
24:38 - 24:42

So, it means, well, that we are now--
24:43 - 24:48

we are added as an identifier--
24:48 - 24:52

for example, if you click
on Swahili language,
24:52 - 24:57

what happens then in Wikidata
on the number that--
24:59 - 25:02

that connects our term--
is the Wikidata term--
25:03 - 25:06

we enter into our thesaurus,
25:06 - 25:10

and from there, you can do a search
directly in the catalog
25:10 - 25:13

by clicking the button again.
25:13 - 25:18

It means, also, that Wikidata
has not really integrated
25:18 - 25:20

into our catalog.
25:20 - 25:22

But that's also more difficult.
25:22 - 25:26

Okay, we have to give the floor
25:26 - 25:31

to Ursula for the next part.
25:31 - 25:33

(Ursula) Thank you very much, Heleen.
25:33 - 25:37

So, I will talk about our experiences
25:37 - 25:40

with incorporating Wikidata elements
25:40 - 25:41

to our web dossier.
25:41 - 25:45

A web dossier is--oh, sorry, yeah, sorry.
25:45 - 25:50

A web dossier, or a classical web dossier,
consists of three parts:
25:50 - 25:53

an introduction to the subject,
25:53 - 25:56

mostly written by one of our researchers;
25:56 - 26:01

a selection of titles, both books
and articles from our collection;
26:01 - 26:06

and the third part, an annotated list
26:06 - 26:09

with links to electronic resources.
26:09 - 26:16

And this year, we added a fourth part
to our web dossiers,
26:16 - 26:18

which is the Wikidata elements.
26:19 - 26:22

And it all started last year,
26:22 - 26:25

and my story is similar
to the story of Olaf, actually.
26:25 - 26:30

Last year, when I had no clue
about Wikidata,
26:30 - 26:33

and I discovered this wonderful
article by Alex Stinson
26:33 - 26:37

on how to write a query in Wikidata.
26:37 - 26:42

And he chose a subject--
a very appealing subject to me.
26:42 - 26:46

Namely, "Discovering Women Writers
from North Africa."
26:46 - 26:51

I can really recommend this article,
26:51 - 26:53

because it's very instructive.
26:53 - 26:57

And I thought I will be--
I'm going to work on this query,
26:57 - 27:03

and try to change it to:
"Southern African Women Writers,"
27:03 - 27:07

and try to add a link
to their work in our catalog.
27:07 - 27:11

And on the right-hand side,
you see the SPARQL query
27:12 - 27:15

which searches for
"Southern African Women Writers."
27:15 - 27:21

If you click on the button,
on the blue button on the lefthand side,
27:22 - 27:24

the search result will appear beneath.
27:24 - 27:26

The search result can have
different formats.
27:26 - 27:30

In my case, the search result is a map.
27:30 - 27:33

And the nice thing about Wikidata
27:33 - 27:37

is that you can embed
to this search result
27:37 - 27:39

into your own webpage,
27:39 - 27:42

and that's what we are now doing
with our work dossiers.
27:42 - 27:47

So, this was the very first one
on Southern African women writers,
27:47 - 27:50

listed classical three elements,
27:50 - 27:53

plus this map on the lefthand side,
27:53 - 27:56

which gives extra information--
27:56 - 27:58

a link to the Southern African
women writer--
27:58 - 28:01

a link to her works in our catalog,
28:01 - 28:07

and a link to the Wikidata record
of her birth place, and her name,
28:08 - 28:13

her personal record, plus a photo,
if it's available on Wikidata.
28:16 - 28:20

And you have to retrieve a nice map
28:20 - 28:24

with a lot of red dots
on the African continent.
28:24 - 28:29

You need nice data in Wikidata,
complete, sufficient data.
28:29 - 28:33

So, with our second web dossier
on public art in Africa,
28:33 - 28:38

we also started to enhance
the data in Wikidata.
28:38 - 28:43

In this case, for a public art--
we edited geo-locations--
28:43 - 28:47

geo-locations to Wikidata.
28:47 - 28:51

And we also searched for works
of public art in commons,
28:51 - 28:55

and if they don't have
a record on Wikidata yet,
28:55 - 29:01

we edited the record to Wikidata.
29:01 - 29:05

And the third thing we do,
29:05 - 29:10

because when we prepare a web dossier,
29:10 - 29:16

we download the titles from our catalog,
29:16 - 29:18

and the tiles are in MARC 21,
29:18 - 29:23

so we have to convert them to a format
that is presentable on the website,
29:23 - 29:28

and it takes not much time and effort
to convert the same set of titles
29:28 - 29:30

to Wikidata QuickStatements,
29:30 - 29:37

and then, we also upload
a title set to Wikidata,
29:37 - 29:41

and you can see the titles we uploaded
29:41 - 29:44

from our latest web dossier
29:44 - 29:48

on African proverbs in Scholia.
29:49 - 29:52

A really nice tool
that visualizes Scholia publications
29:52 - 29:55

being present in Wikidata.
29:55 - 30:00

And, one second--when it is possible,
we add a Scholia template
30:00 - 30:02

to our web dossier's topic.
30:02 - 30:03

Thank you very much.
30:03 - 30:08

(applause)
30:09 - 30:12

Thank you, Heleen and Ursula.
30:12 - 30:17

Next we have Adrian Pohl
presenting using Wikidata
30:17 - 30:22

to improve spatial subject indexing
and regional bibliography.
30:45 - 30:47

Okay, hello everybody.
30:47 - 30:50

I'm going right into the topic.
30:50 - 30:54

I only have ten minutes to present
a three-year project.
30:55 - 30:57

It wasn't full time. (laughs)
30:57 - 31:00

Okay, what's the NWBib?
31:00 - 31:04

It's an acronym for North-Rhine
Westphalian Bibliography.
31:04 - 31:08

It's a regional bibliography
that records literature
31:08 - 31:11

about people and places
in North Rhine-Westphalia.
31:13 - 31:14

And the monograph's in it--
31:15 - 31:19

there are a lot of articles in it,
and most of them are quite unique,
31:19 - 31:22

so, that's the interesting thing
about this bibliography--
31:22 - 31:25

because it's often
less quite obscure stuff--
31:25 - 31:28

local people writing
about that tradition,
31:28 - 31:29

and something like this.
31:30 - 31:33

And there's over 400,000 entries in there.
31:33 - 31:38

And the bibliography started in 1983,
31:38 - 31:43

and so we only have titles
from this publication year onwards.
31:45 - 31:49

If you want to take a look at it,
it's at nwbib.de,
31:49 - 31:51

that's the web application.
31:51 - 31:55

It's based on our service,
lobid.org, the API.
31:57 - 32:01

Because it's cataloged as part
of the hbz union catalog,
32:01 - 32:05

which comprises around 20 million records,
32:05 - 32:09

it's an [inaudible] Aleph system
we get the data out of there,
32:09 - 32:11

and make RDF out of it,
32:11 - 32:16

and provide it as via JSON
or the HTTP API.
32:17 - 32:21

So, the initial status in 2017
32:21 - 32:25

was we had nearly 9,000 distinct strings
32:25 - 32:29

about places--referring to places,
in North Rhine-Westphalia.
32:29 - 32:34

Mostly, those were administrative areas,
like towns and districts,
32:34 - 32:38

but also monasteries, principalities,
or natural regions.
32:39 - 32:44

And we already used Wikidata in 2017,
32:44 - 32:48

and matched those strings
with Wikidata API to Wikidata entries
32:48 - 32:52

quite naively to get
the geo-coordinates from there,
32:52 - 32:57

and do some geo-based
discovery stuff with it.
32:57 - 33:00

But this had some drawbacks.
33:00 - 33:03

And so, the matching was really poor,
33:03 - 33:05

and there were a lot of false positives,
33:05 - 33:09

and we still had no hierarchy
in those places,
33:09 - 33:13

and we still had a lot
of non-unique names.
33:14 - 33:15

So, this is an example here.
33:17 - 33:18

Does this work?
33:18 - 33:22

Yeah, as you can see,
for one place, Brauweiler,
33:22 - 33:25

there are four different strings in there.
33:25 - 33:28

So, we all know how this happens.
33:28 - 33:32

If there's no authority file,
you end up with this data.
33:32 - 33:34

But we want to improve on that.
33:35 - 33:38

And as you can also see,
that while the matching didn't work--
33:38 - 33:40

so you have this name of the place
33:40 - 33:45

and there's often the name
of the superior administrative area,
33:45 - 33:51

and even on the second level,
a superior administrative area
33:51 - 33:52

often in the name
33:52 - 33:59

to identify the place successfully.
33:59 - 34:05

So, the goal was to build a full-fledged
spatial classification based on this data,
34:05 - 34:07

with a hierarchical view of places,
34:09 - 34:11

with one entry or ID for each place.
34:12 - 34:17

And we got this mock-up
by NWBib editors in 2016, made in Excel,
34:18 - 34:23

to get a feeling of what
they would like to have.
34:25 - 34:28

There you have the--
Regierungsbezirk--
34:28 - 34:31

that's the most superior
administrative area--
34:31 - 34:35

we have in there some towns
or districts--rural districts--
34:35 - 34:40

and then, it's going down
to the parts of towns,
34:40 - 34:42

even to this level.
34:43 - 34:46

And we chose Wikidata for this task.
34:46 - 34:50

We also looked at the GND,
the Integrated Authority File,
34:50 - 34:55

and GeoNames--but Wikidata
had the best coverage,
34:55 - 34:57

and the best infrastructure.
34:58 - 35:02

The coverage for the places
and the geo-coordinates we need,
35:02 - 35:05

and the hierarchical
information, for example.
35:05 - 35:07

There were a lot of places,
also, in the GND,
35:07 - 35:10

but there was no hierarchical
information in there.
35:11 - 35:14

And also, Wikidata provides
the infrastructure
35:14 - 35:15

for editing and versioning.
35:15 - 35:20

And there's also a community
that helps maintaining the data,
35:20 - 35:22

which was quite good.
35:23 - 35:27

Okay, but there was a requirement
by the NWBib editors.
35:28 - 35:31

They did not want to directly
rely on Wikidata,
35:31 - 35:33

which was understandable.
35:33 - 35:35

We don't have those servers
under our control,
35:35 - 35:38

and we won't know what's going on there.
35:38 - 35:42

There might be some unwelcome edits
that destroy the classification,
35:42 - 35:44

or parts of it, or vandalism.
35:44 - 35:51

So, we decide to put
an intermediate SKOS file in between,
35:51 - 35:56

on which the application would--
which should be generated from Wikidata.
35:57 - 35:59

And SKOS is the Simple Knowledge
Organization System--
35:59 - 36:04

it's the standard way to model
36:04 - 36:08

a classification in the linked data world.
36:08 - 36:09

So, how we did it? Five steps.
36:09 - 36:14

I will come to each
of the steps in more detail.
36:14 - 36:18

We match the strings to Wikidata
with a better approach than before.
36:19 - 36:23

Created classification based
on Wikidata, edit,
36:23 - 36:26

then back the links
from Wikidata to NWBib
36:26 - 36:28

with a custom property.
36:28 - 36:33

And now, we are in the process
of establishing a good process
36:33 - 36:37

for updating the classification
in Wikidata.
36:37 - 36:39

Seeing--having a DIF
of the changes,
36:39 - 36:41

and then publishing it to the SKOS file.
36:43 - 36:45

I will come to the details.
36:45 - 36:46

So, the matching approach--
36:46 - 36:48

as the API wasn't very sufficient,
36:48 - 36:54

and because we have those
different levels in the strings,
36:54 - 36:59

we build a custom Elasticsearch
index for our task.
37:00 - 37:04

I think by now, you could probably,
as well, use OpenRefine for doing this,
37:04 - 37:09

but at that point in time,
it wasn't available for Wikidata.
37:10 - 37:14

And we build this index base
on SPARQL query,
37:14 - 37:20

and for entities in NRW,
and with a specific type.
37:20 - 37:25

And the query evolved over time a lot.
37:25 - 37:29

And we have a few entries
that you can see the history on GitHub.
37:30 - 37:32

So, where we put in the matching index,
37:32 - 37:36

in the spatial object,
is what we need in our data.
37:36 - 37:40

It's the label and the ID
or the link to Wikidata,
37:40 - 37:44

the geo-coordinates, and the type
from Wikidata [inaudible], as well.
37:44 - 37:50

But also for the matching, very important
that aliases and the broader thing--
37:50 - 37:54

and this is also an example where the name
of the broader entity
37:54 - 37:58

and the district itself are very similar.
37:58 - 38:03

So, it's important to have
some type information, as well,
38:03 - 38:05

for the matching.
38:05 - 38:08

So, the nationwide results
were very good.
38:08 - 38:11

We could automatically match
more than 99% of records
38:11 - 38:12

with this approach.
38:14 - 38:16

These were only 92% of the strings.
38:17 - 38:18

So, obviously, the results--
38:18 - 38:21

those strings that only occurred
one or two times
38:21 - 38:22

often didn't appear in Wikidata.
38:22 - 38:26

And so, we had to do a lot of work
with those with the [long tail].
38:28 - 38:32

And for around 1,000 strings,
the matching was incorrect.
38:32 - 38:35

But the catalogers did a lot of work
in the Aleph catalog,
38:35 - 38:40

but also in Wikidata, they made
more than 6,000 manual edits to Wikidata
38:40 - 38:45

to reach 100% coverage by adding
aliases-type information,
38:45 - 38:47

creating new entries.
38:47 - 38:49

Okay, so, I have to speed up.
38:50 - 38:54

We created classification based on this,
on the hierarchical statements.
38:54 - 38:59

P131 is the main property there.
39:00 - 39:02

We added the information to our data.
39:03 - 39:07

So, we now have this
in our data spatial object--
39:07 - 39:12

and we focus this--the link to Wikidata,
and the types are there,
39:13 - 39:18

and here's the ID
from the SKOS classification
39:18 - 39:19

we built based on Wikidata.
39:20 - 39:24

And you can see there
are Q identifiers in there.
39:27 - 39:29

Now, you can basically query our API
39:29 - 39:34

with such a query using Wikidata URIs,
39:34 - 39:39

and get literature, in this example,
about Cologne back.
39:40 - 39:46

Then we created a Wikidata property
for NWBib and edit those links
39:46 - 39:51

from Wikidata to the classification--
batch load them with QuickStatements.
39:52 - 39:54

And there's also a nice--
39:54 - 39:59

also a move to using a qualifier
on this property
39:59 - 40:03

to add the broader information there.
40:03 - 40:06

So, I think people won't mess around
that work with this,
40:06 - 40:09

and as with the P131 statement.
40:10 - 40:12

So, this is what it looks like.
40:13 - 40:16

This will go to the classification
where you can then start a query.
40:19 - 40:23

Now, we have to build this
update and review process,
40:23 - 40:29

and we will add those data like this,
40:29 - 40:32

with a zero sub-field to Aleph,
40:32 - 40:37

and the catalogers will start
using those Wikidata based IDs,
40:37 - 40:41

URIs, for cataloging for spatial indexing.
40:45 - 40:50

So, by now, there are more than 400,000
NWBib entries with links to Wikidata,
40:50 - 40:56

and more than 4,400 Wikidata entries
with links to NWBib.
40:57 - 40:58

Thank you.
40:58 - 41:03

(applause)
41:08 - 41:10

Thank you, Adrian.
41:13 - 41:15

I got it. Thank you.
41:31 - 41:34

So, as you've seen me before,
I'm Hilary Thorsen.
41:34 - 41:36

I'm Wikimedian in residence
41:36 - 41:38

with the Linked Data
for Production Project.
41:38 - 41:40

I am based at Stanford,
41:40 - 41:43

and I'm here today
with my colleague, Lena Denis,
41:43 - 41:46

who is Cartographic Assistant
at Harvard Library.
41:46 - 41:50

And Christine Fernsebner Eslao
is here in spirit.
41:50 - 41:54

She is currently back in Boston,
but supporting us from afar.
41:54 - 41:56

So, we'll be talking
about Wikidata and Libraries
41:56 - 42:00

as partners in data production,
organization, and project inspiration.
42:01 - 42:04

And our work is part of the Linked Data
for Production Project.
42:05 - 42:08

So, Linked Data for Production
is in its second phase,
42:08 - 42:10

called Pathway for Implementation.
42:10 - 42:13

And it's an Andrew W. Mellon
Foundation grant,
42:13 - 42:16

involving the partnership
of several universities,
42:16 - 42:20

with the goal of constructing a pathway
for shifting the catalog community
42:20 - 42:25

to begin describing library
resources with linked data.
42:25 - 42:27

And it builds upon a previous grant,
42:27 - 42:30

but this iteration is focused
on the practical aspects
42:30 - 42:32

of the transition.
42:34 - 42:36

One of these pathways of investigation
42:36 - 42:39

has been integrating
library metadata with Wikidata.
42:39 - 42:41

We have a lot of questions,
42:41 - 42:43

but some of the ones
we're most interested in
42:43 - 42:46

are how we can integrate
library metadata with Wikidata,
42:46 - 42:50

and make contribution
a part of our cataloging workflows,
42:50 - 42:54

how Wikidata can help us improve
our library discovery environment,
42:54 - 42:56

how it can help us reveal
more relationships
42:56 - 43:00

and connections within our data
and with external data sets,
43:00 - 43:04

and if we have connections in our own data
that can be added to Wikidata,
43:04 - 43:07

how libraries can help
fill in gaps in Wikidata,
43:07 - 43:10

and how libraries can work
with local communities
43:10 - 43:13

to describe library
and archival resources.
43:14 - 43:17

Finding answers to these questions
has focused on the mutual benefit
43:17 - 43:20

for the library and Wikidata communities.
43:20 - 43:23

We've learned through starting to work
on our different Wikidata projects,
43:23 - 43:25

that many of the issues
libraries grapple with,
43:25 - 43:29

like data modeling, identity management,
data maintenance, documentation,
43:29 - 43:31

and instruction on linked data,
43:31 - 43:34

are ones the Wikidata
community works on too.
43:34 - 43:36

I'm going to turn things over to Lena
43:36 - 43:40

to talk about what
she's been working on now.
43:47 - 43:51

Hi, so, as Hilary briefly mentioned,
I work as a map librarian at Harvard,
43:51 - 43:54

where I process maps, atlases,
and archives for our online catalog.
43:54 - 43:57

And while processing two-dimensional
cartographic works
43:57 - 44:00

is relatively straighforward,
cataloging archival collections
44:00 - 44:02

so that their cartographic resources
can be made discoverable,
44:02 - 44:04

has always been more difficult.
44:04 - 44:07

So, my use case for Wikidata
is visually modeling relationships
44:07 - 44:10

between archival collections
and the individual items within them,
44:10 - 44:13

as well as between archival drafts
in published works.
44:13 - 44:17

So, I used Wikidata to highlight the work
of our cartographer named Erwin Raisz,
44:17 - 44:20

who worked at Harvard
in the early 20th-century.
44:20 - 44:23

He was known for his vividly detailed
and artistic land forms,
44:23 - 44:24

like this one on the screen--
44:24 - 44:26

but also for inventing
the armadillo projection,
44:26 - 44:29

writing the first cartography
textbook in English
44:29 - 44:31

and other various
important contributions
44:31 - 44:33

to the field of geography.
44:33 - 44:35

And at the Harvard Map Collection,
44:35 - 44:39

we have a 66-item collection
of Raisz's field notebooks,
44:39 - 44:41

which begin when he was a student
and end just before his death.
44:44 - 44:46

So, this is the collection-level record
that I made for them,
44:46 - 44:48

which merely gives an overview,
44:48 - 44:51

but his notebooks are full of information
44:51 - 44:53

that he used in later atlases,
maps, and textbooks.
44:53 - 44:56

But researchers don't know how to find
that trajectory information,
44:56 - 44:59

and the system
is not designed to show them.
45:01 - 45:04

So, I felt that with Wikidata,
and other Wikimedia platforms,
45:04 - 45:05

I'd be able to take advantage
45:05 - 45:08

of information that already exists
about him on the open web,
45:08 - 45:11

along with library records
and a notebook inventory
45:11 - 45:13

that I had made in an Excel spreadsheet
45:13 - 45:15

to show relationships and influences
between his works.
45:16 - 45:19

So here, you can see how I edited
and reconciled library data
45:19 - 45:20

in OpenRefine.
45:20 - 45:23

And then, I used QuickStatements
to batch import my results.
45:23 - 45:25

So, now, I was ready
to create knowledge graphs
45:25 - 45:28

with SPARQL queries
to show patterns of influence.
45:30 - 45:33

The examples here show
how I leveraged Wikimedia Commons images
45:33 - 45:35

that I connected to him.
45:35 - 45:36

And the hierarchy of some of his works
45:36 - 45:39

that were contributing
factors to other works.
45:39 - 45:42

So, modeling Raisz's works on Wikidata
allowed me to encompass in a single image,
45:42 - 45:46

or in this case, in two images,
the connections that require many pages
45:46 - 45:48

of bibliographic data to reveal.
45:52 - 45:56

So, this video is going to load.
45:56 - 45:57

Yes! Alright.
45:57 - 46:00

This video is a minute and a half long
screencast I made,
46:00 - 46:02

that I'm going to narrate as you watch.
46:02 - 46:05

It shows the process of inputting
and then running a SPARQL query,
46:05 - 46:09

showing hierarchical relationships
between notebooks, an atlas, and a map
46:09 - 46:11

that Raisz created about Cuba.
46:11 - 46:13

He worked there before the revolution,
46:13 - 46:15

so he had the unique position
of having support
46:15 - 46:17

from both the American
and the Cuban governments.
46:17 - 46:21

So, I made this query as an example
to show people who work on Raisz,
46:21 - 46:24

and who are interested in narrowing down
what materials they'd like to request
46:24 - 46:26

when they come to us for research.
46:26 - 46:30

To make the approach replicable
for other archival collections,
46:30 - 46:33

I hope that Harvard and other institutions
will prioritize Wikidata look-ups
46:33 - 46:35

as they move to linked data
cataloging production,
46:35 - 46:38

which my co-presenters
can speak to the progress on
46:38 - 46:39

better than I can.
46:39 - 46:42

But my work has brought me--
has brought to mind a particular issue
46:42 - 46:47

that I see as a future opportunity,
which is that of archival modeling.
46:47 - 46:52

So, to an archivist, an item
is a discrete archival material
46:52 - 46:55

within a larger collection
of archival materials
46:55 - 46:57

that is not a physical location.
46:57 - 47:01

So an archivist from the American National
Archives and Records Administration,
47:01 - 47:03

who is also a Wikidata enthusiast,
47:03 - 47:06

advised me when I was trying
to determine how to express this
47:06 - 47:08

using an example item,
47:08 - 47:10

that I'm going to show
as soon as this video is finally over.
47:11 - 47:14

Alright. Great.
47:20 - 47:22

Nope, that's not what I wanted.
47:22 - 47:24

Here we go.
47:31 - 47:32

It's doing that.
47:32 - 47:34

(humming)
47:34 - 47:37

Nope. Sorry. Sorry.
47:40 - 47:43

Alright, I don't know why
it's not going full screen again.
47:43 - 47:44

I can't get it to do anything.
47:44 - 47:47

But this is the-- oh, my gosh.
47:47 - 47:48

Stop that. Alright.
47:48 - 47:51

So, this is the item that I mentioned.
47:52 - 47:54

So, this was what the archivist
47:54 - 47:56

from the National Archives
and Records Administration
47:56 - 47:57

showed me as an example.
47:57 - 48:02

And he recommended this compromise,
which is to use the part of property
48:02 - 48:06

to connect a lower level description
to a higher level of description,
48:06 - 48:09

which allows the relationships
between different hierarchical levels
48:09 - 48:11

to be asserted as statements
and qualifiers.
48:11 - 48:13

So, in this example that's on screen,
48:13 - 48:16

the relationship between an item,
a series, a collection, and a record group
48:16 - 48:20

are thus contained and described
within a Wikidata item entity.
48:20 - 48:22

So, I followed this model
in my work on Raisz.
48:23 - 48:26

And one of my images is missing.
48:26 - 48:28

No, it's not. It's right there. I'm sorry.
48:28 - 48:31

And so, I followed this model
on my work on Raisz,
48:31 - 48:33

but I look forward
to further standardization.
48:39 - 48:41

So, another archival project
Harvard is working on
48:41 - 48:45

is the Arthur Freedman collection
of more than 2,000 hours
48:45 - 48:49

of punk rock performances
from the 1970s to early 2000s
48:49 - 48:52

in the Boston and Cambridge,
Massachussets areas.
48:52 - 48:55

It includes many bands and venues
that no longer exist.
48:56 - 49:00

So far, work has been done in OpenRefine
on reconciliation of the bands and venues
49:00 - 49:02

to see which need an item
created in Wikidata.
49:03 - 49:06

A basic item will be created
via batch process next spring,
49:06 - 49:09

and then, an edit-a-thon will be
held in conjunction
49:09 - 49:12

with the New England Music Library
Association's meeting in Boston
49:12 - 49:16

to focus on adding more statements
to the batch-created items,
49:16 - 49:19

by drawing on local music
community knowledge.
49:19 - 49:22

We're interested in learning more
about models for pairing librarians
49:22 - 49:26

and Wiki enthusiasts with new contributors
who have domain knowledge.
49:26 - 49:29

Items will eventually be linked
to digitized video
49:29 - 49:31

in Harvard's digital collection platform
49:31 - 49:33

once rights have
been cleared with artists,
49:33 - 49:35

which will likely be a slow process.
49:36 - 49:38

There's also a great amount of interest
49:38 - 49:42

in moving away from manual cataloging
and creation of authority data
49:42 - 49:43

towards identity management,
49:43 - 49:46

where descriptions
can be created in batches.
49:46 - 49:48

An additional project that focused on
49:48 - 49:51

creating international standard
name identifiers, or ISNIs,
49:51 - 49:53

for avant-garde and women filmmakers
49:53 - 49:58

can be adapted for creating Wikidata items
for these filmmakers, as well.
49:58 - 50:01

Spreadsheets with the ISNIs,
filmmaker names, and other details
50:01 - 50:05

can be reconciled in OpenRefine,
and uploaded with QuickStatements.
50:05 - 50:07

Once people in organizations
have been described,
50:07 - 50:09

we'll move toward describing
the films in Wikidata,
50:09 - 50:13

which will likely present
some additional modeling challenges.
50:13 - 50:15

A library presentation
wouldn't be complete
50:15 - 50:17

without a MARC record.
50:17 - 50:20

Here, you can see the record
for Karen Aqua's taxonomy film,
50:20 - 50:22

where her ISNI and Wikidata Q number
50:22 - 50:24

have been added to the 100 field.
50:24 - 50:27

The ISNIs and Wikidata Q numbers
that have been created
50:27 - 50:30

can then be batch added
back into MARC records via MarcEdit.
50:30 - 50:33

You might be asking why I'm showing you
this ugly MARC record,
50:33 - 50:36

instead of some beautiful
linked data statements.
50:36 - 50:39

And that's because our libraries
will be working in a hybrid environment
50:39 - 50:40

for some time.
50:40 - 50:42

Our library catalogs still relies
on MARC records,
50:42 - 50:44

so by adding in these URIs,
50:44 - 50:46

we can try to take advantage
of linked data,
50:46 - 50:48

while our systems still use MARC.
50:49 - 50:53

Adding URIs into MARC records
makes an additional aspect
50:53 - 50:54

of our project possible.
50:54 - 50:57

Work has been done at Stanford
and Cornell to bring data
50:57 - 51:02

from Wikidata into our library catalog
using URIs already in our MARC records.
51:02 - 51:05

You can see an example
of a knowledge panel,
51:05 - 51:07

where all the data is sourced
from Wikidata,
51:07 - 51:11

and links back to the item itself,
along with an invitation to contribute.
51:11 - 51:15

This is currently in a test environment,
not in production in our catalog.
51:15 - 51:17

Ideally, eventually,
these will be generated
51:17 - 51:20

from linked data descriptions
of library resources
51:20 - 51:23

created using Sinopia,
our linked data editor
51:23 - 51:25

developed for cataloging.
51:25 - 51:28

We found that adding a look-up
to Wikidata in Sinopia is difficult.
51:28 - 51:32

The scale and modeling of Wikidata
makes it hard to partition the data
51:32 - 51:34

to be able to look up typed entities,
51:34 - 51:35

and we've run into the problem
51:35 - 51:37

of SPARQL not being good
for keyword search,
51:37 - 51:42

but wanting our keyword APIs
to return SPARQL-like RDF descriptions.
51:42 - 51:45

So, as you can see, we still have
quite a bit of work to do.
51:45 - 51:48

This round of the grant
runs until June 2020,
51:48 - 51:50

so, we'll be continuing our exploration.
51:50 - 51:53

And I just wanted to invite anyone
51:53 - 51:58

who's continued an interest in talking
about Wikidata and libraries,
51:58 - 52:01

I lead a Wikidata Affinity Group
that's open to anyone to join.
52:01 - 52:03

We meet every two weeks,
52:03 - 52:06

and our next call is Tuesday,
November the 5th,
52:06 - 52:08

so if you're interested
in continuing discussions,
52:08 - 52:10

I would love to talk with you further.
52:10 - 52:12

Thank you, everyone.
52:12 - 52:14

And thank you to the other presenters
52:14 - 52:17

for talking about all
of their wonderful projects.
52:17 - 52:21

(applause)

Title:: cdn.media.ccc.de/.../wikidatacon2019-8-eng-Libraries_panel_hd.mp4
Video Language:: English
Duration:: 52:29

	Bar Sch edited English subtitles for cdn.media.ccc.de/.../wikidatacon2019-8-eng-Libraries_panel_hd.mp4
	C3Subtitles edited English subtitles for cdn.media.ccc.de/.../wikidatacon2019-8-eng-Libraries_panel_hd.mp4

English subtitles

Revisions

Revision 2 Uploaded

Bar Sch

cdn.media.ccc.de/.../wikidatacon2019-8-eng-Libraries_panel_hd.mp4

Revisions

Our website uses cookies

Operating cookies (Required)