1
00:00:00,000 --> 00:00:08,249
Good afternoon, everybody.
2
00:00:08,929 --> 00:00:12,068
Welcome to our GLAM panel.
3
00:00:13,124 --> 00:00:17,009
Before we start, I just have
two announcements to make.
4
00:00:17,329 --> 00:00:23,049
First of all, please extensively make use
of our Etherpad to take notes.
5
00:00:23,781 --> 00:00:27,998
And the second one is directed
at our audience at home,
6
00:00:27,998 --> 00:00:29,819
or wherever you are.
7
00:00:29,819 --> 00:00:30,958
If you have any questions,
8
00:00:30,958 --> 00:00:34,028
you can also write that into the Etherpad,
9
00:00:34,028 --> 00:00:37,828
and our room angels
will keep track of them.
10
00:00:39,328 --> 00:00:44,348
So, we decided that for this year's panel,
11
00:00:45,388 --> 00:00:48,868
after seeing all the contributions
that were made,
12
00:00:49,128 --> 00:00:53,538
we would focus on the role of Wikidata
within data ecosystems
13
00:00:53,551 --> 00:00:57,199
that go beyond the actual
Wikimedia projects,
14
00:00:57,199 --> 00:00:59,747
which is also absolutely in line
15
00:00:59,747 --> 00:01:03,677
with the new Wikimedia
Foundation strategy.
16
00:01:04,652 --> 00:01:07,947
And we have, today, four panelists.
17
00:01:08,387 --> 00:01:09,876
Three plus one.
18
00:01:09,876 --> 00:01:13,636
So, I would like to ask you on stage,
19
00:01:13,636 --> 00:01:15,875
so we can introduce you.
20
00:01:22,205 --> 00:01:24,706
So, we have Susanna Ånäs.
21
00:01:25,385 --> 00:01:29,296
She's a long time free-knowledge activist
22
00:01:29,296 --> 00:01:31,276
involved in many WikiProjects.
23
00:01:31,916 --> 00:01:35,526
And she will be reporting today
on the project in cooperation
24
00:01:35,526 --> 00:01:38,396
with the Finnish National Library.
25
00:01:38,856 --> 00:01:43,435
Then we have, next to me, Mike Dickison,
26
00:01:43,435 --> 00:01:46,325
who will be second in this order.
27
00:01:46,995 --> 00:01:50,283
He is a museum curator from New Zealand.
28
00:01:50,283 --> 00:01:53,815
He's a zoologist and a Wikipedia editor.
29
00:01:53,815 --> 00:01:58,788
And he was New Zealand's
first Wikipedian at Large
30
00:01:58,788 --> 00:02:02,565
in 2018 and 2019.
31
00:02:02,565 --> 00:02:06,634
And he will tell us
about his experience in that role,
32
00:02:06,634 --> 00:02:13,105
and what kind of role Wikidata
is starting to play in that context.
33
00:02:15,784 --> 00:02:18,135
Then we have Joachim Neubert
34
00:02:18,135 --> 00:02:23,461
from the Leibniz Information Center
for Economics in Kiel and Hamburg.
35
00:02:24,011 --> 00:02:29,131
He has been working on making the largest
public press archives worldwide
36
00:02:29,131 --> 00:02:34,655
more accessible to the public,
and he's using Wikidata to do that.
37
00:02:35,890 --> 00:02:39,091
And then I will go last.
My name is Beat Estermann.
38
00:02:39,091 --> 00:02:43,080
I work for Bern University
of Applied Sciences, in Switzerland.
39
00:02:43,640 --> 00:02:49,950
And I've been a long-time promoter
for OpenGLAM in Switzerland and Austria.
40
00:02:50,335 --> 00:02:54,840
And I will today report
about my activities in connection
41
00:02:54,840 --> 00:02:59,460
with the mandate from the Canadian Arts
Presenting Association,
42
00:02:59,460 --> 00:03:01,270
focusing on performing arts.
43
00:03:02,121 --> 00:03:04,440
Not primarily on Wikidata,
44
00:03:04,440 --> 00:03:08,421
but you will see Wikidata
is starting to play a role there, as well.
45
00:03:08,970 --> 00:03:13,250
So now, most of us
will take our seat here,
46
00:03:13,250 --> 00:03:16,980
and I will give the floor to Susanna.
47
00:03:18,300 --> 00:03:22,769
Okay. So, hello. My name is Susana Ånäs,
48
00:03:22,769 --> 00:03:25,769
and I work part-time for Wikimedia Finland
49
00:03:25,769 --> 00:03:27,079
as a GLAM coordinator,
50
00:03:27,079 --> 00:03:32,655
and I also do consulting
in the open knowledge sphere.
51
00:03:32,655 --> 00:03:36,049
And this is a discourse,
maybe, of [inaudible].
52
00:03:36,049 --> 00:03:38,719
So, I have been involved in the workings
53
00:03:38,719 --> 00:03:45,642
of geographic data group of the--
54
00:03:48,439 --> 00:03:51,147
well, I looked it up,
but it isn't in English,
55
00:03:51,147 --> 00:03:54,497
but, cultural heritage initiative
of the Finnish royal government.
56
00:03:54,917 --> 00:03:59,775
So, this is about place names
57
00:03:59,775 --> 00:04:03,300
and how they are represented
58
00:04:03,300 --> 00:04:07,466
in different repositories
in the GLAM sector in Finland,
59
00:04:07,466 --> 00:04:11,755
and how they are trying to pull together
these different sources,
60
00:04:11,755 --> 00:04:17,906
and how they are informed
by modeling in Wikidata and elsewhere.
61
00:04:17,906 --> 00:04:23,315
So, here we see the three main sources
for these YSO places,
62
00:04:23,315 --> 00:04:27,944
which is part of the national ontology--
general ontology.
63
00:04:27,944 --> 00:04:29,665
AHAA is for Finnish archives,
64
00:04:29,665 --> 00:04:31,645
Melinda is for Finnish libraries,
65
00:04:31,645 --> 00:04:33,750
and KOOKOS is for Finnish museums.
66
00:04:33,750 --> 00:04:37,585
So, there are three, also,
content management systems
67
00:04:37,585 --> 00:04:40,290
that come together in these YSO places.
68
00:04:40,745 --> 00:04:47,365
And there are exchanges between Wikidata
already taking place,
69
00:04:47,965 --> 00:04:53,065
as well as the names project
for the National Land Survey.
70
00:04:53,065 --> 00:04:56,285
And then, there's a third project,
the Finnish Names Archive,
71
00:04:56,285 --> 00:05:00,391
which doesn't yet contribute to this,
72
00:05:00,391 --> 00:05:02,715
but there are plans for that.
73
00:05:02,715 --> 00:05:09,175
So, one of the key modeling issues
in this whole problem area
74
00:05:09,175 --> 00:05:15,226
is that there are three types
of elements in place names
75
00:05:16,116 --> 00:05:18,195
represented in this project.
76
00:05:18,195 --> 00:05:21,236
One of them is the place,
the one that has location.
77
00:05:21,236 --> 00:05:24,766
And one of them is the place name,
the toponym, for example.
78
00:05:25,006 --> 00:05:27,696
And then, there are sources,
which are documents
79
00:05:27,696 --> 00:05:30,756
from which these both can be derived from,
80
00:05:30,756 --> 00:05:32,565
or like, backed up with.
81
00:05:32,565 --> 00:05:35,845
The YSO places--
here, on the top right,
82
00:05:35,845 --> 00:05:38,799
you will see the same diagram again.
83
00:05:38,799 --> 00:05:41,189
It focuses mainly on the places.
84
00:05:42,619 --> 00:05:46,279
The main thing of this
is the Finnish National Library,
85
00:05:46,279 --> 00:05:49,159
and the Finto project.
86
00:05:50,199 --> 00:05:55,608
There are now more than 7,000 places
in Finnish and Swedish
87
00:05:55,608 --> 00:05:59,438
and over 3,000 in English,
88
00:05:59,438 --> 00:06:03,042
and they are CC0 we've licensed with.
89
00:06:03,042 --> 00:06:06,008
So, here you can see the service of Finto.
90
00:06:06,008 --> 00:06:09,883
And a place-- I chose Sevettijärvi.
91
00:06:09,883 --> 00:06:13,908
It is now also related
to our language project
92
00:06:13,908 --> 00:06:15,268
with the Skolt Sami--
93
00:06:15,268 --> 00:06:18,877
this is a place
in the very north of Finland
94
00:06:18,877 --> 00:06:21,765
inhabited by Skolt Sámi.
95
00:06:21,765 --> 00:06:27,264
So, here you can see the place
which belongs to the--
96
00:06:27,264 --> 00:06:32,724
well, you will see the data
about this place.
97
00:06:32,724 --> 00:06:37,952
You can see that it is connected
to a Wikidata,
98
00:06:37,952 --> 00:06:42,344
as well as this National Land Survey data.
99
00:06:43,192 --> 00:06:47,406
Here we go. And you will see
this in more detail, here.
100
00:06:48,582 --> 00:06:52,360
It is also hierarchically arranged
101
00:06:52,360 --> 00:06:56,310
inside this repository.
102
00:06:57,670 --> 00:07:00,460
Well, actually,
the actual place is not seen,
103
00:07:00,460 --> 00:07:05,880
but it is underneath this municipality,
104
00:07:05,880 --> 00:07:08,010
as well as the region,
105
00:07:08,010 --> 00:07:10,154
and Finland as a country,
and Nordic countries,
106
00:07:10,154 --> 00:07:12,650
the broader region.
107
00:07:12,650 --> 00:07:14,400
Here you can see that many of these
108
00:07:14,400 --> 00:07:17,891
have been matched
with Wikidata previously
109
00:07:18,730 --> 00:07:22,230
through Mix'n'Match,
and there are still remaining ones.
110
00:07:22,230 --> 00:07:27,900
But then, the amount of names
is not that high.
111
00:07:28,411 --> 00:07:30,844
It's only less than 5,000.
112
00:07:31,570 --> 00:07:33,860
So, then there is this other repository
113
00:07:33,860 --> 00:07:38,040
by the Finnish Geospatial
Platform Project--
114
00:07:38,040 --> 00:07:39,199
Place Names Cards.
115
00:07:39,199 --> 00:07:41,729
These are all the place names
that are on Finnish maps.
116
00:07:42,130 --> 00:07:48,308
And they have the linked data,
which is licensed CC BY 4.0.
117
00:07:48,518 --> 00:07:54,478
800,000 map labels in Finnish, Swedish,
and all those three Saami languages
118
00:07:54,478 --> 00:07:55,778
that are in Finland.
119
00:07:55,997 --> 00:07:58,877
And they have
two different types of entities.
120
00:07:58,877 --> 00:08:00,680
The other ones are places,
and the other ones
121
00:08:00,680 --> 00:08:02,651
are place names, toponyms.
122
00:08:02,651 --> 00:08:05,271
And they both have persistent URIs.
123
00:08:06,001 --> 00:08:09,721
Here's, for example,
the same Sevettijärvi, in first Finnish,
124
00:08:09,721 --> 00:08:14,001
and then all those three Saami languages,
as well as the geographic data,
125
00:08:14,001 --> 00:08:18,821
and then there is more information
about that, like the place type,
126
00:08:19,630 --> 00:08:20,841
et cetera.
127
00:08:21,640 --> 00:08:28,411
Here is the card for the place name,
the toponym, having its own URI.
128
00:08:29,943 --> 00:08:33,738
Sorry, it seems that it's not translated
into the English list.
129
00:08:34,432 --> 00:08:39,151
So, multilinguality
is not covering the whole project.
130
00:08:40,167 --> 00:08:42,523
Okay, we come
to the Finnish Names Archive.
131
00:08:42,523 --> 00:08:46,234
This is a project by the Institute
for the Languages of Finland,
132
00:08:46,234 --> 00:08:50,456
and these represent not the places,
not the place names,
133
00:08:50,456 --> 00:08:52,603
but they are actually sources for those.
134
00:08:52,603 --> 00:08:57,123
So, these are three million
field notes of place names,
135
00:08:57,723 --> 00:08:59,529
and it is a Wikibase project.
136
00:08:59,529 --> 00:09:03,325
They are in a Wikibase,
mainly in Finnish, some in Swedish.
137
00:09:03,325 --> 00:09:08,111
An outstanding collection of Saami names,
which we are very interested in.
138
00:09:08,111 --> 00:09:10,141
And they are licensed CC BY.
139
00:09:10,380 --> 00:09:14,850
And that is also a challenge
from the Wikidata point of view.
140
00:09:14,850 --> 00:09:17,640
But if there was a Finnish local Wikibase,
141
00:09:17,640 --> 00:09:22,632
we might be able to first work
on them in that project.
142
00:09:23,034 --> 00:09:25,343
So, here's a screenshot of that,
143
00:09:26,443 --> 00:09:31,323
showing that there's information
about the place, the maps--
144
00:09:31,323 --> 00:09:35,227
the maps that the collectors
initially use,
145
00:09:35,227 --> 00:09:40,713
and the card that they produce
of the information they collected.
146
00:09:41,455 --> 00:09:46,416
So, here's one of those cards
147
00:09:46,416 --> 00:09:48,736
broken down into data
148
00:09:48,736 --> 00:09:50,676
that is included in them.
149
00:09:51,166 --> 00:09:53,751
So, then they sent
this linked data project
150
00:09:53,751 --> 00:09:56,336
by the Helsinki Digital Humanities Lab
151
00:09:56,336 --> 00:09:58,256
and Semantic Computers,
152
00:09:58,256 --> 00:10:01,446
computing group of Aalto University--
153
00:10:01,446 --> 00:10:06,525
and together with this Institute
for the Languages of Finland--
154
00:10:06,525 --> 00:10:07,994
the Names Sampo.
155
00:10:07,994 --> 00:10:11,024
And this is an aggregated
research interface
156
00:10:11,024 --> 00:10:13,503
to several place name sources.
157
00:10:13,503 --> 00:10:17,704
Here you can see that many
of the sources are out there on the left,
158
00:10:17,704 --> 00:10:20,763
and then, you can make
different kinds of visualizations
159
00:10:20,763 --> 00:10:22,653
based on this data.
160
00:10:22,653 --> 00:10:24,438
And, yeah.
161
00:10:25,289 --> 00:10:30,603
So, I've been bringing up this idea
of modeling for a local Wikibase
162
00:10:30,603 --> 00:10:32,693
that we could do with this data.
163
00:10:32,693 --> 00:10:36,580
But when we enter
these modeling questions,
164
00:10:36,580 --> 00:10:37,770
how do we model?
165
00:10:37,770 --> 00:10:41,589
There are different ways,
different traditions in each of these.
166
00:10:45,682 --> 00:10:50,360
And the good thing about it
is it could also serve minority languages
167
00:10:50,360 --> 00:10:52,475
with very little effort.
168
00:10:53,243 --> 00:10:57,179
Okay. So, here we have
the two basic options:
169
00:10:57,179 --> 00:11:01,660
the SAPO model, which is
the Finnish Space-Time Ontology,
170
00:11:02,841 --> 00:11:04,421
and the Wikidata model.
171
00:11:04,421 --> 00:11:07,909
Here you can see
that Wikidata items tend to zero.
172
00:11:07,909 --> 00:11:12,871
Ideally, they remain the same
with the changing properties.
173
00:11:12,871 --> 00:11:16,909
Whereas, in the SAPO model,
these items become new
174
00:11:16,909 --> 00:11:20,399
when there is a change,
such as area change and name change.
175
00:11:21,179 --> 00:11:26,219
So here, come back to this division
176
00:11:26,219 --> 00:11:31,719
between these three different dimensions
of places, place names.
177
00:11:32,099 --> 00:11:37,659
So, should we make these place names
into entities or properties?
178
00:11:37,659 --> 00:11:39,248
Wikidata uses properties,
179
00:11:39,248 --> 00:11:43,098
whereas this land survey
project has entities.
180
00:11:43,838 --> 00:11:46,177
Or should we make them into lexemes?
181
00:11:46,177 --> 00:11:51,426
Wikidata has chosen to work
with properties,
182
00:11:51,426 --> 00:11:54,956
textual properties
for place names over lexemes.
183
00:11:55,567 --> 00:11:57,818
I'm sorry, the other way around.
184
00:11:57,818 --> 00:11:59,631
So, the names are...
185
00:12:03,056 --> 00:12:04,941
properties, not lexemes.
186
00:12:05,874 --> 00:12:06,877
Right.
187
00:12:07,165 --> 00:12:11,132
And maybe the shortcoming of the Wikibase
188
00:12:11,132 --> 00:12:16,340
is the lack of geographical
shapes inside that--
189
00:12:16,340 --> 00:12:20,958
like in the basic setup of it,
190
00:12:20,958 --> 00:12:24,748
so one would have to add
more technology into the stack
191
00:12:24,748 --> 00:12:29,688
to be able to use local geographic shapes.
192
00:12:29,688 --> 00:12:31,823
And a federation is really needed
193
00:12:31,823 --> 00:12:38,168
to be able to take advantage
of the Wikidata corpus.
194
00:12:38,648 --> 00:12:43,052
So, I'm done already. Thank you.
195
00:12:43,616 --> 00:12:45,827
(applause)
196
00:13:01,255 --> 00:13:02,514
Okay.
197
00:13:03,274 --> 00:13:05,011
(speaking in Maori)
198
00:13:05,011 --> 00:13:07,655
Welcome, everyone.
My name is Mike Dickison.
199
00:13:08,375 --> 00:13:10,149
And for a year,
200
00:13:10,149 --> 00:13:13,075
I was New Zealand Wikipedian at Large.
201
00:13:13,935 --> 00:13:16,935
You might wonder
what a Wikipedian at Large is.
202
00:13:17,856 --> 00:13:21,875
Because if you actually look out for it,
there is no such thing, as we can see.
203
00:13:22,735 --> 00:13:25,855
It's a term that I made up
in the grant proposal,
204
00:13:26,153 --> 00:13:29,003
which the foundation
seemed to like very much.
205
00:13:29,983 --> 00:13:31,533
And so, we ran with it.
206
00:13:32,303 --> 00:13:36,633
So, for a year, I went through
35 different institutions,
207
00:13:37,053 --> 00:13:41,053
residents, and most of them,
running training sessions,
208
00:13:41,493 --> 00:13:44,363
organizing public events,
and trying to develop
209
00:13:44,363 --> 00:13:47,230
a Wikimedia strategy for each one.
210
00:13:47,998 --> 00:13:49,498
It was a very interesting experience,
211
00:13:49,498 --> 00:13:53,267
and you encounter a wide range
of different projects and people.
212
00:13:53,267 --> 00:13:58,211
And I wanted to try and talk through
some of the different projects
213
00:13:58,211 --> 00:14:00,345
that dealt with Wikidata
214
00:14:00,872 --> 00:14:05,171
in interesting or, perhaps,
illuminating ways,
215
00:14:05,171 --> 00:14:07,591
that might be useful for folks to discuss.
216
00:14:08,561 --> 00:14:11,961
The project was initially
a Wikipedia project by the name,
217
00:14:11,961 --> 00:14:14,651
simply because that was what people
were familiar with,
218
00:14:15,281 --> 00:14:18,360
and so we organized
multiple different events
219
00:14:18,360 --> 00:14:23,135
at very traditional edit-a-thons,
gender gap work, and so forth.
220
00:14:24,607 --> 00:14:26,752
[And a bunch you can see] [inaudible],
221
00:14:27,105 --> 00:14:30,812
and a bunch of very successful
new editors recruited, and so forth.
222
00:14:31,754 --> 00:14:34,454
We did bulk uploads into Commons.
223
00:14:35,454 --> 00:14:41,246
In this case, there was a collection
of over 1,000 original artworks
224
00:14:41,246 --> 00:14:46,047
by an entomological
illustrator, Des Helmore,
225
00:14:46,047 --> 00:14:47,927
which had been sitting on a hard drive,
226
00:14:47,927 --> 00:14:50,357
[lacking] research for ten years,
227
00:14:50,357 --> 00:14:52,322
and we were able
to get clearance to release those
228
00:14:52,322 --> 00:14:54,245
all under CC BY license.
229
00:14:54,245 --> 00:14:57,963
So, easy wins to show to people there.
230
00:14:57,963 --> 00:15:01,095
Everyone can understand
lots of pictures of beetles.
231
00:15:01,095 --> 00:15:06,681
Everyone can understand workshops
devoted to fixing the gender gap.
232
00:15:07,250 --> 00:15:10,251
But Wikidata
is much more difficult to sell
233
00:15:10,251 --> 00:15:12,280
to people in the GLAM sector,
234
00:15:12,280 --> 00:15:15,095
or anyone outside
of our particular movement.
235
00:15:16,107 --> 00:15:19,717
So, I began to realize that Wikidata
236
00:15:19,717 --> 00:15:22,634
was going to be a more
and more important part
237
00:15:22,634 --> 00:15:25,883
of the Wikipedian at Large projects.
238
00:15:25,883 --> 00:15:30,472
So, as we went through, it became
a larger and larger component
239
00:15:30,472 --> 00:15:31,849
of what I was doing.
240
00:15:31,849 --> 00:15:36,350
And I began to try and teach myself
more about Wikidata as well,
241
00:15:36,800 --> 00:15:39,515
because I was beginning to see
how important it was.
242
00:15:40,287 --> 00:15:41,989
So, this one project--
243
00:15:41,989 --> 00:15:46,325
the kakapo is a native
New Zealand flightless parrot.
244
00:15:48,096 --> 00:15:51,335
We worked with
the Department of Conservation,
245
00:15:51,335 --> 00:15:54,299
whose job is to save
this species from extinction,
246
00:15:54,299 --> 00:15:55,643
and pitched the idea,
247
00:15:55,643 --> 00:15:59,253
"What if we put every
single kakapo into Wikidata?"
248
00:16:01,221 --> 00:16:02,701
And that may seem ridiculous,
249
00:16:02,701 --> 00:16:05,580
but it's actually
a perfectly doable project.
250
00:16:06,621 --> 00:16:08,427
A few of them are in there already.
251
00:16:09,100 --> 00:16:11,601
A key thing to notice here
is there are not many kakapos.
252
00:16:11,615 --> 00:16:13,245
So, it's a manageable task.
253
00:16:13,245 --> 00:16:16,656
There were 148 when I started,
and then one died.
254
00:16:16,935 --> 00:16:20,995
And they've just had
a great breeding season up to 213.
255
00:16:21,765 --> 00:16:25,045
This is great. This is the most kakapo
there have been for over 50 years.
256
00:16:25,505 --> 00:16:28,260
So, this was also a big deal.
257
00:16:28,260 --> 00:16:30,725
This was on the news
every day in New Zealand.
258
00:16:31,285 --> 00:16:33,224
Each new one that was born--
259
00:16:33,224 --> 00:16:34,414
(man) In the New York Times.
260
00:16:34,414 --> 00:16:35,673
(Mike) Did it? Oh, lovely.
261
00:16:35,673 --> 00:16:38,522
Yeah, this was national news.
Everyone likes these birds.
262
00:16:39,002 --> 00:16:40,663
But something interesting about them
263
00:16:40,663 --> 00:16:43,932
is because unlike species
that are more populous,
264
00:16:43,932 --> 00:16:47,822
every single kakapo is named,
has a unique name
265
00:16:47,822 --> 00:16:49,817
and a unique ID number.
266
00:16:49,817 --> 00:16:52,442
And often has good biographical data
267
00:16:52,442 --> 00:16:54,672
about where and when they were born,
268
00:16:54,672 --> 00:16:56,972
were hatched, who their father
and mother was,
269
00:16:56,972 --> 00:16:58,713
when they died, if they died.
270
00:16:58,713 --> 00:17:01,352
So, there is, in fact,
a Department of Conservation database
271
00:17:01,352 --> 00:17:02,882
of all this information.
272
00:17:02,882 --> 00:17:06,723
And one of the most famous kakapos,
of course, is Sirocco,
273
00:17:06,723 --> 00:17:09,726
who you can see is named
after a wind, was born there.
274
00:17:09,726 --> 00:17:13,225
Sirocco has a Twitter account,
275
00:17:13,705 --> 00:17:15,927
which Wikidata had some problems with,
276
00:17:15,927 --> 00:17:18,562
because, apparently,
they just can't have Twitter accounts.
277
00:17:18,562 --> 00:17:20,342
I don't know about that.
278
00:17:21,121 --> 00:17:23,456
He's even featured
on an album cover, and so forth.
279
00:17:23,456 --> 00:17:25,716
So there are multiple properties of this,
280
00:17:25,716 --> 00:17:28,258
probably one of the most famous
individual kakapo.
281
00:17:28,258 --> 00:17:30,337
So, I pitched to the Department
of Conservation,
282
00:17:30,337 --> 00:17:33,245
"Why don't we try and do this
with every single one?"
283
00:17:33,245 --> 00:17:37,665
And so, they had to think about
how much of the biographical data
284
00:17:37,665 --> 00:17:39,365
could be made public.
285
00:17:39,365 --> 00:17:41,225
And they come up with a short list.
286
00:17:41,225 --> 00:17:46,644
And now we've got, I think, 212,
210--I think a couple died--
287
00:17:46,644 --> 00:17:50,703
living kakapo that are all candidates now.
288
00:17:50,703 --> 00:17:52,933
And they only get a name when they fledge.
289
00:17:52,933 --> 00:17:56,172
They have a code number until that
while they're still babies.
290
00:17:56,186 --> 00:17:58,227
So, when we've got the full-fledged crop,
291
00:17:58,227 --> 00:18:01,806
we're going to create
a complete Wikidata--
292
00:18:01,806 --> 00:18:04,225
the entire species will be in Wikidata.
293
00:18:04,586 --> 00:18:06,605
But we need to come up
with a property for DOC ID--
294
00:18:06,605 --> 00:18:08,875
I actually would like to talk
with folks about that.
295
00:18:08,875 --> 00:18:11,266
Should we be using a very specific ID,
296
00:18:11,266 --> 00:18:13,136
or should we be coming up with an ID
297
00:18:13,136 --> 00:18:17,665
that would work for all individual birds
or plants or animals
298
00:18:17,665 --> 00:18:21,965
that have been tagged
in any scientific research project?
299
00:18:21,965 --> 00:18:23,795
It's a good question.
300
00:18:25,105 --> 00:18:27,465
Second project was
Christchurch Art Gallery.
301
00:18:28,225 --> 00:18:31,523
There are very few paintings
of Colin MacCahon,
302
00:18:31,523 --> 00:18:33,963
New Zealand's most famous
artist in existence.
303
00:18:33,963 --> 00:18:36,704
This is a drawing he did
for the New Zealand School Journal,
304
00:18:36,704 --> 00:18:38,424
which was government-funded at the time.
305
00:18:38,424 --> 00:18:40,704
So, it's actually in Archives New Zealand
306
00:18:40,704 --> 00:18:42,294
who own the copyright for that.
307
00:18:42,294 --> 00:18:44,333
This is a very unusual situation.
308
00:18:45,014 --> 00:18:47,073
So, I worked with
Christchurch Art Gallery
309
00:18:47,073 --> 00:18:48,993
who, along with Auckland Art Gallery,
310
00:18:48,993 --> 00:18:52,954
maintain a site called
Find New Zealand artists.
311
00:18:52,954 --> 00:18:55,654
The job of which is to keep track
of the holdings--
312
00:18:55,654 --> 00:18:58,403
every institution that has holdings
of the New Zealand artist.
313
00:18:58,403 --> 00:19:03,163
So, about 18,000 different artists
in their database,
314
00:19:03,163 --> 00:19:05,517
and most with very little
information at all.
315
00:19:06,233 --> 00:19:08,992
So, we did a standard sort of Mix'n'Match.
316
00:19:08,992 --> 00:19:13,673
We did an export of the ones
that had at least a birth date,
317
00:19:13,673 --> 00:19:17,545
or a death date, or a place of birth,
or a place of death.
318
00:19:17,545 --> 00:19:20,614
So, that's not restricting it very much.
319
00:19:20,614 --> 00:19:23,484
And even then, we were not able
to match quite a few,
320
00:19:23,484 --> 00:19:25,954
but we've got about 1,500 now
321
00:19:25,954 --> 00:19:28,603
that are matched
to known artists in Wikidata,
322
00:19:28,603 --> 00:19:30,123
which is nice.
323
00:19:30,123 --> 00:19:31,783
But what was appealing to them--
324
00:19:31,783 --> 00:19:33,523
this is their website,
325
00:19:33,523 --> 00:19:39,213
which really just maintains
the holdings links there.
326
00:19:39,213 --> 00:19:44,523
But this biographical data,
which they create by hand, currently,
327
00:19:44,523 --> 00:19:46,063
for every single artist.
328
00:19:46,063 --> 00:19:48,803
And the act of exporting
and putting into Mix'n'Match
329
00:19:48,803 --> 00:19:52,363
exposed numerous typos
and mistakes and such
330
00:19:52,363 --> 00:19:53,723
that they haven't noticed.
331
00:19:53,723 --> 00:19:56,123
And it's only when you start
running things through [Excel],
332
00:19:56,123 --> 00:19:57,272
these things show up.
333
00:19:57,272 --> 00:20:01,720
And the value of Wikidata
was suddenly conveyed to them
334
00:20:01,720 --> 00:20:05,527
when I said, "You can just suck in
that information from Wikidata."
335
00:20:06,548 --> 00:20:09,507
And that made them sit up straight.
336
00:20:09,507 --> 00:20:11,748
So this, I think, is one
of the selling points.
337
00:20:11,748 --> 00:20:14,907
When you have this carefully
hand-curated website
338
00:20:14,907 --> 00:20:19,344
with 18,000 entries, full of mistakes,
and tell them there's another way,
339
00:20:19,344 --> 00:20:20,558
that they can get other people
340
00:20:20,558 --> 00:20:23,192
to do some of this fact-checking
and correction for them--
341
00:20:23,192 --> 00:20:24,813
that's when it sinks home.
342
00:20:25,143 --> 00:20:27,293
And then announced I was pitching the idea
343
00:20:27,293 --> 00:20:30,313
that they "Wikidatafy"
this entire history book
344
00:20:30,313 --> 00:20:33,333
of the New Zealand artists
in Christchurch in the '30s,
345
00:20:33,333 --> 00:20:36,833
and run through--just published--
and run through every single person,
346
00:20:36,833 --> 00:20:39,453
connection, place, exhibition, and such.
347
00:20:39,453 --> 00:20:43,103
But it's a manageable sized project,
and they're very excited by this.
348
00:20:44,303 --> 00:20:46,843
And thirdly, I wanted to show you
Maori Subject Headings.
349
00:20:46,843 --> 00:20:50,811
A waka is a Maori name
for a particular kind of canoe,
350
00:20:50,811 --> 00:20:52,732
a war canoe.
351
00:20:52,732 --> 00:20:55,952
So, in the National Library
of New Zealand,
352
00:20:55,952 --> 00:20:58,530
there's a listing for waka,
because the National Library
353
00:20:58,530 --> 00:21:02,805
actually has its own dictionary
of Maori Subject Headings,
354
00:21:03,299 --> 00:21:04,474
in the Maori language.
355
00:21:04,474 --> 00:21:06,475
So, there it defines a waka,
356
00:21:07,175 --> 00:21:09,512
in Maori and English.
357
00:21:10,182 --> 00:21:12,372
But it also has a whole lot
of narrower terms,
358
00:21:12,372 --> 00:21:14,222
you can see there on the side there.
359
00:21:14,222 --> 00:21:16,062
a typical would be taurapa.
360
00:21:16,237 --> 00:21:19,774
And a definition first in Maori,
and then in English.
361
00:21:19,774 --> 00:21:22,249
It's the carved sternpost
that you can see there.
362
00:21:22,695 --> 00:21:24,482
And in English, you would say "sternpost,"
363
00:21:24,482 --> 00:21:26,959
but you can't use
the word "sternpost" for taurapa,
364
00:21:26,959 --> 00:21:31,054
because taurapa only works
for particular kinds of war canoes.
365
00:21:31,420 --> 00:21:34,460
So, there's no English word
equivalent for that.
366
00:21:35,108 --> 00:21:37,909
And I suddenly realized
that here is an entire ontology
367
00:21:37,909 --> 00:21:42,177
of cultural-specific terms that have been
very carefully worked out
368
00:21:42,177 --> 00:21:45,043
and verified by the National
Library with Maori,
369
00:21:45,043 --> 00:21:49,733
constantly being added to and improved
with definitions, with descriptions,
370
00:21:49,733 --> 00:21:51,803
in both English and Maori.
371
00:21:51,803 --> 00:21:52,956
Really exciting.
372
00:21:52,956 --> 00:21:56,228
I suddenly thought we could put
this whole lot into Wikidata--
373
00:21:56,228 --> 00:22:00,596
Maori first, and then translated
into English, as required.
374
00:22:00,596 --> 00:22:02,291
Be a nice change, wouldn't it!
375
00:22:03,081 --> 00:22:05,046
And here's the copyright licensing.
376
00:22:05,046 --> 00:22:08,726
Unfortunately, NonCommercial-NoDerivs.
377
00:22:10,346 --> 00:22:12,346
So now I have to start
the conversation with them
378
00:22:12,346 --> 00:22:14,524
about why did they pick that license.
379
00:22:15,675 --> 00:22:19,970
And possibly because they only got
[buy in] from Maori,
380
00:22:19,970 --> 00:22:22,679
who agreed to sit down
and [inaudible] this stuff
381
00:22:22,679 --> 00:22:24,039
if there was a guarantee
382
00:22:24,039 --> 00:22:27,339
that none of this information
could be used for commercial purposes.
383
00:22:27,920 --> 00:22:31,999
So, that's one of the frustrating
aspects of the task
384
00:22:31,999 --> 00:22:34,238
is coming up against
these sorts of restrictions.
385
00:22:34,238 --> 00:22:37,019
So, those are the three things
I wanted to put out in front
386
00:22:37,019 --> 00:22:38,379
and sparking discussion.
387
00:22:38,379 --> 00:22:40,878
Putting an entire species into Wikidata,
388
00:22:40,878 --> 00:22:44,107
what it takes to actually change
an art gallery's curator's mind
389
00:22:44,107 --> 00:22:46,078
about the value of Wikidata,
390
00:22:46,078 --> 00:22:49,838
and what do we do when we would see
a complete ontology
391
00:22:49,838 --> 00:22:52,477
in another language that,
unfortunately, has been slapped
392
00:22:52,477 --> 00:22:55,697
with a restrictive
Creative Commons license.
393
00:22:55,697 --> 00:22:56,997
Thank you.
394
00:22:56,997 --> 00:22:58,737
(applause)
395
00:23:11,412 --> 00:23:14,077
Hello. My name is Joachim Neubert.
396
00:23:14,077 --> 00:23:16,472
I'm working for the ZBW,
397
00:23:17,522 --> 00:23:20,947
that is, Information Center
for Economics in Hamburg,
398
00:23:21,407 --> 00:23:23,796
as a scientific software developer.
399
00:23:24,726 --> 00:23:31,108
And one of my tasks last year
was preparing a data donation to Wikidata.
400
00:23:31,878 --> 00:23:37,193
And I want to give some report on this
on our first experiences
401
00:23:37,613 --> 00:23:43,259
from donating metadata
from the 20th-Century Press Archives.
402
00:23:46,463 --> 00:23:48,299
To our best knowledge,
403
00:23:48,299 --> 00:23:52,678
this is the largest public
press archive in the world.
404
00:23:54,018 --> 00:23:59,158
It has been collected
between 1908 and 2005,
405
00:24:01,008 --> 00:24:04,244
and has been got from
406
00:24:05,174 --> 00:24:09,272
more than 1,500 newspapers
and periodicals
407
00:24:09,272 --> 00:24:13,333
from Germany, and also internationally.
408
00:24:14,651 --> 00:24:18,841
And it has covered everything
which could be of interest
409
00:24:18,841 --> 00:24:22,820
for the Hamburg,
410
00:24:25,870 --> 00:24:28,030
the Hamburg businesspeople
411
00:24:28,030 --> 00:24:32,410
who wanted to expand over the world.
412
00:24:34,611 --> 00:24:39,350
As you can see, this material
has been clipped from newspapers
413
00:24:39,350 --> 00:24:41,790
and put onto paper,
414
00:24:41,790 --> 00:24:44,731
and then collected in folders.
415
00:24:46,121 --> 00:24:50,451
Here you see a small corner
of the Person's Archive,
416
00:24:51,255 --> 00:24:56,182
and, similarly, information
has been collected on companies,
417
00:24:56,182 --> 00:24:59,762
on general topics, on wares,
on everybody,
418
00:25:01,533 --> 00:25:05,557
on everything which could be interesting.
419
00:25:06,978 --> 00:25:11,074
These folders have been scanned
420
00:25:12,652 --> 00:25:15,868
up to roughly 1949.
421
00:25:17,076 --> 00:25:23,123
by the DFG-funded project in 2004 to 2007.
422
00:25:24,268 --> 00:25:30,591
As a result, up to now,
it was 25,000 thematic dossiers
423
00:25:31,727 --> 00:25:33,759
of this time.
424
00:25:33,771 --> 00:25:37,913
This contained about 2 million,
or more than 2 million pages.
425
00:25:38,845 --> 00:25:41,522
And these are online.
426
00:25:43,633 --> 00:25:48,461
This application developed
at that time by ZBW,
427
00:25:50,006 --> 00:25:54,341
which now looks a bit outdated,
428
00:25:55,031 --> 00:25:58,153
not so fancy,
and what’s more of a problem.
429
00:25:58,597 --> 00:26:04,350
It's an application which was built
architecturally on Oracle,
430
00:26:04,350 --> 00:26:08,662
it was built on ColdFusion,
it runs on Windows servers,
431
00:26:09,227 --> 00:26:14,992
so it's not very sustainable
in the long term.
432
00:26:16,008 --> 00:26:19,274
And we have discussed
should we migrate this
433
00:26:19,274 --> 00:26:22,755
to a more fancy linked data application,
434
00:26:23,931 --> 00:26:27,964
or should we take a radical step
435
00:26:27,964 --> 00:26:31,749
and put all this data in the open.
436
00:26:32,843 --> 00:26:37,416
We have assigned CC0 license to that data
437
00:26:37,416 --> 00:26:40,938
and, currently, moving some main--
438
00:26:42,036 --> 00:26:46,463
access layer, some main discovery layer--
so it's a primary access layer
439
00:26:47,835 --> 00:26:50,587
to the open linked data web,
440
00:26:51,315 --> 00:26:56,881
where it actually makes most sense
441
00:26:56,881 --> 00:27:00,698
to put some metadata into Wikidata,
442
00:27:02,367 --> 00:27:06,781
and to make sure that all folders
443
00:27:07,594 --> 00:27:10,633
of the collections are linked to Wikidata,
444
00:27:11,485 --> 00:27:13,308
so they are findable,
445
00:27:14,240 --> 00:27:17,795
and that all metadata about these folders
446
00:27:18,444 --> 00:27:22,977
is also transferred to Wikidata.
447
00:27:23,344 --> 00:27:27,886
So it can be used there,
and it can be enriched there, possibly.
448
00:27:28,780 --> 00:27:32,237
Corrections can be made to that data.
449
00:27:32,645 --> 00:27:38,894
What is still maintained by ZBW is,
of course, the storage of the images,
450
00:27:39,947 --> 00:27:43,882
which we can't put in any way,
451
00:27:45,548 --> 00:27:47,326
or we can't give a license on that
452
00:27:47,326 --> 00:27:51,179
because this was owned
by the original creators.
453
00:27:52,271 --> 00:27:54,954
But we make sure that they are accessible
454
00:27:56,500 --> 00:28:02,203
by some, again, metadata files
via DFG Viewer
455
00:28:03,108 --> 00:28:06,108
in the future by IIIF manifests.
456
00:28:06,849 --> 00:28:11,050
And we will prepare
some static landing pages
457
00:28:11,707 --> 00:28:18,333
which will serve as a data point
of reference for Wikidata,
458
00:28:18,333 --> 00:28:22,596
as well as still making available data
459
00:28:22,600 --> 00:28:26,174
which doesn't fit well into Wikidata.
460
00:28:31,253 --> 00:28:36,815
[For us] is migration
and data donation to Wikidata
461
00:28:37,165 --> 00:28:40,633
with our custom infrastructure
462
00:28:40,633 --> 00:28:44,837
of SPARQL endpoint with that data,
463
00:28:45,887 --> 00:28:48,980
and we basically used federated queries
464
00:28:49,990 --> 00:28:53,834
between that endpoint
and the Wikidata Query Service
465
00:28:53,834 --> 00:28:57,633
to create according statements
466
00:28:59,207 --> 00:29:02,107
through [eyes of] concatenated
467
00:29:02,107 --> 00:29:06,937
in SPARQL queries themselves,
or transformed via a script,
468
00:29:07,907 --> 00:29:12,254
which also generated references
for the statements.
469
00:29:12,742 --> 00:29:19,446
And then put that into QuickStatements
of the code to use this online.
470
00:29:22,544 --> 00:29:24,088
So, this is what we get.
471
00:29:24,493 --> 00:29:28,669
It's not only simple things
like birth dates, but, sorry--
472
00:29:29,835 --> 00:29:34,998
but also complex statements
473
00:29:34,998 --> 00:29:39,787
about already existing items,
474
00:29:39,787 --> 00:29:44,790
like this person was a supervisory
board member of said company
475
00:29:46,682 --> 00:29:48,905
during this period of time,
476
00:29:49,663 --> 00:29:56,696
and referenced for use in...
477
00:29:58,463 --> 00:30:01,864
in the scientific context.
478
00:30:07,763 --> 00:30:10,939
The first part of this data donation
has been finished.
479
00:30:12,736 --> 00:30:17,201
The Person's Archive
is completely linked to Wikidata.
480
00:30:18,333 --> 00:30:23,652
And this is also an information tool.
481
00:30:23,652 --> 00:30:27,360
A lot of items which have been before
482
00:30:27,360 --> 00:30:30,422
not had any external references.
483
00:30:31,278 --> 00:30:35,674
And we had about more
than 6,000 statements,
484
00:30:36,201 --> 00:30:41,924
which are now sourced
in this archive's metadata.
485
00:30:45,288 --> 00:30:49,951
Well, this was the most easy part,
486
00:30:50,880 --> 00:30:54,785
because persons are easily
identifiable in Wikidata.
487
00:30:56,494 --> 00:31:00,443
More than 90% already existed here,
488
00:31:00,443 --> 00:31:02,412
so we could link to that.
489
00:31:02,412 --> 00:31:06,486
We created some 100 items for these,
490
00:31:06,486 --> 00:31:08,807
for the ones which were missing.
491
00:31:09,296 --> 00:31:13,626
But now, we are working
492
00:31:13,626 --> 00:31:18,165
on the rest of the archive,
493
00:31:18,165 --> 00:31:20,432
particularly on the topics archive.
494
00:31:21,243 --> 00:31:26,677
Which means mapping a historic system
for the organization of knowledge
495
00:31:26,677 --> 00:31:29,884
about the whole world,
496
00:31:29,884 --> 00:31:34,147
materialized as newspaper
clippings to Wikidata.
497
00:31:36,305 --> 00:31:41,898
To give you a basic idea,
the Countries and Topics archive
498
00:31:42,668 --> 00:31:48,773
is organized by a hierarchy of countries
499
00:31:48,773 --> 00:31:50,882
and other geographic entities,
500
00:31:52,499 --> 00:31:56,443
which is translated to English,
which makes this more easy.
501
00:31:56,443 --> 00:32:01,861
And German deeply nested...
502
00:32:03,881 --> 00:32:08,064
deeply nested classification of topics.
503
00:32:08,064 --> 00:32:11,593
And this combination defines one...
504
00:32:13,032 --> 00:32:16,020
one folder.
505
00:32:16,020 --> 00:32:21,128
So, what we now want to do
is to match this
506
00:32:21,128 --> 00:32:24,575
as a structure to Wikidata,
and to bring the data in.
507
00:32:24,575 --> 00:32:29,338
And I want to invite you
508
00:32:29,338 --> 00:32:33,801
to join this really nice challenge
509
00:32:33,801 --> 00:32:36,272
in terms of knowledge organization.
510
00:32:37,739 --> 00:32:40,713
So, it's a WikiProject
where this work is tracked,
511
00:32:40,713 --> 00:32:46,288
and you can follow this
or participate in this.
512
00:32:46,591 --> 00:32:48,908
And, yes, thank you very much.
513
00:32:49,639 --> 00:32:51,723
(applause)
514
00:33:03,999 --> 00:33:07,284
So, we're taking
performing arts to Wikidata.
515
00:33:07,735 --> 00:33:11,930
And we're taking performing arts
to the linked open data cloud,
516
00:33:11,930 --> 00:33:15,595
by building a linked open data
ecosystem for the performing arts.
517
00:33:16,164 --> 00:33:21,068
And the question I'm trying to answer,
518
00:33:21,068 --> 00:33:24,463
and I hope you'll help me
in answering the questions
519
00:33:24,463 --> 00:33:27,012
which place for Wikidata and all that.
520
00:33:27,012 --> 00:33:31,316
But let me first start with my experiences
521
00:33:31,316 --> 00:33:33,963
which I made this year,
522
00:33:34,723 --> 00:33:37,564
the first half of the year,
when I had the pleasure
523
00:33:37,564 --> 00:33:39,350
to work with CAPACOA,
524
00:33:39,350 --> 00:33:42,074
which is the Canadian Arts
Presenting Association,
525
00:33:42,074 --> 00:33:47,408
which actually launched a project
called Linked Digital Future Initiative,
526
00:33:47,831 --> 00:33:53,261
to actually get the entire art sector
in Canada to embrace linked open data.
527
00:33:53,441 --> 00:33:56,887
And they did that based on the observation
528
00:33:56,887 --> 00:33:59,042
that over the past five years,
529
00:33:59,731 --> 00:34:03,924
the [inaudible]-- the important topic
within performing arts
530
00:34:03,924 --> 00:34:08,855
was the fact that metadata
was not around in sufficient quality
531
00:34:08,855 --> 00:34:11,780
and not interlinked, not interoperable.
532
00:34:12,106 --> 00:34:16,498
And that was why some of the performances,
533
00:34:16,498 --> 00:34:19,542
some of the events
are not so well findable
534
00:34:19,542 --> 00:34:24,777
by Google and by personal
computer-based assistants, and so on.
535
00:34:25,989 --> 00:34:29,757
So, the vision we kind
of developed together
536
00:34:29,757 --> 00:34:32,997
is that we want to have a knowledge base
537
00:34:34,013 --> 00:34:35,646
for many stakeholders at once.
538
00:34:35,646 --> 00:34:39,636
So we looked at the entire
performing arts value network,
539
00:34:39,636 --> 00:34:42,073
we identified key stakeholders in there,
540
00:34:42,073 --> 00:34:46,545
we looked at the usage scenarios
that we like to pursue,
541
00:34:47,719 --> 00:34:52,074
and we kind of mapped it
to the whole architecture
542
00:34:52,074 --> 00:34:57,097
of such a knowledge base,
or of the different platforms in there,
543
00:34:57,097 --> 00:34:59,535
which, obviously,
is a distributed architecture,
544
00:34:59,535 --> 00:35:01,361
and not one big monolith.
545
00:35:02,499 --> 00:35:05,664
I'm just going to run
through that quite quickly
546
00:35:05,664 --> 00:35:07,980
because we have ten minutes each.
547
00:35:09,035 --> 00:35:13,796
But I think we'll have plenty of time
tonight or tomorrow to deepen that
548
00:35:13,796 --> 00:35:16,318
if anybody's interested in the details.
549
00:35:16,318 --> 00:35:19,116
So, we started from
that Performing Arts Value Network,
550
00:35:19,116 --> 00:35:23,263
which, interestingly,
was just published last year.
551
00:35:23,263 --> 00:35:27,691
So, we're lucky to be able
to build on previous work,
552
00:35:27,691 --> 00:35:31,098
like you have the primary value chain
of the performing arts in the middle,
553
00:35:31,098 --> 00:35:34,177
and various stakeholders around that.
554
00:35:34,177 --> 00:35:37,387
All in all, we identified
20 stakeholder groups,
555
00:35:37,387 --> 00:35:43,384
which then we kind of boiled down
into seven larger categories
556
00:35:43,395 --> 00:35:45,464
for each of the stakeholder groups.
557
00:35:45,464 --> 00:35:51,558
We kind of formulated what kind of needs
558
00:35:51,558 --> 00:35:54,718
they would have in terms
of such an infrastructure,
559
00:35:54,718 --> 00:35:58,572
and what would they be able to achieve
if the whole thing was interlinked
560
00:35:58,572 --> 00:36:02,062
and the data was publicly accessible.
561
00:36:02,637 --> 00:36:04,990
And so, you can see the types here,
562
00:36:04,990 --> 00:36:09,177
the different types is Production,
then Presention & Promotion,
563
00:36:09,177 --> 00:36:12,064
Coverage & Reuse, Live Audiences,
564
00:36:12,064 --> 00:36:13,852
Online Consumption, Heritage,
565
00:36:13,852 --> 00:36:15,959
Research & Education.
566
00:36:15,959 --> 00:36:18,917
And after kind of setting up a big table,
567
00:36:18,917 --> 00:36:21,275
of which you can see
just the first part here,
568
00:36:21,275 --> 00:36:25,128
we kind of compared [over there],
had a look at which type of data
569
00:36:25,128 --> 00:36:26,954
were actually used across the board
570
00:36:26,954 --> 00:36:31,248
by all different groups of stakeholders.
571
00:36:31,248 --> 00:36:36,586
And there's quite a large basis of data
that is common to all of them,
572
00:36:36,586 --> 00:36:38,414
and that is really is the area
573
00:36:38,414 --> 00:36:43,063
where it makes a lot of sense, actually,
to cooperate and to keep that--
574
00:36:43,063 --> 00:36:45,988
to maintain the data together.
575
00:36:47,602 --> 00:36:50,651
So, when talking about
platform architecture,
576
00:36:50,651 --> 00:36:53,648
you can see that we have four layers here.
577
00:36:54,096 --> 00:36:56,448
At the bottom, display the data layer.
578
00:36:56,448 --> 00:36:58,717
Of course, Wikidata plays a part in it,
579
00:36:58,717 --> 00:37:02,733
but also a lot of other databases,
distributed databases
580
00:37:02,733 --> 00:37:07,769
that can expose data
through SPARQL endpoints.
581
00:37:09,204 --> 00:37:13,106
The yellow part in the middle,
that's the semantic layer.
582
00:37:13,106 --> 00:37:16,080
It's our common language
to describe our things,
583
00:37:16,080 --> 00:37:21,834
to make statements about things
around the performing arts, the ontology.
584
00:37:22,400 --> 00:37:25,243
Then we have an application layer
585
00:37:25,243 --> 00:37:30,551
that consists of various modules,
for example, data analysis,
586
00:37:30,551 --> 00:37:34,613
data extraction-- so, how do you
actually get unstructured data
587
00:37:34,613 --> 00:37:36,029
into structured data--
588
00:37:36,029 --> 00:37:38,749
how can we support that by tools.
589
00:37:39,436 --> 00:37:42,478
Then, obviously, there's
a visualization of data--
590
00:37:42,478 --> 00:37:47,115
so if there are large quantities of data,
you want to visualize it in some way.
591
00:37:47,801 --> 00:37:50,155
And on the top, you have
the presentation layer,
592
00:37:50,155 --> 00:37:54,814
that's what the ordinary people
are actually interacting with
593
00:37:54,814 --> 00:37:56,199
on a daily basis--
594
00:37:56,199 --> 00:37:59,615
search engines, encyclopedias,
cultural agendas,
595
00:37:59,615 --> 00:38:02,097
and a variety of other services.
596
00:38:03,395 --> 00:38:05,386
We're not starting from scratch.
597
00:38:05,386 --> 00:38:08,535
Some work has already
been done in this area.
598
00:38:09,107 --> 00:38:13,043
I'll just cite a few examples
from a project
599
00:38:13,043 --> 00:38:15,245
which I have been involved in.
600
00:38:15,245 --> 00:38:18,149
Some other stuff going on as well.
601
00:38:18,149 --> 00:38:21,195
And so, I started in this area
602
00:38:21,195 --> 00:38:24,476
with the Swiss Archive
of the Performing Arts.
603
00:38:25,001 --> 00:38:27,795
[Until] building a Swiss
Performing Arts database,
604
00:38:27,795 --> 00:38:31,046
we created the performing arts ontology,
605
00:38:31,046 --> 00:38:33,931
that's currently being
implemented into RDF.
606
00:38:34,701 --> 00:38:39,771
And there we have the database
of like 60, 70 years
607
00:38:39,771 --> 00:38:43,313
of performance history in Switzerland.
608
00:38:43,313 --> 00:38:45,145
So, that's something that can build on,
609
00:38:45,145 --> 00:38:48,999
and that's something
that's been transformed into RDF.
610
00:38:49,968 --> 00:38:54,621
And there was a builder platform
where this data can be accessed.
611
00:38:56,073 --> 00:39:01,658
Then we have done
several ingests into Wikidata,
612
00:39:01,658 --> 00:39:02,877
partly from Switzerland,
613
00:39:02,877 --> 00:39:08,990
partly also from
the performance arts institutes,
614
00:39:09,680 --> 00:39:12,357
for example, Bart Magnus
was involved in that.
615
00:39:12,883 --> 00:39:15,078
He was the driving force behind that.
616
00:39:15,078 --> 00:39:17,223
There's also stuff from Wikimedia Commons,
617
00:39:17,223 --> 00:39:21,361
but not very well interlinked
with all the rest of our metadata.
618
00:39:21,361 --> 00:39:25,097
And obviously, by doing this ingest,
619
00:39:25,097 --> 00:39:29,274
we also kind of started to implement
parts of this Swiss data model
620
00:39:29,274 --> 00:39:31,345
into Wikidata.
621
00:39:32,767 --> 00:39:37,556
Then one of the Canadian
implementation partners
622
00:39:37,556 --> 00:39:39,013
is Culture Creates.
623
00:39:39,013 --> 00:39:43,872
They're running a platform that actually
scrapes information from theater websites,
624
00:39:43,872 --> 00:39:46,873
and inputs it into a knowledge graph,
625
00:39:48,293 --> 00:39:54,428
to then expose it to search engines
and other search devices.
626
00:39:56,415 --> 00:40:03,027
And there again, we kind of had
to implement and extend this in ontology.
627
00:40:03,261 --> 00:40:08,163
And as you can see from the slide,
is that there's so many empty spaces,
628
00:40:08,163 --> 00:40:09,599
but there's also some overlap,
629
00:40:09,599 --> 00:40:13,456
and an important overlap, obviously,
is the common shared language,
630
00:40:13,456 --> 00:40:18,693
which will help us actually interlink
the various data sets.
631
00:40:20,759 --> 00:40:22,587
What is also important, obviously,
632
00:40:22,587 --> 00:40:26,404
is that we're using the same
base registers and authority files.
633
00:40:26,406 --> 00:40:31,368
And this is a place where Wikidata
plays an important role
634
00:40:31,368 --> 00:40:33,967
by kind of interlinking these.
635
00:40:34,619 --> 00:40:37,799
Now, I'd like to share the recommendations
636
00:40:37,799 --> 00:40:41,882
by the Linked Data Future Initiatives
Advisory Committee.
637
00:40:42,769 --> 00:40:45,169
At least the two first recommendations.
638
00:40:45,169 --> 00:40:47,930
So, for the Canadians,
now it's absolutely crucial
639
00:40:47,930 --> 00:40:53,173
to kind of fill in their own Canadian
performing arts knowledge graph,
640
00:40:53,173 --> 00:40:55,851
because unlike the Swiss Archive
of the Performing Arts,
641
00:40:55,851 --> 00:40:59,389
they're not starting
with an already existing database,
642
00:40:59,389 --> 00:41:01,906
but they're kind of
creating it from scratch.
643
00:41:01,906 --> 00:41:04,468
And it's absolutely crucial
to have data in there.
644
00:41:04,468 --> 00:41:09,024
And second, as you can see,
comes in already Wikidata.
645
00:41:09,024 --> 00:41:12,342
Wikidata, by the Advisory Committee,
646
00:41:12,342 --> 00:41:17,859
has been seen as complementary
to Artsdata.ca, this knowledge graph,
647
00:41:18,347 --> 00:41:21,474
and, therefore, efforts should
be undertaken to contribute
648
00:41:21,474 --> 00:41:24,878
to its population
with performing arts-related data.
649
00:41:25,813 --> 00:41:30,775
And that's where we're going to work on
over the coming months and years,
650
00:41:30,775 --> 00:41:34,748
and that's also why
I'm kind of on the lookout here
651
00:41:34,748 --> 00:41:38,644
to see who else will join that effort.
652
00:41:40,556 --> 00:41:44,942
So, right now, obviously,
we're saying they're complementary.
653
00:41:44,942 --> 00:41:48,341
So, we have to think about whether
the pluses and the minuses
654
00:41:48,341 --> 00:41:49,844
of each of the approaches.
655
00:41:49,844 --> 00:41:52,073
And you can see here a comparison
656
00:41:52,073 --> 00:41:56,120
between Wikidata and the Classical
Linked Open Data approach.
657
00:41:56,887 --> 00:41:59,947
I would be happy to discuss
that further with you guys,
658
00:41:59,947 --> 00:42:02,549
how your experiences are in there.
659
00:42:02,814 --> 00:42:07,727
But, as I see it, Wikidata is a huge plus
because it's a crowdsourcing platform,
660
00:42:07,727 --> 00:42:11,671
and it's easy to invite further parties
to actually contribute.
661
00:42:11,683 --> 00:42:17,482
On the negative side, obviously,
you get this problem of loss of control.
662
00:42:17,658 --> 00:42:22,764
Data owners have to give up control
over their graphs, data quality,
663
00:42:22,764 --> 00:42:24,382
and completeness.
664
00:42:26,554 --> 00:42:31,096
It's harder to track on Wikidata
than if you have it under your control.
665
00:42:31,493 --> 00:42:34,376
And the other strength of Wikidata
666
00:42:34,376 --> 00:42:39,617
is that it requires immediate integration
into that worldwide graph.
667
00:42:39,617 --> 00:42:41,734
And you kind of just do it--
668
00:42:42,544 --> 00:42:46,768
kind of reconcile step by step
against other databases,
669
00:42:46,768 --> 00:42:49,528
which may also be seen by some
as an advantage,
670
00:42:49,528 --> 00:42:53,914
but of course, if you're looking
for integration and interoperability,
671
00:42:53,914 --> 00:42:56,792
Wikidata forces you to go for that
from the beginning.
672
00:42:59,184 --> 00:43:03,157
And then, obviously, harmonizing
data modeling practices
673
00:43:03,157 --> 00:43:05,552
is an issue in both cases.
674
00:43:06,039 --> 00:43:10,671
But it may seem, at the beginning,
easier to do with just in your own silo,
675
00:43:10,671 --> 00:43:13,356
because at some point,
you're done with the task,
676
00:43:13,356 --> 00:43:16,693
and it would be
an ongoing task on Wikidata.
677
00:43:18,280 --> 00:43:22,883
So, when it now comes to prioritizing
the data to be ingested,
678
00:43:23,535 --> 00:43:28,395
that's like the rules
I kind of go by at the moment.
679
00:43:30,055 --> 00:43:32,325
First of all, we'd like to ingest it
680
00:43:32,325 --> 00:43:36,191
where it's unclear who would be
the natural authority in the given area.
681
00:43:36,191 --> 00:43:40,433
So that's definitely data
that will be managed in a shared manner.
682
00:43:40,902 --> 00:43:44,391
And we'd like to ingest it where we see
683
00:43:44,391 --> 00:43:47,149
a high potential
for crowdsourcing approaches.
684
00:43:47,149 --> 00:43:51,693
We'd like to ingest data where the data
is likely to be reused
685
00:43:51,693 --> 00:43:53,965
in the context of Wikipedia.
686
00:43:54,813 --> 00:44:00,262
And there's also hope that some part
of the international coordination
687
00:44:00,262 --> 00:44:04,364
around the whole data modeling,
about the standardization,
688
00:44:04,364 --> 00:44:07,531
they could actually take place
directly on Wikidata,
689
00:44:07,531 --> 00:44:09,484
if it's not taking place elsewhere,
690
00:44:09,484 --> 00:44:12,305
because it kind of forces people
to start interacting
691
00:44:12,305 --> 00:44:14,816
if they ingest data in the same part.
692
00:44:15,963 --> 00:44:22,168
And we'd like to focus now next
on base registers and authority files
693
00:44:22,181 --> 00:44:26,085
because they kind of help us
create the linkages
694
00:44:26,085 --> 00:44:29,010
between different data
and uncontrolled vocabularies
695
00:44:29,010 --> 00:44:32,833
as an extension of the existing ontology.
696
00:44:33,965 --> 00:44:35,994
So, just two more slides.
697
00:44:36,480 --> 00:44:40,978
The next steps will be that we're taking
the sum of all GLAMs approach
698
00:44:40,978 --> 00:44:42,888
to Wiki Loves Performing Arts.
699
00:44:42,888 --> 00:44:47,524
That means we're describing
venues and organizations,
700
00:44:47,524 --> 00:44:51,106
and try to push the data to Wikipedia
701
00:44:51,106 --> 00:44:54,414
in forms of infoboxes
and [bubble] templates.
702
00:44:54,414 --> 00:44:59,769
And the other one, the other projects
I'm going to pursue is COST Action
703
00:45:00,336 --> 00:45:02,001
that we'll submit next year
704
00:45:03,140 --> 00:45:06,037
around that Linked Open Data Ecosystem
for the Performing Arts.
705
00:45:06,037 --> 00:45:10,347
COST is a European program
that supports networking activities,
706
00:45:10,347 --> 00:45:13,929
and the topics to be covered
are listed here.
707
00:45:13,929 --> 00:45:16,404
Two of them, I have highlighted--
708
00:45:16,404 --> 00:45:20,702
one of them is like the question
of federation between Wikidata
709
00:45:20,702 --> 00:45:23,717
and the classical linked
open data approaches.
710
00:45:24,368 --> 00:45:27,744
And the other one, I think,
is very important also,
711
00:45:27,744 --> 00:45:30,528
where we have a huge potential still,
712
00:45:30,528 --> 00:45:35,683
is implementing international campaigns
to supplement data on Wikidata.
713
00:45:37,627 --> 00:45:41,365
So, that's it. Thank you
for your attention.
714
00:45:41,365 --> 00:45:45,762
Now, I would like to ask
my colleagues up here.
715
00:45:47,086 --> 00:45:50,529
To the panel, maybe you'll get them
microphones as well.
716
00:45:53,903 --> 00:45:55,682
And then I would like to...
717
00:45:57,473 --> 00:45:59,940
give you the chance to ask questions.
718
00:46:01,042 --> 00:46:05,185
And obviously, also ask my colleagues
719
00:46:05,753 --> 00:46:08,071
whether they have questions to each other.
720
00:46:12,049 --> 00:46:15,327
So, do we have maybe a question
from the audience?
721
00:46:20,502 --> 00:46:22,758
(man) [inaudible]
722
00:46:23,587 --> 00:46:27,033
I would like to ask from each of you
723
00:46:27,033 --> 00:46:30,842
where would you draw the line,
724
00:46:30,842 --> 00:46:33,076
basically, how you define--
725
00:46:33,076 --> 00:46:35,956
when do you need to run your own Wikibase,
726
00:46:35,956 --> 00:46:39,328
and what do you want to put on Wikidata?
727
00:46:39,328 --> 00:46:43,677
Like, is this a clear delineation
of what is seen
728
00:46:43,677 --> 00:46:45,981
behind of putting it [into order.]
729
00:46:48,211 --> 00:46:51,484
I can answer first because I have the mic.
730
00:46:51,484 --> 00:46:56,955
So, I've been thinking
that one of the issues is notability.
731
00:46:59,212 --> 00:47:02,084
I'm addressing that
in a different project.
732
00:47:02,084 --> 00:47:05,898
And I think licensing could be one,
733
00:47:05,898 --> 00:47:10,466
because you can apply your own terms
in your own database,
734
00:47:10,466 --> 00:47:13,758
and then I think wherever it's possible.
735
00:47:14,284 --> 00:47:19,882
And then, the third one
is just to have it as a sandbox,
736
00:47:19,882 --> 00:47:23,078
prepare it for ingestion into Wikidata.
737
00:47:23,078 --> 00:47:26,085
These are the three main things
that I come up with now,
738
00:47:26,085 --> 00:47:28,554
but I can come up with more.
739
00:47:29,976 --> 00:47:32,369
For me, rights are always
going to be an issue.
740
00:47:32,369 --> 00:47:36,686
So, if the National Library
wanted to move towards Wikibase,
741
00:47:36,686 --> 00:47:39,740
that would enable them to continue
to control the licensing
742
00:47:39,740 --> 00:47:42,539
for the work they've done
with Maori language terms.
743
00:47:43,438 --> 00:47:46,483
The kakapo database only contains data
744
00:47:46,483 --> 00:47:49,977
that the Department of Conservation
felt could be made public,
745
00:47:49,977 --> 00:47:52,739
but I suspect if they see it
up and running,
746
00:47:52,739 --> 00:47:55,980
they might be tempted
to use a private Wikibase
747
00:47:55,980 --> 00:47:58,128
to maintain their own database,
748
00:47:58,128 --> 00:48:01,214
simply because of some
of the visualization tools
749
00:48:01,214 --> 00:48:03,567
that could be applied might be better
750
00:48:03,567 --> 00:48:07,417
than the sort of Excel spreadsheet system
that they currently run.
751
00:48:12,337 --> 00:48:16,556
Well, I think this very much depends
on the kind of data.
752
00:48:17,609 --> 00:48:22,359
We are, with the Press Archive, of course,
in a quite lucky position,
753
00:48:22,359 --> 00:48:26,984
in that this was material
which was published,
754
00:48:26,984 --> 00:48:29,829
it was published at the time,
755
00:48:30,153 --> 00:48:31,780
but it was expensive to publish.
756
00:48:33,082 --> 00:48:36,234
So, this is quite easy.
757
00:48:36,234 --> 00:48:39,449
I think, also, projects--
758
00:48:40,101 --> 00:48:42,027
and this is a typical project,
759
00:48:42,027 --> 00:48:45,726
so it was funded for some time,
and then funding ended,
760
00:48:46,466 --> 00:48:51,516
and what happens with the data
which is enclosed in some silo,
761
00:48:52,136 --> 00:48:55,106
and some software
which will not run forever.
762
00:48:55,846 --> 00:48:59,436
And so, it makes
absolute sense in my eyes.
763
00:48:59,896 --> 00:49:02,776
At the time, Wikidata
wasn't around, but now it is,
764
00:49:03,376 --> 00:49:07,336
and it makes absolute sense
for our project to early on
765
00:49:07,336 --> 00:49:12,732
discuss sustainability in the context
of how could we put this
766
00:49:12,732 --> 00:49:16,617
into a larger ecosystem like Wikidata,
767
00:49:18,717 --> 00:49:21,408
and discuss this with the data community
768
00:49:21,408 --> 00:49:26,864
what is notable and what makes sense
to add this to Wikidata,
769
00:49:26,864 --> 00:49:32,093
and what makes sense to keep this
as a proprietary form.
770
00:49:32,093 --> 00:49:37,753
Maybe in a more simple form
than sophisticated application,
771
00:49:37,753 --> 00:49:43,055
but make it discoverable
and make it linked to the large data cloud
772
00:49:43,055 --> 00:49:46,032
instead of investing lots of money
773
00:49:46,032 --> 00:49:52,692
into some silo which will not sustain.
774
00:49:55,201 --> 00:50:00,121
Yeah, as I said before
in the project I was presenting here,
775
00:50:00,121 --> 00:50:04,926
are dualities between Wikidata
and classical linked open data approaches.
776
00:50:04,926 --> 00:50:07,928
So, it's not so much about
setting up a private Wikibase.
777
00:50:11,147 --> 00:50:14,504
Like one challenge we have had,
and, of course, in Wikidata,
778
00:50:14,504 --> 00:50:17,710
is that when we ingest
your own data there,
779
00:50:17,710 --> 00:50:20,341
you also have to do some housekeeping
780
00:50:20,744 --> 00:50:23,509
of people, of other people, actually.
781
00:50:24,043 --> 00:50:28,258
And they can put off people,
[or it also means] that we will address it
782
00:50:28,258 --> 00:50:29,888
just step by step.
783
00:50:30,375 --> 00:50:33,466
So, there will be, at the moment,
a database living--
784
00:50:33,873 --> 00:50:35,581
in classical linked open data
785
00:50:35,581 --> 00:50:38,395
and we're starting to linking it
with Wikidata,
786
00:50:38,395 --> 00:50:40,993
and it's a continuous process to find out
787
00:50:41,805 --> 00:50:47,643
for which areas the most data
will be eventually on Wikidata,
788
00:50:48,168 --> 00:50:51,946
and for which areas it will actually
live on other databases.
789
00:50:52,620 --> 00:50:56,645
Obviously, we'll have challenges
regarding synchronization,
790
00:50:57,135 --> 00:50:58,589
as we probably all have,
791
00:50:58,589 --> 00:51:01,507
because that linked data field,
792
00:51:01,507 --> 00:51:04,826
where we still have
to negotiate who we trust,
793
00:51:05,160 --> 00:51:08,720
who has authority about what.
794
00:51:13,830 --> 00:51:15,820
(assistant) Other questions?
795
00:51:23,981 --> 00:51:25,550
(woman) Thank you.
796
00:51:26,090 --> 00:51:31,030
So, fully agree with that issue of--
797
00:51:34,425 --> 00:51:41,410
where to put the boundary
between why do we put data on Wikidata,
798
00:51:43,044 --> 00:51:49,144
or why do we keep them,
and create, manage, and maintain them
799
00:51:49,144 --> 00:51:53,104
in local databases and for what purposes.
800
00:51:53,778 --> 00:51:57,213
And I think that
this is a large discussion
801
00:51:57,213 --> 00:52:02,383
that goes beyond just the excitement
802
00:52:02,383 --> 00:52:07,423
of putting data on Wikidata
because it is public,
803
00:52:07,432 --> 00:52:10,762
because it serves humanity, because--
804
00:52:11,031 --> 00:52:13,362
while there are two cool tools,
805
00:52:13,362 --> 00:52:18,132
and things are more complicated
in real life, I think.
806
00:52:19,162 --> 00:52:24,102
Well, despite this,
it's quite an interesting discussion.
807
00:52:24,435 --> 00:52:29,744
And then this is another issue, also,
or another problem that is being discussed
808
00:52:29,744 --> 00:52:35,034
in this event in different panels.
809
00:52:35,775 --> 00:52:41,129
It is on one side, have your own database,
810
00:52:41,129 --> 00:52:43,194
whatever the technology is
811
00:52:43,194 --> 00:52:46,763
and publish things on Wikidata,
812
00:52:47,233 --> 00:52:51,166
or build your own system
813
00:52:51,166 --> 00:52:55,246
of creating and managing information
814
00:52:55,246 --> 00:52:58,131
on the Wikibase technology.
815
00:52:58,591 --> 00:53:04,281
And then, synchronize or whatever--
do federation or things,
816
00:53:04,281 --> 00:53:08,314
so it's a matter
of technology that is used,
817
00:53:09,182 --> 00:53:14,796
and the fact that you use Wikidata
just for publishing,
818
00:53:14,978 --> 00:53:18,637
or the infrastructure
that is underneath Wikidata
819
00:53:18,637 --> 00:53:23,002
to create and manage your data.
820
00:53:27,116 --> 00:53:30,914
I mean, we had a discussion
821
00:53:30,914 --> 00:53:34,254
about the Wikibase panel,
822
00:53:34,254 --> 00:53:36,912
and there will be other discussions here,
823
00:53:36,912 --> 00:53:40,815
but things are
on different levels, I think.
824
00:53:41,626 --> 00:53:47,756
Maybe [you sort of get] to that discussion
about Wikibase or Wikidata--
825
00:53:48,930 --> 00:53:52,427
I think it's problematic
that we are focusing so much
826
00:53:52,427 --> 00:53:56,158
on this Wikibase infrastructure,
because there are other infrastructures,
827
00:53:56,158 --> 00:53:58,690
like in the area of performing arts.
828
00:53:59,810 --> 00:54:04,054
We have another complementary community,
which is MusicBrainz
829
00:54:04,054 --> 00:54:08,954
that runs on their own platform
that provides linked open data,
830
00:54:09,614 --> 00:54:12,692
and as I understand it,
831
00:54:14,160 --> 00:54:17,232
there's agreement
within the Wikidata community
832
00:54:17,232 --> 00:54:19,731
that we're not going
to double all their data--
833
00:54:19,731 --> 00:54:24,237
we're not going to copy all their data,
but we accept that they're complementary.
834
00:54:24,848 --> 00:54:29,678
So, what will happen when you start
integrating this data in Wikipedia?
835
00:54:30,246 --> 00:54:31,907
Infoboxes, for example.
836
00:54:31,907 --> 00:54:35,952
Would we be able to pull that data
directly from their SPARQL endpoint?
837
00:54:36,764 --> 00:54:39,603
Or would we be obliged
to kind of copy all the data,
838
00:54:39,603 --> 00:54:42,225
and what kind of processes
are involved in that?
839
00:54:42,225 --> 00:54:44,915
(woman) Discussions are open, I think,
840
00:54:44,915 --> 00:54:49,615
because within this event,
you have both interested communities--
841
00:54:49,615 --> 00:54:51,975
those that are interested in Wikibase,
842
00:54:51,975 --> 00:54:54,002
and those that are interested in Wikidata,
843
00:54:54,002 --> 00:54:56,282
and those who are interested in both.
844
00:54:56,282 --> 00:54:59,562
Yeah, but we're not going
to oblige them to move to Wikibase.
845
00:55:00,162 --> 00:55:03,138
- (woman) Not necessarily.
- MusicBrainz is not running on Wikibase.
846
00:55:03,138 --> 00:55:06,802
(woman) No, I just wanted to say
that you have separate problems,
847
00:55:06,802 --> 00:55:10,964
sometimes interrelated,
sometimes not completely separated.
848
00:55:12,479 --> 00:55:16,573
And I had another question or remark
849
00:55:16,573 --> 00:55:22,013
regarding the management of hierarchies
in controlled vocabularies,
850
00:55:22,013 --> 00:55:26,473
like thesaurus, like you in Finto.
851
00:55:27,703 --> 00:55:30,563
You do have the places
852
00:55:31,503 --> 00:55:34,956
in the Maori
853
00:55:36,418 --> 00:55:40,554
Subject Headings,
854
00:55:42,262 --> 00:55:48,068
Well, they have to deal with
the management of concepts in hierarchy.
855
00:55:48,360 --> 00:55:52,320
What is your take, your opinion
856
00:55:52,320 --> 00:55:57,042
about the possibility
of managing this controlled
857
00:55:58,850 --> 00:56:02,364
knowledge organization
systems in Wikidata?
858
00:56:07,166 --> 00:56:10,169
I think in the case
of Finto and YSO places,
859
00:56:11,499 --> 00:56:14,391
the repository will be a collection
860
00:56:14,391 --> 00:56:18,936
of several sources, eventually.
861
00:56:18,936 --> 00:56:21,613
So, it is in flux, anyway.
862
00:56:21,613 --> 00:56:24,528
So, we don't have to necessarily--
863
00:56:24,528 --> 00:56:28,383
well, I don't represent
the National Library,
864
00:56:28,383 --> 00:56:31,512
but in that possible project,
865
00:56:31,512 --> 00:56:35,711
we would not have
to maintain an existing--
866
00:56:35,711 --> 00:56:38,540
or fight with an existing structure.
867
00:56:38,540 --> 00:56:45,164
So, in that sense, it is an area
open for exploration.
868
00:56:48,912 --> 00:56:52,272
The Maori Subject Headings
seems to lend themselves ideally
869
00:56:52,272 --> 00:56:54,392
to Wikidata structure,
870
00:56:54,392 --> 00:56:56,961
but the licensing,
of course, forbids that.
871
00:56:56,961 --> 00:56:59,491
I suspect that if the licensing
were different
872
00:56:59,491 --> 00:57:01,511
and they were put into Wikidata,
873
00:57:01,511 --> 00:57:04,562
as soon as somebody decided
they didn't like the hierarchy
874
00:57:04,562 --> 00:57:06,162
and started to change things,
875
00:57:06,162 --> 00:57:10,001
there would be an immediate outcry
from people who worked very hard
876
00:57:10,001 --> 00:57:12,301
to create that structure
877
00:57:12,301 --> 00:57:15,641
and get the sign-off
from various different Maori
878
00:57:15,641 --> 00:57:17,942
that was the current hierarchy.
879
00:57:18,382 --> 00:57:20,841
So, that's an issue to try and resolve.
880
00:57:23,812 --> 00:57:26,502
I think in terms of knowledge
organization systems,
881
00:57:26,502 --> 00:57:28,116
they are all different.
882
00:57:28,116 --> 00:57:31,752
And I'm not sure
if it would be a good idea
883
00:57:31,752 --> 00:57:36,855
to represent different hierarchies
in Wikidata as such,
884
00:57:37,650 --> 00:57:42,101
but it maybe makes sense
to think about overlays
885
00:57:42,941 --> 00:57:45,022
of the data.
886
00:57:45,431 --> 00:57:48,371
So, to do mappings on the content level.
887
00:57:49,091 --> 00:57:54,021
For example, as ZBW partnership
Thesaurus for Economics.
888
00:57:55,420 --> 00:57:59,150
And this thesaurus has its own hierarchy,
889
00:57:59,680 --> 00:58:04,020
and, of course, it would be possible
to project the hierarchy
890
00:58:04,461 --> 00:58:08,452
of this thesaurus into Wikidata concepts
891
00:58:08,452 --> 00:58:11,541
without actually storing
this kind of structure
892
00:58:12,180 --> 00:58:14,840
as an alternative structure
within Wikidata
893
00:58:14,840 --> 00:58:18,640
which would make a lot of confusion.
894
00:58:18,640 --> 00:58:24,789
But I think we should think
of Wikidata, also, as a pool of concepts
895
00:58:24,789 --> 00:58:29,651
which can be connected on layers
which are outside,
896
00:58:30,264 --> 00:58:33,489
and which give another view of the world
897
00:58:33,489 --> 00:58:39,080
which is not necessarily to be
within Wikidata.
898
00:58:45,775 --> 00:58:48,203
(assistant) Alright. Some other questions?
899
00:58:49,096 --> 00:58:51,527
Otherwise-- okay.
900
00:58:54,769 --> 00:58:57,781
(man 2) Joachim, I just wanted
to follow up on that last point.
901
00:58:57,781 --> 00:59:01,064
So, these layers, as you picture it,
902
00:59:02,196 --> 00:59:04,143
they would be maintained externally
903
00:59:04,143 --> 00:59:07,404
and somehow integrated
904
00:59:08,964 --> 00:59:11,764
with Wikidata from the Wikidata side,
905
00:59:11,764 --> 00:59:17,143
or have you thought a bit further
906
00:59:17,143 --> 00:59:19,463
about how that might be managed?
907
00:59:22,351 --> 00:59:24,931
Actually, no, I have no--
908
00:59:25,271 --> 00:59:30,361
I have done experiments
with ZBW and Wikidata.
909
00:59:30,771 --> 00:59:33,132
I was [inaudible] here at Wikidata.
910
00:59:33,132 --> 00:59:38,837
But I think this is
a whole new complex thing,
911
00:59:39,261 --> 00:59:46,210
and so, it's up to [discuss],
[to give up a lot of control]
912
00:59:46,409 --> 00:59:47,908
to do such things.
913
00:59:47,908 --> 00:59:50,178
But it has to be figured out.
914
00:59:56,638 --> 00:59:57,959
Should we take one more?
915
00:59:57,959 --> 00:59:59,686
(man 3) Ah, great.
916
00:59:59,686 --> 01:00:02,628
I was just wondering
about the kakapo project.
917
01:00:03,875 --> 01:00:05,000
Uh-hmm.
918
01:00:05,000 --> 01:00:10,805
(man 3) Okay. So, did you get
any pushback from the Wikidata community
919
01:00:10,805 --> 01:00:14,636
about having individual animals
out of those items?
920
01:00:15,576 --> 01:00:16,836
Not so far.
921
01:00:16,836 --> 01:00:19,045
(man 3) Has anyone heard
about this before?
922
01:00:19,045 --> 01:00:22,445
Is it "not so far" because
no one has heard about it yet?
923
01:00:23,085 --> 01:00:26,095
There's been a small discussion
for quite some time now--
924
01:00:26,095 --> 01:00:29,235
those people interested
in this sort of thing in Wikidata,
925
01:00:29,235 --> 01:00:32,215
and we all seem to think
that it's a natural extension
926
01:00:32,215 --> 01:00:35,855
of getting individual Wikidata items
to a famous racehorse
927
01:00:35,855 --> 01:00:39,755
or someone's cat, which--
that's modeled pretty well.
928
01:00:39,764 --> 01:00:44,444
I guess just the audacious thing
is putting the entire species in there.
929
01:00:44,444 --> 01:00:48,113
But I think it's perfectly manageable.
930
01:00:48,113 --> 01:00:50,173
(man 3) Don't try it with cats and dogs.
931
01:00:50,173 --> 01:00:52,457
(laughter)
932
01:00:52,457 --> 01:00:54,337
(assistant) Okay. I think
the time is finished.
933
01:00:54,337 --> 01:00:55,767
Thank you very much for attending.
934
01:00:55,767 --> 01:00:59,267
I think the speakers will be still open
for the questions and a break.
935
01:00:59,267 --> 01:01:00,797
And have fun.
936
01:01:00,797 --> 01:01:02,292
Thank you very much.
937
01:01:02,292 --> 01:01:04,047
(applause)