0:00:00.000,0:00:08.249
Good afternoon, everybody.
0:00:08.929,0:00:12.068
Welcome to our GLAM panel.
0:00:13.124,0:00:17.009
Before we start, I just have[br]two announcements to make.
0:00:17.329,0:00:23.049
First of all, please extensively make use[br]of our Etherpad to take notes.
0:00:23.781,0:00:27.998
And the second one is directed[br]at our audience at home,
0:00:27.998,0:00:29.819
or wherever you are.
0:00:29.819,0:00:30.958
If you have any questions,
0:00:30.958,0:00:34.028
you can also write that into the Etherpad,
0:00:34.028,0:00:37.828
and our room angels[br]will keep track of them.
0:00:39.328,0:00:44.348
So, we decided that for this year's panel,
0:00:45.388,0:00:48.868
after seeing all the contributions[br]that were made,
0:00:49.128,0:00:53.538
we would focus on the role of Wikidata[br]within data ecosystems
0:00:53.551,0:00:57.199
that go beyond the actual[br]Wikimedia projects,
0:00:57.199,0:00:59.747
which is also absolutely in line
0:00:59.747,0:01:03.677
with the new Wikimedia[br]Foundation strategy.
0:01:04.652,0:01:07.947
And we have, today, four panelists.
0:01:08.387,0:01:09.876
Three plus one.
0:01:09.876,0:01:13.636
So, I would like to ask you on stage,
0:01:13.636,0:01:15.875
so we can introduce you.
0:01:22.205,0:01:24.706
So, we have Susanna Ånäs.
0:01:25.385,0:01:29.296
She's a long time free-knowledge activist
0:01:29.296,0:01:31.276
involved in many WikiProjects.
0:01:31.916,0:01:35.526
And she will be reporting today[br]on the project in cooperation
0:01:35.526,0:01:38.396
with the Finnish National Library.
0:01:38.856,0:01:43.435
Then we have, next to me, Mike Dickison,
0:01:43.435,0:01:46.325
who will be second in this order.
0:01:46.995,0:01:50.283
He is a museum curator from New Zealand.
0:01:50.283,0:01:53.815
He's a zoologist and a Wikipedia editor.
0:01:53.815,0:01:58.788
And he was New Zealand's[br]first Wikipedian at Large
0:01:58.788,0:02:02.565
in 2018 and 2019.
0:02:02.565,0:02:06.634
And he will tell us[br]about his experience in that role,
0:02:06.634,0:02:13.105
and what kind of role Wikidata[br]is starting to play in that context.
0:02:15.784,0:02:18.135
Then we have Joachim Neubert
0:02:18.135,0:02:23.461
from the Leibniz Information Center[br]for Economics in Kiel and Hamburg.
0:02:24.011,0:02:29.131
He has been working on making the largest[br]public press archives worldwide
0:02:29.131,0:02:34.655
more accessible to the public,[br]and he's using Wikidata to do that.
0:02:35.890,0:02:39.091
And then I will go last.[br]My name is Beat Estermann.
0:02:39.091,0:02:43.080
I work for Bern University[br]of Applied Sciences, in Switzerland.
0:02:43.640,0:02:49.950
And I've been a long-time promoter[br]for OpenGLAM in Switzerland and Austria.
0:02:50.335,0:02:54.840
And I will today report[br]about my activities in connection
0:02:54.840,0:02:59.460
with the mandate from the Canadian Arts[br]Presenting Association,
0:02:59.460,0:03:01.270
focusing on performing arts.
0:03:02.121,0:03:04.440
Not primarily on Wikidata,
0:03:04.440,0:03:08.421
but you will see Wikidata[br]is starting to play a role there, as well.
0:03:08.970,0:03:13.250
So now, most of us[br]will take our seat here,
0:03:13.250,0:03:16.980
and I will give the floor to Susanna.
0:03:18.300,0:03:22.769
Okay. So, hello. My name is Susana Ånäs,
0:03:22.769,0:03:25.769
and I work part-time for Wikimedia Finland
0:03:25.769,0:03:27.079
as a GLAM coordinator,
0:03:27.079,0:03:32.655
and I also do consulting[br]in the open knowledge sphere.
0:03:32.655,0:03:36.049
And this is a discourse,[br]maybe, of [inaudible].
0:03:36.049,0:03:38.719
So, I have been involved in the workings
0:03:38.719,0:03:45.642
of geographic data group of the--
0:03:48.439,0:03:51.147
well, I looked it up,[br]but it isn't in English,
0:03:51.147,0:03:54.497
but, cultural heritage initiative[br]of the Finnish royal government.
0:03:54.917,0:03:59.775
So, this is about place names
0:03:59.775,0:04:03.300
and how they are represented
0:04:03.300,0:04:07.466
in different repositories[br]in the GLAM sector in Finland,
0:04:07.466,0:04:11.755
and how they are trying to pull together[br]these different sources,
0:04:11.755,0:04:17.906
and how they are informed[br]by modeling in Wikidata and elsewhere.
0:04:17.906,0:04:23.315
So, here we see the three main sources[br]for these YSO places,
0:04:23.315,0:04:27.944
which is part of the national ontology--[br]general ontology.
0:04:27.944,0:04:29.665
AHAA is for Finnish archives,
0:04:29.665,0:04:31.645
Melinda is for Finnish libraries,
0:04:31.645,0:04:33.750
and KOOKOS is for Finnish museums.
0:04:33.750,0:04:37.585
So, there are three, also,[br]content management systems
0:04:37.585,0:04:40.290
that come together in these YSO places.
0:04:40.745,0:04:47.365
And there are exchanges between Wikidata[br]already taking place,
0:04:47.965,0:04:53.065
as well as the names project[br]for the National Land Survey.
0:04:53.065,0:04:56.285
And then, there's a third project,[br]the Finnish Names Archive,
0:04:56.285,0:05:00.391
which doesn't yet contribute to this,
0:05:00.391,0:05:02.715
but there are plans for that.
0:05:02.715,0:05:09.175
So, one of the key modeling issues[br]in this whole problem area
0:05:09.175,0:05:15.226
is that there are three types[br]of elements in place names
0:05:16.116,0:05:18.195
represented in this project.
0:05:18.195,0:05:21.236
One of them is the place,[br]the one that has location.
0:05:21.236,0:05:24.766
And one of them is the place name,[br]the toponym, for example.
0:05:25.006,0:05:27.696
And then, there are sources,[br]which are documents
0:05:27.696,0:05:30.756
from which these both can be derived from,
0:05:30.756,0:05:32.565
or like, backed up with.
0:05:32.565,0:05:35.845
The YSO places--[br]here, on the top right,
0:05:35.845,0:05:38.799
you will see the same diagram again.
0:05:38.799,0:05:41.189
It focuses mainly on the places.
0:05:42.619,0:05:46.279
The main thing of this[br]is the Finnish National Library,
0:05:46.279,0:05:49.159
and the Finto project.
0:05:50.199,0:05:55.608
There are now more than 7,000 places[br]in Finnish and Swedish
0:05:55.608,0:05:59.438
and over 3,000 in English,
0:05:59.438,0:06:03.042
and they are CC0 we've licensed with.
0:06:03.042,0:06:06.008
So, here you can see the service of Finto.
0:06:06.008,0:06:09.883
And a place-- I chose Sevettijärvi.
0:06:09.883,0:06:13.908
It is now also related[br]to our language project
0:06:13.908,0:06:15.268
with the Skolt Sami--
0:06:15.268,0:06:18.877
this is a place[br]in the very north of Finland
0:06:18.877,0:06:21.765
inhabited by Skolt Sámi.
0:06:21.765,0:06:27.264
So, here you can see the place[br]which belongs to the--
0:06:27.264,0:06:32.724
well, you will see the data[br]about this place.
0:06:32.724,0:06:37.952
You can see that it is connected[br]to a Wikidata,
0:06:37.952,0:06:42.344
as well as this National Land Survey data.
0:06:43.192,0:06:47.406
Here we go. And you will see[br]this in more detail, here.
0:06:48.582,0:06:52.360
It is also hierarchically arranged
0:06:52.360,0:06:56.310
inside this repository.
0:06:57.670,0:07:00.460
Well, actually,[br]the actual place is not seen,
0:07:00.460,0:07:05.880
but it is underneath this municipality,
0:07:05.880,0:07:08.010
as well as the region,
0:07:08.010,0:07:10.154
and Finland as a country,[br]and Nordic countries,
0:07:10.154,0:07:12.650
the broader region.
0:07:12.650,0:07:14.400
Here you can see that many of these
0:07:14.400,0:07:17.891
have been matched[br]with Wikidata previously
0:07:18.730,0:07:22.230
through Mix'n'Match,[br]and there are still remaining ones.
0:07:22.230,0:07:27.900
But then, the amount of names[br]is not that high.
0:07:28.411,0:07:30.844
It's only less than 5,000.
0:07:31.570,0:07:33.860
So, then there is this other repository
0:07:33.860,0:07:38.040
by the Finnish Geospatial[br]Platform Project--
0:07:38.040,0:07:39.199
Place Names Cards.
0:07:39.199,0:07:41.729
These are all the place names[br]that are on Finnish maps.
0:07:42.130,0:07:48.308
And they have the linked data,[br]which is licensed CC BY 4.0.
0:07:48.518,0:07:54.478
800,000 map labels in Finnish, Swedish,[br]and all those three Saami languages
0:07:54.478,0:07:55.778
that are in Finland.
0:07:55.997,0:07:58.877
And they have[br]two different types of entities.
0:07:58.877,0:08:00.680
The other ones are places,[br]and the other ones
0:08:00.680,0:08:02.651
are place names, toponyms.
0:08:02.651,0:08:05.271
And they both have persistent URIs.
0:08:06.001,0:08:09.721
Here's, for example,[br]the same Sevettijärvi, in first Finnish,
0:08:09.721,0:08:14.001
and then all those three Saami languages,[br]as well as the geographic data,
0:08:14.001,0:08:18.821
and then there is more information[br]about that, like the place type,
0:08:19.630,0:08:20.841
et cetera.
0:08:21.640,0:08:28.411
Here is the card for the place name,[br]the toponym, having its own URI.
0:08:29.943,0:08:33.738
Sorry, it seems that it's not translated[br]into the English list.
0:08:34.432,0:08:39.151
So, multilinguality[br]is not covering the whole project.
0:08:40.167,0:08:42.523
Okay, we come[br]to the Finnish Names Archive.
0:08:42.523,0:08:46.234
This is a project by the Institute[br]for the Languages of Finland,
0:08:46.234,0:08:50.456
and these represent not the places,[br]not the place names,
0:08:50.456,0:08:52.603
but they are actually sources for those.
0:08:52.603,0:08:57.123
So, these are three million[br]field notes of place names,
0:08:57.723,0:08:59.529
and it is a Wikibase project.
0:08:59.529,0:09:03.325
They are in a Wikibase,[br]mainly in Finnish, some in Swedish.
0:09:03.325,0:09:08.111
An outstanding collection of Saami names,[br]which we are very interested in.
0:09:08.111,0:09:10.141
And they are licensed CC BY.
0:09:10.380,0:09:14.850
And that is also a challenge[br]from the Wikidata point of view.
0:09:14.850,0:09:17.640
But if there was a Finnish local Wikibase,
0:09:17.640,0:09:22.632
we might be able to first work[br]on them in that project.
0:09:23.034,0:09:25.343
So, here's a screenshot of that,
0:09:26.443,0:09:31.323
showing that there's information[br]about the place, the maps--
0:09:31.323,0:09:35.227
the maps that the collectors[br]initially use,
0:09:35.227,0:09:40.713
and the card that they produce[br]of the information they collected.
0:09:41.455,0:09:46.416
So, here's one of those cards
0:09:46.416,0:09:48.736
broken down into data
0:09:48.736,0:09:50.676
that is included in them.
0:09:51.166,0:09:53.751
So, then they sent[br]this linked data project
0:09:53.751,0:09:56.336
by the Helsinki Digital Humanities Lab
0:09:56.336,0:09:58.256
and Semantic Computers,
0:09:58.256,0:10:01.446
computing group of Aalto University--
0:10:01.446,0:10:06.525
and together with this Institute[br]for the Languages of Finland--
0:10:06.525,0:10:07.994
the Names Sampo.
0:10:07.994,0:10:11.024
And this is an aggregated[br]research interface
0:10:11.024,0:10:13.503
to several place name sources.
0:10:13.503,0:10:17.704
Here you can see that many[br]of the sources are out there on the left,
0:10:17.704,0:10:20.763
and then, you can make[br]different kinds of visualizations
0:10:20.763,0:10:22.653
based on this data.
0:10:22.653,0:10:24.438
And, yeah.
0:10:25.289,0:10:30.603
So, I've been bringing up this idea[br]of modeling for a local Wikibase
0:10:30.603,0:10:32.693
that we could do with this data.
0:10:32.693,0:10:36.580
But when we enter[br]these modeling questions,
0:10:36.580,0:10:37.770
how do we model?
0:10:37.770,0:10:41.589
There are different ways,[br]different traditions in each of these.
0:10:45.682,0:10:50.360
And the good thing about it[br]is it could also serve minority languages
0:10:50.360,0:10:52.475
with very little effort.
0:10:53.243,0:10:57.179
Okay. So, here we have[br]the two basic options:
0:10:57.179,0:11:01.660
the SAPO model, which is[br]the Finnish Space-Time Ontology,
0:11:02.841,0:11:04.421
and the Wikidata model.
0:11:04.421,0:11:07.909
Here you can see[br]that Wikidata items tend to zero.
0:11:07.909,0:11:12.871
Ideally, they remain the same[br]with the changing properties.
0:11:12.871,0:11:16.909
Whereas, in the SAPO model,[br]these items become new
0:11:16.909,0:11:20.399
when there is a change,[br]such as area change and name change.
0:11:21.179,0:11:26.219
So here, come back to this division
0:11:26.219,0:11:31.719
between these three different dimensions[br]of places, place names.
0:11:32.099,0:11:37.659
So, should we make these place names[br]into entities or properties?
0:11:37.659,0:11:39.248
Wikidata uses properties,
0:11:39.248,0:11:43.098
whereas this land survey[br]project has entities.
0:11:43.838,0:11:46.177
Or should we make them into lexemes?
0:11:46.177,0:11:51.426
Wikidata has chosen to work[br]with properties,
0:11:51.426,0:11:54.956
textual properties[br]for place names over lexemes.
0:11:55.567,0:11:57.818
I'm sorry, the other way around.
0:11:57.818,0:11:59.631
So, the names are...
0:12:03.056,0:12:04.941
properties, not lexemes.
0:12:05.874,0:12:06.877
Right.
0:12:07.165,0:12:11.132
And maybe the shortcoming of the Wikibase
0:12:11.132,0:12:16.340
is the lack of geographical[br]shapes inside that--
0:12:16.340,0:12:20.958
like in the basic setup of it,
0:12:20.958,0:12:24.748
so one would have to add[br]more technology into the stack
0:12:24.748,0:12:29.688
to be able to use local geographic shapes.
0:12:29.688,0:12:31.823
And a federation is really needed
0:12:31.823,0:12:38.168
to be able to take advantage[br]of the Wikidata corpus.
0:12:38.648,0:12:43.052
So, I'm done already. Thank you.
0:12:43.616,0:12:45.827
(applause)
0:13:01.255,0:13:02.514
Okay.
0:13:03.274,0:13:05.011
(speaking in Maori)
0:13:05.011,0:13:07.655
Welcome, everyone.[br]My name is Mike Dickison.
0:13:08.375,0:13:10.149
And for a year,
0:13:10.149,0:13:13.075
I was New Zealand Wikipedian at Large.
0:13:13.935,0:13:16.935
You might wonder[br]what a Wikipedian at Large is.
0:13:17.856,0:13:21.875
Because if you actually look out for it,[br]there is no such thing, as we can see.
0:13:22.735,0:13:25.855
It's a term that I made up[br]in the grant proposal,
0:13:26.153,0:13:29.003
which the foundation[br]seemed to like very much.
0:13:29.983,0:13:31.533
And so, we ran with it.
0:13:32.303,0:13:36.633
So, for a year, I went through[br]35 different institutions,
0:13:37.053,0:13:41.053
residents, and most of them,[br]running training sessions,
0:13:41.493,0:13:44.363
organizing public events,[br]and trying to develop
0:13:44.363,0:13:47.230
a Wikimedia strategy for each one.
0:13:47.998,0:13:49.498
It was a very interesting experience,
0:13:49.498,0:13:53.267
and you encounter a wide range[br]of different projects and people.
0:13:53.267,0:13:58.211
And I wanted to try and talk through[br]some of the different projects
0:13:58.211,0:14:00.345
that dealt with Wikidata
0:14:00.872,0:14:05.171
in interesting or, perhaps,[br]illuminating ways,
0:14:05.171,0:14:07.591
that might be useful for folks to discuss.
0:14:08.561,0:14:11.961
The project was initially[br]a Wikipedia project by the name,
0:14:11.961,0:14:14.651
simply because that was what people[br]were familiar with,
0:14:15.281,0:14:18.360
and so we organized[br]multiple different events
0:14:18.360,0:14:23.135
at very traditional edit-a-thons,[br]gender gap work, and so forth.
0:14:24.607,0:14:26.752
[And a bunch you can see] [inaudible],
0:14:27.105,0:14:30.812
and a bunch of very successful[br]new editors recruited, and so forth.
0:14:31.754,0:14:34.454
We did bulk uploads into Commons.
0:14:35.454,0:14:41.246
In this case, there was a collection[br]of over 1,000 original artworks
0:14:41.246,0:14:46.047
by an entomological[br]illustrator, Des Helmore,
0:14:46.047,0:14:47.927
which had been sitting on a hard drive,
0:14:47.927,0:14:50.357
[lacking] research for ten years,
0:14:50.357,0:14:52.322
and we were able[br]to get clearance to release those
0:14:52.322,0:14:54.245
all under CC BY license.
0:14:54.245,0:14:57.963
So, easy wins to show to people there.
0:14:57.963,0:15:01.095
Everyone can understand[br]lots of pictures of beetles.
0:15:01.095,0:15:06.681
Everyone can understand workshops[br]devoted to fixing the gender gap.
0:15:07.250,0:15:10.251
But Wikidata[br]is much more difficult to sell
0:15:10.251,0:15:12.280
to people in the GLAM sector,
0:15:12.280,0:15:15.095
or anyone outside[br]of our particular movement.
0:15:16.107,0:15:19.717
So, I began to realize that Wikidata
0:15:19.717,0:15:22.634
was going to be a more[br]and more important part
0:15:22.634,0:15:25.883
of the Wikipedian at Large projects.
0:15:25.883,0:15:30.472
So, as we went through, it became[br]a larger and larger component
0:15:30.472,0:15:31.849
of what I was doing.
0:15:31.849,0:15:36.350
And I began to try and teach myself[br]more about Wikidata as well,
0:15:36.800,0:15:39.515
because I was beginning to see[br]how important it was.
0:15:40.287,0:15:41.989
So, this one project--
0:15:41.989,0:15:46.325
the kakapo is a native[br]New Zealand flightless parrot.
0:15:48.096,0:15:51.335
We worked with[br]the Department of Conservation,
0:15:51.335,0:15:54.299
whose job is to save[br]this species from extinction,
0:15:54.299,0:15:55.643
and pitched the idea,
0:15:55.643,0:15:59.253
"What if we put every[br]single kakapo into Wikidata?"
0:16:01.221,0:16:02.701
And that may seem ridiculous,
0:16:02.701,0:16:05.580
but it's actually[br]a perfectly doable project.
0:16:06.621,0:16:08.427
A few of them are in there already.
0:16:09.100,0:16:11.601
A key thing to notice here[br]is there are not many kakapos.
0:16:11.615,0:16:13.245
So, it's a manageable task.
0:16:13.245,0:16:16.656
There were 148 when I started,[br]and then one died.
0:16:16.935,0:16:20.995
And they've just had[br]a great breeding season up to 213.
0:16:21.765,0:16:25.045
This is great. This is the most kakapo[br]there have been for over 50 years.
0:16:25.505,0:16:28.260
So, this was also a big deal.
0:16:28.260,0:16:30.725
This was on the news[br]every day in New Zealand.
0:16:31.285,0:16:33.224
Each new one that was born--
0:16:33.224,0:16:34.414
(man) In the New York Times.
0:16:34.414,0:16:35.673
(Mike) Did it? Oh, lovely.
0:16:35.673,0:16:38.522
Yeah, this was national news.[br]Everyone likes these birds.
0:16:39.002,0:16:40.663
But something interesting about them
0:16:40.663,0:16:43.932
is because unlike species[br]that are more populous,
0:16:43.932,0:16:47.822
every single kakapo is named,[br]has a unique name
0:16:47.822,0:16:49.817
and a unique ID number.
0:16:49.817,0:16:52.442
And often has good biographical data
0:16:52.442,0:16:54.672
about where and when they were born,
0:16:54.672,0:16:56.972
were hatched, who their father[br]and mother was,
0:16:56.972,0:16:58.713
when they died, if they died.
0:16:58.713,0:17:01.352
So, there is, in fact,[br]a Department of Conservation database
0:17:01.352,0:17:02.882
of all this information.
0:17:02.882,0:17:06.723
And one of the most famous kakapos,[br]of course, is Sirocco,
0:17:06.723,0:17:09.726
who you can see is named[br]after a wind, was born there.
0:17:09.726,0:17:13.225
Sirocco has a Twitter account,
0:17:13.705,0:17:15.927
which Wikidata had some problems with,
0:17:15.927,0:17:18.562
because, apparently,[br]they just can't have Twitter accounts.
0:17:18.562,0:17:20.342
I don't know about that.
0:17:21.121,0:17:23.456
He's even featured[br]on an album cover, and so forth.
0:17:23.456,0:17:25.716
So there are multiple properties of this,
0:17:25.716,0:17:28.258
probably one of the most famous[br]individual kakapo.
0:17:28.258,0:17:30.337
So, I pitched to the Department[br]of Conservation,
0:17:30.337,0:17:33.245
"Why don't we try and do this[br]with every single one?"
0:17:33.245,0:17:37.665
And so, they had to think about[br]how much of the biographical data
0:17:37.665,0:17:39.365
could be made public.
0:17:39.365,0:17:41.225
And they come up with a short list.
0:17:41.225,0:17:46.644
And now we've got, I think, 212,[br]210--I think a couple died--
0:17:46.644,0:17:50.703
living kakapo that are all candidates now.
0:17:50.703,0:17:52.933
And they only get a name when they fledge.
0:17:52.933,0:17:56.172
They have a code number until that[br]while they're still babies.
0:17:56.186,0:17:58.227
So, when we've got the full-fledged crop,
0:17:58.227,0:18:01.806
we're going to create[br]a complete Wikidata--
0:18:01.806,0:18:04.225
the entire species will be in Wikidata.
0:18:04.586,0:18:06.605
But we need to come up[br]with a property for DOC ID--
0:18:06.605,0:18:08.875
I actually would like to talk[br]with folks about that.
0:18:08.875,0:18:11.266
Should we be using a very specific ID,
0:18:11.266,0:18:13.136
or should we be coming up with an ID
0:18:13.136,0:18:17.665
that would work for all individual birds[br]or plants or animals
0:18:17.665,0:18:21.965
that have been tagged[br]in any scientific research project?
0:18:21.965,0:18:23.795
It's a good question.
0:18:25.105,0:18:27.465
Second project was[br]Christchurch Art Gallery.
0:18:28.225,0:18:31.523
There are very few paintings[br]of Colin MacCahon,
0:18:31.523,0:18:33.963
New Zealand's most famous[br]artist in existence.
0:18:33.963,0:18:36.704
This is a drawing he did[br]for the New Zealand School Journal,
0:18:36.704,0:18:38.424
which was government-funded at the time.
0:18:38.424,0:18:40.704
So, it's actually in Archives New Zealand
0:18:40.704,0:18:42.294
who own the copyright for that.
0:18:42.294,0:18:44.333
This is a very unusual situation.
0:18:45.014,0:18:47.073
So, I worked with[br]Christchurch Art Gallery
0:18:47.073,0:18:48.993
who, along with Auckland Art Gallery,
0:18:48.993,0:18:52.954
maintain a site called[br]Find New Zealand artists.
0:18:52.954,0:18:55.654
The job of which is to keep track[br]of the holdings--
0:18:55.654,0:18:58.403
every institution that has holdings[br]of the New Zealand artist.
0:18:58.403,0:19:03.163
So, about 18,000 different artists[br]in their database,
0:19:03.163,0:19:05.517
and most with very little[br]information at all.
0:19:06.233,0:19:08.992
So, we did a standard sort of Mix'n'Match.
0:19:08.992,0:19:13.673
We did an export of the ones[br]that had at least a birth date,
0:19:13.673,0:19:17.545
or a death date, or a place of birth,[br]or a place of death.
0:19:17.545,0:19:20.614
So, that's not restricting it very much.
0:19:20.614,0:19:23.484
And even then, we were not able[br]to match quite a few,
0:19:23.484,0:19:25.954
but we've got about 1,500 now
0:19:25.954,0:19:28.603
that are matched[br]to known artists in Wikidata,
0:19:28.603,0:19:30.123
which is nice.
0:19:30.123,0:19:31.783
But what was appealing to them--
0:19:31.783,0:19:33.523
this is their website,
0:19:33.523,0:19:39.213
which really just maintains[br]the holdings links there.
0:19:39.213,0:19:44.523
But this biographical data,[br]which they create by hand, currently,
0:19:44.523,0:19:46.063
for every single artist.
0:19:46.063,0:19:48.803
And the act of exporting[br]and putting into Mix'n'Match
0:19:48.803,0:19:52.363
exposed numerous typos[br]and mistakes and such
0:19:52.363,0:19:53.723
that they haven't noticed.
0:19:53.723,0:19:56.123
And it's only when you start[br]running things through [Excel],
0:19:56.123,0:19:57.272
these things show up.
0:19:57.272,0:20:01.720
And the value of Wikidata[br]was suddenly conveyed to them
0:20:01.720,0:20:05.527
when I said, "You can just suck in[br]that information from Wikidata."
0:20:06.548,0:20:09.507
And that made them sit up straight.
0:20:09.507,0:20:11.748
So this, I think, is one[br]of the selling points.
0:20:11.748,0:20:14.907
When you have this carefully[br]hand-curated website
0:20:14.907,0:20:19.344
with 18,000 entries, full of mistakes,[br]and tell them there's another way,
0:20:19.344,0:20:20.558
that they can get other people
0:20:20.558,0:20:23.192
to do some of this fact-checking[br]and correction for them--
0:20:23.192,0:20:24.813
that's when it sinks home.
0:20:25.143,0:20:27.293
And then announced I was pitching the idea
0:20:27.293,0:20:30.313
that they "Wikidatafy"[br]this entire history book
0:20:30.313,0:20:33.333
of the New Zealand artists[br]in Christchurch in the '30s,
0:20:33.333,0:20:36.833
and run through--just published--[br]and run through every single person,
0:20:36.833,0:20:39.453
connection, place, exhibition, and such.
0:20:39.453,0:20:43.103
But it's a manageable sized project,[br]and they're very excited by this.
0:20:44.303,0:20:46.843
And thirdly, I wanted to show you[br]Maori Subject Headings.
0:20:46.843,0:20:50.811
A waka is a Maori name[br]for a particular kind of canoe,
0:20:50.811,0:20:52.732
a war canoe.
0:20:52.732,0:20:55.952
So, in the National Library[br]of New Zealand,
0:20:55.952,0:20:58.530
there's a listing for waka,[br]because the National Library
0:20:58.530,0:21:02.805
actually has its own dictionary[br]of Maori Subject Headings,
0:21:03.299,0:21:04.474
in the Maori language.
0:21:04.474,0:21:06.475
So, there it defines a waka,
0:21:07.175,0:21:09.512
in Maori and English.
0:21:10.182,0:21:12.372
But it also has a whole lot[br]of narrower terms,
0:21:12.372,0:21:14.222
you can see there on the side there.
0:21:14.222,0:21:16.062
a typical would be taurapa.
0:21:16.237,0:21:19.774
And a definition first in Maori,[br]and then in English.
0:21:19.774,0:21:22.249
It's the carved sternpost[br]that you can see there.
0:21:22.695,0:21:24.482
And in English, you would say "sternpost,"
0:21:24.482,0:21:26.959
but you can't use[br]the word "sternpost" for taurapa,
0:21:26.959,0:21:31.054
because taurapa only works[br]for particular kinds of war canoes.
0:21:31.420,0:21:34.460
So, there's no English word[br]equivalent for that.
0:21:35.108,0:21:37.909
And I suddenly realized[br]that here is an entire ontology
0:21:37.909,0:21:42.177
of cultural-specific terms that have been[br]very carefully worked out
0:21:42.177,0:21:45.043
and verified by the National[br]Library with Maori,
0:21:45.043,0:21:49.733
constantly being added to and improved[br]with definitions, with descriptions,
0:21:49.733,0:21:51.803
in both English and Maori.
0:21:51.803,0:21:52.956
Really exciting.
0:21:52.956,0:21:56.228
I suddenly thought we could put[br]this whole lot into Wikidata--
0:21:56.228,0:22:00.596
Maori first, and then translated[br]into English, as required.
0:22:00.596,0:22:02.291
Be a nice change, wouldn't it!
0:22:03.081,0:22:05.046
And here's the copyright licensing.
0:22:05.046,0:22:08.726
Unfortunately, NonCommercial-NoDerivs.
0:22:10.346,0:22:12.346
So now I have to start[br]the conversation with them
0:22:12.346,0:22:14.524
about why did they pick that license.
0:22:15.675,0:22:19.970
And possibly because they only got[br][buy in] from Maori,
0:22:19.970,0:22:22.679
who agreed to sit down[br]and [inaudible] this stuff
0:22:22.679,0:22:24.039
if there was a guarantee
0:22:24.039,0:22:27.339
that none of this information[br]could be used for commercial purposes.
0:22:27.920,0:22:31.999
So, that's one of the frustrating[br]aspects of the task
0:22:31.999,0:22:34.238
is coming up against[br]these sorts of restrictions.
0:22:34.238,0:22:37.019
So, those are the three things[br]I wanted to put out in front
0:22:37.019,0:22:38.379
and sparking discussion.
0:22:38.379,0:22:40.878
Putting an entire species into Wikidata,
0:22:40.878,0:22:44.107
what it takes to actually change[br]an art gallery's curator's mind
0:22:44.107,0:22:46.078
about the value of Wikidata,
0:22:46.078,0:22:49.838
and what do we do when we would see[br]a complete ontology
0:22:49.838,0:22:52.477
in another language that,[br]unfortunately, has been slapped
0:22:52.477,0:22:55.697
with a restrictive[br]Creative Commons license.
0:22:55.697,0:22:56.997
Thank you.
0:22:56.997,0:22:58.737
(applause)
0:23:11.412,0:23:14.077
Hello. My name is Joachim Neubert.
0:23:14.077,0:23:16.472
I'm working for the ZBW,
0:23:17.522,0:23:20.947
that is, Information Center[br]for Economics in Hamburg,
0:23:21.407,0:23:23.796
as a scientific software developer.
0:23:24.726,0:23:31.108
And one of my tasks last year[br]was preparing a data donation to Wikidata.
0:23:31.878,0:23:37.193
And I want to give some report on this[br]on our first experiences
0:23:37.613,0:23:43.259
from donating metadata[br]from the 20th-Century Press Archives.
0:23:46.463,0:23:48.299
To our best knowledge,
0:23:48.299,0:23:52.678
this is the largest public[br]press archive in the world.
0:23:54.018,0:23:59.158
It has been collected[br]between 1908 and 2005,
0:24:01.008,0:24:04.244
and has been got from
0:24:05.174,0:24:09.272
more than 1,500 newspapers[br]and periodicals
0:24:09.272,0:24:13.333
from Germany, and also internationally.
0:24:14.651,0:24:18.841
And it has covered everything[br]which could be of interest
0:24:18.841,0:24:22.820
for the Hamburg,
0:24:25.870,0:24:28.030
the Hamburg businesspeople
0:24:28.030,0:24:32.410
who wanted to expand over the world.
0:24:34.611,0:24:39.350
As you can see, this material[br]has been clipped from newspapers
0:24:39.350,0:24:41.790
and put onto paper,
0:24:41.790,0:24:44.731
and then collected in folders.
0:24:46.121,0:24:50.451
Here you see a small corner[br]of the Person's Archive,
0:24:51.255,0:24:56.182
and, similarly, information[br]has been collected on companies,
0:24:56.182,0:24:59.762
on general topics, on wares,[br]on everybody,
0:25:01.533,0:25:05.557
on everything which could be interesting.
0:25:06.978,0:25:11.074
These folders have been scanned
0:25:12.652,0:25:15.868
up to roughly 1949.
0:25:17.076,0:25:23.123
by the DFG-funded project in 2004 to 2007.
0:25:24.268,0:25:30.591
As a result, up to now,[br]it was 25,000 thematic dossiers
0:25:31.727,0:25:33.759
of this time.
0:25:33.771,0:25:37.913
This contained about 2 million,[br]or more than 2 million pages.
0:25:38.845,0:25:41.522
And these are online.
0:25:43.633,0:25:48.461
This application developed[br]at that time by ZBW,
0:25:50.006,0:25:54.341
which now looks a bit outdated,
0:25:55.031,0:25:58.153
not so fancy, [br]and what’s more of a problem.
0:25:58.597,0:26:04.350
It's an application which was built[br]architecturally on Oracle,
0:26:04.350,0:26:08.662
it was built on ColdFusion,[br]it runs on Windows servers,
0:26:09.227,0:26:14.992
so it's not very sustainable[br]in the long term.
0:26:16.008,0:26:19.274
And we have discussed[br]should we migrate this
0:26:19.274,0:26:22.755
to a more fancy linked data application,
0:26:23.931,0:26:27.964
or should we take a radical step
0:26:27.964,0:26:31.749
and put all this data in the open.
0:26:32.843,0:26:37.416
We have assigned CC0 license to that data
0:26:37.416,0:26:40.938
and, currently, moving some main--
0:26:42.036,0:26:46.463
access layer, some main discovery layer--[br]so it's a primary access layer
0:26:47.835,0:26:50.587
to the open linked data web,
0:26:51.315,0:26:56.881
where it actually makes most sense
0:26:56.881,0:27:00.698
to put some metadata into Wikidata,
0:27:02.367,0:27:06.781
and to make sure that all folders
0:27:07.594,0:27:10.633
of the collections are linked to Wikidata,
0:27:11.485,0:27:13.308
so they are findable,
0:27:14.240,0:27:17.795
and that all metadata about these folders
0:27:18.444,0:27:22.977
is also transferred to Wikidata.
0:27:23.344,0:27:27.886
So it can be used there,[br]and it can be enriched there, possibly.
0:27:28.780,0:27:32.237
Corrections can be made to that data.
0:27:32.645,0:27:38.894
What is still maintained by ZBW is,[br]of course, the storage of the images,
0:27:39.947,0:27:43.882
which we can't put in any way,
0:27:45.548,0:27:47.326
or we can't give a license on that
0:27:47.326,0:27:51.179
because this was owned[br]by the original creators.
0:27:52.271,0:27:54.954
But we make sure that they are accessible
0:27:56.500,0:28:02.203
by some, again, metadata files[br]via DFG Viewer
0:28:03.108,0:28:06.108
in the future by IIIF manifests.
0:28:06.849,0:28:11.050
And we will prepare[br]some static landing pages
0:28:11.707,0:28:18.333
which will serve as a data point[br]of reference for Wikidata,
0:28:18.333,0:28:22.596
as well as still making available data
0:28:22.600,0:28:26.174
which doesn't fit well into Wikidata.
0:28:31.253,0:28:36.815
[For us] is migration[br]and data donation to Wikidata
0:28:37.165,0:28:40.633
with our custom infrastructure
0:28:40.633,0:28:44.837
of SPARQL endpoint with that data,
0:28:45.887,0:28:48.980
and we basically used federated queries
0:28:49.990,0:28:53.834
between that endpoint[br]and the Wikidata Query Service
0:28:53.834,0:28:57.633
to create according statements
0:28:59.207,0:29:02.107
through [eyes of] concatenated
0:29:02.107,0:29:06.937
in SPARQL queries themselves,[br]or transformed via a script,
0:29:07.907,0:29:12.254
which also generated references[br]for the statements.
0:29:12.742,0:29:19.446
And then put that into QuickStatements[br]of the code to use this online.
0:29:22.544,0:29:24.088
So, this is what we get.
0:29:24.493,0:29:28.669
It's not only simple things[br]like birth dates, but, sorry--
0:29:29.835,0:29:34.998
but also complex statements
0:29:34.998,0:29:39.787
about already existing items,
0:29:39.787,0:29:44.790
like this person was a supervisory[br]board member of said company
0:29:46.682,0:29:48.905
during this period of time,
0:29:49.663,0:29:56.696
and referenced for use in...
0:29:58.463,0:30:01.864
in the scientific context.
0:30:07.763,0:30:10.939
The first part of this data donation[br]has been finished.
0:30:12.736,0:30:17.201
The Person's Archive[br]is completely linked to Wikidata.
0:30:18.333,0:30:23.652
And this is also an information tool.
0:30:23.652,0:30:27.360
A lot of items which have been before
0:30:27.360,0:30:30.422
not had any external references.
0:30:31.278,0:30:35.674
And we had about more[br]than 6,000 statements,
0:30:36.201,0:30:41.924
which are now sourced[br]in this archive's metadata.
0:30:45.288,0:30:49.951
Well, this was the most easy part,
0:30:50.880,0:30:54.785
because persons are easily[br]identifiable in Wikidata.
0:30:56.494,0:31:00.443
More than 90% already existed here,
0:31:00.443,0:31:02.412
so we could link to that.
0:31:02.412,0:31:06.486
We created some 100 items for these,
0:31:06.486,0:31:08.807
for the ones which were missing.
0:31:09.296,0:31:13.626
But now, we are working
0:31:13.626,0:31:18.165
on the rest of the archive,
0:31:18.165,0:31:20.432
particularly on the topics archive.
0:31:21.243,0:31:26.677
Which means mapping a historic system[br]for the organization of knowledge
0:31:26.677,0:31:29.884
about the whole world,
0:31:29.884,0:31:34.147
materialized as newspaper[br]clippings to Wikidata.
0:31:36.305,0:31:41.898
To give you a basic idea,[br]the Countries and Topics archive
0:31:42.668,0:31:48.773
is organized by a hierarchy of countries
0:31:48.773,0:31:50.882
and other geographic entities,
0:31:52.499,0:31:56.443
which is translated to English,[br]which makes this more easy.
0:31:56.443,0:32:01.861
And German deeply nested...
0:32:03.881,0:32:08.064
deeply nested classification of topics.
0:32:08.064,0:32:11.593
And this combination defines one...
0:32:13.032,0:32:16.020
one folder.
0:32:16.020,0:32:21.128
So, what we now want to do[br]is to match this
0:32:21.128,0:32:24.575
as a structure to Wikidata,[br]and to bring the data in.
0:32:24.575,0:32:29.338
And I want to invite you
0:32:29.338,0:32:33.801
to join this really nice challenge
0:32:33.801,0:32:36.272
in terms of knowledge organization.
0:32:37.739,0:32:40.713
So, it's a WikiProject[br]where this work is tracked,
0:32:40.713,0:32:46.288
and you can follow this[br]or participate in this.
0:32:46.591,0:32:48.908
And, yes, thank you very much.
0:32:49.639,0:32:51.723
(applause)
0:33:03.999,0:33:07.284
So, we're taking[br]performing arts to Wikidata.
0:33:07.735,0:33:11.930
And we're taking performing arts[br]to the linked open data cloud,
0:33:11.930,0:33:15.595
by building a linked open data[br]ecosystem for the performing arts.
0:33:16.164,0:33:21.068
And the question I'm trying to answer,
0:33:21.068,0:33:24.463
and I hope you'll help me[br]in answering the questions
0:33:24.463,0:33:27.012
which place for Wikidata and all that.
0:33:27.012,0:33:31.316
But let me first start with my experiences
0:33:31.316,0:33:33.963
which I made this year,
0:33:34.723,0:33:37.564
the first half of the year,[br]when I had the pleasure
0:33:37.564,0:33:39.350
to work with CAPACOA,
0:33:39.350,0:33:42.074
which is the Canadian Arts[br]Presenting Association,
0:33:42.074,0:33:47.408
which actually launched a project[br]called Linked Digital Future Initiative,
0:33:47.831,0:33:53.261
to actually get the entire art sector[br]in Canada to embrace linked open data.
0:33:53.441,0:33:56.887
And they did that based on the observation
0:33:56.887,0:33:59.042
that over the past five years,
0:33:59.731,0:34:03.924
the [inaudible]-- the important topic[br]within performing arts
0:34:03.924,0:34:08.855
was the fact that metadata[br]was not around in sufficient quality
0:34:08.855,0:34:11.780
and not interlinked, not interoperable.
0:34:12.106,0:34:16.498
And that was why some of the performances,
0:34:16.498,0:34:19.542
some of the events[br]are not so well findable
0:34:19.542,0:34:24.777
by Google and by personal[br]computer-based assistants, and so on.
0:34:25.989,0:34:29.757
So, the vision we kind[br]of developed together
0:34:29.757,0:34:32.997
is that we want to have a knowledge base
0:34:34.013,0:34:35.646
for many stakeholders at once.
0:34:35.646,0:34:39.636
So we looked at the entire[br]performing arts value network,
0:34:39.636,0:34:42.073
we identified key stakeholders in there,
0:34:42.073,0:34:46.545
we looked at the usage scenarios[br]that we like to pursue,
0:34:47.719,0:34:52.074
and we kind of mapped it[br]to the whole architecture
0:34:52.074,0:34:57.097
of such a knowledge base,[br]or of the different platforms in there,
0:34:57.097,0:34:59.535
which, obviously,[br]is a distributed architecture,
0:34:59.535,0:35:01.361
and not one big monolith.
0:35:02.499,0:35:05.664
I'm just going to run[br]through that quite quickly
0:35:05.664,0:35:07.980
because we have ten minutes each.
0:35:09.035,0:35:13.796
But I think we'll have plenty of time[br]tonight or tomorrow to deepen that
0:35:13.796,0:35:16.318
if anybody's interested in the details.
0:35:16.318,0:35:19.116
So, we started from[br]that Performing Arts Value Network,
0:35:19.116,0:35:23.263
which, interestingly,[br]was just published last year.
0:35:23.263,0:35:27.691
So, we're lucky to be able[br]to build on previous work,
0:35:27.691,0:35:31.098
like you have the primary value chain[br]of the performing arts in the middle,
0:35:31.098,0:35:34.177
and various stakeholders around that.
0:35:34.177,0:35:37.387
All in all, we identified[br]20 stakeholder groups,
0:35:37.387,0:35:43.384
which then we kind of boiled down[br]into seven larger categories
0:35:43.395,0:35:45.464
for each of the stakeholder groups.
0:35:45.464,0:35:51.558
We kind of formulated what kind of needs
0:35:51.558,0:35:54.718
they would have in terms[br]of such an infrastructure,
0:35:54.718,0:35:58.572
and what would they be able to achieve[br]if the whole thing was interlinked
0:35:58.572,0:36:02.062
and the data was publicly accessible.
0:36:02.637,0:36:04.990
And so, you can see the types here,
0:36:04.990,0:36:09.177
the different types is Production,[br]then Presention & Promotion,
0:36:09.177,0:36:12.064
Coverage & Reuse, Live Audiences,
0:36:12.064,0:36:13.852
Online Consumption, Heritage,
0:36:13.852,0:36:15.959
Research & Education.
0:36:15.959,0:36:18.917
And after kind of setting up a big table,
0:36:18.917,0:36:21.275
of which you can see[br]just the first part here,
0:36:21.275,0:36:25.128
we kind of compared [over there],[br]had a look at which type of data
0:36:25.128,0:36:26.954
were actually used across the board
0:36:26.954,0:36:31.248
by all different groups of stakeholders.
0:36:31.248,0:36:36.586
And there's quite a large basis of data[br]that is common to all of them,
0:36:36.586,0:36:38.414
and that is really is the area
0:36:38.414,0:36:43.063
where it makes a lot of sense, actually,[br]to cooperate and to keep that--
0:36:43.063,0:36:45.988
to maintain the data together.
0:36:47.602,0:36:50.651
So, when talking about[br]platform architecture,
0:36:50.651,0:36:53.648
you can see that we have four layers here.
0:36:54.096,0:36:56.448
At the bottom, display the data layer.
0:36:56.448,0:36:58.717
Of course, Wikidata plays a part in it,
0:36:58.717,0:37:02.733
but also a lot of other databases,[br]distributed databases
0:37:02.733,0:37:07.769
that can expose data[br]through SPARQL endpoints.
0:37:09.204,0:37:13.106
The yellow part in the middle,[br]that's the semantic layer.
0:37:13.106,0:37:16.080
It's our common language[br]to describe our things,
0:37:16.080,0:37:21.834
to make statements about things[br]around the performing arts, the ontology.
0:37:22.400,0:37:25.243
Then we have an application layer
0:37:25.243,0:37:30.551
that consists of various modules,[br]for example, data analysis,
0:37:30.551,0:37:34.613
data extraction-- so, how do you[br]actually get unstructured data
0:37:34.613,0:37:36.029
into structured data--
0:37:36.029,0:37:38.749
how can we support that by tools.
0:37:39.436,0:37:42.478
Then, obviously, there's[br]a visualization of data--
0:37:42.478,0:37:47.115
so if there are large quantities of data,[br]you want to visualize it in some way.
0:37:47.801,0:37:50.155
And on the top, you have[br]the presentation layer,
0:37:50.155,0:37:54.814
that's what the ordinary people[br]are actually interacting with
0:37:54.814,0:37:56.199
on a daily basis--
0:37:56.199,0:37:59.615
search engines, encyclopedias,[br]cultural agendas,
0:37:59.615,0:38:02.097
and a variety of other services.
0:38:03.395,0:38:05.386
We're not starting from scratch.
0:38:05.386,0:38:08.535
Some work has already[br]been done in this area.
0:38:09.107,0:38:13.043
I'll just cite a few examples[br]from a project
0:38:13.043,0:38:15.245
which I have been involved in.
0:38:15.245,0:38:18.149
Some other stuff going on as well.
0:38:18.149,0:38:21.195
And so, I started in this area
0:38:21.195,0:38:24.476
with the Swiss Archive[br]of the Performing Arts.
0:38:25.001,0:38:27.795
[Until] building a Swiss[br]Performing Arts database,
0:38:27.795,0:38:31.046
we created the performing arts ontology,
0:38:31.046,0:38:33.931
that's currently being[br]implemented into RDF.
0:38:34.701,0:38:39.771
And there we have the database[br]of like 60, 70 years
0:38:39.771,0:38:43.313
of performance history in Switzerland.
0:38:43.313,0:38:45.145
So, that's something that can build on,
0:38:45.145,0:38:48.999
and that's something[br]that's been transformed into RDF.
0:38:49.968,0:38:54.621
And there was a builder platform[br]where this data can be accessed.
0:38:56.073,0:39:01.658
Then we have done[br]several ingests into Wikidata,
0:39:01.658,0:39:02.877
partly from Switzerland,
0:39:02.877,0:39:08.990
partly also from[br]the performance arts institutes,
0:39:09.680,0:39:12.357
for example, Bart Magnus[br]was involved in that.
0:39:12.883,0:39:15.078
He was the driving force behind that.
0:39:15.078,0:39:17.223
There's also stuff from Wikimedia Commons,
0:39:17.223,0:39:21.361
but not very well interlinked[br]with all the rest of our metadata.
0:39:21.361,0:39:25.097
And obviously, by doing this ingest,
0:39:25.097,0:39:29.274
we also kind of started to implement[br]parts of this Swiss data model
0:39:29.274,0:39:31.345
into Wikidata.
0:39:32.767,0:39:37.556
Then one of the Canadian[br]implementation partners
0:39:37.556,0:39:39.013
is Culture Creates.
0:39:39.013,0:39:43.872
They're running a platform that actually[br]scrapes information from theater websites,
0:39:43.872,0:39:46.873
and inputs it into a knowledge graph,
0:39:48.293,0:39:54.428
to then expose it to search engines[br]and other search devices.
0:39:56.415,0:40:03.027
And there again, we kind of had[br]to implement and extend this in ontology.
0:40:03.261,0:40:08.163
And as you can see from the slide,[br]is that there's so many empty spaces,
0:40:08.163,0:40:09.599
but there's also some overlap,
0:40:09.599,0:40:13.456
and an important overlap, obviously,[br]is the common shared language,
0:40:13.456,0:40:18.693
which will help us actually interlink[br]the various data sets.
0:40:20.759,0:40:22.587
What is also important, obviously,
0:40:22.587,0:40:26.404
is that we're using the same[br]base registers and authority files.
0:40:26.406,0:40:31.368
And this is a place where Wikidata[br]plays an important role
0:40:31.368,0:40:33.967
by kind of interlinking these.
0:40:34.619,0:40:37.799
Now, I'd like to share the recommendations
0:40:37.799,0:40:41.882
by the Linked Data Future Initiatives[br]Advisory Committee.
0:40:42.769,0:40:45.169
At least the two first recommendations.
0:40:45.169,0:40:47.930
So, for the Canadians,[br]now it's absolutely crucial
0:40:47.930,0:40:53.173
to kind of fill in their own Canadian[br]performing arts knowledge graph,
0:40:53.173,0:40:55.851
because unlike the Swiss Archive[br]of the Performing Arts,
0:40:55.851,0:40:59.389
they're not starting[br]with an already existing database,
0:40:59.389,0:41:01.906
but they're kind of[br]creating it from scratch.
0:41:01.906,0:41:04.468
And it's absolutely crucial[br]to have data in there.
0:41:04.468,0:41:09.024
And second, as you can see,[br]comes in already Wikidata.
0:41:09.024,0:41:12.342
Wikidata, by the Advisory Committee,
0:41:12.342,0:41:17.859
has been seen as complementary[br]to Artsdata.ca, this knowledge graph,
0:41:18.347,0:41:21.474
and, therefore, efforts should[br]be undertaken to contribute
0:41:21.474,0:41:24.878
to its population[br]with performing arts-related data.
0:41:25.813,0:41:30.775
And that's where we're going to work on[br]over the coming months and years,
0:41:30.775,0:41:34.748
and that's also why[br]I'm kind of on the lookout here
0:41:34.748,0:41:38.644
to see who else will join that effort.
0:41:40.556,0:41:44.942
So, right now, obviously,[br]we're saying they're complementary.
0:41:44.942,0:41:48.341
So, we have to think about whether[br]the pluses and the minuses
0:41:48.341,0:41:49.844
of each of the approaches.
0:41:49.844,0:41:52.073
And you can see here a comparison
0:41:52.073,0:41:56.120
between Wikidata and the Classical[br]Linked Open Data approach.
0:41:56.887,0:41:59.947
I would be happy to discuss[br]that further with you guys,
0:41:59.947,0:42:02.549
how your experiences are in there.
0:42:02.814,0:42:07.727
But, as I see it, Wikidata is a huge plus[br]because it's a crowdsourcing platform,
0:42:07.727,0:42:11.671
and it's easy to invite further parties[br]to actually contribute.
0:42:11.683,0:42:17.482
On the negative side, obviously,[br]you get this problem of loss of control.
0:42:17.658,0:42:22.764
Data owners have to give up control[br]over their graphs, data quality,
0:42:22.764,0:42:24.382
and completeness.
0:42:26.554,0:42:31.096
It's harder to track on Wikidata[br]than if you have it under your control.
0:42:31.493,0:42:34.376
And the other strength of Wikidata
0:42:34.376,0:42:39.617
is that it requires immediate integration[br]into that worldwide graph.
0:42:39.617,0:42:41.734
And you kind of just do it--
0:42:42.544,0:42:46.768
kind of reconcile step by step[br]against other databases,
0:42:46.768,0:42:49.528
which may also be seen by some[br]as an advantage,
0:42:49.528,0:42:53.914
but of course, if you're looking[br]for integration and interoperability,
0:42:53.914,0:42:56.792
Wikidata forces you to go for that[br]from the beginning.
0:42:59.184,0:43:03.157
And then, obviously, harmonizing[br]data modeling practices
0:43:03.157,0:43:05.552
is an issue in both cases.
0:43:06.039,0:43:10.671
But it may seem, at the beginning,[br]easier to do with just in your own silo,
0:43:10.671,0:43:13.356
because at some point,[br]you're done with the task,
0:43:13.356,0:43:16.693
and it would be[br]an ongoing task on Wikidata.
0:43:18.280,0:43:22.883
So, when it now comes to prioritizing[br]the data to be ingested,
0:43:23.535,0:43:28.395
that's like the rules[br]I kind of go by at the moment.
0:43:30.055,0:43:32.325
First of all, we'd like to ingest it
0:43:32.325,0:43:36.191
where it's unclear who would be[br]the natural authority in the given area.
0:43:36.191,0:43:40.433
So that's definitely data[br]that will be managed in a shared manner.
0:43:40.902,0:43:44.391
And we'd like to ingest it where we see
0:43:44.391,0:43:47.149
a high potential[br]for crowdsourcing approaches.
0:43:47.149,0:43:51.693
We'd like to ingest data where the data[br]is likely to be reused
0:43:51.693,0:43:53.965
in the context of Wikipedia.
0:43:54.813,0:44:00.262
And there's also hope that some part[br]of the international coordination
0:44:00.262,0:44:04.364
around the whole data modeling,[br]about the standardization,
0:44:04.364,0:44:07.531
they could actually take place[br]directly on Wikidata,
0:44:07.531,0:44:09.484
if it's not taking place elsewhere,
0:44:09.484,0:44:12.305
because it kind of forces people[br]to start interacting
0:44:12.305,0:44:14.816
if they ingest data in the same part.
0:44:15.963,0:44:22.168
And we'd like to focus now next[br]on base registers and authority files
0:44:22.181,0:44:26.085
because they kind of help us[br]create the linkages
0:44:26.085,0:44:29.010
between different data[br]and uncontrolled vocabularies
0:44:29.010,0:44:32.833
as an extension of the existing ontology.
0:44:33.965,0:44:35.994
So, just two more slides.
0:44:36.480,0:44:40.978
The next steps will be that we're taking[br]the sum of all GLAMs approach
0:44:40.978,0:44:42.888
to Wiki Loves Performing Arts.
0:44:42.888,0:44:47.524
That means we're describing[br]venues and organizations,
0:44:47.524,0:44:51.106
and try to push the data to Wikipedia
0:44:51.106,0:44:54.414
in forms of infoboxes[br]and [bubble] templates.
0:44:54.414,0:44:59.769
And the other one, the other projects[br]I'm going to pursue is COST Action
0:45:00.336,0:45:02.001
that we'll submit next year
0:45:03.140,0:45:06.037
around that Linked Open Data Ecosystem[br]for the Performing Arts.
0:45:06.037,0:45:10.347
COST is a European program[br]that supports networking activities,
0:45:10.347,0:45:13.929
and the topics to be covered[br]are listed here.
0:45:13.929,0:45:16.404
Two of them, I have highlighted--
0:45:16.404,0:45:20.702
one of them is like the question[br]of federation between Wikidata
0:45:20.702,0:45:23.717
and the classical linked[br]open data approaches.
0:45:24.368,0:45:27.744
And the other one, I think,[br]is very important also,
0:45:27.744,0:45:30.528
where we have a huge potential still,
0:45:30.528,0:45:35.683
is implementing international campaigns[br]to supplement data on Wikidata.
0:45:37.627,0:45:41.365
So, that's it. Thank you[br]for your attention.
0:45:41.365,0:45:45.762
Now, I would like to ask[br]my colleagues up here.
0:45:47.086,0:45:50.529
To the panel, maybe you'll get them[br]microphones as well.
0:45:53.903,0:45:55.682
And then I would like to...
0:45:57.473,0:45:59.940
give you the chance to ask questions.
0:46:01.042,0:46:05.185
And obviously, also ask my colleagues
0:46:05.753,0:46:08.071
whether they have questions to each other.
0:46:12.049,0:46:15.327
So, do we have maybe a question[br]from the audience?
0:46:20.502,0:46:22.758
(man) [inaudible]
0:46:23.587,0:46:27.033
I would like to ask from each of you
0:46:27.033,0:46:30.842
where would you draw the line,
0:46:30.842,0:46:33.076
basically, how you define--
0:46:33.076,0:46:35.956
when do you need to run your own Wikibase,
0:46:35.956,0:46:39.328
and what do you want to put on Wikidata?
0:46:39.328,0:46:43.677
Like, is this a clear delineation[br]of what is seen
0:46:43.677,0:46:45.981
behind of putting it [into order.]
0:46:48.211,0:46:51.484
I can answer first because I have the mic.
0:46:51.484,0:46:56.955
So, I've been thinking[br]that one of the issues is notability.
0:46:59.212,0:47:02.084
I'm addressing that[br]in a different project.
0:47:02.084,0:47:05.898
And I think licensing could be one,
0:47:05.898,0:47:10.466
because you can apply your own terms[br]in your own database,
0:47:10.466,0:47:13.758
and then I think wherever it's possible.
0:47:14.284,0:47:19.882
And then, the third one[br]is just to have it as a sandbox,
0:47:19.882,0:47:23.078
prepare it for ingestion into Wikidata.
0:47:23.078,0:47:26.085
These are the three main things[br]that I come up with now,
0:47:26.085,0:47:28.554
but I can come up with more.
0:47:29.976,0:47:32.369
For me, rights are always[br]going to be an issue.
0:47:32.369,0:47:36.686
So, if the National Library[br]wanted to move towards Wikibase,
0:47:36.686,0:47:39.740
that would enable them to continue[br]to control the licensing
0:47:39.740,0:47:42.539
for the work they've done[br]with Maori language terms.
0:47:43.438,0:47:46.483
The kakapo database only contains data
0:47:46.483,0:47:49.977
that the Department of Conservation[br]felt could be made public,
0:47:49.977,0:47:52.739
but I suspect if they see it[br]up and running,
0:47:52.739,0:47:55.980
they might be tempted[br]to use a private Wikibase
0:47:55.980,0:47:58.128
to maintain their own database,
0:47:58.128,0:48:01.214
simply because of some[br]of the visualization tools
0:48:01.214,0:48:03.567
that could be applied might be better
0:48:03.567,0:48:07.417
than the sort of Excel spreadsheet system[br]that they currently run.
0:48:12.337,0:48:16.556
Well, I think this very much depends[br]on the kind of data.
0:48:17.609,0:48:22.359
We are, with the Press Archive, of course,[br]in a quite lucky position,
0:48:22.359,0:48:26.984
in that this was material[br]which was published,
0:48:26.984,0:48:29.829
it was published at the time,
0:48:30.153,0:48:31.780
but it was expensive to publish.
0:48:33.082,0:48:36.234
So, this is quite easy.
0:48:36.234,0:48:39.449
I think, also, projects--
0:48:40.101,0:48:42.027
and this is a typical project,
0:48:42.027,0:48:45.726
so it was funded for some time,[br]and then funding ended,
0:48:46.466,0:48:51.516
and what happens with the data[br]which is enclosed in some silo,
0:48:52.136,0:48:55.106
and some software[br]which will not run forever.
0:48:55.846,0:48:59.436
And so, it makes[br]absolute sense in my eyes.
0:48:59.896,0:49:02.776
At the time, Wikidata[br]wasn't around, but now it is,
0:49:03.376,0:49:07.336
and it makes absolute sense[br]for our project to early on
0:49:07.336,0:49:12.732
discuss sustainability in the context[br]of how could we put this
0:49:12.732,0:49:16.617
into a larger ecosystem like Wikidata,
0:49:18.717,0:49:21.408
and discuss this with the data community
0:49:21.408,0:49:26.864
what is notable and what makes sense[br]to add this to Wikidata,
0:49:26.864,0:49:32.093
and what makes sense to keep this[br]as a proprietary form.
0:49:32.093,0:49:37.753
Maybe in a more simple form[br]than sophisticated application,
0:49:37.753,0:49:43.055
but make it discoverable[br]and make it linked to the large data cloud
0:49:43.055,0:49:46.032
instead of investing lots of money
0:49:46.032,0:49:52.692
into some silo which will not sustain.
0:49:55.201,0:50:00.121
Yeah, as I said before[br]in the project I was presenting here,
0:50:00.121,0:50:04.926
are dualities between Wikidata[br]and classical linked open data approaches.
0:50:04.926,0:50:07.928
So, it's not so much about[br]setting up a private Wikibase.
0:50:11.147,0:50:14.504
Like one challenge we have had,[br]and, of course, in Wikidata,
0:50:14.504,0:50:17.710
is that when we ingest[br]your own data there,
0:50:17.710,0:50:20.341
you also have to do some housekeeping
0:50:20.744,0:50:23.509
of people, of other people, actually.
0:50:24.043,0:50:28.258
And they can put off people,[br][or it also means] that we will address it
0:50:28.258,0:50:29.888
just step by step.
0:50:30.375,0:50:33.466
So, there will be, at the moment,[br]a database living--
0:50:33.873,0:50:35.581
in classical linked open data
0:50:35.581,0:50:38.395
and we're starting to linking it[br]with Wikidata,
0:50:38.395,0:50:40.993
and it's a continuous process to find out
0:50:41.805,0:50:47.643
for which areas the most data[br]will be eventually on Wikidata,
0:50:48.168,0:50:51.946
and for which areas it will actually[br]live on other databases.
0:50:52.620,0:50:56.645
Obviously, we'll have challenges[br]regarding synchronization,
0:50:57.135,0:50:58.589
as we probably all have,
0:50:58.589,0:51:01.507
because that linked data field,
0:51:01.507,0:51:04.826
where we still have[br]to negotiate who we trust,
0:51:05.160,0:51:08.720
who has authority about what.
0:51:13.830,0:51:15.820
(assistant) Other questions?
0:51:23.981,0:51:25.550
(woman) Thank you.
0:51:26.090,0:51:31.030
So, fully agree with that issue of--
0:51:34.425,0:51:41.410
where to put the boundary[br]between why do we put data on Wikidata,
0:51:43.044,0:51:49.144
or why do we keep them,[br]and create, manage, and maintain them
0:51:49.144,0:51:53.104
in local databases and for what purposes.
0:51:53.778,0:51:57.213
And I think that[br]this is a large discussion
0:51:57.213,0:52:02.383
that goes beyond just the excitement
0:52:02.383,0:52:07.423
of putting data on Wikidata[br]because it is public,
0:52:07.432,0:52:10.762
because it serves humanity, because--
0:52:11.031,0:52:13.362
while there are two cool tools,
0:52:13.362,0:52:18.132
and things are more complicated[br]in real life, I think.
0:52:19.162,0:52:24.102
Well, despite this,[br]it's quite an interesting discussion.
0:52:24.435,0:52:29.744
And then this is another issue, also,[br]or another problem that is being discussed
0:52:29.744,0:52:35.034
in this event in different panels.
0:52:35.775,0:52:41.129
It is on one side, have your own database,
0:52:41.129,0:52:43.194
whatever the technology is
0:52:43.194,0:52:46.763
and publish things on Wikidata,
0:52:47.233,0:52:51.166
or build your own system
0:52:51.166,0:52:55.246
of creating and managing information
0:52:55.246,0:52:58.131
on the Wikibase technology.
0:52:58.591,0:53:04.281
And then, synchronize or whatever--[br]do federation or things,
0:53:04.281,0:53:08.314
so it's a matter[br]of technology that is used,
0:53:09.182,0:53:14.796
and the fact that you use Wikidata[br]just for publishing,
0:53:14.978,0:53:18.637
or the infrastructure[br]that is underneath Wikidata
0:53:18.637,0:53:23.002
to create and manage your data.
0:53:27.116,0:53:30.914
I mean, we had a discussion
0:53:30.914,0:53:34.254
about the Wikibase panel,
0:53:34.254,0:53:36.912
and there will be other discussions here,
0:53:36.912,0:53:40.815
but things are[br]on different levels, I think.
0:53:41.626,0:53:47.756
Maybe [you sort of get] to that discussion[br]about Wikibase or Wikidata--
0:53:48.930,0:53:52.427
I think it's problematic[br]that we are focusing so much
0:53:52.427,0:53:56.158
on this Wikibase infrastructure,[br]because there are other infrastructures,
0:53:56.158,0:53:58.690
like in the area of performing arts.
0:53:59.810,0:54:04.054
We have another complementary community,[br]which is MusicBrainz
0:54:04.054,0:54:08.954
that runs on their own platform[br]that provides linked open data,
0:54:09.614,0:54:12.692
and as I understand it,
0:54:14.160,0:54:17.232
there's agreement[br]within the Wikidata community
0:54:17.232,0:54:19.731
that we're not going[br]to double all their data--
0:54:19.731,0:54:24.237
we're not going to copy all their data,[br]but we accept that they're complementary.
0:54:24.848,0:54:29.678
So, what will happen when you start[br]integrating this data in Wikipedia?
0:54:30.246,0:54:31.907
Infoboxes, for example.
0:54:31.907,0:54:35.952
Would we be able to pull that data[br]directly from their SPARQL endpoint?
0:54:36.764,0:54:39.603
Or would we be obliged[br]to kind of copy all the data,
0:54:39.603,0:54:42.225
and what kind of processes[br]are involved in that?
0:54:42.225,0:54:44.915
(woman) Discussions are open, I think,
0:54:44.915,0:54:49.615
because within this event,[br]you have both interested communities--
0:54:49.615,0:54:51.975
those that are interested in Wikibase,
0:54:51.975,0:54:54.002
and those that are interested in Wikidata,
0:54:54.002,0:54:56.282
and those who are interested in both.
0:54:56.282,0:54:59.562
Yeah, but we're not going[br]to oblige them to move to Wikibase.
0:55:00.162,0:55:03.138
- (woman) Not necessarily.[br]- MusicBrainz is not running on Wikibase.
0:55:03.138,0:55:06.802
(woman) No, I just wanted to say[br]that you have separate problems,
0:55:06.802,0:55:10.964
sometimes interrelated,[br]sometimes not completely separated.
0:55:12.479,0:55:16.573
And I had another question or remark
0:55:16.573,0:55:22.013
regarding the management of hierarchies[br]in controlled vocabularies,
0:55:22.013,0:55:26.473
like thesaurus, like you in Finto.
0:55:27.703,0:55:30.563
You do have the places
0:55:31.503,0:55:34.956
in the Maori
0:55:36.418,0:55:40.554
Subject Headings,
0:55:42.262,0:55:48.068
Well, they have to deal with[br]the management of concepts in hierarchy.
0:55:48.360,0:55:52.320
What is your take, your opinion
0:55:52.320,0:55:57.042
about the possibility[br]of managing this controlled
0:55:58.850,0:56:02.364
knowledge organization[br]systems in Wikidata?
0:56:07.166,0:56:10.169
I think in the case[br]of Finto and YSO places,
0:56:11.499,0:56:14.391
the repository will be a collection
0:56:14.391,0:56:18.936
of several sources, eventually.
0:56:18.936,0:56:21.613
So, it is in flux, anyway.
0:56:21.613,0:56:24.528
So, we don't have to necessarily--
0:56:24.528,0:56:28.383
well, I don't represent[br]the National Library,
0:56:28.383,0:56:31.512
but in that possible project,
0:56:31.512,0:56:35.711
we would not have[br]to maintain an existing--
0:56:35.711,0:56:38.540
or fight with an existing structure.
0:56:38.540,0:56:45.164
So, in that sense, it is an area[br]open for exploration.
0:56:48.912,0:56:52.272
The Maori Subject Headings[br]seems to lend themselves ideally
0:56:52.272,0:56:54.392
to Wikidata structure,
0:56:54.392,0:56:56.961
but the licensing,[br]of course, forbids that.
0:56:56.961,0:56:59.491
I suspect that if the licensing[br]were different
0:56:59.491,0:57:01.511
and they were put into Wikidata,
0:57:01.511,0:57:04.562
as soon as somebody decided[br]they didn't like the hierarchy
0:57:04.562,0:57:06.162
and started to change things,
0:57:06.162,0:57:10.001
there would be an immediate outcry[br]from people who worked very hard
0:57:10.001,0:57:12.301
to create that structure
0:57:12.301,0:57:15.641
and get the sign-off[br]from various different Maori
0:57:15.641,0:57:17.942
that was the current hierarchy.
0:57:18.382,0:57:20.841
So, that's an issue to try and resolve.
0:57:23.812,0:57:26.502
I think in terms of knowledge[br]organization systems,
0:57:26.502,0:57:28.116
they are all different.
0:57:28.116,0:57:31.752
And I'm not sure[br]if it would be a good idea
0:57:31.752,0:57:36.855
to represent different hierarchies[br]in Wikidata as such,
0:57:37.650,0:57:42.101
but it maybe makes sense[br]to think about overlays
0:57:42.941,0:57:45.022
of the data.
0:57:45.431,0:57:48.371
So, to do mappings on the content level.
0:57:49.091,0:57:54.021
For example, as ZBW partnership[br]Thesaurus for Economics.
0:57:55.420,0:57:59.150
And this thesaurus has its own hierarchy,
0:57:59.680,0:58:04.020
and, of course, it would be possible[br]to project the hierarchy
0:58:04.461,0:58:08.452
of this thesaurus into Wikidata concepts
0:58:08.452,0:58:11.541
without actually storing[br]this kind of structure
0:58:12.180,0:58:14.840
as an alternative structure[br]within Wikidata
0:58:14.840,0:58:18.640
which would make a lot of confusion.
0:58:18.640,0:58:24.789
But I think we should think[br]of Wikidata, also, as a pool of concepts
0:58:24.789,0:58:29.651
which can be connected on layers[br]which are outside,
0:58:30.264,0:58:33.489
and which give another view of the world
0:58:33.489,0:58:39.080
which is not necessarily to be[br]within Wikidata.
0:58:45.775,0:58:48.203
(assistant) Alright. Some other questions?
0:58:49.096,0:58:51.527
Otherwise-- okay.
0:58:54.769,0:58:57.781
(man 2) Joachim, I just wanted[br]to follow up on that last point.
0:58:57.781,0:59:01.064
So, these layers, as you picture it,
0:59:02.196,0:59:04.143
they would be maintained externally
0:59:04.143,0:59:07.404
and somehow integrated
0:59:08.964,0:59:11.764
with Wikidata from the Wikidata side,
0:59:11.764,0:59:17.143
or have you thought a bit further
0:59:17.143,0:59:19.463
about how that might be managed?
0:59:22.351,0:59:24.931
Actually, no, I have no--
0:59:25.271,0:59:30.361
I have done experiments[br]with ZBW and Wikidata.
0:59:30.771,0:59:33.132
I was [inaudible] here at Wikidata.
0:59:33.132,0:59:38.837
But I think this is[br]a whole new complex thing,
0:59:39.261,0:59:46.210
and so, it's up to [discuss],[br][to give up a lot of control]
0:59:46.409,0:59:47.908
to do such things.
0:59:47.908,0:59:50.178
But it has to be figured out.
0:59:56.638,0:59:57.959
Should we take one more?
0:59:57.959,0:59:59.686
(man 3) Ah, great.
0:59:59.686,1:00:02.628
I was just wondering[br]about the kakapo project.
1:00:03.875,1:00:05.000
Uh-hmm.
1:00:05.000,1:00:10.805
(man 3) Okay. So, did you get[br]any pushback from the Wikidata community
1:00:10.805,1:00:14.636
about having individual animals[br]out of those items?
1:00:15.576,1:00:16.836
Not so far.
1:00:16.836,1:00:19.045
(man 3) Has anyone heard[br]about this before?
1:00:19.045,1:00:22.445
Is it "not so far" because[br]no one has heard about it yet?
1:00:23.085,1:00:26.095
There's been a small discussion[br]for quite some time now--
1:00:26.095,1:00:29.235
those people interested[br]in this sort of thing in Wikidata,
1:00:29.235,1:00:32.215
and we all seem to think[br]that it's a natural extension
1:00:32.215,1:00:35.855
of getting individual Wikidata items[br]to a famous racehorse
1:00:35.855,1:00:39.755
or someone's cat, which--[br]that's modeled pretty well.
1:00:39.764,1:00:44.444
I guess just the audacious thing[br]is putting the entire species in there.
1:00:44.444,1:00:48.113
But I think it's perfectly manageable.
1:00:48.113,1:00:50.173
(man 3) Don't try it with cats and dogs.
1:00:50.173,1:00:52.457
(laughter)
1:00:52.457,1:00:54.337
(assistant) Okay. I think[br]the time is finished.
1:00:54.337,1:00:55.767
Thank you very much for attending.
1:00:55.767,1:00:59.267
I think the speakers will be still open[br]for the questions and a break.
1:00:59.267,1:01:00.797
And have fun.
1:01:00.797,1:01:02.292
Thank you very much.
1:01:02.292,1:01:04.047
(applause)