1 00:00:00,000 --> 00:00:08,249 Good afternoon, everybody. 2 00:00:08,929 --> 00:00:12,068 Welcome to our GLAM panel. 3 00:00:13,124 --> 00:00:17,009 Before we start, I just have two announcements to make. 4 00:00:17,329 --> 00:00:23,049 First of all, please extensively make use of our Etherpad to take notes. 5 00:00:23,781 --> 00:00:27,998 And the second one is directed at our audience at home, 6 00:00:27,998 --> 00:00:29,819 or wherever you are. 7 00:00:29,819 --> 00:00:30,958 If you have any questions, 8 00:00:30,958 --> 00:00:34,028 you can also write that into the Etherpad, 9 00:00:34,028 --> 00:00:37,828 and our room angels will keep track of them. 10 00:00:39,328 --> 00:00:44,348 So, we decided that for this year's panel, 11 00:00:45,388 --> 00:00:48,868 after seeing all the contributions that were made, 12 00:00:49,128 --> 00:00:53,538 we would focus on the role of Wikidata within data ecosystems 13 00:00:53,551 --> 00:00:57,199 that go beyond the actual Wikimedia projects, 14 00:00:57,199 --> 00:00:59,747 which is also absolutely in line 15 00:00:59,747 --> 00:01:03,677 with the new Wikimedia Foundation strategy. 16 00:01:04,652 --> 00:01:07,947 And we have, today, four panelists. 17 00:01:08,387 --> 00:01:09,876 Three plus one. 18 00:01:09,876 --> 00:01:13,636 So, I would like to ask you on stage, 19 00:01:13,636 --> 00:01:15,875 so we can introduce you. 20 00:01:22,205 --> 00:01:24,706 So, we have Susanna Ånäs. 21 00:01:25,385 --> 00:01:29,296 She's a long time free-knowledge activist 22 00:01:29,296 --> 00:01:31,276 involved in many WikiProjects. 23 00:01:31,916 --> 00:01:35,526 And she will be reporting today on the project in cooperation 24 00:01:35,526 --> 00:01:38,396 with the Finnish National Library. 25 00:01:38,856 --> 00:01:43,435 Then we have, next to me, Mike Dickison, 26 00:01:43,435 --> 00:01:46,325 who will be second in this order. 27 00:01:46,995 --> 00:01:50,283 He is a museum curator from New Zealand. 28 00:01:50,283 --> 00:01:53,815 He's a zoologist and a Wikipedia editor. 29 00:01:53,815 --> 00:01:58,788 And he was New Zealand's first Wikipedian at Large 30 00:01:58,788 --> 00:02:02,565 in 2018 and 2019. 31 00:02:02,565 --> 00:02:06,634 And he will tell us about his experience in that role, 32 00:02:06,634 --> 00:02:13,105 and what kind of role Wikidata is starting to play in that context. 33 00:02:15,784 --> 00:02:18,135 Then we have Joachim Neubert 34 00:02:18,135 --> 00:02:23,461 from the Leibniz Information Center for Economics in Kiel and Hamburg. 35 00:02:24,011 --> 00:02:29,131 He has been working on making the largest public press archives worldwide 36 00:02:29,131 --> 00:02:34,655 more accessible to the public, and he's using Wikidata to do that. 37 00:02:35,890 --> 00:02:39,091 And then I will go last. My name is Beat Estermann. 38 00:02:39,091 --> 00:02:43,080 I work for Bern University of Applied Sciences, in Switzerland. 39 00:02:43,640 --> 00:02:49,950 And I've been a long-time promoter for OpenGLAM in Switzerland and Austria. 40 00:02:50,335 --> 00:02:54,840 And I will today report about my activities in connection 41 00:02:54,840 --> 00:02:59,460 with the mandate from the Canadian Arts Presenting Association, 42 00:02:59,460 --> 00:03:01,270 focusing on performing arts. 43 00:03:02,121 --> 00:03:04,440 Not primarily on Wikidata, 44 00:03:04,440 --> 00:03:08,421 but you will see Wikidata is starting to play a role there, as well. 45 00:03:08,970 --> 00:03:13,250 So now, most of us will take our seat here, 46 00:03:13,250 --> 00:03:16,980 and I will give the floor to Susanna. 47 00:03:18,300 --> 00:03:22,769 Okay. So, hello. My name is Susana Ånäs, 48 00:03:22,769 --> 00:03:25,769 and I work part-time for Wikimedia Finland 49 00:03:25,769 --> 00:03:27,079 as a GLAM coordinator, 50 00:03:27,079 --> 00:03:32,655 and I also do consulting in the open knowledge sphere. 51 00:03:32,655 --> 00:03:36,049 And this is a discourse, maybe, of [inaudible]. 52 00:03:36,049 --> 00:03:38,719 So, I have been involved in the workings 53 00:03:38,719 --> 00:03:45,642 of geographic data group of the-- 54 00:03:48,439 --> 00:03:51,147 well, I looked it up, but it isn't in English, 55 00:03:51,147 --> 00:03:54,497 but, cultural heritage initiative of the Finnish royal government. 56 00:03:54,917 --> 00:03:59,775 So, this is about place names 57 00:03:59,775 --> 00:04:03,300 and how they are represented 58 00:04:03,300 --> 00:04:07,466 in different repositories in the GLAM sector in Finland, 59 00:04:07,466 --> 00:04:11,755 and how they are trying to pull together these different sources, 60 00:04:11,755 --> 00:04:17,906 and how they are informed by modeling in Wikidata and elsewhere. 61 00:04:17,906 --> 00:04:23,315 So, here we see the three main sources for these YSO places, 62 00:04:23,315 --> 00:04:27,944 which is part of the national ontology-- general ontology. 63 00:04:27,944 --> 00:04:29,665 AHAA is for Finnish archives, 64 00:04:29,665 --> 00:04:31,645 Melinda is for Finnish libraries, 65 00:04:31,645 --> 00:04:33,750 and KOOKOS is for Finnish museums. 66 00:04:33,750 --> 00:04:37,585 So, there are three, also, content management systems 67 00:04:37,585 --> 00:04:40,290 that come together in these YSO places. 68 00:04:40,745 --> 00:04:47,365 And there are exchanges between Wikidata already taking place, 69 00:04:47,965 --> 00:04:53,065 as well as the names project for the National Land Survey. 70 00:04:53,065 --> 00:04:56,285 And then, there's a third project, the Finnish Names Archive, 71 00:04:56,285 --> 00:05:00,391 which doesn't yet contribute to this, 72 00:05:00,391 --> 00:05:02,715 but there are plans for that. 73 00:05:02,715 --> 00:05:09,175 So, one of the key modeling issues in this whole problem area 74 00:05:09,175 --> 00:05:15,226 is that there are three types of elements in place names 75 00:05:16,116 --> 00:05:18,195 represented in this project. 76 00:05:18,195 --> 00:05:21,236 One of them is the place, the one that has location. 77 00:05:21,236 --> 00:05:24,766 And one of them is the place name, the toponym, for example. 78 00:05:25,006 --> 00:05:27,696 And then, there are sources, which are documents 79 00:05:27,696 --> 00:05:30,756 from which these both can be derived from, 80 00:05:30,756 --> 00:05:32,565 or like, backed up with. 81 00:05:32,565 --> 00:05:35,845 The YSO places-- here, on the top right, 82 00:05:35,845 --> 00:05:38,799 you will see the same diagram again. 83 00:05:38,799 --> 00:05:41,189 It focuses mainly on the places. 84 00:05:42,619 --> 00:05:46,279 The main thing of this is the Finnish National Library, 85 00:05:46,279 --> 00:05:49,159 and the Finto project. 86 00:05:50,199 --> 00:05:55,608 There are now more than 7,000 places in Finnish and Swedish 87 00:05:55,608 --> 00:05:59,438 and over 3,000 in English, 88 00:05:59,438 --> 00:06:03,042 and they are CC0 we've licensed with. 89 00:06:03,042 --> 00:06:06,008 So, here you can see the service of Finto. 90 00:06:06,008 --> 00:06:09,883 And a place-- I chose Sevettijärvi. 91 00:06:09,883 --> 00:06:13,908 It is now also related to our language project 92 00:06:13,908 --> 00:06:15,268 with the Skolt Sami-- 93 00:06:15,268 --> 00:06:18,877 this is a place in the very north of Finland 94 00:06:18,877 --> 00:06:21,765 inhabited by Skolt Sámi. 95 00:06:21,765 --> 00:06:27,264 So, here you can see the place which belongs to the-- 96 00:06:27,264 --> 00:06:32,724 well, you will see the data about this place. 97 00:06:32,724 --> 00:06:37,952 You can see that it is connected to a Wikidata, 98 00:06:37,952 --> 00:06:42,344 as well as this National Land Survey data. 99 00:06:43,192 --> 00:06:47,406 Here we go. And you will see this in more detail, here. 100 00:06:48,582 --> 00:06:52,360 It is also hierarchically arranged 101 00:06:52,360 --> 00:06:56,310 inside this repository. 102 00:06:57,670 --> 00:07:00,460 Well, actually, the actual place is not seen, 103 00:07:00,460 --> 00:07:05,880 but it is underneath this municipality, 104 00:07:05,880 --> 00:07:08,010 as well as the region, 105 00:07:08,010 --> 00:07:10,154 and Finland as a country, and Nordic countries, 106 00:07:10,154 --> 00:07:12,650 the broader region. 107 00:07:12,650 --> 00:07:14,400 Here you can see that many of these 108 00:07:14,400 --> 00:07:17,891 have been matched with Wikidata previously 109 00:07:18,730 --> 00:07:22,230 through Mix'n'Match, and there are still remaining ones. 110 00:07:22,230 --> 00:07:27,900 But then, the amount of names is not that high. 111 00:07:28,411 --> 00:07:30,844 It's only less than 5,000. 112 00:07:31,570 --> 00:07:33,860 So, then there is this other repository 113 00:07:33,860 --> 00:07:38,040 by the Finnish Geospatial Platform Project-- 114 00:07:38,040 --> 00:07:39,199 Place Names Cards. 115 00:07:39,199 --> 00:07:41,729 These are all the place names that are on Finnish maps. 116 00:07:42,130 --> 00:07:48,308 And they have the linked data, which is licensed CC BY 4.0. 117 00:07:48,518 --> 00:07:54,478 800,000 map labels in Finnish, Swedish, and all those three Saami languages 118 00:07:54,478 --> 00:07:55,778 that are in Finland. 119 00:07:55,997 --> 00:07:58,877 And they have two different types of entities. 120 00:07:58,877 --> 00:08:00,680 The other ones are places, and the other ones 121 00:08:00,680 --> 00:08:02,651 are place names, toponyms. 122 00:08:02,651 --> 00:08:05,271 And they both have persistent URIs. 123 00:08:06,001 --> 00:08:09,721 Here's, for example, the same Sevettijärvi, in first Finnish, 124 00:08:09,721 --> 00:08:14,001 and then all those three Saami languages, as well as the geographic data, 125 00:08:14,001 --> 00:08:18,821 and then there is more information about that, like the place type, 126 00:08:19,630 --> 00:08:20,841 et cetera. 127 00:08:21,640 --> 00:08:28,411 Here is the card for the place name, the toponym, having its own URI. 128 00:08:29,943 --> 00:08:33,738 Sorry, it seems that it's not translated into the English list. 129 00:08:34,432 --> 00:08:39,151 So, multilinguality is not covering the whole project. 130 00:08:40,167 --> 00:08:42,523 Okay, we come to the Finnish Names Archive. 131 00:08:42,523 --> 00:08:46,234 This is a project by the Institute for the Languages of Finland, 132 00:08:46,234 --> 00:08:50,456 and these represent not the places, not the place names, 133 00:08:50,456 --> 00:08:52,603 but they are actually sources for those. 134 00:08:52,603 --> 00:08:57,123 So, these are three million field notes of place names, 135 00:08:57,723 --> 00:08:59,529 and it is a Wikibase project. 136 00:08:59,529 --> 00:09:03,325 They are in a Wikibase, mainly in Finnish, some in Swedish. 137 00:09:03,325 --> 00:09:08,111 An outstanding collection of Saami names, which we are very interested in. 138 00:09:08,111 --> 00:09:10,141 And they are licensed CC BY. 139 00:09:10,380 --> 00:09:14,850 And that is also a challenge from the Wikidata point of view. 140 00:09:14,850 --> 00:09:17,640 But if there was a Finnish local Wikibase, 141 00:09:17,640 --> 00:09:22,632 we might be able to first work on them in that project. 142 00:09:23,034 --> 00:09:25,343 So, here's a screenshot of that, 143 00:09:26,443 --> 00:09:31,323 showing that there's information about the place, the maps-- 144 00:09:31,323 --> 00:09:35,227 the maps that the collectors initially use, 145 00:09:35,227 --> 00:09:40,713 and the card that they produce of the information they collected. 146 00:09:41,455 --> 00:09:46,416 So, here's one of those cards 147 00:09:46,416 --> 00:09:48,736 broken down into data 148 00:09:48,736 --> 00:09:50,676 that is included in them. 149 00:09:51,166 --> 00:09:53,751 So, then they sent this linked data project 150 00:09:53,751 --> 00:09:56,336 by the Helsinki Digital Humanities Lab 151 00:09:56,336 --> 00:09:58,256 and Semantic Computers, 152 00:09:58,256 --> 00:10:01,446 computing group of Aalto University-- 153 00:10:01,446 --> 00:10:06,525 and together with this Institute for the Languages of Finland-- 154 00:10:06,525 --> 00:10:07,994 the Names Sampo. 155 00:10:07,994 --> 00:10:11,024 And this is an aggregated research interface 156 00:10:11,024 --> 00:10:13,503 to several place name sources. 157 00:10:13,503 --> 00:10:17,704 Here you can see that many of the sources are out there on the left, 158 00:10:17,704 --> 00:10:20,763 and then, you can make different kinds of visualizations 159 00:10:20,763 --> 00:10:22,653 based on this data. 160 00:10:22,653 --> 00:10:24,438 And, yeah. 161 00:10:25,289 --> 00:10:30,603 So, I've been bringing up this idea of modeling for a local Wikibase 162 00:10:30,603 --> 00:10:32,693 that we could do with this data. 163 00:10:32,693 --> 00:10:36,580 But when we enter these modeling questions, 164 00:10:36,580 --> 00:10:37,770 how do we model? 165 00:10:37,770 --> 00:10:41,589 There are different ways, different traditions in each of these. 166 00:10:45,682 --> 00:10:50,360 And the good thing about it is it could also serve minority languages 167 00:10:50,360 --> 00:10:52,475 with very little effort. 168 00:10:53,243 --> 00:10:57,179 Okay. So, here we have the two basic options: 169 00:10:57,179 --> 00:11:01,660 the SAPO model, which is the Finnish Space-Time Ontology, 170 00:11:02,841 --> 00:11:04,421 and the Wikidata model. 171 00:11:04,421 --> 00:11:07,909 Here you can see that Wikidata items tend to zero. 172 00:11:07,909 --> 00:11:12,871 Ideally, they remain the same with the changing properties. 173 00:11:12,871 --> 00:11:16,909 Whereas, in the SAPO model, these items become new 174 00:11:16,909 --> 00:11:20,399 when there is a change, such as area change and name change. 175 00:11:21,179 --> 00:11:26,219 So here, come back to this division 176 00:11:26,219 --> 00:11:31,719 between these three different dimensions of places, place names. 177 00:11:32,099 --> 00:11:37,659 So, should we make these place names into entities or properties? 178 00:11:37,659 --> 00:11:39,248 Wikidata uses properties, 179 00:11:39,248 --> 00:11:43,098 whereas this land survey project has entities. 180 00:11:43,838 --> 00:11:46,177 Or should we make them into lexemes? 181 00:11:46,177 --> 00:11:51,426 Wikidata has chosen to work with properties, 182 00:11:51,426 --> 00:11:54,956 textual properties for place names over lexemes. 183 00:11:55,567 --> 00:11:57,818 I'm sorry, the other way around. 184 00:11:57,818 --> 00:11:59,631 So, the names are... 185 00:12:03,056 --> 00:12:04,941 properties, not lexemes. 186 00:12:05,874 --> 00:12:06,877 Right. 187 00:12:07,165 --> 00:12:11,132 And maybe the shortcoming of the Wikibase 188 00:12:11,132 --> 00:12:16,340 is the lack of geographical shapes inside that-- 189 00:12:16,340 --> 00:12:20,958 like in the basic setup of it, 190 00:12:20,958 --> 00:12:24,748 so one would have to add more technology into the stack 191 00:12:24,748 --> 00:12:29,688 to be able to use local geographic shapes. 192 00:12:29,688 --> 00:12:31,823 And a federation is really needed 193 00:12:31,823 --> 00:12:38,168 to be able to take advantage of the Wikidata corpus. 194 00:12:38,648 --> 00:12:43,052 So, I'm done already. Thank you. 195 00:12:43,616 --> 00:12:45,827 (applause) 196 00:13:01,255 --> 00:13:02,514 Okay. 197 00:13:03,274 --> 00:13:05,011 (speaking in Maori) 198 00:13:05,011 --> 00:13:07,655 Welcome, everyone. My name is Mike Dickison. 199 00:13:08,375 --> 00:13:10,149 And for a year, 200 00:13:10,149 --> 00:13:13,075 I was New Zealand Wikipedian at Large. 201 00:13:13,935 --> 00:13:16,935 You might wonder what a Wikipedian at Large is. 202 00:13:17,856 --> 00:13:21,875 Because if you actually look out for it, there is no such thing, as we can see. 203 00:13:22,735 --> 00:13:25,855 It's a term that I made up in the grant proposal, 204 00:13:26,153 --> 00:13:29,003 which the foundation seemed to like very much. 205 00:13:29,983 --> 00:13:31,533 And so, we ran with it. 206 00:13:32,303 --> 00:13:36,633 So, for a year, I went through 35 different institutions, 207 00:13:37,053 --> 00:13:41,053 residents, and most of them, running training sessions, 208 00:13:41,493 --> 00:13:44,363 organizing public events, and trying to develop 209 00:13:44,363 --> 00:13:47,230 a Wikimedia strategy for each one. 210 00:13:47,998 --> 00:13:49,498 It was a very interesting experience, 211 00:13:49,498 --> 00:13:53,267 and you encounter a wide range of different projects and people. 212 00:13:53,267 --> 00:13:58,211 And I wanted to try and talk through some of the different projects 213 00:13:58,211 --> 00:14:00,345 that dealt with Wikidata 214 00:14:00,872 --> 00:14:05,171 in interesting or, perhaps, illuminating ways, 215 00:14:05,171 --> 00:14:07,591 that might be useful for folks to discuss. 216 00:14:08,561 --> 00:14:11,961 The project was initially a Wikipedia project by the name, 217 00:14:11,961 --> 00:14:14,651 simply because that was what people were familiar with, 218 00:14:15,281 --> 00:14:18,360 and so we organized multiple different events 219 00:14:18,360 --> 00:14:23,135 at very traditional edit-a-thons, gender gap work, and so forth. 220 00:14:24,607 --> 00:14:26,752 [And a bunch you can see] [inaudible], 221 00:14:27,105 --> 00:14:30,812 and a bunch of very successful new editors recruited, and so forth. 222 00:14:31,754 --> 00:14:34,454 We did bulk uploads into Commons. 223 00:14:35,454 --> 00:14:41,246 In this case, there was a collection of over 1,000 original artworks 224 00:14:41,246 --> 00:14:46,047 by an entomological illustrator, Des Helmore, 225 00:14:46,047 --> 00:14:47,927 which had been sitting on a hard drive, 226 00:14:47,927 --> 00:14:50,357 [lacking] research for ten years, 227 00:14:50,357 --> 00:14:52,322 and we were able to get clearance to release those 228 00:14:52,322 --> 00:14:54,245 all under CC BY license. 229 00:14:54,245 --> 00:14:57,963 So, easy wins to show to people there. 230 00:14:57,963 --> 00:15:01,095 Everyone can understand lots of pictures of beetles. 231 00:15:01,095 --> 00:15:06,681 Everyone can understand workshops devoted to fixing the gender gap. 232 00:15:07,250 --> 00:15:10,251 But Wikidata is much more difficult to sell 233 00:15:10,251 --> 00:15:12,280 to people in the GLAM sector, 234 00:15:12,280 --> 00:15:15,095 or anyone outside of our particular movement. 235 00:15:16,107 --> 00:15:19,717 So, I began to realize that Wikidata 236 00:15:19,717 --> 00:15:22,634 was going to be a more and more important part 237 00:15:22,634 --> 00:15:25,883 of the Wikipedian at Large projects. 238 00:15:25,883 --> 00:15:30,472 So, as we went through, it became a larger and larger component 239 00:15:30,472 --> 00:15:31,849 of what I was doing. 240 00:15:31,849 --> 00:15:36,350 And I began to try and teach myself more about Wikidata as well, 241 00:15:36,800 --> 00:15:39,515 because I was beginning to see how important it was. 242 00:15:40,287 --> 00:15:41,989 So, this one project-- 243 00:15:41,989 --> 00:15:46,325 the kakapo is a native New Zealand flightless parrot. 244 00:15:48,096 --> 00:15:51,335 We worked with the Department of Conservation, 245 00:15:51,335 --> 00:15:54,299 whose job is to save this species from extinction, 246 00:15:54,299 --> 00:15:55,643 and pitched the idea, 247 00:15:55,643 --> 00:15:59,253 "What if we put every single kakapo into Wikidata?" 248 00:16:01,221 --> 00:16:02,701 And that may seem ridiculous, 249 00:16:02,701 --> 00:16:05,580 but it's actually a perfectly doable project. 250 00:16:06,621 --> 00:16:08,427 A few of them are in there already. 251 00:16:09,100 --> 00:16:11,601 A key thing to notice here is there are not many kakapos. 252 00:16:11,615 --> 00:16:13,245 So, it's a manageable task. 253 00:16:13,245 --> 00:16:16,656 There were 148 when I started, and then one died. 254 00:16:16,935 --> 00:16:20,995 And they've just had a great breeding season up to 213. 255 00:16:21,765 --> 00:16:25,045 This is great. This is the most kakapo there have been for over 50 years. 256 00:16:25,505 --> 00:16:28,260 So, this was also a big deal. 257 00:16:28,260 --> 00:16:30,725 This was on the news every day in New Zealand. 258 00:16:31,285 --> 00:16:33,224 Each new one that was born-- 259 00:16:33,224 --> 00:16:34,414 (man) In the New York Times. 260 00:16:34,414 --> 00:16:35,673 (Mike) Did it? Oh, lovely. 261 00:16:35,673 --> 00:16:38,522 Yeah, this was national news. Everyone likes these birds. 262 00:16:39,002 --> 00:16:40,663 But something interesting about them 263 00:16:40,663 --> 00:16:43,932 is because unlike species that are more populous, 264 00:16:43,932 --> 00:16:47,822 every single kakapo is named, has a unique name 265 00:16:47,822 --> 00:16:49,817 and a unique ID number. 266 00:16:49,817 --> 00:16:52,442 And often has good biographical data 267 00:16:52,442 --> 00:16:54,672 about where and when they were born, 268 00:16:54,672 --> 00:16:56,972 were hatched, who their father and mother was, 269 00:16:56,972 --> 00:16:58,713 when they died, if they died. 270 00:16:58,713 --> 00:17:01,352 So, there is, in fact, a Department of Conservation database 271 00:17:01,352 --> 00:17:02,882 of all this information. 272 00:17:02,882 --> 00:17:06,723 And one of the most famous kakapos, of course, is Sirocco, 273 00:17:06,723 --> 00:17:09,726 who you can see is named after a wind, was born there. 274 00:17:09,726 --> 00:17:13,225 Sirocco has a Twitter account, 275 00:17:13,705 --> 00:17:15,927 which Wikidata had some problems with, 276 00:17:15,927 --> 00:17:18,562 because, apparently, they just can't have Twitter accounts. 277 00:17:18,562 --> 00:17:20,342 I don't know about that. 278 00:17:21,121 --> 00:17:23,456 He's even featured on an album cover, and so forth. 279 00:17:23,456 --> 00:17:25,716 So there are multiple properties of this, 280 00:17:25,716 --> 00:17:28,258 probably one of the most famous individual kakapo. 281 00:17:28,258 --> 00:17:30,337 So, I pitched to the Department of Conservation, 282 00:17:30,337 --> 00:17:33,245 "Why don't we try and do this with every single one?" 283 00:17:33,245 --> 00:17:37,665 And so, they had to think about how much of the biographical data 284 00:17:37,665 --> 00:17:39,365 could be made public. 285 00:17:39,365 --> 00:17:41,225 And they come up with a short list. 286 00:17:41,225 --> 00:17:46,644 And now we've got, I think, 212, 210--I think a couple died-- 287 00:17:46,644 --> 00:17:50,703 living kakapo that are all candidates now. 288 00:17:50,703 --> 00:17:52,933 And they only get a name when they fledge. 289 00:17:52,933 --> 00:17:56,172 They have a code number until that while they're still babies. 290 00:17:56,186 --> 00:17:58,227 So, when we've got the full-fledged crop, 291 00:17:58,227 --> 00:18:01,806 we're going to create a complete Wikidata-- 292 00:18:01,806 --> 00:18:04,225 the entire species will be in Wikidata. 293 00:18:04,586 --> 00:18:06,605 But we need to come up with a property for DOC ID-- 294 00:18:06,605 --> 00:18:08,875 I actually would like to talk with folks about that. 295 00:18:08,875 --> 00:18:11,266 Should we be using a very specific ID, 296 00:18:11,266 --> 00:18:13,136 or should we be coming up with an ID 297 00:18:13,136 --> 00:18:17,665 that would work for all individual birds or plants or animals 298 00:18:17,665 --> 00:18:21,965 that have been tagged in any scientific research project? 299 00:18:21,965 --> 00:18:23,795 It's a good question. 300 00:18:25,105 --> 00:18:27,465 Second project was Christchurch Art Gallery. 301 00:18:28,225 --> 00:18:31,523 There are very few paintings of Colin MacCahon, 302 00:18:31,523 --> 00:18:33,963 New Zealand's most famous artist in existence. 303 00:18:33,963 --> 00:18:36,704 This is a drawing he did for the New Zealand School Journal, 304 00:18:36,704 --> 00:18:38,424 which was government-funded at the time. 305 00:18:38,424 --> 00:18:40,704 So, it's actually in Archives New Zealand 306 00:18:40,704 --> 00:18:42,294 who own the copyright for that. 307 00:18:42,294 --> 00:18:44,333 This is a very unusual situation. 308 00:18:45,014 --> 00:18:47,073 So, I worked with Christchurch Art Gallery 309 00:18:47,073 --> 00:18:48,993 who, along with Auckland Art Gallery, 310 00:18:48,993 --> 00:18:52,954 maintain a site called Find New Zealand artists. 311 00:18:52,954 --> 00:18:55,654 The job of which is to keep track of the holdings-- 312 00:18:55,654 --> 00:18:58,403 every institution that has holdings of the New Zealand artist. 313 00:18:58,403 --> 00:19:03,163 So, about 18,000 different artists in their database, 314 00:19:03,163 --> 00:19:05,517 and most with very little information at all. 315 00:19:06,233 --> 00:19:08,992 So, we did a standard sort of Mix'n'Match. 316 00:19:08,992 --> 00:19:13,673 We did an export of the ones that had at least a birth date, 317 00:19:13,673 --> 00:19:17,545 or a death date, or a place of birth, or a place of death. 318 00:19:17,545 --> 00:19:20,614 So, that's not restricting it very much. 319 00:19:20,614 --> 00:19:23,484 And even then, we were not able to match quite a few, 320 00:19:23,484 --> 00:19:25,954 but we've got about 1,500 now 321 00:19:25,954 --> 00:19:28,603 that are matched to known artists in Wikidata, 322 00:19:28,603 --> 00:19:30,123 which is nice. 323 00:19:30,123 --> 00:19:31,783 But what was appealing to them-- 324 00:19:31,783 --> 00:19:33,523 this is their website, 325 00:19:33,523 --> 00:19:39,213 which really just maintains the holdings links there. 326 00:19:39,213 --> 00:19:44,523 But this biographical data, which they create by hand, currently, 327 00:19:44,523 --> 00:19:46,063 for every single artist. 328 00:19:46,063 --> 00:19:48,803 And the act of exporting and putting into Mix'n'Match 329 00:19:48,803 --> 00:19:52,363 exposed numerous typos and mistakes and such 330 00:19:52,363 --> 00:19:53,723 that they haven't noticed. 331 00:19:53,723 --> 00:19:56,123 And it's only when you start running things through [Excel], 332 00:19:56,123 --> 00:19:57,272 these things show up. 333 00:19:57,272 --> 00:20:01,720 And the value of Wikidata was suddenly conveyed to them 334 00:20:01,720 --> 00:20:05,527 when I said, "You can just suck in that information from Wikidata." 335 00:20:06,548 --> 00:20:09,507 And that made them sit up straight. 336 00:20:09,507 --> 00:20:11,748 So this, I think, is one of the selling points. 337 00:20:11,748 --> 00:20:14,907 When you have this carefully hand-curated website 338 00:20:14,907 --> 00:20:19,344 with 18,000 entries, full of mistakes, and tell them there's another way, 339 00:20:19,344 --> 00:20:20,558 that they can get other people 340 00:20:20,558 --> 00:20:23,192 to do some of this fact-checking and correction for them-- 341 00:20:23,192 --> 00:20:24,813 that's when it sinks home. 342 00:20:25,143 --> 00:20:27,293 And then announced I was pitching the idea 343 00:20:27,293 --> 00:20:30,313 that they "Wikidatafy" this entire history book 344 00:20:30,313 --> 00:20:33,333 of the New Zealand artists in Christchurch in the '30s, 345 00:20:33,333 --> 00:20:36,833 and run through--just published-- and run through every single person, 346 00:20:36,833 --> 00:20:39,453 connection, place, exhibition, and such. 347 00:20:39,453 --> 00:20:43,103 But it's a manageable sized project, and they're very excited by this. 348 00:20:44,303 --> 00:20:46,843 And thirdly, I wanted to show you Maori Subject Headings. 349 00:20:46,843 --> 00:20:50,811 A waka is a Maori name for a particular kind of canoe, 350 00:20:50,811 --> 00:20:52,732 a war canoe. 351 00:20:52,732 --> 00:20:55,952 So, in the National Library of New Zealand, 352 00:20:55,952 --> 00:20:58,530 there's a listing for waka, because the National Library 353 00:20:58,530 --> 00:21:02,805 actually has its own dictionary of Maori Subject Headings, 354 00:21:03,299 --> 00:21:04,474 in the Maori language. 355 00:21:04,474 --> 00:21:06,475 So, there it defines a waka, 356 00:21:07,175 --> 00:21:09,512 in Maori and English. 357 00:21:10,182 --> 00:21:12,372 But it also has a whole lot of narrower terms, 358 00:21:12,372 --> 00:21:14,222 you can see there on the side there. 359 00:21:14,222 --> 00:21:16,062 a typical would be taurapa. 360 00:21:16,237 --> 00:21:19,774 And a definition first in Maori, and then in English. 361 00:21:19,774 --> 00:21:22,249 It's the carved sternpost that you can see there. 362 00:21:22,695 --> 00:21:24,482 And in English, you would say "sternpost," 363 00:21:24,482 --> 00:21:26,959 but you can't use the word "sternpost" for taurapa, 364 00:21:26,959 --> 00:21:31,054 because taurapa only works for particular kinds of war canoes. 365 00:21:31,420 --> 00:21:34,460 So, there's no English word equivalent for that. 366 00:21:35,108 --> 00:21:37,909 And I suddenly realized that here is an entire ontology 367 00:21:37,909 --> 00:21:42,177 of cultural-specific terms that have been very carefully worked out 368 00:21:42,177 --> 00:21:45,043 and verified by the National Library with Maori, 369 00:21:45,043 --> 00:21:49,733 constantly being added to and improved with definitions, with descriptions, 370 00:21:49,733 --> 00:21:51,803 in both English and Maori. 371 00:21:51,803 --> 00:21:52,956 Really exciting. 372 00:21:52,956 --> 00:21:56,228 I suddenly thought we could put this whole lot into Wikidata-- 373 00:21:56,228 --> 00:22:00,596 Maori first, and then translated into English, as required. 374 00:22:00,596 --> 00:22:02,291 Be a nice change, wouldn't it! 375 00:22:03,081 --> 00:22:05,046 And here's the copyright licensing. 376 00:22:05,046 --> 00:22:08,726 Unfortunately, NonCommercial-NoDerivs. 377 00:22:10,346 --> 00:22:12,346 So now I have to start the conversation with them 378 00:22:12,346 --> 00:22:14,524 about why did they pick that license. 379 00:22:15,675 --> 00:22:19,970 And possibly because they only got [buy in] from Maori, 380 00:22:19,970 --> 00:22:22,679 who agreed to sit down and [inaudible] this stuff 381 00:22:22,679 --> 00:22:24,039 if there was a guarantee 382 00:22:24,039 --> 00:22:27,339 that none of this information could be used for commercial purposes. 383 00:22:27,920 --> 00:22:31,999 So, that's one of the frustrating aspects of the task 384 00:22:31,999 --> 00:22:34,238 is coming up against these sorts of restrictions. 385 00:22:34,238 --> 00:22:37,019 So, those are the three things I wanted to put out in front 386 00:22:37,019 --> 00:22:38,379 and sparking discussion. 387 00:22:38,379 --> 00:22:40,878 Putting an entire species into Wikidata, 388 00:22:40,878 --> 00:22:44,107 what it takes to actually change an art gallery's curator's mind 389 00:22:44,107 --> 00:22:46,078 about the value of Wikidata, 390 00:22:46,078 --> 00:22:49,838 and what do we do when we would see a complete ontology 391 00:22:49,838 --> 00:22:52,477 in another language that, unfortunately, has been slapped 392 00:22:52,477 --> 00:22:55,697 with a restrictive Creative Commons license. 393 00:22:55,697 --> 00:22:56,997 Thank you. 394 00:22:56,997 --> 00:22:58,737 (applause) 395 00:23:11,412 --> 00:23:14,077 Hello. My name is Joachim Neubert. 396 00:23:14,077 --> 00:23:16,472 I'm working for the ZBW, 397 00:23:17,522 --> 00:23:20,947 that is, Information Center for Economics in Hamburg, 398 00:23:21,407 --> 00:23:23,796 as a scientific software developer. 399 00:23:24,726 --> 00:23:31,108 And one of my tasks last year was preparing a data donation to Wikidata. 400 00:23:31,878 --> 00:23:37,193 And I want to give some report on this on our first experiences 401 00:23:37,613 --> 00:23:43,259 from donating metadata from the 20th-Century Press Archives. 402 00:23:46,463 --> 00:23:48,299 To our best knowledge, 403 00:23:48,299 --> 00:23:52,678 this is the largest public press archive in the world. 404 00:23:54,018 --> 00:23:59,158 It has been collected between 1908 and 2005, 405 00:24:01,008 --> 00:24:04,244 and has been got from 406 00:24:05,174 --> 00:24:09,272 more than 1,500 newspapers and periodicals 407 00:24:09,272 --> 00:24:13,333 from Germany, and also internationally. 408 00:24:14,651 --> 00:24:18,841 And it has covered everything which could be of interest 409 00:24:18,841 --> 00:24:22,820 for the Hamburg, 410 00:24:25,870 --> 00:24:28,030 the Hamburg businesspeople 411 00:24:28,030 --> 00:24:32,410 who wanted to expand over the world. 412 00:24:34,611 --> 00:24:39,350 As you can see, this material has been clipped from newspapers 413 00:24:39,350 --> 00:24:41,790 and put onto paper, 414 00:24:41,790 --> 00:24:44,731 and then collected in folders. 415 00:24:46,121 --> 00:24:50,451 Here you see a small corner of the Person's Archive, 416 00:24:51,255 --> 00:24:56,182 and, similarly, information has been collected on companies, 417 00:24:56,182 --> 00:24:59,762 on general topics, on wares, on everybody, 418 00:25:01,533 --> 00:25:05,557 on everything which could be interesting. 419 00:25:06,978 --> 00:25:11,074 These folders have been scanned 420 00:25:12,652 --> 00:25:15,868 up to roughly 1949. 421 00:25:17,076 --> 00:25:23,123 by the DFG-funded project in 2004 to 2007. 422 00:25:24,268 --> 00:25:30,591 As a result, up to now, it was 25,000 thematic dossiers 423 00:25:31,727 --> 00:25:33,759 of this time. 424 00:25:33,771 --> 00:25:37,913 This contained about 2 million, or more than 2 million pages. 425 00:25:38,845 --> 00:25:41,522 And these are online. 426 00:25:43,633 --> 00:25:48,461 This application developed at that time by ZBW, 427 00:25:50,006 --> 00:25:54,341 which now looks a bit outdated, 428 00:25:55,031 --> 00:25:58,153 not so fancy, and what’s more of a problem. 429 00:25:58,597 --> 00:26:04,350 It's an application which was built architecturally on Oracle, 430 00:26:04,350 --> 00:26:08,662 it was built on ColdFusion, it runs on Windows servers, 431 00:26:09,227 --> 00:26:14,992 so it's not very sustainable in the long term. 432 00:26:16,008 --> 00:26:19,274 And we have discussed should we migrate this 433 00:26:19,274 --> 00:26:22,755 to a more fancy linked data application, 434 00:26:23,931 --> 00:26:27,964 or should we take a radical step 435 00:26:27,964 --> 00:26:31,749 and put all this data in the open. 436 00:26:32,843 --> 00:26:37,416 We have assigned CC0 license to that data 437 00:26:37,416 --> 00:26:40,938 and, currently, moving some main-- 438 00:26:42,036 --> 00:26:46,463 access layer, some main discovery layer-- so it's a primary access layer 439 00:26:47,835 --> 00:26:50,587 to the open linked data web, 440 00:26:51,315 --> 00:26:56,881 where it actually makes most sense 441 00:26:56,881 --> 00:27:00,698 to put some metadata into Wikidata, 442 00:27:02,367 --> 00:27:06,781 and to make sure that all folders 443 00:27:07,594 --> 00:27:10,633 of the collections are linked to Wikidata, 444 00:27:11,485 --> 00:27:13,308 so they are findable, 445 00:27:14,240 --> 00:27:17,795 and that all metadata about these folders 446 00:27:18,444 --> 00:27:22,977 is also transferred to Wikidata. 447 00:27:23,344 --> 00:27:27,886 So it can be used there, and it can be enriched there, possibly. 448 00:27:28,780 --> 00:27:32,237 Corrections can be made to that data. 449 00:27:32,645 --> 00:27:38,894 What is still maintained by ZBW is, of course, the storage of the images, 450 00:27:39,947 --> 00:27:43,882 which we can't put in any way, 451 00:27:45,548 --> 00:27:47,326 or we can't give a license on that 452 00:27:47,326 --> 00:27:51,179 because this was owned by the original creators. 453 00:27:52,271 --> 00:27:54,954 But we make sure that they are accessible 454 00:27:56,500 --> 00:28:02,203 by some, again, metadata files via DFG Viewer 455 00:28:03,108 --> 00:28:06,108 in the future by IIIF manifests. 456 00:28:06,849 --> 00:28:11,050 And we will prepare some static landing pages 457 00:28:11,707 --> 00:28:18,333 which will serve as a data point of reference for Wikidata, 458 00:28:18,333 --> 00:28:22,596 as well as still making available data 459 00:28:22,600 --> 00:28:26,174 which doesn't fit well into Wikidata. 460 00:28:31,253 --> 00:28:36,815 [For us] is migration and data donation to Wikidata 461 00:28:37,165 --> 00:28:40,633 with our custom infrastructure 462 00:28:40,633 --> 00:28:44,837 of SPARQL endpoint with that data, 463 00:28:45,887 --> 00:28:48,980 and we basically used federated queries 464 00:28:49,990 --> 00:28:53,834 between that endpoint and the Wikidata Query Service 465 00:28:53,834 --> 00:28:57,633 to create according statements 466 00:28:59,207 --> 00:29:02,107 through [eyes of] concatenated 467 00:29:02,107 --> 00:29:06,937 in SPARQL queries themselves, or transformed via a script, 468 00:29:07,907 --> 00:29:12,254 which also generated references for the statements. 469 00:29:12,742 --> 00:29:19,446 And then put that into QuickStatements of the code to use this online. 470 00:29:22,544 --> 00:29:24,088 So, this is what we get. 471 00:29:24,493 --> 00:29:28,669 It's not only simple things like birth dates, but, sorry-- 472 00:29:29,835 --> 00:29:34,998 but also complex statements 473 00:29:34,998 --> 00:29:39,787 about already existing items, 474 00:29:39,787 --> 00:29:44,790 like this person was a supervisory board member of said company 475 00:29:46,682 --> 00:29:48,905 during this period of time, 476 00:29:49,663 --> 00:29:56,696 and referenced for use in... 477 00:29:58,463 --> 00:30:01,864 in the scientific context. 478 00:30:07,763 --> 00:30:10,939 The first part of this data donation has been finished. 479 00:30:12,736 --> 00:30:17,201 The Person's Archive is completely linked to Wikidata. 480 00:30:18,333 --> 00:30:23,652 And this is also an information tool. 481 00:30:23,652 --> 00:30:27,360 A lot of items which have been before 482 00:30:27,360 --> 00:30:30,422 not had any external references. 483 00:30:31,278 --> 00:30:35,674 And we had about more than 6,000 statements, 484 00:30:36,201 --> 00:30:41,924 which are now sourced in this archive's metadata. 485 00:30:45,288 --> 00:30:49,951 Well, this was the most easy part, 486 00:30:50,880 --> 00:30:54,785 because persons are easily identifiable in Wikidata. 487 00:30:56,494 --> 00:31:00,443 More than 90% already existed here, 488 00:31:00,443 --> 00:31:02,412 so we could link to that. 489 00:31:02,412 --> 00:31:06,486 We created some 100 items for these, 490 00:31:06,486 --> 00:31:08,807 for the ones which were missing. 491 00:31:09,296 --> 00:31:13,626 But now, we are working 492 00:31:13,626 --> 00:31:18,165 on the rest of the archive, 493 00:31:18,165 --> 00:31:20,432 particularly on the topics archive. 494 00:31:21,243 --> 00:31:26,677 Which means mapping a historic system for the organization of knowledge 495 00:31:26,677 --> 00:31:29,884 about the whole world, 496 00:31:29,884 --> 00:31:34,147 materialized as newspaper clippings to Wikidata. 497 00:31:36,305 --> 00:31:41,898 To give you a basic idea, the Countries and Topics archive 498 00:31:42,668 --> 00:31:48,773 is organized by a hierarchy of countries 499 00:31:48,773 --> 00:31:50,882 and other geographic entities, 500 00:31:52,499 --> 00:31:56,443 which is translated to English, which makes this more easy. 501 00:31:56,443 --> 00:32:01,861 And German deeply nested... 502 00:32:03,881 --> 00:32:08,064 deeply nested classification of topics. 503 00:32:08,064 --> 00:32:11,593 And this combination defines one... 504 00:32:13,032 --> 00:32:16,020 one folder. 505 00:32:16,020 --> 00:32:21,128 So, what we now want to do is to match this 506 00:32:21,128 --> 00:32:24,575 as a structure to Wikidata, and to bring the data in. 507 00:32:24,575 --> 00:32:29,338 And I want to invite you 508 00:32:29,338 --> 00:32:33,801 to join this really nice challenge 509 00:32:33,801 --> 00:32:36,272 in terms of knowledge organization. 510 00:32:37,739 --> 00:32:40,713 So, it's a WikiProject where this work is tracked, 511 00:32:40,713 --> 00:32:46,288 and you can follow this or participate in this. 512 00:32:46,591 --> 00:32:48,908 And, yes, thank you very much. 513 00:32:49,639 --> 00:32:51,723 (applause) 514 00:33:03,999 --> 00:33:07,284 So, we're taking performing arts to Wikidata. 515 00:33:07,735 --> 00:33:11,930 And we're taking performing arts to the linked open data cloud, 516 00:33:11,930 --> 00:33:15,595 by building a linked open data ecosystem for the performing arts. 517 00:33:16,164 --> 00:33:21,068 And the question I'm trying to answer, 518 00:33:21,068 --> 00:33:24,463 and I hope you'll help me in answering the questions 519 00:33:24,463 --> 00:33:27,012 which place for Wikidata and all that. 520 00:33:27,012 --> 00:33:31,316 But let me first start with my experiences 521 00:33:31,316 --> 00:33:33,963 which I made this year, 522 00:33:34,723 --> 00:33:37,564 the first half of the year, when I had the pleasure 523 00:33:37,564 --> 00:33:39,350 to work with CAPACOA, 524 00:33:39,350 --> 00:33:42,074 which is the Canadian Arts Presenting Association, 525 00:33:42,074 --> 00:33:47,408 which actually launched a project called Linked Digital Future Initiative, 526 00:33:47,831 --> 00:33:53,261 to actually get the entire art sector in Canada to embrace linked open data. 527 00:33:53,441 --> 00:33:56,887 And they did that based on the observation 528 00:33:56,887 --> 00:33:59,042 that over the past five years, 529 00:33:59,731 --> 00:34:03,924 the [inaudible]-- the important topic within performing arts 530 00:34:03,924 --> 00:34:08,855 was the fact that metadata was not around in sufficient quality 531 00:34:08,855 --> 00:34:11,780 and not interlinked, not interoperable. 532 00:34:12,106 --> 00:34:16,498 And that was why some of the performances, 533 00:34:16,498 --> 00:34:19,542 some of the events are not so well findable 534 00:34:19,542 --> 00:34:24,777 by Google and by personal computer-based assistants, and so on. 535 00:34:25,989 --> 00:34:29,757 So, the vision we kind of developed together 536 00:34:29,757 --> 00:34:32,997 is that we want to have a knowledge base 537 00:34:34,013 --> 00:34:35,646 for many stakeholders at once. 538 00:34:35,646 --> 00:34:39,636 So we looked at the entire performing arts value network, 539 00:34:39,636 --> 00:34:42,073 we identified key stakeholders in there, 540 00:34:42,073 --> 00:34:46,545 we looked at the usage scenarios that we like to pursue, 541 00:34:47,719 --> 00:34:52,074 and we kind of mapped it to the whole architecture 542 00:34:52,074 --> 00:34:57,097 of such a knowledge base, or of the different platforms in there, 543 00:34:57,097 --> 00:34:59,535 which, obviously, is a distributed architecture, 544 00:34:59,535 --> 00:35:01,361 and not one big monolith. 545 00:35:02,499 --> 00:35:05,664 I'm just going to run through that quite quickly 546 00:35:05,664 --> 00:35:07,980 because we have ten minutes each. 547 00:35:09,035 --> 00:35:13,796 But I think we'll have plenty of time tonight or tomorrow to deepen that 548 00:35:13,796 --> 00:35:16,318 if anybody's interested in the details. 549 00:35:16,318 --> 00:35:19,116 So, we started from that Performing Arts Value Network, 550 00:35:19,116 --> 00:35:23,263 which, interestingly, was just published last year. 551 00:35:23,263 --> 00:35:27,691 So, we're lucky to be able to build on previous work, 552 00:35:27,691 --> 00:35:31,098 like you have the primary value chain of the performing arts in the middle, 553 00:35:31,098 --> 00:35:34,177 and various stakeholders around that. 554 00:35:34,177 --> 00:35:37,387 All in all, we identified 20 stakeholder groups, 555 00:35:37,387 --> 00:35:43,384 which then we kind of boiled down into seven larger categories 556 00:35:43,395 --> 00:35:45,464 for each of the stakeholder groups. 557 00:35:45,464 --> 00:35:51,558 We kind of formulated what kind of needs 558 00:35:51,558 --> 00:35:54,718 they would have in terms of such an infrastructure, 559 00:35:54,718 --> 00:35:58,572 and what would they be able to achieve if the whole thing was interlinked 560 00:35:58,572 --> 00:36:02,062 and the data was publicly accessible. 561 00:36:02,637 --> 00:36:04,990 And so, you can see the types here, 562 00:36:04,990 --> 00:36:09,177 the different types is Production, then Presention & Promotion, 563 00:36:09,177 --> 00:36:12,064 Coverage & Reuse, Live Audiences, 564 00:36:12,064 --> 00:36:13,852 Online Consumption, Heritage, 565 00:36:13,852 --> 00:36:15,959 Research & Education. 566 00:36:15,959 --> 00:36:18,917 And after kind of setting up a big table, 567 00:36:18,917 --> 00:36:21,275 of which you can see just the first part here, 568 00:36:21,275 --> 00:36:25,128 we kind of compared [over there], had a look at which type of data 569 00:36:25,128 --> 00:36:26,954 were actually used across the board 570 00:36:26,954 --> 00:36:31,248 by all different groups of stakeholders. 571 00:36:31,248 --> 00:36:36,586 And there's quite a large basis of data that is common to all of them, 572 00:36:36,586 --> 00:36:38,414 and that is really is the area 573 00:36:38,414 --> 00:36:43,063 where it makes a lot of sense, actually, to cooperate and to keep that-- 574 00:36:43,063 --> 00:36:45,988 to maintain the data together. 575 00:36:47,602 --> 00:36:50,651 So, when talking about platform architecture, 576 00:36:50,651 --> 00:36:53,648 you can see that we have four layers here. 577 00:36:54,096 --> 00:36:56,448 At the bottom, display the data layer. 578 00:36:56,448 --> 00:36:58,717 Of course, Wikidata plays a part in it, 579 00:36:58,717 --> 00:37:02,733 but also a lot of other databases, distributed databases 580 00:37:02,733 --> 00:37:07,769 that can expose data through SPARQL endpoints. 581 00:37:09,204 --> 00:37:13,106 The yellow part in the middle, that's the semantic layer. 582 00:37:13,106 --> 00:37:16,080 It's our common language to describe our things, 583 00:37:16,080 --> 00:37:21,834 to make statements about things around the performing arts, the ontology. 584 00:37:22,400 --> 00:37:25,243 Then we have an application layer 585 00:37:25,243 --> 00:37:30,551 that consists of various modules, for example, data analysis, 586 00:37:30,551 --> 00:37:34,613 data extraction-- so, how do you actually get unstructured data 587 00:37:34,613 --> 00:37:36,029 into structured data-- 588 00:37:36,029 --> 00:37:38,749 how can we support that by tools. 589 00:37:39,436 --> 00:37:42,478 Then, obviously, there's a visualization of data-- 590 00:37:42,478 --> 00:37:47,115 so if there are large quantities of data, you want to visualize it in some way. 591 00:37:47,801 --> 00:37:50,155 And on the top, you have the presentation layer, 592 00:37:50,155 --> 00:37:54,814 that's what the ordinary people are actually interacting with 593 00:37:54,814 --> 00:37:56,199 on a daily basis-- 594 00:37:56,199 --> 00:37:59,615 search engines, encyclopedias, cultural agendas, 595 00:37:59,615 --> 00:38:02,097 and a variety of other services. 596 00:38:03,395 --> 00:38:05,386 We're not starting from scratch. 597 00:38:05,386 --> 00:38:08,535 Some work has already been done in this area. 598 00:38:09,107 --> 00:38:13,043 I'll just cite a few examples from a project 599 00:38:13,043 --> 00:38:15,245 which I have been involved in. 600 00:38:15,245 --> 00:38:18,149 Some other stuff going on as well. 601 00:38:18,149 --> 00:38:21,195 And so, I started in this area 602 00:38:21,195 --> 00:38:24,476 with the Swiss Archive of the Performing Arts. 603 00:38:25,001 --> 00:38:27,795 [Until] building a Swiss Performing Arts database, 604 00:38:27,795 --> 00:38:31,046 we created the performing arts ontology, 605 00:38:31,046 --> 00:38:33,931 that's currently being implemented into RDF. 606 00:38:34,701 --> 00:38:39,771 And there we have the database of like 60, 70 years 607 00:38:39,771 --> 00:38:43,313 of performance history in Switzerland. 608 00:38:43,313 --> 00:38:45,145 So, that's something that can build on, 609 00:38:45,145 --> 00:38:48,999 and that's something that's been transformed into RDF. 610 00:38:49,968 --> 00:38:54,621 And there was a builder platform where this data can be accessed. 611 00:38:56,073 --> 00:39:01,658 Then we have done several ingests into Wikidata, 612 00:39:01,658 --> 00:39:02,877 partly from Switzerland, 613 00:39:02,877 --> 00:39:08,990 partly also from the performance arts institutes, 614 00:39:09,680 --> 00:39:12,357 for example, Bart Magnus was involved in that. 615 00:39:12,883 --> 00:39:15,078 He was the driving force behind that. 616 00:39:15,078 --> 00:39:17,223 There's also stuff from Wikimedia Commons, 617 00:39:17,223 --> 00:39:21,361 but not very well interlinked with all the rest of our metadata. 618 00:39:21,361 --> 00:39:25,097 And obviously, by doing this ingest, 619 00:39:25,097 --> 00:39:29,274 we also kind of started to implement parts of this Swiss data model 620 00:39:29,274 --> 00:39:31,345 into Wikidata. 621 00:39:32,767 --> 00:39:37,556 Then one of the Canadian implementation partners 622 00:39:37,556 --> 00:39:39,013 is Culture Creates. 623 00:39:39,013 --> 00:39:43,872 They're running a platform that actually scrapes information from theater websites, 624 00:39:43,872 --> 00:39:46,873 and inputs it into a knowledge graph, 625 00:39:48,293 --> 00:39:54,428 to then expose it to search engines and other search devices. 626 00:39:56,415 --> 00:40:03,027 And there again, we kind of had to implement and extend this in ontology. 627 00:40:03,261 --> 00:40:08,163 And as you can see from the slide, is that there's so many empty spaces, 628 00:40:08,163 --> 00:40:09,599 but there's also some overlap, 629 00:40:09,599 --> 00:40:13,456 and an important overlap, obviously, is the common shared language, 630 00:40:13,456 --> 00:40:18,693 which will help us actually interlink the various data sets. 631 00:40:20,759 --> 00:40:22,587 What is also important, obviously, 632 00:40:22,587 --> 00:40:26,404 is that we're using the same base registers and authority files. 633 00:40:26,406 --> 00:40:31,368 And this is a place where Wikidata plays an important role 634 00:40:31,368 --> 00:40:33,967 by kind of interlinking these. 635 00:40:34,619 --> 00:40:37,799 Now, I'd like to share the recommendations 636 00:40:37,799 --> 00:40:41,882 by the Linked Data Future Initiatives Advisory Committee. 637 00:40:42,769 --> 00:40:45,169 At least the two first recommendations. 638 00:40:45,169 --> 00:40:47,930 So, for the Canadians, now it's absolutely crucial 639 00:40:47,930 --> 00:40:53,173 to kind of fill in their own Canadian performing arts knowledge graph, 640 00:40:53,173 --> 00:40:55,851 because unlike the Swiss Archive of the Performing Arts, 641 00:40:55,851 --> 00:40:59,389 they're not starting with an already existing database, 642 00:40:59,389 --> 00:41:01,906 but they're kind of creating it from scratch. 643 00:41:01,906 --> 00:41:04,468 And it's absolutely crucial to have data in there. 644 00:41:04,468 --> 00:41:09,024 And second, as you can see, comes in already Wikidata. 645 00:41:09,024 --> 00:41:12,342 Wikidata, by the Advisory Committee, 646 00:41:12,342 --> 00:41:17,859 has been seen as complementary to Artsdata.ca, this knowledge graph, 647 00:41:18,347 --> 00:41:21,474 and, therefore, efforts should be undertaken to contribute 648 00:41:21,474 --> 00:41:24,878 to its population with performing arts-related data. 649 00:41:25,813 --> 00:41:30,775 And that's where we're going to work on over the coming months and years, 650 00:41:30,775 --> 00:41:34,748 and that's also why I'm kind of on the lookout here 651 00:41:34,748 --> 00:41:38,644 to see who else will join that effort. 652 00:41:40,556 --> 00:41:44,942 So, right now, obviously, we're saying they're complementary. 653 00:41:44,942 --> 00:41:48,341 So, we have to think about whether the pluses and the minuses 654 00:41:48,341 --> 00:41:49,844 of each of the approaches. 655 00:41:49,844 --> 00:41:52,073 And you can see here a comparison 656 00:41:52,073 --> 00:41:56,120 between Wikidata and the Classical Linked Open Data approach. 657 00:41:56,887 --> 00:41:59,947 I would be happy to discuss that further with you guys, 658 00:41:59,947 --> 00:42:02,549 how your experiences are in there. 659 00:42:02,814 --> 00:42:07,727 But, as I see it, Wikidata is a huge plus because it's a crowdsourcing platform, 660 00:42:07,727 --> 00:42:11,671 and it's easy to invite further parties to actually contribute. 661 00:42:11,683 --> 00:42:17,482 On the negative side, obviously, you get this problem of loss of control. 662 00:42:17,658 --> 00:42:22,764 Data owners have to give up control over their graphs, data quality, 663 00:42:22,764 --> 00:42:24,382 and completeness. 664 00:42:26,554 --> 00:42:31,096 It's harder to track on Wikidata than if you have it under your control. 665 00:42:31,493 --> 00:42:34,376 And the other strength of Wikidata 666 00:42:34,376 --> 00:42:39,617 is that it requires immediate integration into that worldwide graph. 667 00:42:39,617 --> 00:42:41,734 And you kind of just do it-- 668 00:42:42,544 --> 00:42:46,768 kind of reconcile step by step against other databases, 669 00:42:46,768 --> 00:42:49,528 which may also be seen by some as an advantage, 670 00:42:49,528 --> 00:42:53,914 but of course, if you're looking for integration and interoperability, 671 00:42:53,914 --> 00:42:56,792 Wikidata forces you to go for that from the beginning. 672 00:42:59,184 --> 00:43:03,157 And then, obviously, harmonizing data modeling practices 673 00:43:03,157 --> 00:43:05,552 is an issue in both cases. 674 00:43:06,039 --> 00:43:10,671 But it may seem, at the beginning, easier to do with just in your own silo, 675 00:43:10,671 --> 00:43:13,356 because at some point, you're done with the task, 676 00:43:13,356 --> 00:43:16,693 and it would be an ongoing task on Wikidata. 677 00:43:18,280 --> 00:43:22,883 So, when it now comes to prioritizing the data to be ingested, 678 00:43:23,535 --> 00:43:28,395 that's like the rules I kind of go by at the moment. 679 00:43:30,055 --> 00:43:32,325 First of all, we'd like to ingest it 680 00:43:32,325 --> 00:43:36,191 where it's unclear who would be the natural authority in the given area. 681 00:43:36,191 --> 00:43:40,433 So that's definitely data that will be managed in a shared manner. 682 00:43:40,902 --> 00:43:44,391 And we'd like to ingest it where we see 683 00:43:44,391 --> 00:43:47,149 a high potential for crowdsourcing approaches. 684 00:43:47,149 --> 00:43:51,693 We'd like to ingest data where the data is likely to be reused 685 00:43:51,693 --> 00:43:53,965 in the context of Wikipedia. 686 00:43:54,813 --> 00:44:00,262 And there's also hope that some part of the international coordination 687 00:44:00,262 --> 00:44:04,364 around the whole data modeling, about the standardization, 688 00:44:04,364 --> 00:44:07,531 they could actually take place directly on Wikidata, 689 00:44:07,531 --> 00:44:09,484 if it's not taking place elsewhere, 690 00:44:09,484 --> 00:44:12,305 because it kind of forces people to start interacting 691 00:44:12,305 --> 00:44:14,816 if they ingest data in the same part. 692 00:44:15,963 --> 00:44:22,168 And we'd like to focus now next on base registers and authority files 693 00:44:22,181 --> 00:44:26,085 because they kind of help us create the linkages 694 00:44:26,085 --> 00:44:29,010 between different data and uncontrolled vocabularies 695 00:44:29,010 --> 00:44:32,833 as an extension of the existing ontology. 696 00:44:33,965 --> 00:44:35,994 So, just two more slides. 697 00:44:36,480 --> 00:44:40,978 The next steps will be that we're taking the sum of all GLAMs approach 698 00:44:40,978 --> 00:44:42,888 to Wiki Loves Performing Arts. 699 00:44:42,888 --> 00:44:47,524 That means we're describing venues and organizations, 700 00:44:47,524 --> 00:44:51,106 and try to push the data to Wikipedia 701 00:44:51,106 --> 00:44:54,414 in forms of infoboxes and [bubble] templates. 702 00:44:54,414 --> 00:44:59,769 And the other one, the other projects I'm going to pursue is COST Action 703 00:45:00,336 --> 00:45:02,001 that we'll submit next year 704 00:45:03,140 --> 00:45:06,037 around that Linked Open Data Ecosystem for the Performing Arts. 705 00:45:06,037 --> 00:45:10,347 COST is a European program that supports networking activities, 706 00:45:10,347 --> 00:45:13,929 and the topics to be covered are listed here. 707 00:45:13,929 --> 00:45:16,404 Two of them, I have highlighted-- 708 00:45:16,404 --> 00:45:20,702 one of them is like the question of federation between Wikidata 709 00:45:20,702 --> 00:45:23,717 and the classical linked open data approaches. 710 00:45:24,368 --> 00:45:27,744 And the other one, I think, is very important also, 711 00:45:27,744 --> 00:45:30,528 where we have a huge potential still, 712 00:45:30,528 --> 00:45:35,683 is implementing international campaigns to supplement data on Wikidata. 713 00:45:37,627 --> 00:45:41,365 So, that's it. Thank you for your attention. 714 00:45:41,365 --> 00:45:45,762 Now, I would like to ask my colleagues up here. 715 00:45:47,086 --> 00:45:50,529 To the panel, maybe you'll get them microphones as well. 716 00:45:53,903 --> 00:45:55,682 And then I would like to... 717 00:45:57,473 --> 00:45:59,940 give you the chance to ask questions. 718 00:46:01,042 --> 00:46:05,185 And obviously, also ask my colleagues 719 00:46:05,753 --> 00:46:08,071 whether they have questions to each other. 720 00:46:12,049 --> 00:46:15,327 So, do we have maybe a question from the audience? 721 00:46:20,502 --> 00:46:22,758 (man) [inaudible] 722 00:46:23,587 --> 00:46:27,033 I would like to ask from each of you 723 00:46:27,033 --> 00:46:30,842 where would you draw the line, 724 00:46:30,842 --> 00:46:33,076 basically, how you define-- 725 00:46:33,076 --> 00:46:35,956 when do you need to run your own Wikibase, 726 00:46:35,956 --> 00:46:39,328 and what do you want to put on Wikidata? 727 00:46:39,328 --> 00:46:43,677 Like, is this a clear delineation of what is seen 728 00:46:43,677 --> 00:46:45,981 behind of putting it [into order.] 729 00:46:48,211 --> 00:46:51,484 I can answer first because I have the mic. 730 00:46:51,484 --> 00:46:56,955 So, I've been thinking that one of the issues is notability. 731 00:46:59,212 --> 00:47:02,084 I'm addressing that in a different project. 732 00:47:02,084 --> 00:47:05,898 And I think licensing could be one, 733 00:47:05,898 --> 00:47:10,466 because you can apply your own terms in your own database, 734 00:47:10,466 --> 00:47:13,758 and then I think wherever it's possible. 735 00:47:14,284 --> 00:47:19,882 And then, the third one is just to have it as a sandbox, 736 00:47:19,882 --> 00:47:23,078 prepare it for ingestion into Wikidata. 737 00:47:23,078 --> 00:47:26,085 These are the three main things that I come up with now, 738 00:47:26,085 --> 00:47:28,554 but I can come up with more. 739 00:47:29,976 --> 00:47:32,369 For me, rights are always going to be an issue. 740 00:47:32,369 --> 00:47:36,686 So, if the National Library wanted to move towards Wikibase, 741 00:47:36,686 --> 00:47:39,740 that would enable them to continue to control the licensing 742 00:47:39,740 --> 00:47:42,539 for the work they've done with Maori language terms. 743 00:47:43,438 --> 00:47:46,483 The kakapo database only contains data 744 00:47:46,483 --> 00:47:49,977 that the Department of Conservation felt could be made public, 745 00:47:49,977 --> 00:47:52,739 but I suspect if they see it up and running, 746 00:47:52,739 --> 00:47:55,980 they might be tempted to use a private Wikibase 747 00:47:55,980 --> 00:47:58,128 to maintain their own database, 748 00:47:58,128 --> 00:48:01,214 simply because of some of the visualization tools 749 00:48:01,214 --> 00:48:03,567 that could be applied might be better 750 00:48:03,567 --> 00:48:07,417 than the sort of Excel spreadsheet system that they currently run. 751 00:48:12,337 --> 00:48:16,556 Well, I think this very much depends on the kind of data. 752 00:48:17,609 --> 00:48:22,359 We are, with the Press Archive, of course, in a quite lucky position, 753 00:48:22,359 --> 00:48:26,984 in that this was material which was published, 754 00:48:26,984 --> 00:48:29,829 it was published at the time, 755 00:48:30,153 --> 00:48:31,780 but it was expensive to publish. 756 00:48:33,082 --> 00:48:36,234 So, this is quite easy. 757 00:48:36,234 --> 00:48:39,449 I think, also, projects-- 758 00:48:40,101 --> 00:48:42,027 and this is a typical project, 759 00:48:42,027 --> 00:48:45,726 so it was funded for some time, and then funding ended, 760 00:48:46,466 --> 00:48:51,516 and what happens with the data which is enclosed in some silo, 761 00:48:52,136 --> 00:48:55,106 and some software which will not run forever. 762 00:48:55,846 --> 00:48:59,436 And so, it makes absolute sense in my eyes. 763 00:48:59,896 --> 00:49:02,776 At the time, Wikidata wasn't around, but now it is, 764 00:49:03,376 --> 00:49:07,336 and it makes absolute sense for our project to early on 765 00:49:07,336 --> 00:49:12,732 discuss sustainability in the context of how could we put this 766 00:49:12,732 --> 00:49:16,617 into a larger ecosystem like Wikidata, 767 00:49:18,717 --> 00:49:21,408 and discuss this with the data community 768 00:49:21,408 --> 00:49:26,864 what is notable and what makes sense to add this to Wikidata, 769 00:49:26,864 --> 00:49:32,093 and what makes sense to keep this as a proprietary form. 770 00:49:32,093 --> 00:49:37,753 Maybe in a more simple form than sophisticated application, 771 00:49:37,753 --> 00:49:43,055 but make it discoverable and make it linked to the large data cloud 772 00:49:43,055 --> 00:49:46,032 instead of investing lots of money 773 00:49:46,032 --> 00:49:52,692 into some silo which will not sustain. 774 00:49:55,201 --> 00:50:00,121 Yeah, as I said before in the project I was presenting here, 775 00:50:00,121 --> 00:50:04,926 are dualities between Wikidata and classical linked open data approaches. 776 00:50:04,926 --> 00:50:07,928 So, it's not so much about setting up a private Wikibase. 777 00:50:11,147 --> 00:50:14,504 Like one challenge we have had, and, of course, in Wikidata, 778 00:50:14,504 --> 00:50:17,710 is that when we ingest your own data there, 779 00:50:17,710 --> 00:50:20,341 you also have to do some housekeeping 780 00:50:20,744 --> 00:50:23,509 of people, of other people, actually. 781 00:50:24,043 --> 00:50:28,258 And they can put off people, [or it also means] that we will address it 782 00:50:28,258 --> 00:50:29,888 just step by step. 783 00:50:30,375 --> 00:50:33,466 So, there will be, at the moment, a database living-- 784 00:50:33,873 --> 00:50:35,581 in classical linked open data 785 00:50:35,581 --> 00:50:38,395 and we're starting to linking it with Wikidata, 786 00:50:38,395 --> 00:50:40,993 and it's a continuous process to find out 787 00:50:41,805 --> 00:50:47,643 for which areas the most data will be eventually on Wikidata, 788 00:50:48,168 --> 00:50:51,946 and for which areas it will actually live on other databases. 789 00:50:52,620 --> 00:50:56,645 Obviously, we'll have challenges regarding synchronization, 790 00:50:57,135 --> 00:50:58,589 as we probably all have, 791 00:50:58,589 --> 00:51:01,507 because that linked data field, 792 00:51:01,507 --> 00:51:04,826 where we still have to negotiate who we trust, 793 00:51:05,160 --> 00:51:08,720 who has authority about what. 794 00:51:13,830 --> 00:51:15,820 (assistant) Other questions? 795 00:51:23,981 --> 00:51:25,550 (woman) Thank you. 796 00:51:26,090 --> 00:51:31,030 So, fully agree with that issue of-- 797 00:51:34,425 --> 00:51:41,410 where to put the boundary between why do we put data on Wikidata, 798 00:51:43,044 --> 00:51:49,144 or why do we keep them, and create, manage, and maintain them 799 00:51:49,144 --> 00:51:53,104 in local databases and for what purposes. 800 00:51:53,778 --> 00:51:57,213 And I think that this is a large discussion 801 00:51:57,213 --> 00:52:02,383 that goes beyond just the excitement 802 00:52:02,383 --> 00:52:07,423 of putting data on Wikidata because it is public, 803 00:52:07,432 --> 00:52:10,762 because it serves humanity, because-- 804 00:52:11,031 --> 00:52:13,362 while there are two cool tools, 805 00:52:13,362 --> 00:52:18,132 and things are more complicated in real life, I think. 806 00:52:19,162 --> 00:52:24,102 Well, despite this, it's quite an interesting discussion. 807 00:52:24,435 --> 00:52:29,744 And then this is another issue, also, or another problem that is being discussed 808 00:52:29,744 --> 00:52:35,034 in this event in different panels. 809 00:52:35,775 --> 00:52:41,129 It is on one side, have your own database, 810 00:52:41,129 --> 00:52:43,194 whatever the technology is 811 00:52:43,194 --> 00:52:46,763 and publish things on Wikidata, 812 00:52:47,233 --> 00:52:51,166 or build your own system 813 00:52:51,166 --> 00:52:55,246 of creating and managing information 814 00:52:55,246 --> 00:52:58,131 on the Wikibase technology. 815 00:52:58,591 --> 00:53:04,281 And then, synchronize or whatever-- do federation or things, 816 00:53:04,281 --> 00:53:08,314 so it's a matter of technology that is used, 817 00:53:09,182 --> 00:53:14,796 and the fact that you use Wikidata just for publishing, 818 00:53:14,978 --> 00:53:18,637 or the infrastructure that is underneath Wikidata 819 00:53:18,637 --> 00:53:23,002 to create and manage your data. 820 00:53:27,116 --> 00:53:30,914 I mean, we had a discussion 821 00:53:30,914 --> 00:53:34,254 about the Wikibase panel, 822 00:53:34,254 --> 00:53:36,912 and there will be other discussions here, 823 00:53:36,912 --> 00:53:40,815 but things are on different levels, I think. 824 00:53:41,626 --> 00:53:47,756 Maybe [you sort of get] to that discussion about Wikibase or Wikidata-- 825 00:53:48,930 --> 00:53:52,427 I think it's problematic that we are focusing so much 826 00:53:52,427 --> 00:53:56,158 on this Wikibase infrastructure, because there are other infrastructures, 827 00:53:56,158 --> 00:53:58,690 like in the area of performing arts. 828 00:53:59,810 --> 00:54:04,054 We have another complementary community, which is MusicBrainz 829 00:54:04,054 --> 00:54:08,954 that runs on their own platform that provides linked open data, 830 00:54:09,614 --> 00:54:12,692 and as I understand it, 831 00:54:14,160 --> 00:54:17,232 there's agreement within the Wikidata community 832 00:54:17,232 --> 00:54:19,731 that we're not going to double all their data-- 833 00:54:19,731 --> 00:54:24,237 we're not going to copy all their data, but we accept that they're complementary. 834 00:54:24,848 --> 00:54:29,678 So, what will happen when you start integrating this data in Wikipedia? 835 00:54:30,246 --> 00:54:31,907 Infoboxes, for example. 836 00:54:31,907 --> 00:54:35,952 Would we be able to pull that data directly from their SPARQL endpoint? 837 00:54:36,764 --> 00:54:39,603 Or would we be obliged to kind of copy all the data, 838 00:54:39,603 --> 00:54:42,225 and what kind of processes are involved in that? 839 00:54:42,225 --> 00:54:44,915 (woman) Discussions are open, I think, 840 00:54:44,915 --> 00:54:49,615 because within this event, you have both interested communities-- 841 00:54:49,615 --> 00:54:51,975 those that are interested in Wikibase, 842 00:54:51,975 --> 00:54:54,002 and those that are interested in Wikidata, 843 00:54:54,002 --> 00:54:56,282 and those who are interested in both. 844 00:54:56,282 --> 00:54:59,562 Yeah, but we're not going to oblige them to move to Wikibase. 845 00:55:00,162 --> 00:55:03,138 - (woman) Not necessarily. - MusicBrainz is not running on Wikibase. 846 00:55:03,138 --> 00:55:06,802 (woman) No, I just wanted to say that you have separate problems, 847 00:55:06,802 --> 00:55:10,964 sometimes interrelated, sometimes not completely separated. 848 00:55:12,479 --> 00:55:16,573 And I had another question or remark 849 00:55:16,573 --> 00:55:22,013 regarding the management of hierarchies in controlled vocabularies, 850 00:55:22,013 --> 00:55:26,473 like thesaurus, like you in Finto. 851 00:55:27,703 --> 00:55:30,563 You do have the places 852 00:55:31,503 --> 00:55:34,956 in the Maori 853 00:55:36,418 --> 00:55:40,554 Subject Headings, 854 00:55:42,262 --> 00:55:48,068 Well, they have to deal with the management of concepts in hierarchy. 855 00:55:48,360 --> 00:55:52,320 What is your take, your opinion 856 00:55:52,320 --> 00:55:57,042 about the possibility of managing this controlled 857 00:55:58,850 --> 00:56:02,364 knowledge organization systems in Wikidata? 858 00:56:07,166 --> 00:56:10,169 I think in the case of Finto and YSO places, 859 00:56:11,499 --> 00:56:14,391 the repository will be a collection 860 00:56:14,391 --> 00:56:18,936 of several sources, eventually. 861 00:56:18,936 --> 00:56:21,613 So, it is in flux, anyway. 862 00:56:21,613 --> 00:56:24,528 So, we don't have to necessarily-- 863 00:56:24,528 --> 00:56:28,383 well, I don't represent the National Library, 864 00:56:28,383 --> 00:56:31,512 but in that possible project, 865 00:56:31,512 --> 00:56:35,711 we would not have to maintain an existing-- 866 00:56:35,711 --> 00:56:38,540 or fight with an existing structure. 867 00:56:38,540 --> 00:56:45,164 So, in that sense, it is an area open for exploration. 868 00:56:48,912 --> 00:56:52,272 The Maori Subject Headings seems to lend themselves ideally 869 00:56:52,272 --> 00:56:54,392 to Wikidata structure, 870 00:56:54,392 --> 00:56:56,961 but the licensing, of course, forbids that. 871 00:56:56,961 --> 00:56:59,491 I suspect that if the licensing were different 872 00:56:59,491 --> 00:57:01,511 and they were put into Wikidata, 873 00:57:01,511 --> 00:57:04,562 as soon as somebody decided they didn't like the hierarchy 874 00:57:04,562 --> 00:57:06,162 and started to change things, 875 00:57:06,162 --> 00:57:10,001 there would be an immediate outcry from people who worked very hard 876 00:57:10,001 --> 00:57:12,301 to create that structure 877 00:57:12,301 --> 00:57:15,641 and get the sign-off from various different Maori 878 00:57:15,641 --> 00:57:17,942 that was the current hierarchy. 879 00:57:18,382 --> 00:57:20,841 So, that's an issue to try and resolve. 880 00:57:23,812 --> 00:57:26,502 I think in terms of knowledge organization systems, 881 00:57:26,502 --> 00:57:28,116 they are all different. 882 00:57:28,116 --> 00:57:31,752 And I'm not sure if it would be a good idea 883 00:57:31,752 --> 00:57:36,855 to represent different hierarchies in Wikidata as such, 884 00:57:37,650 --> 00:57:42,101 but it maybe makes sense to think about overlays 885 00:57:42,941 --> 00:57:45,022 of the data. 886 00:57:45,431 --> 00:57:48,371 So, to do mappings on the content level. 887 00:57:49,091 --> 00:57:54,021 For example, as ZBW partnership Thesaurus for Economics. 888 00:57:55,420 --> 00:57:59,150 And this thesaurus has its own hierarchy, 889 00:57:59,680 --> 00:58:04,020 and, of course, it would be possible to project the hierarchy 890 00:58:04,461 --> 00:58:08,452 of this thesaurus into Wikidata concepts 891 00:58:08,452 --> 00:58:11,541 without actually storing this kind of structure 892 00:58:12,180 --> 00:58:14,840 as an alternative structure within Wikidata 893 00:58:14,840 --> 00:58:18,640 which would make a lot of confusion. 894 00:58:18,640 --> 00:58:24,789 But I think we should think of Wikidata, also, as a pool of concepts 895 00:58:24,789 --> 00:58:29,651 which can be connected on layers which are outside, 896 00:58:30,264 --> 00:58:33,489 and which give another view of the world 897 00:58:33,489 --> 00:58:39,080 which is not necessarily to be within Wikidata. 898 00:58:45,775 --> 00:58:48,203 (assistant) Alright. Some other questions? 899 00:58:49,096 --> 00:58:51,527 Otherwise-- okay. 900 00:58:54,769 --> 00:58:57,781 (man 2) Joachim, I just wanted to follow up on that last point. 901 00:58:57,781 --> 00:59:01,064 So, these layers, as you picture it, 902 00:59:02,196 --> 00:59:04,143 they would be maintained externally 903 00:59:04,143 --> 00:59:07,404 and somehow integrated 904 00:59:08,964 --> 00:59:11,764 with Wikidata from the Wikidata side, 905 00:59:11,764 --> 00:59:17,143 or have you thought a bit further 906 00:59:17,143 --> 00:59:19,463 about how that might be managed? 907 00:59:22,351 --> 00:59:24,931 Actually, no, I have no-- 908 00:59:25,271 --> 00:59:30,361 I have done experiments with ZBW and Wikidata. 909 00:59:30,771 --> 00:59:33,132 I was [inaudible] here at Wikidata. 910 00:59:33,132 --> 00:59:38,837 But I think this is a whole new complex thing, 911 00:59:39,261 --> 00:59:46,210 and so, it's up to [discuss], [to give up a lot of control] 912 00:59:46,409 --> 00:59:47,908 to do such things. 913 00:59:47,908 --> 00:59:50,178 But it has to be figured out. 914 00:59:56,638 --> 00:59:57,959 Should we take one more? 915 00:59:57,959 --> 00:59:59,686 (man 3) Ah, great. 916 00:59:59,686 --> 01:00:02,628 I was just wondering about the kakapo project. 917 01:00:03,875 --> 01:00:05,000 Uh-hmm. 918 01:00:05,000 --> 01:00:10,805 (man 3) Okay. So, did you get any pushback from the Wikidata community 919 01:00:10,805 --> 01:00:14,636 about having individual animals out of those items? 920 01:00:15,576 --> 01:00:16,836 Not so far. 921 01:00:16,836 --> 01:00:19,045 (man 3) Has anyone heard about this before? 922 01:00:19,045 --> 01:00:22,445 Is it "not so far" because no one has heard about it yet? 923 01:00:23,085 --> 01:00:26,095 There's been a small discussion for quite some time now-- 924 01:00:26,095 --> 01:00:29,235 those people interested in this sort of thing in Wikidata, 925 01:00:29,235 --> 01:00:32,215 and we all seem to think that it's a natural extension 926 01:00:32,215 --> 01:00:35,855 of getting individual Wikidata items to a famous racehorse 927 01:00:35,855 --> 01:00:39,755 or someone's cat, which-- that's modeled pretty well. 928 01:00:39,764 --> 01:00:44,444 I guess just the audacious thing is putting the entire species in there. 929 01:00:44,444 --> 01:00:48,113 But I think it's perfectly manageable. 930 01:00:48,113 --> 01:00:50,173 (man 3) Don't try it with cats and dogs. 931 01:00:50,173 --> 01:00:52,457 (laughter) 932 01:00:52,457 --> 01:00:54,337 (assistant) Okay. I think the time is finished. 933 01:00:54,337 --> 01:00:55,767 Thank you very much for attending. 934 01:00:55,767 --> 01:00:59,267 I think the speakers will be still open for the questions and a break. 935 01:00:59,267 --> 01:01:00,797 And have fun. 936 01:01:00,797 --> 01:01:02,292 Thank you very much. 937 01:01:02,292 --> 01:01:04,047 (applause)