1 00:00:05,840 --> 00:00:07,310 Hello. 2 00:00:12,080 --> 00:00:15,130 The two of us are starting 3 00:00:15,130 --> 00:00:19,620 a level on a side-effect or side-project or whatever, 4 00:00:19,620 --> 00:00:22,680 something which is loosely connected to Wikidata, 5 00:00:22,680 --> 00:00:25,200 which is open data 6 00:00:25,200 --> 00:00:28,170 and we're glad to see you're here. 7 00:00:28,170 --> 00:00:29,260 I'm Alice Wiegand. 8 00:00:29,260 --> 00:00:34,350 I'm the project lead for open data in the municipality of Düsseldorf, 9 00:00:34,910 --> 00:00:38,840 and this is Knut Huhne, who is a student. 10 00:00:38,840 --> 00:00:40,400 You may introduce yourself. 11 00:00:42,170 --> 00:00:44,002 Yeah, I'm a software developer by day, 12 00:00:44,002 --> 00:00:47,260 and in my spare time I do a lot of work at Code for Germany, 13 00:00:47,260 --> 00:00:50,350 which is in community organization that I'll talk a bit about, 14 00:00:50,350 --> 00:00:54,270 and we try to build civic tech tools based on open data. 15 00:00:54,270 --> 00:00:57,170 Yeah, that's exactly what we need. 16 00:00:57,170 --> 00:01:01,660 And so let's see where we are [on this]. 17 00:01:03,330 --> 00:01:06,820 [inaudible] 18 00:01:06,820 --> 00:01:09,150 So if we talk about open government data, 19 00:01:09,150 --> 00:01:11,280 this is something where I think 20 00:01:11,910 --> 00:01:15,480 the entire world is much more forward 21 00:01:15,480 --> 00:01:18,970 than Europe and especially Germany is. 22 00:01:18,970 --> 00:01:22,950 But in Germany, where we both come from and live, 23 00:01:22,970 --> 00:01:27,730 this is getting some dynamics because laws are changing. 24 00:01:27,730 --> 00:01:32,970 And overall, we have just data which is used, produced, 25 00:01:32,970 --> 00:01:35,510 and cared 26 00:01:35,510 --> 00:01:40,180 and maintained by government, 27 00:01:41,060 --> 00:01:44,170 which is just a reliable data source, 28 00:01:44,170 --> 00:01:48,350 and it's official data with a high value, 29 00:01:48,350 --> 00:01:51,460 and it is sometimes really surprising to see 30 00:01:51,460 --> 00:01:55,970 what kind of data there is, openly kind of published. 31 00:01:55,970 --> 00:01:59,770 So this is, for example... 32 00:02:00,830 --> 00:02:02,820 I hope it opens soon. 33 00:02:02,820 --> 00:02:04,970 This, for example, is... 34 00:02:06,510 --> 00:02:10,480 it's the measure of radioactivity in kale. 35 00:02:10,480 --> 00:02:12,820 And I think it's surprising, 36 00:02:12,820 --> 00:02:16,840 I wonder why is it kale and not red cabbage? 37 00:02:16,840 --> 00:02:19,770 And I wonder why is this a fixed date? 38 00:02:19,770 --> 00:02:23,930 You know, 20th of November in 2013. 39 00:02:23,930 --> 00:02:26,730 And I wonder why is it that far away? 40 00:02:26,730 --> 00:02:30,840 What are we doing with radioactivity in kale today? 41 00:02:30,840 --> 00:02:31,840 I don't know. 42 00:02:31,840 --> 00:02:35,080 So you find a lot of these surprising things 43 00:02:35,080 --> 00:02:37,420 when you start to... 44 00:02:38,460 --> 00:02:41,040 What have I to do, do you know? 45 00:02:41,040 --> 00:02:44,880 ...when you start to look at open data in Germany. 46 00:02:49,210 --> 00:02:51,750 I'm confused with this computer. 47 00:02:51,750 --> 00:02:53,200 Oh, yes. Thanks. 48 00:02:53,820 --> 00:02:57,880 Yeah, and this data usually is up to date. 49 00:02:57,880 --> 00:03:00,530 Well, it should be, of course. 50 00:03:00,530 --> 00:03:05,220 As in all data, we have our gaps there. 51 00:03:05,220 --> 00:03:11,980 And overall if I just look on the region I know best, 52 00:03:11,980 --> 00:03:16,810 we have 86 53 00:03:17,170 --> 00:03:22,020 of singular portals with open data within Germany, 54 00:03:22,020 --> 00:03:25,330 which is on municipality level, on the country level, 55 00:03:25,330 --> 00:03:28,570 on the federal country level, and on state level. 56 00:03:28,570 --> 00:03:33,570 And in Austria, it's 19; and in Switzerland, it's 6, 57 00:03:33,570 --> 00:03:35,770 and numbers are growing. 58 00:03:35,770 --> 00:03:38,770 So, of course, also, question is why are we all doing 59 00:03:38,770 --> 00:03:41,530 the same thing on different places? 60 00:03:42,220 --> 00:03:45,660 It doesn't seem to be that efficient, 61 00:03:45,660 --> 00:03:49,620 I'm not sure, but this is how our world today works. 62 00:03:49,620 --> 00:03:53,150 So now I find the right key, thanks. 63 00:03:54,020 --> 00:03:59,080 And there are a lot of challenges which we have to face 64 00:03:59,080 --> 00:04:04,420 and kind of a huge gap between wish and reality. 65 00:04:04,420 --> 00:04:07,860 So, after all, I do think there is a huge, 66 00:04:07,860 --> 00:04:12,930 you know, kind of [friendliness] 67 00:04:12,930 --> 00:04:15,530 between open data and Wikidata. 68 00:04:15,530 --> 00:04:18,110 It's all about essential data. 69 00:04:18,110 --> 00:04:24,600 It is about being as actual or being as up to date as possible. 70 00:04:24,600 --> 00:04:29,900 But in the end, when we look at the open data platforms 71 00:04:29,900 --> 00:04:32,330 in mostly Europe, 72 00:04:32,330 --> 00:04:35,530 we find incompatible licenses. 73 00:04:35,530 --> 00:04:41,420 So usually mainly municipalities 74 00:04:41,420 --> 00:04:43,663 choose a BY license, 75 00:04:43,663 --> 00:04:49,240 because they think it would be good to know where this data came from 76 00:04:49,240 --> 00:04:51,510 and to be named there. 77 00:04:51,510 --> 00:04:54,480 And this is really a crazy thing. 78 00:04:54,480 --> 00:04:56,530 I looked at open data portals, 79 00:04:56,530 --> 00:05:01,550 and we have a portal in Düsseldorf for two years now 80 00:05:01,550 --> 00:05:06,730 and by design, we choose the 0 license. 81 00:05:06,730 --> 00:05:09,770 And I found that open data in Zurich-- 82 00:05:09,770 --> 00:05:12,310 Okay, it's not Germany, but it's Zurich-- 83 00:05:12,310 --> 00:05:15,660 and they are doing a lot of cool stuff there as well. 84 00:05:15,660 --> 00:05:18,521 And they also use the 0 license. 85 00:05:18,521 --> 00:05:25,950 But usually municipalities like CC BY licenses, sadly. 86 00:05:25,950 --> 00:05:29,310 And another thing we have to face 87 00:05:29,310 --> 00:05:34,800 is that, especially in municipalities, this kind of task to publish 88 00:05:34,800 --> 00:05:38,600 this internal data on a free and open license, 89 00:05:38,600 --> 00:05:40,470 on a platform, wherever, 90 00:05:40,470 --> 00:05:45,240 is just given to a person who usually does something else. 91 00:05:45,240 --> 00:05:48,330 So it's not, you know, a 100 person task 92 00:05:48,330 --> 00:05:51,006 for this person to do, 93 00:05:51,006 --> 00:05:55,280 but something to do, you know, with all the other things. 94 00:05:57,620 --> 00:06:00,114 Overall, I think we can say 95 00:06:00,114 --> 00:06:06,310 that of course there are people who are really doing a great job. 96 00:06:06,310 --> 00:06:11,080 Usually, we don't find that level of expertise 97 00:06:11,080 --> 00:06:15,060 on data analysis and data management 98 00:06:15,060 --> 00:06:20,750 that we would need to to really find high-quality data 99 00:06:20,750 --> 00:06:24,820 within the open data which comes from governances. 100 00:06:24,820 --> 00:06:27,734 And I think this is a problem, 101 00:06:27,734 --> 00:06:31,220 and I realized also that there's a language issue. 102 00:06:31,220 --> 00:06:35,420 So if I just think about putting my colleagues into this room, 103 00:06:35,420 --> 00:06:39,440 into the session we had just before, about data quality, 104 00:06:40,710 --> 00:06:43,910 it would be problematic to find a common language, 105 00:06:43,910 --> 00:06:48,730 to figure out how we can start to improve our data quality 106 00:06:48,730 --> 00:06:54,420 so that Wikidata's data quality is also improved. 107 00:06:54,420 --> 00:06:58,420 Another thing is that we have no standards 108 00:06:58,420 --> 00:07:02,040 in the name of anthologies, 109 00:07:02,040 --> 00:07:06,330 in the name of how we prepare data. 110 00:07:06,330 --> 00:07:09,060 There is a metadata standard, which is great, 111 00:07:09,060 --> 00:07:13,150 but this, after all, does not mean that we all do the same thing 112 00:07:13,150 --> 00:07:15,710 and that we find the same kind of data, 113 00:07:15,710 --> 00:07:18,500 just because it is named in the same way. 114 00:07:19,130 --> 00:07:22,110 But, overall, it's a lot of official data. 115 00:07:22,110 --> 00:07:24,530 You can get from open data. 116 00:07:25,770 --> 00:07:31,400 I made an example here which is about street names, 117 00:07:31,400 --> 00:07:34,014 and usually you find a lot 118 00:07:34,014 --> 00:07:39,680 of different forms and street names. 119 00:07:39,680 --> 00:07:42,570 Sometimes something like the Karlsplatz 120 00:07:42,570 --> 00:07:47,310 it's written with a C, or with a K, or separated, 121 00:07:47,310 --> 00:07:51,335 and sometimes this is also developing 122 00:07:51,335 --> 00:07:52,860 over the time. 123 00:07:52,860 --> 00:07:56,700 And in the end, there's just only one official name 124 00:07:56,700 --> 00:07:58,950 of a place or of a street, 125 00:07:58,950 --> 00:08:04,040 and it's the municipality which can give you that name. 126 00:08:04,040 --> 00:08:07,480 And this part, like a list of official street names 127 00:08:07,480 --> 00:08:10,370 is something which is regularly published 128 00:08:10,370 --> 00:08:14,660 by a lot of municipalities in their open data portals. 129 00:08:14,660 --> 00:08:19,892 And I think that at all is a good start to figure out 130 00:08:19,892 --> 00:08:24,460 what we can do with this in Wikidata as well. 131 00:08:24,460 --> 00:08:27,450 So this is my short introduction, 132 00:08:27,450 --> 00:08:32,370 and I'm happy to hear about community work with open data. 133 00:08:34,170 --> 00:08:37,910 Yeah, I thought I would just kind of give a quick introduction from the other side, 134 00:08:37,910 --> 00:08:41,080 of movement from the community side. 135 00:08:41,080 --> 00:08:44,240 So, as I said, I work in my spare time 136 00:08:44,240 --> 00:08:46,590 for an organization called Code for Germany. 137 00:08:47,110 --> 00:08:49,770 We've been running since about five years 138 00:08:49,770 --> 00:08:52,440 where we have labs, that is groups of people 139 00:08:52,440 --> 00:08:57,250 that meet once a week, some once a month in Germany 140 00:08:57,850 --> 00:09:00,150 in local, what we call labs. 141 00:09:00,150 --> 00:09:03,860 And we try to build tools that somehow make it easier 142 00:09:03,860 --> 00:09:05,930 for people to participate in politics, 143 00:09:05,930 --> 00:09:08,330 to get an understanding of the environment around them, 144 00:09:08,330 --> 00:09:11,420 to collect data about air pollution. 145 00:09:11,420 --> 00:09:14,200 And, of course, we'd like to use 146 00:09:14,200 --> 00:09:17,050 governmentally provided open data for that, 147 00:09:17,050 --> 00:09:20,570 but we've also realized that there's difficulties with that, 148 00:09:20,570 --> 00:09:24,230 that sometimes the data isn't there, it's under a difficult license, 149 00:09:24,870 --> 00:09:29,680 which is kind of how we found our way to Wikidata also, I think. 150 00:09:29,680 --> 00:09:32,020 We also happened to meet in Berlin 151 00:09:32,020 --> 00:09:34,170 in the offices of Wikimedia Deutschland, 152 00:09:34,380 --> 00:09:38,240 so this kind of brought us very close to Wikidata. 153 00:09:38,980 --> 00:09:40,230 And I think it's cool to see 154 00:09:40,230 --> 00:09:44,420 that we're kind of strengthening the relationship 155 00:09:44,420 --> 00:09:49,110 between the Wikidata community in Germany and the Code for Germany community. 156 00:09:49,740 --> 00:09:52,820 We also would like to work even closer with the government, 157 00:09:52,820 --> 00:09:54,790 but talking about bridging gaps. 158 00:09:54,790 --> 00:09:59,880 I mean, there's very basic problems such as us meeting after we work 159 00:09:59,880 --> 00:10:03,730 and the people for the government wanting to meet when they work. 160 00:10:04,080 --> 00:10:08,020 So I think when we think about how these communities can work together, 161 00:10:08,020 --> 00:10:11,220 there's very mundane things, such as working times, 162 00:10:11,220 --> 00:10:13,370 that we need to keep in mind. 163 00:10:15,060 --> 00:10:20,220 So just a quick introduction to what we do at Code for Germany 164 00:10:20,220 --> 00:10:22,460 especially with regards to Wikidata. 165 00:10:22,920 --> 00:10:26,210 We've had a couple of hackathons now within the last years 166 00:10:26,210 --> 00:10:28,020 where people from the Wikidata community 167 00:10:28,020 --> 00:10:31,480 and the Code for Germany community 168 00:10:31,480 --> 00:10:33,984 kind of came together to meet 169 00:10:33,984 --> 00:10:38,200 and just spend a weekend to work on Wikidata. 170 00:10:38,200 --> 00:10:41,040 And we've done all kinds of different things. 171 00:10:42,020 --> 00:10:45,040 We've usually been very interested in political data, 172 00:10:45,040 --> 00:10:47,370 so we've been importing a lot of data 173 00:10:47,370 --> 00:10:50,570 regarding politicians and regarding elections. 174 00:10:50,570 --> 00:10:54,260 We've thought about how to model election data in Wikidata a lot 175 00:10:54,260 --> 00:10:58,880 and we've also had a lot of people that built games with Wikidata. 176 00:10:59,400 --> 00:11:01,340 One of the nice examples for this 177 00:11:01,340 --> 00:11:04,640 would be the Wikidata card game, where you can put in any Q number 178 00:11:04,640 --> 00:11:07,260 and you get a nice trading card game. 179 00:11:07,625 --> 00:11:08,650 You might have seen that. 180 00:11:08,650 --> 00:11:11,457 If not, I encourage you to look for that. 181 00:11:11,457 --> 00:11:16,090 I think that's a really cool way to sell Wikidata to other people. 182 00:11:19,110 --> 00:11:21,840 Selling-- this is also something that we've realized 183 00:11:21,840 --> 00:11:23,460 when we talk to data providers, 184 00:11:23,460 --> 00:11:26,660 that often they're quite scared to give data to you 185 00:11:26,660 --> 00:11:28,523 with the traditional argument 186 00:11:28,523 --> 00:11:32,530 of "Our data is so complicated, you won't understand it, 187 00:11:32,530 --> 00:11:36,240 and you'll build bad applications that will make us look bad." 188 00:11:37,080 --> 00:11:42,420 And our strategy usually is to just take the data anyway, 189 00:11:42,420 --> 00:11:46,040 build an application share it with them, and then their response is usually, 190 00:11:46,550 --> 00:11:49,600 "Oh, this is pretty cool. Can we link to that from our website?" 191 00:11:50,840 --> 00:11:52,110 And then, at some point, 192 00:11:52,110 --> 00:11:54,860 maybe you can start having a discussion with them. 193 00:11:55,880 --> 00:11:58,480 But, yeah, I think this is kind of what we can do as a community. 194 00:11:58,480 --> 00:12:02,880 We can build little small games and tools to showcase. 195 00:12:02,880 --> 00:12:04,750 Okay, there is Wikidata, and it's pretty cool, 196 00:12:04,750 --> 00:12:07,570 and you have open data, and we can build cool things with it, 197 00:12:07,570 --> 00:12:09,280 but you'll need to give it to us, 198 00:12:09,280 --> 00:12:12,440 you'll need to publish it under a license that we can work with. 199 00:12:13,310 --> 00:12:17,510 And this is one of the things that we try to do at Code for Germany. 200 00:12:18,800 --> 00:12:20,910 [inaudible], thanks. 201 00:12:23,310 --> 00:12:25,360 (applause) 202 00:12:27,260 --> 00:12:28,655 Yeah, thank you. 203 00:12:28,655 --> 00:12:31,550 Before we open the room for questions from you, 204 00:12:31,550 --> 00:12:37,570 we would like to just open or ask some questions to you. 205 00:12:38,440 --> 00:12:40,757 I think that Knut has really described 206 00:12:40,757 --> 00:12:43,627 the challenges we face quite well. 207 00:12:43,627 --> 00:12:49,160 But, still, I do think there's a lot of opportunities in these data, 208 00:12:49,160 --> 00:12:54,750 and we just need to kind of harvest it better than we do it right now. 209 00:12:54,750 --> 00:12:58,840 And so my questions-- and maybe it helps you a bit 210 00:12:58,840 --> 00:13:03,170 to think about that-- is how could we integrate 211 00:13:03,170 --> 00:13:07,330 more open government data into Wikidata in a more structured way. 212 00:13:07,330 --> 00:13:12,830 Just keeping in mind that the people who are kind of providing these data 213 00:13:12,830 --> 00:13:16,820 are not the experts you may expect. 214 00:13:16,820 --> 00:13:18,460 And at the same time, 215 00:13:19,100 --> 00:13:22,800 there already is a WikiProject, open government data, 216 00:13:22,800 --> 00:13:27,440 and I'm not sure if you, Christina had opened it quite a while ago. 217 00:13:27,440 --> 00:13:31,060 And I wonder in which way we can 218 00:13:31,060 --> 00:13:37,370 kind of reanimate it and make the best out of it 219 00:13:37,370 --> 00:13:41,860 because we still have this place, and we have people 220 00:13:41,860 --> 00:13:45,400 who are engaged in the municipalities, in governments, 221 00:13:45,400 --> 00:13:47,605 to open up data. 222 00:13:47,605 --> 00:13:50,530 And maybe it's an opportunity 223 00:13:50,530 --> 00:13:55,060 to just match these different 224 00:13:55,060 --> 00:13:57,820 languages and expectations. 225 00:13:57,820 --> 00:14:01,460 So, yeah, I'm open for any ideas to do that, 226 00:14:01,460 --> 00:14:04,640 and I'm happy to engage a bit in that as well. 227 00:14:04,640 --> 00:14:06,730 So, questions? 228 00:14:10,680 --> 00:14:12,510 (person 1) Hi, thank you, guys. 229 00:14:12,510 --> 00:14:14,888 Maybe an idea is one 230 00:14:14,888 --> 00:14:17,550 we could be taking from the Wikipedia beginnings, 231 00:14:17,550 --> 00:14:19,620 where I think it was Matthias Schindler, 232 00:14:19,620 --> 00:14:22,020 who started with his Content Liberation Army. 233 00:14:22,540 --> 00:14:25,910 And the idea that, you know, you have to really go in, 234 00:14:25,910 --> 00:14:28,950 and the data is there. 235 00:14:28,950 --> 00:14:31,609 But for example, I had a project with a student 236 00:14:31,609 --> 00:14:32,694 where we were looking 237 00:14:32,694 --> 00:14:35,330 at where the trees are geolocated in Berlin, 238 00:14:35,330 --> 00:14:39,310 and this is sometimes on paper, it's sometimes on a stupid database. 239 00:14:39,310 --> 00:14:41,750 We were accused of being terrorists 240 00:14:41,750 --> 00:14:44,460 by the people who didn't want to give us the data. 241 00:14:44,460 --> 00:14:48,800 We had to get really, really picky about this and point to the laws 242 00:14:48,800 --> 00:14:52,040 saying, "This is open data, and you have to give it to us." 243 00:14:52,040 --> 00:14:55,840 but we have to sort of go in friendly, as you were saying 244 00:14:55,840 --> 00:14:58,550 and try and explain to them what they will have from it. 245 00:14:58,550 --> 00:15:01,170 Many of them don't see that they have a use of it 246 00:15:01,170 --> 00:15:03,820 because it's more work for them having to deal with us. 247 00:15:03,820 --> 00:15:07,510 I think that's one of the main kind of fears 248 00:15:07,510 --> 00:15:12,645 which is there are coming people who are just putting more work onto us. 249 00:15:12,645 --> 00:15:17,080 And at the same time, there's so little understanding 250 00:15:17,080 --> 00:15:20,910 that this is just part of what they are doing already. 251 00:15:20,910 --> 00:15:25,260 And that they can really also 252 00:15:25,260 --> 00:15:28,040 learn and get a lot of input 253 00:15:28,040 --> 00:15:30,660 from the people who are asking about that data. 254 00:15:30,660 --> 00:15:32,550 But this is really culture change, 255 00:15:32,550 --> 00:15:35,570 a cultural change especially here in Germany. 256 00:15:35,570 --> 00:15:38,840 So we are working on it. 257 00:15:38,840 --> 00:15:43,330 We are working hard, but it's really kind of a tough thing. 258 00:15:43,330 --> 00:15:44,820 - Maybe I can add? - Yes. 259 00:15:44,820 --> 00:15:46,642 I think what's also really interesting to see 260 00:15:46,642 --> 00:15:47,962 from the community's perspective 261 00:15:47,962 --> 00:15:49,700 is that when we talk to different cities, 262 00:15:49,700 --> 00:15:52,060 it so depends on who happens to work in the cities. 263 00:15:52,060 --> 00:15:54,440 Like we have this very small city of Moers 264 00:15:54,440 --> 00:15:55,913 that is very unknown, 265 00:15:55,913 --> 00:15:58,411 but if you talk to people in the open data community, 266 00:15:58,411 --> 00:15:59,870 everyone will know it 267 00:15:59,870 --> 00:16:03,820 because they happen to pay someone to do work on open data. 268 00:16:05,080 --> 00:16:08,150 And when I talk to people from the government in Berlin, 269 00:16:08,150 --> 00:16:12,442 they tell me, "Okay, I now know I have to publish open data, 270 00:16:12,442 --> 00:16:17,040 but I don't know how, for whom, or why. 271 00:16:18,560 --> 00:16:20,770 And I think this is actually 272 00:16:24,450 --> 00:16:27,750 a chance for the smaller cities to kind of champion this idea 273 00:16:27,750 --> 00:16:29,930 because it's so much easier for them 274 00:16:29,930 --> 00:16:33,170 to kind of get a movement and to liberate some data 275 00:16:33,170 --> 00:16:36,550 where if we talk in Berlin, we always need to talk to 12 districts, 276 00:16:36,550 --> 00:16:39,150 and they'll never align on what data they want to publish. 277 00:16:47,005 --> 00:16:48,617 (person 2) And we have a remote comment 278 00:16:48,617 --> 00:16:50,150 from Beat Estermann 279 00:16:50,150 --> 00:16:54,120 who wants to point out he has some links in Etherpad 280 00:16:54,120 --> 00:16:57,620 about "Interest in open government data helps Swiss authorities 281 00:16:57,620 --> 00:17:00,950 prioritize base registers and controlled vocabularies." 282 00:17:00,950 --> 00:17:02,950 And I'm told he just came in 283 00:17:02,950 --> 00:17:05,280 while I'm reading his Etherpad entry. 284 00:17:05,890 --> 00:17:09,170 So if you could just take the mic from me. 285 00:17:11,700 --> 00:17:13,170 (person 2) Go on. 286 00:17:13,470 --> 00:17:15,260 (Beat) Okay, thank you. 287 00:17:15,510 --> 00:17:17,680 I missed the first introduction. 288 00:17:19,150 --> 00:17:20,460 What did you start on? 289 00:17:20,460 --> 00:17:23,220 - (person 2) I was just reading-- - (Beat) Oh, you were reading. Okay. 290 00:17:23,220 --> 00:17:26,150 So we're currently running-- 291 00:17:26,150 --> 00:17:28,820 In Switzerland, we're running a survey 292 00:17:30,640 --> 00:17:35,640 to kind of prioritize data from within the government. 293 00:17:35,640 --> 00:17:39,710 There are like base registers or controlled vocabularies. 294 00:17:39,710 --> 00:17:42,544 Because we think that they would be crucial 295 00:17:42,544 --> 00:17:46,480 to actually promote and boost the publication of linked open data 296 00:17:46,480 --> 00:17:48,280 across the public authorities, 297 00:17:48,280 --> 00:17:51,240 so we're running a server to prioritize them. 298 00:17:51,610 --> 00:17:56,730 And for some authorities to know which ones to publish now 299 00:17:56,730 --> 00:17:58,220 and for others-- 300 00:17:59,000 --> 00:18:01,850 for the community to know where to put pressure on 301 00:18:01,860 --> 00:18:03,390 and how to actually, 302 00:18:04,550 --> 00:18:07,220 yeah, argue why they should publish it. 303 00:18:07,220 --> 00:18:09,620 We're also collecting use cases. 304 00:18:09,620 --> 00:18:13,480 I posted the link to the Etherpad. 305 00:18:13,480 --> 00:18:17,060 It's in German and French only, the questionnaires. 306 00:18:17,060 --> 00:18:20,130 I'm sorry we're still not like up 307 00:18:20,130 --> 00:18:23,620 five language count here, but you said four languages- 308 00:18:24,085 --> 00:18:25,550 (person 3) Just switch to English. 309 00:18:25,550 --> 00:18:28,050 (Beat) Yeah, we could switch to English, right. 310 00:18:29,710 --> 00:18:31,800 Yeah, so that's one point. 311 00:18:31,800 --> 00:18:34,400 The other point I think is we could... 312 00:18:35,510 --> 00:18:37,860 and I'll put a little bit more love 313 00:18:37,860 --> 00:18:41,910 into kind of documenting the whole Wiki project, 314 00:18:42,810 --> 00:18:44,080 open common data, 315 00:18:44,080 --> 00:18:47,130 and that's something we're not really doing 316 00:18:47,130 --> 00:18:50,680 if you compare it to what is going on in GLAM. 317 00:18:53,310 --> 00:18:55,820 I think that is definitely something 318 00:18:55,820 --> 00:18:59,680 which I probably will try to figure out 319 00:18:59,680 --> 00:19:03,660 after my vacation time, 320 00:19:03,660 --> 00:19:07,350 which is starting on Monday. 321 00:19:08,110 --> 00:19:11,147 There is this WikiProject, 322 00:19:11,147 --> 00:19:15,010 and we need to figure out who is interested in it 323 00:19:15,430 --> 00:19:17,750 what can we do there, 324 00:19:17,750 --> 00:19:20,660 and how can we motivate people 325 00:19:20,660 --> 00:19:25,310 from kind of [out] the Wikidata community 326 00:19:25,310 --> 00:19:28,970 to add this important information to that. 327 00:19:28,970 --> 00:19:32,040 So I do think there is a huge opportunity 328 00:19:32,470 --> 00:19:36,423 to figure out how we can include 329 00:19:36,423 --> 00:19:42,710 more of this really, really valuable and reliable data into Wikidata. 330 00:19:42,710 --> 00:19:46,330 But overall, there's a lot of challenges as well, 331 00:19:46,330 --> 00:19:52,060 and still it's kind of a different crowd of people, 332 00:19:52,060 --> 00:19:55,310 and we need to figure out how to bring them together. 333 00:19:58,070 --> 00:19:59,600 Any idea is welcome. 334 00:19:59,600 --> 00:20:01,080 (Beat) Yeah, there is another point 335 00:20:01,080 --> 00:20:02,805 which we're currently not focusing on 336 00:20:02,805 --> 00:20:05,760 with this base register and vocabulary thing. 337 00:20:05,760 --> 00:20:08,930 But what I have had as a request 338 00:20:08,930 --> 00:20:11,510 is to be able to actually store tabular data 339 00:20:11,510 --> 00:20:14,480 and to be able to pull it. 340 00:20:14,910 --> 00:20:16,930 Because it does not make sense 341 00:20:16,930 --> 00:20:21,950 to put like 200 years of population statistics from Zurich 342 00:20:21,950 --> 00:20:25,640 into that Wikidata item for Zurich. 343 00:20:27,880 --> 00:20:33,770 Maybe I just pick it up and just an anecdote from my day work. 344 00:20:33,770 --> 00:20:38,400 So I started to introduce Wikidata to my colleagues. 345 00:20:38,400 --> 00:20:41,080 We are a small team doing open data, 346 00:20:41,080 --> 00:20:47,060 and it was fine, and they were really, really interested, 347 00:20:47,060 --> 00:20:53,110 but in the end we started to add some of the population dates, 348 00:20:53,110 --> 00:20:56,460 and then, you know, there isn't any order. 349 00:20:56,460 --> 00:21:01,600 So it's so hard to figure out if you find a population date 350 00:21:01,600 --> 00:21:06,820 for year Y or X or something, and if it is still missing. 351 00:21:06,820 --> 00:21:11,018 So, of course, there are still a lot of things 352 00:21:11,018 --> 00:21:13,060 to improve in Wikidata as well, 353 00:21:13,060 --> 00:21:17,400 and tabular data could be one of it also. 354 00:21:18,110 --> 00:21:20,840 (person 4) [inaudible] Is it working? 355 00:21:20,840 --> 00:21:23,080 I have a comment on the tabular data. 356 00:21:23,470 --> 00:21:26,040 I remember we had also discussions 357 00:21:26,040 --> 00:21:29,310 with a canton and the city of Zurich about this, 358 00:21:29,310 --> 00:21:33,050 and that it might make sense to start 359 00:21:33,050 --> 00:21:37,150 discussions on whether we should maybe consider 360 00:21:37,150 --> 00:21:42,220 setting up a Wikibase for open governmental data 361 00:21:42,220 --> 00:21:46,170 and having such kind of datasets 362 00:21:46,170 --> 00:21:51,080 and then link them to Wikidata or link them from Wikidata to them, 363 00:21:51,080 --> 00:21:55,420 because mostly the linked open data technology 364 00:21:55,420 --> 00:21:57,050 is actually enabling that 365 00:21:57,050 --> 00:22:00,190 and is one of the key advantages of this technology. 366 00:22:00,190 --> 00:22:04,710 It is, of course, something that doesn't relate only to OGD data, 367 00:22:04,710 --> 00:22:09,080 it's a global divide in the whole Wikidata community. 368 00:22:09,080 --> 00:22:14,510 Because the larger we make the central endpoint or the graph 369 00:22:15,280 --> 00:22:19,040 the more difficult it is to handle it-- I think we all agree on that. 370 00:22:19,040 --> 00:22:24,220 So I think there should be a deeper conversation and discussion 371 00:22:24,220 --> 00:22:27,550 on whether we should start building this network. 372 00:22:27,550 --> 00:22:30,440 Well, actually, there is already a network of Wikibases. 373 00:22:30,440 --> 00:22:35,550 We also work in the university with publications and research data 374 00:22:35,550 --> 00:22:38,020 with our own Wikibase. 375 00:22:39,170 --> 00:22:43,150 Yeah, and then another comment about the Wiki projects. 376 00:22:43,150 --> 00:22:48,405 So we continued working and documenting the materials 377 00:22:48,405 --> 00:22:49,475 of the events, 378 00:22:49,475 --> 00:22:52,760 so we actually now have two upcoming events in November. 379 00:22:52,760 --> 00:22:56,710 We have a full weekend technical training on Wikidata 380 00:22:56,710 --> 00:22:59,060 in collaboration with the open data Zurich people 381 00:22:59,060 --> 00:23:00,460 and the canton of Zurich, 382 00:23:00,460 --> 00:23:04,020 and also Wikimedia Switzerland, and we have a hackathon. 383 00:23:04,020 --> 00:23:07,000 But I totally agree that it would be great 384 00:23:07,000 --> 00:23:09,660 to start having conversations with all the participants 385 00:23:09,670 --> 00:23:12,060 that have been listed already in the project, 386 00:23:12,060 --> 00:23:13,660 and start more discussions, 387 00:23:13,660 --> 00:23:17,660 especially with all the countries that have many good initiatives, 388 00:23:17,660 --> 00:23:20,210 like Germany, like what you described 389 00:23:20,210 --> 00:23:22,440 and start documenting 390 00:23:22,440 --> 00:23:25,170 what are the specific needs of these institutions, 391 00:23:25,170 --> 00:23:26,310 what are the problems, 392 00:23:26,310 --> 00:23:29,907 and what specific tools we need to develop, or procedures, 393 00:23:29,907 --> 00:23:33,730 that we can help them import or link data in Wikidata. 394 00:23:36,460 --> 00:23:39,510 I think we're out of time. One last question. 395 00:23:41,280 --> 00:23:44,310 (person 5) So a proposal to use Wikibase for that? 396 00:23:44,310 --> 00:23:46,371 I'm not sure whether that actually would solve 397 00:23:46,371 --> 00:23:49,150 this tabular data problem. 398 00:23:49,150 --> 00:23:53,400 And when thinking of statistical data, like population data, 399 00:23:53,400 --> 00:23:56,150 that is not data that we want to really edit, 400 00:23:56,670 --> 00:23:58,970 that's data we just want to consume. 401 00:23:59,840 --> 00:24:02,950 So it means we have to ask ourselves 402 00:24:02,950 --> 00:24:06,860 whether we want to build in the capability to actually pull data 403 00:24:06,860 --> 00:24:10,150 directly from external third-party SPARQL endpoints, 404 00:24:10,150 --> 00:24:14,640 and not just from within this Wikibase ecosystem 405 00:24:14,640 --> 00:24:16,710 that we're planning to build up as well. 406 00:24:16,715 --> 00:24:19,445 (person 4) So I agree that it doesn't solve the tabular data, 407 00:24:19,445 --> 00:24:20,875 but what I was trying to say 408 00:24:20,875 --> 00:24:23,510 is that the information that is more specific, 409 00:24:23,510 --> 00:24:26,620 it might be the case that we want to export it to something else 410 00:24:26,620 --> 00:24:32,970 and I see Wikibase also as a very good data modeling example. 411 00:24:32,970 --> 00:24:37,770 So not only because you want to have humans editing, 412 00:24:37,770 --> 00:24:42,040 but also because the whole data modeling happening in Wikidata 413 00:24:42,040 --> 00:24:44,130 with all the qualifiers and references 414 00:24:44,130 --> 00:24:47,420 adds a lot to all the datasets. 415 00:24:47,420 --> 00:24:50,460 So if we would do it from scratch in RDF 416 00:24:51,420 --> 00:24:53,420 we would be missing these features 417 00:24:53,420 --> 00:24:56,480 that Wikidata has, and I see it has an advantage. 418 00:24:56,480 --> 00:24:59,300 So that was a reason why I mentioned 419 00:24:59,300 --> 00:25:01,400 that it would be very helpful to maybe think of 420 00:25:01,400 --> 00:25:03,950 for the Wikibases around the OGD data. 421 00:25:04,970 --> 00:25:08,370 (moderator) So, I'm sorry, but I think we just ran out of time, 422 00:25:08,370 --> 00:25:11,150 and I encourage you to keep talking with our speakers, 423 00:25:11,150 --> 00:25:13,020 [inaudible] during all the conference 424 00:25:13,020 --> 00:25:15,370 and please, a round of applause for them. 425 00:25:15,370 --> 00:25:17,150 (applause) 426 00:25:19,627 --> 00:25:20,860 Thank you.