WEBVTT 00:00:05.840 --> 00:00:07.310 Hello. 00:00:12.080 --> 00:00:15.130 The two of us are starting 00:00:15.130 --> 00:00:19.620 a level on a side-effect or side-project or whatever, 00:00:19.620 --> 00:00:22.680 something which is loosely connected to Wikidata, 00:00:22.680 --> 00:00:25.200 which is open data 00:00:25.200 --> 00:00:28.170 and we're glad to see you're here. 00:00:28.170 --> 00:00:29.260 I'm Alice Wiegand. 00:00:29.260 --> 00:00:34.350 I'm the project lead for open data in the municipality of Düsseldorf, 00:00:34.910 --> 00:00:38.840 and this is Knut Huhne, who is a student. 00:00:38.840 --> 00:00:40.400 You may introduce yourself. 00:00:42.170 --> 00:00:44.002 Yeah, I'm a software developer by day, 00:00:44.002 --> 00:00:47.260 and in my spare time I do a lot of work at Code for Germany, 00:00:47.260 --> 00:00:50.350 which is in community organization that I'll talk a bit about, 00:00:50.350 --> 00:00:54.270 and we try to build civic tech tools based on open data. 00:00:54.270 --> 00:00:57.170 Yeah, that's exactly what we need. 00:00:57.170 --> 00:01:01.660 And so let's see where we are [on this]. 00:01:03.330 --> 00:01:06.820 [inaudible] 00:01:06.820 --> 00:01:09.150 So if we talk about open government data, 00:01:09.150 --> 00:01:11.280 this is something where I think 00:01:11.910 --> 00:01:15.480 the entire world is much more forward 00:01:15.480 --> 00:01:18.970 than Europe and especially Germany is. 00:01:18.970 --> 00:01:22.950 But in Germany, where we both come from and live, 00:01:22.970 --> 00:01:27.730 this is getting some dynamics because laws are changing. 00:01:27.730 --> 00:01:32.970 And overall, we have just data which is used, produced, 00:01:32.970 --> 00:01:35.510 and cared 00:01:35.510 --> 00:01:40.180 and maintained by government, 00:01:41.060 --> 00:01:44.170 which is just a reliable data source, 00:01:44.170 --> 00:01:48.350 and it's official data with a high value, 00:01:48.350 --> 00:01:51.460 and it is sometimes really surprising to see 00:01:51.460 --> 00:01:55.970 what kind of data there is, openly kind of published. 00:01:55.970 --> 00:01:59.770 So this is, for example... 00:02:00.830 --> 00:02:02.820 I hope it opens soon. 00:02:02.820 --> 00:02:04.970 This, for example, is... 00:02:06.510 --> 00:02:10.480 it's the measure of radioactivity in kale. 00:02:10.480 --> 00:02:12.820 And I think it's surprising, 00:02:12.820 --> 00:02:16.840 I wonder why is it kale and not red cabbage? 00:02:16.840 --> 00:02:19.770 And I wonder why is this a fixed date? 00:02:19.770 --> 00:02:23.930 You know, 20th of November in 2013. 00:02:23.930 --> 00:02:26.730 And I wonder why is it that far away? 00:02:26.730 --> 00:02:30.840 What are we doing with radioactivity in kale today? 00:02:30.840 --> 00:02:31.840 I don't know. 00:02:31.840 --> 00:02:35.080 So you find a lot of these surprising things 00:02:35.080 --> 00:02:37.420 when you start to... 00:02:38.460 --> 00:02:41.040 What have I to do, do you know? 00:02:41.040 --> 00:02:44.880 ...when you start to look at open data in Germany. 00:02:49.210 --> 00:02:51.750 I'm confused with this computer. 00:02:51.750 --> 00:02:53.200 Oh, yes. Thanks. 00:02:53.820 --> 00:02:57.880 Yeah, and this data usually is up to date. 00:02:57.880 --> 00:03:00.530 Well, it should be, of course. 00:03:00.530 --> 00:03:05.220 As in all data, we have our gaps there. 00:03:05.220 --> 00:03:11.980 And overall if I just look on the region I know best, 00:03:11.980 --> 00:03:16.810 we have 86 00:03:17.170 --> 00:03:22.020 of singular portals with open data within Germany, 00:03:22.020 --> 00:03:25.330 which is on municipality level, on the country level, 00:03:25.330 --> 00:03:28.570 on the federal country level, and on state level. 00:03:28.570 --> 00:03:33.570 And in Austria, it's 19; and in Switzerland, it's 6, 00:03:33.570 --> 00:03:35.770 and numbers are growing. 00:03:35.770 --> 00:03:38.770 So, of course, also, question is why are we all doing 00:03:38.770 --> 00:03:41.530 the same thing on different places? 00:03:42.220 --> 00:03:45.660 It doesn't seem to be that efficient, 00:03:45.660 --> 00:03:49.620 I'm not sure, but this is how our world today works. 00:03:49.620 --> 00:03:53.150 So now I find the right key, thanks. 00:03:54.020 --> 00:03:59.080 And there are a lot of challenges which we have to face 00:03:59.080 --> 00:04:04.420 and kind of a huge gap between wish and reality. 00:04:04.420 --> 00:04:07.860 So, after all, I do think there is a huge, 00:04:07.860 --> 00:04:12.930 you know, kind of [friendliness] 00:04:12.930 --> 00:04:15.530 between open data and Wikidata. 00:04:15.530 --> 00:04:18.110 It's all about essential data. 00:04:18.110 --> 00:04:24.600 It is about being as actual or being as up to date as possible. 00:04:24.600 --> 00:04:29.900 But in the end, when we look at the open data platforms 00:04:29.900 --> 00:04:32.330 in mostly Europe, 00:04:32.330 --> 00:04:35.530 we find incompatible licenses. 00:04:35.530 --> 00:04:41.420 So usually mainly municipalities 00:04:41.420 --> 00:04:43.663 choose a BY license, 00:04:43.663 --> 00:04:49.240 because they think it would be good to know where this data came from 00:04:49.240 --> 00:04:51.510 and to be named there. 00:04:51.510 --> 00:04:54.480 And this is really a crazy thing. 00:04:54.480 --> 00:04:56.530 I looked at open data portals, 00:04:56.530 --> 00:05:01.550 and we have a portal in Düsseldorf for two years now 00:05:01.550 --> 00:05:06.730 and by design, we choose the 0 license. 00:05:06.730 --> 00:05:09.770 And I found that open data in Zurich-- 00:05:09.770 --> 00:05:12.310 Okay, it's not Germany, but it's Zurich-- 00:05:12.310 --> 00:05:15.660 and they are doing a lot of cool stuff there as well. 00:05:15.660 --> 00:05:18.521 And they also use the 0 license. 00:05:18.521 --> 00:05:25.950 But usually municipalities like CC BY licenses, sadly. 00:05:25.950 --> 00:05:29.310 And another thing we have to face 00:05:29.310 --> 00:05:34.800 is that, especially in municipalities, this kind of task to publish 00:05:34.800 --> 00:05:38.600 this internal data on a free and open license, 00:05:38.600 --> 00:05:40.470 on a platform, wherever, 00:05:40.470 --> 00:05:45.240 is just given to a person who usually does something else. 00:05:45.240 --> 00:05:48.330 So it's not, you know, a 100 person task 00:05:48.330 --> 00:05:51.006 for this person to do, 00:05:51.006 --> 00:05:55.280 but something to do, you know, with all the other things. 00:05:57.620 --> 00:06:00.114 Overall, I think we can say 00:06:00.114 --> 00:06:06.310 that of course there are people who are really doing a great job. 00:06:06.310 --> 00:06:11.080 Usually, we don't find that level of expertise 00:06:11.080 --> 00:06:15.060 on data analysis and data management 00:06:15.060 --> 00:06:20.750 that we would need to to really find high-quality data 00:06:20.750 --> 00:06:24.820 within the open data which comes from governances. 00:06:24.820 --> 00:06:27.734 And I think this is a problem, 00:06:27.734 --> 00:06:31.220 and I realized also that there's a language issue. 00:06:31.220 --> 00:06:35.420 So if I just think about putting my colleagues into this room, 00:06:35.420 --> 00:06:39.440 into the session we had just before, about data quality, 00:06:40.710 --> 00:06:43.910 it would be problematic to find a common language, 00:06:43.910 --> 00:06:48.730 to figure out how we can start to improve our data quality 00:06:48.730 --> 00:06:54.420 so that Wikidata's data quality is also improved. 00:06:54.420 --> 00:06:58.420 Another thing is that we have no standards 00:06:58.420 --> 00:07:02.040 in the name of anthologies, 00:07:02.040 --> 00:07:06.330 in the name of how we prepare data. 00:07:06.330 --> 00:07:09.060 There is a metadata standard, which is great, 00:07:09.060 --> 00:07:13.150 but this, after all, does not mean that we all do the same thing 00:07:13.150 --> 00:07:15.710 and that we find the same kind of data, 00:07:15.710 --> 00:07:18.500 just because it is named in the same way. 00:07:19.130 --> 00:07:22.110 But, overall, it's a lot of official data. 00:07:22.110 --> 00:07:24.530 You can get from open data. 00:07:25.770 --> 00:07:31.400 I made an example here which is about street names, 00:07:31.400 --> 00:07:34.014 and usually you find a lot 00:07:34.014 --> 00:07:39.680 of different forms and street names. 00:07:39.680 --> 00:07:42.570 Sometimes something like the Karlsplatz 00:07:42.570 --> 00:07:47.310 it's written with a C, or with a K, or separated, 00:07:47.310 --> 00:07:51.335 and sometimes this is also developing 00:07:51.335 --> 00:07:52.860 over the time. 00:07:52.860 --> 00:07:56.700 And in the end, there's just only one official name 00:07:56.700 --> 00:07:58.950 of a place or of a street, 00:07:58.950 --> 00:08:04.040 and it's the municipality which can give you that name. 00:08:04.040 --> 00:08:07.480 And this part, like a list of official street names 00:08:07.480 --> 00:08:10.370 is something which is regularly published 00:08:10.370 --> 00:08:14.660 by a lot of municipalities in their open data portals. 00:08:14.660 --> 00:08:19.892 And I think that at all is a good start to figure out 00:08:19.892 --> 00:08:24.460 what we can do with this in Wikidata as well. 00:08:24.460 --> 00:08:27.450 So this is my short introduction, 00:08:27.450 --> 00:08:32.370 and I'm happy to hear about community work with open data. 00:08:34.170 --> 00:08:37.910 Yeah, I thought I would just kind of give a quick introduction from the other side, 00:08:37.910 --> 00:08:41.080 of movement from the community side. 00:08:41.080 --> 00:08:44.240 So, as I said, I work in my spare time 00:08:44.240 --> 00:08:46.590 for an organization called Code for Germany. 00:08:47.110 --> 00:08:49.770 We've been running since about five years 00:08:49.770 --> 00:08:52.440 where we have labs, that is groups of people 00:08:52.440 --> 00:08:57.250 that meet once a week, some once a month in Germany 00:08:57.850 --> 00:09:00.150 in local, what we call labs. 00:09:00.150 --> 00:09:03.860 And we try to build tools that somehow make it easier 00:09:03.860 --> 00:09:05.930 for people to participate in politics, 00:09:05.930 --> 00:09:08.330 to get an understanding of the environment around them, 00:09:08.330 --> 00:09:11.420 to collect data about air pollution. 00:09:11.420 --> 00:09:14.200 And, of course, we'd like to use 00:09:14.200 --> 00:09:17.050 governmentally provided open data for that, 00:09:17.050 --> 00:09:20.570 but we've also realized that there's difficulties with that, 00:09:20.570 --> 00:09:24.230 that sometimes the data isn't there, it's under a difficult license, 00:09:24.870 --> 00:09:29.680 which is kind of how we found our way to Wikidata also, I think. 00:09:29.680 --> 00:09:32.020 We also happened to meet in Berlin 00:09:32.020 --> 00:09:34.170 in the offices of Wikimedia Deutschland, 00:09:34.380 --> 00:09:38.240 so this kind of brought us very close to Wikidata. 00:09:38.980 --> 00:09:40.230 And I think it's cool to see 00:09:40.230 --> 00:09:44.420 that we're kind of strengthening the relationship 00:09:44.420 --> 00:09:49.110 between the Wikidata community in Germany and the Code for Germany community. 00:09:49.740 --> 00:09:52.820 We also would like to work even closer with the government, 00:09:52.820 --> 00:09:54.790 but talking about bridging gaps. 00:09:54.790 --> 00:09:59.880 I mean, there's very basic problems such as us meeting after we work 00:09:59.880 --> 00:10:03.730 and the people for the government wanting to meet when they work. 00:10:04.080 --> 00:10:08.020 So I think when we think about how these communities can work together, 00:10:08.020 --> 00:10:11.220 there's very mundane things, such as working times, 00:10:11.220 --> 00:10:13.370 that we need to keep in mind. 00:10:15.060 --> 00:10:20.220 So just a quick introduction to what we do at Code for Germany 00:10:20.220 --> 00:10:22.460 especially with regards to Wikidata. 00:10:22.920 --> 00:10:26.210 We've had a couple of hackathons now within the last years 00:10:26.210 --> 00:10:28.020 where people from the Wikidata community 00:10:28.020 --> 00:10:31.480 and the Code for Germany community 00:10:31.480 --> 00:10:33.984 kind of came together to meet 00:10:33.984 --> 00:10:38.200 and just spend a weekend to work on Wikidata. 00:10:38.200 --> 00:10:41.040 And we've done all kinds of different things. 00:10:42.020 --> 00:10:45.040 We've usually been very interested in political data, 00:10:45.040 --> 00:10:47.370 so we've been importing a lot of data 00:10:47.370 --> 00:10:50.570 regarding politicians and regarding elections. 00:10:50.570 --> 00:10:54.260 We've thought about how to model election data in Wikidata a lot 00:10:54.260 --> 00:10:58.880 and we've also had a lot of people that built games with Wikidata. 00:10:59.400 --> 00:11:01.340 One of the nice examples for this 00:11:01.340 --> 00:11:04.640 would be the Wikidata card game, where you can put in any Q number 00:11:04.640 --> 00:11:07.260 and you get a nice trading card game. 00:11:07.625 --> 00:11:08.650 You might have seen that. 00:11:08.650 --> 00:11:11.457 If not, I encourage you to look for that. 00:11:11.457 --> 00:11:16.090 I think that's a really cool way to sell Wikidata to other people. 00:11:19.110 --> 00:11:21.840 Selling-- this is also something that we've realized 00:11:21.840 --> 00:11:23.460 when we talk to data providers, 00:11:23.460 --> 00:11:26.660 that often they're quite scared to give data to you 00:11:26.660 --> 00:11:28.523 with the traditional argument 00:11:28.523 --> 00:11:32.530 of "Our data is so complicated, you won't understand it, 00:11:32.530 --> 00:11:36.240 and you'll build bad applications that will make us look bad." 00:11:37.080 --> 00:11:42.420 And our strategy usually is to just take the data anyway, 00:11:42.420 --> 00:11:46.040 build an application share it with them, and then their response is usually, 00:11:46.550 --> 00:11:49.600 "Oh, this is pretty cool. Can we link to that from our website?" 00:11:50.840 --> 00:11:52.110 And then, at some point, 00:11:52.110 --> 00:11:54.860 maybe you can start having a discussion with them. 00:11:55.880 --> 00:11:58.480 But, yeah, I think this is kind of what we can do as a community. 00:11:58.480 --> 00:12:02.880 We can build little small games and tools to showcase. 00:12:02.880 --> 00:12:04.750 Okay, there is Wikidata, and it's pretty cool, 00:12:04.750 --> 00:12:07.570 and you have open data, and we can build cool things with it, 00:12:07.570 --> 00:12:09.280 but you'll need to give it to us, 00:12:09.280 --> 00:12:12.440 you'll need to publish it under a license that we can work with. 00:12:13.310 --> 00:12:17.510 And this is one of the things that we try to do at Code for Germany. 00:12:18.800 --> 00:12:20.910 [inaudible], thanks. 00:12:23.310 --> 00:12:25.360 (applause) 00:12:27.260 --> 00:12:28.655 Yeah, thank you. 00:12:28.655 --> 00:12:31.550 Before we open the room for questions from you, 00:12:31.550 --> 00:12:37.570 we would like to just open or ask some questions to you. 00:12:38.440 --> 00:12:40.757 I think that Knut has really described 00:12:40.757 --> 00:12:43.627 the challenges we face quite well. 00:12:43.627 --> 00:12:49.160 But, still, I do think there's a lot of opportunities in these data, 00:12:49.160 --> 00:12:54.750 and we just need to kind of harvest it better than we do it right now. 00:12:54.750 --> 00:12:58.840 And so my questions-- and maybe it helps you a bit 00:12:58.840 --> 00:13:03.170 to think about that-- is how could we integrate 00:13:03.170 --> 00:13:07.330 more open government data into Wikidata in a more structured way. 00:13:07.330 --> 00:13:12.830 Just keeping in mind that the people who are kind of providing these data 00:13:12.830 --> 00:13:16.820 are not the experts you may expect. 00:13:16.820 --> 00:13:18.460 And at the same time, 00:13:19.100 --> 00:13:22.800 there already is a WikiProject, open government data, 00:13:22.800 --> 00:13:27.440 and I'm not sure if you, Christina had opened it quite a while ago. 00:13:27.440 --> 00:13:31.060 And I wonder in which way we can 00:13:31.060 --> 00:13:37.370 kind of reanimate it and make the best out of it 00:13:37.370 --> 00:13:41.860 because we still have this place, and we have people 00:13:41.860 --> 00:13:45.400 who are engaged in the municipalities, in governments, 00:13:45.400 --> 00:13:47.605 to open up data. 00:13:47.605 --> 00:13:50.530 And maybe it's an opportunity 00:13:50.530 --> 00:13:55.060 to just match these different 00:13:55.060 --> 00:13:57.820 languages and expectations. 00:13:57.820 --> 00:14:01.460 So, yeah, I'm open for any ideas to do that, 00:14:01.460 --> 00:14:04.640 and I'm happy to engage a bit in that as well. 00:14:04.640 --> 00:14:06.730 So, questions? 00:14:10.680 --> 00:14:12.510 (person 1) Hi, thank you, guys. 00:14:12.510 --> 00:14:14.888 Maybe an idea is one 00:14:14.888 --> 00:14:17.550 we could be taking from the Wikipedia beginnings, 00:14:17.550 --> 00:14:19.620 where I think it was Matthias Schindler, 00:14:19.620 --> 00:14:22.020 who started with his Content Liberation Army. 00:14:22.540 --> 00:14:25.910 And the idea that, you know, you have to really go in, 00:14:25.910 --> 00:14:28.950 and the data is there. 00:14:28.950 --> 00:14:31.609 But for example, I had a project with a student 00:14:31.609 --> 00:14:32.694 where we were looking 00:14:32.694 --> 00:14:35.330 at where the trees are geolocated in Berlin, 00:14:35.330 --> 00:14:39.310 and this is sometimes on paper, it's sometimes on a stupid database. 00:14:39.310 --> 00:14:41.750 We were accused of being terrorists 00:14:41.750 --> 00:14:44.460 by the people who didn't want to give us the data. 00:14:44.460 --> 00:14:48.800 We had to get really, really picky about this and point to the laws 00:14:48.800 --> 00:14:52.040 saying, "This is open data, and you have to give it to us." 00:14:52.040 --> 00:14:55.840 but we have to sort of go in friendly, as you were saying 00:14:55.840 --> 00:14:58.550 and try and explain to them what they will have from it. 00:14:58.550 --> 00:15:01.170 Many of them don't see that they have a use of it 00:15:01.170 --> 00:15:03.820 because it's more work for them having to deal with us. 00:15:03.820 --> 00:15:07.510 I think that's one of the main kind of fears 00:15:07.510 --> 00:15:12.645 which is there are coming people who are just putting more work onto us. 00:15:12.645 --> 00:15:17.080 And at the same time, there's so little understanding 00:15:17.080 --> 00:15:20.910 that this is just part of what they are doing already. 00:15:20.910 --> 00:15:25.260 And that they can really also 00:15:25.260 --> 00:15:28.040 learn and get a lot of input 00:15:28.040 --> 00:15:30.660 from the people who are asking about that data. 00:15:30.660 --> 00:15:32.550 But this is really culture change, 00:15:32.550 --> 00:15:35.570 a cultural change especially here in Germany. 00:15:35.570 --> 00:15:38.840 So we are working on it. 00:15:38.840 --> 00:15:43.330 We are working hard, but it's really kind of a tough thing. 00:15:43.330 --> 00:15:44.820 - Maybe I can add? - Yes. 00:15:44.820 --> 00:15:46.642 I think what's also really interesting to see 00:15:46.642 --> 00:15:47.962 from the community's perspective 00:15:47.962 --> 00:15:49.700 is that when we talk to different cities, 00:15:49.700 --> 00:15:52.060 it so depends on who happens to work in the cities. 00:15:52.060 --> 00:15:54.440 Like we have this very small city of Moers 00:15:54.440 --> 00:15:55.913 that is very unknown, 00:15:55.913 --> 00:15:58.411 but if you talk to people in the open data community, 00:15:58.411 --> 00:15:59.870 everyone will know it 00:15:59.870 --> 00:16:03.820 because they happen to pay someone to do work on open data. 00:16:05.080 --> 00:16:08.150 And when I talk to people from the government in Berlin, 00:16:08.150 --> 00:16:12.442 they tell me, "Okay, I now know I have to publish open data, 00:16:12.442 --> 00:16:17.040 but I don't know how, for whom, or why. 00:16:18.560 --> 00:16:20.770 And I think this is actually 00:16:24.450 --> 00:16:27.750 a chance for the smaller cities to kind of champion this idea 00:16:27.750 --> 00:16:29.930 because it's so much easier for them 00:16:29.930 --> 00:16:33.170 to kind of get a movement and to liberate some data 00:16:33.170 --> 00:16:36.550 where if we talk in Berlin, we always need to talk to 12 districts, 00:16:36.550 --> 00:16:39.150 and they'll never align on what data they want to publish. 00:16:47.005 --> 00:16:48.617 (person 2) And we have a remote comment 00:16:48.617 --> 00:16:50.150 from Beat Estermann 00:16:50.150 --> 00:16:54.120 who wants to point out he has some links in Etherpad 00:16:54.120 --> 00:16:57.620 about "Interest in open government data helps Swiss authorities 00:16:57.620 --> 00:17:00.950 prioritize base registers and controlled vocabularies." 00:17:00.950 --> 00:17:02.950 And I'm told he just came in 00:17:02.950 --> 00:17:05.280 while I'm reading his Etherpad entry. 00:17:05.890 --> 00:17:09.170 So if you could just take the mic from me. 00:17:11.700 --> 00:17:13.170 (person 2) Go on. 00:17:13.470 --> 00:17:15.260 (Beat) Okay, thank you. 00:17:15.510 --> 00:17:17.680 I missed the first introduction. 00:17:19.150 --> 00:17:20.460 What did you start on? 00:17:20.460 --> 00:17:23.220 - (person 2) I was just reading-- - (Beat) Oh, you were reading. Okay. 00:17:23.220 --> 00:17:26.150 So we're currently running-- 00:17:26.150 --> 00:17:28.820 In Switzerland, we're running a survey 00:17:30.640 --> 00:17:35.640 to kind of prioritize data from within the government. 00:17:35.640 --> 00:17:39.710 There are like base registers or controlled vocabularies. 00:17:39.710 --> 00:17:42.544 Because we think that they would be crucial 00:17:42.544 --> 00:17:46.480 to actually promote and boost the publication of linked open data 00:17:46.480 --> 00:17:48.280 across the public authorities, 00:17:48.280 --> 00:17:51.240 so we're running a server to prioritize them. 00:17:51.610 --> 00:17:56.730 And for some authorities to know which ones to publish now 00:17:56.730 --> 00:17:58.220 and for others-- 00:17:59.000 --> 00:18:01.850 for the community to know where to put pressure on 00:18:01.860 --> 00:18:03.390 and how to actually, 00:18:04.550 --> 00:18:07.220 yeah, argue why they should publish it. 00:18:07.220 --> 00:18:09.620 We're also collecting use cases. 00:18:09.620 --> 00:18:13.480 I posted the link to the Etherpad. 00:18:13.480 --> 00:18:17.060 It's in German and French only, the questionnaires. 00:18:17.060 --> 00:18:20.130 I'm sorry we're still not like up 00:18:20.130 --> 00:18:23.620 five language count here, but you said four languages- 00:18:24.085 --> 00:18:25.550 (person 3) Just switch to English. 00:18:25.550 --> 00:18:28.050 (Beat) Yeah, we could switch to English, right. 00:18:29.710 --> 00:18:31.800 Yeah, so that's one point. 00:18:31.800 --> 00:18:34.400 The other point I think is we could... 00:18:35.510 --> 00:18:37.860 and I'll put a little bit more love 00:18:37.860 --> 00:18:41.910 into kind of documenting the whole Wiki project, 00:18:42.810 --> 00:18:44.080 open common data, 00:18:44.080 --> 00:18:47.130 and that's something we're not really doing 00:18:47.130 --> 00:18:50.680 if you compare it to what is going on in GLAM. 00:18:53.310 --> 00:18:55.820 I think that is definitely something 00:18:55.820 --> 00:18:59.680 which I probably will try to figure out 00:18:59.680 --> 00:19:03.660 after my vacation time, 00:19:03.660 --> 00:19:07.350 which is starting on Monday. 00:19:08.110 --> 00:19:11.147 There is this WikiProject, 00:19:11.147 --> 00:19:15.010 and we need to figure out who is interested in it 00:19:15.430 --> 00:19:17.750 what can we do there, 00:19:17.750 --> 00:19:20.660 and how can we motivate people 00:19:20.660 --> 00:19:25.310 from kind of [out] the Wikidata community 00:19:25.310 --> 00:19:28.970 to add this important information to that. 00:19:28.970 --> 00:19:32.040 So I do think there is a huge opportunity 00:19:32.470 --> 00:19:36.423 to figure out how we can include 00:19:36.423 --> 00:19:42.710 more of this really, really valuable and reliable data into Wikidata. 00:19:42.710 --> 00:19:46.330 But overall, there's a lot of challenges as well, 00:19:46.330 --> 00:19:52.060 and still it's kind of a different crowd of people, 00:19:52.060 --> 00:19:55.310 and we need to figure out how to bring them together. 00:19:58.070 --> 00:19:59.600 Any idea is welcome. 00:19:59.600 --> 00:20:01.080 (Beat) Yeah, there is another point 00:20:01.080 --> 00:20:02.805 which we're currently not focusing on 00:20:02.805 --> 00:20:05.760 with this base register and vocabulary thing. 00:20:05.760 --> 00:20:08.930 But what I have had as a request 00:20:08.930 --> 00:20:11.510 is to be able to actually store tabular data 00:20:11.510 --> 00:20:14.480 and to be able to pull it. 00:20:14.910 --> 00:20:16.930 Because it does not make sense 00:20:16.930 --> 00:20:21.950 to put like 200 years of population statistics from Zurich 00:20:21.950 --> 00:20:25.640 into that Wikidata item for Zurich. 00:20:27.880 --> 00:20:33.770 Maybe I just pick it up and just an anecdote from my day work. 00:20:33.770 --> 00:20:38.400 So I started to introduce Wikidata to my colleagues. 00:20:38.400 --> 00:20:41.080 We are a small team doing open data, 00:20:41.080 --> 00:20:47.060 and it was fine, and they were really, really interested, 00:20:47.060 --> 00:20:53.110 but in the end we started to add some of the population dates, 00:20:53.110 --> 00:20:56.460 and then, you know, there isn't any order. 00:20:56.460 --> 00:21:01.600 So it's so hard to figure out if you find a population date 00:21:01.600 --> 00:21:06.820 for year Y or X or something, and if it is still missing. 00:21:06.820 --> 00:21:11.018 So, of course, there are still a lot of things 00:21:11.018 --> 00:21:13.060 to improve in Wikidata as well, 00:21:13.060 --> 00:21:17.400 and tabular data could be one of it also. 00:21:18.110 --> 00:21:20.840 (person 4) [inaudible] Is it working? 00:21:20.840 --> 00:21:23.080 I have a comment on the tabular data. 00:21:23.470 --> 00:21:26.040 I remember we had also discussions 00:21:26.040 --> 00:21:29.310 with a canton and the city of Zurich about this, 00:21:29.310 --> 00:21:33.050 and that it might make sense to start 00:21:33.050 --> 00:21:37.150 discussions on whether we should maybe consider 00:21:37.150 --> 00:21:42.220 setting up a Wikibase for open governmental data 00:21:42.220 --> 00:21:46.170 and having such kind of datasets 00:21:46.170 --> 00:21:51.080 and then link them to Wikidata or link them from Wikidata to them, 00:21:51.080 --> 00:21:55.420 because mostly the linked open data technology 00:21:55.420 --> 00:21:57.050 is actually enabling that 00:21:57.050 --> 00:22:00.190 and is one of the key advantages of this technology. 00:22:00.190 --> 00:22:04.710 It is, of course, something that doesn't relate only to OGD data, 00:22:04.710 --> 00:22:09.080 it's a global divide in the whole Wikidata community. 00:22:09.080 --> 00:22:14.510 Because the larger we make the central endpoint or the graph 00:22:15.280 --> 00:22:19.040 the more difficult it is to handle it-- I think we all agree on that. 00:22:19.040 --> 00:22:24.220 So I think there should be a deeper conversation and discussion 00:22:24.220 --> 00:22:27.550 on whether we should start building this network. 00:22:27.550 --> 00:22:30.440 Well, actually, there is already a network of Wikibases. 00:22:30.440 --> 00:22:35.550 We also work in the university with publications and research data 00:22:35.550 --> 00:22:38.020 with our own Wikibase. 00:22:39.170 --> 00:22:43.150 Yeah, and then another comment about the Wiki projects. 00:22:43.150 --> 00:22:48.405 So we continued working and documenting the materials 00:22:48.405 --> 00:22:49.475 of the events, 00:22:49.475 --> 00:22:52.760 so we actually now have two upcoming events in November. 00:22:52.760 --> 00:22:56.710 We have a full weekend technical training on Wikidata 00:22:56.710 --> 00:22:59.060 in collaboration with the open data Zurich people 00:22:59.060 --> 00:23:00.460 and the canton of Zurich, 00:23:00.460 --> 00:23:04.020 and also Wikimedia Switzerland, and we have a hackathon. 00:23:04.020 --> 00:23:07.000 But I totally agree that it would be great 00:23:07.000 --> 00:23:09.660 to start having conversations with all the participants 00:23:09.670 --> 00:23:12.060 that have been listed already in the project, 00:23:12.060 --> 00:23:13.660 and start more discussions, 00:23:13.660 --> 00:23:17.660 especially with all the countries that have many good initiatives, 00:23:17.660 --> 00:23:20.210 like Germany, like what you described 00:23:20.210 --> 00:23:22.440 and start documenting 00:23:22.440 --> 00:23:25.170 what are the specific needs of these institutions, 00:23:25.170 --> 00:23:26.310 what are the problems, 00:23:26.310 --> 00:23:29.907 and what specific tools we need to develop, or procedures, 00:23:29.907 --> 00:23:33.730 that we can help them import or link data in Wikidata. 00:23:36.460 --> 00:23:39.510 I think we're out of time. One last question. 00:23:41.280 --> 00:23:44.310 (person 5) So a proposal to use Wikibase for that? 00:23:44.310 --> 00:23:46.371 I'm not sure whether that actually would solve 00:23:46.371 --> 00:23:49.150 this tabular data problem. 00:23:49.150 --> 00:23:53.400 And when thinking of statistical data, like population data, 00:23:53.400 --> 00:23:56.150 that is not data that we want to really edit, 00:23:56.670 --> 00:23:58.970 that's data we just want to consume. 00:23:59.840 --> 00:24:02.950 So it means we have to ask ourselves 00:24:02.950 --> 00:24:06.860 whether we want to build in the capability to actually pull data 00:24:06.860 --> 00:24:10.150 directly from external third-party SPARQL endpoints, 00:24:10.150 --> 00:24:14.640 and not just from within this Wikibase ecosystem 00:24:14.640 --> 00:24:16.710 that we're planning to build up as well. 00:24:16.715 --> 00:24:19.445 (person 4) So I agree that it doesn't solve the tabular data, 00:24:19.445 --> 00:24:20.875 but what I was trying to say 00:24:20.875 --> 00:24:23.510 is that the information that is more specific, 00:24:23.510 --> 00:24:26.620 it might be the case that we want to export it to something else 00:24:26.620 --> 00:24:32.970 and I see Wikibase also as a very good data modeling example. 00:24:32.970 --> 00:24:37.770 So not only because you want to have humans editing, 00:24:37.770 --> 00:24:42.040 but also because the whole data modeling happening in Wikidata 00:24:42.040 --> 00:24:44.130 with all the qualifiers and references 00:24:44.130 --> 00:24:47.420 adds a lot to all the datasets. 00:24:47.420 --> 00:24:50.460 So if we would do it from scratch in RDF 00:24:51.420 --> 00:24:53.420 we would be missing these features 00:24:53.420 --> 00:24:56.480 that Wikidata has, and I see it has an advantage. 00:24:56.480 --> 00:24:59.300 So that was a reason why I mentioned 00:24:59.300 --> 00:25:01.400 that it would be very helpful to maybe think of 00:25:01.400 --> 00:25:03.950 for the Wikibases around the OGD data. 00:25:04.970 --> 00:25:08.370 (moderator) So, I'm sorry, but I think we just ran out of time, 00:25:08.370 --> 00:25:11.150 and I encourage you to keep talking with our speakers, 00:25:11.150 --> 00:25:13.020 [inaudible] during all the conference 00:25:13.020 --> 00:25:15.370 and please, a round of applause for them. 00:25:15.370 --> 00:25:17.150 (applause) 00:25:19.627 --> 00:25:20.860 Thank you.