Hello. The two of us are starting a level on a side-effect or side-project or whatever, something which is loosely connected to Wikidata, which is open data and we're glad to see you're here. I'm Alice Wiegand. I'm the project lead for open data in the municipality of Düsseldorf, and this is Knut Huhne, who is a student. You may introduce yourself. Yeah, I'm a software developer by day, and in my spare time I do a lot of work at Code for Germany, which is in community organization that I'll talk a bit about, and we try to build civic tech tools based on open data. Yeah, that's exactly what we need. And so let's see where we are [on this]. [inaudible] So if we talk about open government data, this is something where I think the entire world is much more forward than Europe and especially Germany is. But in Germany, where we both come from and live, this is getting some dynamics because laws are changing. And overall, we have just data which is used, produced, and cared and maintained by government, which is just a reliable data source, and it's official data with a high value, and it is sometimes really surprising to see what kind of data there is, openly kind of published. So this is, for example... I hope it opens soon. This, for example, is... it's the measure of radioactivity in kale. And I think it's surprising, I wonder why is it kale and not red cabbage? And I wonder why is this a fixed date? You know, 20th of November in 2013. And I wonder why is it that far away? What are we doing with radioactivity in kale today? I don't know. So you find a lot of these surprising things when you start to... What have I to do, do you know? ...when you start to look at open data in Germany. I'm confused with this computer. Oh, yes. Thanks. Yeah, and this data usually is up to date. Well, it should be, of course. As in all data, we have our gaps there. And overall if I just look on the region I know best, we have 86 of singular portals with open data within Germany, which is on municipality level, on the country level, on the federal country level, and on state level. And in Austria, it's 19; and in Switzerland, it's 6, and numbers are growing. So, of course, also, question is why are we all doing the same thing on different places? It doesn't seem to be that efficient, I'm not sure, but this is how our world today works. So now I find the right key, thanks. And there are a lot of challenges which we have to face and kind of a huge gap between wish and reality. So, after all, I do think there is a huge, you know, kind of [friendliness] between open data and Wikidata. It's all about essential data. It is about being as actual or being as up to date as possible. But in the end, when we look at the open data platforms in mostly Europe, we find incompatible licenses. So usually mainly municipalities choose a BY license, because they think it would be good to know where this data came from and to be named there. And this is really a crazy thing. I looked at open data portals, and we have a portal in Düsseldorf for two years now and by design, we choose the 0 license. And I found that open data in Zurich-- Okay, it's not Germany, but it's Zurich-- and they are doing a lot of cool stuff there as well. And they also use the 0 license. But usually municipalities like CC BY licenses, sadly. And another thing we have to face is that, especially in municipalities, this kind of task to publish this internal data on a free and open license, on a platform, wherever, is just given to a person who usually does something else. So it's not, you know, a 100 person task for this person to do, but something to do, you know, with all the other things. Overall, I think we can say that of course there are people who are really doing a great job. Usually, we don't find that level of expertise on data analysis and data management that we would need to to really find high-quality data within the open data which comes from governances. And I think this is a problem, and I realized also that there's a language issue. So if I just think about putting my colleagues into this room, into the session we had just before, about data quality, it would be problematic to find a common language, to figure out how we can start to improve our data quality so that Wikidata's data quality is also improved. Another thing is that we have no standards in the name of anthologies, in the name of how we prepare data. There is a metadata standard, which is great, but this, after all, does not mean that we all do the same thing and that we find the same kind of data, just because it is named in the same way. But, overall, it's a lot of official data. You can get from open data. I made an example here which is about street names, and usually you find a lot of different forms and street names. Sometimes something like the Karlsplatz it's written with a C, or with a K, or separated, and sometimes this is also developing over the time. And in the end, there's just only one official name of a place or of a street, and it's the municipality which can give you that name. And this part, like a list of official street names is something which is regularly published by a lot of municipalities in their open data portals. And I think that at all is a good start to figure out what we can do with this in Wikidata as well. So this is my short introduction, and I'm happy to hear about community work with open data. Yeah, I thought I would just kind of give a quick introduction from the other side, of movement from the community side. So, as I said, I work in my spare time for an organization called Code for Germany. We've been running since about five years where we have labs, that is groups of people that meet once a week, some once a month in Germany in local, what we call labs. And we try to build tools that somehow make it easier for people to participate in politics, to get an understanding of the environment around them, to collect data about air pollution. And, of course, we'd like to use governmentally provided open data for that, but we've also realized that there's difficulties with that, that sometimes the data isn't there, it's under a difficult license, which is kind of how we found our way to Wikidata also, I think. We also happened to meet in Berlin in the offices of Wikimedia Deutschland, so this kind of brought us very close to Wikidata. And I think it's cool to see that we're kind of strengthening the relationship between the Wikidata community in Germany and the Code for Germany community. We also would like to work even closer with the government, but talking about bridging gaps. I mean, there's very basic problems such as us meeting after we work and the people for the government wanting to meet when they work. So I think when we think about how these communities can work together, there's very mundane things, such as working times, that we need to keep in mind. So just a quick introduction to what we do at Code for Germany especially with regards to Wikidata. We've had a couple of hackathons now within the last years where people from the Wikidata community and the Code for Germany community kind of came together to meet and just spend a weekend to work on Wikidata. And we've done all kinds of different things. We've usually been very interested in political data, so we've been importing a lot of data regarding politicians and regarding elections. We've thought about how to model election data in Wikidata a lot and we've also had a lot of people that built games with Wikidata. One of the nice examples for this would be the Wikidata card game, where you can put in any Q number and you get a nice trading card game. You might have seen that. If not, I encourage you to look for that. I think that's a really cool way to sell Wikidata to other people. Selling-- this is also something that we've realized when we talk to data providers, that often they're quite scared to give data to you with the traditional argument of "Our data is so complicated, you won't understand it, and you'll build bad applications that will make us look bad." And our strategy usually is to just take the data anyway, build an application share it with them, and then their response is usually, "Oh, this is pretty cool. Can we link to that from our website?" And then, at some point, maybe you can start having a discussion with them. But, yeah, I think this is kind of what we can do as a community. We can build little small games and tools to showcase. Okay, there is Wikidata, and it's pretty cool, and you have open data, and we can build cool things with it, but you'll need to give it to us, you'll need to publish it under a license that we can work with. And this is one of the things that we try to do at Code for Germany. [inaudible], thanks. (applause) Yeah, thank you. Before we open the room for questions from you, we would like to just open or ask some questions to you. I think that Knut has really described the challenges we face quite well. But, still, I do think there's a lot of opportunities in these data, and we just need to kind of harvest it better than we do it right now. And so my questions-- and maybe it helps you a bit to think about that-- is how could we integrate more open government data into Wikidata in a more structured way. Just keeping in mind that the people who are kind of providing these data are not the experts you may expect. And at the same time, there already is a WikiProject, open government data, and I'm not sure if you, Christina had opened it quite a while ago. And I wonder in which way we can kind of reanimate it and make the best out of it because we still have this place, and we have people who are engaged in the municipalities, in governments, to open up data. And maybe it's an opportunity to just match these different languages and expectations. So, yeah, I'm open for any ideas to do that, and I'm happy to engage a bit in that as well. So, questions? (person 1) Hi, thank you, guys. Maybe an idea is one we could be taking from the Wikipedia beginnings, where I think it was Matthias Schindler, who started with his Content Liberation Army. And the idea that, you know, you have to really go in, and the data is there. But for example, I had a project with a student where we were looking at where the trees are geolocated in Berlin, and this is sometimes on paper, it's sometimes on a stupid database. We were accused of being terrorists by the people who didn't want to give us the data. We had to get really, really picky about this and point to the laws saying, "This is open data, and you have to give it to us." but we have to sort of go in friendly, as you were saying and try and explain to them what they will have from it. Many of them don't see that they have a use of it because it's more work for them having to deal with us. I think that's one of the main kind of fears which is there are coming people who are just putting more work onto us. And at the same time, there's so little understanding that this is just part of what they are doing already. And that they can really also learn and get a lot of input from the people who are asking about that data. But this is really culture change, a cultural change especially here in Germany. So we are working on it. We are working hard, but it's really kind of a tough thing. - Maybe I can add? - Yes. I think what's also really interesting to see from the community's perspective is that when we talk to different cities, it so depends on who happens to work in the cities. Like we have this very small city of Moers that is very unknown, but if you talk to people in the open data community, everyone will know it because they happen to pay someone to do work on open data. And when I talk to people from the government in Berlin, they tell me, "Okay, I now know I have to publish open data, but I don't know how, for whom, or why. And I think this is actually a chance for the smaller cities to kind of champion this idea because it's so much easier for them to kind of get a movement and to liberate some data where if we talk in Berlin, we always need to talk to 12 districts, and they'll never align on what data they want to publish. (person 2) And we have a remote comment from Beat Estermann who wants to point out he has some links in Etherpad about "Interest in open government data helps Swiss authorities prioritize base registers and controlled vocabularies." And I'm told he just came in while I'm reading his Etherpad entry. So if you could just take the mic from me. (person 2) Go on. (Beat) Okay, thank you. I missed the first introduction. What did you start on? - (person 2) I was just reading-- - (Beat) Oh, you were reading. Okay. So we're currently running-- In Switzerland, we're running a survey to kind of prioritize data from within the government. There are like base registers or controlled vocabularies. Because we think that they would be crucial to actually promote and boost the publication of linked open data across the public authorities, so we're running a server to prioritize them. And for some authorities to know which ones to publish now and for others-- for the community to know where to put pressure on and how to actually, yeah, argue why they should publish it. We're also collecting use cases. I posted the link to the Etherpad. It's in German and French only, the questionnaires. I'm sorry we're still not like up five language count here, but you said four languages- (person 3) Just switch to English. (Beat) Yeah, we could switch to English, right. Yeah, so that's one point. The other point I think is we could... and I'll put a little bit more love into kind of documenting the whole Wiki project, open common data, and that's something we're not really doing if you compare it to what is going on in GLAM. I think that is definitely something which I probably will try to figure out after my vacation time, which is starting on Monday. There is this WikiProject, and we need to figure out who is interested in it what can we do there, and how can we motivate people from kind of [out] the Wikidata community to add this important information to that. So I do think there is a huge opportunity to figure out how we can include more of this really, really valuable and reliable data into Wikidata. But overall, there's a lot of challenges as well, and still it's kind of a different crowd of people, and we need to figure out how to bring them together. Any idea is welcome. (Beat) Yeah, there is another point which we're currently not focusing on with this base register and vocabulary thing. But what I have had as a request is to be able to actually store tabular data and to be able to pull it. Because it does not make sense to put like 200 years of population statistics from Zurich into that Wikidata item for Zurich. Maybe I just pick it up and just an anecdote from my day work. So I started to introduce Wikidata to my colleagues. We are a small team doing open data, and it was fine, and they were really, really interested, but in the end we started to add some of the population dates, and then, you know, there isn't any order. So it's so hard to figure out if you find a population date for year Y or X or something, and if it is still missing. So, of course, there are still a lot of things to improve in Wikidata as well, and tabular data could be one of it also. (person 4) [inaudible] Is it working? I have a comment on the tabular data. I remember we had also discussions with a canton and the city of Zurich about this, and that it might make sense to start discussions on whether we should maybe consider setting up a Wikibase for open governmental data and having such kind of datasets and then link them to Wikidata or link them from Wikidata to them, because mostly the linked open data technology is actually enabling that and is one of the key advantages of this technology. It is, of course, something that doesn't relate only to OGD data, it's a global divide in the whole Wikidata community. Because the larger we make the central endpoint or the graph the more difficult it is to handle it-- I think we all agree on that. So I think there should be a deeper conversation and discussion on whether we should start building this network. Well, actually, there is already a network of Wikibases. We also work in the university with publications and research data with our own Wikibase. Yeah, and then another comment about the Wiki projects. So we continued working and documenting the materials of the events, so we actually now have two upcoming events in November. We have a full weekend technical training on Wikidata in collaboration with the open data Zurich people and the canton of Zurich, and also Wikimedia Switzerland, and we have a hackathon. But I totally agree that it would be great to start having conversations with all the participants that have been listed already in the project, and start more discussions, especially with all the countries that have many good initiatives, like Germany, like what you described and start documenting what are the specific needs of these institutions, what are the problems, and what specific tools we need to develop, or procedures, that we can help them import or link data in Wikidata. I think we're out of time. One last question. (person 5) So a proposal to use Wikibase for that? I'm not sure whether that actually would solve this tabular data problem. And when thinking of statistical data, like population data, that is not data that we want to really edit, that's data we just want to consume. So it means we have to ask ourselves whether we want to build in the capability to actually pull data directly from external third-party SPARQL endpoints, and not just from within this Wikibase ecosystem that we're planning to build up as well. (person 4) So I agree that it doesn't solve the tabular data, but what I was trying to say is that the information that is more specific, it might be the case that we want to export it to something else and I see Wikibase also as a very good data modeling example. So not only because you want to have humans editing, but also because the whole data modeling happening in Wikidata with all the qualifiers and references adds a lot to all the datasets. So if we would do it from scratch in RDF we would be missing these features that Wikidata has, and I see it has an advantage. So that was a reason why I mentioned that it would be very helpful to maybe think of for the Wikibases around the OGD data. (moderator) So, I'm sorry, but I think we just ran out of time, and I encourage you to keep talking with our speakers, [inaudible] during all the conference and please, a round of applause for them. (applause) Thank you.