-
Hello.
-
The two of us are starting
-
a level on a side-effect
or side-project or whatever,
-
something which
is loosely connected to Wikidata,
-
which is open data
-
and we're glad to see you're here.
-
I'm Alice Wiegand.
-
I'm the project lead for open data
in the municipality of Düsseldorf,
-
and this is Knut Huhne, who is a student.
-
You may introduce yourself.
-
Yeah, I'm a software developer by day,
-
and in my spare time
I do a lot of work at Code for Germany,
-
which is in community organization
that I'll talk a bit about,
-
and we try to build civic tech tools
based on open data.
-
Yeah, that's exactly what we need.
-
And so let's see where we are [on this].
-
[inaudible]
-
So if we talk about open government data,
-
this is something where I think
-
the entire world is much more forward
-
than Europe and especially Germany is.
-
But in Germany,
where we both come from and live,
-
this is getting some dynamics
because laws are changing.
-
And overall, we have just data
which is used, produced,
-
and cared
-
and maintained by government,
-
which is just a reliable data source,
-
and it's official data with a high value,
-
and it is sometimes
really surprising to see
-
what kind of data there is,
openly kind of published.
-
So this is, for example...
-
I hope it opens soon.
-
This, for example, is...
-
it's the measure of radioactivity in kale.
-
And I think it's surprising,
-
I wonder why is it kale
and not red cabbage?
-
And I wonder why is this a fixed date?
-
You know, 20th of November in 2013.
-
And I wonder why is it that far away?
-
What are we doing
with radioactivity in kale today?
-
I don't know.
-
So you find a lot
of these surprising things
-
when you start to...
-
What have I to do, do you know?
-
...when you start to
look at open data in Germany.
-
I'm confused with this computer.
-
Oh, yes. Thanks.
-
Yeah, and this data usually is up to date.
-
Well, it should be, of course.
-
As in all data, we have our gaps there.
-
And overall if I just look
on the region I know best,
-
we have 86
-
of singular portals
with open data within Germany,
-
which is on municipality level,
on the country level,
-
on the federal country level,
and on state level.
-
And in Austria, it's 19;
and in Switzerland, it's 6,
-
and numbers are growing.
-
So, of course, also,
question is why are we all doing
-
the same thing on different places?
-
It doesn't seem to be that efficient,
-
I'm not sure, but this is how
our world today works.
-
So now I find the right key, thanks.
-
And there are a lot of challenges
which we have to face
-
and kind of a huge gap
between wish and reality.
-
So, after all, I do think there is a huge,
-
you know, kind of [friendliness]
-
between open data and Wikidata.
-
It's all about essential data.
-
It is about being as actual
or being as up to date as possible.
-
But in the end, when we look
at the open data platforms
-
in mostly Europe,
-
we find incompatible licenses.
-
So usually mainly municipalities
-
choose a BY license,
-
because they think it would be good
to know where this data came from
-
and to be named there.
-
And this is really a crazy thing.
-
I looked at open data portals,
-
and we have a portal in Düsseldorf
for two years now
-
and by design, we choose the 0 license.
-
And I found that open data in Zurich--
-
Okay, it's not Germany, but it's Zurich--
-
and they are doing
a lot of cool stuff there as well.
-
And they also use the 0 license.
-
But usually municipalities
like CC BY licenses, sadly.
-
And another thing we have to face
-
is that, especially in municipalities,
this kind of task to publish
-
this internal data
on a free and open license,
-
on a platform, wherever,
-
is just given to a person
who usually does something else.
-
So it's not, you know, a 100 person task
-
for this person to do,
-
but something to do, you know,
with all the other things.
-
Overall, I think we can say
-
that of course there are people
who are really doing a great job.
-
Usually, we don't find
that level of expertise
-
on data analysis and data management
-
that we would need to
to really find high-quality data
-
within the open data
which comes from governances.
-
And I think this is a problem,
-
and I realized also
that there's a language issue.
-
So if I just think about
putting my colleagues into this room,
-
into the session we had just before,
about data quality,
-
it would be problematic
to find a common language,
-
to figure out how we can start
to improve our data quality
-
so that Wikidata's data quality
is also improved.
-
Another thing
is that we have no standards
-
in the name of anthologies,
-
in the name of how we prepare data.
-
There is a metadata standard,
which is great,
-
but this, after all, does not mean
that we all do the same thing
-
and that we find the same kind of data,
-
just because it is named in the same way.
-
But, overall, it's a lot of official data.
-
You can get from open data.
-
I made an example here
which is about street names,
-
and usually you find a lot
-
of different forms and street names.
-
Sometimes something like the Karlsplatz
-
it's written with a C,
or with a K, or separated,
-
and sometimes this is also developing
-
over the time.
-
And in the end, there's just
only one official name
-
of a place or of a street,
-
and it's the municipality
which can give you that name.
-
And this part, like a list
of official street names
-
is something which is regularly published
-
by a lot of municipalities
in their open data portals.
-
And I think that at all
is a good start to figure out
-
what we can do with this
in Wikidata as well.
-
So this is my short introduction,
-
and I'm happy to hear about
community work with open data.
-
Yeah, I thought I would just kind of give
a quick introduction from the other side,
-
of movement from the community side.
-
So, as I said, I work in my spare time
-
for an organization
called Code for Germany.
-
We've been running since about five years
-
where we have labs,
that is groups of people
-
that meet once a week,
some once a month in Germany
-
in local, what we call labs.
-
And we try to build tools
that somehow make it easier
-
for people to participate in politics,
-
to get an understanding
of the environment around them,
-
to collect data about air pollution.
-
And, of course, we'd like to use
-
governmentally provided
open data for that,
-
but we've also realized
that there's difficulties with that,
-
that sometimes the data isn't there,
it's under a difficult license,
-
which is kind of how we found our way
to Wikidata also, I think.
-
We also happened to meet in Berlin
-
in the offices of Wikimedia Deutschland,
-
so this kind of brought us
very close to Wikidata.
-
And I think it's cool to see
-
that we're kind of strengthening
the relationship
-
between the Wikidata community in Germany
and the Code for Germany community.
-
We also would like to work
even closer with the government,
-
but talking about bridging gaps.
-
I mean, there's very basic problems
such as us meeting after we work
-
and the people for the government
wanting to meet when they work.
-
So I think when we think about
how these communities can work together,
-
there's very mundane things,
such as working times,
-
that we need to keep in mind.
-
So just a quick introduction
to what we do at Code for Germany
-
especially with regards to Wikidata.
-
We've had a couple of hackathons now
within the last years
-
where people from the Wikidata community
-
and the Code for Germany community
-
kind of came together to meet
-
and just spend a weekend
to work on Wikidata.
-
And we've done
all kinds of different things.
-
We've usually been very interested
in political data,
-
so we've been importing a lot of data
-
regarding politicians
and regarding elections.
-
We've thought about how to model
election data in Wikidata a lot
-
and we've also had a lot of people
that built games with Wikidata.
-
One of the nice examples for this
-
would be the Wikidata card game,
where you can put in any Q number
-
and you get a nice trading card game.
-
You might have seen that.
-
If not, I encourage you to look for that.
-
I think that's a really cool way
to sell Wikidata to other people.
-
Selling-- this is also
something that we've realized
-
when we talk to data providers,
-
that often they're quite scared
to give data to you
-
with the traditional argument
-
of "Our data is so complicated,
you won't understand it,
-
and you'll build bad applications
that will make us look bad."
-
And our strategy usually
is to just take the data anyway,
-
build an application share it with them,
and then their response is usually,
-
"Oh, this is pretty cool.
Can we link to that from our website?"
-
And then, at some point,
-
maybe you can start having
a discussion with them.
-
But, yeah, I think this is kind
of what we can do as a community.
-
We can build little small games
and tools to showcase.
-
Okay, there is Wikidata,
and it's pretty cool,
-
and you have open data,
and we can build cool things with it,
-
but you'll need to give it to us,
-
you'll need to publish it
under a license that we can work with.
-
And this is one of the things
that we try to do at Code for Germany.
-
[inaudible], thanks.
-
(applause)
-
Yeah, thank you.
-
Before we open
the room for questions from you,
-
we would like to just open
or ask some questions to you.
-
I think that Knut has really described
-
the challenges we face quite well.
-
But, still, I do think there's a lot
of opportunities in these data,
-
and we just need to kind of harvest it
better than we do it right now.
-
And so my questions--
and maybe it helps you a bit
-
to think about that--
is how could we integrate
-
more open government data
into Wikidata in a more structured way.
-
Just keeping in mind that the people
who are kind of providing these data
-
are not the experts you may expect.
-
And at the same time,
-
there already is a WikiProject,
open government data,
-
and I'm not sure if you, Christina
had opened it quite a while ago.
-
And I wonder in which way we can
-
kind of reanimate it
and make the best out of it
-
because we still have this place,
and we have people
-
who are engaged
in the municipalities, in governments,
-
to open up data.
-
And maybe it's an opportunity
-
to just match these different
-
languages and expectations.
-
So, yeah, I'm open
for any ideas to do that,
-
and I'm happy to engage
a bit in that as well.
-
So, questions?
-
(person 1) Hi, thank you, guys.
-
Maybe an idea is one
-
we could be taking
from the Wikipedia beginnings,
-
where I think it was Matthias Schindler,
-
who started
with his Content Liberation Army.
-
And the idea that,
you know, you have to really go in,
-
and the data is there.
-
But for example,
I had a project with a student
-
where we were looking
-
at where the trees
are geolocated in Berlin,
-
and this is sometimes on paper,
it's sometimes on a stupid database.
-
We were accused of being terrorists
-
by the people who didn't want
to give us the data.
-
We had to get really, really
picky about this and point to the laws
-
saying, "This is open data,
and you have to give it to us."
-
but we have to sort of go in friendly,
as you were saying
-
and try and explain to them
what they will have from it.
-
Many of them don't see
that they have a use of it
-
because it's more work for them
having to deal with us.
-
I think that's one
of the main kind of fears
-
which is there are coming people
who are just putting more work onto us.
-
And at the same time,
there's so little understanding
-
that this is just part
of what they are doing already.
-
And that they can really also
-
learn and get a lot of input
-
from the people
who are asking about that data.
-
But this is really culture change,
-
a cultural change
especially here in Germany.
-
So we are working on it.
-
We are working hard,
but it's really kind of a tough thing.
-
- Maybe I can add?
- Yes.
-
I think what's also
really interesting to see
-
from the community's perspective
-
is that when we talk to different cities,
-
it so depends on who happens
to work in the cities.
-
Like we have this very small city of Moers
-
that is very unknown,
-
but if you talk to people
in the open data community,
-
everyone will know it
-
because they happen to pay someone
to do work on open data.
-
And when I talk to people
from the government in Berlin,
-
they tell me, "Okay, I now know
I have to publish open data,
-
but I don't know how, for whom, or why.
-
And I think this is actually
-
a chance for the smaller cities
to kind of champion this idea
-
because it's so much easier for them
-
to kind of get a movement
and to liberate some data
-
where if we talk in Berlin,
we always need to talk to 12 districts,
-
and they'll never align
on what data they want to publish.
-
(person 2) And we have a remote comment
-
from Beat Estermann
-
who wants to point out
he has some links in Etherpad
-
about "Interest in open government data
helps Swiss authorities
-
prioritize base registers
and controlled vocabularies."
-
And I'm told he just came in
-
while I'm reading his Etherpad entry.
-
So if you could just take the mic from me.
-
(person 2) Go on.
-
(Beat) Okay, thank you.
-
I missed the first introduction.
-
What did you start on?
-
- (person 2) I was just reading--
- (Beat) Oh, you were reading. Okay.
-
So we're currently running--
-
In Switzerland, we're running a survey
-
to kind of prioritize data
from within the government.
-
There are like base registers
or controlled vocabularies.
-
Because we think
that they would be crucial
-
to actually promote and boost
the publication of linked open data
-
across the public authorities,
-
so we're running a server
to prioritize them.
-
And for some authorities
to know which ones to publish now
-
and for others--
-
for the community to know
where to put pressure on
-
and how to actually,
-
yeah, argue why they should publish it.
-
We're also collecting use cases.
-
I posted the link to the Etherpad.
-
It's in German and French only,
the questionnaires.
-
I'm sorry we're still not like up
-
five language count here,
but you said four languages-
-
(person 3) Just switch to English.
-
(Beat) Yeah, we could switch
to English, right.
-
Yeah, so that's one point.
-
The other point I think is we could...
-
and I'll put a little bit more love
-
into kind of documenting
the whole Wiki project,
-
open common data,
-
and that's something
we're not really doing
-
if you compare it
to what is going on in GLAM.
-
I think that is definitely something
-
which I probably will try to figure out
-
after my vacation time,
-
which is starting on Monday.
-
There is this WikiProject,
-
and we need to figure out
who is interested in it
-
what can we do there,
-
and how can we motivate people
-
from kind of [out] the Wikidata community
-
to add this important information to that.
-
So I do think there is a huge opportunity
-
to figure out how we can include
-
more of this really, really valuable
and reliable data into Wikidata.
-
But overall, there's a lot
of challenges as well,
-
and still it's kind of
a different crowd of people,
-
and we need to figure out
how to bring them together.
-
Any idea is welcome.
-
(Beat) Yeah, there is another point
-
which we're currently not focusing on
-
with this base register
and vocabulary thing.
-
But what I have had as a request
-
is to be able
to actually store tabular data
-
and to be able to pull it.
-
Because it does not make sense
-
to put like 200 years
of population statistics from Zurich
-
into that Wikidata item for Zurich.
-
Maybe I just pick it up
and just an anecdote from my day work.
-
So I started to introduce Wikidata
to my colleagues.
-
We are a small team doing open data,
-
and it was fine,
and they were really, really interested,
-
but in the end we started
to add some of the population dates,
-
and then, you know, there isn't any order.
-
So it's so hard to figure out
if you find a population date
-
for year Y or X or something,
and if it is still missing.
-
So, of course,
there are still a lot of things
-
to improve in Wikidata as well,
-
and tabular data could be one of it also.
-
(person 4) [inaudible] Is it working?
-
I have a comment on the tabular data.
-
I remember we had also discussions
-
with a canton and the city
of Zurich about this,
-
and that it might make sense to start
-
discussions on whether
we should maybe consider
-
setting up a Wikibase
for open governmental data
-
and having such kind of datasets
-
and then link them to Wikidata
or link them from Wikidata to them,
-
because mostly
the linked open data technology
-
is actually enabling that
-
and is one of the key advantages
of this technology.
-
It is, of course, something
that doesn't relate only to OGD data,
-
it's a global divide
in the whole Wikidata community.
-
Because the larger we make
the central endpoint or the graph
-
the more difficult it is to handle it--
I think we all agree on that.
-
So I think there should be
a deeper conversation and discussion
-
on whether we should
start building this network.
-
Well, actually, there is already
a network of Wikibases.
-
We also work in the university
with publications and research data
-
with our own Wikibase.
-
Yeah, and then another comment
about the Wiki projects.
-
So we continued working
and documenting the materials
-
of the events,
-
so we actually now have
two upcoming events in November.
-
We have a full weekend
technical training on Wikidata
-
in collaboration
with the open data Zurich people
-
and the canton of Zurich,
-
and also Wikimedia Switzerland,
and we have a hackathon.
-
But I totally agree that it would be great
-
to start having conversations
with all the participants
-
that have been listed already
in the project,
-
and start more discussions,
-
especially with all the countries
that have many good initiatives,
-
like Germany, like what you described
-
and start documenting
-
what are the specific needs
of these institutions,
-
what are the problems,
-
and what specific tools
we need to develop, or procedures,
-
that we can help them import
or link data in Wikidata.
-
I think we're out of time.
One last question.
-
(person 5) So a proposal
to use Wikibase for that?
-
I'm not sure whether
that actually would solve
-
this tabular data problem.
-
And when thinking of statistical data,
like population data,
-
that is not data
that we want to really edit,
-
that's data we just want to consume.
-
So it means we have to ask ourselves
-
whether we want to build in
the capability to actually pull data
-
directly from external third-party
SPARQL endpoints,
-
and not just from
within this Wikibase ecosystem
-
that we're planning to build up as well.
-
(person 4) So I agree
that it doesn't solve the tabular data,
-
but what I was trying to say
-
is that the information
that is more specific,
-
it might be the case that we want
to export it to something else
-
and I see Wikibase also
as a very good data modeling example.
-
So not only because you want
to have humans editing,
-
but also because the whole data modeling
happening in Wikidata
-
with all the qualifiers and references
-
adds a lot to all the datasets.
-
So if we would do it from scratch in RDF
-
we would be missing these features
-
that Wikidata has,
and I see it has an advantage.
-
So that was a reason why I mentioned
-
that it would be very helpful
to maybe think of
-
for the Wikibases around the OGD data.
-
(moderator) So, I'm sorry,
but I think we just ran out of time,
-
and I encourage you
to keep talking with our speakers,
-
[inaudible] during all the conference
-
and please, a round of applause for them.
-
(applause)
-
Thank you.