-
Hello. So, good afternoon.
-
Welcome to the OpenStreetMap
and Wikidata workshop.
-
My name is Eugene.
-
And together with Edward,
we'll be talking about OpenStreetMap
-
and how it can work together
with Wikidata.
-
So, just a show of hands.
-
Who here has an OpenStreetMap account?
-
Okay, some. So, probably
this is not new to you.
-
But for those who are not familiar
with OpenStreetMap,
-
I'll give an introduction
to OpenStreetMap and its data model.
-
So, basically, what is OpenStreetMap?
-
It is basically a crowdsourced project
to map the whole world.
-
And the usual way we introduce
OpenStreetMap to people
-
is like OpenStreetMap
is like Wikipedia for maps.
-
But actually, a more accurate way
to introduce OpenStreetMap
-
is that it is like Wikidata
for geographical data.
-
But that presupposes that the audience
already knows or is familiar
-
with what Wikidata is.
-
And why do we say
that OpenStreetMap is like Wikidata?
-
And that's because both
have quite a lot of things in common,
-
both being crowdsourced
and open data projects.
-
So, you know Wikidata--
-
it has items, statements,
properties, et cetera.
-
In the same way, OpenStreetMap
has things like nodes, ways, relations,
-
that have members and roles,
and these have tags
-
that are composed of keys and values.
-
So, as more detail, nodes, ways,
and relations model the geometry
-
and topology of objects.
-
And then, we have tags,
which are actually key value strings
-
that describe the actual things
that those objects represent.
-
So, to give an example,
here's the Wikidata item for Berlin.
-
So, we can have property
like population, 3 million something,
-
with a qualifier, point in time,
and references.
-
And they have counterparts
in OpenStreetMap.
-
So, for example, the Berlin relation
in OpenStreetMap
-
has the tag population is equal
to 3.4 million something,
-
and it has another tag, source:population
equals this URL and that date.
-
So, unlike in Wikidata,
-
wherein you can have qualifiers
and references for your statements,
-
in OpenStreetMap, the tag is quite flat.
-
There's no secondary levels of tags.
-
Everything is flat.
-
And that's why we have to put
what you call secondary tags.
-
So, for example here, source:population
-
to indicate that the population tag
has this source.
-
Another thing is that OpenStreetMap's tags
are not strictly controlled.
-
Unlike with Wikidata, wherein you have
to have approval process
-
before properties are created,
-
here, OpenStreetMap mappers
-
can invent and add
any tags that they like.
-
However, there is a tagging
proposal process
-
in order to propose common tags
-
that will be used by mappers
all over the world.
-
Okay, data modeling discussions
on the Wikidata:Project chat page
-
are actually quite similar
to the discussions
-
in OpenStreetMap's tagging mailing list.
-
For example, here's an example
of discussion on the project tag,
-
how do we model a building
that has changed its use?
-
In OpenStreetMap,
we have similar discussions.
-
How do we tag these sorts of buildings?
-
So, I've given an introduction
of what OpenStreetMap is.
-
I'd love to discuss it more,
but we don't have enough time.
-
So, we'll go into how do we link
-
between OpenStreetMap
and Wikidata together.
-
I don't have to explain
why linking is a good thing.
-
We're all Wikidatans, and we know
that linking data is a good thing.
-
So, how do we actually link
Wikidata with OpenStreetMap?
-
So, from Wikidata to OpenStreetMap,
-
Wikidata items on places can link
to OpenStreetMap relations
-
using the OSM relation ID,
or the P402 property.
-
So, the question is: why only relations?
-
That's because OSM IDs are not stable.
-
For example, you can change nodes
to represent a different object.
-
Ways can be split
-
to add new information about those ways.
-
However, relatively, relations
in OpenStreetMap are relatively stable.
-
At least for major relations,
such as administrative boundaries,
-
or highway routes,
or public transportation routes.
-
That way, you can link at least the,
for example, here,
-
Berlin Wikidata data edit item,
-
can link to the relation
representing the boundary
-
in OpenStreetMap via its ID.
-
In terms of the ontology,
Wikidata items and properties
-
for geographical features
can link to "equivalent," in quotes,
-
OpenStreetMap classes
using the OSM tag or key property.
-
For example, the lighthouse item
in Wikidata has the value
-
for OpenStreetMap tag or key
Tag:man_made=lighthouse.
-
That means that lighthouses are equivalent
to objects that are tagged
-
in OpenStreetMap
with man-made lighthouse.
-
Going in the other direction,
OpenStreetMap objects can link
-
to corresponding Wikipedia articles
and Wikidata items
-
using the Wikipedia
and Wikidata tags, respectively.
-
So, here's an example.
-
The OpenStreetMap relation for Berlin.
-
We have the Wikidata tag, Q64,
-
and the Wikipedia article
linking to the German article for Berlin.
-
There are also several Wikidata
secondary tags,
-
such as for example, brand:wikidata,
architect:wikidata, artist:wikidata,
-
or name:etymology:wikidata.
-
We use this in order to exactly specify
-
what we are referring to.
-
For example, on the top part here,
we have the example.
-
There's an artwork in OpenStreetMap
-
that was created
by the artist named Herakut,
-
but who is that?
-
So, in order to specify exactly,
we use artist:wikidata,
-
and that Q ID number.
-
So that you can be exactly sure
which Herakut artist it really is.
-
This is also useful, for example,
if you're tagging, for example,
-
objects in OpenStreetMap
that are in a different language.
-
For example, in Japan, you might have
a fast-food restaurant
-
called Makudonarudo,
which is actually McDonald's.
-
So, you can tag that using
the brand:wikidata tag
-
pointing to the McDonald's
item in Wikidata.
-
So, in terms of ontology,
we define and describe the tags
-
in OpenStreetMap
on the OpenStreetMap Wiki,
-
and we can add links to corresponding
Wikipedia articles and Wikidata items
-
so that we can sort of explain
the correspondences and relations
-
between these tags and items in Wikidata.
-
Okay, so how do OpenStreetMap
and Wikimedia use each other's data?
-
So, first, we have the interactive maps.
-
So, OpenStreetMap data powers
-
the Wikimedia Foundation's
Kartotherian map tile service,
-
which is used by the Kartographer
MediaWiki extension.
-
So, basically, any time you see
an interactive map
-
or almost any interactive map
on any Wikimedia project,
-
that is usually powered
by the Kartotherian map tile service.
-
For example, here's the interactive map
for Berlin in the English Wikivoyage.
-
So, the base map there is all coming
from OpenStreetMap.
-
So, another thing that
the Kartographer extension can do
-
is it can pull and overlay geometry
from OpenStreetMap.
-
So, here's the infobox on Commons
for the Berlin category.
-
And the map there, you can see
an outline for Berlin, there.
-
That outline comes from OpenStreetMap.
-
In 2008, the foundation released
localized map tiles for Kartotherian,
-
and this leveraged the multilingual
name tags in OpenStreetMap,
-
so that you can view those maps
that you see on Wikimedia projects
-
in the user's language.
-
Then, how do we use Wikidata
in OpenStreetMap?
-
For example, when tagging brands,
-
for example, in shops
and restaurants or banks,
-
OpenStreetMap's Name Suggestion Index
uses Wikidata to provide brand identity
-
and improved tagging.
-
So, for example, if you tag
an object in OpenStreetMap
-
with brand Wikidata pointing
to the McDonald's item in Wikidata,
-
the name field is now automatically locked
-
so that users cannot just change that
to, for example, Burger King.
-
And then, you can edit or also pull icons,
-
the McDonald's icon there
-
that is taken from the Facebook
item property
-
in Wikidata.
-
So, yeah.
-
So, that, at least, when users are tagging
these shops in OpenStreetMap,
-
they can be sure
that they're doing it correctly.
-
Okay, so Sophox is a SPARQL endpoint
for OpenStreetMap data.
-
So, this service can use RDF federation
to also query linked Wikidata items.
-
So, actually, in OpenStreetMap,
we usually use other query services,
-
such as Overpass.
-
But if you want to also query
using Wikidata items,
-
we have the Sophox endpoint
that you can use.
-
And for geocoding,
if you're not familiar with geocoding,
-
basically, that's the technology
wherein given an address,
-
you are returned geocoordinates.
-
So, we have what we call Nominatim,
which is the usual service
-
in OpenStreetMap for doing geocoding.
-
And previously, it already uses
Wikipedia tags in OpenStreetMap.
-
But this year, we added
Google Summer of Code project code
-
to integrate using Wikidata tags
in Nominatim,
-
so that search results
can become more relevant for users
-
who are doing the searches.
-
And for localization, Mapbox and MapTiler,
which are third-party companies
-
that extensively use OpenStreetMap,
-
they use Wikidata to power
their localized map products.
-
So, basically, if there
are missing name tags in OpenStreetMap,
-
and if that object is linked to Wikidata,
they can pull the labels from Wikidata,
-
and use that to show multilingual labels,
if that is missing in OpenStreetMap.
-
The reason for that is because
we have a philosophy in OpenStreetMap
-
that we do not try to add too many tags,
especially if that can be automated.
-
For example,
for automatic transliterations,
-
if that can be automated, we don't need
to add that to OpenStreetMap.
-
But in Wikidata, that's no problem.
-
So, you can do that by doing this linking
between OpenStreetMap and Wikidata.
-
You don't have to do
that transliteration on your own.
-
You can just pull it from Wikidata.
-
And also, the OpenStreetMap Wiki
has the Wikibase extension installed.
-
So, the idea here is that we want
the tag information or the description
-
or the description of the tags,
that we use in OpenStreetMap
-
to be machine-readable.
-
Hopefully, this will be used
by software and editors
-
that use OpenStreetMap data to better see
-
how objects are described
in OpenStreetMap.
-
Currently, this is not used as much,
-
but hopefully, as the tagging information
becomes more complete and better,
-
this can be used
by OpenStreetMap software,
-
thanks to the Wikibase installation.
-
Okay, some copyright and IP issues.
-
Wikidata can't import coordinates
from OpenStreetMap.
-
The reason for that is because
OpenStreetMap is licensed
-
under the Open Database License.
-
And also, we have conflicting doctrines.
-
Here in the European Union
and the United Kingdom,
-
we have database rights.
-
Whereas, in the US, we have the idea
that facts are not copyrightable.
-
So, we cannot just--
even though you cannot say--
-
you cannot copyright the fact
that this restaurant or this bank
-
or this place is at this location,
-
doing that as an import
or as a batch job
-
is not allowed
-
because OpenStreetMap
is protected by database rights
-
being hosted in the United Kingdom.
-
Conversely, OpenStreetMap
will not import geodata from Wikidata,
-
despite the CC0 license,
because of data provenance issues.
-
If you're not familiar
with how geocoordinates are added
-
into Wikipedia articles,
usually users just go to Google Maps,
-
search, and then copy the coordinates
that show up in the results,
-
and place that
into the Wikipedia articles.
-
In OpenStreetMap, we, as much as possible,
-
avoid copying data
from third-party sources
-
that are proprietary, such as,
for example, Google Maps.
-
And because of that, we will never,
in OpenStreetMap, never import data
-
from Wikipedia and also Wikidata,
-
because most coordinates in Wikidata
have been imported from Wikipedia.
-
So, it's an established principle
on OpenStreetMap
-
that we don't import from Wikipedia.
-
Okay, I'll just then
turn it over to Edward.
-
(Edward) I'm going to talk
about the process for adding links
-
from OpenStreetMap to Wikidata.
-
So, I've written a tool
for automating this process.
-
Like, it's user-assisted editing.
-
So, it's not a fully automated tool.
-
It's available. Anyone can use it.
-
There's the address.
-
So, when I run the tool on Berlin,
it finds 2,800 matches.
-
So, these are Wikidata items
-
where it thinks it has found
the same OpenStreetMap objects.
-
So, the matcher is using this criteria.
-
It looks for things
that are the same entity type.
-
They've got the same coordinates,
-
and then either they've got the same name,
street address, or identifier.
-
So, with the name, it's doing lots
of kind of normalization,
-
like lower casing, removing spaces,
all kinds of bits and pieces
-
to try and match up
slightly different ways
-
that you could write a name.
-
And I'm looking at names
from different sources,
-
like the labels, and the aliases,
-
but also the site links,
the article titles,
-
and I pull anything in bold
from the Wikipedia article,
-
so lots of sources for names.
-
These are the identifiers
that I'm matching on.
-
So, we've got lots
of identifiers in Wikidata.
-
OpenStreetMap has identifiers, as well.
-
So, I've got a mapping
between the name of the tag
-
for the identifier in OpenStreetMap,
and the property in Wikidata,
-
and I look for things
that have the same identifier.
-
So, my first version of this,
I tried to completely automate it,
-
and the OpenStreetMap community
was not impressed.
-
So, better to have
a semi-automated process,
-
so people put in a place name,
and then they see a list of matches,
-
and they go through
and they check the matches,
-
and when they're happy, they hit save.
-
And the OpenStreetMap community
is much happier with that.
-
It does make mistakes, the software,
it tries very carefully,
-
but there are errors in there.
-
So you have to have
someone checking them.
-
I've got a question of like,
when I designed this,
-
I felt like there should be
a one-to-one mapping
-
between things
in OpenStreetMap and Wikidata,
-
and it doesn't really work.
-
Like for my example, tunnels
often get represented as two objects
-
in OpenStreetMap--
one for each tunnel bore,
-
or each road, lane within the tunnel--
-
whereas in Wikidata they tend
to be represented as a single item,
-
so I need to change my software
to take account of this.
-
And I have difficulties with tram stops.
-
So one item in Wikidata for a tram stop,
-
but in OpenStreetMap, it's represented
-
as a relation with nodes
-
where the tram stops
on either side of the road.
-
But I'm using a piece of software
called osm2pgsql
-
to do the OpenStreetMap side of things.
-
And it doesn't really support
these relations.
-
So, I'm struggling with tram stops.
-
And so, people are using this tool.
-
There's almost 10,000 changesets
uploaded to OpenStreetMap.
-
Edits on OpenStreetMap
are grouped into changesets;
-
they're not individual edits
like on Wikidata.
-
And I've got over 200 users.
-
And using this tool, there's been
a quarter of a million links added
-
to OpenStreetMap.
-
But overall, those people are adding
Wikidata links by hand,
-
or with other tools, and there's now
1.4 million OpenStreetMap objects
-
with a Wikidata tag.
-
Yeah, so that is our presentation.
-
Any questions?
-
And just while we're taking questions,
-
I'll see if I can do
a live demo of the tool.
-
Any questions?
-
(woman) I'm very interested in sort of
surpassing this license problem.
-
And can you tell me about strategies,
that can be already used, such as--
-
I understand that there
are some contributions that aren't CC0--
-
or like the users, or whatever they are--
-
that can facilitate the exchange
of information between the systems.
-
(Edward) It's true that when you sign up
to OpenStreetMap,
-
you can tick a box saying,
"My edits are CC0."
-
But the difficulty is that you
are often editing something
-
that somebody else has edited.
-
And so, it's difficult
to isolate the CC0 edits.
-
(woman) Maybe like-- further question,
like what can we do about that?
-
Like, can we discuss-- I mean,
I think it has been tried,
-
but I don't think it's necessarily
doomed to fail.
-
(Eugene) Well, the best thing we can do
is try to link items together
-
using Edward's tool and other tools.
-
But for the moment, we just try
to map these things separately.
-
Maybe we can coordinate--
-
for example, if the third-party database
that we want to import is compatible
-
with both Wikidata and OpenStreetMap,
you can do a coordinated import to both.
-
But otherwise, we really have
to respect the license,
-
because in order
for OpenStreetMap to work,
-
it really respects intellectual
property and copyright.
-
(man) Thank you. Is it possible
to change the language of the background
-
when you go to the map?
-
Because it appears the language
of the local place
-
that you are looking for.
-
(Eugene) So, in OpenStreetMap,
basically, we tag the default name
-
according to the local language
of that place.
-
So, for example, if you go to Japan
in OpenStreetMap,
-
you will see Japanese names.
-
You cannot do that using
the OpenStreetMap website,
-
but there are third-party services
or tile services
-
that provide multilingual maps.
-
As I mentioned, there's Mapbox,
there's MapTiler.
-
They provide multilingual maps
so that you can use that
-
instead of the default layer
in OpenStreetMap.
-
(man) [inaudible]
-
or from the OpenStreetMap [inaudible]?
-
Yeah, for example--
-
(man) [inaudible]
-
not actually to this tool,
but also [inaudible].
-
Well, currently, OpenStreetMap,
as a project does not--
-
no project to provide this service,
-
because the idea
is that we provide the data,
-
and other people can build on that
to provide the services
-
that users will be able to use.
-
(man 2) Yeah, this is a great project
for all to [inaudible] on Wikidata.
-
Say, in Wikidata,
we have a lot of locations,
-
which is already coded and it is CC0.
-
So, there are a lot of images,
a lot of other things are in Wikidata.
-
So, if we integrate
this Wikidata Q items to OSM,
-
can we pull this,
all the other information
-
from Wikidata directly to OpenStreetMap,
-
any kind of tool, or something like that?
-
Or can we add an image
which is in Commons?
-
Can we add the link of the image
in Commons to OpenStreetMap,
-
like this Wikidata ID?
-
I feel like you don't need to.
-
Just leave the data in Wikidata,
and access it through the link.
-
Like just add the link
from OpenStreetMap to Wikidata,
-
and then, if you want the images--
don't duplicate the data,
-
don't have the same thing in both places.
-
And like Eugene was saying,
it's a bit tricky copying the data.
-
It's true that it's CC0,
-
but if we just start
importing lots of data,
-
then OpenStreetMap's going to ask
what's the provenance of the data,
-
where has all this come from.
-
I mean, I don't know if Eugene--
if you got anything to add to that.
-
(Eugene) Well, OpenStreetMap
does have an image tag.
-
So, you can add that image tag
pointing to a Commons file,
-
if you really want to.
-
But if you link it to the Wikidata item,
you don't need that,
-
because the Wikidata item
-
would probably have
a Commons category statement,
-
and that provides you links
to several images, as well.
-
You don't need to add that
in OpenStreetMap.
-
Can I just show this quick demo.
-
This is the software that I built.
-
So, I've looked up
Orange County in Indiana.
-
You can see, I've already run the software
in 2017, and I added 43 elements.
-
It guesses the language is English,
-
by looking at the number of languages
that the Wikidata labels are in.
-
And so the software
has found five matches,
-
and it's got a list of matches
with tick boxes.
-
There's a map.
-
It shows you the first paragraph
from the Wikipedia article
-
in the currently selected language.
-
If I say, show tags, these are the tags
from OpenStreetMap,
-
so it's matched--the name is the same.
-
It says it's found eight name matches,
and it's a hotel which matches.
-
I can say, show on map.
-
And the pin is the location
of the Wikidata coordinates,
-
and it's matched
this hotel building polygon.
-
So, I can go through,
and you can see some others.
-
There's a school.
-
It's failed with the airport.
-
The airport is represented
both as a node and as a way,
-
and it can't figure out which one to use,
-
so it isn't going to do the airport.
-
Here's a historic bank building
that it's managed to match.
-
There's an old name tag in OpenStreetMap,
-
that it's matched the old name,
with the name that is in Wikidata.
-
And then it's also matched up
public library.
-
So, if I click on add wikidata tags
to OpenStreetMap,
-
it gives me a change comment field
where I could edit it--
-
change comment if I wanted.
-
And it shows me the same matches again.
-
And I hit save, and it's edited the map,
-
and it's added
the Wikidata tags to the map.
-
(applause)
-
([Muhammad]) It's actually not a question.
-
But first, thank you for this project.
-
My name is [Muhammad Hidjal]
from Palestine.
-
I am a civil engineer,
and I do some special statistics.
-
A few months ago, a magazine in my country
asked me to do some statistics
-
on Nobel Prize winners.
-
So, for that, I used
Wikidata Query Service,
-
and ArcGIS program for geographic
information system analyzation.
-
I extracted the place of birth
for all Nobel Prize winners,
-
and projected them on the map
using ArcGIS program,
-
and then they asked me,
"How many of them--
-
how many of the winners were born
in the north part of the world,
-
how many of them were born
in the south part of the world?"
-
The problem here is
that ArcGIS program is not free
-
and I don't want to use it anymore.
-
Can I do these statistics
using OpenStreetMap
-
after projecting the special
information of these?
-
Okay, so the problem is
that what you're doing--
-
what you're trying to do is
what we call a geospatial analysis.
-
However, OpenStreetMap is a data project.
We provide data.
-
And what you can do is, for example,
take the data from OpenStreetMap,
-
take the data from
your Nobel Prize place of births,
-
and use a tool, like ArcGIS,
which is not free,
-
but there's an open source tool,
called QGIS,
-
which you can use to do
that spatial analysis that you want.
-
So, you can combine, for example,
-
the boundaries for northern countries
versus southern countries,
-
put that in QGIS, then put your data
with the Nobel Prize place of births,
-
and then do an intersection
tool or function.
-
So, yeah.
-
So, I think we're out of time,
and it's lunch now.
-
Thanks, everyone.
-
(applause)