Hello. So, good afternoon.
Welcome to the OpenStreetMap
and Wikidata workshop.
My name is Eugene.
And together with Edward,
we'll be talking about OpenStreetMap
and how it can work together
with Wikidata.
So, just a show of hands.
Who here has an OpenStreetMap account?
Okay, some. So, probably
this is not new to you.
But for those who are not familiar
with OpenStreetMap,
I'll give an introduction
to OpenStreetMap and its data model.
So, basically, what is OpenStreetMap?
It is basically a crowdsourced project
to map the whole world.
And the usual way we introduce
OpenStreetMap to people
is like OpenStreetMap
is like Wikipedia for maps.
But actually, a more accurate way
to introduce OpenStreetMap
is that it is like Wikidata
for geographical data.
But that presupposes that the audience
already knows or is familiar
with what Wikidata is.
And why do we say
that OpenStreetMap is like Wikidata?
And that's because both
have quite a lot of things in common,
both being crowdsourced
and open data projects.
So, you know Wikidata--
it has items, statements,
properties, et cetera.
In the same way, OpenStreetMap
has things like nodes, ways, relations,
that have members and roles,
and these have tags
that are composed of keys and values.
So, as more detail, nodes, ways,
and relations model the geometry
and topology of objects.
And then, we have tags,
which are actually key value strings
that describe the actual things
that those objects represent.
So, to give an example,
here's the Wikidata item for Berlin.
So, we can have property
like population, 3 million something,
with a qualifier, point in time,
and references.
And they have counterparts
in OpenStreetMap.
So, for example, the Berlin relation
in OpenStreetMap
has the tag population is equal
to 3.4 million something,
and it has another tag, source:population
equals this URL and that date.
So, unlike in Wikidata,
wherein you can have qualifiers
and references for your statements,
in OpenStreetMap, the tag is quite flat.
There's no secondary levels of tags.
Everything is flat.
And that's why we have to put
what you call secondary tags.
So, for example here, source:population
to indicate that the population tag
has this source.
Another thing is that OpenStreetMap's tags
are not strictly controlled.
Unlike with Wikidata, wherein you have
to have approval process
before properties are created,
here, OpenStreetMap mappers
can invent and add
any tags that they like.
However, there is a tagging
proposal process
in order to propose common tags
that will be used by mappers
all over the world.
Okay, data modeling discussions
on the Wikidata:Project chat page
are actually quite similar
to the discussions
in OpenStreetMap's tagging mailing list.
For example, here's an example
of discussion on the project tag,
how do we model a building
that has changed its use?
In OpenStreetMap,
we have similar discussions.
How do we tag these sorts of buildings?
So, I've given an introduction
of what OpenStreetMap is.
I'd love to discuss it more,
but we don't have enough time.
So, we'll go into how do we link
between OpenStreetMap
and Wikidata together.
I don't have to explain
why linking is a good thing.
We're all Wikidatans, and we know
that linking data is a good thing.
So, how do we actually link
Wikidata with OpenStreetMap?
So, from Wikidata to OpenStreetMap,
Wikidata items on places can link
to OpenStreetMap relations
using the OSM relation ID,
or the P402 property.
So, the question is: why only relations?
That's because OSM IDs are not stable.
For example, you can change nodes
to represent a different object.
Ways can be split
to add new information about those ways.
However, relatively, relations
in OpenStreetMap are relatively stable.
At least for major relations,
such as administrative boundaries,
or highway routes,
or public transportation routes.
That way, you can link at least the,
for example, here,
Berlin Wikidata data edit item,
can link to the relation
representing the boundary
in OpenStreetMap via its ID.
In terms of the ontology,
Wikidata items and properties
for geographical features
can link to "equivalent," in quotes,
OpenStreetMap classes
using the OSM tag or key property.
For example, the lighthouse item
in Wikidata has the value
for OpenStreetMap tag or key
Tag:man_made=lighthouse.
That means that lighthouses are equivalent
to objects that are tagged
in OpenStreetMap
with man-made lighthouse.
Going in the other direction,
OpenStreetMap objects can link
to corresponding Wikipedia articles
and Wikidata items
using the Wikipedia
and Wikidata tags, respectively.
So, here's an example.
The OpenStreetMap relation for Berlin.
We have the Wikidata tag, Q64,
and the Wikipedia article
linking to the German article for Berlin.
There are also several Wikidata
secondary tags,
such as for example, brand:wikidata,
architect:wikidata, artist:wikidata,
or name:etymology:wikidata.
We use this in order to exactly specify
what we are referring to.
For example, on the top part here,
we have the example.
There's an artwork in OpenStreetMap
that was created
by the artist named Herakut,
but who is that?
So, in order to specify exactly,
we use artist:wikidata,
and that Q ID number.
So that you can be exactly sure
which Herakut artist it really is.
This is also useful, for example,
if you're tagging, for example,
objects in OpenStreetMap
that are in a different language.
For example, in Japan, you might have
a fast-food restaurant
called Makudonarudo,
which is actually McDonald's.
So, you can tag that using
the brand:wikidata tag
pointing to the McDonald's
item in Wikidata.
So, in terms of ontology,
we define and describe the tags
in OpenStreetMap
on the OpenStreetMap Wiki,
and we can add links to corresponding
Wikipedia articles and Wikidata items
so that we can sort of explain
the correspondences and relations
between these tags and items in Wikidata.
Okay, so how do OpenStreetMap
and Wikimedia use each other's data?
So, first, we have the interactive maps.
So, OpenStreetMap data powers
the Wikimedia Foundation's
Kartotherian map tile service,
which is used by the Kartographer
MediaWiki extension.
So, basically, any time you see
an interactive map
or almost any interactive map
on any Wikimedia project,
that is usually powered
by the Kartotherian map tile service.
For example, here's the interactive map
for Berlin in the English Wikivoyage.
So, the base map there is all coming
from OpenStreetMap.
So, another thing that
the Kartographer extension can do
is it can pull and overlay geometry
from OpenStreetMap.
So, here's the infobox on Commons
for the Berlin category.
And the map there, you can see
an outline for Berlin, there.
That outline comes from OpenStreetMap.
In 2008, the foundation released
localized map tiles for Kartotherian,
and this leveraged the multilingual
name tags in OpenStreetMap,
so that you can view those maps
that you see on Wikimedia projects
in the user's language.
Then, how do we use Wikidata
in OpenStreetMap?
For example, when tagging brands,
for example, in shops
and restaurants or banks,
OpenStreetMap's Name Suggestion Index
uses Wikidata to provide brand identity
and improved tagging.
So, for example, if you tag
an object in OpenStreetMap
with brand Wikidata pointing
to the McDonald's item in Wikidata,
the name field is now automatically locked
so that users cannot just change that
to, for example, Burger King.
And then, you can edit or also pull icons,
the McDonald's icon there
that is taken from the Facebook
item property
in Wikidata.
So, yeah.
So, that, at least, when users are tagging
these shops in OpenStreetMap,
they can be sure
that they're doing it correctly.
Okay, so Sophox is a SPARQL endpoint
for OpenStreetMap data.
So, this service can use RDF federation
to also query linked Wikidata items.
So, actually, in OpenStreetMap,
we usually use other query services,
such as Overpass.
But if you want to also query
using Wikidata items,
we have the Sophox endpoint
that you can use.
And for geocoding,
if you're not familiar with geocoding,
basically, that's the technology
wherein given an address,
you are returned geocoordinates.
So, we have what we call Nominatim,
which is the usual service
in OpenStreetMap for doing geocoding.
And previously, it already uses
Wikipedia tags in OpenStreetMap.
But this year, we added
Google Summer of Code project code
to integrate using Wikidata tags
in Nominatim,
so that search results
can become more relevant for users
who are doing the searches.
And for localization, Mapbox and MapTiler,
which are third-party companies
that extensively use OpenStreetMap,
they use Wikidata to power
their localized map products.
So, basically, if there
are missing name tags in OpenStreetMap,
and if that object is linked to Wikidata,
they can pull the labels from Wikidata,
and use that to show multilingual labels,
if that is missing in OpenStreetMap.
The reason for that is because
we have a philosophy in OpenStreetMap
that we do not try to add too many tags,
especially if that can be automated.
For example,
for automatic transliterations,
if that can be automated, we don't need
to add that to OpenStreetMap.
But in Wikidata, that's no problem.
So, you can do that by doing this linking
between OpenStreetMap and Wikidata.
You don't have to do
that transliteration on your own.
You can just pull it from Wikidata.
And also, the OpenStreetMap Wiki
has the Wikibase extension installed.
So, the idea here is that we want
the tag information or the description
or the description of the tags,
that we use in OpenStreetMap
to be machine-readable.
Hopefully, this will be used
by software and editors
that use OpenStreetMap data to better see
how objects are described
in OpenStreetMap.
Currently, this is not used as much,
but hopefully, as the tagging information
becomes more complete and better,
this can be used
by OpenStreetMap software,
thanks to the Wikibase installation.
Okay, some copyright and IP issues.
Wikidata can't import coordinates
from OpenStreetMap.
The reason for that is because
OpenStreetMap is licensed
under the Open Database License.
And also, we have conflicting doctrines.
Here in the European Union
and the United Kingdom,
we have database rights.
Whereas, in the US, we have the idea
that facts are not copyrightable.
So, we cannot just--
even though you cannot say--
you cannot copyright the fact
that this restaurant or this bank
or this place is at this location,
doing that as an import
or as a batch job
is not allowed
because OpenStreetMap
is protected by database rights
being hosted in the United Kingdom.
Conversely, OpenStreetMap
will not import geodata from Wikidata,
despite the CC0 license,
because of data provenance issues.
If you're not familiar
with how geocoordinates are added
into Wikipedia articles,
usually users just go to Google Maps,
search, and then copy the coordinates
that show up in the results,
and place that
into the Wikipedia articles.
In OpenStreetMap, we, as much as possible,
avoid copying data
from third-party sources
that are proprietary, such as,
for example, Google Maps.
And because of that, we will never,
in OpenStreetMap, never import data
from Wikipedia and also Wikidata,
because most coordinates in Wikidata
have been imported from Wikipedia.
So, it's an established principle
on OpenStreetMap
that we don't import from Wikipedia.
Okay, I'll just then
turn it over to Edward.
(Edward) I'm going to talk
about the process for adding links
from OpenStreetMap to Wikidata.
So, I've written a tool
for automating this process.
Like, it's user-assisted editing.
So, it's not a fully automated tool.
It's available. Anyone can use it.
There's the address.
So, when I run the tool on Berlin,
it finds 2,800 matches.
So, these are Wikidata items
where it thinks it has found
the same OpenStreetMap objects.
So, the matcher is using this criteria.
It looks for things
that are the same entity type.
They've got the same coordinates,
and then either they've got the same name,
street address, or identifier.
So, with the name, it's doing lots
of kind of normalization,
like lower casing, removing spaces,
all kinds of bits and pieces
to try and match up
slightly different ways
that you could write a name.
And I'm looking at names
from different sources,
like the labels, and the aliases,
but also the site links,
the article titles,
and I pull anything in bold
from the Wikipedia article,
so lots of sources for names.
These are the identifiers
that I'm matching on.
So, we've got lots
of identifiers in Wikidata.
OpenStreetMap has identifiers, as well.
So, I've got a mapping
between the name of the tag
for the identifier in OpenStreetMap,
and the property in Wikidata,
and I look for things
that have the same identifier.
So, my first version of this,
I tried to completely automate it,
and the OpenStreetMap community
was not impressed.
So, better to have
a semi-automated process,
so people put in a place name,
and then they see a list of matches,
and they go through
and they check the matches,
and when they're happy, they hit save.
And the OpenStreetMap community
is much happier with that.
It does make mistakes, the software,
it tries very carefully,
but there are errors in there.
So you have to have
someone checking them.
I've got a question of like,
when I designed this,
I felt like there should be
a one-to-one mapping
between things
in OpenStreetMap and Wikidata,
and it doesn't really work.
Like for my example, tunnels
often get represented as two objects
in OpenStreetMap--
one for each tunnel bore,
or each road, lane within the tunnel--
whereas in Wikidata they tend
to be represented as a single item,
so I need to change my software
to take account of this.
And I have difficulties with tram stops.
So one item in Wikidata for a tram stop,
but in OpenStreetMap, it's represented
as a relation with nodes
where the tram stops
on either side of the road.
But I'm using a piece of software
called osm2pgsql
to do the OpenStreetMap side of things.
And it doesn't really support
these relations.
So, I'm struggling with tram stops.
And so, people are using this tool.
There's almost 10,000 changesets
uploaded to OpenStreetMap.
Edits on OpenStreetMap
are grouped into changesets;
they're not individual edits
like on Wikidata.
And I've got over 200 users.
And using this tool, there's been
a quarter of a million links added
to OpenStreetMap.
But overall, those people are adding
Wikidata links by hand,
or with other tools, and there's now
1.4 million OpenStreetMap objects
with a Wikidata tag.
Yeah, so that is our presentation.
Any questions?
And just while we're taking questions,
I'll see if I can do
a live demo of the tool.
Any questions?
(woman) I'm very interested in sort of
surpassing this license problem.
And can you tell me about strategies,
that can be already used, such as--
I understand that there
are some contributions that aren't CC0--
or like the users, or whatever they are--
that can facilitate the exchange
of information between the systems.
(Edward) It's true that when you sign up
to OpenStreetMap,
you can tick a box saying,
"My edits are CC0."
But the difficulty is that you
are often editing something
that somebody else has edited.
And so, it's difficult
to isolate the CC0 edits.
(woman) Maybe like-- further question,
like what can we do about that?
Like, can we discuss-- I mean,
I think it has been tried,
but I don't think it's necessarily
doomed to fail.
(Eugene) Well, the best thing we can do
is try to link items together
using Edward's tool and other tools.
But for the moment, we just try
to map these things separately.
Maybe we can coordinate--
for example, if the third-party database
that we want to import is compatible
with both Wikidata and OpenStreetMap,
you can do a coordinated import to both.
But otherwise, we really have
to respect the license,
because in order
for OpenStreetMap to work,
it really respects intellectual
property and copyright.
(man) Thank you. Is it possible
to change the language of the background
when you go to the map?
Because it appears the language
of the local place
that you are looking for.
(Eugene) So, in OpenStreetMap,
basically, we tag the default name
according to the local language
of that place.
So, for example, if you go to Japan
in OpenStreetMap,
you will see Japanese names.
You cannot do that using
the OpenStreetMap website,
but there are third-party services
or tile services
that provide multilingual maps.
As I mentioned, there's Mapbox,
there's MapTiler.
They provide multilingual maps
so that you can use that
instead of the default layer
in OpenStreetMap.
(man) [inaudible]
or from the OpenStreetMap [inaudible]?
Yeah, for example--
(man) [inaudible]
not actually to this tool,
but also [inaudible].
Well, currently, OpenStreetMap,
as a project does not--
no project to provide this service,
because the idea
is that we provide the data,
and other people can build on that
to provide the services
that users will be able to use.
(man 2) Yeah, this is a great project
for all to [inaudible] on Wikidata.
Say, in Wikidata,
we have a lot of locations,
which is already coded and it is CC0.
So, there are a lot of images,
a lot of other things are in Wikidata.
So, if we integrate
this Wikidata Q items to OSM,
can we pull this,
all the other information
from Wikidata directly to OpenStreetMap,
any kind of tool, or something like that?
Or can we add an image
which is in Commons?
Can we add the link of the image
in Commons to OpenStreetMap,
like this Wikidata ID?
I feel like you don't need to.
Just leave the data in Wikidata,
and access it through the link.
Like just add the link
from OpenStreetMap to Wikidata,
and then, if you want the images--
don't duplicate the data,
don't have the same thing in both places.
And like Eugene was saying,
it's a bit tricky copying the data.
It's true that it's CC0,
but if we just start
importing lots of data,
then OpenStreetMap's going to ask
what's the provenance of the data,
where has all this come from.
I mean, I don't know if Eugene--
if you got anything to add to that.
(Eugene) Well, OpenStreetMap
does have an image tag.
So, you can add that image tag
pointing to a Commons file,
if you really want to.
But if you link it to the Wikidata item,
you don't need that,
because the Wikidata item
would probably have
a Commons category statement,
and that provides you links
to several images, as well.
You don't need to add that
in OpenStreetMap.
Can I just show this quick demo.
This is the software that I built.
So, I've looked up
Orange County in Indiana.
You can see, I've already run the software
in 2017, and I added 43 elements.
It guesses the language is English,
by looking at the number of languages
that the Wikidata labels are in.
And so the software
has found five matches,
and it's got a list of matches
with tick boxes.
There's a map.
It shows you the first paragraph
from the Wikipedia article
in the currently selected language.
If I say, show tags, these are the tags
from OpenStreetMap,
so it's matched--the name is the same.
It says it's found eight name matches,
and it's a hotel which matches.
I can say, show on map.
And the pin is the location
of the Wikidata coordinates,
and it's matched
this hotel building polygon.
So, I can go through,
and you can see some others.
There's a school.
It's failed with the airport.
The airport is represented
both as a node and as a way,
and it can't figure out which one to use,
so it isn't going to do the airport.
Here's a historic bank building
that it's managed to match.
There's an old name tag in OpenStreetMap,
that it's matched the old name,
with the name that is in Wikidata.
And then it's also matched up
public library.
So, if I click on add wikidata tags
to OpenStreetMap,
it gives me a change comment field
where I could edit it--
change comment if I wanted.
And it shows me the same matches again.
And I hit save, and it's edited the map,
and it's added
the Wikidata tags to the map.
(applause)
([Muhammad]) It's actually not a question.
But first, thank you for this project.
My name is [Muhammad Hidjal]
from Palestine.
I am a civil engineer,
and I do some special statistics.
A few months ago, a magazine in my country
asked me to do some statistics
on Nobel Prize winners.
So, for that, I used
Wikidata Query Service,
and ArcGIS program for geographic
information system analyzation.
I extracted the place of birth
for all Nobel Prize winners,
and projected them on the map
using ArcGIS program,
and then they asked me,
"How many of them--
how many of the winners were born
in the north part of the world,
how many of them were born
in the south part of the world?"
The problem here is
that ArcGIS program is not free
and I don't want to use it anymore.
Can I do these statistics
using OpenStreetMap
after projecting the special
information of these?
Okay, so the problem is
that what you're doing--
what you're trying to do is
what we call a geospatial analysis.
However, OpenStreetMap is a data project.
We provide data.
And what you can do is, for example,
take the data from OpenStreetMap,
take the data from
your Nobel Prize place of births,
and use a tool, like ArcGIS,
which is not free,
but there's an open source tool,
called QGIS,
which you can use to do
that spatial analysis that you want.
So, you can combine, for example,
the boundaries for northern countries
versus southern countries,
put that in QGIS, then put your data
with the Nobel Prize place of births,
and then do an intersection
tool or function.
So, yeah.
So, I think we're out of time,
and it's lunch now.
Thanks, everyone.
(applause)