Hello. So, good afternoon. Welcome to the OpenStreetMap and Wikidata workshop. My name is Eugene. And together with Edward, we'll be talking about OpenStreetMap and how it can work together with Wikidata. So, just a show of hands. Who here has an OpenStreetMap account? Okay, some. So, probably this is not new to you. But for those who are not familiar with OpenStreetMap, I'll give an introduction to OpenStreetMap and its data model. So, basically, what is OpenStreetMap? It is basically a crowdsourced project to map the whole world. And the usual way we introduce OpenStreetMap to people is like OpenStreetMap is like Wikipedia for maps. But actually, a more accurate way to introduce OpenStreetMap is that it is like Wikidata for geographical data. But that presupposes that the audience already knows or is familiar with what Wikidata is. And why do we say that OpenStreetMap is like Wikidata? And that's because both have quite a lot of things in common, both being crowdsourced and open data projects. So, you know Wikidata-- it has items, statements, properties, et cetera. In the same way, OpenStreetMap has things like nodes, ways, relations, that have members and roles, and these have tags that are composed of keys and values. So, as more detail, nodes, ways, and relations model the geometry and topology of objects. And then, we have tags, which are actually key value strings that describe the actual things that those objects represent. So, to give an example, here's the Wikidata item for Berlin. So, we can have property like population, 3 million something, with a qualifier, point in time, and references. And they have counterparts in OpenStreetMap. So, for example, the Berlin relation in OpenStreetMap has the tag population is equal to 3.4 million something, and it has another tag, source:population equals this URL and that date. So, unlike in Wikidata, wherein you can have qualifiers and references for your statements, in OpenStreetMap, the tag is quite flat. There's no secondary levels of tags. Everything is flat. And that's why we have to put what you call secondary tags. So, for example here, source:population to indicate that the population tag has this source. Another thing is that OpenStreetMap's tags are not strictly controlled. Unlike with Wikidata, wherein you have to have approval process before properties are created, here, OpenStreetMap mappers can invent and add any tags that they like. However, there is a tagging proposal process in order to propose common tags that will be used by mappers all over the world. Okay, data modeling discussions on the Wikidata:Project chat page are actually quite similar to the discussions in OpenStreetMap's tagging mailing list. For example, here's an example of discussion on the project tag, how do we model a building that has changed its use? In OpenStreetMap, we have similar discussions. How do we tag these sorts of buildings? So, I've given an introduction of what OpenStreetMap is. I'd love to discuss it more, but we don't have enough time. So, we'll go into how do we link between OpenStreetMap and Wikidata together. I don't have to explain why linking is a good thing. We're all Wikidatans, and we know that linking data is a good thing. So, how do we actually link Wikidata with OpenStreetMap? So, from Wikidata to OpenStreetMap, Wikidata items on places can link to OpenStreetMap relations using the OSM relation ID, or the P402 property. So, the question is: why only relations? That's because OSM IDs are not stable. For example, you can change nodes to represent a different object. Ways can be split to add new information about those ways. However, relatively, relations in OpenStreetMap are relatively stable. At least for major relations, such as administrative boundaries, or highway routes, or public transportation routes. That way, you can link at least the, for example, here, Berlin Wikidata data edit item, can link to the relation representing the boundary in OpenStreetMap via its ID. In terms of the ontology, Wikidata items and properties for geographical features can link to "equivalent," in quotes, OpenStreetMap classes using the OSM tag or key property. For example, the lighthouse item in Wikidata has the value for OpenStreetMap tag or key Tag:man_made=lighthouse. That means that lighthouses are equivalent to objects that are tagged in OpenStreetMap with man-made lighthouse. Going in the other direction, OpenStreetMap objects can link to corresponding Wikipedia articles and Wikidata items using the Wikipedia and Wikidata tags, respectively. So, here's an example. The OpenStreetMap relation for Berlin. We have the Wikidata tag, Q64, and the Wikipedia article linking to the German article for Berlin. There are also several Wikidata secondary tags, such as for example, brand:wikidata, architect:wikidata, artist:wikidata, or name:etymology:wikidata. We use this in order to exactly specify what we are referring to. For example, on the top part here, we have the example. There's an artwork in OpenStreetMap that was created by the artist named Herakut, but who is that? So, in order to specify exactly, we use artist:wikidata, and that Q ID number. So that you can be exactly sure which Herakut artist it really is. This is also useful, for example, if you're tagging, for example, objects in OpenStreetMap that are in a different language. For example, in Japan, you might have a fast-food restaurant called Makudonarudo, which is actually McDonald's. So, you can tag that using the brand:wikidata tag pointing to the McDonald's item in Wikidata. So, in terms of ontology, we define and describe the tags in OpenStreetMap on the OpenStreetMap Wiki, and we can add links to corresponding Wikipedia articles and Wikidata items so that we can sort of explain the correspondences and relations between these tags and items in Wikidata. Okay, so how do OpenStreetMap and Wikimedia use each other's data? So, first, we have the interactive maps. So, OpenStreetMap data powers the Wikimedia Foundation's Kartotherian map tile service, which is used by the Kartographer MediaWiki extension. So, basically, any time you see an interactive map or almost any interactive map on any Wikimedia project, that is usually powered by the Kartotherian map tile service. For example, here's the interactive map for Berlin in the English Wikivoyage. So, the base map there is all coming from OpenStreetMap. So, another thing that the Kartographer extension can do is it can pull and overlay geometry from OpenStreetMap. So, here's the infobox on Commons for the Berlin category. And the map there, you can see an outline for Berlin, there. That outline comes from OpenStreetMap. In 2008, the foundation released localized map tiles for Kartotherian, and this leveraged the multilingual name tags in OpenStreetMap, so that you can view those maps that you see on Wikimedia projects in the user's language. Then, how do we use Wikidata in OpenStreetMap? For example, when tagging brands, for example, in shops and restaurants or banks, OpenStreetMap's Name Suggestion Index uses Wikidata to provide brand identity and improved tagging. So, for example, if you tag an object in OpenStreetMap with brand Wikidata pointing to the McDonald's item in Wikidata, the name field is now automatically locked so that users cannot just change that to, for example, Burger King. And then, you can edit or also pull icons, the McDonald's icon there that is taken from the Facebook item property in Wikidata. So, yeah. So, that, at least, when users are tagging these shops in OpenStreetMap, they can be sure that they're doing it correctly. Okay, so Sophox is a SPARQL endpoint for OpenStreetMap data. So, this service can use RDF federation to also query linked Wikidata items. So, actually, in OpenStreetMap, we usually use other query services, such as Overpass. But if you want to also query using Wikidata items, we have the Sophox endpoint that you can use. And for geocoding, if you're not familiar with geocoding, basically, that's the technology wherein given an address, you are returned geocoordinates. So, we have what we call Nominatim, which is the usual service in OpenStreetMap for doing geocoding. And previously, it already uses Wikipedia tags in OpenStreetMap. But this year, we added Google Summer of Code project code to integrate using Wikidata tags in Nominatim, so that search results can become more relevant for users who are doing the searches. And for localization, Mapbox and MapTiler, which are third-party companies that extensively use OpenStreetMap, they use Wikidata to power their localized map products. So, basically, if there are missing name tags in OpenStreetMap, and if that object is linked to Wikidata, they can pull the labels from Wikidata, and use that to show multilingual labels, if that is missing in OpenStreetMap. The reason for that is because we have a philosophy in OpenStreetMap that we do not try to add too many tags, especially if that can be automated. For example, for automatic transliterations, if that can be automated, we don't need to add that to OpenStreetMap. But in Wikidata, that's no problem. So, you can do that by doing this linking between OpenStreetMap and Wikidata. You don't have to do that transliteration on your own. You can just pull it from Wikidata. And also, the OpenStreetMap Wiki has the Wikibase extension installed. So, the idea here is that we want the tag information or the description or the description of the tags, that we use in OpenStreetMap to be machine-readable. Hopefully, this will be used by software and editors that use OpenStreetMap data to better see how objects are described in OpenStreetMap. Currently, this is not used as much, but hopefully, as the tagging information becomes more complete and better, this can be used by OpenStreetMap software, thanks to the Wikibase installation. Okay, some copyright and IP issues. Wikidata can't import coordinates from OpenStreetMap. The reason for that is because OpenStreetMap is licensed under the Open Database License. And also, we have conflicting doctrines. Here in the European Union and the United Kingdom, we have database rights. Whereas, in the US, we have the idea that facts are not copyrightable. So, we cannot just-- even though you cannot say-- you cannot copyright the fact that this restaurant or this bank or this place is at this location, doing that as an import or as a batch job is not allowed because OpenStreetMap is protected by database rights being hosted in the United Kingdom. Conversely, OpenStreetMap will not import geodata from Wikidata, despite the CC0 license, because of data provenance issues. If you're not familiar with how geocoordinates are added into Wikipedia articles, usually users just go to Google Maps, search, and then copy the coordinates that show up in the results, and place that into the Wikipedia articles. In OpenStreetMap, we, as much as possible, avoid copying data from third-party sources that are proprietary, such as, for example, Google Maps. And because of that, we will never, in OpenStreetMap, never import data from Wikipedia and also Wikidata, because most coordinates in Wikidata have been imported from Wikipedia. So, it's an established principle on OpenStreetMap that we don't import from Wikipedia. Okay, I'll just then turn it over to Edward. (Edward) I'm going to talk about the process for adding links from OpenStreetMap to Wikidata. So, I've written a tool for automating this process. Like, it's user-assisted editing. So, it's not a fully automated tool. It's available. Anyone can use it. There's the address. So, when I run the tool on Berlin, it finds 2,800 matches. So, these are Wikidata items where it thinks it has found the same OpenStreetMap objects. So, the matcher is using this criteria. It looks for things that are the same entity type. They've got the same coordinates, and then either they've got the same name, street address, or identifier. So, with the name, it's doing lots of kind of normalization, like lower casing, removing spaces, all kinds of bits and pieces to try and match up slightly different ways that you could write a name. And I'm looking at names from different sources, like the labels, and the aliases, but also the site links, the article titles, and I pull anything in bold from the Wikipedia article, so lots of sources for names. These are the identifiers that I'm matching on. So, we've got lots of identifiers in Wikidata. OpenStreetMap has identifiers, as well. So, I've got a mapping between the name of the tag for the identifier in OpenStreetMap, and the property in Wikidata, and I look for things that have the same identifier. So, my first version of this, I tried to completely automate it, and the OpenStreetMap community was not impressed. So, better to have a semi-automated process, so people put in a place name, and then they see a list of matches, and they go through and they check the matches, and when they're happy, they hit save. And the OpenStreetMap community is much happier with that. It does make mistakes, the software, it tries very carefully, but there are errors in there. So you have to have someone checking them. I've got a question of like, when I designed this, I felt like there should be a one-to-one mapping between things in OpenStreetMap and Wikidata, and it doesn't really work. Like for my example, tunnels often get represented as two objects in OpenStreetMap-- one for each tunnel bore, or each road, lane within the tunnel-- whereas in Wikidata they tend to be represented as a single item, so I need to change my software to take account of this. And I have difficulties with tram stops. So one item in Wikidata for a tram stop, but in OpenStreetMap, it's represented as a relation with nodes where the tram stops on either side of the road. But I'm using a piece of software called osm2pgsql to do the OpenStreetMap side of things. And it doesn't really support these relations. So, I'm struggling with tram stops. And so, people are using this tool. There's almost 10,000 changesets uploaded to OpenStreetMap. Edits on OpenStreetMap are grouped into changesets; they're not individual edits like on Wikidata. And I've got over 200 users. And using this tool, there's been a quarter of a million links added to OpenStreetMap. But overall, those people are adding Wikidata links by hand, or with other tools, and there's now 1.4 million OpenStreetMap objects with a Wikidata tag. Yeah, so that is our presentation. Any questions? And just while we're taking questions, I'll see if I can do a live demo of the tool. Any questions? (woman) I'm very interested in sort of surpassing this license problem. And can you tell me about strategies, that can be already used, such as-- I understand that there are some contributions that aren't CC0-- or like the users, or whatever they are-- that can facilitate the exchange of information between the systems. (Edward) It's true that when you sign up to OpenStreetMap, you can tick a box saying, "My edits are CC0." But the difficulty is that you are often editing something that somebody else has edited. And so, it's difficult to isolate the CC0 edits. (woman) Maybe like-- further question, like what can we do about that? Like, can we discuss-- I mean, I think it has been tried, but I don't think it's necessarily doomed to fail. (Eugene) Well, the best thing we can do is try to link items together using Edward's tool and other tools. But for the moment, we just try to map these things separately. Maybe we can coordinate-- for example, if the third-party database that we want to import is compatible with both Wikidata and OpenStreetMap, you can do a coordinated import to both. But otherwise, we really have to respect the license, because in order for OpenStreetMap to work, it really respects intellectual property and copyright. (man) Thank you. Is it possible to change the language of the background when you go to the map? Because it appears the language of the local place that you are looking for. (Eugene) So, in OpenStreetMap, basically, we tag the default name according to the local language of that place. So, for example, if you go to Japan in OpenStreetMap, you will see Japanese names. You cannot do that using the OpenStreetMap website, but there are third-party services or tile services that provide multilingual maps. As I mentioned, there's Mapbox, there's MapTiler. They provide multilingual maps so that you can use that instead of the default layer in OpenStreetMap. (man) [inaudible] or from the OpenStreetMap [inaudible]? Yeah, for example-- (man) [inaudible] not actually to this tool, but also [inaudible]. Well, currently, OpenStreetMap, as a project does not-- no project to provide this service, because the idea is that we provide the data, and other people can build on that to provide the services that users will be able to use. (man 2) Yeah, this is a great project for all to [inaudible] on Wikidata. Say, in Wikidata, we have a lot of locations, which is already coded and it is CC0. So, there are a lot of images, a lot of other things are in Wikidata. So, if we integrate this Wikidata Q items to OSM, can we pull this, all the other information from Wikidata directly to OpenStreetMap, any kind of tool, or something like that? Or can we add an image which is in Commons? Can we add the link of the image in Commons to OpenStreetMap, like this Wikidata ID? I feel like you don't need to. Just leave the data in Wikidata, and access it through the link. Like just add the link from OpenStreetMap to Wikidata, and then, if you want the images-- don't duplicate the data, don't have the same thing in both places. And like Eugene was saying, it's a bit tricky copying the data. It's true that it's CC0, but if we just start importing lots of data, then OpenStreetMap's going to ask what's the provenance of the data, where has all this come from. I mean, I don't know if Eugene-- if you got anything to add to that. (Eugene) Well, OpenStreetMap does have an image tag. So, you can add that image tag pointing to a Commons file, if you really want to. But if you link it to the Wikidata item, you don't need that, because the Wikidata item would probably have a Commons category statement, and that provides you links to several images, as well. You don't need to add that in OpenStreetMap. Can I just show this quick demo. This is the software that I built. So, I've looked up Orange County in Indiana. You can see, I've already run the software in 2017, and I added 43 elements. It guesses the language is English, by looking at the number of languages that the Wikidata labels are in. And so the software has found five matches, and it's got a list of matches with tick boxes. There's a map. It shows you the first paragraph from the Wikipedia article in the currently selected language. If I say, show tags, these are the tags from OpenStreetMap, so it's matched--the name is the same. It says it's found eight name matches, and it's a hotel which matches. I can say, show on map. And the pin is the location of the Wikidata coordinates, and it's matched this hotel building polygon. So, I can go through, and you can see some others. There's a school. It's failed with the airport. The airport is represented both as a node and as a way, and it can't figure out which one to use, so it isn't going to do the airport. Here's a historic bank building that it's managed to match. There's an old name tag in OpenStreetMap, that it's matched the old name, with the name that is in Wikidata. And then it's also matched up public library. So, if I click on add wikidata tags to OpenStreetMap, it gives me a change comment field where I could edit it-- change comment if I wanted. And it shows me the same matches again. And I hit save, and it's edited the map, and it's added the Wikidata tags to the map. (applause) ([Muhammad]) It's actually not a question. But first, thank you for this project. My name is [Muhammad Hidjal] from Palestine. I am a civil engineer, and I do some special statistics. A few months ago, a magazine in my country asked me to do some statistics on Nobel Prize winners. So, for that, I used Wikidata Query Service, and ArcGIS program for geographic information system analyzation. I extracted the place of birth for all Nobel Prize winners, and projected them on the map using ArcGIS program, and then they asked me, "How many of them-- how many of the winners were born in the north part of the world, how many of them were born in the south part of the world?" The problem here is that ArcGIS program is not free and I don't want to use it anymore. Can I do these statistics using OpenStreetMap after projecting the special information of these? Okay, so the problem is that what you're doing-- what you're trying to do is what we call a geospatial analysis. However, OpenStreetMap is a data project. We provide data. And what you can do is, for example, take the data from OpenStreetMap, take the data from your Nobel Prize place of births, and use a tool, like ArcGIS, which is not free, but there's an open source tool, called QGIS, which you can use to do that spatial analysis that you want. So, you can combine, for example, the boundaries for northern countries versus southern countries, put that in QGIS, then put your data with the Nobel Prize place of births, and then do an intersection tool or function. So, yeah. So, I think we're out of time, and it's lunch now. Thanks, everyone. (applause)