1 00:00:06,259 --> 00:00:08,379 Hello. So, good afternoon. 2 00:00:08,379 --> 00:00:11,919 Welcome to the OpenStreetMap and Wikidata workshop. 3 00:00:11,919 --> 00:00:13,479 My name is Eugene. 4 00:00:13,479 --> 00:00:16,849 And together with Edward, we'll be talking about OpenStreetMap 5 00:00:16,849 --> 00:00:20,069 and how it can work together with Wikidata. 6 00:00:20,118 --> 00:00:21,718 So, just a show of hands. 7 00:00:21,718 --> 00:00:24,638 Who here has an OpenStreetMap account? 8 00:00:26,538 --> 00:00:30,638 Okay, some. So, probably this is not new to you. 9 00:00:30,638 --> 00:00:33,838 But for those who are not familiar with OpenStreetMap, 10 00:00:33,838 --> 00:00:37,348 I'll give an introduction to OpenStreetMap and its data model. 11 00:00:37,348 --> 00:00:39,478 So, basically, what is OpenStreetMap? 12 00:00:39,478 --> 00:00:42,688 It is basically a crowdsourced project to map the whole world. 13 00:00:43,579 --> 00:00:46,608 And the usual way we introduce OpenStreetMap to people 14 00:00:46,608 --> 00:00:50,636 is like OpenStreetMap is like Wikipedia for maps. 15 00:00:51,692 --> 00:00:55,097 But actually, a more accurate way to introduce OpenStreetMap 16 00:00:55,097 --> 00:00:57,897 is that it is like Wikidata for geographical data. 17 00:00:58,260 --> 00:01:01,711 But that presupposes that the audience already knows or is familiar 18 00:01:01,711 --> 00:01:03,695 with what Wikidata is. 19 00:01:04,347 --> 00:01:07,596 And why do we say that OpenStreetMap is like Wikidata? 20 00:01:08,083 --> 00:01:11,690 And that's because both have quite a lot of things in common, 21 00:01:11,690 --> 00:01:14,731 both being crowdsourced and open data projects. 22 00:01:16,071 --> 00:01:18,136 So, you know Wikidata-- 23 00:01:18,136 --> 00:01:20,938 it has items, statements, properties, et cetera. 24 00:01:21,250 --> 00:01:26,690 In the same way, OpenStreetMap has things like nodes, ways, relations, 25 00:01:26,690 --> 00:01:30,570 that have members and roles, and these have tags 26 00:01:30,570 --> 00:01:33,450 that are composed of keys and values. 27 00:01:34,230 --> 00:01:40,236 So, as more detail, nodes, ways, and relations model the geometry 28 00:01:40,236 --> 00:01:42,380 and topology of objects. 29 00:01:42,971 --> 00:01:46,950 And then, we have tags, which are actually key value strings 30 00:01:46,950 --> 00:01:51,040 that describe the actual things that those objects represent. 31 00:01:51,920 --> 00:01:57,760 So, to give an example, here's the Wikidata item for Berlin. 32 00:01:57,790 --> 00:02:02,530 So, we can have property like population, 3 million something, 33 00:02:03,069 --> 00:02:05,695 with a qualifier, point in time, and references. 34 00:02:06,243 --> 00:02:09,782 And they have counterparts in OpenStreetMap. 35 00:02:09,782 --> 00:02:13,136 So, for example, the Berlin relation in OpenStreetMap 36 00:02:13,136 --> 00:02:17,162 has the tag population is equal to 3.4 million something, 37 00:02:17,662 --> 00:02:23,662 and it has another tag, source:population equals this URL and that date. 38 00:02:24,382 --> 00:02:26,522 So, unlike in Wikidata, 39 00:02:26,522 --> 00:02:29,762 wherein you can have qualifiers and references for your statements, 40 00:02:29,762 --> 00:02:32,412 in OpenStreetMap, the tag is quite flat. 41 00:02:32,912 --> 00:02:36,452 There's no secondary levels of tags. 42 00:02:36,521 --> 00:02:37,821 Everything is flat. 43 00:02:37,821 --> 00:02:40,881 And that's why we have to put what you call secondary tags. 44 00:02:40,912 --> 00:02:43,732 So, for example here, source:population 45 00:02:43,732 --> 00:02:46,963 to indicate that the population tag has this source. 46 00:02:48,583 --> 00:02:52,108 Another thing is that OpenStreetMap's tags are not strictly controlled. 47 00:02:52,108 --> 00:02:55,532 Unlike with Wikidata, wherein you have to have approval process 48 00:02:55,532 --> 00:02:57,583 before properties are created, 49 00:02:57,583 --> 00:02:59,431 here, OpenStreetMap mappers 50 00:02:59,431 --> 00:03:02,792 can invent and add any tags that they like. 51 00:03:02,792 --> 00:03:05,587 However, there is a tagging proposal process 52 00:03:05,587 --> 00:03:07,858 in order to propose common tags 53 00:03:07,858 --> 00:03:11,391 that will be used by mappers all over the world. 54 00:03:14,107 --> 00:03:17,770 Okay, data modeling discussions on the Wikidata:Project chat page 55 00:03:17,770 --> 00:03:19,899 are actually quite similar to the discussions 56 00:03:19,899 --> 00:03:22,039 in OpenStreetMap's tagging mailing list. 57 00:03:22,659 --> 00:03:26,709 For example, here's an example of discussion on the project tag, 58 00:03:27,759 --> 00:03:30,599 how do we model a building that has changed its use? 59 00:03:30,599 --> 00:03:33,393 In OpenStreetMap, we have similar discussions. 60 00:03:33,393 --> 00:03:35,819 How do we tag these sorts of buildings? 61 00:03:37,734 --> 00:03:40,833 So, I've given an introduction of what OpenStreetMap is. 62 00:03:40,833 --> 00:03:45,293 I'd love to discuss it more, but we don't have enough time. 63 00:03:45,429 --> 00:03:48,772 So, we'll go into how do we link 64 00:03:48,772 --> 00:03:51,392 between OpenStreetMap and Wikidata together. 65 00:03:51,392 --> 00:03:54,612 I don't have to explain why linking is a good thing. 66 00:03:54,671 --> 00:03:58,332 We're all Wikidatans, and we know that linking data is a good thing. 67 00:03:59,052 --> 00:04:03,012 So, how do we actually link Wikidata with OpenStreetMap? 68 00:04:03,601 --> 00:04:06,202 So, from Wikidata to OpenStreetMap, 69 00:04:06,681 --> 00:04:10,721 Wikidata items on places can link to OpenStreetMap relations 70 00:04:10,721 --> 00:04:14,842 using the OSM relation ID, or the P402 property. 71 00:04:15,442 --> 00:04:17,612 So, the question is: why only relations? 72 00:04:18,133 --> 00:04:20,861 That's because OSM IDs are not stable. 73 00:04:21,521 --> 00:04:26,272 For example, you can change nodes to represent a different object. 74 00:04:26,942 --> 00:04:28,700 Ways can be split 75 00:04:29,760 --> 00:04:34,550 to add new information about those ways. 76 00:04:35,410 --> 00:04:42,326 However, relatively, relations in OpenStreetMap are relatively stable. 77 00:04:42,860 --> 00:04:47,349 At least for major relations, such as administrative boundaries, 78 00:04:47,349 --> 00:04:50,410 or highway routes, or public transportation routes. 79 00:04:51,171 --> 00:04:54,839 That way, you can link at least the, for example, here, 80 00:04:55,219 --> 00:04:57,639 Berlin Wikidata data edit item, 81 00:04:57,639 --> 00:05:00,312 can link to the relation representing the boundary 82 00:05:00,312 --> 00:05:04,269 in OpenStreetMap via its ID. 83 00:05:05,260 --> 00:05:09,310 In terms of the ontology, Wikidata items and properties 84 00:05:09,310 --> 00:05:13,489 for geographical features can link to "equivalent," in quotes, 85 00:05:13,489 --> 00:05:16,959 OpenStreetMap classes using the OSM tag or key property. 86 00:05:17,399 --> 00:05:22,759 For example, the lighthouse item in Wikidata has the value 87 00:05:22,838 --> 00:05:27,029 for OpenStreetMap tag or key Tag:man_made=lighthouse. 88 00:05:27,279 --> 00:05:32,369 That means that lighthouses are equivalent to objects that are tagged 89 00:05:32,369 --> 00:05:35,459 in OpenStreetMap with man-made lighthouse. 90 00:05:37,379 --> 00:05:41,489 Going in the other direction, OpenStreetMap objects can link 91 00:05:41,489 --> 00:05:44,859 to corresponding Wikipedia articles and Wikidata items 92 00:05:44,859 --> 00:05:48,340 using the Wikipedia and Wikidata tags, respectively. 93 00:05:48,818 --> 00:05:50,039 So, here's an example. 94 00:05:50,039 --> 00:05:52,498 The OpenStreetMap relation for Berlin. 95 00:05:52,498 --> 00:05:54,838 We have the Wikidata tag, Q64, 96 00:05:55,379 --> 00:06:00,128 and the Wikipedia article linking to the German article for Berlin. 97 00:06:02,158 --> 00:06:04,678 There are also several Wikidata secondary tags, 98 00:06:04,678 --> 00:06:09,001 such as for example, brand:wikidata, architect:wikidata, artist:wikidata, 99 00:06:09,001 --> 00:06:11,196 or name:etymology:wikidata. 100 00:06:11,196 --> 00:06:14,418 We use this in order to exactly specify 101 00:06:15,468 --> 00:06:17,840 what we are referring to. 102 00:06:18,334 --> 00:06:21,135 For example, on the top part here, we have the example. 103 00:06:21,746 --> 00:06:24,059 There's an artwork in OpenStreetMap 104 00:06:24,059 --> 00:06:27,427 that was created by the artist named Herakut, 105 00:06:27,876 --> 00:06:29,215 but who is that? 106 00:06:29,215 --> 00:06:32,669 So, in order to specify exactly, we use artist:wikidata, 107 00:06:32,669 --> 00:06:34,380 and that Q ID number. 108 00:06:34,383 --> 00:06:39,530 So that you can be exactly sure which Herakut artist it really is. 109 00:06:40,090 --> 00:06:43,850 This is also useful, for example, if you're tagging, for example, 110 00:06:46,090 --> 00:06:48,700 objects in OpenStreetMap that are in a different language. 111 00:06:48,700 --> 00:06:52,856 For example, in Japan, you might have a fast-food restaurant 112 00:06:52,856 --> 00:06:56,437 called Makudonarudo, which is actually McDonald's. 113 00:06:56,437 --> 00:06:59,483 So, you can tag that using the brand:wikidata tag 114 00:07:00,183 --> 00:07:03,343 pointing to the McDonald's item in Wikidata. 115 00:07:06,059 --> 00:07:11,399 So, in terms of ontology, we define and describe the tags 116 00:07:11,399 --> 00:07:14,809 in OpenStreetMap on the OpenStreetMap Wiki, 117 00:07:14,809 --> 00:07:20,309 and we can add links to corresponding Wikipedia articles and Wikidata items 118 00:07:20,729 --> 00:07:25,688 so that we can sort of explain the correspondences and relations 119 00:07:25,688 --> 00:07:29,780 between these tags and items in Wikidata. 120 00:07:32,692 --> 00:07:38,532 Okay, so how do OpenStreetMap and Wikimedia use each other's data? 121 00:07:39,827 --> 00:07:42,181 So, first, we have the interactive maps. 122 00:07:42,181 --> 00:07:44,744 So, OpenStreetMap data powers 123 00:07:44,744 --> 00:07:48,475 the Wikimedia Foundation's Kartotherian map tile service, 124 00:07:48,475 --> 00:07:51,242 which is used by the Kartographer MediaWiki extension. 125 00:07:51,607 --> 00:07:54,241 So, basically, any time you see an interactive map 126 00:07:54,241 --> 00:07:58,213 or almost any interactive map on any Wikimedia project, 127 00:07:58,213 --> 00:08:01,816 that is usually powered by the Kartotherian map tile service. 128 00:08:02,263 --> 00:08:07,004 For example, here's the interactive map for Berlin in the English Wikivoyage. 129 00:08:07,695 --> 00:08:11,583 So, the base map there is all coming from OpenStreetMap. 130 00:08:13,446 --> 00:08:16,811 So, another thing that the Kartographer extension can do 131 00:08:16,811 --> 00:08:20,332 is it can pull and overlay geometry from OpenStreetMap. 132 00:08:20,778 --> 00:08:25,799 So, here's the infobox on Commons for the Berlin category. 133 00:08:26,324 --> 00:08:30,953 And the map there, you can see an outline for Berlin, there. 134 00:08:31,397 --> 00:08:33,873 That outline comes from OpenStreetMap. 135 00:08:35,659 --> 00:08:40,528 In 2008, the foundation released localized map tiles for Kartotherian, 136 00:08:40,528 --> 00:08:45,191 and this leveraged the multilingual name tags in OpenStreetMap, 137 00:08:45,191 --> 00:08:49,370 so that you can view those maps that you see on Wikimedia projects 138 00:08:49,370 --> 00:08:51,279 in the user's language. 139 00:08:53,067 --> 00:08:56,229 Then, how do we use Wikidata in OpenStreetMap? 140 00:08:56,594 --> 00:08:59,270 For example, when tagging brands, 141 00:08:59,336 --> 00:09:02,216 for example, in shops and restaurants or banks, 142 00:09:02,986 --> 00:09:07,726 OpenStreetMap's Name Suggestion Index uses Wikidata to provide brand identity 143 00:09:07,726 --> 00:09:09,326 and improved tagging. 144 00:09:09,326 --> 00:09:13,306 So, for example, if you tag an object in OpenStreetMap 145 00:09:13,306 --> 00:09:18,456 with brand Wikidata pointing to the McDonald's item in Wikidata, 146 00:09:19,886 --> 00:09:22,937 the name field is now automatically locked 147 00:09:22,937 --> 00:09:26,598 so that users cannot just change that to, for example, Burger King. 148 00:09:27,567 --> 00:09:32,523 And then, you can edit or also pull icons, 149 00:09:34,063 --> 00:09:35,763 the McDonald's icon there 150 00:09:35,763 --> 00:09:41,962 that is taken from the Facebook item property 151 00:09:41,962 --> 00:09:43,782 in Wikidata. 152 00:09:44,361 --> 00:09:45,811 So, yeah. 153 00:09:45,811 --> 00:09:50,111 So, that, at least, when users are tagging these shops in OpenStreetMap, 154 00:09:50,111 --> 00:09:52,751 they can be sure that they're doing it correctly. 155 00:09:54,822 --> 00:09:59,311 Okay, so Sophox is a SPARQL endpoint for OpenStreetMap data. 156 00:09:59,771 --> 00:10:04,081 So, this service can use RDF federation to also query linked Wikidata items. 157 00:10:04,721 --> 00:10:09,251 So, actually, in OpenStreetMap, we usually use other query services, 158 00:10:09,251 --> 00:10:11,041 such as Overpass. 159 00:10:11,421 --> 00:10:16,771 But if you want to also query using Wikidata items, 160 00:10:17,311 --> 00:10:21,511 we have the Sophox endpoint that you can use. 161 00:10:23,711 --> 00:10:27,101 And for geocoding, if you're not familiar with geocoding, 162 00:10:27,101 --> 00:10:32,291 basically, that's the technology wherein given an address, 163 00:10:32,831 --> 00:10:37,097 you are returned geocoordinates. 164 00:10:37,825 --> 00:10:42,900 So, we have what we call Nominatim, which is the usual service 165 00:10:42,900 --> 00:10:45,417 in OpenStreetMap for doing geocoding. 166 00:10:45,417 --> 00:10:50,008 And previously, it already uses Wikipedia tags in OpenStreetMap. 167 00:10:50,534 --> 00:10:55,316 But this year, we added Google Summer of Code project code 168 00:10:55,316 --> 00:10:59,289 to integrate using Wikidata tags in Nominatim, 169 00:10:59,289 --> 00:11:02,855 so that search results can become more relevant for users 170 00:11:02,855 --> 00:11:05,084 who are doing the searches. 171 00:11:06,583 --> 00:11:12,306 And for localization, Mapbox and MapTiler, which are third-party companies 172 00:11:12,306 --> 00:11:15,622 that extensively use OpenStreetMap, 173 00:11:15,622 --> 00:11:18,787 they use Wikidata to power their localized map products. 174 00:11:19,111 --> 00:11:23,854 So, basically, if there are missing name tags in OpenStreetMap, 175 00:11:23,854 --> 00:11:29,937 and if that object is linked to Wikidata, they can pull the labels from Wikidata, 176 00:11:29,937 --> 00:11:35,217 and use that to show multilingual labels, if that is missing in OpenStreetMap. 177 00:11:35,705 --> 00:11:39,675 The reason for that is because we have a philosophy in OpenStreetMap 178 00:11:39,675 --> 00:11:45,603 that we do not try to add too many tags, especially if that can be automated. 179 00:11:45,656 --> 00:11:48,296 For example, for automatic transliterations, 180 00:11:48,857 --> 00:11:52,973 if that can be automated, we don't need to add that to OpenStreetMap. 181 00:11:52,973 --> 00:11:56,156 But in Wikidata, that's no problem. 182 00:11:56,156 --> 00:12:01,184 So, you can do that by doing this linking between OpenStreetMap and Wikidata. 183 00:12:01,184 --> 00:12:06,616 You don't have to do that transliteration on your own. 184 00:12:06,625 --> 00:12:08,855 You can just pull it from Wikidata. 185 00:12:10,315 --> 00:12:15,805 And also, the OpenStreetMap Wiki has the Wikibase extension installed. 186 00:12:16,565 --> 00:12:22,085 So, the idea here is that we want the tag information or the description 187 00:12:22,085 --> 00:12:27,025 or the description of the tags, that we use in OpenStreetMap 188 00:12:27,025 --> 00:12:28,905 to be machine-readable. 189 00:12:28,905 --> 00:12:32,705 Hopefully, this will be used by software and editors 190 00:12:32,705 --> 00:12:38,096 that use OpenStreetMap data to better see 191 00:12:38,096 --> 00:12:41,475 how objects are described in OpenStreetMap. 192 00:12:41,995 --> 00:12:43,995 Currently, this is not used as much, 193 00:12:43,995 --> 00:12:48,935 but hopefully, as the tagging information becomes more complete and better, 194 00:12:48,935 --> 00:12:52,215 this can be used by OpenStreetMap software, 195 00:12:52,215 --> 00:12:54,755 thanks to the Wikibase installation. 196 00:12:56,195 --> 00:12:58,635 Okay, some copyright and IP issues. 197 00:13:00,395 --> 00:13:03,805 Wikidata can't import coordinates from OpenStreetMap. 198 00:13:03,805 --> 00:13:06,975 The reason for that is because OpenStreetMap is licensed 199 00:13:06,975 --> 00:13:09,675 under the Open Database License. 200 00:13:12,195 --> 00:13:14,875 And also, we have conflicting doctrines. 201 00:13:14,875 --> 00:13:19,118 Here in the European Union and the United Kingdom, 202 00:13:19,215 --> 00:13:20,974 we have database rights. 203 00:13:22,214 --> 00:13:26,944 Whereas, in the US, we have the idea that facts are not copyrightable. 204 00:13:27,364 --> 00:13:32,034 So, we cannot just-- even though you cannot say-- 205 00:13:32,034 --> 00:13:35,954 you cannot copyright the fact that this restaurant or this bank 206 00:13:35,954 --> 00:13:39,804 or this place is at this location, 207 00:13:39,804 --> 00:13:45,264 doing that as an import or as a batch job 208 00:13:45,403 --> 00:13:47,953 is not allowed 209 00:13:47,953 --> 00:13:51,883 because OpenStreetMap is protected by database rights 210 00:13:52,843 --> 00:13:56,134 being hosted in the United Kingdom. 211 00:13:57,883 --> 00:14:03,723 Conversely, OpenStreetMap will not import geodata from Wikidata, 212 00:14:03,723 --> 00:14:07,863 despite the CC0 license, because of data provenance issues. 213 00:14:09,563 --> 00:14:15,033 If you're not familiar with how geocoordinates are added 214 00:14:15,033 --> 00:14:19,723 into Wikipedia articles, usually users just go to Google Maps, 215 00:14:20,353 --> 00:14:24,173 search, and then copy the coordinates that show up in the results, 216 00:14:24,173 --> 00:14:26,983 and place that into the Wikipedia articles. 217 00:14:27,883 --> 00:14:31,353 In OpenStreetMap, we, as much as possible, 218 00:14:31,353 --> 00:14:34,895 avoid copying data from third-party sources 219 00:14:34,895 --> 00:14:37,984 that are proprietary, such as, for example, Google Maps. 220 00:14:38,613 --> 00:14:43,493 And because of that, we will never, in OpenStreetMap, never import data 221 00:14:43,493 --> 00:14:46,823 from Wikipedia and also Wikidata, 222 00:14:46,823 --> 00:14:51,333 because most coordinates in Wikidata have been imported from Wikipedia. 223 00:14:52,173 --> 00:14:54,603 So, it's an established principle on OpenStreetMap 224 00:14:54,603 --> 00:14:57,033 that we don't import from Wikipedia. 225 00:14:58,443 --> 00:15:01,667 Okay, I'll just then turn it over to Edward. 226 00:15:02,803 --> 00:15:06,153 (Edward) I'm going to talk about the process for adding links 227 00:15:06,153 --> 00:15:08,293 from OpenStreetMap to Wikidata. 228 00:15:08,412 --> 00:15:12,942 So, I've written a tool for automating this process. 229 00:15:13,772 --> 00:15:15,752 Like, it's user-assisted editing. 230 00:15:15,752 --> 00:15:18,072 So, it's not a fully automated tool. 231 00:15:18,072 --> 00:15:20,072 It's available. Anyone can use it. 232 00:15:20,142 --> 00:15:21,992 There's the address. 233 00:15:22,452 --> 00:15:29,200 So, when I run the tool on Berlin, it finds 2,800 matches. 234 00:15:30,080 --> 00:15:31,685 So, these are Wikidata items 235 00:15:31,685 --> 00:15:36,604 where it thinks it has found the same OpenStreetMap objects. 236 00:15:37,868 --> 00:15:40,904 So, the matcher is using this criteria. 237 00:15:40,904 --> 00:15:43,194 It looks for things that are the same entity type. 238 00:15:43,194 --> 00:15:44,556 They've got the same coordinates, 239 00:15:44,556 --> 00:15:48,209 and then either they've got the same name, street address, or identifier. 240 00:15:48,981 --> 00:15:52,750 So, with the name, it's doing lots of kind of normalization, 241 00:15:52,750 --> 00:15:57,537 like lower casing, removing spaces, all kinds of bits and pieces 242 00:15:57,537 --> 00:15:59,847 to try and match up slightly different ways 243 00:15:59,847 --> 00:16:01,386 that you could write a name. 244 00:16:01,386 --> 00:16:03,695 And I'm looking at names from different sources, 245 00:16:03,695 --> 00:16:05,361 like the labels, and the aliases, 246 00:16:05,361 --> 00:16:08,563 but also the site links, the article titles, 247 00:16:08,563 --> 00:16:13,710 and I pull anything in bold from the Wikipedia article, 248 00:16:13,710 --> 00:16:15,934 so lots of sources for names. 249 00:16:16,504 --> 00:16:18,589 These are the identifiers that I'm matching on. 250 00:16:18,589 --> 00:16:21,671 So, we've got lots of identifiers in Wikidata. 251 00:16:21,671 --> 00:16:24,629 OpenStreetMap has identifiers, as well. 252 00:16:24,629 --> 00:16:27,468 So, I've got a mapping between the name of the tag 253 00:16:27,468 --> 00:16:31,833 for the identifier in OpenStreetMap, and the property in Wikidata, 254 00:16:31,833 --> 00:16:34,472 and I look for things that have the same identifier. 255 00:16:34,706 --> 00:16:38,850 So, my first version of this, I tried to completely automate it, 256 00:16:38,850 --> 00:16:41,760 and the OpenStreetMap community was not impressed. 257 00:16:41,760 --> 00:16:44,860 So, better to have a semi-automated process, 258 00:16:44,860 --> 00:16:49,110 so people put in a place name, and then they see a list of matches, 259 00:16:49,110 --> 00:16:51,500 and they go through and they check the matches, 260 00:16:51,500 --> 00:16:53,731 and when they're happy, they hit save. 261 00:16:54,371 --> 00:16:57,080 And the OpenStreetMap community is much happier with that. 262 00:16:57,080 --> 00:17:00,770 It does make mistakes, the software, it tries very carefully, 263 00:17:00,770 --> 00:17:03,470 but there are errors in there. 264 00:17:03,470 --> 00:17:05,510 So you have to have someone checking them. 265 00:17:06,725 --> 00:17:09,150 I've got a question of like, when I designed this, 266 00:17:09,150 --> 00:17:11,150 I felt like there should be a one-to-one mapping 267 00:17:11,150 --> 00:17:13,880 between things in OpenStreetMap and Wikidata, 268 00:17:13,880 --> 00:17:15,370 and it doesn't really work. 269 00:17:15,370 --> 00:17:20,651 Like for my example, tunnels often get represented as two objects 270 00:17:20,651 --> 00:17:22,750 in OpenStreetMap-- one for each tunnel bore, 271 00:17:22,750 --> 00:17:25,951 or each road, lane within the tunnel-- 272 00:17:25,951 --> 00:17:29,990 whereas in Wikidata they tend to be represented as a single item, 273 00:17:30,550 --> 00:17:34,150 so I need to change my software to take account of this. 274 00:17:36,030 --> 00:17:38,820 And I have difficulties with tram stops. 275 00:17:38,974 --> 00:17:43,174 So one item in Wikidata for a tram stop, 276 00:17:43,174 --> 00:17:46,506 but in OpenStreetMap, it's represented 277 00:17:46,573 --> 00:17:50,093 as a relation with nodes 278 00:17:50,093 --> 00:17:53,023 where the tram stops on either side of the road. 279 00:17:53,023 --> 00:17:57,852 But I'm using a piece of software called osm2pgsql 280 00:17:57,852 --> 00:18:00,602 to do the OpenStreetMap side of things. 281 00:18:00,602 --> 00:18:03,112 And it doesn't really support these relations. 282 00:18:03,112 --> 00:18:05,772 So, I'm struggling with tram stops. 283 00:18:06,653 --> 00:18:08,133 And so, people are using this tool. 284 00:18:08,133 --> 00:18:11,612 There's almost 10,000 changesets uploaded to OpenStreetMap. 285 00:18:11,612 --> 00:18:14,783 Edits on OpenStreetMap are grouped into changesets; 286 00:18:14,783 --> 00:18:17,602 they're not individual edits like on Wikidata. 287 00:18:17,602 --> 00:18:20,282 And I've got over 200 users. 288 00:18:21,062 --> 00:18:25,802 And using this tool, there's been a quarter of a million links added 289 00:18:25,802 --> 00:18:27,402 to OpenStreetMap. 290 00:18:27,712 --> 00:18:32,032 But overall, those people are adding Wikidata links by hand, 291 00:18:32,032 --> 00:18:36,372 or with other tools, and there's now 1.4 million OpenStreetMap objects 292 00:18:36,372 --> 00:18:38,192 with a Wikidata tag. 293 00:18:40,112 --> 00:18:42,652 Yeah, so that is our presentation. 294 00:18:43,592 --> 00:18:45,112 Any questions? 295 00:18:45,112 --> 00:18:46,771 And just while we're taking questions, 296 00:18:46,771 --> 00:18:50,073 I'll see if I can do a live demo of the tool. 297 00:18:52,461 --> 00:18:54,272 Any questions? 298 00:18:59,922 --> 00:19:05,022 (woman) I'm very interested in sort of surpassing this license problem. 299 00:19:06,127 --> 00:19:12,392 And can you tell me about strategies, that can be already used, such as-- 300 00:19:13,262 --> 00:19:17,035 I understand that there are some contributions that aren't CC0-- 301 00:19:17,035 --> 00:19:21,695 or like the users, or whatever they are-- 302 00:19:21,865 --> 00:19:27,105 that can facilitate the exchange of information between the systems. 303 00:19:27,718 --> 00:19:30,515 (Edward) It's true that when you sign up to OpenStreetMap, 304 00:19:30,515 --> 00:19:33,667 you can tick a box saying, "My edits are CC0." 305 00:19:34,047 --> 00:19:37,277 But the difficulty is that you are often editing something 306 00:19:37,277 --> 00:19:39,437 that somebody else has edited. 307 00:19:39,437 --> 00:19:45,797 And so, it's difficult to isolate the CC0 edits. 308 00:19:47,607 --> 00:19:52,388 (woman) Maybe like-- further question, like what can we do about that? 309 00:19:52,417 --> 00:19:56,460 Like, can we discuss-- I mean, I think it has been tried, 310 00:19:56,460 --> 00:19:59,717 but I don't think it's necessarily doomed to fail. 311 00:20:03,036 --> 00:20:08,232 (Eugene) Well, the best thing we can do is try to link items together 312 00:20:08,232 --> 00:20:11,027 using Edward's tool and other tools. 313 00:20:12,217 --> 00:20:18,727 But for the moment, we just try to map these things separately. 314 00:20:20,065 --> 00:20:21,523 Maybe we can coordinate-- 315 00:20:21,614 --> 00:20:27,504 for example, if the third-party database that we want to import is compatible 316 00:20:27,536 --> 00:20:32,543 with both Wikidata and OpenStreetMap, you can do a coordinated import to both. 317 00:20:33,193 --> 00:20:37,633 But otherwise, we really have to respect the license, 318 00:20:37,633 --> 00:20:42,039 because in order for OpenStreetMap to work, 319 00:20:42,567 --> 00:20:46,335 it really respects intellectual property and copyright. 320 00:20:54,909 --> 00:20:58,638 (man) Thank you. Is it possible to change the language of the background 321 00:20:58,638 --> 00:21:02,085 when you go to the map? 322 00:21:03,221 --> 00:21:06,182 Because it appears the language of the local place 323 00:21:06,182 --> 00:21:08,336 that you are looking for. 324 00:21:09,794 --> 00:21:16,316 (Eugene) So, in OpenStreetMap, basically, we tag the default name 325 00:21:16,316 --> 00:21:21,219 according to the local language of that place. 326 00:21:21,625 --> 00:21:24,872 So, for example, if you go to Japan in OpenStreetMap, 327 00:21:24,872 --> 00:21:26,649 you will see Japanese names. 328 00:21:28,189 --> 00:21:30,833 You cannot do that using the OpenStreetMap website, 329 00:21:30,833 --> 00:21:35,465 but there are third-party services or tile services 330 00:21:35,465 --> 00:21:39,555 that provide multilingual maps. 331 00:21:40,364 --> 00:21:43,074 As I mentioned, there's Mapbox, there's MapTiler. 332 00:21:43,074 --> 00:21:46,071 They provide multilingual maps so that you can use that 333 00:21:46,071 --> 00:21:50,288 instead of the default layer in OpenStreetMap. 334 00:21:50,896 --> 00:21:56,358 (man) [inaudible] 335 00:21:56,358 --> 00:22:01,274 or from the OpenStreetMap [inaudible]? 336 00:22:02,167 --> 00:22:03,748 Yeah, for example-- 337 00:22:03,748 --> 00:22:05,980 (man) [inaudible] 338 00:22:05,980 --> 00:22:10,605 not actually to this tool, but also [inaudible]. 339 00:22:11,026 --> 00:22:15,304 Well, currently, OpenStreetMap, as a project does not-- 340 00:22:16,564 --> 00:22:19,102 no project to provide this service, 341 00:22:19,102 --> 00:22:22,699 because the idea is that we provide the data, 342 00:22:22,699 --> 00:22:26,841 and other people can build on that to provide the services 343 00:22:26,841 --> 00:22:29,802 that users will be able to use. 344 00:22:33,410 --> 00:22:38,890 (man 2) Yeah, this is a great project for all to [inaudible] on Wikidata. 345 00:22:39,458 --> 00:22:45,011 Say, in Wikidata, we have a lot of locations, 346 00:22:45,501 --> 00:22:48,617 which is already coded and it is CC0. 347 00:22:49,396 --> 00:22:55,919 So, there are a lot of images, a lot of other things are in Wikidata. 348 00:22:56,295 --> 00:23:01,216 So, if we integrate this Wikidata Q items to OSM, 349 00:23:03,698 --> 00:23:06,576 can we pull this, all the other information 350 00:23:06,576 --> 00:23:10,064 from Wikidata directly to OpenStreetMap, 351 00:23:10,064 --> 00:23:12,104 any kind of tool, or something like that? 352 00:23:12,104 --> 00:23:16,133 Or can we add an image which is in Commons? 353 00:23:16,133 --> 00:23:20,475 Can we add the link of the image in Commons to OpenStreetMap, 354 00:23:20,475 --> 00:23:23,239 like this Wikidata ID? 355 00:23:24,138 --> 00:23:25,520 I feel like you don't need to. 356 00:23:25,520 --> 00:23:29,069 Just leave the data in Wikidata, and access it through the link. 357 00:23:29,069 --> 00:23:32,428 Like just add the link from OpenStreetMap to Wikidata, 358 00:23:32,428 --> 00:23:35,428 and then, if you want the images-- don't duplicate the data, 359 00:23:35,428 --> 00:23:38,158 don't have the same thing in both places. 360 00:23:38,818 --> 00:23:42,198 And like Eugene was saying, it's a bit tricky copying the data. 361 00:23:42,198 --> 00:23:43,608 It's true that it's CC0, 362 00:23:43,608 --> 00:23:45,728 but if we just start importing lots of data, 363 00:23:45,728 --> 00:23:48,459 then OpenStreetMap's going to ask what's the provenance of the data, 364 00:23:48,459 --> 00:23:50,538 where has all this come from. 365 00:23:52,589 --> 00:23:55,338 I mean, I don't know if Eugene-- if you got anything to add to that. 366 00:23:55,338 --> 00:23:58,518 (Eugene) Well, OpenStreetMap does have an image tag. 367 00:23:58,518 --> 00:24:02,217 So, you can add that image tag pointing to a Commons file, 368 00:24:02,217 --> 00:24:03,551 if you really want to. 369 00:24:03,551 --> 00:24:07,937 But if you link it to the Wikidata item, you don't need that, 370 00:24:07,937 --> 00:24:10,577 because the Wikidata item 371 00:24:10,577 --> 00:24:14,767 would probably have a Commons category statement, 372 00:24:14,767 --> 00:24:19,182 and that provides you links to several images, as well. 373 00:24:19,556 --> 00:24:22,843 You don't need to add that in OpenStreetMap. 374 00:24:23,979 --> 00:24:26,248 Can I just show this quick demo. 375 00:24:26,248 --> 00:24:28,276 This is the software that I built. 376 00:24:28,276 --> 00:24:32,173 So, I've looked up Orange County in Indiana. 377 00:24:32,733 --> 00:24:37,728 You can see, I've already run the software in 2017, and I added 43 elements. 378 00:24:38,520 --> 00:24:41,074 It guesses the language is English, 379 00:24:41,074 --> 00:24:45,779 by looking at the number of languages that the Wikidata labels are in. 380 00:24:46,588 --> 00:24:49,101 And so the software has found five matches, 381 00:24:49,101 --> 00:24:52,384 and it's got a list of matches with tick boxes. 382 00:24:52,384 --> 00:24:53,681 There's a map. 383 00:24:54,004 --> 00:24:58,623 It shows you the first paragraph from the Wikipedia article 384 00:24:58,623 --> 00:25:00,857 in the currently selected language. 385 00:25:00,965 --> 00:25:05,105 If I say, show tags, these are the tags from OpenStreetMap, 386 00:25:05,105 --> 00:25:07,792 so it's matched--the name is the same. 387 00:25:07,792 --> 00:25:12,058 It says it's found eight name matches, and it's a hotel which matches. 388 00:25:12,705 --> 00:25:14,245 I can say, show on map. 389 00:25:14,245 --> 00:25:19,256 And the pin is the location of the Wikidata coordinates, 390 00:25:19,256 --> 00:25:22,055 and it's matched this hotel building polygon. 391 00:25:22,625 --> 00:25:24,595 So, I can go through, and you can see some others. 392 00:25:24,595 --> 00:25:26,035 There's a school. 393 00:25:26,035 --> 00:25:27,695 It's failed with the airport. 394 00:25:27,695 --> 00:25:31,095 The airport is represented both as a node and as a way, 395 00:25:31,095 --> 00:25:33,245 and it can't figure out which one to use, 396 00:25:33,635 --> 00:25:35,585 so it isn't going to do the airport. 397 00:25:36,365 --> 00:25:42,085 Here's a historic bank building that it's managed to match. 398 00:25:42,885 --> 00:25:46,095 There's an old name tag in OpenStreetMap, 399 00:25:46,095 --> 00:25:49,455 that it's matched the old name, with the name that is in Wikidata. 400 00:25:50,274 --> 00:25:52,635 And then it's also matched up public library. 401 00:25:52,635 --> 00:25:55,780 So, if I click on add wikidata tags to OpenStreetMap, 402 00:25:55,780 --> 00:25:58,058 it gives me a change comment field where I could edit it-- 403 00:25:58,058 --> 00:25:59,628 change comment if I wanted. 404 00:25:59,628 --> 00:26:01,621 And it shows me the same matches again. 405 00:26:01,621 --> 00:26:04,389 And I hit save, and it's edited the map, 406 00:26:04,389 --> 00:26:07,794 and it's added the Wikidata tags to the map. 407 00:26:10,437 --> 00:26:16,034 (applause) 408 00:26:19,763 --> 00:26:21,535 ([Muhammad]) It's actually not a question. 409 00:26:21,535 --> 00:26:24,007 But first, thank you for this project. 410 00:26:25,057 --> 00:26:27,690 My name is [Muhammad Hidjal] from Palestine. 411 00:26:28,083 --> 00:26:31,713 I am a civil engineer, and I do some special statistics. 412 00:26:32,579 --> 00:26:37,353 A few months ago, a magazine in my country asked me to do some statistics 413 00:26:37,353 --> 00:26:39,024 on Nobel Prize winners. 414 00:26:39,524 --> 00:26:42,923 So, for that, I used Wikidata Query Service, 415 00:26:42,923 --> 00:26:47,113 and ArcGIS program for geographic information system analyzation. 416 00:26:47,754 --> 00:26:51,653 I extracted the place of birth for all Nobel Prize winners, 417 00:26:51,653 --> 00:26:55,874 and projected them on the map using ArcGIS program, 418 00:26:55,874 --> 00:26:58,792 and then they asked me, "How many of them-- 419 00:26:59,124 --> 00:27:03,303 how many of the winners were born in the north part of the world, 420 00:27:03,303 --> 00:27:06,309 how many of them were born in the south part of the world?" 421 00:27:06,309 --> 00:27:09,391 The problem here is that ArcGIS program is not free 422 00:27:09,391 --> 00:27:11,505 and I don't want to use it anymore. 423 00:27:11,505 --> 00:27:15,317 Can I do these statistics using OpenStreetMap 424 00:27:15,317 --> 00:27:19,493 after projecting the special information of these? 425 00:27:21,485 --> 00:27:24,118 Okay, so the problem is that what you're doing-- 426 00:27:24,118 --> 00:27:28,699 what you're trying to do is what we call a geospatial analysis. 427 00:27:29,309 --> 00:27:32,394 However, OpenStreetMap is a data project. We provide data. 428 00:27:33,003 --> 00:27:37,018 And what you can do is, for example, take the data from OpenStreetMap, 429 00:27:37,018 --> 00:27:40,554 take the data from your Nobel Prize place of births, 430 00:27:40,554 --> 00:27:44,612 and use a tool, like ArcGIS, which is not free, 431 00:27:44,612 --> 00:27:47,376 but there's an open source tool, called QGIS, 432 00:27:47,376 --> 00:27:51,067 which you can use to do that spatial analysis that you want. 433 00:27:51,472 --> 00:27:53,505 So, you can combine, for example, 434 00:27:53,505 --> 00:27:57,729 the boundaries for northern countries versus southern countries, 435 00:27:57,729 --> 00:28:03,454 put that in QGIS, then put your data with the Nobel Prize place of births, 436 00:28:03,859 --> 00:28:07,430 and then do an intersection tool or function. 437 00:28:08,037 --> 00:28:09,414 So, yeah. 438 00:28:09,626 --> 00:28:13,086 So, I think we're out of time, and it's lunch now. 439 00:28:13,175 --> 00:28:14,617 Thanks, everyone. 440 00:28:14,617 --> 00:28:19,857 (applause)