1 00:00:06,415 --> 00:00:08,619 Hi, so before we start, quickly, 2 00:00:08,619 --> 00:00:11,515 so I'm Jean-Fred, I'm a Wikidata volunteer. 3 00:00:11,855 --> 00:00:13,519 Hi, I am Envel, 4 00:00:13,519 --> 00:00:15,795 and I'm also a Wikidata volunteer. 5 00:00:16,365 --> 00:00:21,523 And I'm Tracy, and I get paid (chuckles) to volunteer for Wikidata, 6 00:00:21,523 --> 00:00:24,628 but I'm also enthusiastic to be here today, 7 00:00:24,628 --> 00:00:26,598 and I work for a research board. 8 00:00:29,553 --> 00:00:32,040 Alright, thanks for coming to our presentation: 9 00:00:32,040 --> 00:00:33,677 Sum of all video games: 10 00:00:33,677 --> 00:00:37,900 our road to make Wikidata the hub of all video game metadata. 11 00:00:38,570 --> 00:00:41,055 So, first off, why should we even care about video games, 12 00:00:41,055 --> 00:00:43,010 like aren't they just like kids playing Fortnite 13 00:00:43,010 --> 00:00:44,549 or something at night? 14 00:00:44,549 --> 00:00:46,381 So video games have been here for a long time, 15 00:00:46,381 --> 00:00:48,951 since the '70s or '60s or '40s. 16 00:00:48,951 --> 00:00:50,020 It depends what you ask. 17 00:00:50,020 --> 00:00:52,589 You can check Wikipedia's extensive coverage 18 00:00:52,589 --> 00:00:54,561 of what is even a game. 19 00:00:54,970 --> 00:00:56,531 It's a major cultural industry. 20 00:00:56,531 --> 00:00:59,131 More than 2.5 billion people play in the world, 21 00:00:59,131 --> 00:01:01,489 and we estimate that, at the very least, 22 00:01:01,489 --> 00:01:05,629 100,000-200,000 video games have been published since that time 23 00:01:05,629 --> 00:01:08,976 and that's not counting games published on the Play Store-- 24 00:01:08,976 --> 00:01:11,155 then you go through the millions, 25 00:01:11,155 --> 00:01:13,736 which is not that much when you're on Wikidata. 26 00:01:15,256 --> 00:01:18,933 So a little overview of the current state of video games on Wikidata. 27 00:01:18,933 --> 00:01:21,771 These numbers are also on our poster on the ground floor, 28 00:01:21,771 --> 00:01:23,574 so we can also have it there. 29 00:01:23,574 --> 00:01:28,082 So we have video games or the Q7889, 30 00:01:28,082 --> 00:01:31,017 and we have 38,000 of them, 31 00:01:31,537 --> 00:01:32,628 which is not that much 32 00:01:32,628 --> 00:01:35,110 considering that there are at least 200,000, as I mentioned. 33 00:01:35,110 --> 00:01:39,667 We also have expansion packs, DLCs, and compilations 34 00:01:39,667 --> 00:01:41,838 but we also have, for example, game controllers. 35 00:01:41,838 --> 00:01:45,610 We have a lot of game consoles, about 700-- that's a lot. 36 00:01:46,520 --> 00:01:48,665 We have an extensive ontology of video game genres, 37 00:01:48,665 --> 00:01:50,201 that's pretty cool, 200 of them, 38 00:01:50,201 --> 00:01:53,462 and [inaudible] a bit on magazines also. 39 00:01:53,462 --> 00:01:56,394 Maybe video games could be a satellite even for WikiCite 40 00:01:56,394 --> 00:01:58,189 I don't know. (chuckles) 41 00:01:59,049 --> 00:02:00,901 But what about outside of Wikidata? 42 00:02:01,681 --> 00:02:04,878 There are a lot of databases out there about video games. 43 00:02:04,878 --> 00:02:06,721 You may have heard about some very big ones, 44 00:02:06,721 --> 00:02:09,032 like Mobygames or IGDB. 45 00:02:09,032 --> 00:02:12,337 There are also a lot of very special-interest databases-- 46 00:02:12,337 --> 00:02:15,742 databases that only cover certain types. 47 00:02:16,602 --> 00:02:20,900 Visual Novel Database only has about this niche genre 48 00:02:20,900 --> 00:02:22,290 that is a visual novel. 49 00:02:22,290 --> 00:02:26,708 You have databases that are only about games published on the Commodore 64, 50 00:02:26,708 --> 00:02:28,160 and so on. 51 00:02:28,550 --> 00:02:32,831 But you also have government agencies and commercial players, 52 00:02:32,831 --> 00:02:35,301 government agencies [inaudible], called the rating agencies, 53 00:02:35,301 --> 00:02:38,830 the ones that put a little label: it's not good for your kids under 16. 54 00:02:38,830 --> 00:02:41,601 The problem is that there is no common identifier 55 00:02:41,601 --> 00:02:43,487 around all of these databases 56 00:02:43,487 --> 00:02:44,510 that binds them together. 57 00:02:44,510 --> 00:02:47,911 There is no cross-linking, or it is very little. 58 00:02:47,911 --> 00:02:52,686 Some database might be linked to their neighbor/friend's database, 59 00:02:52,686 --> 00:02:55,040 like the Amiga database talk to each other a little bit. 60 00:02:55,040 --> 00:02:58,629 But you won't have one easy way of saying all that. 61 00:03:00,279 --> 00:03:02,565 So there are different data coverage and specialization, 62 00:03:02,565 --> 00:03:05,590 and that often comes also with conceptual differences. 63 00:03:06,200 --> 00:03:10,841 A database might consider a game is a work, 64 00:03:10,841 --> 00:03:12,952 if you're into the FRBR model, 65 00:03:12,952 --> 00:03:14,226 or that might be an edition 66 00:03:14,226 --> 00:03:16,575 or that might be a particular console version. 67 00:03:17,055 --> 00:03:19,160 So there is a lot of granularity in there. 68 00:03:19,460 --> 00:03:22,117 And that's important in terms of coverage 69 00:03:22,117 --> 00:03:25,008 because some databases-- 70 00:03:25,008 --> 00:03:27,623 for example, Mobygames has a lot of information about a lot of things, 71 00:03:27,623 --> 00:03:29,039 but it doesn't have a lot of information 72 00:03:29,039 --> 00:03:32,417 about the games that were published on the early French computers, 73 00:03:32,417 --> 00:03:35,790 like the Oric or the Thomson TO MO series. 74 00:03:36,589 --> 00:03:40,063 You will find that into more French databases. 75 00:03:40,063 --> 00:03:44,905 And if you go into Eastern video games, like China or Japan, 76 00:03:44,905 --> 00:03:47,460 it's not very well covered in Western databases. 77 00:03:48,260 --> 00:03:51,218 Enter WikiProject video games. 78 00:03:51,218 --> 00:03:53,515 (cheers and applause) 79 00:03:53,515 --> 00:03:54,572 (woman) Whoo-hoo! 80 00:03:54,572 --> 00:03:55,948 We didn't make that one, actually. 81 00:03:57,608 --> 00:03:59,278 So it lives at that address 82 00:03:59,278 --> 00:04:01,548 and there are a lot of subpages, 83 00:04:01,548 --> 00:04:06,005 and we're going to go through a little bit of what this project is made of. 84 00:04:06,005 --> 00:04:08,660 As often, there is-- 85 00:04:08,660 --> 00:04:11,815 we'll separate that in what's old and what's new 86 00:04:11,815 --> 00:04:13,958 and what's borrowed and what's blue. 87 00:04:14,458 --> 00:04:15,660 So, as old we have-- 88 00:04:15,660 --> 00:04:17,500 Like a lot of WikiProjects we have, 89 00:04:17,500 --> 00:04:19,480 an ontology description with all the properties. 90 00:04:19,480 --> 00:04:22,031 There are currently 64 properties, mostly for games, 91 00:04:22,031 --> 00:04:25,385 but also about series or hardware. 92 00:04:25,385 --> 00:04:28,208 And we have a fairly extensive, I think-- 93 00:04:28,208 --> 00:04:30,128 how to put it-- separations. 94 00:04:30,128 --> 00:04:31,380 We have things about the staff, 95 00:04:31,380 --> 00:04:32,966 but also about the narrative universe 96 00:04:32,966 --> 00:04:36,689 or about the gameplay, like how many players there are. 97 00:04:36,689 --> 00:04:39,387 So you can explore this; it's kind of very exciting. 98 00:04:39,877 --> 00:04:41,797 We also have example queries. 99 00:04:41,797 --> 00:04:44,033 If we have time at the end, we might show off some, 100 00:04:44,033 --> 00:04:46,029 but you can just explore them yourself. 101 00:04:51,455 --> 00:04:53,850 We also have something new. 102 00:04:53,850 --> 00:04:59,951 Because those things don't exist in other WikiProjects and Wikidata. 103 00:05:00,545 --> 00:05:02,915 For example, we have an Activity Log. 104 00:05:02,915 --> 00:05:06,437 You can see it here. 105 00:05:06,437 --> 00:05:10,577 On this Activity Log, we track the activity of the project. 106 00:05:10,577 --> 00:05:16,725 So when we publish a blog post or an article somewhere, 107 00:05:16,725 --> 00:05:19,677 we add it here. 108 00:05:20,327 --> 00:05:23,281 When we create a new identifier property 109 00:05:23,281 --> 00:05:25,751 or any property related to video games, 110 00:05:25,751 --> 00:05:27,222 we also add it here. 111 00:05:28,172 --> 00:05:29,960 We also have achievements, 112 00:05:29,960 --> 00:05:32,611 like in January, we added a condition 113 00:05:32,611 --> 00:05:37,896 of an external identifier. 114 00:05:39,446 --> 00:05:42,046 Another thing that we do is we have a Tasks List. 115 00:05:42,046 --> 00:05:47,414 The Tasks List can be used by newcomers to the project 116 00:05:47,414 --> 00:05:51,711 to do things in the project. 117 00:05:52,331 --> 00:05:53,765 It can be [inaudible], 118 00:05:53,765 --> 00:05:59,574 so we give them an insight to [inaudible] 119 00:05:59,574 --> 00:06:01,057 and how to do that. 120 00:06:01,577 --> 00:06:05,116 It's also where we like [inaudible] 121 00:06:05,116 --> 00:06:08,751 [inaudible] 122 00:06:10,326 --> 00:06:12,959 We also have something borrowed. 123 00:06:13,897 --> 00:06:17,293 We have a lot of pages of statistics reports. 124 00:06:21,701 --> 00:06:24,631 We also have external identifiers that [inaudible]-- 125 00:06:24,631 --> 00:06:26,098 you can see it here-- 126 00:06:27,535 --> 00:06:29,923 where we track-- 127 00:06:29,923 --> 00:06:32,110 I don't know if you can see it-- 128 00:06:32,110 --> 00:06:36,124 but we have more than 100 external identifiers 129 00:06:36,124 --> 00:06:37,418 for video games, 130 00:06:37,418 --> 00:06:39,212 so this is big, huge. 131 00:06:39,212 --> 00:06:43,127 And here we can see for each item here-- 132 00:06:43,557 --> 00:06:45,147 just a little peek. 133 00:06:45,147 --> 00:06:50,623 And also the completion of the identifier. 134 00:06:54,012 --> 00:06:56,725 So, some of these things we borrowed from the Sum of all Paintings 135 00:06:56,725 --> 00:06:59,705 and other things, that begins more blue. 136 00:06:59,705 --> 00:07:04,019 So the InteGraality tool that was made initially for Sum of all Paintings 137 00:07:04,679 --> 00:07:06,235 I extended it for video games, 138 00:07:06,235 --> 00:07:08,819 and then I might as well have done it for everybody. 139 00:07:09,610 --> 00:07:12,174 So, yeah, one day we'll get all of these. 140 00:07:12,174 --> 00:07:15,628 So this is the core properties, the genre/developer/publisher 141 00:07:16,348 --> 00:07:17,951 along video game systems, 142 00:07:17,951 --> 00:07:21,007 so Windows, PlayStation console and so on. 143 00:07:21,007 --> 00:07:23,456 So, as you can see, we have a lot of work to do 144 00:07:23,456 --> 00:07:26,439 for even like the very basic core properties. 145 00:07:26,949 --> 00:07:30,030 So, yeah, one day, all of that will be blue. 146 00:07:31,340 --> 00:07:32,920 What have we been doing? 147 00:07:34,340 --> 00:07:35,737 Things that we've been doing a lot 148 00:07:35,737 --> 00:07:38,886 has been creating identifiers with all these external databases 149 00:07:38,886 --> 00:07:40,071 and aligning them. 150 00:07:40,071 --> 00:07:44,644 So Envel mentioned we have created over 100 external identifier properties-- 151 00:07:45,134 --> 00:07:49,204 that covers very big databases and very tiny ones. 152 00:07:49,754 --> 00:07:54,732 We've been using the Mix'n'match tool extensively for matching. 153 00:07:54,732 --> 00:07:57,222 And sometimes we've been using things a bit more advanced 154 00:07:57,222 --> 00:07:59,886 that Envel will detail in a moment. 155 00:08:01,046 --> 00:08:03,308 Yeah, so 100 external identifier properties created 156 00:08:03,308 --> 00:08:06,129 in roughly a year to two years 157 00:08:06,129 --> 00:08:08,250 and over 16 Mix'n'match catalogs. 158 00:08:08,250 --> 00:08:09,785 And I started tracking 159 00:08:09,785 --> 00:08:15,078 how many Q7889 items didn't have any identifiers, 160 00:08:15,078 --> 00:08:17,215 and five months ago it was 15,000 161 00:08:17,215 --> 00:08:20,386 and today we're down to 9,600, 162 00:08:20,386 --> 00:08:25,177 which is very much thanks to the teaching assistant of Tracy. 163 00:08:25,578 --> 00:08:28,817 So there's still 9,000 to go, but we're getting there. 164 00:08:32,556 --> 00:08:37,146 So we needed to import a lot of data 165 00:08:37,146 --> 00:08:40,826 to complete those identifiers. 166 00:08:42,996 --> 00:08:46,530 The first tool to do that is the Wikidata website. 167 00:08:47,280 --> 00:08:48,984 I think it's important to say it 168 00:08:48,984 --> 00:08:55,193 because it's where we can fix the small problems, and so on. 169 00:08:56,193 --> 00:09:02,020 But we also have dedicated tools to do that on Wikidata. 170 00:09:02,020 --> 00:09:04,925 There is Mix'n'match, and its gadget. 171 00:09:06,345 --> 00:09:10,179 The Mix'n'match Wiki gadget is a gadget that you can add 172 00:09:10,179 --> 00:09:12,438 to your account in Wikidata, 173 00:09:12,438 --> 00:09:17,363 and it adds all identifiers 174 00:09:17,363 --> 00:09:20,788 from [inaudible] Mix'n'match to an item. 175 00:09:22,549 --> 00:09:27,273 You can easily add serial IDs [inaudible]. 176 00:09:29,695 --> 00:09:33,146 Other tools... There is QuickStatements, of course. 177 00:09:33,526 --> 00:09:38,751 But you also can use more general tools, like OpenRefine, 178 00:09:38,751 --> 00:09:42,039 Dataiku Data Science Studio, et cetera. 179 00:09:43,369 --> 00:09:46,079 The point is it's very important for this project, 180 00:09:46,079 --> 00:09:48,750 and I think for all projects in Wikidata, 181 00:09:48,750 --> 00:09:53,183 to have a healthy ecosystem of tools that works. 182 00:09:59,413 --> 00:10:01,642 There are two examples of imports. 183 00:10:01,642 --> 00:10:06,279 The first one is connecting PCGamingWiki and Wikidata. 184 00:10:06,279 --> 00:10:09,383 It was made by a volunteer. 185 00:10:09,383 --> 00:10:12,157 He made his own program in Ruby, 186 00:10:12,157 --> 00:10:13,529 so that's an example. 187 00:10:14,289 --> 00:10:15,347 The second one 188 00:10:15,347 --> 00:10:19,200 is linking the OLAC video game vocabulary with Wikidata. 189 00:10:19,200 --> 00:10:22,473 It was made using OpenRefine and Mix'n'match, 190 00:10:22,473 --> 00:10:27,210 and I think Tracy can talk more about this one. 191 00:10:28,549 --> 00:10:32,805 And I have a third example, which is one I made. 192 00:10:33,665 --> 00:10:38,080 I matched the catalog of BnF, 193 00:10:38,080 --> 00:10:41,984 so it's Bibliothèque... the French National Library 194 00:10:42,824 --> 00:10:45,548 with Wikidata. 195 00:10:45,548 --> 00:10:50,494 So they have about 4,000 entries 196 00:10:50,494 --> 00:10:52,571 about video games in their catalog, 197 00:10:52,571 --> 00:10:58,046 and I matched half of them to Wikidata. 198 00:10:59,398 --> 00:11:01,956 So, for that, I made a project 199 00:11:03,626 --> 00:11:05,864 in Dataiku Data Science Studio. 200 00:11:06,414 --> 00:11:10,465 You can see the work [inaudible]. 201 00:11:11,185 --> 00:11:12,219 I will not detail it, 202 00:11:12,219 --> 00:11:14,528 but if you have questions, feel free to ask. 203 00:11:15,508 --> 00:11:19,143 I also developed a Dataiku plugin to do it, 204 00:11:19,143 --> 00:11:21,535 to facilitate SPARQL querying 205 00:11:21,535 --> 00:11:25,639 because it's not included in the tool. 206 00:11:27,309 --> 00:11:31,644 One cool thing that happened after this one 207 00:11:31,644 --> 00:11:34,676 is that BnF contacted me about this project. 208 00:11:34,676 --> 00:11:36,744 So it was very cool to have feedback, 209 00:11:36,744 --> 00:11:40,178 and that contact was established. 210 00:11:44,472 --> 00:11:48,029 So, another topic, the link-- 211 00:11:48,029 --> 00:11:51,604 So we want Wikidata to be the linking hub for video games. 212 00:11:52,804 --> 00:11:54,321 As you can see here, 213 00:11:54,321 --> 00:11:57,510 a video game is, as Jean-Fred said, 214 00:11:57,510 --> 00:12:00,037 a video game is about a lot of things. 215 00:12:01,417 --> 00:12:05,737 We have Reviews and Scores, Speedruns, 216 00:12:05,737 --> 00:12:08,284 News, Library ID, 217 00:12:09,584 --> 00:12:11,255 Soundtrack, etc. 218 00:12:11,919 --> 00:12:15,840 We don't want all this data to be in Wikidata, 219 00:12:15,840 --> 00:12:18,444 we want this data to be linked to Wikidata. 220 00:12:18,444 --> 00:12:20,681 So we want Wikidata to be, 221 00:12:22,041 --> 00:12:24,499 like [Lidia] said yesterday, a place-- 222 00:12:25,342 --> 00:12:29,161 We want to see Wikidata as a place you go, 223 00:12:29,161 --> 00:12:33,235 and then you go to another place. 224 00:12:33,590 --> 00:12:35,542 So I think that's it. 225 00:12:38,410 --> 00:12:41,230 And as you can see by the links, 226 00:12:42,550 --> 00:12:48,584 video games have a really lot of aspects to research, 227 00:12:49,194 --> 00:12:53,318 and video games are really complex cultural artifacts. 228 00:12:53,318 --> 00:12:55,667 There are [inaudible], there are [ed ones], 229 00:12:55,667 --> 00:12:59,511 remasters, re-releases, mods, updates, 230 00:12:59,511 --> 00:13:02,482 download of content, and so on and so forth. 231 00:13:02,482 --> 00:13:05,721 Plenty of remakes or remastered editions 232 00:13:05,721 --> 00:13:09,076 are separate items at this stage in Wikidata, 233 00:13:09,076 --> 00:13:11,009 but not necessarily. 234 00:13:11,009 --> 00:13:14,798 Additionally, remakes are not often linked to the original work 235 00:13:14,798 --> 00:13:17,486 using the property based on. 236 00:13:17,486 --> 00:13:21,859 And perhaps we should create an entity schema for the video games, 237 00:13:21,859 --> 00:13:23,997 but we are still in the process 238 00:13:23,997 --> 00:13:28,859 to get a discussion started for the data model of video games. 239 00:13:29,909 --> 00:13:32,193 Mostly, we have one item, 240 00:13:32,193 --> 00:13:36,434 what we typically recognize as "the game," 241 00:13:36,434 --> 00:13:38,620 when we say we played the same game, 242 00:13:38,620 --> 00:13:41,960 so it's like a Mario Kart 6. 243 00:13:41,960 --> 00:13:45,364 Even if we played it on different platforms, 244 00:13:45,364 --> 00:13:50,522 so, for example, on Switch, on Wii U, or something else. 245 00:13:50,522 --> 00:13:55,980 So Wikidata items for a game aggregate characteristics 246 00:13:55,980 --> 00:14:01,357 which are shared among different versions or editions. 247 00:14:01,357 --> 00:14:02,827 This makes linking not easy 248 00:14:02,827 --> 00:14:07,201 because many databases describe games on different levels, 249 00:14:07,201 --> 00:14:09,107 as Jean-Frédéric mentioned. 250 00:14:09,687 --> 00:14:13,913 For instance, some have one database entry for each edition, 251 00:14:13,913 --> 00:14:16,599 and this results in more than one identifier 252 00:14:16,599 --> 00:14:19,156 for each video game item. 253 00:14:19,156 --> 00:14:22,807 And so the use of specific qualifiers is needed. 254 00:14:23,527 --> 00:14:28,518 We have some discussions thinking about the creation of different editions items, 255 00:14:29,228 --> 00:14:31,071 for editions or releases. 256 00:14:31,071 --> 00:14:33,676 as this is good practice for literature, 257 00:14:33,676 --> 00:14:39,932 but the FRBR model which is used for books seems not useful for everyone. 258 00:14:41,072 --> 00:14:45,763 This is also an ongoing discussion with the video game research community 259 00:14:45,763 --> 00:14:48,730 about the best data model for video games. 260 00:14:49,720 --> 00:14:54,396 And speaking about video game research and the research community, 261 00:14:54,396 --> 00:14:56,832 there is an active video game research community 262 00:14:56,832 --> 00:15:00,720 with a growing interest in data about games. 263 00:15:00,720 --> 00:15:05,245 Sadly, there are no national libraries for video games 264 00:15:05,245 --> 00:15:07,816 which have a comprehensive dataset 265 00:15:07,816 --> 00:15:09,752 with authority data about video games-- 266 00:15:09,752 --> 00:15:12,674 yes, the BnF with 4,000 video games, 267 00:15:12,674 --> 00:15:16,344 but there's still more outside. 268 00:15:17,114 --> 00:15:19,381 That means researchers rely on data 269 00:15:19,591 --> 00:15:23,626 on video game fan databases, 270 00:15:23,626 --> 00:15:25,678 but as we know, there are so many, 271 00:15:25,678 --> 00:15:28,874 and there's so different [inaudible]. 272 00:15:29,254 --> 00:15:30,753 And what makes it even harder, 273 00:15:30,753 --> 00:15:33,394 the data is not open. 274 00:15:33,394 --> 00:15:36,879 So could Wikidata be a source for video game research? 275 00:15:36,879 --> 00:15:38,061 Yes. 276 00:15:38,772 --> 00:15:40,485 I work for the research project diggr, 277 00:15:40,485 --> 00:15:44,275 and we have decided to work with Wikidata for our video game research, 278 00:15:44,275 --> 00:15:46,846 and we not only use the data which is already there, 279 00:15:46,846 --> 00:15:50,669 we create data about video games and companies by hand 280 00:15:50,669 --> 00:15:54,565 or automatically, in Wikidata. 281 00:15:54,565 --> 00:15:59,144 Additionally, we have created about 20,000 links to Mobygames, 282 00:15:59,144 --> 00:16:02,648 GameFAQs and the Japanese Media Arts Database. 283 00:16:03,698 --> 00:16:10,210 And we also initiated as an alignment with the OLAC video game genre vocabulary. 284 00:16:11,270 --> 00:16:13,815 So video game research colleagues in Japan 285 00:16:13,815 --> 00:16:17,670 are also experimenting with Wikidata 286 00:16:17,670 --> 00:16:20,729 to use it as a work authority for video games. 287 00:16:21,569 --> 00:16:24,982 So, our research will cause a lot of spatial data 288 00:16:24,982 --> 00:16:26,806 about video game companies 289 00:16:26,806 --> 00:16:31,310 and where video games have been released all over the world. 290 00:16:31,310 --> 00:16:37,352 So we use data for video game databases, like Mobygames in Wikidata, 291 00:16:37,352 --> 00:16:41,026 to create some analyses like this. 292 00:16:41,026 --> 00:16:43,250 We call it Lemongrab, the tool, 293 00:16:43,250 --> 00:16:46,034 and the researcher can select one or more platforms 294 00:16:46,034 --> 00:16:48,921 and one or more release countries 295 00:16:48,921 --> 00:16:52,610 and he will get an overview about which companies are big players. 296 00:16:52,610 --> 00:16:56,684 In this case, the number of published or developed video games 297 00:16:57,284 --> 00:16:58,855 for this combination. 298 00:16:59,305 --> 00:17:01,359 Additionally, they can see which country 299 00:17:01,359 --> 00:17:05,313 is strongly represented by these companies. 300 00:17:06,153 --> 00:17:08,419 Or we use Wikidata Query Service directly 301 00:17:08,419 --> 00:17:13,589 to create maps of companies within the video game industry. 302 00:17:14,399 --> 00:17:20,990 So, at this stage, I think there are 5,000 video game companies 303 00:17:20,990 --> 00:17:23,327 already in Wikidata 304 00:17:23,327 --> 00:17:28,686 which we have created half of them, I think. (chuckles) 305 00:17:29,204 --> 00:17:34,362 So, in conclusion, after two years of working with Wikidata for our research, 306 00:17:34,362 --> 00:17:35,481 we are very pleased, 307 00:17:35,481 --> 00:17:37,127 especially with the cooperation 308 00:17:37,127 --> 00:17:40,189 with the volunteers of the video game taskers. 309 00:17:40,189 --> 00:17:41,612 Thank you for that. 310 00:17:41,612 --> 00:17:47,116 And we think Wikidata can be the one-stop shop for video game research 311 00:17:47,116 --> 00:17:52,541 because it already aggregates so many links to very specialized sites 312 00:17:52,541 --> 00:17:57,417 and it is not realistic that we put all the data into Wikidata. 313 00:18:00,422 --> 00:18:01,522 Thank you. 314 00:18:01,522 --> 00:18:04,361 At the same time, we want to be useful for the researchers. 315 00:18:04,361 --> 00:18:07,682 We also want to stay or to be or to become, 316 00:18:07,682 --> 00:18:10,470 however you want it, useful to the Wikipedias. 317 00:18:10,470 --> 00:18:12,271 Right now, some Wikipedias are using the data 318 00:18:12,271 --> 00:18:15,649 from Wikipedia for their infoboxes. 319 00:18:15,649 --> 00:18:18,719 So if tomorrow we just revamp the entire data model 320 00:18:18,719 --> 00:18:20,554 in a way they can't use it anymore, 321 00:18:20,554 --> 00:18:22,357 it doesn't sound like a great idea. 322 00:18:22,357 --> 00:18:24,175 So we'll try not to do that. 323 00:18:26,163 --> 00:18:30,520 I think we want to be enhancing all the databases, 324 00:18:30,520 --> 00:18:32,590 and that's something that's already started. 325 00:18:32,590 --> 00:18:36,891 So if you go to Visual Novel Database right now at vndb.org, 326 00:18:36,891 --> 00:18:39,783 the following research workshop that we did 327 00:18:39,783 --> 00:18:41,145 with the nice diggr folks 328 00:18:41,145 --> 00:18:42,499 who could meet with the database, 329 00:18:42,499 --> 00:18:45,545 and they were interested enough with all the linkage that we made 330 00:18:45,545 --> 00:18:51,204 that they could harvest more links about the entity that they talk about. 331 00:18:51,204 --> 00:18:57,916 Like, "Well, okay, thanks to Wikidata, we also retrieved reviews or speedruns 332 00:18:57,916 --> 00:18:59,768 or a store where you can buy these games. 333 00:18:59,768 --> 00:19:02,523 So we're already being useful. 334 00:19:02,523 --> 00:19:04,059 So that was a fine example. 335 00:19:04,059 --> 00:19:07,864 But also this German researcher 336 00:19:07,864 --> 00:19:11,971 just started the Internationale Computerspielesammlung, 337 00:19:11,971 --> 00:19:13,958 (chuckles) 338 00:19:13,958 --> 00:19:17,532 which is online, which has all the data about the German video games, 339 00:19:17,532 --> 00:19:19,802 what they have in their collections, 340 00:19:19,802 --> 00:19:23,923 and they've been using Wikidata to enrich the data IDs for labels, 341 00:19:23,923 --> 00:19:25,969 so they have alternate titles. 342 00:19:26,779 --> 00:19:28,297 So that was also pretty cool. 343 00:19:30,067 --> 00:19:33,391 I think Wikidata can be the backend for powering applications. 344 00:19:33,391 --> 00:19:36,194 So, an example that already exists is vglist.co, 345 00:19:36,194 --> 00:19:38,378 and in some ways a little bit similar 346 00:19:38,378 --> 00:19:40,751 to what avante.io does for books, 347 00:19:40,751 --> 00:19:43,882 vglist.co does it for video games. 348 00:19:44,942 --> 00:19:47,413 It's an app where you can record the games you've played, 349 00:19:47,413 --> 00:19:49,515 how long you spend, and your favorites. 350 00:19:49,515 --> 00:19:52,670 And I just really like the fact that it's built on top of Wikidata. 351 00:19:52,670 --> 00:19:54,234 It's pretty cool. 352 00:19:54,724 --> 00:19:59,482 So maybe one day we can just connect all these things together 353 00:19:59,482 --> 00:20:02,820 and harvest SPARQL to query data, 354 00:20:02,820 --> 00:20:05,074 and it really doesn't matter where it is, 355 00:20:05,074 --> 00:20:07,780 and say, "Yeah, data is not a database," 356 00:20:07,780 --> 00:20:09,215 and that will be fine. 357 00:20:09,765 --> 00:20:12,604 Thank you very much, and we'll take questions. 358 00:20:12,604 --> 00:20:14,812 (moderator) We just have five minutes for questions. 359 00:20:14,812 --> 00:20:16,478 (applause) 360 00:20:22,870 --> 00:20:25,674 (man) Hello, I really love your project, 361 00:20:25,674 --> 00:20:28,713 and when I want to contribute, where should I go? 362 00:20:29,080 --> 00:20:31,350 So there was short URL in there, 363 00:20:31,350 --> 00:20:32,437 and as Envel mentioned, 364 00:20:32,437 --> 00:20:35,874 there are tabs at the top with the links to the SPARQL queries and so on. 365 00:20:35,874 --> 00:20:37,885 And there is a Tasks, 366 00:20:37,885 --> 00:20:40,565 which is like a couple of suggestions on where to get started. 367 00:20:40,565 --> 00:20:43,630 But it's not mandatory, you can work on whatever you want, obviously. 368 00:20:43,630 --> 00:20:45,005 But, yeah, that's a nice place. 369 00:20:45,005 --> 00:20:48,442 And if you have a project, you can also bring it to the Talk page. 370 00:20:48,442 --> 00:20:49,803 It's not a very lively Talk page, 371 00:20:49,803 --> 00:20:53,437 like a lot of Wikidata Project Talk pages, in many ways, 372 00:20:53,437 --> 00:20:58,071 but I will read and answer, so that's a start. 373 00:20:58,071 --> 00:20:59,593 Do you already have something in mind? 374 00:20:59,593 --> 00:21:01,723 We can talk after this if you have something in mind. 375 00:21:02,518 --> 00:21:04,070 - Allons-y. - (woman) Hi there. 376 00:21:04,070 --> 00:21:07,983 So I work with a group from University of Copenhagen 377 00:21:07,983 --> 00:21:10,247 and University of Washington 378 00:21:10,247 --> 00:21:14,390 who are working on an initiative called Atari Women, 379 00:21:15,131 --> 00:21:17,009 recognizing all the women 380 00:21:17,009 --> 00:21:20,918 who've been involved through the years with the Atari game system. 381 00:21:20,918 --> 00:21:22,967 And so I'm wondering if-- 382 00:21:22,967 --> 00:21:26,218 I believe that your WikiProject 383 00:21:26,218 --> 00:21:30,308 covers the developers, the designers and such, 384 00:21:30,998 --> 00:21:37,235 but obviously, it crosses into the biography part of our world. 385 00:21:37,725 --> 00:21:40,245 And so how does that work? 386 00:21:42,175 --> 00:21:45,604 Is there someone who's more specialized in that area 387 00:21:45,604 --> 00:21:52,128 who these folks at these two universities could connect with, or... 388 00:21:53,046 --> 00:21:54,399 Thoughts? 389 00:21:56,409 --> 00:21:58,808 I don't think there will be somebody in particular. 390 00:21:59,998 --> 00:22:02,976 My impression of the [inaudible] project is that they are fairly eclectic. 391 00:22:02,976 --> 00:22:05,754 Sometimes people specialize on very specific niche topics. 392 00:22:05,754 --> 00:22:07,039 In that case, I don't think so. 393 00:22:07,039 --> 00:22:10,166 So I'll be happy to take the call. 394 00:22:10,166 --> 00:22:11,291 So, to answer your question, 395 00:22:11,291 --> 00:22:14,450 yes, that will definitely be in the scope of our project. 396 00:22:16,237 --> 00:22:19,476 And in that period, particularly, I don't think we want to turn back 397 00:22:19,476 --> 00:22:22,466 because these days video games are made by like 1,000 people 398 00:22:22,466 --> 00:22:24,784 and do we want to create an item about every single person, 399 00:22:24,784 --> 00:22:27,489 like the credit rolls of a movie, right? 400 00:22:27,489 --> 00:22:30,163 So in modern times, I don't know if we want to be that database, 401 00:22:30,163 --> 00:22:32,790 the ultimate database of game credits. 402 00:22:33,750 --> 00:22:36,643 But for the Atari early days-- oh, definitely, 403 00:22:36,643 --> 00:22:38,523 I would actually love to see the dataset 404 00:22:38,523 --> 00:22:42,480 because it's a lot of dudes in common knowledge of... 405 00:22:42,480 --> 00:22:44,487 - (woman) I'll connect you to that. - Yes, please. 406 00:22:44,487 --> 00:22:46,031 (laughter) 407 00:22:48,406 --> 00:22:50,210 (moderator) Any other questions? 408 00:22:53,490 --> 00:22:55,192 Sir, just in front of you. 409 00:22:56,230 --> 00:22:58,372 (man 2) Do you collaborate with the Internet Archive? 410 00:22:58,372 --> 00:23:02,906 Because there's not a month going by that Jason Scott doesn't post. 411 00:23:02,906 --> 00:23:07,111 He's rescued 170,000 DOS games or stuff like that. 412 00:23:11,100 --> 00:23:15,701 There are Internet Archives identifiers on some game items, 413 00:23:15,701 --> 00:23:17,939 which is a bit weird because usually on the Internet Archive 414 00:23:17,939 --> 00:23:19,926 there's going to be a particular release of the game, 415 00:23:19,926 --> 00:23:21,666 again on the difference... 416 00:23:21,666 --> 00:23:24,064 Last time I checked there were four or five Prince of Persia 417 00:23:24,064 --> 00:23:25,067 on the Internet Archive 418 00:23:25,067 --> 00:23:28,130 because they have the Apple II version and the DOS version and so on. 419 00:23:28,130 --> 00:23:29,708 So not explicitly. 420 00:23:29,708 --> 00:23:36,519 In general, I think we probably want to make some connections more general 421 00:23:36,519 --> 00:23:39,239 with the video game preservation scene. 422 00:23:39,239 --> 00:23:45,032 There is a quite lively organization that work hard on video game preservation. 423 00:23:45,032 --> 00:23:49,690 And I think Wikidata can be a useful resource for them 424 00:23:49,690 --> 00:23:51,609 because they don't have to manage the metadata, 425 00:23:51,609 --> 00:23:54,573 and they can focus on managing other things. 426 00:23:54,573 --> 00:23:56,084 Do you have something to add to that? 427 00:23:56,084 --> 00:23:57,136 No. 428 00:23:59,422 --> 00:24:00,792 [inaudible], perhaps? 429 00:24:01,042 --> 00:24:02,862 (man 3) I had the same question. 430 00:24:02,862 --> 00:24:04,460 (laughter) 431 00:24:04,460 --> 00:24:05,599 Perfect. 432 00:24:05,599 --> 00:24:09,194 (moderator) There was one more question back here. 433 00:24:11,843 --> 00:24:14,587 No, probably I hallucinated. Sorry. 434 00:24:16,103 --> 00:24:17,880 For one minute, we can show a query. 435 00:24:18,470 --> 00:24:19,506 Or not. 436 00:24:19,506 --> 00:24:21,275 (moderator) You have 30 seconds. 437 00:24:22,709 --> 00:24:24,599 Will the Query Service [inaudible]? 438 00:24:30,239 --> 00:24:32,202 We have links in the PDF, [inaudible]? 439 00:24:43,007 --> 00:24:45,400 (man 4) If there's still time, I have a question. 440 00:24:45,400 --> 00:24:46,532 Yes, please. 441 00:24:46,532 --> 00:24:48,201 During your presentation, did you notice 442 00:24:48,201 --> 00:24:53,711 that some of the identifiers have more than 100% [inaudible]? 443 00:24:53,711 --> 00:24:56,380 Yeah, it's because the examples-- 444 00:24:56,380 --> 00:24:59,552 so that reason, one of the users, for example, itself, 445 00:24:59,552 --> 00:25:01,239 because they use [inaudible] as examples. 446 00:25:01,239 --> 00:25:03,426 And also sometimes because there are broad matches. 447 00:25:03,426 --> 00:25:05,991 So if it says something that's a bit-- 448 00:25:06,481 --> 00:25:09,381 So, yeah, that's one of my favorite-- if I can scroll it-- 449 00:25:09,381 --> 00:25:13,137 it's the characters of the Mario franchise linked to their games. 450 00:25:13,137 --> 00:25:15,706 (chuckles) 451 00:25:15,706 --> 00:25:19,285 So you can find like Wario and Princess Peach, and so on. 452 00:25:19,285 --> 00:25:20,371 And my favorite is-- 453 00:25:20,371 --> 00:25:23,896 if you look somewhere, yes, because there is Mario somewhere here, 454 00:25:23,896 --> 00:25:25,712 and there is Dr. Mario. 455 00:25:25,712 --> 00:25:29,105 And if you look at the item, it's said to be the same as-- 456 00:25:29,835 --> 00:25:33,842 because Mario plumber and Mario physician might be two different people, 457 00:25:33,842 --> 00:25:35,318 we don't really know. 458 00:25:35,318 --> 00:25:36,840 (laughter) 459 00:25:39,568 --> 00:25:42,845 (moderator) Thank you very much for this presentation. 460 00:25:43,743 --> 00:25:45,755 (applause)