WEBVTT 00:00:06.415 --> 00:00:08.619 Hi, so before we start, quickly, 00:00:08.619 --> 00:00:11.515 so I'm Jean-Fred, I'm a Wikidata volunteer. 00:00:11.855 --> 00:00:13.519 Hi, I am Envel, 00:00:13.519 --> 00:00:15.795 and I'm also a Wikidata volunteer. 00:00:16.365 --> 00:00:21.523 And I'm Tracy, and I get paid (chuckles) to volunteer for Wikidata, 00:00:21.523 --> 00:00:24.628 but I'm also enthusiastic to be here today, 00:00:24.628 --> 00:00:26.598 and I work for a research board. 00:00:29.553 --> 00:00:32.040 Alright, thanks for coming to our presentation: 00:00:32.040 --> 00:00:33.677 Sum of all video games: 00:00:33.677 --> 00:00:37.900 our road to make Wikidata the hub of all video game metadata. 00:00:38.570 --> 00:00:41.055 So, first off, why should we even care about video games, 00:00:41.055 --> 00:00:43.010 like aren't they just like kids playing Fortnite 00:00:43.010 --> 00:00:44.549 or something at night? 00:00:44.549 --> 00:00:46.381 So video games have been here for a long time, 00:00:46.381 --> 00:00:48.951 since the '70s or '60s or '40s. 00:00:48.951 --> 00:00:50.020 It depends what you ask. 00:00:50.020 --> 00:00:52.589 You can check Wikipedia's extensive coverage 00:00:52.589 --> 00:00:54.561 of what is even a game. 00:00:54.970 --> 00:00:56.531 It's a major cultural industry. 00:00:56.531 --> 00:00:59.131 More than 2.5 billion people play in the world, 00:00:59.131 --> 00:01:01.489 and we estimate that, at the very least, 00:01:01.489 --> 00:01:05.629 100,000-200,000 video games have been published since that time 00:01:05.629 --> 00:01:08.976 and that's not counting games published on the Play Store-- 00:01:08.976 --> 00:01:11.155 then you go through the millions, 00:01:11.155 --> 00:01:13.736 which is not that much when you're on Wikidata. 00:01:15.256 --> 00:01:18.933 So a little overview of the current state of video games on Wikidata. 00:01:18.933 --> 00:01:21.771 These numbers are also on our poster on the ground floor, 00:01:21.771 --> 00:01:23.574 so we can also have it there. 00:01:23.574 --> 00:01:28.082 So we have video games or the Q7889, 00:01:28.082 --> 00:01:31.017 and we have 38,000 of them, 00:01:31.537 --> 00:01:32.628 which is not that much 00:01:32.628 --> 00:01:35.110 considering that there are at least 200,000, as I mentioned. 00:01:35.110 --> 00:01:39.667 We also have expansion packs, DLCs, and compilations 00:01:39.667 --> 00:01:41.838 but we also have, for example, game controllers. 00:01:41.838 --> 00:01:45.610 We have a lot of game consoles, about 700-- that's a lot. 00:01:46.520 --> 00:01:48.665 We have an extensive ontology of video game genres, 00:01:48.665 --> 00:01:50.201 that's pretty cool, 200 of them, 00:01:50.201 --> 00:01:53.462 and [inaudible] a bit on magazines also. 00:01:53.462 --> 00:01:56.394 Maybe video games could be a satellite even for WikiCite 00:01:56.394 --> 00:01:58.189 I don't know. (chuckles) 00:01:59.049 --> 00:02:00.901 But what about outside of Wikidata? 00:02:01.681 --> 00:02:04.878 There are a lot of databases out there about video games. 00:02:04.878 --> 00:02:06.721 You may have heard about some very big ones, 00:02:06.721 --> 00:02:09.032 like Mobygames or IGDB. 00:02:09.032 --> 00:02:12.337 There are also a lot of very special-interest databases-- 00:02:12.337 --> 00:02:15.742 databases that only cover certain types. 00:02:16.602 --> 00:02:20.900 Visual Novel Database only has about this niche genre 00:02:20.900 --> 00:02:22.290 that is a visual novel. 00:02:22.290 --> 00:02:26.708 You have databases that are only about games published on the Commodore 64, 00:02:26.708 --> 00:02:28.160 and so on. 00:02:28.550 --> 00:02:32.831 But you also have government agencies and commercial players, 00:02:32.831 --> 00:02:35.301 government agencies [inaudible], called the rating agencies, 00:02:35.301 --> 00:02:38.830 the ones that put a little label: it's not good for your kids under 16. 00:02:38.830 --> 00:02:41.601 The problem is that there is no common identifier 00:02:41.601 --> 00:02:43.487 around all of these databases 00:02:43.487 --> 00:02:44.510 that binds them together. 00:02:44.510 --> 00:02:47.911 There is no cross-linking, or it is very little. 00:02:47.911 --> 00:02:52.686 Some database might be linked to their neighbor/friend's database, 00:02:52.686 --> 00:02:55.040 like the Amiga database talk to each other a little bit. 00:02:55.040 --> 00:02:58.629 But you won't have one easy way of saying all that. 00:03:00.279 --> 00:03:02.565 So there are different data coverage and specialization, 00:03:02.565 --> 00:03:05.590 and that often comes also with conceptual differences. 00:03:06.200 --> 00:03:10.841 A database might consider a game is a work, 00:03:10.841 --> 00:03:12.952 if you're into the FRBR model, 00:03:12.952 --> 00:03:14.226 or that might be an edition 00:03:14.226 --> 00:03:16.575 or that might be a particular console version. 00:03:17.055 --> 00:03:19.160 So there is a lot of granularity in there. 00:03:19.460 --> 00:03:22.117 And that's important in terms of coverage 00:03:22.117 --> 00:03:25.008 because some databases-- 00:03:25.008 --> 00:03:27.623 for example, Mobygames has a lot of information about a lot of things, 00:03:27.623 --> 00:03:29.039 but it doesn't have a lot of information 00:03:29.039 --> 00:03:32.417 about the games that were published on the early French computers, 00:03:32.417 --> 00:03:35.790 like the Oric or the Thomson TO MO series. 00:03:36.589 --> 00:03:40.063 You will find that into more French databases. 00:03:40.063 --> 00:03:44.905 And if you go into Eastern video games, like China or Japan, 00:03:44.905 --> 00:03:47.460 it's not very well covered in Western databases. 00:03:48.260 --> 00:03:51.218 Enter WikiProject video games. 00:03:51.218 --> 00:03:53.515 (cheers and applause) 00:03:53.515 --> 00:03:54.572 (woman) Whoo-hoo! 00:03:54.572 --> 00:03:55.948 We didn't make that one, actually. 00:03:57.608 --> 00:03:59.278 So it lives at that address 00:03:59.278 --> 00:04:01.548 and there are a lot of subpages, 00:04:01.548 --> 00:04:06.005 and we're going to go through a little bit of what this project is made of. 00:04:06.005 --> 00:04:08.660 As often, there is-- 00:04:08.660 --> 00:04:11.815 we'll separate that in what's old and what's new 00:04:11.815 --> 00:04:13.958 and what's borrowed and what's blue. 00:04:14.458 --> 00:04:15.660 So, as old we have-- 00:04:15.660 --> 00:04:17.500 Like a lot of WikiProjects we have, 00:04:17.500 --> 00:04:19.480 an ontology description with all the properties. 00:04:19.480 --> 00:04:22.031 There are currently 64 properties, mostly for games, 00:04:22.031 --> 00:04:25.385 but also about series or hardware. 00:04:25.385 --> 00:04:28.208 And we have a fairly extensive, I think-- 00:04:28.208 --> 00:04:30.128 how to put it-- separations. 00:04:30.128 --> 00:04:31.380 We have things about the staff, 00:04:31.380 --> 00:04:32.966 but also about the narrative universe 00:04:32.966 --> 00:04:36.689 or about the gameplay, like how many players there are. 00:04:36.689 --> 00:04:39.387 So you can explore this; it's kind of very exciting. 00:04:39.877 --> 00:04:41.797 We also have example queries. 00:04:41.797 --> 00:04:44.033 If we have time at the end, we might show off some, 00:04:44.033 --> 00:04:46.029 but you can just explore them yourself. 00:04:51.455 --> 00:04:53.850 We also have something new. 00:04:53.850 --> 00:04:59.951 Because those things don't exist in other WikiProjects and Wikidata. 00:05:00.545 --> 00:05:02.915 For example, we have an Activity Log. 00:05:02.915 --> 00:05:06.437 You can see it here. 00:05:06.437 --> 00:05:10.577 On this Activity Log, we track the activity of the project. 00:05:10.577 --> 00:05:16.725 So when we publish a blog post or an article somewhere, 00:05:16.725 --> 00:05:19.677 we add it here. 00:05:20.327 --> 00:05:23.281 When we create a new identifier property 00:05:23.281 --> 00:05:25.751 or any property related to video games, 00:05:25.751 --> 00:05:27.222 we also add it here. 00:05:28.172 --> 00:05:29.960 We also have achievements, 00:05:29.960 --> 00:05:32.611 like in January, we added a condition 00:05:32.611 --> 00:05:37.896 of an external identifier. 00:05:39.446 --> 00:05:42.046 Another thing that we do is we have a Tasks List. 00:05:42.046 --> 00:05:47.414 The Tasks List can be used by newcomers to the project 00:05:47.414 --> 00:05:51.711 to do things in the project. 00:05:52.331 --> 00:05:53.765 It can be [inaudible], 00:05:53.765 --> 00:05:59.574 so we give them an insight to [inaudible] 00:05:59.574 --> 00:06:01.057 and how to do that. 00:06:01.577 --> 00:06:05.116 It's also where we like [inaudible] 00:06:05.116 --> 00:06:08.751 [inaudible] 00:06:10.326 --> 00:06:12.959 We also have something borrowed. 00:06:13.897 --> 00:06:17.293 We have a lot of pages of statistics reports. 00:06:21.701 --> 00:06:24.631 We also have external identifiers that [inaudible]-- 00:06:24.631 --> 00:06:26.098 you can see it here-- 00:06:27.535 --> 00:06:29.923 where we track-- 00:06:29.923 --> 00:06:32.110 I don't know if you can see it-- 00:06:32.110 --> 00:06:36.124 but we have more than 100 external identifiers 00:06:36.124 --> 00:06:37.418 for video games, 00:06:37.418 --> 00:06:39.212 so this is big, huge. 00:06:39.212 --> 00:06:43.127 And here we can see for each item here-- 00:06:43.557 --> 00:06:45.147 just a little peek. 00:06:45.147 --> 00:06:50.623 And also the completion of the identifier. 00:06:54.012 --> 00:06:56.725 So, some of these things we borrowed from the Sum of all Paintings 00:06:56.725 --> 00:06:59.705 and other things, that begins more blue. 00:06:59.705 --> 00:07:04.019 So the InteGraality tool that was made initially for Sum of all Paintings 00:07:04.679 --> 00:07:06.235 I extended it for video games, 00:07:06.235 --> 00:07:08.819 and then I might as well have done it for everybody. 00:07:09.610 --> 00:07:12.174 So, yeah, one day we'll get all of these. 00:07:12.174 --> 00:07:15.628 So this is the core properties, the genre/developer/publisher 00:07:16.348 --> 00:07:17.951 along video game systems, 00:07:17.951 --> 00:07:21.007 so Windows, PlayStation console and so on. 00:07:21.007 --> 00:07:23.456 So, as you can see, we have a lot of work to do 00:07:23.456 --> 00:07:26.439 for even like the very basic core properties. 00:07:26.949 --> 00:07:30.030 So, yeah, one day, all of that will be blue. 00:07:31.340 --> 00:07:32.920 What have we been doing? 00:07:34.340 --> 00:07:35.737 Things that we've been doing a lot 00:07:35.737 --> 00:07:38.886 has been creating identifiers with all these external databases 00:07:38.886 --> 00:07:40.071 and aligning them. 00:07:40.071 --> 00:07:44.644 So Envel mentioned we have created over 100 external identifier properties-- 00:07:45.134 --> 00:07:49.204 that covers very big databases and very tiny ones. 00:07:49.754 --> 00:07:54.732 We've been using the Mix'n'match tool extensively for matching. 00:07:54.732 --> 00:07:57.222 And sometimes we've been using things a bit more advanced 00:07:57.222 --> 00:07:59.886 that Envel will detail in a moment. 00:08:01.046 --> 00:08:03.308 Yeah, so 100 external identifier properties created 00:08:03.308 --> 00:08:06.129 in roughly a year to two years 00:08:06.129 --> 00:08:08.250 and over 16 Mix'n'match catalogs. 00:08:08.250 --> 00:08:09.785 And I started tracking 00:08:09.785 --> 00:08:15.078 how many Q7889 items didn't have any identifiers, 00:08:15.078 --> 00:08:17.215 and five months ago it was 15,000 00:08:17.215 --> 00:08:20.386 and today we're down to 9,600, 00:08:20.386 --> 00:08:25.177 which is very much thanks to the teaching assistant of Tracy. 00:08:25.578 --> 00:08:28.817 So there's still 9,000 to go, but we're getting there. 00:08:32.556 --> 00:08:37.146 So we needed to import a lot of data 00:08:37.146 --> 00:08:40.826 to complete those identifiers. 00:08:42.996 --> 00:08:46.530 The first tool to do that is the Wikidata website. 00:08:47.280 --> 00:08:48.984 I think it's important to say it 00:08:48.984 --> 00:08:55.193 because it's where we can fix the small problems, and so on. 00:08:56.193 --> 00:09:02.020 But we also have dedicated tools to do that on Wikidata. 00:09:02.020 --> 00:09:04.925 There is Mix'n'match, and its gadget. 00:09:06.345 --> 00:09:10.179 The Mix'n'match Wiki gadget is a gadget that you can add 00:09:10.179 --> 00:09:12.438 to your account in Wikidata, 00:09:12.438 --> 00:09:17.363 and it adds all identifiers 00:09:17.363 --> 00:09:20.788 from [inaudible] Mix'n'match to an item. 00:09:22.549 --> 00:09:27.273 You can easily add serial IDs [inaudible]. 00:09:29.695 --> 00:09:33.146 Other tools... There is QuickStatements, of course. 00:09:33.526 --> 00:09:38.751 But you also can use more general tools, like OpenRefine, 00:09:38.751 --> 00:09:42.039 Dataiku Data Science Studio, et cetera. 00:09:43.369 --> 00:09:46.079 The point is it's very important for this project, 00:09:46.079 --> 00:09:48.750 and I think for all projects in Wikidata, 00:09:48.750 --> 00:09:53.183 to have a healthy ecosystem of tools that works. 00:09:59.413 --> 00:10:01.642 There are two examples of imports. 00:10:01.642 --> 00:10:06.279 The first one is connecting PCGamingWiki and Wikidata. 00:10:06.279 --> 00:10:09.383 It was made by a volunteer. 00:10:09.383 --> 00:10:12.157 He made his own program in Ruby, 00:10:12.157 --> 00:10:13.529 so that's an example. 00:10:14.289 --> 00:10:15.347 The second one 00:10:15.347 --> 00:10:19.200 is linking the OLAC video game vocabulary with Wikidata. 00:10:19.200 --> 00:10:22.473 It was made using OpenRefine and Mix'n'match, 00:10:22.473 --> 00:10:27.210 and I think Tracy can talk more about this one. 00:10:28.549 --> 00:10:32.805 And I have a third example, which is one I made. 00:10:33.665 --> 00:10:38.080 I matched the catalog of BnF, 00:10:38.080 --> 00:10:41.984 so it's Bibliothèque... the French National Library 00:10:42.824 --> 00:10:45.548 with Wikidata. 00:10:45.548 --> 00:10:50.494 So they have about 4,000 entries 00:10:50.494 --> 00:10:52.571 about video games in their catalog, 00:10:52.571 --> 00:10:58.046 and I matched half of them to Wikidata. 00:10:59.398 --> 00:11:01.956 So, for that, I made a project 00:11:03.626 --> 00:11:05.864 in Dataiku Data Science Studio. 00:11:06.414 --> 00:11:10.465 You can see the work [inaudible]. 00:11:11.185 --> 00:11:12.219 I will not detail it, 00:11:12.219 --> 00:11:14.528 but if you have questions, feel free to ask. 00:11:15.508 --> 00:11:19.143 I also developed a Dataiku plugin to do it, 00:11:19.143 --> 00:11:21.535 to facilitate SPARQL querying 00:11:21.535 --> 00:11:25.639 because it's not included in the tool. 00:11:27.309 --> 00:11:31.644 One cool thing that happened after this one 00:11:31.644 --> 00:11:34.676 is that BnF contacted me about this project. 00:11:34.676 --> 00:11:36.744 So it was very cool to have feedback, 00:11:36.744 --> 00:11:40.178 and that contact was established. 00:11:44.472 --> 00:11:48.029 So, another topic, the link-- 00:11:48.029 --> 00:11:51.604 So we want Wikidata to be the linking hub for video games. 00:11:52.804 --> 00:11:54.321 As you can see here, 00:11:54.321 --> 00:11:57.510 a video game is, as Jean-Fred said, 00:11:57.510 --> 00:12:00.037 a video game is about a lot of things. 00:12:01.417 --> 00:12:05.737 We have Reviews and Scores, Speedruns, 00:12:05.737 --> 00:12:08.284 News, Library ID, 00:12:09.584 --> 00:12:11.255 Soundtrack, etc. 00:12:11.919 --> 00:12:15.840 We don't want all this data to be in Wikidata, 00:12:15.840 --> 00:12:18.444 we want this data to be linked to Wikidata. 00:12:18.444 --> 00:12:20.681 So we want Wikidata to be, 00:12:22.041 --> 00:12:24.499 like [Lidia] said yesterday, a place-- 00:12:25.342 --> 00:12:29.161 We want to see Wikidata as a place you go, 00:12:29.161 --> 00:12:33.235 and then you go to another place. 00:12:33.590 --> 00:12:35.542 So I think that's it. 00:12:38.410 --> 00:12:41.230 And as you can see by the links, 00:12:42.550 --> 00:12:48.584 video games have a really lot of aspects to research, 00:12:49.194 --> 00:12:53.318 and video games are really complex cultural artifacts. 00:12:53.318 --> 00:12:55.667 There are [inaudible], there are [ed ones], 00:12:55.667 --> 00:12:59.511 remasters, re-releases, mods, updates, 00:12:59.511 --> 00:13:02.482 download of content, and so on and so forth. 00:13:02.482 --> 00:13:05.721 Plenty of remakes or remastered editions 00:13:05.721 --> 00:13:09.076 are separate items at this stage in Wikidata, 00:13:09.076 --> 00:13:11.009 but not necessarily. 00:13:11.009 --> 00:13:14.798 Additionally, remakes are not often linked to the original work 00:13:14.798 --> 00:13:17.486 using the property based on. 00:13:17.486 --> 00:13:21.859 And perhaps we should create an entity schema for the video games, 00:13:21.859 --> 00:13:23.997 but we are still in the process 00:13:23.997 --> 00:13:28.859 to get a discussion started for the data model of video games. 00:13:29.909 --> 00:13:32.193 Mostly, we have one item, 00:13:32.193 --> 00:13:36.434 what we typically recognize as "the game," 00:13:36.434 --> 00:13:38.620 when we say we played the same game, 00:13:38.620 --> 00:13:41.960 so it's like a Mario Kart 6. 00:13:41.960 --> 00:13:45.364 Even if we played it on different platforms, 00:13:45.364 --> 00:13:50.522 so, for example, on Switch, on Wii U, or something else. 00:13:50.522 --> 00:13:55.980 So Wikidata items for a game aggregate characteristics 00:13:55.980 --> 00:14:01.357 which are shared among different versions or editions. 00:14:01.357 --> 00:14:02.827 This makes linking not easy 00:14:02.827 --> 00:14:07.201 because many databases describe games on different levels, 00:14:07.201 --> 00:14:09.107 as Jean-Frédéric mentioned. 00:14:09.687 --> 00:14:13.913 For instance, some have one database entry for each edition, 00:14:13.913 --> 00:14:16.599 and this results in more than one identifier 00:14:16.599 --> 00:14:19.156 for each video game item. 00:14:19.156 --> 00:14:22.807 And so the use of specific qualifiers is needed. 00:14:23.527 --> 00:14:28.518 We have some discussions thinking about the creation of different editions items, 00:14:29.228 --> 00:14:31.071 for editions or releases. 00:14:31.071 --> 00:14:33.676 as this is good practice for literature, 00:14:33.676 --> 00:14:39.932 but the FRBR model which is used for books seems not useful for everyone. 00:14:41.072 --> 00:14:45.763 This is also an ongoing discussion with the video game research community 00:14:45.763 --> 00:14:48.730 about the best data model for video games. 00:14:49.720 --> 00:14:54.396 And speaking about video game research and the research community, 00:14:54.396 --> 00:14:56.832 there is an active video game research community 00:14:56.832 --> 00:15:00.720 with a growing interest in data about games. 00:15:00.720 --> 00:15:05.245 Sadly, there are no national libraries for video games 00:15:05.245 --> 00:15:07.816 which have a comprehensive dataset 00:15:07.816 --> 00:15:09.752 with authority data about video games-- 00:15:09.752 --> 00:15:12.674 yes, the BnF with 4,000 video games, 00:15:12.674 --> 00:15:16.344 but there's still more outside. 00:15:17.114 --> 00:15:19.381 That means researchers rely on data 00:15:19.591 --> 00:15:23.626 on video game fan databases, 00:15:23.626 --> 00:15:25.678 but as we know, there are so many, 00:15:25.678 --> 00:15:28.874 and there's so different [inaudible]. 00:15:29.254 --> 00:15:30.753 And what makes it even harder, 00:15:30.753 --> 00:15:33.394 the data is not open. 00:15:33.394 --> 00:15:36.879 So could Wikidata be a source for video game research? 00:15:36.879 --> 00:15:38.061 Yes. 00:15:38.772 --> 00:15:40.485 I work for the research project diggr, 00:15:40.485 --> 00:15:44.275 and we have decided to work with Wikidata for our video game research, 00:15:44.275 --> 00:15:46.846 and we not only use the data which is already there, 00:15:46.846 --> 00:15:50.669 we create data about video games and companies by hand 00:15:50.669 --> 00:15:54.565 or automatically, in Wikidata. 00:15:54.565 --> 00:15:59.144 Additionally, we have created about 20,000 links to Mobygames, 00:15:59.144 --> 00:16:02.648 GameFAQs and the Japanese Media Arts Database. 00:16:03.698 --> 00:16:10.210 And we also initiated as an alignment with the OLAC video game genre vocabulary. 00:16:11.270 --> 00:16:13.815 So video game research colleagues in Japan 00:16:13.815 --> 00:16:17.670 are also experimenting with Wikidata 00:16:17.670 --> 00:16:20.729 to use it as a work authority for video games. 00:16:21.569 --> 00:16:24.982 So, our research will cause a lot of spatial data 00:16:24.982 --> 00:16:26.806 about video game companies 00:16:26.806 --> 00:16:31.310 and where video games have been released all over the world. 00:16:31.310 --> 00:16:37.352 So we use data for video game databases, like Mobygames in Wikidata, 00:16:37.352 --> 00:16:41.026 to create some analyses like this. 00:16:41.026 --> 00:16:43.250 We call it Lemongrab, the tool, 00:16:43.250 --> 00:16:46.034 and the researcher can select one or more platforms 00:16:46.034 --> 00:16:48.921 and one or more release countries 00:16:48.921 --> 00:16:52.610 and he will get an overview about which companies are big players. 00:16:52.610 --> 00:16:56.684 In this case, the number of published or developed video games 00:16:57.284 --> 00:16:58.855 for this combination. 00:16:59.305 --> 00:17:01.359 Additionally, they can see which country 00:17:01.359 --> 00:17:05.313 is strongly represented by these companies. 00:17:06.153 --> 00:17:08.419 Or we use Wikidata Query Service directly 00:17:08.419 --> 00:17:13.589 to create maps of companies within the video game industry. 00:17:14.399 --> 00:17:20.990 So, at this stage, I think there are 5,000 video game companies 00:17:20.990 --> 00:17:23.327 already in Wikidata 00:17:23.327 --> 00:17:28.686 which we have created half of them, I think. (chuckles) 00:17:29.204 --> 00:17:34.362 So, in conclusion, after two years of working with Wikidata for our research, 00:17:34.362 --> 00:17:35.481 we are very pleased, 00:17:35.481 --> 00:17:37.127 especially with the cooperation 00:17:37.127 --> 00:17:40.189 with the volunteers of the video game taskers. 00:17:40.189 --> 00:17:41.612 Thank you for that. 00:17:41.612 --> 00:17:47.116 And we think Wikidata can be the one-stop shop for video game research 00:17:47.116 --> 00:17:52.541 because it already aggregates so many links to very specialized sites 00:17:52.541 --> 00:17:57.417 and it is not realistic that we put all the data into Wikidata. 00:18:00.422 --> 00:18:01.522 Thank you. 00:18:01.522 --> 00:18:04.361 At the same time, we want to be useful for the researchers. 00:18:04.361 --> 00:18:07.682 We also want to stay or to be or to become, 00:18:07.682 --> 00:18:10.470 however you want it, useful to the Wikipedias. 00:18:10.470 --> 00:18:12.271 Right now, some Wikipedias are using the data 00:18:12.271 --> 00:18:15.649 from Wikipedia for their infoboxes. 00:18:15.649 --> 00:18:18.719 So if tomorrow we just revamp the entire data model 00:18:18.719 --> 00:18:20.554 in a way they can't use it anymore, 00:18:20.554 --> 00:18:22.357 it doesn't sound like a great idea. 00:18:22.357 --> 00:18:24.175 So we'll try not to do that. 00:18:26.163 --> 00:18:30.520 I think we want to be enhancing all the databases, 00:18:30.520 --> 00:18:32.590 and that's something that's already started. 00:18:32.590 --> 00:18:36.891 So if you go to Visual Novel Database right now at vndb.org, 00:18:36.891 --> 00:18:39.783 the following research workshop that we did 00:18:39.783 --> 00:18:41.145 with the nice diggr folks 00:18:41.145 --> 00:18:42.499 who could meet with the database, 00:18:42.499 --> 00:18:45.545 and they were interested enough with all the linkage that we made 00:18:45.545 --> 00:18:51.204 that they could harvest more links about the entity that they talk about. 00:18:51.204 --> 00:18:57.916 Like, "Well, okay, thanks to Wikidata, we also retrieved reviews or speedruns 00:18:57.916 --> 00:18:59.768 or a store where you can buy these games. 00:18:59.768 --> 00:19:02.523 So we're already being useful. 00:19:02.523 --> 00:19:04.059 So that was a fine example. 00:19:04.059 --> 00:19:07.864 But also this German researcher 00:19:07.864 --> 00:19:11.971 just started the Internationale Computerspielesammlung, 00:19:11.971 --> 00:19:13.958 (chuckles) 00:19:13.958 --> 00:19:17.532 which is online, which has all the data about the German video games, 00:19:17.532 --> 00:19:19.802 what they have in their collections, 00:19:19.802 --> 00:19:23.923 and they've been using Wikidata to enrich the data IDs for labels, 00:19:23.923 --> 00:19:25.969 so they have alternate titles. 00:19:26.779 --> 00:19:28.297 So that was also pretty cool. 00:19:30.067 --> 00:19:33.391 I think Wikidata can be the backend for powering applications. 00:19:33.391 --> 00:19:36.194 So, an example that already exists is vglist.co, 00:19:36.194 --> 00:19:38.378 and in some ways a little bit similar 00:19:38.378 --> 00:19:40.751 to what avante.io does for books, 00:19:40.751 --> 00:19:43.882 vglist.co does it for video games. 00:19:44.942 --> 00:19:47.413 It's an app where you can record the games you've played, 00:19:47.413 --> 00:19:49.515 how long you spend, and your favorites. 00:19:49.515 --> 00:19:52.670 And I just really like the fact that it's built on top of Wikidata. 00:19:52.670 --> 00:19:54.234 It's pretty cool. 00:19:54.724 --> 00:19:59.482 So maybe one day we can just connect all these things together 00:19:59.482 --> 00:20:02.820 and harvest SPARQL to query data, 00:20:02.820 --> 00:20:05.074 and it really doesn't matter where it is, 00:20:05.074 --> 00:20:07.780 and say, "Yeah, data is not a database," 00:20:07.780 --> 00:20:09.215 and that will be fine. 00:20:09.765 --> 00:20:12.604 Thank you very much, and we'll take questions. 00:20:12.604 --> 00:20:14.812 (moderator) We just have five minutes for questions. 00:20:14.812 --> 00:20:16.478 (applause) 00:20:22.870 --> 00:20:25.674 (man) Hello, I really love your project, 00:20:25.674 --> 00:20:28.713 and when I want to contribute, where should I go? 00:20:29.080 --> 00:20:31.350 So there was short URL in there, 00:20:31.350 --> 00:20:32.437 and as Envel mentioned, 00:20:32.437 --> 00:20:35.874 there are tabs at the top with the links to the SPARQL queries and so on. 00:20:35.874 --> 00:20:37.885 And there is a Tasks, 00:20:37.885 --> 00:20:40.565 which is like a couple of suggestions on where to get started. 00:20:40.565 --> 00:20:43.630 But it's not mandatory, you can work on whatever you want, obviously. 00:20:43.630 --> 00:20:45.005 But, yeah, that's a nice place. 00:20:45.005 --> 00:20:48.442 And if you have a project, you can also bring it to the Talk page. 00:20:48.442 --> 00:20:49.803 It's not a very lively Talk page, 00:20:49.803 --> 00:20:53.437 like a lot of Wikidata Project Talk pages, in many ways, 00:20:53.437 --> 00:20:58.071 but I will read and answer, so that's a start. 00:20:58.071 --> 00:20:59.593 Do you already have something in mind? 00:20:59.593 --> 00:21:01.723 We can talk after this if you have something in mind. 00:21:02.518 --> 00:21:04.070 - Allons-y. - (woman) Hi there. 00:21:04.070 --> 00:21:07.983 So I work with a group from University of Copenhagen 00:21:07.983 --> 00:21:10.247 and University of Washington 00:21:10.247 --> 00:21:14.390 who are working on an initiative called Atari Women, 00:21:15.131 --> 00:21:17.009 recognizing all the women 00:21:17.009 --> 00:21:20.918 who've been involved through the years with the Atari game system. 00:21:20.918 --> 00:21:22.967 And so I'm wondering if-- 00:21:22.967 --> 00:21:26.218 I believe that your WikiProject 00:21:26.218 --> 00:21:30.308 covers the developers, the designers and such, 00:21:30.998 --> 00:21:37.235 but obviously, it crosses into the biography part of our world. 00:21:37.725 --> 00:21:40.245 And so how does that work? 00:21:42.175 --> 00:21:45.604 Is there someone who's more specialized in that area 00:21:45.604 --> 00:21:52.128 who these folks at these two universities could connect with, or... 00:21:53.046 --> 00:21:54.399 Thoughts? 00:21:56.409 --> 00:21:58.808 I don't think there will be somebody in particular. 00:21:59.998 --> 00:22:02.976 My impression of the [inaudible] project is that they are fairly eclectic. 00:22:02.976 --> 00:22:05.754 Sometimes people specialize on very specific niche topics. 00:22:05.754 --> 00:22:07.039 In that case, I don't think so. 00:22:07.039 --> 00:22:10.166 So I'll be happy to take the call. 00:22:10.166 --> 00:22:11.291 So, to answer your question, 00:22:11.291 --> 00:22:14.450 yes, that will definitely be in the scope of our project. 00:22:16.237 --> 00:22:19.476 And in that period, particularly, I don't think we want to turn back 00:22:19.476 --> 00:22:22.466 because these days video games are made by like 1,000 people 00:22:22.466 --> 00:22:24.784 and do we want to create an item about every single person, 00:22:24.784 --> 00:22:27.489 like the credit rolls of a movie, right? 00:22:27.489 --> 00:22:30.163 So in modern times, I don't know if we want to be that database, 00:22:30.163 --> 00:22:32.790 the ultimate database of game credits. 00:22:33.750 --> 00:22:36.643 But for the Atari early days-- oh, definitely, 00:22:36.643 --> 00:22:38.523 I would actually love to see the dataset 00:22:38.523 --> 00:22:42.480 because it's a lot of dudes in common knowledge of... 00:22:42.480 --> 00:22:44.487 - (woman) I'll connect you to that. - Yes, please. 00:22:44.487 --> 00:22:46.031 (laughter) 00:22:48.406 --> 00:22:50.210 (moderator) Any other questions? 00:22:53.490 --> 00:22:55.192 Sir, just in front of you. 00:22:56.230 --> 00:22:58.372 (man 2) Do you collaborate with the Internet Archive? 00:22:58.372 --> 00:23:02.906 Because there's not a month going by that Jason Scott doesn't post. 00:23:02.906 --> 00:23:07.111 He's rescued 170,000 DOS games or stuff like that. 00:23:11.100 --> 00:23:15.701 There are Internet Archives identifiers on some game items, 00:23:15.701 --> 00:23:17.939 which is a bit weird because usually on the Internet Archive 00:23:17.939 --> 00:23:19.926 there's going to be a particular release of the game, 00:23:19.926 --> 00:23:21.666 again on the difference... 00:23:21.666 --> 00:23:24.064 Last time I checked there were four or five Prince of Persia 00:23:24.064 --> 00:23:25.067 on the Internet Archive 00:23:25.067 --> 00:23:28.130 because they have the Apple II version and the DOS version and so on. 00:23:28.130 --> 00:23:29.708 So not explicitly. 00:23:29.708 --> 00:23:36.519 In general, I think we probably want to make some connections more general 00:23:36.519 --> 00:23:39.239 with the video game preservation scene. 00:23:39.239 --> 00:23:45.032 There is a quite lively organization that work hard on video game preservation. 00:23:45.032 --> 00:23:49.690 And I think Wikidata can be a useful resource for them 00:23:49.690 --> 00:23:51.609 because they don't have to manage the metadata, 00:23:51.609 --> 00:23:54.573 and they can focus on managing other things. 00:23:54.573 --> 00:23:56.084 Do you have something to add to that? 00:23:56.084 --> 00:23:57.136 No. 00:23:59.422 --> 00:24:00.792 [inaudible], perhaps? 00:24:01.042 --> 00:24:02.862 (man 3) I had the same question. 00:24:02.862 --> 00:24:04.460 (laughter) 00:24:04.460 --> 00:24:05.599 Perfect. 00:24:05.599 --> 00:24:09.194 (moderator) There was one more question back here. 00:24:11.843 --> 00:24:14.587 No, probably I hallucinated. Sorry. 00:24:16.103 --> 00:24:17.880 For one minute, we can show a query. 00:24:18.470 --> 00:24:19.506 Or not. 00:24:19.506 --> 00:24:21.275 (moderator) You have 30 seconds. 00:24:22.709 --> 00:24:24.599 Will the Query Service [inaudible]? 00:24:30.239 --> 00:24:32.202 We have links in the PDF, [inaudible]? 00:24:43.007 --> 00:24:45.400 (man 4) If there's still time, I have a question. 00:24:45.400 --> 00:24:46.532 Yes, please. 00:24:46.532 --> 00:24:48.201 During your presentation, did you notice 00:24:48.201 --> 00:24:53.711 that some of the identifiers have more than 100% [inaudible]? 00:24:53.711 --> 00:24:56.380 Yeah, it's because the examples-- 00:24:56.380 --> 00:24:59.552 so that reason, one of the users, for example, itself, 00:24:59.552 --> 00:25:01.239 because they use [inaudible] as examples. 00:25:01.239 --> 00:25:03.426 And also sometimes because there are broad matches. 00:25:03.426 --> 00:25:05.991 So if it says something that's a bit-- 00:25:06.481 --> 00:25:09.381 So, yeah, that's one of my favorite-- if I can scroll it-- 00:25:09.381 --> 00:25:13.137 it's the characters of the Mario franchise linked to their games. 00:25:13.137 --> 00:25:15.706 (chuckles) 00:25:15.706 --> 00:25:19.285 So you can find like Wario and Princess Peach, and so on. 00:25:19.285 --> 00:25:20.371 And my favorite is-- 00:25:20.371 --> 00:25:23.896 if you look somewhere, yes, because there is Mario somewhere here, 00:25:23.896 --> 00:25:25.712 and there is Dr. Mario. 00:25:25.712 --> 00:25:29.105 And if you look at the item, it's said to be the same as-- 00:25:29.835 --> 00:25:33.842 because Mario plumber and Mario physician might be two different people, 00:25:33.842 --> 00:25:35.318 we don't really know. 00:25:35.318 --> 00:25:36.840 (laughter) 00:25:39.568 --> 00:25:42.845 (moderator) Thank you very much for this presentation. 00:25:43.743 --> 00:25:45.755 (applause)