1
00:00:06,415 --> 00:00:08,619
Hi, so before we start, quickly,
2
00:00:08,619 --> 00:00:11,515
so I'm Jean-Fred,
I'm a Wikidata volunteer.
3
00:00:11,855 --> 00:00:13,519
Hi, I am Envel,
4
00:00:13,519 --> 00:00:15,795
and I'm also a Wikidata volunteer.
5
00:00:16,365 --> 00:00:21,523
And I'm Tracy, and I get paid (chuckles)
to volunteer for Wikidata,
6
00:00:21,523 --> 00:00:24,628
but I'm also enthusiastic
to be here today,
7
00:00:24,628 --> 00:00:26,598
and I work for a research board.
8
00:00:29,553 --> 00:00:32,040
Alright, thanks for coming
to our presentation:
9
00:00:32,040 --> 00:00:33,677
Sum of all video games:
10
00:00:33,677 --> 00:00:37,900
our road to make Wikidata
the hub of all video game metadata.
11
00:00:38,570 --> 00:00:41,055
So, first off, why should
we even care about video games,
12
00:00:41,055 --> 00:00:43,010
like aren't they just
like kids playing Fortnite
13
00:00:43,010 --> 00:00:44,549
or something at night?
14
00:00:44,549 --> 00:00:46,381
So video games
have been here for a long time,
15
00:00:46,381 --> 00:00:48,951
since the '70s or '60s or '40s.
16
00:00:48,951 --> 00:00:50,020
It depends what you ask.
17
00:00:50,020 --> 00:00:52,589
You can check Wikipedia's
extensive coverage
18
00:00:52,589 --> 00:00:54,561
of what is even a game.
19
00:00:54,970 --> 00:00:56,531
It's a major cultural industry.
20
00:00:56,531 --> 00:00:59,131
More than 2.5 billion people
play in the world,
21
00:00:59,131 --> 00:01:01,489
and we estimate that, at the very least,
22
00:01:01,489 --> 00:01:05,629
100,000-200,000 video games
have been published since that time
23
00:01:05,629 --> 00:01:08,976
and that's not counting games
published on the Play Store--
24
00:01:08,976 --> 00:01:11,155
then you go through the millions,
25
00:01:11,155 --> 00:01:13,736
which is not that much
when you're on Wikidata.
26
00:01:15,256 --> 00:01:18,933
So a little overview of the current state
of video games on Wikidata.
27
00:01:18,933 --> 00:01:21,771
These numbers are also
on our poster on the ground floor,
28
00:01:21,771 --> 00:01:23,574
so we can also have it there.
29
00:01:23,574 --> 00:01:28,082
So we have video games or the Q7889,
30
00:01:28,082 --> 00:01:31,017
and we have 38,000 of them,
31
00:01:31,537 --> 00:01:32,628
which is not that much
32
00:01:32,628 --> 00:01:35,110
considering that there are
at least 200,000, as I mentioned.
33
00:01:35,110 --> 00:01:39,667
We also have expansion packs,
DLCs, and compilations
34
00:01:39,667 --> 00:01:41,838
but we also have, for example,
game controllers.
35
00:01:41,838 --> 00:01:45,610
We have a lot of game consoles,
about 700-- that's a lot.
36
00:01:46,520 --> 00:01:48,665
We have an extensive ontology
of video game genres,
37
00:01:48,665 --> 00:01:50,201
that's pretty cool, 200 of them,
38
00:01:50,201 --> 00:01:53,462
and [inaudible] a bit on magazines also.
39
00:01:53,462 --> 00:01:56,394
Maybe video games
could be a satellite even for WikiCite
40
00:01:56,394 --> 00:01:58,189
I don't know. (chuckles)
41
00:01:59,049 --> 00:02:00,901
But what about outside of Wikidata?
42
00:02:01,681 --> 00:02:04,878
There are a lot of databases
out there about video games.
43
00:02:04,878 --> 00:02:06,721
You may have heard about
some very big ones,
44
00:02:06,721 --> 00:02:09,032
like Mobygames or IGDB.
45
00:02:09,032 --> 00:02:12,337
There are also a lot
of very special-interest databases--
46
00:02:12,337 --> 00:02:15,742
databases that only cover certain types.
47
00:02:16,602 --> 00:02:20,900
Visual Novel Database
only has about this niche genre
48
00:02:20,900 --> 00:02:22,290
that is a visual novel.
49
00:02:22,290 --> 00:02:26,708
You have databases that are only about
games published on the Commodore 64,
50
00:02:26,708 --> 00:02:28,160
and so on.
51
00:02:28,550 --> 00:02:32,831
But you also have government agencies
and commercial players,
52
00:02:32,831 --> 00:02:35,301
government agencies [inaudible],
called the rating agencies,
53
00:02:35,301 --> 00:02:38,830
the ones that put a little label:
it's not good for your kids under 16.
54
00:02:38,830 --> 00:02:41,601
The problem is that
there is no common identifier
55
00:02:41,601 --> 00:02:43,487
around all of these databases
56
00:02:43,487 --> 00:02:44,510
that binds them together.
57
00:02:44,510 --> 00:02:47,911
There is no cross-linking,
or it is very little.
58
00:02:47,911 --> 00:02:52,686
Some database might be linked
to their neighbor/friend's database,
59
00:02:52,686 --> 00:02:55,040
like the Amiga database
talk to each other a little bit.
60
00:02:55,040 --> 00:02:58,629
But you won't have
one easy way of saying all that.
61
00:03:00,279 --> 00:03:02,565
So there are different
data coverage and specialization,
62
00:03:02,565 --> 00:03:05,590
and that often comes
also with conceptual differences.
63
00:03:06,200 --> 00:03:10,841
A database might consider
a game is a work,
64
00:03:10,841 --> 00:03:12,952
if you're into the FRBR model,
65
00:03:12,952 --> 00:03:14,226
or that might be an edition
66
00:03:14,226 --> 00:03:16,575
or that might be
a particular console version.
67
00:03:17,055 --> 00:03:19,160
So there is a lot of granularity in there.
68
00:03:19,460 --> 00:03:22,117
And that's important in terms of coverage
69
00:03:22,117 --> 00:03:25,008
because some databases--
70
00:03:25,008 --> 00:03:27,623
for example, Mobygames has a lot
of information about a lot of things,
71
00:03:27,623 --> 00:03:29,039
but it doesn't have a lot of information
72
00:03:29,039 --> 00:03:32,417
about the games that were published
on the early French computers,
73
00:03:32,417 --> 00:03:35,790
like the Oric
or the Thomson TO MO series.
74
00:03:36,589 --> 00:03:40,063
You will find that
into more French databases.
75
00:03:40,063 --> 00:03:44,905
And if you go into Eastern video games,
like China or Japan,
76
00:03:44,905 --> 00:03:47,460
it's not very well covered
in Western databases.
77
00:03:48,260 --> 00:03:51,218
Enter WikiProject video games.
78
00:03:51,218 --> 00:03:53,515
(cheers and applause)
79
00:03:53,515 --> 00:03:54,572
(woman) Whoo-hoo!
80
00:03:54,572 --> 00:03:55,948
We didn't make that one, actually.
81
00:03:57,608 --> 00:03:59,278
So it lives at that address
82
00:03:59,278 --> 00:04:01,548
and there are a lot of subpages,
83
00:04:01,548 --> 00:04:06,005
and we're going to go through a little bit
of what this project is made of.
84
00:04:06,005 --> 00:04:08,660
As often, there is--
85
00:04:08,660 --> 00:04:11,815
we'll separate that
in what's old and what's new
86
00:04:11,815 --> 00:04:13,958
and what's borrowed and what's blue.
87
00:04:14,458 --> 00:04:15,660
So, as old we have--
88
00:04:15,660 --> 00:04:17,500
Like a lot of WikiProjects we have,
89
00:04:17,500 --> 00:04:19,480
an ontology description
with all the properties.
90
00:04:19,480 --> 00:04:22,031
There are currently 64 properties,
mostly for games,
91
00:04:22,031 --> 00:04:25,385
but also about series or hardware.
92
00:04:25,385 --> 00:04:28,208
And we have a fairly extensive, I think--
93
00:04:28,208 --> 00:04:30,128
how to put it-- separations.
94
00:04:30,128 --> 00:04:31,380
We have things about the staff,
95
00:04:31,380 --> 00:04:32,966
but also about the narrative universe
96
00:04:32,966 --> 00:04:36,689
or about the gameplay,
like how many players there are.
97
00:04:36,689 --> 00:04:39,387
So you can explore this;
it's kind of very exciting.
98
00:04:39,877 --> 00:04:41,797
We also have example queries.
99
00:04:41,797 --> 00:04:44,033
If we have time at the end,
we might show off some,
100
00:04:44,033 --> 00:04:46,029
but you can just explore them yourself.
101
00:04:51,455 --> 00:04:53,850
We also have something new.
102
00:04:53,850 --> 00:04:59,951
Because those things don't exist
in other WikiProjects and Wikidata.
103
00:05:00,545 --> 00:05:02,915
For example, we have an Activity Log.
104
00:05:02,915 --> 00:05:06,437
You can see it here.
105
00:05:06,437 --> 00:05:10,577
On this Activity Log, we track
the activity of the project.
106
00:05:10,577 --> 00:05:16,725
So when we publish a blog post
or an article somewhere,
107
00:05:16,725 --> 00:05:19,677
we add it here.
108
00:05:20,327 --> 00:05:23,281
When we create a new identifier property
109
00:05:23,281 --> 00:05:25,751
or any property related to video games,
110
00:05:25,751 --> 00:05:27,222
we also add it here.
111
00:05:28,172 --> 00:05:29,960
We also have achievements,
112
00:05:29,960 --> 00:05:32,611
like in January, we added a condition
113
00:05:32,611 --> 00:05:37,896
of an external identifier.
114
00:05:39,446 --> 00:05:42,046
Another thing that we do
is we have a Tasks List.
115
00:05:42,046 --> 00:05:47,414
The Tasks List can be used
by newcomers to the project
116
00:05:47,414 --> 00:05:51,711
to do things in the project.
117
00:05:52,331 --> 00:05:53,765
It can be [inaudible],
118
00:05:53,765 --> 00:05:59,574
so we give them an insight to [inaudible]
119
00:05:59,574 --> 00:06:01,057
and how to do that.
120
00:06:01,577 --> 00:06:05,116
It's also where we like [inaudible]
121
00:06:05,116 --> 00:06:08,751
[inaudible]
122
00:06:10,326 --> 00:06:12,959
We also have something borrowed.
123
00:06:13,897 --> 00:06:17,293
We have a lot of pages
of statistics reports.
124
00:06:21,701 --> 00:06:24,631
We also have external identifiers
that [inaudible]--
125
00:06:24,631 --> 00:06:26,098
you can see it here--
126
00:06:27,535 --> 00:06:29,923
where we track--
127
00:06:29,923 --> 00:06:32,110
I don't know if you can see it--
128
00:06:32,110 --> 00:06:36,124
but we have more than
100 external identifiers
129
00:06:36,124 --> 00:06:37,418
for video games,
130
00:06:37,418 --> 00:06:39,212
so this is big, huge.
131
00:06:39,212 --> 00:06:43,127
And here we can see for each item here--
132
00:06:43,557 --> 00:06:45,147
just a little peek.
133
00:06:45,147 --> 00:06:50,623
And also the completion of the identifier.
134
00:06:54,012 --> 00:06:56,725
So, some of these things we borrowed
from the Sum of all Paintings
135
00:06:56,725 --> 00:06:59,705
and other things, that begins more blue.
136
00:06:59,705 --> 00:07:04,019
So the InteGraality tool that was made
initially for Sum of all Paintings
137
00:07:04,679 --> 00:07:06,235
I extended it for video games,
138
00:07:06,235 --> 00:07:08,819
and then I might as well
have done it for everybody.
139
00:07:09,610 --> 00:07:12,174
So, yeah, one day we'll get all of these.
140
00:07:12,174 --> 00:07:15,628
So this is the core properties,
the genre/developer/publisher
141
00:07:16,348 --> 00:07:17,951
along video game systems,
142
00:07:17,951 --> 00:07:21,007
so Windows,
PlayStation console and so on.
143
00:07:21,007 --> 00:07:23,456
So, as you can see,
we have a lot of work to do
144
00:07:23,456 --> 00:07:26,439
for even like
the very basic core properties.
145
00:07:26,949 --> 00:07:30,030
So, yeah, one day,
all of that will be blue.
146
00:07:31,340 --> 00:07:32,920
What have we been doing?
147
00:07:34,340 --> 00:07:35,737
Things that we've been doing a lot
148
00:07:35,737 --> 00:07:38,886
has been creating identifiers
with all these external databases
149
00:07:38,886 --> 00:07:40,071
and aligning them.
150
00:07:40,071 --> 00:07:44,644
So Envel mentioned we have created
over 100 external identifier properties--
151
00:07:45,134 --> 00:07:49,204
that covers very big databases
and very tiny ones.
152
00:07:49,754 --> 00:07:54,732
We've been using the Mix'n'match tool
extensively for matching.
153
00:07:54,732 --> 00:07:57,222
And sometimes we've been using things
a bit more advanced
154
00:07:57,222 --> 00:07:59,886
that Envel will detail in a moment.
155
00:08:01,046 --> 00:08:03,308
Yeah, so 100 external
identifier properties created
156
00:08:03,308 --> 00:08:06,129
in roughly a year to two years
157
00:08:06,129 --> 00:08:08,250
and over 16 Mix'n'match catalogs.
158
00:08:08,250 --> 00:08:09,785
And I started tracking
159
00:08:09,785 --> 00:08:15,078
how many Q7889 items
didn't have any identifiers,
160
00:08:15,078 --> 00:08:17,215
and five months ago it was 15,000
161
00:08:17,215 --> 00:08:20,386
and today we're down to 9,600,
162
00:08:20,386 --> 00:08:25,177
which is very much thanks
to the teaching assistant of Tracy.
163
00:08:25,578 --> 00:08:28,817
So there's still 9,000 to go,
but we're getting there.
164
00:08:32,556 --> 00:08:37,146
So we needed to import a lot of data
165
00:08:37,146 --> 00:08:40,826
to complete those identifiers.
166
00:08:42,996 --> 00:08:46,530
The first tool to do that
is the Wikidata website.
167
00:08:47,280 --> 00:08:48,984
I think it's important to say it
168
00:08:48,984 --> 00:08:55,193
because it's where we can fix
the small problems, and so on.
169
00:08:56,193 --> 00:09:02,020
But we also have dedicated tools
to do that on Wikidata.
170
00:09:02,020 --> 00:09:04,925
There is Mix'n'match, and its gadget.
171
00:09:06,345 --> 00:09:10,179
The Mix'n'match Wiki gadget
is a gadget that you can add
172
00:09:10,179 --> 00:09:12,438
to your account in Wikidata,
173
00:09:12,438 --> 00:09:17,363
and it adds all identifiers
174
00:09:17,363 --> 00:09:20,788
from [inaudible] Mix'n'match to an item.
175
00:09:22,549 --> 00:09:27,273
You can easily add serial IDs [inaudible].
176
00:09:29,695 --> 00:09:33,146
Other tools...
There is QuickStatements, of course.
177
00:09:33,526 --> 00:09:38,751
But you also can use
more general tools, like OpenRefine,
178
00:09:38,751 --> 00:09:42,039
Dataiku Data Science Studio, et cetera.
179
00:09:43,369 --> 00:09:46,079
The point is it's very important
for this project,
180
00:09:46,079 --> 00:09:48,750
and I think for all projects in Wikidata,
181
00:09:48,750 --> 00:09:53,183
to have a healthy ecosystem
of tools that works.
182
00:09:59,413 --> 00:10:01,642
There are two examples of imports.
183
00:10:01,642 --> 00:10:06,279
The first one is connecting
PCGamingWiki and Wikidata.
184
00:10:06,279 --> 00:10:09,383
It was made by a volunteer.
185
00:10:09,383 --> 00:10:12,157
He made his own program in Ruby,
186
00:10:12,157 --> 00:10:13,529
so that's an example.
187
00:10:14,289 --> 00:10:15,347
The second one
188
00:10:15,347 --> 00:10:19,200
is linking the OLAC video game
vocabulary with Wikidata.
189
00:10:19,200 --> 00:10:22,473
It was made using OpenRefine
and Mix'n'match,
190
00:10:22,473 --> 00:10:27,210
and I think Tracy
can talk more about this one.
191
00:10:28,549 --> 00:10:32,805
And I have a third example,
which is one I made.
192
00:10:33,665 --> 00:10:38,080
I matched the catalog of BnF,
193
00:10:38,080 --> 00:10:41,984
so it's Bibliothèque...
the French National Library
194
00:10:42,824 --> 00:10:45,548
with Wikidata.
195
00:10:45,548 --> 00:10:50,494
So they have about 4,000 entries
196
00:10:50,494 --> 00:10:52,571
about video games in their catalog,
197
00:10:52,571 --> 00:10:58,046
and I matched half of them to Wikidata.
198
00:10:59,398 --> 00:11:01,956
So, for that, I made a project
199
00:11:03,626 --> 00:11:05,864
in Dataiku Data Science Studio.
200
00:11:06,414 --> 00:11:10,465
You can see the work [inaudible].
201
00:11:11,185 --> 00:11:12,219
I will not detail it,
202
00:11:12,219 --> 00:11:14,528
but if you have questions,
feel free to ask.
203
00:11:15,508 --> 00:11:19,143
I also developed
a Dataiku plugin to do it,
204
00:11:19,143 --> 00:11:21,535
to facilitate SPARQL querying
205
00:11:21,535 --> 00:11:25,639
because it's not included in the tool.
206
00:11:27,309 --> 00:11:31,644
One cool thing that happened
after this one
207
00:11:31,644 --> 00:11:34,676
is that BnF contacted me
about this project.
208
00:11:34,676 --> 00:11:36,744
So it was very cool to have feedback,
209
00:11:36,744 --> 00:11:40,178
and that contact was established.
210
00:11:44,472 --> 00:11:48,029
So, another topic, the link--
211
00:11:48,029 --> 00:11:51,604
So we want Wikidata to be
the linking hub for video games.
212
00:11:52,804 --> 00:11:54,321
As you can see here,
213
00:11:54,321 --> 00:11:57,510
a video game is, as Jean-Fred said,
214
00:11:57,510 --> 00:12:00,037
a video game is about a lot of things.
215
00:12:01,417 --> 00:12:05,737
We have Reviews and Scores, Speedruns,
216
00:12:05,737 --> 00:12:08,284
News, Library ID,
217
00:12:09,584 --> 00:12:11,255
Soundtrack, etc.
218
00:12:11,919 --> 00:12:15,840
We don't want all this data
to be in Wikidata,
219
00:12:15,840 --> 00:12:18,444
we want this data
to be linked to Wikidata.
220
00:12:18,444 --> 00:12:20,681
So we want Wikidata to be,
221
00:12:22,041 --> 00:12:24,499
like [Lidia] said yesterday, a place--
222
00:12:25,342 --> 00:12:29,161
We want to see Wikidata as a place you go,
223
00:12:29,161 --> 00:12:33,235
and then you go to another place.
224
00:12:33,590 --> 00:12:35,542
So I think that's it.
225
00:12:38,410 --> 00:12:41,230
And as you can see by the links,
226
00:12:42,550 --> 00:12:48,584
video games have a really lot
of aspects to research,
227
00:12:49,194 --> 00:12:53,318
and video games are really
complex cultural artifacts.
228
00:12:53,318 --> 00:12:55,667
There are [inaudible],
there are [ed ones],
229
00:12:55,667 --> 00:12:59,511
remasters, re-releases, mods, updates,
230
00:12:59,511 --> 00:13:02,482
download of content,
and so on and so forth.
231
00:13:02,482 --> 00:13:05,721
Plenty of remakes or remastered editions
232
00:13:05,721 --> 00:13:09,076
are separate items
at this stage in Wikidata,
233
00:13:09,076 --> 00:13:11,009
but not necessarily.
234
00:13:11,009 --> 00:13:14,798
Additionally, remakes are not often linked
to the original work
235
00:13:14,798 --> 00:13:17,486
using the property based on.
236
00:13:17,486 --> 00:13:21,859
And perhaps we should create
an entity schema for the video games,
237
00:13:21,859 --> 00:13:23,997
but we are still in the process
238
00:13:23,997 --> 00:13:28,859
to get a discussion started
for the data model of video games.
239
00:13:29,909 --> 00:13:32,193
Mostly, we have one item,
240
00:13:32,193 --> 00:13:36,434
what we typically recognize as "the game,"
241
00:13:36,434 --> 00:13:38,620
when we say we played the same game,
242
00:13:38,620 --> 00:13:41,960
so it's like a Mario Kart 6.
243
00:13:41,960 --> 00:13:45,364
Even if we played it
on different platforms,
244
00:13:45,364 --> 00:13:50,522
so, for example, on Switch,
on Wii U, or something else.
245
00:13:50,522 --> 00:13:55,980
So Wikidata items
for a game aggregate characteristics
246
00:13:55,980 --> 00:14:01,357
which are shared among
different versions or editions.
247
00:14:01,357 --> 00:14:02,827
This makes linking not easy
248
00:14:02,827 --> 00:14:07,201
because many databases
describe games on different levels,
249
00:14:07,201 --> 00:14:09,107
as Jean-Frédéric mentioned.
250
00:14:09,687 --> 00:14:13,913
For instance, some have
one database entry for each edition,
251
00:14:13,913 --> 00:14:16,599
and this results
in more than one identifier
252
00:14:16,599 --> 00:14:19,156
for each video game item.
253
00:14:19,156 --> 00:14:22,807
And so the use
of specific qualifiers is needed.
254
00:14:23,527 --> 00:14:28,518
We have some discussions thinking about
the creation of different editions items,
255
00:14:29,228 --> 00:14:31,071
for editions or releases.
256
00:14:31,071 --> 00:14:33,676
as this is good practice for literature,
257
00:14:33,676 --> 00:14:39,932
but the FRBR model which is used for books
seems not useful for everyone.
258
00:14:41,072 --> 00:14:45,763
This is also an ongoing discussion
with the video game research community
259
00:14:45,763 --> 00:14:48,730
about the best data model for video games.
260
00:14:49,720 --> 00:14:54,396
And speaking about video game research
and the research community,
261
00:14:54,396 --> 00:14:56,832
there is an active video game
research community
262
00:14:56,832 --> 00:15:00,720
with a growing interest
in data about games.
263
00:15:00,720 --> 00:15:05,245
Sadly, there are no national libraries
for video games
264
00:15:05,245 --> 00:15:07,816
which have a comprehensive dataset
265
00:15:07,816 --> 00:15:09,752
with authority data about video games--
266
00:15:09,752 --> 00:15:12,674
yes, the BnF with 4,000 video games,
267
00:15:12,674 --> 00:15:16,344
but there's still more outside.
268
00:15:17,114 --> 00:15:19,381
That means researchers rely on data
269
00:15:19,591 --> 00:15:23,626
on video game fan databases,
270
00:15:23,626 --> 00:15:25,678
but as we know, there are so many,
271
00:15:25,678 --> 00:15:28,874
and there's so different [inaudible].
272
00:15:29,254 --> 00:15:30,753
And what makes it even harder,
273
00:15:30,753 --> 00:15:33,394
the data is not open.
274
00:15:33,394 --> 00:15:36,879
So could Wikidata be a source
for video game research?
275
00:15:36,879 --> 00:15:38,061
Yes.
276
00:15:38,772 --> 00:15:40,485
I work for the research project diggr,
277
00:15:40,485 --> 00:15:44,275
and we have decided to work with Wikidata
for our video game research,
278
00:15:44,275 --> 00:15:46,846
and we not only use the data
which is already there,
279
00:15:46,846 --> 00:15:50,669
we create data about video games
and companies by hand
280
00:15:50,669 --> 00:15:54,565
or automatically, in Wikidata.
281
00:15:54,565 --> 00:15:59,144
Additionally, we have created
about 20,000 links to Mobygames,
282
00:15:59,144 --> 00:16:02,648
GameFAQs and the Japanese
Media Arts Database.
283
00:16:03,698 --> 00:16:10,210
And we also initiated as an alignment
with the OLAC video game genre vocabulary.
284
00:16:11,270 --> 00:16:13,815
So video game
research colleagues in Japan
285
00:16:13,815 --> 00:16:17,670
are also experimenting with Wikidata
286
00:16:17,670 --> 00:16:20,729
to use it as a work authority
for video games.
287
00:16:21,569 --> 00:16:24,982
So, our research will cause
a lot of spatial data
288
00:16:24,982 --> 00:16:26,806
about video game companies
289
00:16:26,806 --> 00:16:31,310
and where video games
have been released all over the world.
290
00:16:31,310 --> 00:16:37,352
So we use data for video game databases,
like Mobygames in Wikidata,
291
00:16:37,352 --> 00:16:41,026
to create some analyses like this.
292
00:16:41,026 --> 00:16:43,250
We call it Lemongrab, the tool,
293
00:16:43,250 --> 00:16:46,034
and the researcher can select
one or more platforms
294
00:16:46,034 --> 00:16:48,921
and one or more release countries
295
00:16:48,921 --> 00:16:52,610
and he will get an overview
about which companies are big players.
296
00:16:52,610 --> 00:16:56,684
In this case, the number of published
or developed video games
297
00:16:57,284 --> 00:16:58,855
for this combination.
298
00:16:59,305 --> 00:17:01,359
Additionally, they can see which country
299
00:17:01,359 --> 00:17:05,313
is strongly represented
by these companies.
300
00:17:06,153 --> 00:17:08,419
Or we use Wikidata Query Service directly
301
00:17:08,419 --> 00:17:13,589
to create maps of companies
within the video game industry.
302
00:17:14,399 --> 00:17:20,990
So, at this stage, I think
there are 5,000 video game companies
303
00:17:20,990 --> 00:17:23,327
already in Wikidata
304
00:17:23,327 --> 00:17:28,686
which we have created
half of them, I think. (chuckles)
305
00:17:29,204 --> 00:17:34,362
So, in conclusion, after two years
of working with Wikidata for our research,
306
00:17:34,362 --> 00:17:35,481
we are very pleased,
307
00:17:35,481 --> 00:17:37,127
especially with the cooperation
308
00:17:37,127 --> 00:17:40,189
with the volunteers
of the video game taskers.
309
00:17:40,189 --> 00:17:41,612
Thank you for that.
310
00:17:41,612 --> 00:17:47,116
And we think Wikidata can be
the one-stop shop for video game research
311
00:17:47,116 --> 00:17:52,541
because it already aggregates
so many links to very specialized sites
312
00:17:52,541 --> 00:17:57,417
and it is not realistic
that we put all the data into Wikidata.
313
00:18:00,422 --> 00:18:01,522
Thank you.
314
00:18:01,522 --> 00:18:04,361
At the same time, we want
to be useful for the researchers.
315
00:18:04,361 --> 00:18:07,682
We also want to stay
or to be or to become,
316
00:18:07,682 --> 00:18:10,470
however you want it,
useful to the Wikipedias.
317
00:18:10,470 --> 00:18:12,271
Right now, some Wikipedias
are using the data
318
00:18:12,271 --> 00:18:15,649
from Wikipedia for their infoboxes.
319
00:18:15,649 --> 00:18:18,719
So if tomorrow we just revamp
the entire data model
320
00:18:18,719 --> 00:18:20,554
in a way they can't use it anymore,
321
00:18:20,554 --> 00:18:22,357
it doesn't sound like a great idea.
322
00:18:22,357 --> 00:18:24,175
So we'll try not to do that.
323
00:18:26,163 --> 00:18:30,520
I think we want to be
enhancing all the databases,
324
00:18:30,520 --> 00:18:32,590
and that's something
that's already started.
325
00:18:32,590 --> 00:18:36,891
So if you go to Visual Novel Database
right now at vndb.org,
326
00:18:36,891 --> 00:18:39,783
the following research
workshop that we did
327
00:18:39,783 --> 00:18:41,145
with the nice diggr folks
328
00:18:41,145 --> 00:18:42,499
who could meet with the database,
329
00:18:42,499 --> 00:18:45,545
and they were interested enough
with all the linkage that we made
330
00:18:45,545 --> 00:18:51,204
that they could harvest more links
about the entity that they talk about.
331
00:18:51,204 --> 00:18:57,916
Like, "Well, okay, thanks to Wikidata,
we also retrieved reviews or speedruns
332
00:18:57,916 --> 00:18:59,768
or a store where you can buy these games.
333
00:18:59,768 --> 00:19:02,523
So we're already being useful.
334
00:19:02,523 --> 00:19:04,059
So that was a fine example.
335
00:19:04,059 --> 00:19:07,864
But also this German researcher
336
00:19:07,864 --> 00:19:11,971
just started the Internationale
Computerspielesammlung,
337
00:19:11,971 --> 00:19:13,958
(chuckles)
338
00:19:13,958 --> 00:19:17,532
which is online, which has all the data
about the German video games,
339
00:19:17,532 --> 00:19:19,802
what they have in their collections,
340
00:19:19,802 --> 00:19:23,923
and they've been using Wikidata
to enrich the data IDs for labels,
341
00:19:23,923 --> 00:19:25,969
so they have alternate titles.
342
00:19:26,779 --> 00:19:28,297
So that was also pretty cool.
343
00:19:30,067 --> 00:19:33,391
I think Wikidata can be the backend
for powering applications.
344
00:19:33,391 --> 00:19:36,194
So, an example
that already exists is vglist.co,
345
00:19:36,194 --> 00:19:38,378
and in some ways a little bit similar
346
00:19:38,378 --> 00:19:40,751
to what avante.io does for books,
347
00:19:40,751 --> 00:19:43,882
vglist.co does it for video games.
348
00:19:44,942 --> 00:19:47,413
It's an app where you can record
the games you've played,
349
00:19:47,413 --> 00:19:49,515
how long you spend, and your favorites.
350
00:19:49,515 --> 00:19:52,670
And I just really like the fact
that it's built on top of Wikidata.
351
00:19:52,670 --> 00:19:54,234
It's pretty cool.
352
00:19:54,724 --> 00:19:59,482
So maybe one day we can just connect
all these things together
353
00:19:59,482 --> 00:20:02,820
and harvest SPARQL to query data,
354
00:20:02,820 --> 00:20:05,074
and it really doesn't matter where it is,
355
00:20:05,074 --> 00:20:07,780
and say, "Yeah, data is not a database,"
356
00:20:07,780 --> 00:20:09,215
and that will be fine.
357
00:20:09,765 --> 00:20:12,604
Thank you very much,
and we'll take questions.
358
00:20:12,604 --> 00:20:14,812
(moderator) We just have
five minutes for questions.
359
00:20:14,812 --> 00:20:16,478
(applause)
360
00:20:22,870 --> 00:20:25,674
(man) Hello, I really love your project,
361
00:20:25,674 --> 00:20:28,713
and when I want to contribute,
where should I go?
362
00:20:29,080 --> 00:20:31,350
So there was short URL in there,
363
00:20:31,350 --> 00:20:32,437
and as Envel mentioned,
364
00:20:32,437 --> 00:20:35,874
there are tabs at the top with the links
to the SPARQL queries and so on.
365
00:20:35,874 --> 00:20:37,885
And there is a Tasks,
366
00:20:37,885 --> 00:20:40,565
which is like a couple of suggestions
on where to get started.
367
00:20:40,565 --> 00:20:43,630
But it's not mandatory, you can work
on whatever you want, obviously.
368
00:20:43,630 --> 00:20:45,005
But, yeah, that's a nice place.
369
00:20:45,005 --> 00:20:48,442
And if you have a project,
you can also bring it to the Talk page.
370
00:20:48,442 --> 00:20:49,803
It's not a very lively Talk page,
371
00:20:49,803 --> 00:20:53,437
like a lot of Wikidata Project
Talk pages, in many ways,
372
00:20:53,437 --> 00:20:58,071
but I will read and answer,
so that's a start.
373
00:20:58,071 --> 00:20:59,593
Do you already have something in mind?
374
00:20:59,593 --> 00:21:01,723
We can talk after this
if you have something in mind.
375
00:21:02,518 --> 00:21:04,070
- Allons-y.
- (woman) Hi there.
376
00:21:04,070 --> 00:21:07,983
So I work with a group
from University of Copenhagen
377
00:21:07,983 --> 00:21:10,247
and University of Washington
378
00:21:10,247 --> 00:21:14,390
who are working on an initiative
called Atari Women,
379
00:21:15,131 --> 00:21:17,009
recognizing all the women
380
00:21:17,009 --> 00:21:20,918
who've been involved through the years
with the Atari game system.
381
00:21:20,918 --> 00:21:22,967
And so I'm wondering if--
382
00:21:22,967 --> 00:21:26,218
I believe that your WikiProject
383
00:21:26,218 --> 00:21:30,308
covers the developers,
the designers and such,
384
00:21:30,998 --> 00:21:37,235
but obviously, it crosses
into the biography part of our world.
385
00:21:37,725 --> 00:21:40,245
And so how does that work?
386
00:21:42,175 --> 00:21:45,604
Is there someone
who's more specialized in that area
387
00:21:45,604 --> 00:21:52,128
who these folks at these two universities
could connect with, or...
388
00:21:53,046 --> 00:21:54,399
Thoughts?
389
00:21:56,409 --> 00:21:58,808
I don't think there will be
somebody in particular.
390
00:21:59,998 --> 00:22:02,976
My impression of the [inaudible] project
is that they are fairly eclectic.
391
00:22:02,976 --> 00:22:05,754
Sometimes people specialize
on very specific niche topics.
392
00:22:05,754 --> 00:22:07,039
In that case, I don't think so.
393
00:22:07,039 --> 00:22:10,166
So I'll be happy to take the call.
394
00:22:10,166 --> 00:22:11,291
So, to answer your question,
395
00:22:11,291 --> 00:22:14,450
yes, that will definitely be
in the scope of our project.
396
00:22:16,237 --> 00:22:19,476
And in that period, particularly,
I don't think we want to turn back
397
00:22:19,476 --> 00:22:22,466
because these days video games
are made by like 1,000 people
398
00:22:22,466 --> 00:22:24,784
and do we want to create an item
about every single person,
399
00:22:24,784 --> 00:22:27,489
like the credit rolls of a movie, right?
400
00:22:27,489 --> 00:22:30,163
So in modern times, I don't know
if we want to be that database,
401
00:22:30,163 --> 00:22:32,790
the ultimate database of game credits.
402
00:22:33,750 --> 00:22:36,643
But for the Atari early days--
oh, definitely,
403
00:22:36,643 --> 00:22:38,523
I would actually love to see the dataset
404
00:22:38,523 --> 00:22:42,480
because it's a lot of dudes
in common knowledge of...
405
00:22:42,480 --> 00:22:44,487
- (woman) I'll connect you to that.
- Yes, please.
406
00:22:44,487 --> 00:22:46,031
(laughter)
407
00:22:48,406 --> 00:22:50,210
(moderator) Any other questions?
408
00:22:53,490 --> 00:22:55,192
Sir, just in front of you.
409
00:22:56,230 --> 00:22:58,372
(man 2) Do you collaborate
with the Internet Archive?
410
00:22:58,372 --> 00:23:02,906
Because there's not a month going by
that Jason Scott doesn't post.
411
00:23:02,906 --> 00:23:07,111
He's rescued 170,000
DOS games or stuff like that.
412
00:23:11,100 --> 00:23:15,701
There are Internet Archives identifiers
on some game items,
413
00:23:15,701 --> 00:23:17,939
which is a bit weird
because usually on the Internet Archive
414
00:23:17,939 --> 00:23:19,926
there's going to be
a particular release of the game,
415
00:23:19,926 --> 00:23:21,666
again on the difference...
416
00:23:21,666 --> 00:23:24,064
Last time I checked there were four
or five Prince of Persia
417
00:23:24,064 --> 00:23:25,067
on the Internet Archive
418
00:23:25,067 --> 00:23:28,130
because they have the Apple II version
and the DOS version and so on.
419
00:23:28,130 --> 00:23:29,708
So not explicitly.
420
00:23:29,708 --> 00:23:36,519
In general, I think we probably want
to make some connections more general
421
00:23:36,519 --> 00:23:39,239
with the video game preservation scene.
422
00:23:39,239 --> 00:23:45,032
There is a quite lively organization
that work hard on video game preservation.
423
00:23:45,032 --> 00:23:49,690
And I think Wikidata
can be a useful resource for them
424
00:23:49,690 --> 00:23:51,609
because they don't have
to manage the metadata,
425
00:23:51,609 --> 00:23:54,573
and they can focus
on managing other things.
426
00:23:54,573 --> 00:23:56,084
Do you have something to add to that?
427
00:23:56,084 --> 00:23:57,136
No.
428
00:23:59,422 --> 00:24:00,792
[inaudible], perhaps?
429
00:24:01,042 --> 00:24:02,862
(man 3) I had the same question.
430
00:24:02,862 --> 00:24:04,460
(laughter)
431
00:24:04,460 --> 00:24:05,599
Perfect.
432
00:24:05,599 --> 00:24:09,194
(moderator) There was
one more question back here.
433
00:24:11,843 --> 00:24:14,587
No, probably I hallucinated. Sorry.
434
00:24:16,103 --> 00:24:17,880
For one minute, we can show a query.
435
00:24:18,470 --> 00:24:19,506
Or not.
436
00:24:19,506 --> 00:24:21,275
(moderator) You have 30 seconds.
437
00:24:22,709 --> 00:24:24,599
Will the Query Service [inaudible]?
438
00:24:30,239 --> 00:24:32,202
We have links in the PDF, [inaudible]?
439
00:24:43,007 --> 00:24:45,400
(man 4) If there's still time,
I have a question.
440
00:24:45,400 --> 00:24:46,532
Yes, please.
441
00:24:46,532 --> 00:24:48,201
During your presentation, did you notice
442
00:24:48,201 --> 00:24:53,711
that some of the identifiers
have more than 100% [inaudible]?
443
00:24:53,711 --> 00:24:56,380
Yeah, it's because the examples--
444
00:24:56,380 --> 00:24:59,552
so that reason, one of the users,
for example, itself,
445
00:24:59,552 --> 00:25:01,239
because they use [inaudible] as examples.
446
00:25:01,239 --> 00:25:03,426
And also sometimes
because there are broad matches.
447
00:25:03,426 --> 00:25:05,991
So if it says something that's a bit--
448
00:25:06,481 --> 00:25:09,381
So, yeah, that's one
of my favorite-- if I can scroll it--
449
00:25:09,381 --> 00:25:13,137
it's the characters of the Mario franchise
linked to their games.
450
00:25:13,137 --> 00:25:15,706
(chuckles)
451
00:25:15,706 --> 00:25:19,285
So you can find like Wario
and Princess Peach, and so on.
452
00:25:19,285 --> 00:25:20,371
And my favorite is--
453
00:25:20,371 --> 00:25:23,896
if you look somewhere, yes,
because there is Mario somewhere here,
454
00:25:23,896 --> 00:25:25,712
and there is Dr. Mario.
455
00:25:25,712 --> 00:25:29,105
And if you look at the item,
it's said to be the same as--
456
00:25:29,835 --> 00:25:33,842
because Mario plumber and Mario physician
might be two different people,
457
00:25:33,842 --> 00:25:35,318
we don't really know.
458
00:25:35,318 --> 00:25:36,840
(laughter)
459
00:25:39,568 --> 00:25:42,845
(moderator) Thank you very much
for this presentation.
460
00:25:43,743 --> 00:25:45,755
(applause)