WEBVTT
00:00:06.239 --> 00:00:08.628
Willkommen, Bienvenue-- Welcome.
00:00:08.628 --> 00:00:10.782
I always wanted to say that on a stage.
00:00:10.817 --> 00:00:12.804
(laughter)
00:00:12.856 --> 00:00:14.928
This is going to be inspirational,
00:00:14.928 --> 00:00:19.057
because this is the official
Wikibase inspiration panel
00:00:19.057 --> 00:00:22.543
of WikidataCon 2019.
00:00:23.839 --> 00:00:27.519
The point of this panel
is to be inspired by all the things
00:00:27.519 --> 00:00:33.714
that people, in various countries,
in various fields, do with Wikibase,
00:00:33.766 --> 00:00:36.034
the software behind Wikidata.
00:00:36.084 --> 00:00:39.375
I was really surprised to learn today
that someone came to me and said,
00:00:39.375 --> 00:00:42.451
"I learned about Wikibase
the first time today."
00:00:42.817 --> 00:00:47.073
So, it is the software that runs Wikidata.
00:00:47.073 --> 00:00:50.963
And if you want
to order things in the world
00:00:50.963 --> 00:00:54.121
the way Wikidata orders things
in the world,
00:00:55.101 --> 00:00:58.627
but you don't agree with the items
that we have in there,
00:00:58.627 --> 00:01:02.372
because you might need
a finer level of granularity,
00:01:02.372 --> 00:01:05.828
or maybe you don't want to start
with Q1, which is the universe,
00:01:05.828 --> 00:01:10.197
because in your little world,
Q1 could be a book, if you are a library,
00:01:10.197 --> 00:01:14.362
or it could be some kind of animal,
if you work in biology,
00:01:14.362 --> 00:01:19.073
or it could be a historic person,
if you do digital humanities,
00:01:19.073 --> 00:01:21.771
but you still want
the same system of ordering,
00:01:21.771 --> 00:01:24.565
then Wikibase is the thing for you.
00:01:25.395 --> 00:01:30.070
Over the last one or two years,
we have made contact
00:01:30.080 --> 00:01:34.163
with extraordinary people,
who are pioneers, who are trailblazing,
00:01:34.163 --> 00:01:36.641
who are evaluating Wikibase,
00:01:36.641 --> 00:01:39.920
and who are doing
extremely great stuff with that.
00:01:41.216 --> 00:01:43.886
This panel is going to be very rushed.
00:01:44.372 --> 00:01:48.310
Every one of the participants
of this panel would have deserved
00:01:48.310 --> 00:01:51.314
a one-hour slot to present their thing.
00:01:51.406 --> 00:01:54.007
But our program is packed.
00:01:54.414 --> 00:02:00.108
So, yeah, keep your seat belt fastened
for a fast-paced ride
00:02:00.108 --> 00:02:03.829
through the inspirational
world of Wikibases.
00:02:04.155 --> 00:02:09.870
And the first one is a project
from two organizations,
00:02:09.870 --> 00:02:12.223
which is a little sensation in itself.
00:02:12.833 --> 00:02:16.495
The Bibliothèque nationale de France,
the French National Library,
00:02:16.495 --> 00:02:22.343
and Abes, which is an authority
for higher education.
00:02:22.870 --> 00:02:26.440
But I think you will talk about that
more in your presentation,
00:02:26.440 --> 00:02:31.406
and yeah, we'd like to welcome
Anila Angjeli and Benjamin Bober
00:02:31.406 --> 00:02:34.741
on stage for the first
ten minutes of inspiration.
00:02:35.509 --> 00:02:40.768
(applause)
00:02:46.204 --> 00:02:47.339
Hi, everybody.
00:02:47.339 --> 00:02:49.372
So, yeah, my name is Benjamin Bober.
00:02:49.372 --> 00:02:51.734
So, I work for Abes,
00:02:51.734 --> 00:02:54.406
which stands for Higher Education Agency,
00:02:54.406 --> 00:02:56.437
Bibliographic Higher Education Agency.
00:02:56.437 --> 00:03:00.642
Basically, we work with all
the university libraries in France,
00:03:00.642 --> 00:03:03.070
and manage the union catalog.
00:03:03.120 --> 00:03:06.362
And also their authority files.
00:03:06.920 --> 00:03:10.353
And I'm here with Anila Angjeli,
from the BnF,
00:03:10.353 --> 00:03:11.971
French National Library.
00:03:11.971 --> 00:03:16.027
And we're going to talk to you
about our joint project,
00:03:17.077 --> 00:03:21.239
which is about creating
a new production tool
00:03:21.239 --> 00:03:24.088
for authorities data--
00:03:24.938 --> 00:03:28.785
person, corporate bodies,
concepts, and so on.
00:03:28.785 --> 00:03:33.496
And we spent the last months
00:03:33.496 --> 00:03:37.064
asking Wikibase to do this stuff.
00:03:37.551 --> 00:03:43.931
So, I will give you some context
really quickly,
00:03:45.833 --> 00:03:49.030
because it's important for us,
as libraries--
00:03:49.079 --> 00:03:54.475
There's been this technological
shift recently
00:03:56.016 --> 00:03:58.051
with the linked open data movement,
00:03:58.051 --> 00:04:01.951
and we wanted, as a bibliographical agency,
00:04:01.951 --> 00:04:05.551
to follow this new trend.
00:04:06.111 --> 00:04:08.474
And, well, it's been years since we've--
00:04:10.131 --> 00:04:12.611
experimenting with linked open data,
00:04:12.611 --> 00:04:16.215
with RDF, SPARQL and so on.
00:04:16.215 --> 00:04:21.765
But we think that now
is the good time to move forward.
00:04:23.311 --> 00:04:28.313
It's also a good time
because there's been a-- not a shift,
00:04:29.534 --> 00:04:31.009
there's a fundamental change
00:04:31.009 --> 00:04:36.780
in the way we consider
bibliographical data.
00:04:37.712 --> 00:04:41.255
We used to, and we still have data
00:04:41.747 --> 00:04:44.803
stored in records, we call it MARC records
00:04:44.803 --> 00:04:47.801
in the library landscape.
00:04:48.444 --> 00:04:51.239
We used a specific format called MARC.
00:04:53.108 --> 00:04:56.956
But recently, there has been some way
00:04:59.431 --> 00:05:01.697
to think about it
from another point of view.
00:05:01.697 --> 00:05:06.621
And to go from a record-based world,
to an entity-based world
00:05:06.621 --> 00:05:11.572
when we try to interconnect
people, works,
00:05:14.129 --> 00:05:16.724
and other entities.
00:05:17.777 --> 00:05:23.844
So, in this context, we decided
to launch this joint initiative.
00:05:25.639 --> 00:05:28.516
But our goal is far beyond libraries.
00:05:28.516 --> 00:05:32.461
We would like to have with us
00:05:35.519 --> 00:05:38.060
other French GLAMS, for instance,
00:05:38.060 --> 00:05:42.386
because we think our project
can help them also.
00:05:44.134 --> 00:05:49.368
So basically, our project is called
Fichier National d'Entités,
00:05:49.411 --> 00:05:51.232
so National Entity Files.
00:05:51.917 --> 00:05:55.961
And it will be a shared platform
for collaboratively create
00:05:55.961 --> 00:05:58.652
and maintain reference
data about entities.
00:05:58.652 --> 00:06:01.544
Like I said, persons,
corporate bodies, places, concepts,
00:06:01.544 --> 00:06:03.206
and creative works.
00:06:03.339 --> 00:06:06.221
So, we embrace a lot of things.
00:06:06.909 --> 00:06:09.632
And it's a challenge
because it's the first time
00:06:09.632 --> 00:06:15.826
BnF and Abes collaborate
at such a level.
00:06:19.031 --> 00:06:22.488
Giving you a quick view
about where we are--
00:06:22.618 --> 00:06:25.129
where we've come from
and where we are now.
00:06:25.129 --> 00:06:27.834
We have been working
on this project since 2017.
00:06:29.178 --> 00:06:31.967
We've benchmarked,
other similar initiatives,
00:06:31.967 --> 00:06:33.923
and came to the conclusion last year
00:06:33.923 --> 00:06:40.687
that there was a strong interest
in Wikibase as the FNE's backbone.
00:06:41.632 --> 00:06:44.886
We were considering it a good solution
00:06:44.886 --> 00:06:49.257
to build upon, but we still
had doubts at this time,
00:06:50.016 --> 00:06:54.033
because we have specific needs to fulfill.
00:06:54.683 --> 00:06:58.871
So we decided to launch,
to spend this year
00:06:58.871 --> 00:07:01.910
to build a proof of concept with real data
00:07:01.910 --> 00:07:06.039
both from BnF catalog,
authority catalog, and our catalogs.
00:07:06.718 --> 00:07:10.990
And well, try to merge this data
into a Wikibase,
00:07:10.990 --> 00:07:13.471
and to try to see how they behave
00:07:13.471 --> 00:07:17.964
and how the tool can fulfill our needs.
00:07:18.103 --> 00:07:22.370
And we were helped
in this proof of concept
00:07:22.370 --> 00:07:27.282
by Maxime and Vincent
from Inventaire.io,
00:07:28.497 --> 00:07:33.145
who helped us have a better idea
00:07:33.145 --> 00:07:37.133
about what Wikibase can bring us.
00:07:37.188 --> 00:07:40.272
And Anila will talk
about the first findings.
00:07:42.255 --> 00:07:46.913
So, while this decision to go
00:07:46.995 --> 00:07:49.713
with experiments with the Wikibase
00:07:49.713 --> 00:07:52.793
as the technical infrastructure backbone
00:07:52.793 --> 00:07:57.360
or the basic layer for our FNE
00:07:57.360 --> 00:08:04.100
was because it's not trivial
to move from one system to another,
00:08:04.657 --> 00:08:10.170
and because the initiative
of using the Wikibase
00:08:10.940 --> 00:08:15.976
as the technical infrastructure
for our data--
00:08:17.760 --> 00:08:19.262
it was both--
00:08:20.396 --> 00:08:25.771
means that we move from our classical
00:08:26.545 --> 00:08:28.083
system information
00:08:28.083 --> 00:08:33.131
or library information system
to quite another thing.
00:08:33.643 --> 00:08:36.469
And so, we needed to experiment first,
00:08:36.469 --> 00:08:41.751
and just to see whether a set
of functionalities that are--
00:08:42.439 --> 00:08:48.189
that we usually need to perform
and fulfill in our environment--
00:08:48.239 --> 00:08:49.739
professional environment.
00:08:49.739 --> 00:08:52.946
I'm talking here about creating
and maintaining,
00:08:52.946 --> 00:08:56.562
and not publishing,
which is a big difference.
00:08:56.562 --> 00:08:59.685
You were at the session,
the previous session,
00:08:59.685 --> 00:09:04.393
with just Wikidata Commons,
00:09:04.393 --> 00:09:06.765
contribution strategies for GLAM--
00:09:06.765 --> 00:09:12.741
it was about publication
and ways about creation in itself.
00:09:12.787 --> 00:09:16.146
So, we need to go step by step,
00:09:16.146 --> 00:09:19.955
and that's why we conducted
this experiment, this proof of concept.
00:09:20.970 --> 00:09:26.726
And, good surprise, no major obstacle
to ingest library data
00:09:26.726 --> 00:09:30.754
according to a specific ontology,
which is, while we--
00:09:31.159 --> 00:09:37.781
I briefly mentioned that we put their data
in two different flavors of MARC,
00:09:38.552 --> 00:09:42.689
then we defined
some [inaudible] properties
00:09:42.689 --> 00:09:47.233
in order to be able to experiment
with merging the data,
00:09:47.233 --> 00:09:52.511
and there was no major obstacle
from the technical point of view.
00:09:53.406 --> 00:09:56.569
Of course, we came up with a confirmation
00:09:56.569 --> 00:10:00.425
that Wikibase does offer built-in features
00:10:00.425 --> 00:10:05.140
that could be used as the basis
for the technical infrastructure for FNE.
00:10:06.319 --> 00:10:09.000
But again, the decision is not yet made,
00:10:09.000 --> 00:10:11.637
because the experiment is still--
00:10:12.487 --> 00:10:16.243
let's say, the developments
have been completed.
00:10:16.650 --> 00:10:22.313
Now, we're in the phase of writing
the final conclusions,
00:10:22.313 --> 00:10:28.774
and the decision is not yet made
from the strategic point of view,
00:10:29.391 --> 00:10:34.468
but these are really the first findings
we can talk about.
00:10:34.512 --> 00:10:37.954
And Wikibase-- it appears to us
00:10:37.954 --> 00:10:43.033
that a Wikibase might be
a good operational solution
00:10:43.033 --> 00:10:48.571
for managing this initiative--
that is jointly, collaboratively,
00:10:48.571 --> 00:10:51.980
create these entity, these things,
00:10:53.281 --> 00:10:56.828
to remind you of the opposition,
which is things and strings.
00:10:57.834 --> 00:11:01.113
However, we noticed there are gaps.
00:11:01.118 --> 00:11:05.418
Within the specific needs
of our specific institutions,
00:11:06.146 --> 00:11:12.361
there are defined communities
with their own culture, practices and,
00:11:14.711 --> 00:11:20.462
well, it is certain processes
that are inherent to the libraries,
00:11:21.111 --> 00:11:25.650
and the solution offered by Wikibase,
for example, the search.
00:11:26.542 --> 00:11:28.929
I mean, from the professional standpoint,
00:11:28.929 --> 00:11:31.648
not only from this end-user standpoint,
00:11:31.648 --> 00:11:34.575
but professional, we need some indexes
00:11:34.575 --> 00:11:38.925
in order to ensure
data quality, data curation,
00:11:38.925 --> 00:11:41.197
and it is very important
for the professional,
00:11:41.197 --> 00:11:46.406
and Wikibase with its Elasticsearch
00:11:46.406 --> 00:11:48.857
and CirrusSearch doesn't offer.
00:11:48.857 --> 00:11:51.702
But still areas of investigation there.
00:11:52.229 --> 00:11:54.454
The roles-- how are the roles managed?
00:11:54.454 --> 00:11:57.248
The bureaucrat, the patrolling of--
00:11:57.248 --> 00:12:00.861
it's not exactly what happened
in our world.
00:12:01.268 --> 00:12:04.712
Although there is a layer
that can be used,
00:12:04.712 --> 00:12:11.132
upon which we can build
other roles that are more in compliance
00:12:11.132 --> 00:12:14.876
with our way of managing the data.
00:12:15.649 --> 00:12:20.437
Or different constraints,
constraints related to data publication,
00:12:20.842 --> 00:12:26.005
or data-- there's an error there
we need to correct.
00:12:26.655 --> 00:12:29.096
Data policy-- okay, thank you.
00:12:29.702 --> 00:12:32.710
So, there are things that need to be--
00:12:33.360 --> 00:12:38.574
other layers, bricks,
need to be built upon Wikibase.
00:12:39.141 --> 00:12:42.873
And of course, one of the reasons,
the major reasons,
00:12:42.873 --> 00:12:45.222
the reason why we are here with you,
00:12:45.222 --> 00:12:50.450
is that we-- we are willing,
and we feel the necessity
00:12:50.450 --> 00:12:54.349
to be part of a community
sharing the same concerns.
00:12:54.358 --> 00:12:59.267
And we all know, given the program,
00:12:59.320 --> 00:13:01.554
that libraries and GLAMs
00:13:01.554 --> 00:13:05.084
are heavily represented in this event.
00:13:05.896 --> 00:13:11.772
So, I think-- we think that maybe
00:13:11.772 --> 00:13:14.206
in a couple of weeks,
00:13:14.206 --> 00:13:19.082
or next year, we will able
to communicate more openly
00:13:19.082 --> 00:13:23.717
on our decision to go forward
with this solution.
00:13:24.404 --> 00:13:26.163
Thank you.
00:13:26.163 --> 00:13:27.748
Thank you so much.
00:13:27.748 --> 00:13:31.155
(applause)
00:13:31.155 --> 00:13:33.547
So, we will have short
presentations first,
00:13:33.547 --> 00:13:35.092
and we will all return on stage
00:13:35.092 --> 00:13:37.646
for questions, if we have
the time for that.
00:13:38.296 --> 00:13:41.251
But yeah, we heard something from France.
00:13:42.757 --> 00:13:44.301
There's another project.
00:13:45.086 --> 00:13:47.980
It's not Fichier National d'Ent--
00:13:47.980 --> 00:13:50.031
(jokingly struggles with name)
00:13:50.031 --> 00:13:51.545
But it's Gemeinsame Normdatei,
00:13:52.937 --> 00:13:56.767
the universal authority file
00:13:56.767 --> 00:13:58.224
for the German-speaking world.
00:13:58.224 --> 00:14:03.747
And I'm so happy to have good friends
of the Wikimedia movement here.
00:14:04.559 --> 00:14:09.436
Barbara Fischer and Sarah Hartmann.
00:14:11.831 --> 00:14:15.208
Thanks alot for the invitation
to talk about our project,
00:14:15.212 --> 00:14:18.006
which is called GND meets Wikibase.
00:14:18.694 --> 00:14:21.645
And it's a joint project
of Wikimedia Deutschland,
00:14:21.645 --> 00:14:23.468
and the GND.
00:14:23.745 --> 00:14:25.707
And we'd like to give you
a quick overview,
00:14:25.707 --> 00:14:28.781
as Jens said before,
there are just 10 minutes.
00:14:29.971 --> 00:14:33.138
Why we go for that approach
to evaluate Wikibase,
00:14:33.138 --> 00:14:37.153
if it fulfills the requirements
for managing authority data
00:14:37.153 --> 00:14:40.434
on a collaborative level, I would say.
00:14:42.258 --> 00:14:45.660
So, where do we come from,
and what's the idea of authority control?
00:14:45.660 --> 00:14:49.927
And GND, which stands for
Gemeinsame Normdatei,
00:14:50.837 --> 00:14:51.838
what's the idea of it?
00:14:51.838 --> 00:14:55.623
And yeah, where do we come from,
as I said before.
00:14:55.623 --> 00:14:59.307
It's not that different
from what Anila and Ben said,
00:15:00.007 --> 00:15:01.649
just a few seconds ago.
00:15:02.765 --> 00:15:06.003
The GND is used
for the description of resources,
00:15:06.003 --> 00:15:09.726
such as publications,
and objects, for example,
00:15:09.726 --> 00:15:14.168
and in order to enable
accurate data retrieval,
00:15:14.168 --> 00:15:19.080
I would say, the GND provides
unambiguous and distinct entities
00:15:19.080 --> 00:15:21.390
for that retrieval.
00:15:21.837 --> 00:15:25.328
And so, there are persistent identifiers,
as well, as you all know,
00:15:25.328 --> 00:15:28.654
for identification and reference
for these entities.
00:15:30.968 --> 00:15:33.972
The authority file is used
by mainly libraries,
00:15:35.075 --> 00:15:37.955
we would say,
in the German-speaking countries,
00:15:37.955 --> 00:15:41.477
but a few other institutions
from the cultural heritage domain,
00:15:41.477 --> 00:15:45.497
are using the authority file already.
00:15:46.228 --> 00:15:52.567
And all in all there are
around about 60 million records,
00:15:52.774 --> 00:15:55.242
and in Wikibase, we would say "items,"
00:15:55.242 --> 00:15:58.037
which refer to persons, names of persons,
00:15:58.037 --> 00:16:01.475
corporate bodies, for example,
geographic names, and works.
00:16:01.768 --> 00:16:06.522
And the GND is run cooperatively
by so-called GND agencies,
00:16:06.583 --> 00:16:11.212
and at the moment, there are
around about 1,000 institutions
00:16:11.212 --> 00:16:15.443
who are active users of the GND--
that means they establish new records
00:16:15.443 --> 00:16:19.999
and added records or items
on a regular basis.
00:16:20.745 --> 00:16:24.204
And the most important thing, I would say,
00:16:24.204 --> 00:16:27.848
is that the GND data
is provided free of charge
00:16:27.848 --> 00:16:29.520
under CC0 conditions,
00:16:29.520 --> 00:16:33.313
and that all the APIs
and documentation is open as well.
00:16:34.532 --> 00:16:37.077
Yeah, talking about open--
00:16:38.129 --> 00:16:41.613
that's the point,
and the crucial one here--
00:16:41.613 --> 00:16:45.235
at the moment, we challenge
to open up the GND
00:16:45.235 --> 00:16:51.400
for other GLAM institutions
and institutions from the science domain.
00:16:52.212 --> 00:16:55.972
At the moment, it's really focused
on the library sector.
00:16:56.715 --> 00:17:00.243
That means that the handy tool
of librarians has to evolve
00:17:01.223 --> 00:17:06.241
into a tool that is used
and accepted across domains.
00:17:06.300 --> 00:17:10.144
And that means a lot of work
on organizational stuff,
00:17:10.144 --> 00:17:15.011
community building, discussions
about the current data model,
00:17:15.011 --> 00:17:17.930
and infrastructural and technical issues.
00:17:17.945 --> 00:17:19.527
And, yeah.
00:17:20.581 --> 00:17:22.966
Talking about the infrastructural issues,
00:17:23.806 --> 00:17:29.165
we came up with the idea
to become partners in crime
00:17:29.596 --> 00:17:34.704
with Wikibase, I would say,
so have slightly the same aims,
00:17:34.704 --> 00:17:40.092
namely make cultural data
more accessible and interoperable.
00:17:40.661 --> 00:17:44.964
And therefore we now
evaluate the software,
00:17:44.964 --> 00:17:49.581
which was originally conceived
for a sole application, Wikidata,
00:17:49.581 --> 00:17:53.311
if it's sufficient for managing
authority data.
00:17:58.084 --> 00:18:00.917
Right-- hi from my side as well.
00:18:00.917 --> 00:18:05.701
We're focusing in our evaluation
[inaudible] we do commonly
00:18:05.701 --> 00:18:07.450
with Wikimedia Deutschland.
00:18:08.220 --> 00:18:11.269
First of all, if Wikibase meets
the requirements
00:18:11.269 --> 00:18:15.224
of GLAM institutions, galleries,
libraries, archives, and museums,
00:18:15.224 --> 00:18:18.467
to drive collaboratively
an authority file,
00:18:18.467 --> 00:18:20.698
which is like our basic question.
00:18:21.748 --> 00:18:25.981
We also would like to see
Wikibase to increase usability
00:18:25.981 --> 00:18:29.312
as the software system
we're using right now
00:18:29.312 --> 00:18:32.885
is, let's say, quite a complex software
00:18:32.885 --> 00:18:37.361
that is not as handy
as you might like it to be.
00:18:39.074 --> 00:18:41.828
Well, and then, we would like to know
00:18:41.828 --> 00:18:45.914
if Wikibase would also ease
both data linking
00:18:45.914 --> 00:18:48.710
and growing a diverse community.
00:18:48.710 --> 00:18:52.429
As Sarah said before, we are right now
in a process of opening up
00:18:52.429 --> 00:18:58.356
towards a broader scope
of GLAM institutions,
00:18:58.356 --> 00:19:00.425
and science institutions.
00:19:00.425 --> 00:19:06.152
And of course, they are working
within their own software structures,
00:19:06.152 --> 00:19:09.231
and we would like to know
if Wikibase would ease
00:19:09.231 --> 00:19:12.190
the cooperation-- collaboration with us.
00:19:12.678 --> 00:19:15.390
So, why do we do that?
00:19:15.634 --> 00:19:19.200
This is because we consider that Wikibase
00:19:19.200 --> 00:19:22.239
might be the attractive community zone,
00:19:22.239 --> 00:19:25.596
which means--I had to write that down--
00:19:26.807 --> 00:19:30.607
first of all, as it is open source,
it will be more accessible
00:19:30.607 --> 00:19:35.285
than any proprietary source
software system that is used
00:19:35.285 --> 00:19:39.421
in the cataloging fields
of the GLAM institutions.
00:19:40.002 --> 00:19:43.114
Then, we feel that the Wikibase community
00:19:43.114 --> 00:19:46.354
already by now
is a very dedicated community,
00:19:46.354 --> 00:19:50.163
and we would like to participate
in that dedicated community,
00:19:50.446 --> 00:19:53.447
because we believe that sharing is caring.
00:19:53.771 --> 00:19:59.102
What we want to share
is our knowledge is your knowledge,
00:19:59.144 --> 00:20:02.557
and together, in order to omit redundance,
00:20:02.557 --> 00:20:07.393
not by editing the same information
over and over again,
00:20:07.393 --> 00:20:09.373
but reuse data, link it,
00:20:09.373 --> 00:20:11.559
quoting it, and enriching it.
00:20:12.609 --> 00:20:17.474
And I placed here on the picture
one of the tools
00:20:17.474 --> 00:20:22.802
that is broadly spread within Wikidata,
this Histropedia,
00:20:23.332 --> 00:20:29.061
because we also feel that if we are able
to introduce our data into Wikibase,
00:20:29.061 --> 00:20:34.159
we might be able to share tools,
improving the code,
00:20:34.159 --> 00:20:38.181
and thus being an active,
contributing part of the community.
00:20:38.232 --> 00:20:40.030
Thank you.
00:20:40.030 --> 00:20:42.671
I'd like to debate that with you later on.
00:20:43.319 --> 00:20:44.775
Thank you so much.
00:20:44.775 --> 00:20:46.354
(applause)
00:20:46.354 --> 00:20:47.938
Thank you so much.
00:20:49.885 --> 00:20:53.874
So, at some point,
we ask ourselves, did we--
00:20:56.996 --> 00:20:59.868
by accident, write a library software?
00:20:59.913 --> 00:21:05.216
Because the adoption of Wikibase
in the library fields is so overwhelming.
00:21:06.434 --> 00:21:08.012
But there's more to it.
00:21:09.023 --> 00:21:13.903
And of course, we didn't
accidentally write a library system.
00:21:14.353 --> 00:21:17.764
It can be used for other fields as well.
00:21:18.296 --> 00:21:19.878
For instance, for biology.
00:21:19.878 --> 00:21:23.363
And David Fichtmueller will tell us
about using Wikibase
00:21:23.363 --> 00:21:25.835
as a platform for biodiversity.
00:21:26.770 --> 00:21:29.449
- I think that was grayed.
- Yeah.
00:21:29.449 --> 00:21:31.835
Full screen? Oh, okay.
00:21:37.603 --> 00:21:39.758
Yes. Hello, everybody.
00:21:40.819 --> 00:21:43.383
I'm David, and I work
at the Botanic Garden,
00:21:43.383 --> 00:21:45.214
Botanical Museum here in Berlin.
00:21:45.988 --> 00:21:48.065
And I work there as a computer scientist.
00:21:48.065 --> 00:21:51.194
We have an entire department
called Biodiversity Informatics.
00:21:51.884 --> 00:21:53.633
Generally speaking, we write the software
00:21:53.633 --> 00:21:55.858
that biologists use in their daily work.
00:21:56.430 --> 00:21:58.932
And on my private side,
00:21:58.932 --> 00:22:02.639
I've been a Wikipedia contributor
for almost 15 years now,
00:22:02.639 --> 00:22:06.045
and Wikidata contributor
for almost five years now.
00:22:06.981 --> 00:22:09.425
And also, as part of my job,
00:22:09.425 --> 00:22:12.068
I'm a co-administrator of a MediaWiki farm
00:22:12.068 --> 00:22:16.684
with more than 80 wikis
regarding the biology community.
00:22:18.855 --> 00:22:22.116
And a couple of years ago,
I was assigned to a project
00:22:22.556 --> 00:22:26.670
that was, yeah, about working
on a standard.
00:22:26.735 --> 00:22:29.524
In particular, it's a standard
called ABCD,
00:22:30.827 --> 00:22:33.135
that we needed to do some work on.
00:22:33.405 --> 00:22:37.295
And I assume most of you
haven't heard about ABCD,
00:22:37.295 --> 00:22:39.728
that's not really a bad thing.
00:22:39.728 --> 00:22:41.279
It's really specific.
00:22:41.279 --> 00:22:44.128
It stands for Access to Biological
Collection Data.
00:22:44.863 --> 00:22:47.292
And it's an XML schema.
00:22:47.298 --> 00:22:49.772
So, it can express
biological information,
00:22:49.772 --> 00:22:54.190
particular things like information
about herbarium sheets,
00:22:54.190 --> 00:22:59.920
about collections, like fish in
alcohol jars, or--
00:23:01.111 --> 00:23:02.449
but also observations--
00:23:02.449 --> 00:23:05.165
scientists being out in the field,
seeing certain plants,
00:23:05.165 --> 00:23:06.543
seeing certain animals.
00:23:06.543 --> 00:23:08.970
A lot of variety in here,
and because of this,
00:23:08.970 --> 00:23:10.426
it's quite a huge standard.
00:23:10.426 --> 00:23:13.940
So, we have 1,800
different concepts in there.
00:23:14.748 --> 00:23:18.322
That's counting the different XPaths
there are within the file.
00:23:20.055 --> 00:23:22.302
And so the challenge was to convert this
00:23:22.302 --> 00:23:25.234
into a new modern semantic standard.
00:23:25.280 --> 00:23:27.271
We wanted to use an OWL ontology
00:23:27.271 --> 00:23:31.200
that is able to express
the same kind of information
00:23:31.200 --> 00:23:33.951
that has previously been expressed
with the XML files,
00:23:35.245 --> 00:23:38.361
and also keep all the existing
documentation,
00:23:38.361 --> 00:23:41.122
and restrictions,
and all of the connections
00:23:41.122 --> 00:23:42.989
between the items
00:23:42.989 --> 00:23:46.357
and have a collaborative platform
00:23:46.357 --> 00:23:50.284
where other scientists can come in
and give us advice
00:23:50.284 --> 00:23:52.914
on their specific fields of focus.
00:23:52.914 --> 00:23:54.780
Did we model this correctly?
00:23:55.266 --> 00:23:56.596
Is there anything missing?
00:23:56.596 --> 00:24:00.528
So, yeah, with all of this in mind,
we went looking around,
00:24:00.528 --> 00:24:03.675
and found a solution, and I guess
it wouldn't surprise anybody here,
00:24:03.675 --> 00:24:06.752
it's Wikibase, otherwise
I wouldn't have been talking here.
00:24:08.171 --> 00:24:10.779
So, we decided on using Wikibase.
00:24:11.266 --> 00:24:14.356
And we started to install it
without the Docker Image.
00:24:15.165 --> 00:24:17.171
Big mistake. Don't do this.
00:24:17.171 --> 00:24:18.171
(laughter)
00:24:18.171 --> 00:24:21.335
In our defense, we started this
two and a half years ago.
00:24:21.616 --> 00:24:24.167
And it was two years ago
at the WikidataCon
00:24:24.167 --> 00:24:26.088
that the Docker Image was first released.
00:24:26.898 --> 00:24:29.828
So, we had to figure out our own way.
00:24:29.828 --> 00:24:32.265
And once we had things up and running,
00:24:32.265 --> 00:24:35.259
we didn't really want to break
changing things.
00:24:35.259 --> 00:24:39.801
We do have the Docker installed
for the Query Service,
00:24:40.275 --> 00:24:43.322
and we have a weird, hybrid
of custom installation
00:24:43.322 --> 00:24:46.004
and Docker installation
and modified scripts
00:24:46.004 --> 00:24:48.542
connecting those two instances.
00:24:48.542 --> 00:24:51.605
We then installed
QuickStatements, again, manually,
00:24:51.605 --> 00:24:57.201
because by that time, it wasn't part
of the Query Service,
00:24:57.201 --> 00:25:00.361
did some slight modifications,
and adjustments to get it to work.
00:25:00.888 --> 00:25:05.443
I know it's now part
of the Docker Image.
00:25:05.928 --> 00:25:10.724
But yeah, we had it running,
so, we didn't bother changing it.
00:25:11.574 --> 00:25:13.437
Keep this in mind for later on.
00:25:14.164 --> 00:25:15.867
But before I go into what we did,
00:25:15.867 --> 00:25:18.465
I'm going to avoid
a possible confusion here,
00:25:18.465 --> 00:25:22.280
because we're talking
about data standards,
00:25:22.345 --> 00:25:25.273
and when we express things
in a semantic way,
00:25:25.273 --> 00:25:30.097
we will convert the concepts
from the XML into Classes and Properties.
00:25:30.580 --> 00:25:33.659
So, this being Object Properties
connecting the different classes,
00:25:33.659 --> 00:25:36.663
and Datatype Properties
that actually contain the content,
00:25:36.663 --> 00:25:40.370
that is to store text, numbers,
things like that.
00:25:41.195 --> 00:25:44.038
And we express all of this
within Wikibase,
00:25:44.082 --> 00:25:46.910
but all of those are items in Wikibase.
00:25:47.597 --> 00:25:51.446
And they are then described
using Wikibase Properties.
00:25:51.455 --> 00:25:54.950
So, we have ABCD properties
being items being described
00:25:54.950 --> 00:25:56.657
as Wikibase Properties.
00:25:56.657 --> 00:26:00.531
I try to make sure to use
the prefixes accordingly,
00:26:00.531 --> 00:26:03.581
so you know what I'm talking about
when I talk about properties
00:26:03.581 --> 00:26:04.820
in this talk.
00:26:05.746 --> 00:26:08.060
So, let's look at the properties,
00:26:08.060 --> 00:26:10.203
in particular, with Wikibase Properties.
00:26:10.215 --> 00:26:13.013
We sat down and thought,
"Okay, what do we need
00:26:13.013 --> 00:26:16.296
to describe the concepts
we want to model?"
00:26:16.701 --> 00:26:19.323
And we ended up using around 25 properties
00:26:19.833 --> 00:26:22.532
in addition to, of course, label,
description, alias.
00:26:22.670 --> 00:26:24.452
I'm not going to mention all of them,
00:26:24.452 --> 00:26:26.314
just so you see the variety.
00:26:27.243 --> 00:26:29.846
Those fulfill our requirements.
00:26:29.846 --> 00:26:36.496
And yeah, some things
express some restrictions,
00:26:36.496 --> 00:26:38.544
and others--
00:26:38.544 --> 00:26:40.062
Most of them are optional.
00:26:40.697 --> 00:26:42.628
Only very few are mandatory.
00:26:42.921 --> 00:26:46.489
So then, we set on importing
all of this information.
00:26:46.581 --> 00:26:51.082
We wrote a Schema Parser that extracts
all of the different concepts.
00:26:51.082 --> 00:26:53.959
So everything that has an XPath
within the XML Schema,
00:26:53.959 --> 00:26:57.121
and all of the documentation
that is part of the XML schema,
00:26:57.121 --> 00:27:00.284
and so we got this into a nice CSV file,
00:27:00.284 --> 00:27:04.862
and then we could work on this
and import it using QuickStatements.
00:27:05.918 --> 00:27:07.176
Worked quite well.
00:27:07.176 --> 00:27:11.157
But then, we had, as I said,
1,800-plus concepts
00:27:11.157 --> 00:27:13.272
in our Wikibase instance.
00:27:13.760 --> 00:27:17.252
But then, when we had things like person--
00:27:17.821 --> 00:27:20.366
person name, and contact email--
00:27:20.366 --> 00:27:23.485
those appear a couple of times
within the schema--
00:27:23.485 --> 00:27:27.157
for the data set owner, for the person
who took an image, things like that.
00:27:27.157 --> 00:27:29.180
So, of course, we needed to reduce those,
00:27:29.180 --> 00:27:32.013
and combine those to reusable classes.
00:27:32.064 --> 00:27:34.858
So, there was a lot of manual editing
00:27:34.858 --> 00:27:36.319
to reduce the number of concepts,
00:27:36.319 --> 00:27:39.558
and in the end, we ended up
with a little more than 500.
00:27:39.965 --> 00:27:43.540
So, we have Classes, Object Properties,
Datatype Properties,
00:27:43.540 --> 00:27:45.362
a couple of other ones I'm skipping
00:27:45.362 --> 00:27:47.392
to avoid additional complexity here.
00:27:48.362 --> 00:27:52.856
And for certain large-scale edits,
we also used QuickStatements again.
00:27:54.686 --> 00:27:57.312
So now, we did all of the editing,
00:27:57.312 --> 00:27:59.476
now we wanted to make sure
that the data we have
00:27:59.476 --> 00:28:00.775
is actually consistent.
00:28:01.101 --> 00:28:04.922
So, that's where we used what we call
Maintenance Queries,
00:28:06.252 --> 00:28:09.570
used the query interface
with some SPARQL queries,
00:28:09.570 --> 00:28:12.114
basically to check for missing properties,
00:28:13.250 --> 00:28:15.324
wrong links between concepts,
00:28:16.338 --> 00:28:18.761
basically, things that didn't match
00:28:18.761 --> 00:28:21.112
with our concept, with our structure.
00:28:21.840 --> 00:28:24.356
And in the end, we also had to do
00:28:24.356 --> 00:28:26.007
a manual review of all of the concepts
00:28:26.007 --> 00:28:27.875
just to make sure we didn't miss anything.
00:28:27.875 --> 00:28:29.986
This was kind of a lot of work,
00:28:29.986 --> 00:28:33.882
because if you only take
like five minutes per item,
00:28:33.992 --> 00:28:35.771
multiply it by 550,
00:28:36.781 --> 00:28:39.855
it's over one week of full
and concentrated work.
00:28:40.667 --> 00:28:42.732
But of course, we don't need five minutes,
00:28:42.732 --> 00:28:45.977
because you sometimes spend
like half an hour to fix a certain item
00:28:45.977 --> 00:28:48.294
when there's problems with the modeling.
00:28:48.985 --> 00:28:50.895
So, we now had all of the data.
00:28:50.895 --> 00:28:53.058
Now, it was time to get the data
out of Wikibase.
00:28:54.175 --> 00:28:58.236
We wrote an export script in Python
that uses the Query Service
00:28:58.236 --> 00:29:01.088
to get the information about the concepts,
00:29:01.088 --> 00:29:04.706
and fill them in templates--
prepared templates.
00:29:05.234 --> 00:29:07.916
So, in the end, we get
a nice valid OWL file
00:29:07.916 --> 00:29:09.787
that contains everything we need.
00:29:09.833 --> 00:29:12.788
And this is the actual basis
of the standard.
00:29:12.916 --> 00:29:17.380
For future versions,
when we're going to make revisions,
00:29:17.380 --> 00:29:19.651
the Wikibase is our working platform.
00:29:19.651 --> 00:29:22.697
And once we do an export,
this is the new version of the standard.
00:29:22.750 --> 00:29:25.102
Keeping those separate,
this would also allow us
00:29:25.102 --> 00:29:29.116
to move the server
to a different instance,
00:29:29.116 --> 00:29:32.796
or as I said, change the installation.
00:29:32.887 --> 00:29:35.963
We export JSON
for the documentation of the website.
00:29:36.771 --> 00:29:40.962
And we also export the data
to a second Wikibase instance.
00:29:41.409 --> 00:29:43.196
This is like really
experimental, right now.
00:29:43.196 --> 00:29:46.682
We haven't really used this
in production where it can--
00:29:46.682 --> 00:29:49.483
where the concepts can then be used
to describe actual data.
00:29:49.483 --> 00:29:51.422
So we're breaking down those--
00:29:52.189 --> 00:29:56.402
we're taking them a step down
from properties being Wikibase items,
00:29:56.407 --> 00:29:59.318
and converting them into actual
Wikibase properties.
00:29:59.761 --> 00:30:02.522
This is quite a lot of requests--
quite a lot of steps
00:30:02.522 --> 00:30:05.203
to keep all of the data
and all of the linking consistent,
00:30:05.203 --> 00:30:06.669
but it works.
00:30:06.669 --> 00:30:08.865
And in the end, well,
it was quite successful.
00:30:09.705 --> 00:30:11.703
There is a huge community--
00:30:11.949 --> 00:30:14.909
there is a community about
Biodiversity Information Standards,
00:30:14.909 --> 00:30:18.449
who also had their annual meeting
just in the past days.
00:30:18.729 --> 00:30:21.589
So, there's a huge interest
in reusing this approach
00:30:21.604 --> 00:30:23.385
for other standards, as well.
00:30:23.524 --> 00:30:25.183
And so, in the future,
00:30:25.183 --> 00:30:28.257
we want to try a bit
about Shape Expressions--
00:30:28.257 --> 00:30:31.110
as I said, we have some restrictions
in there to export them--
00:30:31.754 --> 00:30:35.160
and build some better workflows
for the versioning.
00:30:35.160 --> 00:30:36.873
We haven't done this yet.
00:30:36.919 --> 00:30:38.908
And switch up the Docker instance.
00:30:39.398 --> 00:30:41.676
So, at the end, I'm gong to have
a small wish list--
00:30:41.676 --> 00:30:43.335
what things could be improved.
00:30:43.335 --> 00:30:47.096
Well, there are a lot more tools
out there that are really written
00:30:47.096 --> 00:30:50.320
for Wikidata, but could be more agnostic,
00:30:51.839 --> 00:30:53.218
in particular, QuickStatements.
00:30:53.218 --> 00:30:56.658
As I said, I did
some adjustments manually.
00:30:56.707 --> 00:30:59.899
Many of the issues I had
are probably solved by now,
00:30:59.899 --> 00:31:01.679
but I don't think all of them.
00:31:01.840 --> 00:31:06.581
Then we want to import existing templates,
00:31:06.581 --> 00:31:09.288
or the SPARQL template,
the Q and the P template.
00:31:09.288 --> 00:31:12.203
They are really useful
when working with Wikibase.
00:31:12.203 --> 00:31:14.599
So, this would be done automatically.
00:31:14.599 --> 00:31:17.111
And as I said, we did a lot
of manual editing.
00:31:17.111 --> 00:31:20.769
So, it would be useful,
just ideal to have a tool where you can--
00:31:20.769 --> 00:31:22.393
Like in an Excel table--
00:31:22.393 --> 00:31:25.551
you load a couple of items,
and you load a couple of properties,
00:31:25.551 --> 00:31:27.619
and then just jump from cell to cell,
00:31:27.619 --> 00:31:31.662
really quickly edit a lot of things
00:31:31.662 --> 00:31:33.423
in a semi-automated way.
00:31:34.985 --> 00:31:36.390
Thanks. That's the end.
00:31:37.093 --> 00:31:38.481
Thank you so much.
00:31:38.481 --> 00:31:40.795
(applause)
00:31:40.795 --> 00:31:42.659
So much to talk about on this.
00:31:43.273 --> 00:31:48.254
So, there is not only--
well, how do I get back from here.
00:31:50.917 --> 00:31:54.004
It's not only about science.
It's not only about libraries.
00:31:54.004 --> 00:31:57.181
You can also create
art and beauty with Wikibase.
00:31:57.181 --> 00:32:01.611
And who would be better to tell us
about this than Stuart Prior.
00:32:12.056 --> 00:32:15.268
Now, slightly embarrassingly,
we talk about art and beauty,
00:32:15.268 --> 00:32:17.296
but this is a really ugly presentation.
00:32:17.296 --> 00:32:18.554
(laughter)
00:32:19.604 --> 00:32:22.552
Starting off with a room
full of Wikimedians,
00:32:22.552 --> 00:32:24.261
trains--people like trains.
00:32:24.956 --> 00:32:26.465
But it has a purpose.
00:32:26.465 --> 00:32:30.538
So, this is Hackney Downs Station
in Northeast London.
00:32:31.429 --> 00:32:34.104
And this is about
Banner Repeater and Wikibase,
00:32:34.104 --> 00:32:35.968
which I'll explain further.
00:32:36.014 --> 00:32:37.829
So, this is a terrible photo.
00:32:37.829 --> 00:32:43.405
But it is actually where
an artists' publishing archive is held,
00:32:43.512 --> 00:32:46.140
which is on the platform
of a train station.
00:32:46.950 --> 00:32:50.688
Within there, they've got
several hundred copies
00:32:50.688 --> 00:32:52.886
of various types of artists' publishing.
00:32:52.886 --> 00:32:54.389
They get a lot of public footfall.
00:32:54.389 --> 00:32:57.132
It does a lot of outreach
to actual general public.
00:32:57.132 --> 00:32:58.386
Like you get on the train,
00:32:58.386 --> 00:33:01.758
you'll find bits of
sort of obscure art on the train.
00:33:02.856 --> 00:33:04.888
So, it's a really interesting project,
00:33:04.934 --> 00:33:06.924
but part of a much wider community.
00:33:07.452 --> 00:33:10.374
So, what is Artists' Publishing?
What are Artists' Books?
00:33:10.430 --> 00:33:12.087
Like, I didn't know either.
00:33:13.545 --> 00:33:15.329
So, the definition,
according to Wikipedia,
00:33:15.329 --> 00:33:19.377
is "Artists' books are works of art
that utilize the form of the book."
00:33:19.377 --> 00:33:20.956
Well, you can read it.
00:33:21.569 --> 00:33:24.130
But it's individual pieces of art,
00:33:24.130 --> 00:33:28.257
or sometimes collections of art,
using publishing as a medium.
00:33:28.583 --> 00:33:31.141
This varies quite a lot.
It's very interesting.
00:33:31.141 --> 00:33:32.560
It was kind of--
00:33:32.570 --> 00:33:35.043
There was a lot of it
in the early '20s and '30s,
00:33:35.043 --> 00:33:37.793
and it had a bit of a renaissance,
'60s and 70's,
00:33:37.793 --> 00:33:39.411
and continues to expand.
00:33:39.491 --> 00:33:42.237
Has a large global community,
multilingual,
00:33:43.089 --> 00:33:47.748
somewhat separate from large
institutional art institutions.
00:33:47.805 --> 00:33:50.448
So, you'll find collections,
00:33:50.448 --> 00:33:53.616
such as the V&A
has a collection, obviously.
00:33:54.680 --> 00:33:58.483
So, they've got various kind
of items such as these.
00:33:59.294 --> 00:34:02.045
This is just an article,
so it's just not the best display.
00:34:03.098 --> 00:34:08.009
But it's a really kind of interesting,
yet slightly niche field of work.
00:34:08.661 --> 00:34:11.674
But it's not very good on Wikidata.
00:34:14.023 --> 00:34:18.245
This is, again, a really terrible photo--
it's not my photo--
00:34:18.245 --> 00:34:21.488
of some the stuff held
in Banner Repeater's archive.
00:34:21.488 --> 00:34:24.086
If you see in the middle,
the pink one, Blast,
00:34:24.086 --> 00:34:27.802
that's actually a fairly notable
piece of artists' publishing
00:34:27.802 --> 00:34:29.548
from the '20s.
00:34:31.168 --> 00:34:32.838
What does it look like on Wikidata?
00:34:32.838 --> 00:34:34.341
It's not good on Wikidata.
00:34:34.869 --> 00:34:37.782
It's often just confused with books
00:34:37.782 --> 00:34:39.803
or other forms of publishing.
00:34:40.292 --> 00:34:42.724
The average kind of Wikidata item for
00:34:42.728 --> 00:34:46.374
a notable piece of artists' publishing
00:34:47.145 --> 00:34:50.512
doesn't really have much to say about it.
00:34:50.568 --> 00:34:53.738
You know, it's just--
there you go, that's it.
00:34:54.832 --> 00:34:57.429
There's not a huge amount
of identifier numbers as well.
00:34:57.781 --> 00:35:00.782
So, there's clearly a lot missing
00:35:00.782 --> 00:35:03.710
when it comes to artists' publishing,
00:35:03.710 --> 00:35:06.840
certainly compared
to more traditional forms of art--
00:35:06.840 --> 00:35:09.073
paintings and sculpture and so forth.
00:35:09.722 --> 00:35:12.681
And there's a huge desire
within the community
00:35:12.681 --> 00:35:15.631
to start codifying this,
and making it a real thing.
00:35:16.566 --> 00:35:19.283
So, I'll give you an example
of what is actually available.
00:35:19.283 --> 00:35:22.202
You can point out what's wrong
with this query.
00:35:23.542 --> 00:35:28.173
So, this is basically all there is.
00:35:28.702 --> 00:35:31.507
That's every artists' book on Wikidata.
00:35:31.552 --> 00:35:33.049
So, there's really not a lot.
00:35:33.049 --> 00:35:36.322
Some of them don't even
have labels for a start.
00:35:36.322 --> 00:35:38.632
And it's something
that really needs expanding.
00:35:38.632 --> 00:35:41.099
And something that has capacity
to be expanded.
00:35:41.148 --> 00:35:43.416
Has anyone seen what's wrong
with this query yet?
00:35:45.164 --> 00:35:47.317
The labels-- the labels say "sausage",
00:35:48.172 --> 00:35:50.814
because I just stole
someone else's query,
00:35:50.814 --> 00:35:52.212
and changed the key number.
00:35:52.212 --> 00:35:53.342
(laughter)
00:35:53.342 --> 00:35:55.264
It's actually a query about sausages.
00:35:55.877 --> 00:35:57.541
Anyway, moving on.
00:35:57.827 --> 00:36:00.475
But yeah, you see it doesn't really have
much of a presence.
00:36:01.122 --> 00:36:04.163
We were approached by Banner Repeater.
00:36:05.378 --> 00:36:07.281
So, I work with Wikimedia UK.
00:36:07.281 --> 00:36:10.275
We were approached by Banner Repeater
to help them with this--
00:36:10.719 --> 00:36:12.416
with setting up a Wikibase--
00:36:13.182 --> 00:36:15.812
in terms of funding,
in getting extra funding,
00:36:15.812 --> 00:36:18.293
but also in terms of bringing in
a wider community,
00:36:18.293 --> 00:36:20.152
and being part of the process.
00:36:20.561 --> 00:36:23.886
So, the process is basically
to gather this community
00:36:23.886 --> 00:36:27.364
of artists, archivists,
and linked data experts,
00:36:28.554 --> 00:36:31.607
and work out what the schema,
the data model,
00:36:31.607 --> 00:36:33.872
for artists' publishing should be.
00:36:33.929 --> 00:36:35.588
It's a very specialized field.
00:36:35.953 --> 00:36:38.147
Doesn't really map
onto Wikidata perfectly.
00:36:38.392 --> 00:36:40.793
It's probably too granular for it.
00:36:41.684 --> 00:36:44.485
And the other thing
is the kind of flexibility of it.
00:36:44.577 --> 00:36:46.639
Maybe it doesn't really fit in Wikidata.
00:36:46.639 --> 00:36:50.090
Maybe it's too rigid at the moment.
00:36:50.090 --> 00:36:52.796
The Wikibase is being built,
00:36:52.796 --> 00:36:55.639
so I haven't got much to show you,
because it's not been built yet,
00:36:55.639 --> 00:36:57.149
but this more about the process.
00:36:57.343 --> 00:37:00.591
And the process is extensive
community consultation,
00:37:00.678 --> 00:37:02.136
a few kind of layers of it.
00:37:02.136 --> 00:37:04.563
So, we're not just going
to do this in one session.
00:37:04.563 --> 00:37:06.146
It's not a few individuals deciding.
00:37:06.146 --> 00:37:08.296
It's kind of ongoing,
and ongoing, and ongoing.
00:37:09.352 --> 00:37:13.244
The impact of this
could be fairly substantial,
00:37:13.244 --> 00:37:15.145
because no one else is doing this work.
00:37:15.145 --> 00:37:18.593
A lot of the larger institutions
have artists' publishing
00:37:18.593 --> 00:37:20.270
sitting in their kind of back room.
00:37:20.270 --> 00:37:22.163
They don't really know
how to categorize it.
00:37:22.163 --> 00:37:23.744
They haven't categorized it very well.
00:37:23.793 --> 00:37:25.899
They're not very interested in it.
00:37:25.899 --> 00:37:29.104
But there is a huge community
that is interested in doing this.
00:37:30.527 --> 00:37:34.012
So, this is basically
the process at the moment.
00:37:34.502 --> 00:37:36.936
So, the initial workshop has happened.
00:37:36.936 --> 00:37:40.228
So, it was an expert workshop
with some people
00:37:40.228 --> 00:37:43.644
deep in the field of artists' publishing--
00:37:43.644 --> 00:37:45.959
archivists, people
who own collections, and so forth--
00:37:46.002 --> 00:37:48.962
to establish a kind of
basic set of priors,
00:37:49.407 --> 00:37:52.080
to look at what things were existing.
00:37:52.080 --> 00:37:54.677
The existing status was on Wikidata,
00:37:54.677 --> 00:37:57.134
and look at how that
could be expanded or improved.
00:37:57.665 --> 00:38:00.503
And then they documented that,
00:38:00.503 --> 00:38:03.605
and established this basic structure.
00:38:04.135 --> 00:38:05.759
And now, we move into the next process
00:38:05.759 --> 00:38:07.630
where it's bringing in
a much wider community.
00:38:07.721 --> 00:38:11.087
So that's-- it's not just data people,
it's creators, as well.
00:38:11.656 --> 00:38:13.237
There'll be a lot of narrative in this,
00:38:13.237 --> 00:38:15.140
and a lot of qualitative things.
00:38:15.140 --> 00:38:18.093
Again, stuff that just
doesn't really belong on Wikidata.
00:38:18.933 --> 00:38:20.966
But also working with archivists,
00:38:20.966 --> 00:38:24.045
and working with linked
data experts, and so forth,
00:38:24.045 --> 00:38:26.322
to hopefully bring this all together,
00:38:26.322 --> 00:38:29.979
to create a resource that will have
a nice accessible front end,
00:38:29.979 --> 00:38:33.241
and also build this community--
people who can contribute to it,
00:38:33.241 --> 00:38:35.631
and kind of own this data set.
00:38:36.318 --> 00:38:38.505
I'll show you what we've got ready.
00:38:40.687 --> 00:38:44.205
This is subject to change.
00:38:44.815 --> 00:38:47.494
But this is basically kind of
where we've got so far
00:38:47.494 --> 00:38:48.612
with the expert ones.
00:38:48.612 --> 00:38:51.968
So, you see different P numbers
being developed,
00:38:51.968 --> 00:38:54.891
and look at what
their equivalent on Wikidata is.
00:38:55.428 --> 00:38:58.472
And obviously, it's a lot more granular
00:38:58.472 --> 00:39:01.394
than probably the information
on Wikidata is at the moment, so--
00:39:02.900 --> 00:39:06.599
There's a lot of detailed stuff,
so there's qualities
00:39:06.599 --> 00:39:09.063
such as height, width,
thickness, and so forth,
00:39:09.763 --> 00:39:12.135
which aren't necessarily that present
00:39:12.135 --> 00:39:14.981
on other groups
of artists' publishing on Wikidata.
00:39:15.453 --> 00:39:19.946
But there's also other things like
"commissioned by", and "contributors to",
00:39:19.946 --> 00:39:22.573
and a lot of these works
will have multiple contributors.
00:39:23.262 --> 00:39:25.526
And multiple editions
and things like that.
00:39:25.526 --> 00:39:27.432
There's really a lot
of granular information
00:39:27.432 --> 00:39:29.049
that can come about these things.
00:39:29.049 --> 00:39:30.844
And a lot of narrative as well, you know,
00:39:31.571 --> 00:39:32.953
as things have changed over time,
00:39:32.953 --> 00:39:34.741
as people have reinterpreted things.
00:39:35.566 --> 00:39:38.288
And this was what was created.
00:39:39.605 --> 00:39:42.633
Again, most of it has
Wikidata equivalents,
00:39:42.633 --> 00:39:44.063
but some of it doesn't yet.
00:39:44.063 --> 00:39:46.748
So, what do we have here.
00:39:48.203 --> 00:39:50.395
Other editions, and things like that.
00:39:50.395 --> 00:39:51.813
So, it's fairly specialized.
00:39:51.813 --> 00:39:52.929
This is the first stage.
00:39:52.929 --> 00:39:54.643
And this will go through another process,
00:39:54.643 --> 00:39:57.237
as people take things away from it
or contribute, too.
00:39:58.180 --> 00:40:00.727
The flexibility is really
important in this.
00:40:01.577 --> 00:40:04.785
It's kind of getting away
from older kind of standards,
00:40:04.785 --> 00:40:07.101
and moving to something
which is a bit more up-to-date,
00:40:07.101 --> 00:40:09.423
and something where the community
can really change things,
00:40:09.423 --> 00:40:11.869
and not be dictated to--
and I'll start speaking quicker.
00:40:13.778 --> 00:40:18.258
So, power dynamics, at the moment,
and why Wikibase.
00:40:18.258 --> 00:40:20.432
So at the moment, this is the art world.
00:40:20.432 --> 00:40:21.950
This is what the art world looks like.
00:40:21.950 --> 00:40:23.269
It's a big orange thing.
00:40:23.570 --> 00:40:25.360
But you've got these large institutions,
00:40:25.360 --> 00:40:27.993
and then you've got sort of
groups of artists' publishing.
00:40:28.117 --> 00:40:31.575
That could be Delhi, Mexico City,
London, and so forth.
00:40:32.197 --> 00:40:36.219
And what we don't want
is this kind of thing
00:40:36.219 --> 00:40:38.881
where large institutions and experts
get to dictate
00:40:38.881 --> 00:40:41.672
the kind of ontology,
and how these things are going to work.
00:40:43.115 --> 00:40:47.389
So, working to establish a Wikibase
among an artist community
00:40:47.874 --> 00:40:51.523
can help them work out
what they're going to do,
00:40:51.523 --> 00:40:54.582
and then they start pushing back
into the larger institutions,
00:40:54.995 --> 00:40:57.140
with a more kind of flexible data model,
00:40:57.140 --> 00:40:59.344
with something that's more up-to-date
00:40:59.344 --> 00:41:01.814
and coming from grassroots organizations,
00:41:01.814 --> 00:41:05.115
as opposed as coming
from institutions, so to speak.
00:41:05.564 --> 00:41:08.325
So, I think there's huge value
in this approach
00:41:08.398 --> 00:41:12.058
in terms of creating
a sort of parallel infrastructure
00:41:12.058 --> 00:41:16.284
for communities of people
who own content, and so forth,
00:41:16.284 --> 00:41:19.360
much like Wikimedia is,
00:41:19.846 --> 00:41:21.996
and kind of pushing out to institutions,
00:41:21.996 --> 00:41:24.223
rather than doing it the other way around.
00:41:24.662 --> 00:41:26.847
Do I have another slide?
What next?
00:41:26.847 --> 00:41:29.368
I always put this slide in,
because it's always the worst slide,
00:41:29.368 --> 00:41:30.484
and it's such a stereotype.
00:41:31.347 --> 00:41:35.068
What next? We're moving on
to the community consultation stage,
00:41:35.068 --> 00:41:38.953
so we'll get a bit more kind of
expansive and interesting.
00:41:39.288 --> 00:41:43.724
This obviously, this database
will be talking to Wikidata,
00:41:43.724 --> 00:41:46.632
but on what term,
we're not 100% sure.
00:41:46.632 --> 00:41:48.614
But it could be that this becomes very--
00:41:48.614 --> 00:41:51.086
just a very specific instance
for artists' publishing
00:41:51.086 --> 00:41:53.922
that Wikidata can draw from,
and vice versa.
00:41:54.501 --> 00:41:56.918
And I'll just finish off
with that picture again,
00:41:56.918 --> 00:41:58.419
because I just quite like it.
00:41:58.587 --> 00:42:00.936
And that's all I have to say.
Thank you.
00:42:00.936 --> 00:42:04.881
- Thank you so much.
- (applause)
00:42:05.810 --> 00:42:08.853
We're almost at the end
of our fast-paced ride,
00:42:08.853 --> 00:42:12.710
and we'll-- what to say?
we saved the best for last?
00:42:12.710 --> 00:42:16.433
No, but we give the last presentation
00:42:16.433 --> 00:42:20.118
to someone who's a true pioneer
of using Wikibase
00:42:20.118 --> 00:42:23.439
in the field of digital humanities.
00:42:23.439 --> 00:42:25.510
And, yeah-- Olaf Simons.
00:42:25.510 --> 00:42:28.499
You have not prepared any slides,
but you will do some live action.
00:42:28.600 --> 00:42:29.978
Exactly.
00:42:30.793 --> 00:42:34.165
And I have been on Wikipedia
since 2004, actually.
00:42:34.230 --> 00:42:35.570
I have the 15 years.
00:42:37.772 --> 00:42:39.555
What am I going to show?
00:42:41.665 --> 00:42:43.564
I've been congratulated for this.
00:42:43.564 --> 00:42:47.635
I'm going to show you
the Wikibase instance we created.
00:42:47.635 --> 00:42:49.056
It's not a Docker Image.
00:42:49.056 --> 00:42:52.093
And I could agree, it's not the best
to have a Docker--
00:42:52.093 --> 00:42:56.707
it's not the best to have
an independent installation.
00:42:56.707 --> 00:42:57.808
It's difficult,
00:42:57.808 --> 00:42:59.646
and it has been extremely
difficult for us,
00:42:59.646 --> 00:43:03.638
and we're grateful
for the Wikimedia Germany
00:43:04.828 --> 00:43:08.741
to help us get it done
on a mutual agreement we had.
00:43:09.413 --> 00:43:15.696
So, basically, we have here
several projects on this.
00:43:16.060 --> 00:43:18.243
It's more project-oriented than Wikidata.
00:43:18.847 --> 00:43:21.453
And my thing should be in here.
00:43:21.506 --> 00:43:27.025
I open that and go--
just should have done that before.
00:43:27.336 --> 00:43:28.595
Here we are.
00:43:29.723 --> 00:43:33.542
The history of the Illuminati--
I start with this one.
00:43:33.868 --> 00:43:36.216
This has been a little film
00:43:36.216 --> 00:43:40.272
which has been created
by Paul-Olivier Dehaye,
00:43:41.602 --> 00:43:43.755
whom I only know from Twitter,
00:43:43.755 --> 00:43:45.709
as he asked us what kind of experience
00:43:45.709 --> 00:43:49.933
did we make when we got our Wikibase,
00:43:49.933 --> 00:43:52.242
and he was experimenting with his own.
00:43:52.242 --> 00:43:55.606
And I talked to him
about things we could do,
00:43:55.606 --> 00:43:57.271
and things we could not do.
00:43:57.271 --> 00:44:00.432
This was a film I would love
to be able to do.
00:44:00.432 --> 00:44:02.339
And he said, "It's easy for me.
00:44:02.339 --> 00:44:04.724
I can run a SPARQL search,
get the information,
00:44:04.724 --> 00:44:08.147
and put it into a program,
in which you can then see this thing."
00:44:08.835 --> 00:44:12.328
It's actually 20 years of research
on the Illuminati,
00:44:12.328 --> 00:44:15.897
and gives you a short history
of the entire organization
00:44:15.897 --> 00:44:17.921
and all its correspondences.
00:44:17.921 --> 00:44:20.147
That's not a Wikimedia tool.
00:44:20.147 --> 00:44:23.024
It's not a tool of Wikibase.
00:44:23.024 --> 00:44:25.010
But it's something you can do.
00:44:25.010 --> 00:44:29.545
And actually, I like it
that it is not a tool already.
00:44:29.545 --> 00:44:31.006
It should become a tool.
00:44:31.006 --> 00:44:33.932
I like it because it shows
our data is really free.
00:44:33.932 --> 00:44:37.343
Someone can download our data,
someone can do something with it,
00:44:37.343 --> 00:44:42.308
which we haven't expected,
and it can be done within two hours,
00:44:42.308 --> 00:44:44.482
if you're bright--
and he is bright, of course.
00:44:45.255 --> 00:44:46.735
So, he created this for us.
00:44:46.827 --> 00:44:48.929
I go back to my presentation.
00:44:50.141 --> 00:44:52.825
Why on Wikibase?
00:44:52.825 --> 00:44:56.203
This was the immediate question
when we approached Wikimedia.
00:44:56.203 --> 00:44:58.910
I knew of Wikidata since 2010,
00:44:59.480 --> 00:45:04.643
and in 2017, it was ready
to be used by us.
00:45:05.560 --> 00:45:10.942
And there was actually an interest
from Wikimedia people to say,
00:45:10.942 --> 00:45:13.215
"Do it, and we support you."
00:45:13.705 --> 00:45:15.493
Why our own base?
00:45:15.777 --> 00:45:19.590
Basically, as original research
that we have to do.
00:45:20.159 --> 00:45:24.951
And the entire installation
is a research tool.
00:45:24.951 --> 00:45:27.663
It's not only there to take a look
at what we did
00:45:27.663 --> 00:45:29.331
and for presentation purposes,
00:45:29.331 --> 00:45:31.968
but actually, I use it every day
for my research.
00:45:31.968 --> 00:45:35.341
I change dates of documents,
00:45:35.341 --> 00:45:38.782
and take a look at how things look
when I have changed that.
00:45:38.782 --> 00:45:41.410
I do a lot with working hypothesis.
00:45:41.410 --> 00:45:48.083
And we ask projects that have data
to give us their data,
00:45:48.083 --> 00:45:50.073
and to feed them in,
00:45:50.073 --> 00:45:54.269
and they can, again, put a label,
00:45:54.269 --> 00:45:58.208
put an item to their data sets,
00:45:58.264 --> 00:46:02.397
that says this has been produced
by the following project.
00:46:02.397 --> 00:46:04.777
Next projects can continue with it.
00:46:04.777 --> 00:46:06.962
But it's already there as a marker
00:46:06.962 --> 00:46:11.260
that this is a data set
with work from a certain project.
00:46:11.437 --> 00:46:14.149
And if you have a project, DFG--
00:46:14.779 --> 00:46:17.568
DFG funded, the German
research institution--
00:46:17.568 --> 00:46:19.404
if you have a project, you want to show
00:46:19.404 --> 00:46:20.983
what kind of work you have done.
00:46:20.983 --> 00:46:22.633
And you can now do a SPARQL search
00:46:22.633 --> 00:46:25.880
and present your entire group of data sets
00:46:25.880 --> 00:46:30.100
in the final résumé of your work.
00:46:30.751 --> 00:46:36.002
So we get original research,
we identify research,
00:46:36.002 --> 00:46:38.513
we encourage the working hypothesis.
00:46:38.588 --> 00:46:40.045
This is a working tool,
00:46:40.045 --> 00:46:42.807
and it's actually quite useful
to start from the beginning,
00:46:42.807 --> 00:46:44.267
not to present something in the end.
00:46:44.267 --> 00:46:46.741
But from day one, you work with it,
00:46:46.741 --> 00:46:50.170
and what you think is
the proper answer to that question,
00:46:50.170 --> 00:46:53.120
you can put it into Wikibase, and then
00:46:53.120 --> 00:46:55.021
you can substantiate information
00:46:55.021 --> 00:46:57.253
until you see this
is the right identification
00:46:57.253 --> 00:46:59.532
of a person or the right date for a thing
00:46:59.532 --> 00:47:02.249
which we haven't been able to date so far.
00:47:02.309 --> 00:47:05.103
So, actually, accumulate work
while you are doing it,
00:47:05.103 --> 00:47:07.536
use the Wikibase as a kind of tool
00:47:07.536 --> 00:47:09.763
that is getting you closer
to the final result.
00:47:11.098 --> 00:47:14.782
Our first meeting took place
on December 1, 2017.
00:47:15.268 --> 00:47:18.757
And I remember I had
a little challenge for you,
00:47:18.960 --> 00:47:25.067
and that was a death date--
a date of death for a person--
00:47:25.245 --> 00:47:30.055
where I wanted to have someone
to show a source for that,
00:47:30.055 --> 00:47:31.429
and that was extremely difficult,
00:47:31.429 --> 00:47:32.975
because he had to create the source
00:47:32.975 --> 00:47:34.758
before he could connect it to that.
00:47:34.758 --> 00:47:36.499
And in the room, we were--
00:47:36.499 --> 00:47:39.815
we had the clear idea,
if we do this, we'd do it
00:47:39.815 --> 00:47:44.608
with the sources already part
of the Wikibase installation we have.
00:47:44.608 --> 00:47:46.433
And if we have the sources in there--
00:47:46.433 --> 00:47:49.515
that is, all the early modern books
that have been printed
00:47:49.515 --> 00:47:50.771
would be the ideal.
00:47:50.771 --> 00:47:53.382
If we have that in there,
we need the GND in there.
00:47:53.382 --> 00:47:59.538
And when we heard that the GND people
are on their track to test the software,
00:47:59.538 --> 00:48:01.879
I approached them and asked,
"Wouldn't you like to do this
00:48:01.879 --> 00:48:05.499
in a cooperation with us,
so that we can have your data,
00:48:05.499 --> 00:48:07.208
which we want to have, anyway,
00:48:07.208 --> 00:48:09.976
and that you can see
how it works on a Wikibase."
00:48:09.976 --> 00:48:11.684
And this is where we are at the moment.
00:48:11.684 --> 00:48:14.849
And presently, I would say,
a lot of things,
00:48:14.849 --> 00:48:16.399
we're not sure how they are done,
00:48:16.399 --> 00:48:18.339
or at least I am not sure
how they are done.
00:48:18.339 --> 00:48:21.292
How's the input done, how do you get
from a resource of strings
00:48:21.292 --> 00:48:24.500
to an item-based resource--
lots of things.
00:48:25.111 --> 00:48:28.065
And basically, my talk here
is an invitation.
00:48:28.471 --> 00:48:30.012
Join us.
00:48:30.502 --> 00:48:32.987
We are still not really part
of the Wikibase community.
00:48:32.987 --> 00:48:33.999
That doesn't exist.
00:48:33.999 --> 00:48:35.789
We have a Wikidata community.
00:48:35.789 --> 00:48:38.057
And lots of things
are taking place in Wikidata,
00:48:38.057 --> 00:48:42.751
but if I ask for help for a Wikibase
that is not Wikidata,
00:48:43.118 --> 00:48:44.696
that's a difficult thing.
00:48:46.030 --> 00:48:49.432
First thing I would say is,
actually, to work with us is cool,
00:48:49.432 --> 00:48:53.665
because you can grab the data
for Wikidata anytime, any moment, at CC0.
00:48:54.398 --> 00:48:57.886
So, actually, you can use it
as an incubator of your work,
00:48:57.886 --> 00:49:00.486
and drag it to Wikidata.
00:49:01.013 --> 00:49:06.106
And also, we will work with big data,
when we have the GND
00:49:06.106 --> 00:49:07.967
in there, that will be quite something.
00:49:07.967 --> 00:49:09.628
So, if you really want the challenge,
00:49:09.628 --> 00:49:11.810
you can get it also on our platform.
00:49:12.339 --> 00:49:15.499
And we offer interesting communities.
00:49:16.394 --> 00:49:18.341
Basically, one of the things
that is different
00:49:18.341 --> 00:49:21.489
is that we have all clear-name accounts
and institutions.
00:49:21.489 --> 00:49:24.459
So, but that also means you can do things
00:49:24.459 --> 00:49:25.949
which you couldn't do on Wikidata.
00:49:25.949 --> 00:49:27.976
You can do your genealogy at our site.
00:49:27.976 --> 00:49:28.993
We don't mind.
00:49:28.993 --> 00:49:32.075
It's interesting to have people
getting such data.
00:49:32.075 --> 00:49:36.049
You can do your city's search--
research, historical research
00:49:36.049 --> 00:49:37.948
on our platform-- we don't mind.
00:49:37.948 --> 00:49:42.456
You can be with research on our platform.
00:49:43.052 --> 00:49:45.812
So, lots of things need to be done.
00:49:46.137 --> 00:49:48.565
We have immense problems
running the database.
00:49:48.565 --> 00:49:50.676
It was implemented by Wikimedia,
00:49:50.676 --> 00:49:52.981
but now, we see lots of things
don't really work.
00:49:52.981 --> 00:49:54.478
We can't really fix that.
00:49:54.478 --> 00:49:57.543
It's extremely difficult to get help
00:49:57.543 --> 00:50:00.489
to run the database,
to update the database,
00:50:00.489 --> 00:50:03.034
to solve little technical problems,
00:50:03.034 --> 00:50:08.632
which we face as soon as we run
an instance outside Wikidata.
00:50:09.318 --> 00:50:13.002
Like getting the direct
GND link is difficult.
00:50:13.055 --> 00:50:15.644
It works on Wikidata,
it doesn't work on our instance.
00:50:15.644 --> 00:50:19.620
Getting images from Wikimedia Commons
00:50:19.620 --> 00:50:23.260
on our Wikibase is not that easy.
00:50:23.260 --> 00:50:25.370
Lots of little things still remain.
00:50:25.370 --> 00:50:27.525
So, actually, this is an invitation.
00:50:27.525 --> 00:50:32.153
If you want to join us
on the mass input, do that.
00:50:33.852 --> 00:50:34.861
Approach us.
00:50:34.912 --> 00:50:37.191
If you want to help us
with technical things,
00:50:37.191 --> 00:50:38.591
this is highly welcome.
00:50:38.591 --> 00:50:40.129
And then, we need tools.
00:50:40.129 --> 00:50:42.120
You saw the tool we had in the beginning.
00:50:42.120 --> 00:50:44.921
Actually, it's not that difficult
to get such tools.
00:50:45.934 --> 00:50:50.963
I saw what kind of query you do
to get such a visualization,
00:50:50.963 --> 00:50:55.140
and once you have it,
you should be able to modify it easily.
00:50:56.601 --> 00:50:59.358
These tools are extremely precious
00:50:59.358 --> 00:51:02.754
in our community
of digital humanities projects.
00:51:02.774 --> 00:51:06.099
And there are little companies
that create these tools,
00:51:06.099 --> 00:51:08.727
again, and again, and again,
and get money for that.
00:51:08.727 --> 00:51:12.202
I would love to have these tools
just once and for all free
00:51:12.202 --> 00:51:15.493
and on the market and working
with a Wikibase instance.
00:51:15.493 --> 00:51:19.662
So, anyone who is interested
in developing tools,
00:51:19.662 --> 00:51:21.901
approach us, and we have plenty of ideas
00:51:21.901 --> 00:51:24.624
of what visualizations
historians would love to see,
00:51:25.071 --> 00:51:26.815
and that should be done.
00:51:28.198 --> 00:51:31.493
So, basically, lots of things,
like, still remain.
00:51:31.549 --> 00:51:33.774
I've got one minute.
I don't need that one minute.
00:51:33.821 --> 00:51:35.640
And you're putting pressure on me.
00:51:37.260 --> 00:51:38.637
(person) Give it to the audience.
00:51:38.637 --> 00:51:40.380
I give the minute to the audience.
00:51:40.380 --> 00:51:42.122
Yeah. Thank you so much.
00:51:42.172 --> 00:51:44.324
And maybe you want to sit down,
00:51:44.324 --> 00:51:49.363
because I would like everyone
to join me back on stage.
00:51:50.053 --> 00:51:51.793
And we can have a round of questions.
00:51:51.793 --> 00:51:54.628
I really like that we ended
with an invitation,
00:51:54.628 --> 00:51:56.850
because this is what this is now.
00:51:57.254 --> 00:51:58.836
You are invited to ask questions.
00:51:58.836 --> 00:52:03.165
You are also invited to join us tomorrow
at the Wikibase meetup.
00:52:03.489 --> 00:52:06.332
If you are-- if you have some idea
00:52:06.332 --> 00:52:08.567
for an awesome Wikibase installation,
00:52:08.567 --> 00:52:12.262
for your institution, for your hobby,
for changing the world--
00:52:12.990 --> 00:52:16.267
please come and join us,
we will meet up, and--
00:52:18.083 --> 00:52:20.228
There's some complication
with the chairs.
00:52:20.357 --> 00:52:22.340
Well, let's stand up. Okay.
00:52:22.390 --> 00:52:24.496
I think we have another microphone, here.
00:52:24.496 --> 00:52:26.528
(person) I have the microphone
for the questions.
00:52:26.971 --> 00:52:29.246
Okay. So--
00:52:31.157 --> 00:52:32.662
Thank you for the presenters.
00:52:32.662 --> 00:52:35.799
And meet us at the Wikibase meetup,
00:52:35.799 --> 00:52:38.911
and now, I can't wait to hear
your questions to the panel.
00:52:40.731 --> 00:52:42.391
(person) Who's the first?
00:52:43.805 --> 00:52:47.088
(person) Hi. I will be talking
in the lightning session, too,
00:52:47.088 --> 00:52:50.872
about geosciences, and how in geosciences,
00:52:50.872 --> 00:52:54.312
there's many data repositories
that have collected
00:52:54.312 --> 00:52:56.895
and shared data with the community
00:52:56.895 --> 00:52:59.331
for years, for decades in some cases.
00:52:59.820 --> 00:53:04.808
And they curate the data set,
their schemas evolve continuously,
00:53:04.808 --> 00:53:07.243
they get a lot of feedback
from the community.
00:53:07.243 --> 00:53:10.042
All they desire is to organize
the community,
00:53:10.042 --> 00:53:12.557
to enable the growth
of these repositories.
00:53:13.046 --> 00:53:17.371
So, they don't necessarily desire
to put all their content in Wikidata
00:53:17.371 --> 00:53:18.837
and lose control over it.
00:53:18.837 --> 00:53:22.201
They offer a tremendous service
curating this content.
00:53:22.566 --> 00:53:27.743
So, I just wanted to point out
that some of the requirements
00:53:27.743 --> 00:53:30.895
and needs that have been voiced
by the panelists
00:53:30.895 --> 00:53:32.841
appear in my communities.
00:53:32.931 --> 00:53:39.764
And my question is, how do you mix
or maintain control
00:53:40.291 --> 00:53:42.971
over those schemas, over the standards,
00:53:42.971 --> 00:53:47.827
while allowing the community
to continue to introduce feedback
00:53:47.827 --> 00:53:52.194
and have more of this crowdsourcing
spirit that Wikidata has?
00:53:52.882 --> 00:53:56.209
I think everyone could answer that,
but maybe David, you want to start?
00:53:57.313 --> 00:53:59.470
I'm not sure whether I'm the right
person to answer this,
00:53:59.470 --> 00:54:00.845
because in our use case--
00:54:02.175 --> 00:54:04.100
in terms of data modeling,
00:54:04.100 --> 00:54:09.297
it's really a narrow set of people
who actually do the work.
00:54:09.472 --> 00:54:13.415
We contact experts
for the relevant segments,
00:54:14.145 --> 00:54:17.309
and some of them could contribute,
but for the current iteration,
00:54:17.309 --> 00:54:21.035
it was only me and two colleagues
who actually worked on it.
00:54:21.082 --> 00:54:25.903
So, we want to have this option,
that we get experts in,
00:54:25.903 --> 00:54:29.356
but it's always in close
collaboration with us,
00:54:29.356 --> 00:54:32.076
so that we don't really have to worry
00:54:32.076 --> 00:54:34.349
about the problem of crowdsourcing.
00:54:36.053 --> 00:54:38.232
Being part of the Wikimedia community,
00:54:38.232 --> 00:54:40.620
I would say, I would not be that worried.
00:54:40.702 --> 00:54:45.797
95% of the edits are good edits,
and improving things--more than that.
00:54:47.097 --> 00:54:50.409
As soon as we have an instance
that is actually closed--
00:54:50.409 --> 00:54:53.350
where I offer the accounts on real name,
00:54:53.350 --> 00:54:59.469
that's an additional hurdle
that no fool is going to go over.
00:54:59.520 --> 00:55:05.335
People are required on our instance
to offer an address, on page--
00:55:05.442 --> 00:55:06.938
not to me, but on page--
00:55:06.938 --> 00:55:10.312
and this is something only
institutions usually do,
00:55:10.312 --> 00:55:11.576
or private people that say,
00:55:11.576 --> 00:55:13.564
"Okay, I'm a private person.
I love this research.
00:55:13.564 --> 00:55:15.882
This is my personal field.
I give you my address."
00:55:15.882 --> 00:55:19.692
And this is a thing that puts off every--
00:55:20.384 --> 00:55:23.718
any vandal who wants to destroy Wikidata.
00:55:24.084 --> 00:55:27.545
So, you can close the system, but then,
00:55:27.545 --> 00:55:30.216
you are not really part
of the same flowing community.
00:55:30.305 --> 00:55:33.264
But again, I would say, if you go to CC0,
00:55:33.264 --> 00:55:35.848
then you can open up,
you can be the incubator
00:55:35.848 --> 00:55:40.552
where people do the research,
and then it goes out to the community.
00:55:40.552 --> 00:55:44.935
But it's an invitation--
use maybe closed works,
00:55:44.935 --> 00:55:48.743
and use an instance where
you work together with people you like.
00:55:54.123 --> 00:55:56.475
Well, I think that--
00:55:59.752 --> 00:56:03.798
I don't think that it's only my opinion--
00:56:04.499 --> 00:56:07.250
it is there are different perspectives,
00:56:07.250 --> 00:56:12.911
and it will be hard to reconcile
all perspectives and say,
00:56:13.359 --> 00:56:19.333
"Wikidata is the solution
for the entire world to go into."
00:56:20.065 --> 00:56:24.364
I don't say by this that Wikidata
is not a solution,
00:56:24.972 --> 00:56:27.925
but there are different perspectives,
there are different needs.
00:56:27.925 --> 00:56:34.844
The world is-- really, there is
a large variety of needs,
00:56:34.844 --> 00:56:40.271
of professional perspectives,
that you cannot reconcile
00:56:40.271 --> 00:56:44.639
in a unique worldwide database.
00:56:44.639 --> 00:56:48.587
So, I think that both are--
00:56:48.587 --> 00:56:51.756
The trickiest thing is how to reconcile
00:56:51.756 --> 00:56:58.528
and find angles of dialogue
between these two large families
00:56:58.528 --> 00:57:00.800
of needs and perspectives.
00:57:03.349 --> 00:57:05.379
If there are more questions,
00:57:05.379 --> 00:57:07.860
I would rather like to go
to more questions.
00:57:08.960 --> 00:57:10.382
Anybody else?
00:57:12.482 --> 00:57:15.159
If not, meanwhile you're thinking
about your questions--
00:57:15.159 --> 00:57:17.726
I would just like to say
that's one of the reasons
00:57:17.726 --> 00:57:19.632
why we consider Wikibase,
00:57:19.647 --> 00:57:23.820
because we believe that adding,
editing information
00:57:23.820 --> 00:57:27.992
within the Wikibase instance,
where you have rights and roles,
00:57:27.992 --> 00:57:31.443
as you have in Wikidata,
gives us the opportunity
00:57:31.443 --> 00:57:36.360
to share that information
with the information in Wikidata
00:57:36.360 --> 00:57:39.109
in a more easy way,
a more convenient way
00:57:39.109 --> 00:57:44.170
than if we try to build these bridges
in between our authority file
00:57:44.170 --> 00:57:46.520
and Wikidata at the moment.
00:57:46.641 --> 00:57:48.421
(person) So, I find it quite exciting
00:57:48.421 --> 00:57:51.870
hearing about how
you're energizing communities
00:57:51.870 --> 00:57:55.149
to find their own ways for data modeling,
00:57:55.149 --> 00:57:58.636
and that you can put into Wikibase.
00:57:59.336 --> 00:58:02.556
Will you-- I'm just saying
of Stuart Prior's community,
00:58:02.556 --> 00:58:04.174
but also some of the others--
00:58:04.174 --> 00:58:06.155
be trying to feed the approaches
00:58:06.155 --> 00:58:10.157
that as a community
that you decide work back to Wikidata,
00:58:10.157 --> 00:58:12.876
to say, "We've done artists' books,
00:58:12.876 --> 00:58:15.316
we've thrashed through several iterations,
00:58:15.316 --> 00:58:17.753
this is what we found really worked,
00:58:17.753 --> 00:58:19.904
and the properties that you should have
00:58:19.904 --> 00:58:23.193
or revisions you should make
to the Wikidata data model.
00:58:24.018 --> 00:58:26.006
Good question. Very short answer.
00:58:27.388 --> 00:58:28.922
It's an interesting question.
00:58:30.112 --> 00:58:31.847
I don't know whether this is a model
00:58:31.847 --> 00:58:33.551
that's going to work for other types.
00:58:33.638 --> 00:58:35.009
I hope it is.
00:58:36.063 --> 00:58:39.093
But it's a difficult one if you question
00:58:39.093 --> 00:58:42.774
of whether the Wikidata community
accepts the kind of authority
00:58:42.774 --> 00:58:45.700
of a separate community that goes off
and does the work on its own.
00:58:46.556 --> 00:58:47.776
But I would certainly hope
00:58:47.776 --> 00:58:50.335
that it's a way of people
feeding back into this process,
00:58:50.335 --> 00:58:53.702
without necessarily needing to go
onto Wikidata and do it.
00:58:56.904 --> 00:58:58.525
Well, I would say, grab it.
00:58:58.525 --> 00:59:01.721
Grab it if it's convenient, take it,
and take a look at how it works
00:59:01.721 --> 00:59:02.896
in the other instance.
00:59:02.896 --> 00:59:06.424
And if you feel like
this is a cool property
00:59:06.424 --> 00:59:09.457
to do certain searches,
then that will be adopted,
00:59:09.457 --> 00:59:10.721
that will be flowing.
00:59:10.721 --> 00:59:12.839
I wouldn't think
of authorities doing this.
00:59:12.839 --> 00:59:14.807
(person) Coming from
a Wikidata user perspective,
00:59:14.807 --> 00:59:17.543
the great thing you're doing
is showing you've established code
00:59:17.543 --> 00:59:18.802
that works and runs.
00:59:18.802 --> 00:59:21.390
You've established a data model
that people can see,
00:59:21.390 --> 00:59:23.290
is implementable, and works.
00:59:23.348 --> 00:59:25.867
And so, in the open source community,
00:59:25.867 --> 00:59:27.693
you know, show us the code.
00:59:27.705 --> 00:59:29.124
You can do that.
00:59:29.124 --> 00:59:32.726
And that's why I think it's very exciting
to have these branches
00:59:32.726 --> 00:59:35.306
that can then fold it back
for data modeling.
00:59:35.306 --> 00:59:36.381
Yeah, thank you.
00:59:36.381 --> 00:59:38.373
I think that is exactly the point.
00:59:38.902 --> 00:59:41.833
I also like the verb
that you used-- energize.
00:59:41.923 --> 00:59:43.869
This is exactly what we want to do.
00:59:43.869 --> 00:59:46.584
Energize, as in Star Trek.
00:59:47.890 --> 00:59:50.193
Yeah, this panel comes to an end.
00:59:51.120 --> 00:59:53.750
And if you have any more questions
00:59:53.750 --> 00:59:57.431
on all these Wikibase projects, talk.
00:59:57.442 --> 00:59:59.633
- Please come tomorrow.
- Have conversations.
00:59:59.633 --> 01:00:01.504
This is what this conference is about.
01:00:01.504 --> 01:00:02.926
Thank you very much.
01:00:02.926 --> 01:00:08.073
(applause)