-
Willkommen, Bienvenue-- Welcome.
-
I always wanted to say that on a stage.
-
(laughter)
-
This is going to be inspirational,
-
because this is the official
Wikibase inspiration panel
-
of WikidataCon 2019.
-
The point of this panel
is to be inspired by all the things
-
that people, in various countries,
in various fields, do with Wikibase,
-
the software behind Wikidata.
-
I was really surprised to learn today
that someone came to me and said,
-
"I learned about Wikibase
the first time today."
-
So, it is the software that runs Wikidata.
-
And if you want
to order things in the world
-
the way Wikidata orders things
in the world,
-
but you don't agree with the items
that we have in there,
-
because you might need
a finer level of granularity,
-
or maybe you don't want to start
with Q1, which is the universe,
-
because in your little world,
Q1 could be a book, if you are a library,
-
or it could be some kind of animal,
if you work in biology,
-
or it could be a historic person,
if you do digital humanities,
-
but you still want
the same system of ordering,
-
then Wikibase is the thing for you.
-
Over the last one or two years,
we have made contact
-
with extraordinary people,
who are pioneers, who are trailblazing,
-
who are evaluating Wikibase,
-
and who are doing
extremely great stuff with that.
-
This panel is going to be very rushed.
-
Every one of the participants
of this panel would have deserved
-
a one-hour slot to present their thing.
-
But our program is packed.
-
So, yeah, keep your seat belt fastened
for a fast-paced ride
-
through the inspirational
world of Wikibases.
-
And the first one is a project
from two organizations,
-
which is a little sensation in itself.
-
The Bibliothèque nationale de France,
the French National Library,
-
and Abes, which is an authority
for higher education.
-
But I think you will talk about that
more in your presentation,
-
and yeah, we'd like to welcome
Anila Angjeli and Benjamin Bober
-
on stage for the first
ten minutes of inspiration.
-
(applause)
-
Hi, everybody.
-
So, yeah, my name is Benjamin Bober.
-
So, I work for Abes,
-
which stands for Higher Education Agency,
-
Bibliographic Higher Education Agency.
-
Basically, we work with all
the university libraries in France,
-
and manage the union catalog.
-
And also their authority files.
-
And I'm here with Anila Angjeli,
from the BnF,
-
French National Library.
-
And we're going to talk to you
about our joint project,
-
which is about creating
a new production tool
-
for authorities data--
-
person, corporate bodies,
concepts, and so on.
-
And we spent the last months
-
asking Wikibase to do this stuff.
-
So, I will give you some context
really quickly,
-
because it's important for us,
as libraries--
-
There's been this technological
shift recently
-
with the linked open data movement,
-
and we wanted, as a bibliographical agency,
-
to follow this new trend.
-
And, well, it's been years since we've--
-
experimenting with linked open data,
-
with RDF, SPARQL and so on.
-
But we think that now
is the good time to move forward.
-
It's also a good time
because there's been a-- not a shift,
-
there's a fundamental change
-
in the way we consider
bibliographical data.
-
We used to, and we still have data
-
stored in records, we call it MARC records
-
in the library landscape.
-
We used a specific format called MARC.
-
But recently, there has been some way
-
to think about it
from another point of view.
-
And to go from a record-based world,
to an entity-based world
-
when we try to interconnect
people, works,
-
and other entities.
-
So, in this context, we decided
to launch this joint initiative.
-
But our goal is far beyond libraries.
-
We would like to have with us
-
other French GLAMS, for instance,
-
because we think our project
can help them also.
-
So basically, our project is called
Fichier National d'Entités,
-
so National Entity Files.
-
And it will be a shared platform
for collaboratively create
-
and maintain reference
data about entities.
-
Like I said, persons,
corporate bodies, places, concepts,
-
and creative works.
-
So, we embrace a lot of things.
-
And it's a challenge
because it's the first time
-
BnF and Abes collaborate
at such a level.
-
Giving you a quick view
about where we are--
-
where we've come from
and where we are now.
-
We have been working
on this project since 2017.
-
We've benchmarked,
other similar initiatives,
-
and came to the conclusion last year
-
that there was a strong interest
in Wikibase as the FNE's backbone.
-
We were considering it a good solution
-
to build upon, but we still
had doubts at this time,
-
because we have specific needs to fulfill.
-
So we decided to launch,
to spend this year
-
to build a proof of concept with real data
-
both from BnF catalog,
authority catalog, and our catalogs.
-
And well, try to merge this data
into a Wikibase,
-
and to try to see how they behave
-
and how the tool can fulfill our needs.
-
And we were helped
in this proof of concept
-
by Maxime and Vincent
from Inventaire.io,
-
who helped us have a better idea
-
about what Wikibase can bring us.
-
And Anila will talk
about the first findings.
-
So, while this decision to go
-
with experiments with the Wikibase
-
as the technical infrastructure backbone
-
or the basic layer for our FNE
-
was because it's not trivial
to move from one system to another,
-
and because the initiative
of using the Wikibase
-
as the technical infrastructure
for our data--
-
it was both--
-
means that we move from our classical
-
system information
-
or library information system
to quite another thing.
-
And so, we needed to experiment first,
-
and just to see whether a set
of functionalities that are--
-
that we usually need to perform
and fulfill in our environment--
-
professional environment.
-
I'm talking here about creating
and maintaining,
-
and not publishing,
which is a big difference.
-
You were at the session,
the previous session,
-
with just Wikidata Commons,
-
contribution strategies for GLAM--
-
it was about publication
and ways about creation in itself.
-
So, we need to go step by step,
-
and that's why we conducted
this experiment, this proof of concept.
-
And, good surprise, no major obstacle
to ingest library data
-
according to a specific ontology,
which is, while we--
-
I briefly mentioned that we put their data
in two different flavors of MARC,
-
then we defined
some [inaudible] properties
-
in order to be able to experiment
with merging the data,
-
and there was no major obstacle
from the technical point of view.
-
Of course, we came up with a confirmation
-
that Wikibase does offer built-in features
-
that could be used as the basis
for the technical infrastructure for FNE.
-
But again, the decision is not yet made,
-
because the experiment is still--
-
let's say, the developments
have been completed.
-
Now, we're in the phase of writing
the final conclusions,
-
and the decision is not yet made
from the strategic point of view,
-
but these are really the first findings
we can talk about.
-
And Wikibase-- it appears to us
-
that a Wikibase might be
a good operational solution
-
for managing this initiative--
that is jointly, collaboratively,
-
create these entity, these things,
-
to remind you of the opposition,
which is things and strings.
-
However, we noticed there are gaps.
-
Within the specific needs
of our specific institutions,
-
there are defined communities
with their own culture, practices and,
-
well, it is certain processes
that are inherent to the libraries,
-
and the solution offered by Wikibase,
for example, the search.
-
I mean, from the professional standpoint,
-
not only from this end-user standpoint,
-
but professional, we need some indexes
-
in order to ensure
data quality, data curation,
-
and it is very important
for the professional,
-
and Wikibase with its Elasticsearch
-
and CirrusSearch doesn't offer.
-
But still areas of investigation there.
-
The roles-- how are the roles managed?
-
The bureaucrat, the patrolling of--
-
it's not exactly what happened
in our world.
-
Although there is a layer
that can be used,
-
upon which we can build
other roles that are more in compliance
-
with our way of managing the data.
-
Or different constraints,
constraints related to data publication,
-
or data-- there's an error there
we need to correct.
-
Data policy-- okay, thank you.
-
So, there are things that need to be--
-
other layers, bricks,
need to be built upon Wikibase.
-
And of course, one of the reasons,
the major reasons,
-
the reason why we are here with you,
-
is that we-- we are willing,
and we feel the necessity
-
to be part of a community
sharing the same concerns.
-
And we all know, given the program,
-
that libraries and GLAMs
-
are heavily represented in this event.
-
So, I think-- we think that maybe
-
in a couple of weeks,
-
or next year, we will able
to communicate more openly
-
on our decision to go forward
with this solution.
-
Thank you.
-
Thank you so much.
-
(applause)
-
So, we will have short
presentations first,
-
and we will all return on stage
-
for questions, if we have
the time for that.
-
But yeah, we heard something from France.
-
There's another project.
-
It's not Fichier National d'Ent--
-
(jokingly struggles with name)
-
But it's Gemeinsame Normdatei,
-
the universal authority file
-
for the German-speaking world.
-
And I'm so happy to have good friends
of the Wikimedia movement here.
-
Barbara Fischer and Sarah Hartmann.
-
Thanks alot for the invitation
to talk about our project,
-
which is called GND meets Wikibase.
-
And it's a joint project
of Wikimedia Deutschland,
-
and the GND.
-
And we'd like to give you
a quick overview,
-
as Jens said before,
there are just 10 minutes.
-
Why we go for that approach
to evaluate Wikibase,
-
if it fulfills the requirements
for managing authority data
-
on a collaborative level, I would say.
-
So, where do we come from,
and what's the idea of authority control?
-
And GND, which stands for
Gemeinsame Normdatei,
-
what's the idea of it?
-
And yeah, where do we come from,
as I said before.
-
It's not that different
from what Anila and Ben said,
-
just a few seconds ago.
-
The GND is used
for the description of resources,
-
such as publications,
and objects, for example,
-
and in order to enable
accurate data retrieval,
-
I would say, the GND provides
unambiguous and distinct entities
-
for that retrieval.
-
And so, there are persistent identifiers,
as well, as you all know,
-
for identification and reference
for these entities.
-
The authority file is used
by mainly libraries,
-
we would say,
in the German-speaking countries,
-
but a few other institutions
from the cultural heritage domain,
-
are using the authority file already.
-
And all in all there are
around about 60 million records,
-
and in Wikibase, we would say "items,"
-
which refer to persons, names of persons,
-
corporate bodies, for example,
geographic names, and works.
-
And the GND is run cooperatively
by so-called GND agencies,
-
and at the moment, there are
around about 1,000 institutions
-
who are active users of the GND--
that means they establish new records
-
and added records or items
on a regular basis.
-
And the most important thing, I would say,
-
is that the GND data
is provided free of charge
-
under CC0 conditions,
-
and that all the APIs
and documentation is open as well.
-
Yeah, talking about open--
-
that's the point,
and the crucial one here--
-
at the moment, we challenge
to open up the GND
-
for other GLAM institutions
and institutions from the science domain.
-
At the moment, it's really focused
on the library sector.
-
That means that the handy tool
of librarians has to evolve
-
into a tool that is used
and accepted across domains.
-
And that means a lot of work
on organizational stuff,
-
community building, discussions
about the current data model,
-
and infrastructural and technical issues.
-
And, yeah.
-
Talking about the infrastructural issues,
-
we came up with the idea
to become partners in crime
-
with Wikibase, I would say,
so have slightly the same aims,
-
namely make cultural data
more accessible and interoperable.
-
And therefore we now
evaluate the software,
-
which was originally conceived
for a sole application, Wikidata,
-
if it's sufficient for managing
authority data.
-
Right-- hi from my side as well.
-
We're focusing in our evaluation
[inaudible] we do commonly
-
with Wikimedia Deutschland.
-
First of all, if Wikibase meets
the requirements
-
of GLAM institutions, galleries,
libraries, archives, and museums,
-
to drive collaboratively
an authority file,
-
which is like our basic question.
-
We also would like to see
Wikibase to increase usability
-
as the software system
we're using right now
-
is, let's say, quite a complex software
-
that is not as handy
as you might like it to be.
-
Well, and then, we would like to know
-
if Wikibase would also ease
both data linking
-
and growing a diverse community.
-
As Sarah said before, we are right now
in a process of opening up
-
towards a broader scope
of GLAM institutions,
-
and science institutions.
-
And of course, they are working
within their own software structures,
-
and we would like to know
if Wikibase would ease
-
the cooperation-- collaboration with us.
-
So, why do we do that?
-
This is because we consider that Wikibase
-
might be the attractive community zone,
-
which means--I had to write that down--
-
first of all, as it is open source,
it will be more accessible
-
than any proprietary source
software system that is used
-
in the cataloging fields
of the GLAM institutions.
-
Then, we feel that the Wikibase community
-
already by now
is a very dedicated community,
-
and we would like to participate
in that dedicated community,
-
because we believe that sharing is caring.
-
What we want to share
is our knowledge is your knowledge,
-
and together, in order to omit redundance,
-
not by editing the same information
over and over again,
-
but reuse data, link it,
-
quoting it, and enriching it.
-
And I placed here on the picture
one of the tools
-
that is broadly spread within Wikidata,
this Histropedia,
-
because we also feel that if we are able
to introduce our data into Wikibase,
-
we might be able to share tools,
improving the code,
-
and thus being an active,
contributing part of the community.
-
Thank you.
-
I'd like to debate that with you later on.
-
Thank you so much.
-
(applause)
-
Thank you so much.
-
So, at some point,
we ask ourselves, did we--
-
by accident, write a library software?
-
Because the adoption of Wikibase
in the library fields is so overwhelming.
-
But there's more to it.
-
And of course, we didn't
accidentally write a library system.
-
It can be used for other fields as well.
-
For instance, for biology.
-
And David Fichtmueller will tell us
about using Wikibase
-
as a platform for biodiversity.
-
- I think that was grayed.
- Yeah.
-
Full screen? Oh, okay.
-
Yes. Hello, everybody.
-
I'm David, and I work
at the Botanic Garden,
-
Botanical Museum here in Berlin.
-
And I work there as a computer scientist.
-
We have an entire department
called Biodiversity Informatics.
-
Generally speaking, we write the software
-
that biologists use in their daily work.
-
And on my private side,
-
I've been a Wikipedia contributor
for almost 15 years now,
-
and Wikidata contributor
for almost five years now.
-
And also, as part of my job,
-
I'm a co-administrator of a MediaWiki farm
-
with more than 80 wikis
regarding the biology community.
-
And a couple of years ago,
I was assigned to a project
-
that was, yeah, about working
on a standard.
-
In particular, it's a standard
called ABCD,
-
that we needed to do some work on.
-
And I assume most of you
haven't heard about ABCD,
-
that's not really a bad thing.
-
It's really specific.
-
It stands for Access to Biological
Collection Data.
-
And it's an XML schema.
-
So, it can express
biological information,
-
particular things like information
about herbarium sheets,
-
about collections, like fish in
alcohol jars, or--
-
but also observations--
-
scientists being out in the field,
seeing certain plants,
-
seeing certain animals.
-
A lot of variety in here,
and because of this,
-
it's quite a huge standard.
-
So, we have 1,800
different concepts in there.
-
That's counting the different XPaths
there are within the file.
-
And so the challenge was to convert this
-
into a new modern semantic standard.
-
We wanted to use an OWL ontology
-
that is able to express
the same kind of information
-
that has previously been expressed
with the XML files,
-
and also keep all the existing
documentation,
-
and restrictions,
and all of the connections
-
between the items
-
and have a collaborative platform
-
where other scientists can come in
and give us advice
-
on their specific fields of focus.
-
Did we model this correctly?
-
Is there anything missing?
-
So, yeah, with all of this in mind,
we went looking around,
-
and found a solution, and I guess
it wouldn't surprise anybody here,
-
it's Wikibase, otherwise
I wouldn't have been talking here.
-
So, we decided on using Wikibase.
-
And we started to install it
without the Docker Image.
-
Big mistake. Don't do this.
-
(laughter)
-
In our defense, we started this
two and a half years ago.
-
And it was two years ago
at the WikidataCon
-
that the Docker Image was first released.
-
So, we had to figure out our own way.
-
And once we had things up and running,
-
we didn't really want to break
changing things.
-
We do have the Docker installed
for the Query Service,
-
and we have a weird, hybrid
of custom installation
-
and Docker installation
and modified scripts
-
connecting those two instances.
-
We then installed
QuickStatements, again, manually,
-
because by that time, it wasn't part
of the Query Service,
-
did some slight modifications,
and adjustments to get it to work.
-
I know it's now part
of the Docker Image.
-
But yeah, we had it running,
so, we didn't bother changing it.
-
Keep this in mind for later on.
-
But before I go into what we did,
-
I'm going to avoid
a possible confusion here,
-
because we're talking
about data standards,
-
and when we express things
in a semantic way,
-
we will convert the concepts
from the XML into Classes and Properties.
-
So, this being Object Properties
connecting the different classes,
-
and Datatype Properties
that actually contain the content,
-
that is to store text, numbers,
things like that.
-
And we express all of this
within Wikibase,
-
but all of those are items in Wikibase.
-
And they are then described
using Wikibase Properties.
-
So, we have ABCD properties
being items being described
-
as Wikibase Properties.
-
I try to make sure to use
the prefixes accordingly,
-
so you know what I'm talking about
when I talk about properties
-
in this talk.
-
So, let's look at the properties,
-
in particular, with Wikibase Properties.
-
We sat down and thought,
"Okay, what do we need
-
to describe the concepts
we want to model?"
-
And we ended up using around 25 properties
-
in addition to, of course, label,
description, alias.
-
I'm not going to mention all of them,
-
just so you see the variety.
-
Those fulfill our requirements.
-
And yeah, some things
express some restrictions,
-
and others--
-
Most of them are optional.
-
Only very few are mandatory.
-
So then, we set on importing
all of this information.
-
We wrote a Schema Parser that extracts
all of the different concepts.
-
So everything that has an XPath
within the XML Schema,
-
and all of the documentation
that is part of the XML schema,
-
and so we got this into a nice CSV file,
-
and then we could work on this
and import it using QuickStatements.
-
Worked quite well.
-
But then, we had, as I said,
1,800-plus concepts
-
in our Wikibase instance.
-
But then, when we had things like person--
-
person name, and contact email--
-
those appear a couple of times
within the schema--
-
for the data set owner, for the person
who took an image, things like that.
-
So, of course, we needed to reduce those,
-
and combine those to reusable classes.
-
So, there was a lot of manual editing
-
to reduce the number of concepts,
-
and in the end, we ended up
with a little more than 500.
-
So, we have Classes, Object Properties,
Datatype Properties,
-
a couple of other ones I'm skipping
-
to avoid additional complexity here.
-
And for certain large-scale edits,
we also used QuickStatements again.
-
So now, we did all of the editing,
-
now we wanted to make sure
that the data we have
-
is actually consistent.
-
So, that's where we used what we call
Maintenance Queries,
-
used the query interface
with some SPARQL queries,
-
basically to check for missing properties,
-
wrong links between concepts,
-
basically, things that didn't match
-
with our concept, with our structure.
-
And in the end, we also had to do
-
a manual review of all of the concepts
-
just to make sure we didn't miss anything.
-
This was kind of a lot of work,
-
because if you only take
like five minutes per item,
-
multiply it by 550,
-
it's over one week of full
and concentrated work.
-
But of course, we don't need five minutes,
-
because you sometimes spend
like half an hour to fix a certain item
-
when there's problems with the modeling.
-
So, we now had all of the data.
-
Now, it was time to get the data
out of Wikibase.
-
We wrote an export script in Python
that uses the Query Service
-
to get the information about the concepts,
-
and fill them in templates--
prepared templates.
-
So, in the end, we get
a nice valid OWL file
-
that contains everything we need.
-
And this is the actual basis
of the standard.
-
For future versions,
when we're going to make revisions,
-
the Wikibase is our working platform.
-
And once we do an export,
this is the new version of the standard.
-
Keeping those separate,
this would also allow us
-
to move the server
to a different instance,
-
or as I said, change the installation.
-
We export JSON
for the documentation of the website.
-
And we also export the data
to a second Wikibase instance.
-
This is like really
experimental, right now.
-
We haven't really used this
in production where it can--
-
where the concepts can then be used
to describe actual data.
-
So we're breaking down those--
-
we're taking them a step down
from properties being Wikibase items,
-
and converting them into actual
Wikibase properties.
-
This is quite a lot of requests--
quite a lot of steps
-
to keep all of the data
and all of the linking consistent,
-
but it works.
-
And in the end, well,
it was quite successful.
-
There is a huge community--
-
there is a community about
Biodiversity Information Standards,
-
who also had their annual meeting
just in the past days.
-
So, there's a huge interest
in reusing this approach
-
for other standards, as well.
-
And so, in the future,
-
we want to try a bit
about Shape Expressions--
-
as I said, we have some restrictions
in there to export them--
-
and build some better workflows
for the versioning.
-
We haven't done this yet.
-
And switch up the Docker instance.
-
So, at the end, I'm gong to have
a small wish list--
-
what things could be improved.
-
Well, there are a lot more tools
out there that are really written
-
for Wikidata, but could be more agnostic,
-
in particular, QuickStatements.
-
As I said, I did
some adjustments manually.
-
Many of the issues I had
are probably solved by now,
-
but I don't think all of them.
-
Then we want to import existing templates,
-
or the SPARQL template,
the Q and the P template.
-
They are really useful
when working with Wikibase.
-
So, this would be done automatically.
-
And as I said, we did a lot
of manual editing.
-
So, it would be useful,
just ideal to have a tool where you can--
-
Like in an Excel table--
-
you load a couple of items,
and you load a couple of properties,
-
and then just jump from cell to cell,
-
really quickly edit a lot of things
-
in a semi-automated way.
-
Thanks. That's the end.
-
Thank you so much.
-
(applause)
-
So much to talk about on this.
-
So, there is not only--
well, how do I get back from here.
-
It's not only about science.
It's not only about libraries.
-
You can also create
art and beauty with Wikibase.
-
And who would be better to tell us
about this than Stuart Prior.
-
Now, slightly embarrassingly,
we talk about art and beauty,
-
but this is a really ugly presentation.
-
(laughter)
-
Starting off with a room
full of Wikimedians,
-
trains--people like trains.
-
But it has a purpose.
-
So, this is Hackney Downs Station
in Northeast London.
-
And this is about
Banner Repeater and Wikibase,
-
which I'll explain further.
-
So, this is a terrible photo.
-
But it is actually where
an artists' publishing archive is held,
-
which is on the platform
of a train station.
-
Within there, they've got
several hundred copies
-
of various types of artists' publishing.
-
They get a lot of public footfall.
-
It does a lot of outreach
to actual general public.
-
Like you get on the train,
-
you'll find bits of
sort of obscure art on the train.
-
So, it's a really interesting project,
-
but part of a much wider community.
-
So, what is Artists' Publishing?
What are Artists' Books?
-
Like, I didn't know either.
-
So, the definition,
according to Wikipedia,
-
is "Artists' books are works of art
that utilize the form of the book."
-
Well, you can read it.
-
But it's individual pieces of art,
-
or sometimes collections of art,
using publishing as a medium.
-
This varies quite a lot.
It's very interesting.
-
It was kind of--
-
There was a lot of it
in the early '20s and '30s,
-
and it had a bit of a renaissance,
'60s and 70's,
-
and continues to expand.
-
Has a large global community,
multilingual,
-
somewhat separate from large
institutional art institutions.
-
So, you'll find collections,
-
such as the V&A
has a collection, obviously.
-
So, they've got various kind
of items such as these.
-
This is just an article,
so it's just not the best display.
-
But it's a really kind of interesting,
yet slightly niche field of work.
-
But it's not very good on Wikidata.
-
This is, again, a really terrible photo--
it's not my photo--
-
of some the stuff held
in Banner Repeater's archive.
-
If you see in the middle,
the pink one, Blast,
-
that's actually a fairly notable
piece of artists' publishing
-
from the '20s.
-
What does it look like on Wikidata?
-
It's not good on Wikidata.
-
It's often just confused with books
-
or other forms of publishing.
-
The average kind of Wikidata item for
-
a notable piece of artists' publishing
-
doesn't really have much to say about it.
-
You know, it's just--
there you go, that's it.
-
There's not a huge amount
of identifier numbers as well.
-
So, there's clearly a lot missing
-
when it comes to artists' publishing,
-
certainly compared
to more traditional forms of art--
-
paintings and sculpture and so forth.
-
And there's a huge desire
within the community
-
to start codifying this,
and making it a real thing.
-
So, I'll give you an example
of what is actually available.
-
You can point out what's wrong
with this query.
-
So, this is basically all there is.
-
That's every artists' book on Wikidata.
-
So, there's really not a lot.
-
Some of them don't even
have labels for a start.
-
And it's something
that really needs expanding.
-
And something that has capacity
to be expanded.
-
Has anyone seen what's wrong
with this query yet?
-
The labels-- the labels say "sausage",
-
because I just stole
someone else's query,
-
and changed the key number.
-
(laughter)
-
It's actually a query about sausages.
-
Anyway, moving on.
-
But yeah, you see it doesn't really have
much of a presence.
-
We were approached by Banner Repeater.
-
So, I work with Wikimedia UK.
-
We were approached by Banner Repeater
to help them with this--
-
with setting up a Wikibase--
-
in terms of funding,
in getting extra funding,
-
but also in terms of bringing in
a wider community,
-
and being part of the process.
-
So, the process is basically
to gather this community
-
of artists, archivists,
and linked data experts,
-
and work out what the schema,
the data model,
-
for artists' publishing should be.
-
It's a very specialized field.
-
Doesn't really map
onto Wikidata perfectly.
-
It's probably too granular for it.
-
And the other thing
is the kind of flexibility of it.
-
Maybe it doesn't really fit in Wikidata.
-
Maybe it's too rigid at the moment.
-
The Wikibase is being built,
-
so I haven't got much to show you,
because it's not been built yet,
-
but this more about the process.
-
And the process is extensive
community consultation,
-
a few kind of layers of it.
-
So, we're not just going
to do this in one session.
-
It's not a few individuals deciding.
-
It's kind of ongoing,
and ongoing, and ongoing.
-
The impact of this
could be fairly substantial,
-
because no one else is doing this work.
-
A lot of the larger institutions
have artists' publishing
-
sitting in their kind of back room.
-
They don't really know
how to categorize it.
-
They haven't categorized it very well.
-
They're not very interested in it.
-
But there is a huge community
that is interested in doing this.
-
So, this is basically
the process at the moment.
-
So, the initial workshop has happened.
-
So, it was an expert workshop
with some people
-
deep in the field of artists' publishing--
-
archivists, people
who own collections, and so forth--
-
to establish a kind of
basic set of priors,
-
to look at what things were existing.
-
The existing status was on Wikidata,
-
and look at how that
could be expanded or improved.
-
And then they documented that,
-
and established this basic structure.
-
And now, we move into the next process
-
where it's bringing in
a much wider community.
-
So that's-- it's not just data people,
it's creators, as well.
-
There'll be a lot of narrative in this,
-
and a lot of qualitative things.
-
Again, stuff that just
doesn't really belong on Wikidata.
-
But also working with archivists,
-
and working with linked
data experts, and so forth,
-
to hopefully bring this all together,
-
to create a resource that will have
a nice accessible front end,
-
and also build this community--
people who can contribute to it,
-
and kind of own this data set.
-
I'll show you what we've got ready.
-
This is subject to change.
-
But this is basically kind of
where we've got so far
-
with the expert ones.
-
So, you see different P numbers
being developed,
-
and look at what
their equivalent on Wikidata is.
-
And obviously, it's a lot more granular
-
than probably the information
on Wikidata is at the moment, so--
-
There's a lot of detailed stuff,
so there's qualities
-
such as height, width,
thickness, and so forth,
-
which aren't necessarily that present
-
on other groups
of artists' publishing on Wikidata.
-
But there's also other things like
"commissioned by", and "contributors to",
-
and a lot of these works
will have multiple contributors.
-
And multiple editions
and things like that.
-
There's really a lot
of granular information
-
that can come about these things.
-
And a lot of narrative as well, you know,
-
as things have changed over time,
-
as people have reinterpreted things.
-
And this was what was created.
-
Again, most of it has
Wikidata equivalents,
-
but some of it doesn't yet.
-
So, what do we have here.
-
Other editions, and things like that.
-
So, it's fairly specialized.
-
This is the first stage.
-
And this will go through another process,
-
as people take things away from it
or contribute, too.
-
The flexibility is really
important in this.
-
It's kind of getting away
from older kind of standards,
-
and moving to something
which is a bit more up-to-date,
-
and something where the community
can really change things,
-
and not be dictated to--
and I'll start speaking quicker.
-
So, power dynamics, at the moment,
and why Wikibase.
-
So at the moment, this is the art world.
-
This is what the art world looks like.
-
It's a big orange thing.
-
But you've got these large institutions,
-
and then you've got sort of
groups of artists' publishing.
-
That could be Delhi, Mexico City,
London, and so forth.
-
And what we don't want
is this kind of thing
-
where large institutions and experts
get to dictate
-
the kind of ontology,
and how these things are going to work.
-
So, working to establish a Wikibase
among an artist community
-
can help them work out
what they're going to do,
-
and then they start pushing back
into the larger institutions,
-
with a more kind of flexible data model,
-
with something that's more up-to-date
-
and coming from grassroots organizations,
-
as opposed as coming
from institutions, so to speak.
-
So, I think there's huge value
in this approach
-
in terms of creating
a sort of parallel infrastructure
-
for communities of people
who own content, and so forth,
-
much like Wikimedia is,
-
and kind of pushing out to institutions,
-
rather than doing it the other way around.
-
Do I have another slide?
What next?
-
I always put this slide in,
because it's always the worst slide,
-
and it's such a stereotype.
-
What next? We're moving on
to the community consultation stage,
-
so we'll get a bit more kind of
expansive and interesting.
-
This obviously, this database
will be talking to Wikidata,
-
but on what term,
we're not 100% sure.
-
But it could be that this becomes very--
-
just a very specific instance
for artists' publishing
-
that Wikidata can draw from,
and vice versa.
-
And I'll just finish off
with that picture again,
-
because I just quite like it.
-
And that's all I have to say.
Thank you.
-
- Thank you so much.
- (applause)
-
We're almost at the end
of our fast-paced ride,
-
and we'll-- what to say?
we saved the best for last?
-
No, but we give the last presentation
-
to someone who's a true pioneer
of using Wikibase
-
in the field of digital humanities.
-
And, yeah-- Olaf Simons.
-
You have not prepared any slides,
but you will do some live action.
-
Exactly.
-
And I have been on Wikipedia
since 2004, actually.
-
I have the 15 years.
-
What am I going to show?
-
I've been congratulated for this.
-
I'm going to show you
the Wikibase instance we created.
-
It's not a Docker Image.
-
And I could agree, it's not the best
to have a Docker--
-
it's not the best to have
an independent installation.
-
It's difficult,
-
and it has been extremely
difficult for us,
-
and we're grateful
for the Wikimedia Germany
-
to help us get it done
on a mutual agreement we had.
-
So, basically, we have here
several projects on this.
-
It's more project-oriented than Wikidata.
-
And my thing should be in here.
-
I open that and go--
just should have done that before.
-
Here we are.
-
The history of the Illuminati--
I start with this one.
-
This has been a little film
-
which has been created
by Paul-Olivier Dehaye,
-
whom I only know from Twitter,
-
as he asked us what kind of experience
-
did we make when we got our Wikibase,
-
and he was experimenting with his own.
-
And I talked to him
about things we could do,
-
and things we could not do.
-
This was a film I would love
to be able to do.
-
And he said, "It's easy for me.
-
I can run a SPARQL search,
get the information,
-
and put it into a program,
in which you can then see this thing."
-
It's actually 20 years of research
on the Illuminati,
-
and gives you a short history
of the entire organization
-
and all its correspondences.
-
That's not a Wikimedia tool.
-
It's not a tool of Wikibase.
-
But it's something you can do.
-
And actually, I like it
that it is not a tool already.
-
It should become a tool.
-
I like it because it shows
our data is really free.
-
Someone can download our data,
someone can do something with it,
-
which we haven't expected,
and it can be done within two hours,
-
if you're bright--
and he is bright, of course.
-
So, he created this for us.
-
I go back to my presentation.
-
Why on Wikibase?
-
This was the immediate question
when we approached Wikimedia.
-
I knew of Wikidata since 2010,
-
and in 2017, it was ready
to be used by us.
-
And there was actually an interest
from Wikimedia people to say,
-
"Do it, and we support you."
-
Why our own base?
-
Basically, as original research
that we have to do.
-
And the entire installation
is a research tool.
-
It's not only there to take a look
at what we did
-
and for presentation purposes,
-
but actually, I use it every day
for my research.
-
I change dates of documents,
-
and take a look at how things look
when I have changed that.
-
I do a lot with working hypothesis.
-
And we ask projects that have data
to give us their data,
-
and to feed them in,
-
and they can, again, put a label,
-
put an item to their data sets,
-
that says this has been produced
by the following project.
-
Next projects can continue with it.
-
But it's already there as a marker
-
that this is a data set
with work from a certain project.
-
And if you have a project, DFG--
-
DFG funded, the German
research institution--
-
if you have a project, you want to show
-
what kind of work you have done.
-
And you can now do a SPARQL search
-
and present your entire group of data sets
-
in the final résumé of your work.
-
So we get original research,
we identify research,
-
we encourage the working hypothesis.
-
This is a working tool,
-
and it's actually quite useful
to start from the beginning,
-
not to present something in the end.
-
But from day one, you work with it,
-
and what you think is
the proper answer to that question,
-
you can put it into Wikibase, and then
-
you can substantiate information
-
until you see this
is the right identification
-
of a person or the right date for a thing
-
which we haven't been able to date so far.
-
So, actually, accumulate work
while you are doing it,
-
use the Wikibase as a kind of tool
-
that is getting you closer
to the final result.
-
Our first meeting took place
on December 1, 2017.
-
And I remember I had
a little challenge for you,
-
and that was a death date--
a date of death for a person--
-
where I wanted to have someone
to show a source for that,
-
and that was extremely difficult,
-
because he had to create the source
-
before he could connect it to that.
-
And in the room, we were--
-
we had the clear idea,
if we do this, we'd do it
-
with the sources already part
of the Wikibase installation we have.
-
And if we have the sources in there--
-
that is, all the early modern books
that have been printed
-
would be the ideal.
-
If we have that in there,
we need the GND in there.
-
And when we heard that the GND people
are on their track to test the software,
-
I approached them and asked,
"Wouldn't you like to do this
-
in a cooperation with us,
so that we can have your data,
-
which we want to have, anyway,
-
and that you can see
how it works on a Wikibase."
-
And this is where we are at the moment.
-
And presently, I would say,
a lot of things,
-
we're not sure how they are done,
-
or at least I am not sure
how they are done.
-
How's the input done, how do you get
from a resource of strings
-
to an item-based resource--
lots of things.
-
And basically, my talk here
is an invitation.
-
Join us.
-
We are still not really part
of the Wikibase community.
-
That doesn't exist.
-
We have a Wikidata community.
-
And lots of things
are taking place in Wikidata,
-
but if I ask for help for a Wikibase
that is not Wikidata,
-
that's a difficult thing.
-
First thing I would say is,
actually, to work with us is cool,
-
because you can grab the data
for Wikidata anytime, any moment, at CC0.
-
So, actually, you can use it
as an incubator of your work,
-
and drag it to Wikidata.
-
And also, we will work with big data,
when we have the GND
-
in there, that will be quite something.
-
So, if you really want the challenge,
-
you can get it also on our platform.
-
And we offer interesting communities.
-
Basically, one of the things
that is different
-
is that we have all clear-name accounts
and institutions.
-
So, but that also means you can do things
-
which you couldn't do on Wikidata.
-
You can do your genealogy at our site.
-
We don't mind.
-
It's interesting to have people
getting such data.
-
You can do your city's search--
research, historical research
-
on our platform-- we don't mind.
-
You can be with research on our platform.
-
So, lots of things need to be done.
-
We have immense problems
running the database.
-
It was implemented by Wikimedia,
-
but now, we see lots of things
don't really work.
-
We can't really fix that.
-
It's extremely difficult to get help
-
to run the database,
to update the database,
-
to solve little technical problems,
-
which we face as soon as we run
an instance outside Wikidata.
-
Like getting the direct
GND link is difficult.
-
It works on Wikidata,
it doesn't work on our instance.
-
Getting images from Wikimedia Commons
-
on our Wikibase is not that easy.
-
Lots of little things still remain.
-
So, actually, this is an invitation.
-
If you want to join us
on the mass input, do that.
-
Approach us.
-
If you want to help us
with technical things,
-
this is highly welcome.
-
And then, we need tools.
-
You saw the tool we had in the beginning.
-
Actually, it's not that difficult
to get such tools.
-
I saw what kind of query you do
to get such a visualization,
-
and once you have it,
you should be able to modify it easily.
-
These tools are extremely precious
-
in our community
of digital humanities projects.
-
And there are little companies
that create these tools,
-
again, and again, and again,
and get money for that.
-
I would love to have these tools
just once and for all free
-
and on the market and working
with a Wikibase instance.
-
So, anyone who is interested
in developing tools,
-
approach us, and we have plenty of ideas
-
of what visualizations
historians would love to see,
-
and that should be done.
-
So, basically, lots of things,
like, still remain.
-
I've got one minute.
I don't need that one minute.
-
And you're putting pressure on me.
-
(person) Give it to the audience.
-
I give the minute to the audience.
-
Yeah. Thank you so much.
-
And maybe you want to sit down,
-
because I would like everyone
to join me back on stage.
-
And we can have a round of questions.
-
I really like that we ended
with an invitation,
-
because this is what this is now.
-
You are invited to ask questions.
-
You are also invited to join us tomorrow
at the Wikibase meetup.
-
If you are-- if you have some idea
-
for an awesome Wikibase installation,
-
for your institution, for your hobby,
for changing the world--
-
please come and join us,
we will meet up, and--
-
There's some complication
with the chairs.
-
Well, let's stand up. Okay.
-
I think we have another microphone, here.
-
(person) I have the microphone
for the questions.
-
Okay. So--
-
Thank you for the presenters.
-
And meet us at the Wikibase meetup,
-
and now, I can't wait to hear
your questions to the panel.
-
(person) Who's the first?
-
(person) Hi. I will be talking
in the lightning session, too,
-
about geosciences, and how in geosciences,
-
there's many data repositories
that have collected
-
and shared data with the community
-
for years, for decades in some cases.
-
And they curate the data set,
their schemas evolve continuously,
-
they get a lot of feedback
from the community.
-
All they desire is to organize
the community,
-
to enable the growth
of these repositories.
-
So, they don't necessarily desire
to put all their content in Wikidata
-
and lose control over it.
-
They offer a tremendous service
curating this content.
-
So, I just wanted to point out
that some of the requirements
-
and needs that have been voiced
by the panelists
-
appear in my communities.
-
And my question is, how do you mix
or maintain control
-
over those schemas, over the standards,
-
while allowing the community
to continue to introduce feedback
-
and have more of this crowdsourcing
spirit that Wikidata has?
-
I think everyone could answer that,
but maybe David, you want to start?
-
I'm not sure whether I'm the right
person to answer this,
-
because in our use case--
-
in terms of data modeling,
-
it's really a narrow set of people
who actually do the work.
-
We contact experts
for the relevant segments,
-
and some of them could contribute,
but for the current iteration,
-
it was only me and two colleagues
who actually worked on it.
-
So, we want to have this option,
that we get experts in,
-
but it's always in close
collaboration with us,
-
so that we don't really have to worry
-
about the problem of crowdsourcing.
-
Being part of the Wikimedia community,
-
I would say, I would not be that worried.
-
95% of the edits are good edits,
and improving things--more than that.
-
As soon as we have an instance
that is actually closed--
-
where I offer the accounts on real name,
-
that's an additional hurdle
that no fool is going to go over.
-
People are required on our instance
to offer an address, on page--
-
not to me, but on page--
-
and this is something only
institutions usually do,
-
or private people that say,
-
"Okay, I'm a private person.
I love this research.
-
This is my personal field.
I give you my address."
-
And this is a thing that puts off every--
-
any vandal who wants to destroy Wikidata.
-
So, you can close the system, but then,
-
you are not really part
of the same flowing community.
-
But again, I would say, if you go to CC0,
-
then you can open up,
you can be the incubator
-
where people do the research,
and then it goes out to the community.
-
But it's an invitation--
use maybe closed works,
-
and use an instance where
you work together with people you like.
-
Well, I think that--
-
I don't think that it's only my opinion--
-
it is there are different perspectives,
-
and it will be hard to reconcile
all perspectives and say,
-
"Wikidata is the solution
for the entire world to go into."
-
I don't say by this that Wikidata
is not a solution,
-
but there are different perspectives,
there are different needs.
-
The world is-- really, there is
a large variety of needs,
-
of professional perspectives,
that you cannot reconcile
-
in a unique worldwide database.
-
So, I think that both are--
-
The trickiest thing is how to reconcile
-
and find angles of dialogue
between these two large families
-
of needs and perspectives.
-
If there are more questions,
-
I would rather like to go
to more questions.
-
Anybody else?
-
If not, meanwhile you're thinking
about your questions--
-
I would just like to say
that's one of the reasons
-
why we consider Wikibase,
-
because we believe that adding,
editing information
-
within the Wikibase instance,
where you have rights and roles,
-
as you have in Wikidata,
gives us the opportunity
-
to share that information
with the information in Wikidata
-
in a more easy way,
a more convenient way
-
than if we try to build these bridges
in between our authority file
-
and Wikidata at the moment.
-
(person) So, I find it quite exciting
-
hearing about how
you're energizing communities
-
to find their own ways for data modeling,
-
and that you can put into Wikibase.
-
Will you-- I'm just saying
of Stuart Prior's community,
-
but also some of the others--
-
be trying to feed the approaches
-
that as a community
that you decide work back to Wikidata,
-
to say, "We've done artists' books,
-
we've thrashed through several iterations,
-
this is what we found really worked,
-
and the properties that you should have
-
or revisions you should make
to the Wikidata data model.
-
Good question. Very short answer.
-
It's an interesting question.
-
I don't know whether this is a model
-
that's going to work for other types.
-
I hope it is.
-
But it's a difficult one if you question
-
of whether the Wikidata community
accepts the kind of authority
-
of a separate community that goes off
and does the work on its own.
-
But I would certainly hope
-
that it's a way of people
feeding back into this process,
-
without necessarily needing to go
onto Wikidata and do it.
-
Well, I would say, grab it.
-
Grab it if it's convenient, take it,
and take a look at how it works
-
in the other instance.
-
And if you feel like
this is a cool property
-
to do certain searches,
then that will be adopted,
-
that will be flowing.
-
I wouldn't think
of authorities doing this.
-
(person) Coming from
a Wikidata user perspective,
-
the great thing you're doing
is showing you've established code
-
that works and runs.
-
You've established a data model
that people can see,
-
is implementable, and works.
-
And so, in the open source community,
-
you know, show us the code.
-
You can do that.
-
And that's why I think it's very exciting
to have these branches
-
that can then fold it back
for data modeling.
-
Yeah, thank you.
-
I think that is exactly the point.
-
I also like the verb
that you used-- energize.
-
This is exactly what we want to do.
-
Energize, as in Star Trek.
-
Yeah, this panel comes to an end.
-
And if you have any more questions
-
on all these Wikibase projects, talk.
-
- Please come tomorrow.
- Have conversations.
-
This is what this conference is about.
-
Thank you very much.
-
(applause)