Willkommen, Bienvenue-- Welcome.
I always wanted to say that on a stage.
(laughter)
This is going to be inspirational,
because this is the official
Wikibase inspiration panel
of WikidataCon 2019.
The point of this panel
is to be inspired by all the things
that people, in various countries,
in various fields, do with Wikibase,
the software behind Wikidata.
I was really surprised to learn today
that someone came to me and said,
"I learned about Wikibase
the first time today."
So, it is the software that runs Wikidata.
And if you want
to order things in the world
the way Wikidata orders things
in the world,
but you don't agree with the items
that we have in there,
because you might need
a finer level of granularity,
or maybe you don't want to start
with Q1, which is the universe,
because in your little world,
Q1 could be a book, if you are a library,
or it could be some kind of animal,
if you work in biology,
or it could be a historic person,
if you do digital humanities,
but you still want
the same system of ordering,
then Wikibase is the thing for you.
Over the last one or two years,
we have made contact
with extraordinary people,
who are pioneers, who are trailblazing,
who are evaluating Wikibase,
and who are doing
extremely great stuff with that.
This panel is going to be very rushed.
Every one of the participants
of this panel would have deserved
a one-hour slot to present their thing.
But our program is packed.
So, yeah, keep your seat belt fastened
for a fast-paced ride
through the inspirational
world of Wikibases.
And the first one is a project
from two organizations,
which is a little sensation in itself.
The Bibliothèque nationale de France,
the French National Library,
and Abes, which is an authority
for higher education.
But I think you will talk about that
more in your presentation,
and yeah, we'd like to welcome
Anila Angjeli and Benjamin Bober
on stage for the first
ten minutes of inspiration.
(applause)
Hi, everybody.
So, yeah, my name is Benjamin Bober.
So, I work for Abes,
which stands for Higher Education Agency,
Bibliographic Higher Education Agency.
Basically, we work with all
the university libraries in France,
and manage the union catalog.
And also their authority files.
And I'm here with Anila Angjeli,
from the BnF,
French National Library.
And we're going to talk to you
about our joint project,
which is about creating
a new production tool
for authorities data--
person, corporate bodies,
concepts, and so on.
And we spent the last months
asking Wikibase to do this stuff.
So, I will give you some context
really quickly,
because it's important for us,
as libraries--
There's been this technological
shift recently
with the linked open data movement,
and we wanted, as a bibliographical agency,
to follow this new trend.
And, well, it's been years since we've--
experimenting with linked open data,
with RDF, SPARQL and so on.
But we think that now
is the good time to move forward.
It's also a good time
because there's been a-- not a shift,
there's a fundamental change
in the way we consider
bibliographical data.
We used to, and we still have data
stored in records, we call it MARC records
in the library landscape.
We used a specific format called MARC.
But recently, there has been some way
to think about it
from another point of view.
And to go from a record-based world,
to an entity-based world
when we try to interconnect
people, works,
and other entities.
So, in this context, we decided
to launch this joint initiative.
But our goal is far beyond libraries.
We would like to have with us
other French GLAMS, for instance,
because we think our project
can help them also.
So basically, our project is called
Fichier National d'Entités,
so National Entity Files.
And it will be a shared platform
for collaboratively create
and maintain reference
data about entities.
Like I said, persons,
corporate bodies, places, concepts,
and creative works.
So, we embrace a lot of things.
And it's a challenge
because it's the first time
BnF and Abes collaborate
at such a level.
Giving you a quick view
about where we are--
where we've come from
and where we are now.
We have been working
on this project since 2017.
We've benchmarked,
other similar initiatives,
and came to the conclusion last year
that there was a strong interest
in Wikibase as the FNE's backbone.
We were considering it a good solution
to build upon, but we still
had doubts at this time,
because we have specific needs to fulfill.
So we decided to launch,
to spend this year
to build a proof of concept with real data
both from BnF catalog,
authority catalog, and our catalogs.
And well, try to merge this data
into a Wikibase,
and to try to see how they behave
and how the tool can fulfill our needs.
And we were helped
in this proof of concept
by Maxime and Vincent
from Inventaire.io,
who helped us have a better idea
about what Wikibase can bring us.
And Anila will talk
about the first findings.
So, while this decision to go
with experiments with the Wikibase
as the technical infrastructure backbone
or the basic layer for our FNE
was because it's not trivial
to move from one system to another,
and because the initiative
of using the Wikibase
as the technical infrastructure
for our data--
it was both--
means that we move from our classical
system information
or library information system
to quite another thing.
And so, we needed to experiment first,
and just to see whether a set
of functionalities that are--
that we usually need to perform
and fulfill in our environment--
professional environment.
I'm talking here about creating
and maintaining,
and not publishing,
which is a big difference.
You were at the session,
the previous session,
with just Wikidata Commons,
contribution strategies for GLAM--
it was about publication
and ways about creation in itself.
So, we need to go step by step,
and that's why we conducted
this experiment, this proof of concept.
And, good surprise, no major obstacle
to ingest library data
according to a specific ontology,
which is, while we--
I briefly mentioned that we put their data
in two different flavors of MARC,
then we defined
some [inaudible] properties
in order to be able to experiment
with merging the data,
and there was no major obstacle
from the technical point of view.
Of course, we came up with a confirmation
that Wikibase does offer built-in features
that could be used as the basis
for the technical infrastructure for FNE.
But again, the decision is not yet made,
because the experiment is still--
let's say, the developments
have been completed.
Now, we're in the phase of writing
the final conclusions,
and the decision is not yet made
from the strategic point of view,
but these are really the first findings
we can talk about.
And Wikibase-- it appears to us
that a Wikibase might be
a good operational solution
for managing this initiative--
that is jointly, collaboratively,
create these entity, these things,
to remind you of the opposition,
which is things and strings.
However, we noticed there are gaps.
Within the specific needs
of our specific institutions,
there are defined communities
with their own culture, practices and,
well, it is certain processes
that are inherent to the libraries,
and the solution offered by Wikibase,
for example, the search.
I mean, from the professional standpoint,
not only from this end-user standpoint,
but professional, we need some indexes
in order to ensure
data quality, data curation,
and it is very important
for the professional,
and Wikibase with its Elasticsearch
and CirrusSearch doesn't offer.
But still areas of investigation there.
The roles-- how are the roles managed?
The bureaucrat, the patrolling of--
it's not exactly what happened
in our world.
Although there is a layer
that can be used,
upon which we can build
other roles that are more in compliance
with our way of managing the data.
Or different constraints,
constraints related to data publication,
or data-- there's an error there
we need to correct.
Data policy-- okay, thank you.
So, there are things that need to be--
other layers, bricks,
need to be built upon Wikibase.
And of course, one of the reasons,
the major reasons,
the reason why we are here with you,
is that we-- we are willing,
and we feel the necessity
to be part of a community
sharing the same concerns.
And we all know, given the program,
that libraries and GLAMs
are heavily represented in this event.
So, I think-- we think that maybe
in a couple of weeks,
or next year, we will able
to communicate more openly
on our decision to go forward
with this solution.
Thank you.
Thank you so much.
(applause)
So, we will have short
presentations first,
and we will all return on stage
for questions, if we have
the time for that.
But yeah, we heard something from France.
There's another project.
It's not Fichier National d'Ent--
(jokingly struggles with name)
But it's Gemeinsame Normdatei,
the universal authority file
for the German-speaking world.
And I'm so happy to have good friends
of the Wikimedia movement here.
Barbara Fischer and Sarah Hartmann.
Thanks alot for the invitation
to talk about our project,
which is called GND meets Wikibase.
And it's a joint project
of Wikimedia Deutschland,
and the GND.
And we'd like to give you
a quick overview,
as Jens said before,
there are just 10 minutes.
Why we go for that approach
to evaluate Wikibase,
if it fulfills the requirements
for managing authority data
on a collaborative level, I would say.
So, where do we come from,
and what's the idea of authority control?
And GND, which stands for
Gemeinsame Normdatei,
what's the idea of it?
And yeah, where do we come from,
as I said before.
It's not that different
from what Anila and Ben said,
just a few seconds ago.
The GND is used
for the description of resources,
such as publications,
and objects, for example,
and in order to enable
accurate data retrieval,
I would say, the GND provides
unambiguous and distinct entities
for that retrieval.
And so, there are persistent identifiers,
as well, as you all know,
for identification and reference
for these entities.
The authority file is used
by mainly libraries,
we would say,
in the German-speaking countries,
but a few other institutions
from the cultural heritage domain,
are using the authority file already.
And all in all there are
around about 60 million records,
and in Wikibase, we would say "items,"
which refer to persons, names of persons,
corporate bodies, for example,
geographic names, and works.
And the GND is run cooperatively
by so-called GND agencies,
and at the moment, there are
around about 1,000 institutions
who are active users of the GND--
that means they establish new records
and added records or items
on a regular basis.
And the most important thing, I would say,
is that the GND data
is provided free of charge
under CC0 conditions,
and that all the APIs
and documentation is open as well.
Yeah, talking about open--
that's the point,
and the crucial one here--
at the moment, we challenge
to open up the GND
for other GLAM institutions
and institutions from the science domain.
At the moment, it's really focused
on the library sector.
That means that the handy tool
of librarians has to evolve
into a tool that is used
and accepted across domains.
And that means a lot of work
on organizational stuff,
community building, discussions
about the current data model,
and infrastructural and technical issues.
And, yeah.
Talking about the infrastructural issues,
we came up with the idea
to become partners in crime
with Wikibase, I would say,
so have slightly the same aims,
namely make cultural data
more accessible and interoperable.
And therefore we now
evaluate the software,
which was originally conceived
for a sole application, Wikidata,
if it's sufficient for managing
authority data.
Right-- hi from my side as well.
We're focusing in our evaluation
[inaudible] we do commonly
with Wikimedia Deutschland.
First of all, if Wikibase meets
the requirements
of GLAM institutions, galleries,
libraries, archives, and museums,
to drive collaboratively
an authority file,
which is like our basic question.
We also would like to see
Wikibase to increase usability
as the software system
we're using right now
is, let's say, quite a complex software
that is not as handy
as you might like it to be.
Well, and then, we would like to know
if Wikibase would also ease
both data linking
and growing a diverse community.
As Sarah said before, we are right now
in a process of opening up
towards a broader scope
of GLAM institutions,
and science institutions.
And of course, they are working
within their own software structures,
and we would like to know
if Wikibase would ease
the cooperation-- collaboration with us.
So, why do we do that?
This is because we consider that Wikibase
might be the attractive community zone,
which means--I had to write that down--
first of all, as it is open source,
it will be more accessible
than any proprietary source
software system that is used
in the cataloging fields
of the GLAM institutions.
Then, we feel that the Wikibase community
already by now
is a very dedicated community,
and we would like to participate
in that dedicated community,
because we believe that sharing is caring.
What we want to share
is our knowledge is your knowledge,
and together, in order to omit redundance,
not by editing the same information
over and over again,
but reuse data, link it,
quoting it, and enriching it.
And I placed here on the picture
one of the tools
that is broadly spread within Wikidata,
this Histropedia,
because we also feel that if we are able
to introduce our data into Wikibase,
we might be able to share tools,
improving the code,
and thus being an active,
contributing part of the community.
Thank you.
I'd like to debate that with you later on.
Thank you so much.
(applause)
Thank you so much.
So, at some point,
we ask ourselves, did we--
by accident, write a library software?
Because the adoption of Wikibase
in the library fields is so overwhelming.
But there's more to it.
And of course, we didn't
accidentally write a library system.
It can be used for other fields as well.
For instance, for biology.
And David Fichtmueller will tell us
about using Wikibase
as a platform for biodiversity.
- I think that was grayed.
- Yeah.
Full screen? Oh, okay.
Yes. Hello, everybody.
I'm David, and I work
at the Botanic Garden,
Botanical Museum here in Berlin.
And I work there as a computer scientist.
We have an entire department
called Biodiversity Informatics.
Generally speaking, we write the software
that biologists use in their daily work.
And on my private side,
I've been a Wikipedia contributor
for almost 15 years now,
and Wikidata contributor
for almost five years now.
And also, as part of my job,
I'm a co-administrator of a MediaWiki farm
with more than 80 wikis
regarding the biology community.
And a couple of years ago,
I was assigned to a project
that was, yeah, about working
on a standard.
In particular, it's a standard
called ABCD,
that we needed to do some work on.
And I assume most of you
haven't heard about ABCD,
that's not really a bad thing.
It's really specific.
It stands for Access to Biological
Collection Data.
And it's an XML schema.
So, it can express
biological information,
particular things like information
about herbarium sheets,
about collections, like fish in
alcohol jars, or--
but also observations--
scientists being out in the field,
seeing certain plants,
seeing certain animals.
A lot of variety in here,
and because of this,
it's quite a huge standard.
So, we have 1,800
different concepts in there.
That's counting the different XPaths
there are within the file.
And so the challenge was to convert this
into a new modern semantic standard.
We wanted to use an OWL ontology
that is able to express
the same kind of information
that has previously been expressed
with the XML files,
and also keep all the existing
documentation,
and restrictions,
and all of the connections
between the items
and have a collaborative platform
where other scientists can come in
and give us advice
on their specific fields of focus.
Did we model this correctly?
Is there anything missing?
So, yeah, with all of this in mind,
we went looking around,
and found a solution, and I guess
it wouldn't surprise anybody here,
it's Wikibase, otherwise
I wouldn't have been talking here.
So, we decided on using Wikibase.
And we started to install it
without the Docker Image.
Big mistake. Don't do this.
(laughter)
In our defense, we started this
two and a half years ago.
And it was two years ago
at the WikidataCon
that the Docker Image was first released.
So, we had to figure out our own way.
And once we had things up and running,
we didn't really want to break
changing things.
We do have the Docker installed
for the Query Service,
and we have a weird, hybrid
of custom installation
and Docker installation
and modified scripts
connecting those two instances.
We then installed
QuickStatements, again, manually,
because by that time, it wasn't part
of the Query Service,
did some slight modifications,
and adjustments to get it to work.
I know it's now part
of the Docker Image.
But yeah, we had it running,
so, we didn't bother changing it.
Keep this in mind for later on.
But before I go into what we did,
I'm going to avoid
a possible confusion here,
because we're talking
about data standards,
and when we express things
in a semantic way,
we will convert the concepts
from the XML into Classes and Properties.
So, this being Object Properties
connecting the different classes,
and Datatype Properties
that actually contain the content,
that is to store text, numbers,
things like that.
And we express all of this
within Wikibase,
but all of those are items in Wikibase.
And they are then described
using Wikibase Properties.
So, we have ABCD properties
being items being described
as Wikibase Properties.
I try to make sure to use
the prefixes accordingly,
so you know what I'm talking about
when I talk about properties
in this talk.
So, let's look at the properties,
in particular, with Wikibase Properties.
We sat down and thought,
"Okay, what do we need
to describe the concepts
we want to model?"
And we ended up using around 25 properties
in addition to, of course, label,
description, alias.
I'm not going to mention all of them,
just so you see the variety.
Those fulfill our requirements.
And yeah, some things
express some restrictions,
and others--
Most of them are optional.
Only very few are mandatory.
So then, we set on importing
all of this information.
We wrote a Schema Parser that extracts
all of the different concepts.
So everything that has an XPath
within the XML Schema,
and all of the documentation
that is part of the XML schema,
and so we got this into a nice CSV file,
and then we could work on this
and import it using QuickStatements.
Worked quite well.
But then, we had, as I said,
1,800-plus concepts
in our Wikibase instance.
But then, when we had things like person--
person name, and contact email--
those appear a couple of times
within the schema--
for the data set owner, for the person
who took an image, things like that.
So, of course, we needed to reduce those,
and combine those to reusable classes.
So, there was a lot of manual editing
to reduce the number of concepts,
and in the end, we ended up
with a little more than 500.
So, we have Classes, Object Properties,
Datatype Properties,
a couple of other ones I'm skipping
to avoid additional complexity here.
And for certain large-scale edits,
we also used QuickStatements again.
So now, we did all of the editing,
now we wanted to make sure
that the data we have
is actually consistent.
So, that's where we used what we call
Maintenance Queries,
used the query interface
with some SPARQL queries,
basically to check for missing properties,
wrong links between concepts,
basically, things that didn't match
with our concept, with our structure.
And in the end, we also had to do
a manual review of all of the concepts
just to make sure we didn't miss anything.
This was kind of a lot of work,
because if you only take
like five minutes per item,
multiply it by 550,
it's over one week of full
and concentrated work.
But of course, we don't need five minutes,
because you sometimes spend
like half an hour to fix a certain item
when there's problems with the modeling.
So, we now had all of the data.
Now, it was time to get the data
out of Wikibase.
We wrote an export script in Python
that uses the Query Service
to get the information about the concepts,
and fill them in templates--
prepared templates.
So, in the end, we get
a nice valid OWL file
that contains everything we need.
And this is the actual basis
of the standard.
For future versions,
when we're going to make revisions,
the Wikibase is our working platform.
And once we do an export,
this is the new version of the standard.
Keeping those separate,
this would also allow us
to move the server
to a different instance,
or as I said, change the installation.
We export JSON
for the documentation of the website.
And we also export the data
to a second Wikibase instance.
This is like really
experimental, right now.
We haven't really used this
in production where it can--
where the concepts can then be used
to describe actual data.
So we're breaking down those--
we're taking them a step down
from properties being Wikibase items,
and converting them into actual
Wikibase properties.
This is quite a lot of requests--
quite a lot of steps
to keep all of the data
and all of the linking consistent,
but it works.
And in the end, well,
it was quite successful.
There is a huge community--
there is a community about
Biodiversity Information Standards,
who also had their annual meeting
just in the past days.
So, there's a huge interest
in reusing this approach
for other standards, as well.
And so, in the future,
we want to try a bit
about Shape Expressions--
as I said, we have some restrictions
in there to export them--
and build some better workflows
for the versioning.
We haven't done this yet.
And switch up the Docker instance.
So, at the end, I'm gong to have
a small wish list--
what things could be improved.
Well, there are a lot more tools
out there that are really written
for Wikidata, but could be more agnostic,
in particular, QuickStatements.
As I said, I did
some adjustments manually.
Many of the issues I had
are probably solved by now,
but I don't think all of them.
Then we want to import existing templates,
or the SPARQL template,
the Q and the P template.
They are really useful
when working with Wikibase.
So, this would be done automatically.
And as I said, we did a lot
of manual editing.
So, it would be useful,
just ideal to have a tool where you can--
Like in an Excel table--
you load a couple of items,
and you load a couple of properties,
and then just jump from cell to cell,
really quickly edit a lot of things
in a semi-automated way.
Thanks. That's the end.
Thank you so much.
(applause)
So much to talk about on this.
So, there is not only--
well, how do I get back from here.
It's not only about science.
It's not only about libraries.
You can also create
art and beauty with Wikibase.
And who would be better to tell us
about this than Stuart Prior.
Now, slightly embarrassingly,
we talk about art and beauty,
but this is a really ugly presentation.
(laughter)
Starting off with a room
full of Wikimedians,
trains--people like trains.
But it has a purpose.
So, this is Hackney Downs Station
in Northeast London.
And this is about
Banner Repeater and Wikibase,
which I'll explain further.
So, this is a terrible photo.
But it is actually where
an artists' publishing archive is held,
which is on the platform
of a train station.
Within there, they've got
several hundred copies
of various types of artists' publishing.
They get a lot of public footfall.
It does a lot of outreach
to actual general public.
Like you get on the train,
you'll find bits of
sort of obscure art on the train.
So, it's a really interesting project,
but part of a much wider community.
So, what is Artists' Publishing?
What are Artists' Books?
Like, I didn't know either.
So, the definition,
according to Wikipedia,
is "Artists' books are works of art
that utilize the form of the book."
Well, you can read it.
But it's individual pieces of art,
or sometimes collections of art,
using publishing as a medium.
This varies quite a lot.
It's very interesting.
It was kind of--
There was a lot of it
in the early '20s and '30s,
and it had a bit of a renaissance,
'60s and 70's,
and continues to expand.
Has a large global community,
multilingual,
somewhat separate from large
institutional art institutions.
So, you'll find collections,
such as the V&A
has a collection, obviously.
So, they've got various kind
of items such as these.
This is just an article,
so it's just not the best display.
But it's a really kind of interesting,
yet slightly niche field of work.
But it's not very good on Wikidata.
This is, again, a really terrible photo--
it's not my photo--
of some the stuff held
in Banner Repeater's archive.
If you see in the middle,
the pink one, Blast,
that's actually a fairly notable
piece of artists' publishing
from the '20s.
What does it look like on Wikidata?
It's not good on Wikidata.
It's often just confused with books
or other forms of publishing.
The average kind of Wikidata item for
a notable piece of artists' publishing
doesn't really have much to say about it.
You know, it's just--
there you go, that's it.
There's not a huge amount
of identifier numbers as well.
So, there's clearly a lot missing
when it comes to artists' publishing,
certainly compared
to more traditional forms of art--
paintings and sculpture and so forth.
And there's a huge desire
within the community
to start codifying this,
and making it a real thing.
So, I'll give you an example
of what is actually available.
You can point out what's wrong
with this query.
So, this is basically all there is.
That's every artists' book on Wikidata.
So, there's really not a lot.
Some of them don't even
have labels for a start.
And it's something
that really needs expanding.
And something that has capacity
to be expanded.
Has anyone seen what's wrong
with this query yet?
The labels-- the labels say "sausage",
because I just stole
someone else's query,
and changed the key number.
(laughter)
It's actually a query about sausages.
Anyway, moving on.
But yeah, you see it doesn't really have
much of a presence.
We were approached by Banner Repeater.
So, I work with Wikimedia UK.
We were approached by Banner Repeater
to help them with this--
with setting up a Wikibase--
in terms of funding,
in getting extra funding,
but also in terms of bringing in
a wider community,
and being part of the process.
So, the process is basically
to gather this community
of artists, archivists,
and linked data experts,
and work out what the schema,
the data model,
for artists' publishing should be.
It's a very specialized field.
Doesn't really map
onto Wikidata perfectly.
It's probably too granular for it.
And the other thing
is the kind of flexibility of it.
Maybe it doesn't really fit in Wikidata.
Maybe it's too rigid at the moment.
The Wikibase is being built,
so I haven't got much to show you,
because it's not been built yet,
but this more about the process.
And the process is extensive
community consultation,
a few kind of layers of it.
So, we're not just going
to do this in one session.
It's not a few individuals deciding.
It's kind of ongoing,
and ongoing, and ongoing.
The impact of this
could be fairly substantial,
because no one else is doing this work.
A lot of the larger institutions
have artists' publishing
sitting in their kind of back room.
They don't really know
how to categorize it.
They haven't categorized it very well.
They're not very interested in it.
But there is a huge community
that is interested in doing this.
So, this is basically
the process at the moment.
So, the initial workshop has happened.
So, it was an expert workshop
with some people
deep in the field of artists' publishing--
archivists, people
who own collections, and so forth--
to establish a kind of
basic set of priors,
to look at what things were existing.
The existing status was on Wikidata,
and look at how that
could be expanded or improved.
And then they documented that,
and established this basic structure.
And now, we move into the next process
where it's bringing in
a much wider community.
So that's-- it's not just data people,
it's creators, as well.
There'll be a lot of narrative in this,
and a lot of qualitative things.
Again, stuff that just
doesn't really belong on Wikidata.
But also working with archivists,
and working with linked
data experts, and so forth,
to hopefully bring this all together,
to create a resource that will have
a nice accessible front end,
and also build this community--
people who can contribute to it,
and kind of own this data set.
I'll show you what we've got ready.
This is subject to change.
But this is basically kind of
where we've got so far
with the expert ones.
So, you see different P numbers
being developed,
and look at what
their equivalent on Wikidata is.
And obviously, it's a lot more granular
than probably the information
on Wikidata is at the moment, so--
There's a lot of detailed stuff,
so there's qualities
such as height, width,
thickness, and so forth,
which aren't necessarily that present
on other groups
of artists' publishing on Wikidata.
But there's also other things like
"commissioned by", and "contributors to",
and a lot of these works
will have multiple contributors.
And multiple editions
and things like that.
There's really a lot
of granular information
that can come about these things.
And a lot of narrative as well, you know,
as things have changed over time,
as people have reinterpreted things.
And this was what was created.
Again, most of it has
Wikidata equivalents,
but some of it doesn't yet.
So, what do we have here.
Other editions, and things like that.
So, it's fairly specialized.
This is the first stage.
And this will go through another process,
as people take things away from it
or contribute, too.
The flexibility is really
important in this.
It's kind of getting away
from older kind of standards,
and moving to something
which is a bit more up-to-date,
and something where the community
can really change things,
and not be dictated to--
and I'll start speaking quicker.
So, power dynamics, at the moment,
and why Wikibase.
So at the moment, this is the art world.
This is what the art world looks like.
It's a big orange thing.
But you've got these large institutions,
and then you've got sort of
groups of artists' publishing.
That could be Delhi, Mexico City,
London, and so forth.
And what we don't want
is this kind of thing
where large institutions and experts
get to dictate
the kind of ontology,
and how these things are going to work.
So, working to establish a Wikibase
among an artist community
can help them work out
what they're going to do,
and then they start pushing back
into the larger institutions,
with a more kind of flexible data model,
with something that's more up-to-date
and coming from grassroots organizations,
as opposed as coming
from institutions, so to speak.
So, I think there's huge value
in this approach
in terms of creating
a sort of parallel infrastructure
for communities of people
who own content, and so forth,
much like Wikimedia is,
and kind of pushing out to institutions,
rather than doing it the other way around.
Do I have another slide?
What next?
I always put this slide in,
because it's always the worst slide,
and it's such a stereotype.
What next? We're moving on
to the community consultation stage,
so we'll get a bit more kind of
expansive and interesting.
This obviously, this database
will be talking to Wikidata,
but on what term,
we're not 100% sure.
But it could be that this becomes very--
just a very specific instance
for artists' publishing
that Wikidata can draw from,
and vice versa.
And I'll just finish off
with that picture again,
because I just quite like it.
And that's all I have to say.
Thank you.
- Thank you so much.
- (applause)
We're almost at the end
of our fast-paced ride,
and we'll-- what to say?
we saved the best for last?
No, but we give the last presentation
to someone who's a true pioneer
of using Wikibase
in the field of digital humanities.
And, yeah-- Olaf Simons.
You have not prepared any slides,
but you will do some live action.
Exactly.
And I have been on Wikipedia
since 2004, actually.
I have the 15 years.
What am I going to show?
I've been congratulated for this.
I'm going to show you
the Wikibase instance we created.
It's not a Docker Image.
And I could agree, it's not the best
to have a Docker--
it's not the best to have
an independent installation.
It's difficult,
and it has been extremely
difficult for us,
and we're grateful
for the Wikimedia Germany
to help us get it done
on a mutual agreement we had.
So, basically, we have here
several projects on this.
It's more project-oriented than Wikidata.
And my thing should be in here.
I open that and go--
just should have done that before.
Here we are.
The history of the Illuminati--
I start with this one.
This has been a little film
which has been created
by Paul-Olivier Dehaye,
whom I only know from Twitter,
as he asked us what kind of experience
did we make when we got our Wikibase,
and he was experimenting with his own.
And I talked to him
about things we could do,
and things we could not do.
This was a film I would love
to be able to do.
And he said, "It's easy for me.
I can run a SPARQL search,
get the information,
and put it into a program,
in which you can then see this thing."
It's actually 20 years of research
on the Illuminati,
and gives you a short history
of the entire organization
and all its correspondences.
That's not a Wikimedia tool.
It's not a tool of Wikibase.
But it's something you can do.
And actually, I like it
that it is not a tool already.
It should become a tool.
I like it because it shows
our data is really free.
Someone can download our data,
someone can do something with it,
which we haven't expected,
and it can be done within two hours,
if you're bright--
and he is bright, of course.
So, he created this for us.
I go back to my presentation.
Why on Wikibase?
This was the immediate question
when we approached Wikimedia.
I knew of Wikidata since 2010,
and in 2017, it was ready
to be used by us.
And there was actually an interest
from Wikimedia people to say,
"Do it, and we support you."
Why our own base?
Basically, as original research
that we have to do.
And the entire installation
is a research tool.
It's not only there to take a look
at what we did
and for presentation purposes,
but actually, I use it every day
for my research.
I change dates of documents,
and take a look at how things look
when I have changed that.
I do a lot with working hypothesis.
And we ask projects that have data
to give us their data,
and to feed them in,
and they can, again, put a label,
put an item to their data sets,
that says this has been produced
by the following project.
Next projects can continue with it.
But it's already there as a marker
that this is a data set
with work from a certain project.
And if you have a project, DFG--
DFG funded, the German
research institution--
if you have a project, you want to show
what kind of work you have done.
And you can now do a SPARQL search
and present your entire group of data sets
in the final résumé of your work.
So we get original research,
we identify research,
we encourage the working hypothesis.
This is a working tool,
and it's actually quite useful
to start from the beginning,
not to present something in the end.
But from day one, you work with it,
and what you think is
the proper answer to that question,
you can put it into Wikibase, and then
you can substantiate information
until you see this
is the right identification
of a person or the right date for a thing
which we haven't been able to date so far.
So, actually, accumulate work
while you are doing it,
use the Wikibase as a kind of tool
that is getting you closer
to the final result.
Our first meeting took place
on December 1, 2017.
And I remember I had
a little challenge for you,
and that was a death date--
a date of death for a person--
where I wanted to have someone
to show a source for that,
and that was extremely difficult,
because he had to create the source
before he could connect it to that.
And in the room, we were--
we had the clear idea,
if we do this, we'd do it
with the sources already part
of the Wikibase installation we have.
And if we have the sources in there--
that is, all the early modern books
that have been printed
would be the ideal.
If we have that in there,
we need the GND in there.
And when we heard that the GND people
are on their track to test the software,
I approached them and asked,
"Wouldn't you like to do this
in a cooperation with us,
so that we can have your data,
which we want to have, anyway,
and that you can see
how it works on a Wikibase."
And this is where we are at the moment.
And presently, I would say,
a lot of things,
we're not sure how they are done,
or at least I am not sure
how they are done.
How's the input done, how do you get
from a resource of strings
to an item-based resource--
lots of things.
And basically, my talk here
is an invitation.
Join us.
We are still not really part
of the Wikibase community.
That doesn't exist.
We have a Wikidata community.
And lots of things
are taking place in Wikidata,
but if I ask for help for a Wikibase
that is not Wikidata,
that's a difficult thing.
First thing I would say is,
actually, to work with us is cool,
because you can grab the data
for Wikidata anytime, any moment, at CC0.
So, actually, you can use it
as an incubator of your work,
and drag it to Wikidata.
And also, we will work with big data,
when we have the GND
in there, that will be quite something.
So, if you really want the challenge,
you can get it also on our platform.
And we offer interesting communities.
Basically, one of the things
that is different
is that we have all clear-name accounts
and institutions.
So, but that also means you can do things
which you couldn't do on Wikidata.
You can do your genealogy at our site.
We don't mind.
It's interesting to have people
getting such data.
You can do your city's search--
research, historical research
on our platform-- we don't mind.
You can be with research on our platform.
So, lots of things need to be done.
We have immense problems
running the database.
It was implemented by Wikimedia,
but now, we see lots of things
don't really work.
We can't really fix that.
It's extremely difficult to get help
to run the database,
to update the database,
to solve little technical problems,
which we face as soon as we run
an instance outside Wikidata.
Like getting the direct
GND link is difficult.
It works on Wikidata,
it doesn't work on our instance.
Getting images from Wikimedia Commons
on our Wikibase is not that easy.
Lots of little things still remain.
So, actually, this is an invitation.
If you want to join us
on the mass input, do that.
Approach us.
If you want to help us
with technical things,
this is highly welcome.
And then, we need tools.
You saw the tool we had in the beginning.
Actually, it's not that difficult
to get such tools.
I saw what kind of query you do
to get such a visualization,
and once you have it,
you should be able to modify it easily.
These tools are extremely precious
in our community
of digital humanities projects.
And there are little companies
that create these tools,
again, and again, and again,
and get money for that.
I would love to have these tools
just once and for all free
and on the market and working
with a Wikibase instance.
So, anyone who is interested
in developing tools,
approach us, and we have plenty of ideas
of what visualizations
historians would love to see,
and that should be done.
So, basically, lots of things,
like, still remain.
I've got one minute.
I don't need that one minute.
And you're putting pressure on me.
(person) Give it to the audience.
I give the minute to the audience.
Yeah. Thank you so much.
And maybe you want to sit down,
because I would like everyone
to join me back on stage.
And we can have a round of questions.
I really like that we ended
with an invitation,
because this is what this is now.
You are invited to ask questions.
You are also invited to join us tomorrow
at the Wikibase meetup.
If you are-- if you have some idea
for an awesome Wikibase installation,
for your institution, for your hobby,
for changing the world--
please come and join us,
we will meet up, and--
There's some complication
with the chairs.
Well, let's stand up. Okay.
I think we have another microphone, here.
(person) I have the microphone
for the questions.
Okay. So--
Thank you for the presenters.
And meet us at the Wikibase meetup,
and now, I can't wait to hear
your questions to the panel.
(person) Who's the first?
(person) Hi. I will be talking
in the lightning session, too,
about geosciences, and how in geosciences,
there's many data repositories
that have collected
and shared data with the community
for years, for decades in some cases.
And they curate the data set,
their schemas evolve continuously,
they get a lot of feedback
from the community.
All they desire is to organize
the community,
to enable the growth
of these repositories.
So, they don't necessarily desire
to put all their content in Wikidata
and lose control over it.
They offer a tremendous service
curating this content.
So, I just wanted to point out
that some of the requirements
and needs that have been voiced
by the panelists
appear in my communities.
And my question is, how do you mix
or maintain control
over those schemas, over the standards,
while allowing the community
to continue to introduce feedback
and have more of this crowdsourcing
spirit that Wikidata has?
I think everyone could answer that,
but maybe David, you want to start?
I'm not sure whether I'm the right
person to answer this,
because in our use case--
in terms of data modeling,
it's really a narrow set of people
who actually do the work.
We contact experts
for the relevant segments,
and some of them could contribute,
but for the current iteration,
it was only me and two colleagues
who actually worked on it.
So, we want to have this option,
that we get experts in,
but it's always in close
collaboration with us,
so that we don't really have to worry
about the problem of crowdsourcing.
Being part of the Wikimedia community,
I would say, I would not be that worried.
95% of the edits are good edits,
and improving things--more than that.
As soon as we have an instance
that is actually closed--
where I offer the accounts on real name,
that's an additional hurdle
that no fool is going to go over.
People are required on our instance
to offer an address, on page--
not to me, but on page--
and this is something only
institutions usually do,
or private people that say,
"Okay, I'm a private person.
I love this research.
This is my personal field.
I give you my address."
And this is a thing that puts off every--
any vandal who wants to destroy Wikidata.
So, you can close the system, but then,
you are not really part
of the same flowing community.
But again, I would say, if you go to CC0,
then you can open up,
you can be the incubator
where people do the research,
and then it goes out to the community.
But it's an invitation--
use maybe closed works,
and use an instance where
you work together with people you like.
Well, I think that--
I don't think that it's only my opinion--
it is there are different perspectives,
and it will be hard to reconcile
all perspectives and say,
"Wikidata is the solution
for the entire world to go into."
I don't say by this that Wikidata
is not a solution,
but there are different perspectives,
there are different needs.
The world is-- really, there is
a large variety of needs,
of professional perspectives,
that you cannot reconcile
in a unique worldwide database.
So, I think that both are--
The trickiest thing is how to reconcile
and find angles of dialogue
between these two large families
of needs and perspectives.
If there are more questions,
I would rather like to go
to more questions.
Anybody else?
If not, meanwhile you're thinking
about your questions--
I would just like to say
that's one of the reasons
why we consider Wikibase,
because we believe that adding,
editing information
within the Wikibase instance,
where you have rights and roles,
as you have in Wikidata,
gives us the opportunity
to share that information
with the information in Wikidata
in a more easy way,
a more convenient way
than if we try to build these bridges
in between our authority file
and Wikidata at the moment.
(person) So, I find it quite exciting
hearing about how
you're energizing communities
to find their own ways for data modeling,
and that you can put into Wikibase.
Will you-- I'm just saying
of Stuart Prior's community,
but also some of the others--
be trying to feed the approaches
that as a community
that you decide work back to Wikidata,
to say, "We've done artists' books,
we've thrashed through several iterations,
this is what we found really worked,
and the properties that you should have
or revisions you should make
to the Wikidata data model.
Good question. Very short answer.
It's an interesting question.
I don't know whether this is a model
that's going to work for other types.
I hope it is.
But it's a difficult one if you question
of whether the Wikidata community
accepts the kind of authority
of a separate community that goes off
and does the work on its own.
But I would certainly hope
that it's a way of people
feeding back into this process,
without necessarily needing to go
onto Wikidata and do it.
Well, I would say, grab it.
Grab it if it's convenient, take it,
and take a look at how it works
in the other instance.
And if you feel like
this is a cool property
to do certain searches,
then that will be adopted,
that will be flowing.
I wouldn't think
of authorities doing this.
(person) Coming from
a Wikidata user perspective,
the great thing you're doing
is showing you've established code
that works and runs.
You've established a data model
that people can see,
is implementable, and works.
And so, in the open source community,
you know, show us the code.
You can do that.
And that's why I think it's very exciting
to have these branches
that can then fold it back
for data modeling.
Yeah, thank you.
I think that is exactly the point.
I also like the verb
that you used-- energize.
This is exactly what we want to do.
Energize, as in Star Trek.
Yeah, this panel comes to an end.
And if you have any more questions
on all these Wikibase projects, talk.
- Please come tomorrow.
- Have conversations.
This is what this conference is about.
Thank you very much.
(applause)