-
Good afternoon, everybody.
-
Welcome to our GLAM panel.
-
Before we start, I just have
two announcements to make.
-
First of all, please extensively make use
of our Etherpad to take notes.
-
And the second one is directed
at our audience at home,
-
or wherever you are.
-
If you have any questions,
-
you can also write that into the Etherpad,
-
and our room angels
will keep track of them.
-
So, we decided that for this year's panel,
-
after seeing all the contributions
that were made,
-
we would focus on the role of Wikidata
within data ecosystems
-
that go beyond the actual
Wikimedia projects,
-
which is also absolutely in line
-
with the new Wikimedia
Foundation strategy.
-
And we have, today, four panelists.
-
Three plus one.
-
So, I would like to ask you on stage,
-
so we can introduce you.
-
So, we have Susanna Ånäs.
-
She's a long time free-knowledge activist
-
involved in many WikiProjects.
-
And she will be reporting today
on the project in cooperation
-
with the Finnish National Library.
-
Then we have, next to me, Mike Dickison,
-
who will be second in this order.
-
He is a museum curator from New Zealand.
-
He's a zoologist and a Wikipedia editor.
-
And he was New Zealand's
first Wikipedian at Large
-
in 2018 and 2019.
-
And he will tell us
about his experience in that role,
-
and what kind of role Wikidata
is starting to play in that context.
-
Then we have Joachim Neubert
-
from the Leibniz Information Center
for Economics in Kiel and Hamburg.
-
He has been working on making the largest
public press archives worldwide
-
more accessible to the public,
and he's using Wikidata to do that.
-
And then I will go last.
My name is Beat Estermann.
-
I work for Bern University
of Applied Sciences, in Switzerland.
-
And I've been a long-time promoter
for OpenGLAM in Switzerland and Austria.
-
And I will today report
about my activities in connection
-
with the mandate from the Canadian Arts
Presenting Association,
-
focusing on performing arts.
-
Not primarily on Wikidata,
-
but you will see Wikidata
is starting to play a role there, as well.
-
So now, most of us
will take our seat here,
-
and I will give the floor to Susanna.
-
Okay. So, hello. My name is Susana Ånäs,
-
and I work part-time for Wikimedia Finland
-
as a GLAM coordinator,
-
and I also do consulting
in the open knowledge sphere.
-
And this is a discourse,
maybe, of [inaudible].
-
So, I have been involved in the workings
-
of geographic data group of the--
-
well, I looked it up,
but it isn't in English,
-
but, cultural heritage initiative
of the Finnish royal government.
-
So, this is about place names
-
and how they are represented
-
in different repositories
in the GLAM sector in Finland,
-
and how they are trying to pull together
these different sources,
-
and how they are informed
by modeling in Wikidata and elsewhere.
-
So, here we see the three main sources
for these YSO places,
-
which is part of the national ontology--
general ontology.
-
AHAA is for Finnish archives,
-
Melinda is for Finnish libraries,
-
and KOOKOS is for Finnish museums.
-
So, there are three, also,
content management systems
-
that come together in these YSO places.
-
And there are exchanges between Wikidata
already taking place,
-
as well as the names project
for the National Land Survey.
-
And then, there's a third project,
the Finnish Names Archive,
-
which doesn't yet contribute to this,
-
but there are plans for that.
-
So, one of the key modeling issues
in this whole problem area
-
is that there are three types
of elements in place names
-
represented in this project.
-
One of them is the place,
the one that has location.
-
And one of them is the place name,
the toponym, for example.
-
And then, there are sources,
which are documents
-
from which these both can be derived from,
-
or like, backed up with.
-
The YSO places--
here, on the top right,
-
you will see the same diagram again.
-
It focuses mainly on the places.
-
The main thing of this
is the Finnish National Library,
-
and the Finto project.
-
There are now more than 7,000 places
in Finnish and Swedish
-
and over 3,000 in English,
-
and they are CC0 we've licensed with.
-
So, here you can see the service of Finto.
-
And a place-- I chose Sevettijärvi.
-
It is now also related
to our language project
-
with the Skolt Sami--
-
this is a place
in the very north of Finland
-
inhabited by Skolt Sámi.
-
So, here you can see the place
which belongs to the--
-
well, you will see the data
about this place.
-
You can see that it is connected
to a Wikidata,
-
as well as this National Land Survey data.
-
Here we go. And you will see
this in more detail, here.
-
It is also hierarchically arranged
-
inside this repository.
-
Well, actually,
the actual place is not seen,
-
but it is underneath this municipality,
-
as well as the region,
-
and Finland as a country,
and Nordic countries,
-
the broader region.
-
Here you can see that many of these
-
have been matched
with Wikidata previously
-
through Mix'n'Match,
and there are still remaining ones.
-
But then, the amount of names
is not that high.
-
It's only less than 5,000.
-
So, then there is this other repository
-
by the Finnish Geospatial
Platform Project--
-
Place Names Cards.
-
These are all the place names
that are on Finnish maps.
-
And they have the linked data,
which is licensed CC BY 4.0.
-
800,000 map labels in Finnish, Swedish,
and all those three Saami languages
-
that are in Finland.
-
And they have
two different types of entities.
-
The other ones are places,
and the other ones
-
are place names, toponyms.
-
And they both have persistent URIs.
-
Here's, for example,
the same Sevettijärvi, in first Finnish,
-
and then all those three Saami languages,
as well as the geographic data,
-
and then there is more information
about that, like the place type,
-
et cetera.
-
Here is the card for the place name,
the toponym, having its own URI.
-
Sorry, it seems that it's not translated
into the English list.
-
So, multilinguality
is not covering the whole project.
-
Okay, we come
to the Finnish Names Archive.
-
This is a project by the Institute
for the Languages of Finland,
-
and these represent not the places,
not the place names,
-
but they are actually sources for those.
-
So, these are three million
field notes of place names,
-
and it is a Wikibase project.
-
They are in a Wikibase,
mainly in Finnish, some in Swedish.
-
An outstanding collection of Saami names,
which we are very interested in.
-
And they are licensed CC BY.
-
And that is also a challenge
from the Wikidata point of view.
-
But if there was a Finnish local Wikibase,
-
we might be able to first work
on them in that project.
-
So, here's a screenshot of that,
-
showing that there's information
about the place, the maps--
-
the maps that the collectors
initially use,
-
and the card that they produce
of the information they collected.
-
So, here's one of those cards
-
broken down into data
-
that is included in them.
-
So, then they sent
this linked data project
-
by the Helsinki Digital Humanities Lab
-
and Semantic Computers,
-
computing group of Aalto University--
-
and together with this Institute
for the Languages of Finland--
-
the Names Sampo.
-
And this is an aggregated
research interface
-
to several place name sources.
-
Here you can see that many
of the sources are out there on the left,
-
and then, you can make
different kinds of visualizations
-
based on this data.
-
And, yeah.
-
So, I've been bringing up this idea
of modeling for a local Wikibase
-
that we could do with this data.
-
But when we enter
these modeling questions,
-
how do we model?
-
There are different ways,
different traditions in each of these.
-
And the good thing about it
is it could also serve minority languages
-
with very little effort.
-
Okay. So, here we have
the two basic options:
-
the SAPO model, which is
the Finnish Space-Time Ontology,
-
and the Wikidata model.
-
Here you can see
that Wikidata items tend to zero.
-
Ideally, they remain the same
with the changing properties.
-
Whereas, in the SAPO model,
these items become new
-
when there is a change,
such as area change and name change.
-
So here, come back to this division
-
between these three different dimensions
of places, place names.
-
So, should we make these place names
into entities or properties?
-
Wikidata uses properties,
-
whereas this land survey
project has entities.
-
Or should we make them into lexemes?
-
Wikidata has chosen to work
with properties,
-
textual properties
for place names over lexemes.
-
I'm sorry, the other way around.
-
So, the names are...
-
properties, not lexemes.
-
Right.
-
And maybe the shortcoming of the Wikibase
-
is the lack of geographical
shapes inside that--
-
like in the basic setup of it,
-
so one would have to add
more technology into the stack
-
to be able to use local geographic shapes.
-
And a federation is really needed
-
to be able to take advantage
of the Wikidata corpus.
-
So, I'm done already. Thank you.
-
(applause)
-
Okay.
-
(speaking in Maori)
-
Welcome, everyone.
My name is Mike Dickison.
-
And for a year,
-
I was New Zealand Wikipedian at Large.
-
You might wonder
what a Wikipedian at Large is.
-
Because if you actually look out for it,
there is no such thing, as we can see.
-
It's a term that I made up
in the grant proposal,
-
which the foundation
seemed to like very much.
-
And so, we ran with it.
-
So, for a year, I went through
35 different institutions,
-
residents, and most of them,
running training sessions,
-
organizing public events,
and trying to develop
-
a Wikimedia strategy for each one.
-
It was a very interesting experience,
-
and you encounter a wide range
of different projects and people.
-
And I wanted to try and talk through
some of the different projects
-
that dealt with Wikidata
-
in interesting or, perhaps,
illuminating ways,
-
that might be useful for folks to discuss.
-
The project was initially
a Wikipedia project by the name,
-
simply because that was what people
were familiar with,
-
and so we organized
multiple different events
-
at very traditional edit-a-thons,
gender gap work, and so forth.
-
[And a bunch you can see] [inaudible],
-
and a bunch of very successful
new editors recruited, and so forth.
-
We did bulk uploads into Commons.
-
In this case, there was a collection
of over 1,000 original artworks
-
by an entomological
illustrator, Des Helmore,
-
which had been sitting on a hard drive,
-
[lacking] research for ten years,
-
and we were able
to get clearance to release those
-
all under CC BY license.
-
So, easy wins to show to people there.
-
Everyone can understand
lots of pictures of beetles.
-
Everyone can understand workshops
devoted to fixing the gender gap.
-
But Wikidata
is much more difficult to sell
-
to people in the GLAM sector,
-
or anyone outside
of our particular movement.
-
So, I began to realize that Wikidata
-
was going to be a more
and more important part
-
of the Wikipedian at Large projects.
-
So, as we went through, it became
a larger and larger component
-
of what I was doing.
-
And I began to try and teach myself
more about Wikidata as well,
-
because I was beginning to see
how important it was.
-
So, this one project--
-
the kakapo is a native
New Zealand flightless parrot.
-
We worked with
the Department of Conservation,
-
whose job is to save
this species from extinction,
-
and pitched the idea,
-
"What if we put every
single kakapo into Wikidata?"
-
And that may seem ridiculous,
-
but it's actually
a perfectly doable project.
-
A few of them are in there already.
-
A key thing to notice here
is there are not many kakapos.
-
So, it's a manageable task.
-
There were 148 when I started,
and then one died.
-
And they've just had
a great breeding season up to 213.
-
This is great. This is the most kakapo
there have been for over 50 years.
-
So, this was also a big deal.
-
This was on the news
every day in New Zealand.
-
Each new one that was born--
-
(man) In the New York Times.
-
(Mike) Did it? Oh, lovely.
-
Yeah, this was national news.
Everyone likes these birds.
-
But something interesting about them
-
is because unlike species
that are more populous,
-
every single kakapo is named,
has a unique name
-
and a unique ID number.
-
And often has good biographical data
-
about where and when they were born,
-
were hatched, who their father
and mother was,
-
when they died, if they died.
-
So, there is, in fact,
a Department of Conservation database
-
of all this information.
-
And one of the most famous kakapos,
of course, is Sirocco,
-
who you can see is named
after a wind, was born there.
-
Sirocco has a Twitter account,
-
which Wikidata had some problems with,
-
because, apparently,
they just can't have Twitter accounts.
-
I don't know about that.
-
He's even featured
on an album cover, and so forth.
-
So there are multiple properties of this,
-
probably one of the most famous
individual kakapo.
-
So, I pitched to the Department
of Conservation,
-
"Why don't we try and do this
with every single one?"
-
And so, they had to think about
how much of the biographical data
-
could be made public.
-
And they come up with a short list.
-
And now we've got, I think, 212,
210--I think a couple died--
-
living kakapo that are all candidates now.
-
And they only get a name when they fledge.
-
They have a code number until that
while they're still babies.
-
So, when we've got the full-fledged crop,
-
we're going to create
a complete Wikidata--
-
the entire species will be in Wikidata.
-
But we need to come up
with a property for DOC ID--
-
I actually would like to talk
with folks about that.
-
Should we be using a very specific ID,
-
or should we be coming up with an ID
-
that would work for all individual birds
or plants or animals
-
that have been tagged
in any scientific research project?
-
It's a good question.
-
Second project was
Christchurch Art Gallery.
-
There are very few paintings
of Colin MacCahon,
-
New Zealand's most famous
artist in existence.
-
This is a drawing he did
for the New Zealand School Journal,
-
which was government-funded at the time.
-
So, it's actually in Archives New Zealand
-
who own the copyright for that.
-
This is a very unusual situation.
-
So, I worked with
Christchurch Art Gallery
-
who, along with Auckland Art Gallery,
-
maintain a site called
Find New Zealand artists.
-
The job of which is to keep track
of the holdings--
-
every institution that has holdings
of the New Zealand artist.
-
So, about 18,000 different artists
in their database,
-
and most with very little
information at all.
-
So, we did a standard sort of Mix'n'Match.
-
We did an export of the ones
that had at least a birth date,
-
or a death date, or a place of birth,
or a place of death.
-
So, that's not restricting it very much.
-
And even then, we were not able
to match quite a few,
-
but we've got about 1,500 now
-
that are matched
to known artists in Wikidata,
-
which is nice.
-
But what was appealing to them--
-
this is their website,
-
which really just maintains
the holdings links there.
-
But this biographical data,
which they create by hand, currently,
-
for every single artist.
-
And the act of exporting
and putting into Mix'n'Match
-
exposed numerous typos
and mistakes and such
-
that they haven't noticed.
-
And it's only when you start
running things through [Excel],
-
these things show up.
-
And the value of Wikidata
was suddenly conveyed to them
-
when I said, "You can just suck in
that information from Wikidata."
-
And that made them sit up straight.
-
So this, I think, is one
of the selling points.
-
When you have this carefully
hand-curated website
-
with 18,000 entries, full of mistakes,
and tell them there's another way,
-
that they can get other people
-
to do some of this fact-checking
and correction for them--
-
that's when it sinks home.
-
And then announced I was pitching the idea
-
that they "Wikidatafy"
this entire history book
-
of the New Zealand artists
in Christchurch in the '30s,
-
and run through--just published--
and run through every single person,
-
connection, place, exhibition, and such.
-
But it's a manageable sized project,
and they're very excited by this.
-
And thirdly, I wanted to show you
Maori Subject Headings.
-
A waka is a Maori name
for a particular kind of canoe,
-
a war canoe.
-
So, in the National Library
of New Zealand,
-
there's a listing for waka,
because the National Library
-
actually has its own dictionary
of Maori Subject Headings,
-
in the Maori language.
-
So, there it defines a waka,
-
in Maori and English.
-
But it also has a whole lot
of narrower terms,
-
you can see there on the side there.
-
a typical would be taurapa.
-
And a definition first in Maori,
and then in English.
-
It's the carved sternpost
that you can see there.
-
And in English, you would say "sternpost,"
-
but you can't use
the word "sternpost" for taurapa,
-
because taurapa only works
for particular kinds of war canoes.
-
So, there's no English word
equivalent for that.
-
And I suddenly realized
that here is an entire ontology
-
of cultural-specific terms that have been
very carefully worked out
-
and verified by the National
Library with Maori,
-
constantly being added to and improved
with definitions, with descriptions,
-
in both English and Maori.
-
Really exciting.
-
I suddenly thought we could put
this whole lot into Wikidata--
-
Maori first, and then translated
into English, as required.
-
Be a nice change, wouldn't it!
-
And here's the copyright licensing.
-
Unfortunately, NonCommercial-NoDerivs.
-
So now I have to start
the conversation with them
-
about why did they pick that license.
-
And possibly because they only got
[buy in] from Maori,
-
who agreed to sit down
and [inaudible] this stuff
-
if there was a guarantee
-
that none of this information
could be used for commercial purposes.
-
So, that's one of the frustrating
aspects of the task
-
is coming up against
these sorts of restrictions.
-
So, those are the three things
I wanted to put out in front
-
and sparking discussion.
-
Putting an entire species into Wikidata,
-
what it takes to actually change
an art gallery's curator's mind
-
about the value of Wikidata,
-
and what do we do when we would see
a complete ontology
-
in another language that,
unfortunately, has been slapped
-
with a restrictive
Creative Commons license.
-
Thank you.
-
(applause)
-
Hello. My name is Joachim Neubert.
-
I'm working for the ZBW,
-
that is, Information Center
for Economics in Hamburg,
-
as a scientific software developer.
-
And one of my tasks last year
was preparing a data donation to Wikidata.
-
And I want to give some report on this
on our first experiences
-
from donating metadata
from the 20th-Century Press Archives.
-
To our best knowledge,
-
this is the largest public
press archive in the world.
-
It has been collected
between 1908 and 2005,
-
and has been got from
-
more than 1,500 newspapers
and periodicals
-
from Germany, and also internationally.
-
And it has covered everything
which could be of interest
-
for the Hamburg,
-
the Hamburg businesspeople
-
who wanted to expand over the world.
-
As you can see, this material
has been clipped from newspapers
-
and put onto paper,
-
and then collected in folders.
-
Here you see a small corner
of the Person's Archive,
-
and, similarly, information
has been collected on companies,
-
on general topics, on wares,
on everybody,
-
on everything which could be interesting.
-
These folders have been scanned
-
up to roughly 1949.
-
by the DFG-funded project in 2004 to 2007.
-
As a result, up to now,
it was 25,000 thematic dossiers
-
of this time.
-
This contained about 2 million,
or more than 2 million pages.
-
And these are online.
-
This application developed
at that time by ZBW,
-
which now looks a bit outdated,
-
not so fancy,
and what’s more of a problem.
-
It's an application which was built
architecturally on Oracle,
-
it was built on ColdFusion,
it runs on Windows servers,
-
so it's not very sustainable
in the long term.
-
And we have discussed
should we migrate this
-
to a more fancy linked data application,
-
or should we take a radical step
-
and put all this data in the open.
-
We have assigned CC0 license to that data
-
and, currently, moving some main--
-
access layer, some main discovery layer--
so it's a primary access layer
-
to the open linked data web,
-
where it actually makes most sense
-
to put some metadata into Wikidata,
-
and to make sure that all folders
-
of the collections are linked to Wikidata,
-
so they are findable,
-
and that all metadata about these folders
-
is also transferred to Wikidata.
-
So it can be used there,
and it can be enriched there, possibly.
-
Corrections can be made to that data.
-
What is still maintained by ZBW is,
of course, the storage of the images,
-
which we can't put in any way,
-
or we can't give a license on that
-
because this was owned
by the original creators.
-
But we make sure that they are accessible
-
by some, again, metadata files
via DFG Viewer
-
in the future by IIIF manifests.
-
And we will prepare
some static landing pages
-
which will serve as a data point
of reference for Wikidata,
-
as well as still making available data
-
which doesn't fit well into Wikidata.
-
[For us] is migration
and data donation to Wikidata
-
with our custom infrastructure
-
of SPARQL endpoint with that data,
-
and we basically used federated queries
-
between that endpoint
and the Wikidata Query Service
-
to create according statements
-
through [eyes of] concatenated
-
in SPARQL queries themselves,
or transformed via a script,
-
which also generated references
for the statements.
-
And then put that into QuickStatements
of the code to use this online.
-
So, this is what we get.
-
It's not only simple things
like birth dates, but, sorry--
-
but also complex statements
-
about already existing items,
-
like this person was a supervisory
board member of said company
-
during this period of time,
-
and referenced for use in...
-
in the scientific context.
-
The first part of this data donation
has been finished.
-
The Person's Archive
is completely linked to Wikidata.
-
And this is also an information tool.
-
A lot of items which have been before
-
not had any external references.
-
And we had about more
than 6,000 statements,
-
which are now sourced
in this archive's metadata.
-
Well, this was the most easy part,
-
because persons are easily
identifiable in Wikidata.
-
More than 90% already existed here,
-
so we could link to that.
-
We created some 100 items for these,
-
for the ones which were missing.
-
But now, we are working
-
on the rest of the archive,
-
particularly on the topics archive.
-
Which means mapping a historic system
for the organization of knowledge
-
about the whole world,
-
materialized as newspaper
clippings to Wikidata.
-
To give you a basic idea,
the Countries and Topics archive
-
is organized by a hierarchy of countries
-
and other geographic entities,
-
which is translated to English,
which makes this more easy.
-
And German deeply nested...
-
deeply nested classification of topics.
-
And this combination defines one...
-
one folder.
-
So, what we now want to do
is to match this
-
as a structure to Wikidata,
and to bring the data in.
-
And I want to invite you
-
to join this really nice challenge
-
in terms of knowledge organization.
-
So, it's a WikiProject
where this work is tracked,
-
and you can follow this
or participate in this.
-
And, yes, thank you very much.
-
(applause)
-
So, we're taking
performing arts to Wikidata.
-
And we're taking performing arts
to the linked open data cloud,
-
by building a linked open data
ecosystem for the performing arts.
-
And the question I'm trying to answer,
-
and I hope you'll help me
in answering the questions
-
which place for Wikidata and all that.
-
But let me first start with my experiences
-
which I made this year,
-
the first half of the year,
when I had the pleasure
-
to work with CAPACOA,
-
which is the Canadian Arts
Presenting Association,
-
which actually launched a project
called Linked Digital Future Initiative,
-
to actually get the entire art sector
in Canada to embrace linked open data.
-
And they did that based on the observation
-
that over the past five years,
-
the [inaudible]-- the important topic
within performing arts
-
was the fact that metadata
was not around in sufficient quality
-
and not interlinked, not interoperable.
-
And that was why some of the performances,
-
some of the events
are not so well findable
-
by Google and by personal
computer-based assistants, and so on.
-
So, the vision we kind
of developed together
-
is that we want to have a knowledge base
-
for many stakeholders at once.
-
So we looked at the entire
performing arts value network,
-
we identified key stakeholders in there,
-
we looked at the usage scenarios
that we like to pursue,
-
and we kind of mapped it
to the whole architecture
-
of such a knowledge base,
or of the different platforms in there,
-
which, obviously,
is a distributed architecture,
-
and not one big monolith.
-
I'm just going to run
through that quite quickly
-
because we have ten minutes each.
-
But I think we'll have plenty of time
tonight or tomorrow to deepen that
-
if anybody's interested in the details.
-
So, we started from
that Performing Arts Value Network,
-
which, interestingly,
was just published last year.
-
So, we're lucky to be able
to build on previous work,
-
like you have the primary value chain
of the performing arts in the middle,
-
and various stakeholders around that.
-
All in all, we identified
20 stakeholder groups,
-
which then we kind of boiled down
into seven larger categories
-
for each of the stakeholder groups.
-
We kind of formulated what kind of needs
-
they would have in terms
of such an infrastructure,
-
and what would they be able to achieve
if the whole thing was interlinked
-
and the data was publicly accessible.
-
And so, you can see the types here,
-
the different types is Production,
then Presention & Promotion,
-
Coverage & Reuse, Live Audiences,
-
Online Consumption, Heritage,
-
Research & Education.
-
And after kind of setting up a big table,
-
of which you can see
just the first part here,
-
we kind of compared [over there],
had a look at which type of data
-
were actually used across the board
-
by all different groups of stakeholders.
-
And there's quite a large basis of data
that is common to all of them,
-
and that is really is the area
-
where it makes a lot of sense, actually,
to cooperate and to keep that--
-
to maintain the data together.
-
So, when talking about
platform architecture,
-
you can see that we have four layers here.
-
At the bottom, display the data layer.
-
Of course, Wikidata plays a part in it,
-
but also a lot of other databases,
distributed databases
-
that can expose data
through SPARQL endpoints.
-
The yellow part in the middle,
that's the semantic layer.
-
It's our common language
to describe our things,
-
to make statements about things
around the performing arts, the ontology.
-
Then we have an application layer
-
that consists of various modules,
for example, data analysis,
-
data extraction-- so, how do you
actually get unstructured data
-
into structured data--
-
how can we support that by tools.
-
Then, obviously, there's
a visualization of data--
-
so if there are large quantities of data,
you want to visualize it in some way.
-
And on the top, you have
the presentation layer,
-
that's what the ordinary people
are actually interacting with
-
on a daily basis--
-
search engines, encyclopedias,
cultural agendas,
-
and a variety of other services.
-
We're not starting from scratch.
-
Some work has already
been done in this area.
-
I'll just cite a few examples
from a project
-
which I have been involved in.
-
Some other stuff going on as well.
-
And so, I started in this area
-
with the Swiss Archive
of the Performing Arts.
-
[Until] building a Swiss
Performing Arts database,
-
we created the performing arts ontology,
-
that's currently being
implemented into RDF.
-
And there we have the database
of like 60, 70 years
-
of performance history in Switzerland.
-
So, that's something that can build on,
-
and that's something
that's been transformed into RDF.
-
And there was a builder platform
where this data can be accessed.
-
Then we have done
several ingests into Wikidata,
-
partly from Switzerland,
-
partly also from
the performance arts institutes,
-
for example, Bart Magnus
was involved in that.
-
He was the driving force behind that.
-
There's also stuff from Wikimedia Commons,
-
but not very well interlinked
with all the rest of our metadata.
-
And obviously, by doing this ingest,
-
we also kind of started to implement
parts of this Swiss data model
-
into Wikidata.
-
Then one of the Canadian
implementation partners
-
is Culture Creates.
-
They're running a platform that actually
scrapes information from theater websites,
-
and inputs it into a knowledge graph,
-
to then expose it to search engines
and other search devices.
-
And there again, we kind of had
to implement and extend this in ontology.
-
And as you can see from the slide,
is that there's so many empty spaces,
-
but there's also some overlap,
-
and an important overlap, obviously,
is the common shared language,
-
which will help us actually interlink
the various data sets.
-
What is also important, obviously,
-
is that we're using the same
base registers and authority files.
-
And this is a place where Wikidata
plays an important role
-
by kind of interlinking these.
-
Now, I'd like to share the recommendations
-
by the Linked Data Future Initiatives
Advisory Committee.
-
At least the two first recommendations.
-
So, for the Canadians,
now it's absolutely crucial
-
to kind of fill in their own Canadian
performing arts knowledge graph,
-
because unlike the Swiss Archive
of the Performing Arts,
-
they're not starting
with an already existing database,
-
but they're kind of
creating it from scratch.
-
And it's absolutely crucial
to have data in there.
-
And second, as you can see,
comes in already Wikidata.
-
Wikidata, by the Advisory Committee,
-
has been seen as complementary
to Artsdata.ca, this knowledge graph,
-
and, therefore, efforts should
be undertaken to contribute
-
to its population
with performing arts-related data.
-
And that's where we're going to work on
over the coming months and years,
-
and that's also why
I'm kind of on the lookout here
-
to see who else will join that effort.
-
So, right now, obviously,
we're saying they're complementary.
-
So, we have to think about whether
the pluses and the minuses
-
of each of the approaches.
-
And you can see here a comparison
-
between Wikidata and the Classical
Linked Open Data approach.
-
I would be happy to discuss
that further with you guys,
-
how your experiences are in there.
-
But, as I see it, Wikidata is a huge plus
because it's a crowdsourcing platform,
-
and it's easy to invite further parties
to actually contribute.
-
On the negative side, obviously,
you get this problem of loss of control.
-
Data owners have to give up control
over their graphs, data quality,
-
and completeness.
-
It's harder to track on Wikidata
than if you have it under your control.
-
And the other strength of Wikidata
-
is that it requires immediate integration
into that worldwide graph.
-
And you kind of just do it--
-
kind of reconcile step by step
against other databases,
-
which may also be seen by some
as an advantage,
-
but of course, if you're looking
for integration and interoperability,
-
Wikidata forces you to go for that
from the beginning.
-
And then, obviously, harmonizing
data modeling practices
-
is an issue in both cases.
-
But it may seem, at the beginning,
easier to do with just in your own silo,
-
because at some point,
you're done with the task,
-
and it would be
an ongoing task on Wikidata.
-
So, when it now comes to prioritizing
the data to be ingested,
-
that's like the rules
I kind of go by at the moment.
-
First of all, we'd like to ingest it
-
where it's unclear who would be
the natural authority in the given area.
-
So that's definitely data
that will be managed in a shared manner.
-
And we'd like to ingest it where we see
-
a high potential
for crowdsourcing approaches.
-
We'd like to ingest data where the data
is likely to be reused
-
in the context of Wikipedia.
-
And there's also hope that some part
of the international coordination
-
around the whole data modeling,
about the standardization,
-
they could actually take place
directly on Wikidata,
-
if it's not taking place elsewhere,
-
because it kind of forces people
to start interacting
-
if they ingest data in the same part.
-
And we'd like to focus now next
on base registers and authority files
-
because they kind of help us
create the linkages
-
between different data
and uncontrolled vocabularies
-
as an extension of the existing ontology.
-
So, just two more slides.
-
The next steps will be that we're taking
the sum of all GLAMs approach
-
to Wiki Loves Performing Arts.
-
That means we're describing
venues and organizations,
-
and try to push the data to Wikipedia
-
in forms of infoboxes
and [bubble] templates.
-
And the other one, the other projects
I'm going to pursue is COST Action
-
that we'll submit next year
-
around that Linked Open Data Ecosystem
for the Performing Arts.
-
COST is a European program
that supports networking activities,
-
and the topics to be covered
are listed here.
-
Two of them, I have highlighted--
-
one of them is like the question
of federation between Wikidata
-
and the classical linked
open data approaches.
-
And the other one, I think,
is very important also,
-
where we have a huge potential still,
-
is implementing international campaigns
to supplement data on Wikidata.
-
So, that's it. Thank you
for your attention.
-
Now, I would like to ask
my colleagues up here.
-
To the panel, maybe you'll get them
microphones as well.
-
And then I would like to...
-
give you the chance to ask questions.
-
And obviously, also ask my colleagues
-
whether they have questions to each other.
-
So, do we have maybe a question
from the audience?
-
(man) [inaudible]
-
I would like to ask from each of you
-
where would you draw the line,
-
basically, how you define--
-
when do you need to run your own Wikibase,
-
and what do you want to put on Wikidata?
-
Like, is this a clear delineation
of what is seen
-
behind of putting it [into order.]
-
I can answer first because I have the mic.
-
So, I've been thinking
that one of the issues is notability.
-
I'm addressing that
in a different project.
-
And I think licensing could be one,
-
because you can apply your own terms
in your own database,
-
and then I think wherever it's possible.
-
And then, the third one
is just to have it as a sandbox,
-
prepare it for ingestion into Wikidata.
-
These are the three main things
that I come up with now,
-
but I can come up with more.
-
For me, rights are always
going to be an issue.
-
So, if the National Library
wanted to move towards Wikibase,
-
that would enable them to continue
to control the licensing
-
for the work they've done
with Maori language terms.
-
The kakapo database only contains data
-
that the Department of Conservation
felt could be made public,
-
but I suspect if they see it
up and running,
-
they might be tempted
to use a private Wikibase
-
to maintain their own database,
-
simply because of some
of the visualization tools
-
that could be applied might be better
-
than the sort of Excel spreadsheet system
that they currently run.
-
Well, I think this very much depends
on the kind of data.
-
We are, with the Press Archive, of course,
in a quite lucky position,
-
in that this was material
which was published,
-
it was published at the time,
-
but it was expensive to publish.
-
So, this is quite easy.
-
I think, also, projects--
-
and this is a typical project,
-
so it was funded for some time,
and then funding ended,
-
and what happens with the data
which is enclosed in some silo,
-
and some software
which will not run forever.
-
And so, it makes
absolute sense in my eyes.
-
At the time, Wikidata
wasn't around, but now it is,
-
and it makes absolute sense
for our project to early on
-
discuss sustainability in the context
of how could we put this
-
into a larger ecosystem like Wikidata,
-
and discuss this with the data community
-
what is notable and what makes sense
to add this to Wikidata,
-
and what makes sense to keep this
as a proprietary form.
-
Maybe in a more simple form
than sophisticated application,
-
but make it discoverable
and make it linked to the large data cloud
-
instead of investing lots of money
-
into some silo which will not sustain.
-
Yeah, as I said before
in the project I was presenting here,
-
are dualities between Wikidata
and classical linked open data approaches.
-
So, it's not so much about
setting up a private Wikibase.
-
Like one challenge we have had,
and, of course, in Wikidata,
-
is that when we ingest
your own data there,
-
you also have to do some housekeeping
-
of people, of other people, actually.
-
And they can put off people,
[or it also means] that we will address it
-
just step by step.
-
So, there will be, at the moment,
a database living--
-
in classical linked open data
-
and we're starting to linking it
with Wikidata,
-
and it's a continuous process to find out
-
for which areas the most data
will be eventually on Wikidata,
-
and for which areas it will actually
live on other databases.
-
Obviously, we'll have challenges
regarding synchronization,
-
as we probably all have,
-
because that linked data field,
-
where we still have
to negotiate who we trust,
-
who has authority about what.
-
(assistant) Other questions?
-
(woman) Thank you.
-
So, fully agree with that issue of--
-
where to put the boundary
between why do we put data on Wikidata,
-
or why do we keep them,
and create, manage, and maintain them
-
in local databases and for what purposes.
-
And I think that
this is a large discussion
-
that goes beyond just the excitement
-
of putting data on Wikidata
because it is public,
-
because it serves humanity, because--
-
while there are two cool tools,
-
and things are more complicated
in real life, I think.
-
Well, despite this,
it's quite an interesting discussion.
-
And then this is another issue, also,
or another problem that is being discussed
-
in this event in different panels.
-
It is on one side, have your own database,
-
whatever the technology is
-
and publish things on Wikidata,
-
or build your own system
-
of creating and managing information
-
on the Wikibase technology.
-
And then, synchronize or whatever--
do federation or things,
-
so it's a matter
of technology that is used,
-
and the fact that you use Wikidata
just for publishing,
-
or the infrastructure
that is underneath Wikidata
-
to create and manage your data.
-
I mean, we had a discussion
-
about the Wikibase panel,
-
and there will be other discussions here,
-
but things are
on different levels, I think.
-
Maybe [you sort of get] to that discussion
about Wikibase or Wikidata--
-
I think it's problematic
that we are focusing so much
-
on this Wikibase infrastructure,
because there are other infrastructures,
-
like in the area of performing arts.
-
We have another complementary community,
which is MusicBrainz
-
that runs on their own platform
that provides linked open data,
-
and as I understand it,
-
there's agreement
within the Wikidata community
-
that we're not going
to double all their data--
-
we're not going to copy all their data,
but we accept that they're complementary.
-
So, what will happen when you start
integrating this data in Wikipedia?
-
Infoboxes, for example.
-
Would we be able to pull that data
directly from their SPARQL endpoint?
-
Or would we be obliged
to kind of copy all the data,
-
and what kind of processes
are involved in that?
-
(woman) Discussions are open, I think,
-
because within this event,
you have both interested communities--
-
those that are interested in Wikibase,
-
and those that are interested in Wikidata,
-
and those who are interested in both.
-
Yeah, but we're not going
to oblige them to move to Wikibase.
-
- (woman) Not necessarily.
- MusicBrainz is not running on Wikibase.
-
(woman) No, I just wanted to say
that you have separate problems,
-
sometimes interrelated,
sometimes not completely separated.
-
And I had another question or remark
-
regarding the management of hierarchies
in controlled vocabularies,
-
like thesaurus, like you in Finto.
-
You do have the places
-
in the Maori
-
Subject Headings,
-
Well, they have to deal with
the management of concepts in hierarchy.
-
What is your take, your opinion
-
about the possibility
of managing this controlled
-
knowledge organization
systems in Wikidata?
-
I think in the case
of Finto and YSO places,
-
the repository will be a collection
-
of several sources, eventually.
-
So, it is in flux, anyway.
-
So, we don't have to necessarily--
-
well, I don't represent
the National Library,
-
but in that possible project,
-
we would not have
to maintain an existing--
-
or fight with an existing structure.
-
So, in that sense, it is an area
open for exploration.
-
The Maori Subject Headings
seems to lend themselves ideally
-
to Wikidata structure,
-
but the licensing,
of course, forbids that.
-
I suspect that if the licensing
were different
-
and they were put into Wikidata,
-
as soon as somebody decided
they didn't like the hierarchy
-
and started to change things,
-
there would be an immediate outcry
from people who worked very hard
-
to create that structure
-
and get the sign-off
from various different Maori
-
that was the current hierarchy.
-
So, that's an issue to try and resolve.
-
I think in terms of knowledge
organization systems,
-
they are all different.
-
And I'm not sure
if it would be a good idea
-
to represent different hierarchies
in Wikidata as such,
-
but it maybe makes sense
to think about overlays
-
of the data.
-
So, to do mappings on the content level.
-
For example, as ZBW partnership
Thesaurus for Economics.
-
And this thesaurus has its own hierarchy,
-
and, of course, it would be possible
to project the hierarchy
-
of this thesaurus into Wikidata concepts
-
without actually storing
this kind of structure
-
as an alternative structure
within Wikidata
-
which would make a lot of confusion.
-
But I think we should think
of Wikidata, also, as a pool of concepts
-
which can be connected on layers
which are outside,
-
and which give another view of the world
-
which is not necessarily to be
within Wikidata.
-
(assistant) Alright. Some other questions?
-
Otherwise-- okay.
-
(man 2) Joachim, I just wanted
to follow up on that last point.
-
So, these layers, as you picture it,
-
they would be maintained externally
-
and somehow integrated
-
with Wikidata from the Wikidata side,
-
or have you thought a bit further
-
about how that might be managed?
-
Actually, no, I have no--
-
I have done experiments
with ZBW and Wikidata.
-
I was [inaudible] here at Wikidata.
-
But I think this is
a whole new complex thing,
-
and so, it's up to [discuss],
[to give up a lot of control]
-
to do such things.
-
But it has to be figured out.
-
Should we take one more?
-
(man 3) Ah, great.
-
I was just wondering
about the kakapo project.
-
Uh-hmm.
-
(man 3) Okay. So, did you get
any pushback from the Wikidata community
-
about having individual animals
out of those items?
-
Not so far.
-
(man 3) Has anyone heard
about this before?
-
Is it "not so far" because
no one has heard about it yet?
-
There's been a small discussion
for quite some time now--
-
those people interested
in this sort of thing in Wikidata,
-
and we all seem to think
that it's a natural extension
-
of getting individual Wikidata items
to a famous racehorse
-
or someone's cat, which--
that's modeled pretty well.
-
I guess just the audacious thing
is putting the entire species in there.
-
But I think it's perfectly manageable.
-
(man 3) Don't try it with cats and dogs.
-
(laughter)
-
(assistant) Okay. I think
the time is finished.
-
Thank you very much for attending.
-
I think the speakers will be still open
for the questions and a break.
-
And have fun.
-
Thank you very much.
-
(applause)