cdn.media.ccc.de/.../wikidatacon2019-3-eng-Glimpse_over_Wikidata_hd.mp4

Edit subtitles

0:06 - 0:09

Hello, everyone.
0:09 - 0:12

It's awesome that you're all here,
so many of you.
0:12 - 0:13

It's really, really great.
0:15 - 0:20

So Lea already talked a lot
about this event,
0:20 - 0:23

and I'm going to talk a bit
about Wikidata itself
0:23 - 0:26

and what has been happening
around it over the last year
0:26 - 0:28

and where we are going.
0:29 - 0:33

So... what is this? Sorry.
0:40 - 0:44

So... where are we?
Where are we going?
0:45 - 0:50

Over the last year there has been
so much to celebrate
0:50 - 0:52

and I want to highlight some of that
0:52 - 0:55

because sometimes it goes unnoticed.
0:57 - 1:04

And first I want to take you through
some statistics around editors
1:04 - 1:07

and our content and how our data is used.
1:10 - 1:15

Over the last year,
we have grown our community
1:15 - 1:17

which is amazing.
1:17 - 1:21

We have around 3,000 new people
1:21 - 1:26

who edit once or more in 30 days.
1:26 - 1:30

So that's 3,000 new Wikidatans, yay!
1:32 - 1:37

Now if you look at people who do more,
like five edits in 30 days,
1:37 - 1:41

we've got an additional 1,200 roughly.
1:41 - 1:44

And if you look
at the people who do 100 edits or more--
1:44 - 1:47

I hope many of you in this room--
1:47 - 1:49

we have 300 more.
1:49 - 1:51

Raise your hand
if you're in this last group.
1:53 - 1:56

Woot! You're awesome!
1:58 - 2:04

And while the number of edits
is usually not something
2:04 - 2:09

we pay a lot of attention to,
2:09 - 2:13

we did cross
the 1 billion edits mark this year.
2:13 - 2:15

(applause)
2:21 - 2:23

Alright, let's look at content.
2:28 - 2:31

So, we're now at 65 million items,
2:31 - 2:34

so entities to describe the world,
2:34 - 2:41

and we're doing this
with around 6,700 properties.
2:44 - 2:48

Of those, around 4,300
are external identifiers,
2:48 - 2:53

which gives us a lot of linking
to other catalogues, databases,
2:53 - 2:56

websites and more
2:56 - 2:59

and really makes Wikidata
the central place
2:59 - 3:02

in a linked open data web.
3:02 - 3:07

So using those properties and items,
3:07 - 3:12

we have around 800 million statements now,
3:12 - 3:16

and compared to last year,
we know about half a statement more
3:16 - 3:18

about every single item.
3:19 - 3:20

(laughter)
3:23 - 3:25

So, yeah, Wikidata got smarter.
3:27 - 3:30

But we don't just have items
and properties,
3:30 - 3:34

we also have new stuff
like lexemes
3:34 - 3:40

and we are now at 204,000 lexemes
that describe words
3:40 - 3:42

in many different languages.
3:42 - 3:43

It's very cool.
3:44 - 3:48

I will talk more about this
in a session later today.
3:49 - 3:53

Last, the latest addition
are entity schemas
3:53 - 3:59

that help us figure out
how to consistently model data
3:59 - 4:01

across a certain area.
4:02 - 4:04

And of those, we have around 140 now.
4:08 - 4:11

Now numbers aren't everything
around content, right,
4:11 - 4:15

amount of content--we also care
about quality of the content.
4:16 - 4:22

And what we've done now is
we've trained a machine learning system
4:22 - 4:25

to judge the quality of an item.
4:26 - 4:30

Now this is far from perfect,
but it gives you an idea.
4:30 - 4:35

So every item in Wikidata gets a score
between 1 and 5.
4:35 - 4:38

One is pretty terrible; five is amazing.
4:38 - 4:42

And it looks at things
like how many statements does it have,
4:42 - 4:44

how many external identifiers
does it have,
4:44 - 4:46

how many references are there,
4:46 - 4:49

how many different labels are there
in different languages,
4:49 - 4:51

and so on.
4:51 - 4:55

And then we looked at Wikidata over time,
4:55 - 5:00

and as you can see,
based on these measures,
5:00 - 5:04

we went from pretty terrible
to much better.
5:04 - 5:05

(laughter)
5:06 - 5:07

So that's good.
5:07 - 5:12

But what you can also see,
there's still a lot of room to 5.
5:14 - 5:20

Now I don't think
this is where we will get to, right?
5:20 - 5:23

Not every item will be absolutely perfect
5:23 - 5:26

according to these measures
that we have taken.
5:26 - 5:31

But I'm really happy to see
that consistently the quality of our data
5:31 - 5:32

is getting better and better.
5:37 - 5:43

Okay, but creating that data isn't enough.
5:44 - 5:47

We want this--we do this for a reason.
5:47 - 5:49

We want it to be used.
5:49 - 5:55

And now we looked at how many articles
5:55 - 6:01

on each of the other Wikimedia projects
use data from Wikidata,
6:02 - 6:07

and we looked at the percentage
of all articles on those projects.
6:07 - 6:10

Now if you look across all of Wikimedia
6:10 - 6:12

and all of the articles there,
6:12 - 6:19

then 56.35% of them today
make use of some data from Wikidata.
6:20 - 6:22

Which I think is pretty good,
6:22 - 6:27

but of course,
there's still a lot of room to 100.
6:29 - 6:34

And then I looked at which projects
are actually making most use
6:34 - 6:36

of Wikidata's data,
6:36 - 6:39

and I split this
by language versions and so on.
6:40 - 6:45

And now what do you think
the top five projects--
6:46 - 6:48

which ones are all of them?
6:48 - 6:51

Which project family do they belong to?
6:51 - 6:53

(several in audience) Commons.
6:53 - 6:57

Okay, that's pretty uniformly Commons.
6:57 - 6:59

You would actually be wrong.
6:59 - 7:02

All of the top five are Wikivoyage.
7:02 - 7:04

(audience) Oh!
7:04 - 7:05

(laughter)
7:05 - 7:08

So yeah, applause to Wikivoyage.
7:09 - 7:11

(applause)
7:17 - 7:20

If you would like to check
where Commons actually is
7:20 - 7:22

and where all of your other projects are,
7:22 - 7:24

there is a dashboard.
7:24 - 7:25

Come to me and we can check it out.
7:28 - 7:32

Of course, inside Wikimedia is
not the only place where our data is used.
7:32 - 7:35

It's also used outside,
and so much has happened.
7:35 - 7:39

I can't begin to mention it all,
but to highlight some
7:40 - 7:44

there are great uses of our data
at the Met, at the Wellcome Trust,
7:44 - 7:46

at the Library of Congress,
7:46 - 7:48

in GeneWiki and so many more.
7:48 - 7:51

And if you go through some of the sessions
later in the program,
7:51 - 7:53

you will hear about some of them.
7:57 - 8:00

Alright, enough statistics.
8:00 - 8:02

Let's look at some other highlights.
8:03 - 8:07

So we already talked
about data quality improving,
8:07 - 8:11

and when you look at data quality,
there are a lot of dimensions
8:11 - 8:16

that you can look at,
and we've improved on some of those,
8:16 - 8:19

like how accurate is the data,
8:19 - 8:21

how trustworthy is the data,
8:21 - 8:23

how referenced is it,
8:23 - 8:25

how consistent is it modeled,
8:26 - 8:29

how completed is it and so on.
8:31 - 8:36

Just to pick out one--
for consistency for example,
8:36 - 8:42

we have created the ability to store
entity schemas now in Wikidata
8:42 - 8:47

so that you can describe
how certain domains should be modeled.
8:47 - 8:49

So you can find--
8:50 - 8:54

you can create an entity schema,
say, for Dutch painters,
8:54 - 8:56

and then you can look how--
8:56 - 8:59

which items that are for Dutch painters
8:59 - 9:02

do not, for example,
have a date of birth but should
9:02 - 9:05

and similar things like that.
9:06 - 9:10

And I hope that a lot more
wiki projects and so on
9:10 - 9:13

will be able to make use
of entity schemas to take good care
9:13 - 9:16

of their data, and if you want
to learn how to do that,
9:16 - 9:18

there's a session later
in the program as well
9:18 - 9:23

by people who know all about this
and will make this less
9:23 - 9:25

of a black box for you.
9:28 - 9:29

Alright.
9:31 - 9:35

Another thing that really got traction
9:35 - 9:38

over the last year
is the Wikibase ecosystem, right?
9:38 - 9:44

This idea that not all open data
should and has to happen
9:44 - 9:47

in Wikidata, but instead, we want
a thriving ecosystem
9:47 - 9:51

of different places, of different actors,
9:51 - 9:54

like institutions, companies,
9:54 - 9:57

volunteer projects opening up their data
in a similar way
9:57 - 10:00

that Wikidata does it
and then connecting all of it,
10:00 - 10:03

exchanging data between those,
linking that data.
10:04 - 10:09

And over the last year,
the interest in that
10:09 - 10:12

and the interest in institutions
and people running
10:12 - 10:15

their own Wikibase instance
has really exploded,
10:15 - 10:20

and especially in the sector
of libraries.
10:23 - 10:26

There's a lot of testing, evaluating,
10:26 - 10:29

and to be honest, trailblazing,
10:29 - 10:34

going on there at the moment
where adventurous institutions
10:34 - 10:39

work with us to really figure out
how Wikibase can work
10:39 - 10:42

for their collections,
for their catalogues and so on.
10:43 - 10:45

Among them, the German National Library,
10:45 - 10:46

the French National Library,
10:46 - 10:49

OCLC and it's really exciting to see.
10:55 - 10:57

One of the reasons
why I think this is so exciting
10:57 - 11:03

is that we are helping these institutions
open up data in a way that is
11:03 - 11:08

not just putting it on a website
and someone can access it
11:08 - 11:12

but really thinking about this--
the next step after that, right?
11:12 - 11:15

Letting people help you maintain
that data, augment that data,
11:15 - 11:20

enrich it, and that's really a shift
11:20 - 11:25

that I hope will bring good things.
11:26 - 11:28

And the other thing it helps us with
11:28 - 11:31

is that it lets experts curate the data
11:31 - 11:37

in their space, keep it in good shape
so that we can then set up
11:37 - 11:42

synchronizing processes
to Wikidata, for example,
11:42 - 11:46

instead of having to take care of it
ourselves all the time.
11:47 - 11:50

And at the end of the day,
I hope it will take some pressure
11:50 - 11:54

off of Wikidata to be that place
where everything has to go.
11:58 - 12:00

Lexicographical data--
12:02 - 12:07

Over the last year,
people started describing words
12:07 - 12:12

in their language in Wikidata
so that we can build things
12:12 - 12:15

like automated translation tools,
12:16 - 12:21

and we are at the point
where in some languages
12:21 - 12:26

we are starting to get nearer
to reaching that critical mass
12:26 - 12:29

that is needed to actually
build a serious application.
12:30 - 12:33

In a lot of languages,
we still have a long way to go,
12:33 - 12:35

but in some,
we're really starting to get there,
12:35 - 12:37

and that's really great to see.
12:39 - 12:41

If you want to know more about this,
come to my session later today.
12:46 - 12:49

And, of course, not to forget,
12:49 - 12:51

structured data on Commons.
12:51 - 12:52

(audience member whistles)
12:52 - 12:54

Yes! (laughs)
12:54 - 12:56

(applause)
12:59 - 13:02

The structured data on Commons
seen at the foundation
13:02 - 13:06

has really gotten...
13:07 - 13:11

everything together and made it possible
13:11 - 13:15

to add statements to files
on Commons over the last year,
13:16 - 13:19

and people are starting to add
those statements to images
13:19 - 13:23

to then make it easier to find
to build better applications on top of it,
13:23 - 13:24

and so much more.
13:24 - 13:27

It's really exciting to see how
that is growing,
13:27 - 13:30

and I think what's really important
13:30 - 13:33

for the Wikidata community
to understand here
13:33 - 13:37

is that when you see "depicts"
13:37 - 13:42

or "house cat" or "sitting," "lizard"
and "wall" here,
13:42 - 13:45

those are links to Wikidata items
and properties.
13:45 - 13:50

That means when we create items
and properties,
13:50 - 13:54

those are no longer just providing
the vocabulary for Wikidata itself.
13:54 - 13:58

They are providing the vocabulary
for Commons as well.
13:58 - 14:01

And this will only get more and more so,
14:01 - 14:03

so we have to pay a lot more attention
14:03 - 14:07

to how our ontology, our vocabulary
14:07 - 14:10

is actually used in other places
than we had before.
14:14 - 14:20

And the last one I have is that
we've started building stronger bridges
14:20 - 14:22

to the other Wikimedia projects.
14:23 - 14:26

My team and I are working
on a project called the Wikidata Bridge,
14:26 - 14:29

and you should totally come
to the UX booth
14:29 - 14:33

and do some testing of the current state
14:33 - 14:36

that will have
for example Wikipedia editors
14:36 - 14:39

edit Wikidata directly
from their projects
14:39 - 14:41

without having to go to Wikidata
14:41 - 14:44

and having to understand
everything around it.
14:44 - 14:51

I hope that this will take away
one more hurdle that makes it difficult
14:51 - 14:54

for Wikimedia projects
to adopt more data from Wikidata.
14:57 - 15:01

Alright, now to strategies
and where are we going?
15:03 - 15:07

Since December, the Wikidata team
at Wikimedia Deutschland,
15:07 - 15:12

and people from the Wikimedia Foundation
have been working on strategies,
15:12 - 15:15

papers around Wikidata.
15:15 - 15:16

It's basically writing down
15:16 - 15:20

what a lot of us have been
talking about already
15:20 - 15:23

over the last four or five years.
15:24 - 15:29

And I don't know if all of you
have read those papers.
15:29 - 15:34

They're published on Meta Commons
until the end of the month.
15:34 - 15:36

It would be great
if you haven't read them,
15:36 - 15:39

go read them,
leave your comments and so on.
15:40 - 15:44

Now the very quick overview
of what is in there
15:44 - 15:51

is that we think about Wikidata
and Wikibase in three pieces.
15:52 - 15:55

The first one is Wikidata as a platform.
15:55 - 15:57

You can see it in the lower corner,
15:57 - 16:04

and that is really around
Wikidata enables every person
16:04 - 16:06

to access and share information
16:06 - 16:09

regardless of their language
and technology,
16:09 - 16:14

and we do that by providing
general purpose data about the world.
16:14 - 16:18

So basically what you do every day.
16:21 - 16:25

The second thing is
the Wikibase ecosystem part
16:25 - 16:30

where Wikibase, the software
running Wikidata, powers
16:30 - 16:35

not just Wikidata, but a thriving
open data web that is the backbone
16:35 - 16:37

of free and open knowledge.
16:38 - 16:43

And the third and last thing
is Wikidata for the Wikimedia projects
16:43 - 16:47

at the top where Wikidata is there
16:47 - 16:50

to help the Wikimedia projects--
16:51 - 16:54

help make them ready for the future.
16:58 - 17:03

Concretely, what does that mean
for the near or midterm future?
17:04 - 17:06

Wikidata as a platform--
17:07 - 17:11

We want to have better data quality,
so we will continue working
17:11 - 17:14

on better tools,
improving the tools we have and so on.
17:15 - 17:19

We need to make our data
more accessible
17:19 - 17:24

through better APIs,
a more robust SPARQL endpoint
17:24 - 17:27

but also things like more consistently
modeling our data
17:27 - 17:31

so it actually is easy to reuse
in applications.
17:32 - 17:37

And the last thing I had was
setting up feedback processes
17:37 - 17:39

with our partners.
17:40 - 17:44

Unlike Wikipedia, Wikidata is not
17:44 - 17:46

what I call a destination project, right?
17:46 - 17:49

Someone goes to Wikipedia and reads it
17:49 - 17:51

whereas Wikidata is usually not
17:51 - 17:53

someone goes to Wikidata and reads it.
17:53 - 17:54

It would be awesome,
17:54 - 17:58

but realistically
it's not what it is, right?
17:58 - 18:01

A lot of the people who are exposed
18:01 - 18:03

to our data are not on Wikidata itself,
18:03 - 18:07

but they are seeing it through Wikipedia
and many other places.
18:08 - 18:12

Now these other places do get feedback
on that data, right?
18:12 - 18:15

Their users tell them,
"Hey, here's something that's wrong,"
18:17 - 18:21

and I would like to have that
so that we can make it available
18:21 - 18:24

to the people who actually edit
on Wikidata, meaning you.
18:24 - 18:27

And figuring out how to do that
in a meaningful way
18:27 - 18:32

without overwhelming everyone
will be one of the things to do
18:32 - 18:33

over the next year.
18:35 - 18:37

Alright, Wikibase ecosystem.
18:37 - 18:41

There, we will continue to work
with the libraries,
18:41 - 18:46

but also look into science,
for example, and more.
18:46 - 18:52

There is a Wikibase showcase later today
that you should totally go to
18:52 - 18:53

and see what's already there
18:53 - 18:56

and what people are already doing
with Wikibase.
18:56 - 18:57

It's really worth it.
18:58 - 19:01

And what's needed there is
19:01 - 19:03

also setting up
good processes around that.
19:04 - 19:08

Helping people figure out
who to talk to about what,
19:08 - 19:10

where they can find help,
19:10 - 19:12

all these kinds of things.
19:13 - 19:17

And, of course, making it easier
to install and maintain
19:17 - 19:20

a Wikibase because that's still
a bit of a pain.
19:21 - 19:25

And the last thing is federation
which is basically
19:25 - 19:27

what we've been talking about
for Commons earlier
19:27 - 19:31

where Commons uses
Wikidata's items and properties
19:31 - 19:34

but for other Wikibase instances out there
19:34 - 19:36

so they can also use
Wikidata's vocabulary.
19:38 - 19:42

And that, as I was saying earlier,
increases yet again
19:42 - 19:48

the need to be mindful
of how our vocabulary is used out there
19:48 - 19:51

more than we have had to so far.
19:54 - 19:57

And Wikidata for the Wikimedia projects--
19:57 - 20:01

of course, tighter integration
through the Wikidata Bridge
20:01 - 20:04

and helping people edit directly
from their projects
20:04 - 20:09

and the other thing that we all need
to think about together, I think,
20:09 - 20:15

is figuring out how to reduce
the language barriers.
20:15 - 20:19

The more Wikidata is integrated
in the Wikimedia projects,
20:19 - 20:22

the more people will have
a need to talk to each other
20:22 - 20:26

about that data without
speaking the same language,
20:26 - 20:32

and we have to figure out
how to deal with that.
20:33 - 20:37

If people have smart ideas,
I would love to talk to you.
20:39 - 20:41

And with that,
I come to the end of my talk.
20:42 - 20:44

Thank you, everyone, for giving
more people more access
20:44 - 20:46

to more knowledge every day.
20:47 - 20:49

(applause)
20:58 - 21:00

We have some time for questions
21:00 - 21:02

so if there are any questions
in the audience
21:02 - 21:05

or if you are remotely watching
the livestream--Hi, Mom--
21:05 - 21:08

you can ask the question
on the EtherPad
21:08 - 21:11

or on the Telegram Channel
and we'll do our best.
21:11 - 21:13

So anything?
21:16 - 21:17

Ah.
21:21 - 21:25

(person 1) Hi, everyone, this is more
of a meme than a question,
21:25 - 21:32

so when the time extension
will be able to also to get
21:32 - 21:36

hours and minutes and seconds
21:36 - 21:38

because up till now
the position is just to date.
21:38 - 21:42

- I know... it's not my question--
- (laughing)
21:42 - 21:44

That's why I said it's a meme.
21:44 - 21:46

Every time is always like that,
21:46 - 21:49

but it comes always from remote so...
21:50 - 21:53

I do not have a very good answer to that.
21:53 - 21:54

I'm sorry.
21:56 - 22:02

But maybe as some background,
people need it even more
22:02 - 22:08

to describe images on Commons
so it might bubble up the long list
22:08 - 22:11

of things that need to be done
a bit faster through that.
22:15 - 22:16

Any more questions?
22:25 - 22:28

(person 2) [Linda] from Wikimedia
Foundation's research team--
22:28 - 22:31

I have a question about your thoughts
22:31 - 22:38

on patrolling, and that may be related
to quality of content on Wikidata,
22:38 - 22:40

but if you can speak to that
22:40 - 22:44

like how do you see the near medium term
patrolling efforts changing,
22:44 - 22:46

especially with the Bridge project
22:46 - 22:48

which I'm looking forward to
going out and trying it.
22:48 - 22:49

Yeah, thank you.
22:52 - 22:57

So as you say, with things
like we did at Bridge,
22:59 - 23:03

a lot more effort will have to be spent
on patrolling, I think.
23:04 - 23:09

But we are at a size where this
is probably not feasible
23:09 - 23:11

to do it by hand, by a human,
23:11 - 23:15

so we need to spend a lot more effort
on improving, for example,
23:15 - 23:18

ORES, the machine learning system
to help us with that,
23:18 - 23:25

to help us figure out which edits
a human really needs to look at
23:25 - 23:26

and which is probably just like yeah,
23:26 - 23:30

the regular stuff
I don't need to look at this.
23:34 - 23:39

Currently, ORES is not super good
at judging what--
23:39 - 23:41

if an edit on Wikidata is good or bad.
23:41 - 23:45

There's currently a campaign going on
23:45 - 23:50

that is training
the machine learning system,
23:51 - 23:52

with your help,
23:53 - 23:56

to teach it basically what a good edit is
23:56 - 23:57

and what a bad edit is,
23:57 - 24:03

and we haven't reached the threshold
of enough humans teaching it yet
24:03 - 24:08

to really improve it,
but if you have a few minutes,
24:08 - 24:11

it would be great if you help teach ORES
24:11 - 24:14

make better judgements
about Wikidata edits.
24:14 - 24:16

And it's really simple--
it shows you an edit,
24:16 - 24:18

and you say this is a good edit,
24:18 - 24:20

this is a bad edit, and that's it.
24:20 - 24:23

You can do this in front of the TV
in the evening on the couch.
24:26 - 24:27

(person 3) Share a link.
24:28 - 24:31

We will share a link
in the Telegram Group, yes.
24:32 - 24:36

And once we've reached
the threshold we need--
24:36 - 24:39

I think it's around 7,000,
but I might be wrong--
24:40 - 24:44

then we can rerun the training
for ORES and then it will be
24:44 - 24:48

hopefully considerably better
at judging the edits on Wikidata.
24:50 - 24:52

And then I hope more of you can use that
24:52 - 24:56

to filter recent changes, for example,
or your watch list
24:56 - 24:58

for edits that really need your attention.
24:59 - 25:00

Yeah.
25:03 - 25:04

Hi.
25:07 - 25:10

(person 4) I'm just curious to know,
and this is a question not from me,
25:10 - 25:13

but from partners
that I've been working with,
25:13 - 25:16

the more partners we have joining Wikidata
25:16 - 25:20

and starting to experiment with queries,
25:20 - 25:23

the more issues we are having
with timeout of queries
25:23 - 25:26

so what's happening with that?
25:28 - 25:30

So, some people
at the Wikimedia Foundation
25:30 - 25:34

are looking into that,
and--small spoiler--
25:34 - 25:37

be there for the birthday present session.
25:37 - 25:38

(laughter)
25:43 - 25:46

(person 5) Hello, I'm Bart Magnus
from Belgium (PACKED).
25:46 - 25:49

I would like to know
what the current state of affairs is
25:49 - 25:52

regarding federation
so raising your properties
25:52 - 25:54

in your own Wikibase instance--
25:54 - 25:57

is there anything to mention about that?
25:57 - 26:01

So over the last year,
a lot of people have told us
26:01 - 26:04

that they want federation, right?
26:04 - 26:07

But the problem was
that a lot of people understood
26:07 - 26:09

very different things
when they said federation.
26:11 - 26:14

Some of those things
were very easily doable.
26:14 - 26:16

Some of those things were
really, really hard.
26:17 - 26:22

And my team and I have been talking
to a lot of people, for example,
26:22 - 26:27

the partners we work with at libraries
to figure out what is it actually
26:27 - 26:29

precisely that they need.
26:30 - 26:34

And we finished that now,
though, of course, I'm happy
26:34 - 26:38

to take more feedback
if you want to talk to me about that,
26:38 - 26:41

and now I'm at a stage where
I'm comfortable to say,
26:41 - 26:43

"Okay, we're going to start with that."
26:45 - 26:48

And that will happen over the next
I would say two or three months
26:48 - 26:51

that we actually write
the first lines of code
26:51 - 26:54

and then hopefully have people able
26:54 - 26:57

to test it early next year, I would say.
27:00 - 27:01

(presenter) Okay, last questions.
27:02 - 27:06

(person 6) Finn Årup Nielsen
from Copenhagen, Denmark.
27:06 - 27:10

In relation to the other language,
there's been a sort of discussion
27:10 - 27:14

in the WikiCite community
about whether we should continue
27:14 - 27:16

to put more scientific papers in there--
27:16 - 27:20

this relates to how much data
we can put into Wikidata.
27:20 - 27:23

Timeout in the Wikidata Query Service
is one issue
27:23 - 27:24

but also the maintaining
27:24 - 27:30

so what are your thoughts about...
27:31 - 27:35

Is the size of Wikidata
beginning to be a problem
27:35 - 27:36

in general?
27:36 - 27:39

Should we stop putting in lexeme data?
27:39 - 27:41

Should we stop putting
in scientific data
27:41 - 27:46

into Wikidata or do we have
any research on this
27:46 - 27:50

or technical problems inflating?
27:50 - 27:51

Yeah...
27:53 - 27:57

Wikidata is definitely coming
to some...
27:59 - 28:03

scalability boundaries, let's say,
28:04 - 28:06

both technically and socially.
28:06 - 28:09

And for both we need solutions, right?
28:09 - 28:13

Socially, we have things like more editors
28:13 - 28:16

and recent changes to the point
where it's completely unfeasible
28:16 - 28:20

for a human to patrol that
because it's simply too much.
28:21 - 28:26

But also technically,
and we've been addressing some of that.
28:26 - 28:30

For example, some database
re-architecturing
28:30 - 28:34

around database view-turned table,
if that says anything for anyone.
28:36 - 28:38

But those only get us so far,
28:39 - 28:41

and one of the things we want
to look at next year
28:41 - 28:46

is where the other pain points are
and what to do about them
28:46 - 28:48

on the technical side.
28:49 - 28:51

So that's a general picture.
28:51 - 28:54

At the same time, I am very hesitant
28:54 - 28:58

to tell anyone, "No, no, no,
stop putting data into Wikidata."
28:58 - 29:02

That would kind of defeat the purpose.
29:04 - 29:07

But, for example, the Wikibase ecosystem
29:07 - 29:09

is one way to address that, right,
29:09 - 29:14

to not require everything
in Wikidata.
29:14 - 29:16

That's the whole beauty
of linked open data.
29:16 - 29:18

You don't have
to have it all in the same place.
29:18 - 29:20

You can connect different places.
29:20 - 29:21

It's amazing.
29:22 - 29:28

So around WikiCites specifically, yes--
29:30 - 29:35

okay, WikiCites specifically,
I think we need
29:35 - 29:36

to look at in proportion.
29:36 - 29:41

I don't have an exact percentage
of what percentage
29:41 - 29:45

of the items in Wikidata
are around WikiCite topics,
29:45 - 29:47

but it's a big percentage.
29:47 - 29:50

And maybe that's the thing
we need to talk about...
29:50 - 29:52

in the break.
29:53 - 29:55

Well, thank you very much!
29:55 - 29:56

(applause)

Title:: cdn.media.ccc.de/.../wikidatacon2019-3-eng-Glimpse_over_Wikidata_hd.mp4
Video Language:: English
Duration:: 30:07

	Bar Sch edited English subtitles for cdn.media.ccc.de/.../wikidatacon2019-3-eng-Glimpse_over_Wikidata_hd.mp4
	C3Subtitles edited English subtitles for cdn.media.ccc.de/.../wikidatacon2019-3-eng-Glimpse_over_Wikidata_hd.mp4

English subtitles

Revisions

Revision 2 Uploaded

Bar Sch

cdn.media.ccc.de/.../wikidatacon2019-3-eng-Glimpse_over_Wikidata_hd.mp4

Revisions

Our website uses cookies

Operating cookies (Required)