(Andrew) Welcome to the Infoboxes panel.
How many people here know
what an infobox is in our community?
Good, but even if you don't,
the 30-second introduction is
infoboxes are what you normally see
when you look at an article in Wikipedia,
in the upper right hand corner
and it probably tries to give you
most of the facts about a topic.
You're starting to see infoboxes more
than in just Wikipedia articles.
If you saw one of our presenters here,
Mike Peel, got an award yesterday
for his work on infoboxes and Commons,
and he'll talk about that.
We're also going to have other folks
talk about their experiences
in implementing infoboxes
that are driven by Wikidata.
So, even without Wikidata
as part of the equation,
there have been some pretty famous,
I don't want to say battles,
but let's say disputes and conflicts
between Wikipedia editors
about the appropriateness of infoboxes
and their role in different projects.
So, we've been having a session like this
for the last few years
talking about what the interaction
should be between our communities
in Wikidata and Wikipedia
and Commons and other places.
So, hopefully this will give you
a pretty good set of views
on where things are right now
and where they're going,
from Wikipedia editions
that are heavily using infoboxes
and ones that are a little bit
more reluctant to do that.
So we have our presenters today,
Harmonia, Tpt, Amador and Mike Peel.
So, we're going to start with Harmonia,
is that right?
- Or Tpt and Harmonia?
- Yes, both of us.
- We’re presenting together.
- Okay very good, thank you.
And they'll introduce themselves
and talk about their infobox story.
Hi, everyone.
So I am Harmonia Amanda,
I don't have my name tag.
I think we have very different
expectation and goals
whether we are really small communities,
bigger communities
or the really, really big communities
like Commons and everything.
What Tpt and I are working on are,
Automatic Wikidata Infoboxes
for really, really small Wikipedias.
So Wikipedias without anyone
knowing LUA in their community
or really--or not having regular workshops
or nobody meeting up in real life
like the really, really small communities.
And Tpt will start
with the technical side of it.
Yes, so, what we wanted to have
is templates without--
It's for very small Wikis, so without
having to do any work, saying,
"Hey, we want to have an infobox
about a person.
Where should we display the birthdate
and the birthdays?"
It's a piece on [inaudible]
on Wikidata and so on.
So maybe something that is fully automated
and just works, you just have to copy
a LUA module on your Wiki
and then the templates
according to the LUA module
and then it should just work
without any configuration.
And so it's what we did,
so it's basically a single LUA module
with something like 200 lines of LUA.
The only configuration we have in this
is just a list of properties
we don't want to display.
So it's mostly it's properties that
are internal to Wikimedia projects
and then it creates an infobox from this
based on the Wikidata content.
So it's already deployed
on 13 Wikipedias right now,
and we call it Databox.
So there is [inaudible] infobox
in one very small Wikipedia.
Sorry, I don't remember the language.
- (Harmonia) It's in Hausa.
- (Tpt) Thank you.
So, every some examples you see
basically the ideas in infobox is,
it displays a label
of the Wikidata item and then you have--
if there is an image on the Wikidata
you displayed,
then you have a subtitle
with the Wikidata type.
And then you have the list of properties
sorted just like Wikidata.
And at the end is our geo coordinates,
you have a map and that's it.
So it's very simple.
And it works for any kind of entities.
So it's just not
for a person, place or such.
You don't have to configure it.
It just works.
So it's not the nicest infobox.
But it's very simple and you see it
in four different languages.
I believe it works quite well,
but there is still a lot of work
for languages actually.
It's to translate the labels
and Harmonia's is going to talk about it.
(Harmonia) Yeah.
The problem with the infoboxes in general,
it's what the Indian community talked
yesterday in their talk
so if you want that specific thing
you should [inaudible] the stream
for that presentation they did yesterday.
Or to fill the knowledge gap in Wikidata.
And the problem is not
the technical part of it.
It's that if you don't add
any labels in your language,
we will follow the MediaWiki
fall back languages
and at some point,
it will end up in English.
So the problem with these
really small communities
is actually auto-created dynamic
to translate things so we can use data
from Wikidata.
And that's in auto-create
your own community rules.
So on the 13 Wikipedias,
we have very different usables.
So we have Wikipedia, we make the choice
to only have the infoboxes in drafts.
So you have the data,
you can write the article
but you don't see it on the main space.
You have Wikipedia who makes the choice,
"Hey, if two third of the infobox
is in my language,
then I want it on the main space
and so I know
which, what do I need to translate,"
and their Wikipedia would trust
only infoboxes entirely
in their languages.
So the idea is to start
with labelathons
so you can use--you can,
oh, this next slide.
Oh, no it's--
Yeah, sorry.
You can see on the side we made
a SPARQL query for the Hausa Wikipedia
for the most used properties on Wikidata
which don't have labels in Hausa.
And you can see, that the first property
is language used
which is not that much used on Wikidata
because they translated everything else.
And that was not that much
an amount of work.
So you can start small, build a community,
start using data on Wikidata.
And if you have a bigger community,
you want infoboxes while doing
cooler things than that
and that's the Catalan projects.
- (Harmonia) And that's yours.
- (Amador) Thank you, thank you.
First of all--
(applause)
First of all, I apologize for my English.
If you are expert in rare languages,
this is your opportunity.
While I will talk about two things
that are infoboxes,
but...
several infoboxes and our [inaudible]
and you can play with them.
And I prepared an analysis
that you can see in the presentation.
In order to play, I show some
but no standard in the process--
in the look and feel but in the process.
We started three years ago,
where we implemented
Wikidata in--Wikidata--
inside the infoboxes.
Our objectives were these,
take advantage of Wikidata
in that moment was emerging.
Use these skills as harmonized
the skins of the layout...
Reduce the obsolescence of the information
because everything was manual parameters
and avoid particular solutions
because there are millions more templates,
it's easier,
and everybody has its own
template for its own article,
and well, we have thousands of articles
with a--Sorry.
Those itself infoboxes
with just 10, 20 articles.
The problems are these,
the people doesn't trust in the Wikidata
doesn't want to change,
and they want to maintain
typical or local information
that is not in Wikidata.
Now three years, three years after,
the solutions that we applied
and the state of the art that we have
is, we say, "Okay, we will keep
the manual parameters,
if you wish."
But as Wikidata gets the information,
I will erase the manual parameter
because you don't need it.
Okay.
Now 80%--82%
of all Wikipedia articles
have a Wikidata infobox.
Sixty percent of the articles have
just the call to the template
without any other manual parameters.
We agree 90% of the all the articles
have the harmonizing and some
similar skill and look and feel.
When the [inaudible] language changes
the priority we say,
manual is before the Wikidata
but in some moment we say,
this manual probably is absolute.
So we will change the priority.
This is a question about
there's an exclusion I need, I want to--
I believe that it's important
to have too many infoboxes
or he has one infobox.
Well, we believe that it's important
to have one infobox for each,
great concept.
A person, a building
or something like this
but not one for its kind
or one for all kinds of solutions.
This is our figures,
the important thing is this--
this is with four infoboxes
we cover 50% of four articles.
With four infoboxes,
we cover [inaudible] % more.
And with all, we cover 80%.
Look at that long tail that we can--
we have no solution
because they are very specific
is 83 infoboxes
only four cover 9%.
So the most important, if you
[inaudible] for these options,
you have a great cover.
The look and feel of the infoboxes
is like this.
You have here and another is...
to arrive
and these infoboxes are multilingual
and respond to--
you can explain the--
can tell them the parameter--
the language in two ways.
One is when the preferences
of the Wikipedia,
or if you don't want
to follow the preferences,
and will force another language,
you can call the template
with the parameter of line, equal
and the language you wish.
All the translations
are made via Wikidata labels.
So, when--
Let me show one example of--
if some--
if your label doesn't exist
in your language
in Wikidata--
it appears not in this case, sorry.
It appears here, a pencil,
and you push the pencil and it goes
to Wikidata in order to enter the label,
and then it's running.
Yes.
Ten seconds.
What is our current goals?
We have a solution implemented
in our Wikipedia.
The community, right, is happy.
And no one thinks again
in the old solutions.
So as a WIkidata team told me
[inaudible] before.
It's live at the talk in Catalan.
So, I translate, I change it
to be multilingual
and now this solution is near
to be a plug and play
in any other Wikipedia.
Until now,
other languages, copy it or Wiki model
and...
infoboxes but no were multilingual,
they need to translate
and then they have their own copy
and launch the synchronizing
with our revolution.
But now, this is not necessary
or if you made this kind of migration
and don't touch
or every [inaudible] with it,
we--sorry--
every update with it.
You are good.
Okay.
And Mr. Mike Peel. to explain enwp.
(applause)
(Mike) Okay, I've just got
a couple of slides
and one on Commons,
one on English Wikipedia.
So the first is on Commons.
So this is a single infobox.
It's actually two infoboxes,
it's separate in what's coded for people
and for everything else.
But it generally works
for every single topic.
And it's deployed on Commons
where it's a very different community
from most Wikipedias
that it needs to be multilingual
out-of-the-box.
So you need to be able to change
the language, if you interface
and everything changes in the infobox,
which Wikidata lets you do.
So before this, before WIkidata,
Commons did not really have infoboxes
in categories, now it can.
It's currently deployed
in about two and a half million,
not quite two and a half million
but hopefully in a few weeks time
we'll get there.
And that's had about seven million
Common categories so we're less
than halfway to go actually
and so please keep adding it
if you spot a category without it.
It tries to add everything it can,
which is useful but try to do it
in as compact form as possible
because you don't want to take up
space because the main thing of Commons
is all the media files.
So you want to highlight those,
and then this infobox gives you
additional context on the topic
down the side.
And so it shows, the image it shows,
the main properties
and map of where it is.
Importantly it links to tools,
which you might find useful at the bottom.
So things like, Wiki Shoot Me!
to find nearby pictures
and other things like that.
It's something
that can work on other Wikis.
It's not very portable.
It relies on about half of dozen
different templates
and LUA codes at the moment.
I'm hoping to compact that,
so it is more portable
so you can use it elsewhere, if you want.
It's now set up so that
the main template is actually
a configuration template.
So you can say we want
the width of this to be 200 pixels
or 300 pixels,
you can define that in here.
You can say, "We want a map,
we want this coordinate system."
So it's got some flexibility
to cope with and different cases.
It is actually installed
on English Wikipedia
and you can sort of use it there
but someone will come along
within a few hours and change it
to a different template, so
you can see how it looks
on English Wikipedia, at least.
It is also on a few other ones.
English Wikipedia is a difficult one.
So it has a lot of existing content.
The advantage of Commons
is there wasn't that much content
in the categories to start with,
so you could go in
and add a lot more very easily.
English WIkipedia already has that.
So if you are using an infobox in Wikidata
you're normally replacing existing content
and editors don't like that.
There also seems to be a feeling
that in the English Wikipedia
it can act as a check against Wikidata.
So if you keep things independent,
than you can cross check
and catch vandalism,
which sort of works as long you don't mind
all the extra burden of having to handle
two lots of the same data.
One in a structured form,
which is lovely
and one in an unstructured form
which is a pain.
There's also a lot more
different templates.
So there's about hundred templates
that infobox templates are used
in Wikidata at the moment.
That's out of thousands
on English Wikipedia.
Converting all those
will take a long time.
There's about 2,000 infoboxes
that are entirely drawn from Wikidata
which is a small fraction
of the number of articles there but...
It's been a long way to get there.
And it's lot more demanding
on, again, exactly like formatting
so you can't have anything
which doesn't look quite right
or show cue numbers
and things like that.
You've got to have local override.
You've got to make sure
you only showing referenced information.
All of that is built into the code
which underlays these infoboxes
which is WikidataIB, which is
what Doug Taylor's been working on,
user access.
And you can use that, it's very modular.
So each bit of the infobox
is constructed separately,
so you don't even need to know
LUA to create these infoboxes.
I don't really know LUA at all, so.
You can just use the existing
templates structure to do that.
Which makes it quite nice to deploy
and to fiddle around with.
Yeah, that's basically all I've got.
So most of the session, hopefully
will be questions from you
and hopefully, we can give some answers.
Thanks for listening.
(applause)
(Andrew) Let's make sure we have
both microphones working here.
You want to test that, make sure it works.
Does that microphone work?
Second one?
- Does it work?
- Testing.
- Yeah.
- Great.
Alright, great.
Well we love to hear some questions
and also,
if you've had experiences
with information boxes
and other communities,
we'd love to hear that, so.
(audience 1) Yeah, my case has more WIki.
And we have 10 to 25 editors.
So, do I need to know LUA
to implement Wikidata template?
(Amador) To whom? To me?
(audience 1) To all the people.
Anybody can answer, that's okay.
(Harmonia) Okay.
That's actually why we made
three presentations in one in that,
if nobody in your community
can do any LUA work at all
then the solution is the one
Tpt and I made.
But this one is
you can't easily make
your own preferences.
So we deal with all the technical parts.
But there is drawback to that.
If you have someone locally,
who can say,
"Eh, this looks good but we have actually
this kind of problem in all language."
Like for example, we have problems
with gender language
which are not shown female labels
which is actually not
a complicated thing to find in LUA
but you need to know
the LUA codes to find that.
So someone who doesn't know your language
won't think of that
if they are coming from a language
which doesn't have this problem.
So see if you have someone who know LUA,
you can make more personalization
to have something
which looks better on your Wikipedia,
which is what
the Catalan Wikipedia is doing.
They have skin layouts which are really...
integrated with the rest
of the layout of Wikipedia
which you can't do
when you don't know any LUA at all.
But you have different kinds of solution,
depending on what
your community is right now.
(audience 1) Can I copy the code,
which is already there
in Catalan Wikipedia to my Wikipedia,
the code, to make a new template
or something like that?
Actually our solutions has two labels.
The model Wikidata,
that they say is able to handle the access
to the Wikidata and recovery
[inaudible] values,
[inaudible] values, qualifiers and so.
And the level presentation,
the presentation level
that is made by Wiki template
is in Wikicode, okay.
So, our model now is running in seven
or eight WIkipedia difference,
that they have their own solutions,
call it this model
but the presentation is their own.
Okay.
If you have some solution like this,
you can get the old model.
However, if you don't have
a good solution in that template
in the presentation,
you can take both, a model
and other template, okay.
Because if you get and change
the language, change your language,
it runs immediately.
Okay.
(Andrew) So, to be clear your solution
is LUA based, you have to take it all,
you handle all the technical parts.
But since you have a split in two layers,
you can still tweak it
at the Wikicode level to customize it.
Does that make sense?
We split in two because...
the number of...
details able to modify LUA is decreasing.
(Andrew) Right.
And we concentrated in a LUA model,
everything that is considered
can be considered a black box.
And the changes people have
is related with the presentation.
"I don't like this color,
I want this upper and this other down,"
all these kind of things
is made in templates.
So I made--I modified the template
but there are several editors
that can modify the templates too.
(Andrew) Great and then Mike,
how does yours--
Yeah, but, but...
(laughter)
I'm sorry.
That's the part.
I think most of the people
who are at the WikidataCon
are actually coming for communities
which know Wikicode.
We made data box for all--
especially for African languages
but I think it should be in on many
minority languages.
But we made that for the [inaudible]
Wikipedia, for the Hausa Wikipedia
who are actually on the Hausa Wikipedia.
Hausa is actually the language
spoken by millions of people.
But the Wikipedia have less
than 4,000 articles.
And they don't have
people knowing even Wikicode
because they are editing
by phone and things, so.
It's like Wikidata describe the projects
we presented on the first day
and everything.
So the really, really small communities
have actually really
different expectations
from communities who are actually
really Wikipedian already
and who have their own templates
and their own personalization.
And they want the code
but they don't know any code at all.
(Andrew) Right, right,
that's great. Mike.
The ones on Commons and English Wikipedias
tend to build up using WikidataIB
so that there's the individual parameters
that you fetch.
And that's all in LUA but then
you're calling that from the Wiki text.
So I don't know LUA.
So all those infoboxes--these all
are constructed in Wikitext
and that's possible.
The good thing with all these
different combinations is you can pick
which one you like the best
and just use that.
Or you use multiple ones on the same Wiki.
Or--yeah.
It's all drawing information
from the same place
that's the important thing.
So we all share the same data set.
If [inaudible] a little.
So you have a lot of people
that knows LUA or Wikicode,
it doesn't matter.
If you have a low people able to do that,
you must choose
the better solution for you.
(Andrew) It's amazing how far
we've come in the last two years.
That we actually have a choice
of a really good solutions, that's great.
(Danny) Hi, thank you.
I'm Danny,
I'm from the Wikimedia Foundation.
I think that there are actually
some problems that need to get solved
in regards to the distrust
that Wikipedians have
that are on the Wikidata side
and not on the Wikipedia side.
So it's not just about
resistance of change.
I will tell you a story.
I was talking to Lydia
at Wikimedia about this
and sort of walked
through a little scenario.
Because we were in Stockholm,
we tried--the examples that we used
was the population of Stockholm.
So let's say I'm from English Wikipedia,
and I say that it's x million.
And then I come back later and I see
that that has been changed to y million
because it's--
that was done by somebody
on another Wikipedia in another language.
How do I know where that comes from
and who did it and what the source was?
So I click through
to get to the Wikidata item
and then looked at that property
and it actually turned out
that somebody had very recently--
this was a coincidence,
but somebody had very recently
had changed the population of Stockholm.
And the source that they used
led to a 404 error.
So in other words I can't--
And that person spoke German,
and so it would be difficult for me
to talk to the person, asked them like,
"What kind of source was that?
Why did that change?
Is that actually an update
or is it just a mistake?"
So I said like,
I know this is a coincidence,
like I just happened to pick this example
but 100% of the things I tried
have this problem.
And that's a really difficult thing
to figure out.
And there's other problems with references
on Wikidata as well.
There's tainted references
like Lydia spoke about earlier,
but there's also circular references,
where I believe a lot
of the initial import came from Wikipedias
and so the source just says
Italian Wikipedia.
And that's a thing, though.
Like I know that everybody knows,
but that's an actual real thing.
We can't have it
on the big WIkipedias
that the infobox for every page
doesn't have sources anymore.
It's just like from Italian--
like Italian Wikipedia is sourcing itself.
- Can I answer some of that?
- (Danny) Yes, please do.
And so looking at
the English Wikipedia infoboxes,
they aren't referenced.
(laughter)
That's kind of a problem.
So you need to look into the--
(Andrew) You need traditional infoboxes.
Traditional infoboxes,
not the Wikidata ones.
Forget the Wikidata ones here.
If you want to find out
where the information is,
in the infobox you need to look
through the whole article
and find it and pull it out
and that's difficult.
Wikidata does support references.
We need to use those more.
And in particular, when you're importing
on the English Wikipedia,
you only import referenced information
and that's excluding these sources
stated in English Wikipedia.
That's ignored.
So it's only if it's a good reference,
then it's used.
So there's some ways of doing that,
but Wikidata does have a long way to go
before everything is referenced.
And I think there is a bias here
on the English Wikipedia
in that it's the biggest Wikipedia
which is...
with more information.
But we actually run through
with the same problem
on the French Wikipedia
when we started, we automated infoboxes
and people start following.
When there is discrepancies
between Wikidata and Wikipedia,
which one is right?
And it was 90% Wikidata.
So Wikidata was way better than Wikipedia
and we supported wrong information,
obsolete information and everything
that nobody on Wikipedia
had supported for years.
So I think the numbers are less
for the English Wikipedia
than for other Wikipedias.
But the French Wikipedia
is not a small one,
but Wikidata was way better.
So I think it's not 90%
for the English Wikipedia,
but I think when these discrepancies
and we have tools to do that.
Like, hey, this manual value is
conflicting with the value from Wikidata.
When these semantic templates
on the code like for tables of...
population which you have in tables,
which are sometimes depending
on the article
as with semantic marker.
You can use the semantic marker
and say,
the value are from infobox is not--
So we have technical means
to track some of that
and to put this
where it's originating from.
We did that, welcome to French Wikipedia
and Wikidata statistically is right.
But on the English Wikipedia,
I think it's a little less.
But I think it's actually a good thing
to have a way to spot
that there is a discrepancy
and a reference problem.
And yeah, Mike Peel said that unless
it's the biggest Wikipedia,
we only use reference statements
and the import
from stating with Italian Wikipedia
are not considered as references.
(Andrew) It's kind of a relic
of the initial import that we did,
but you're right it's not acceptable
as a reference statement
and most people are trying
to get rid of those.
(Danny) Yes, it's just
that it just needs...
(Andrew) You need, you need--
(Danny) That's a problem that needs
a strategy on Wikidata.
Like how are we going to clean that up
in different references?
I agree.
But this is out of this presentation, no?
What's your proposal?
What do you propose?
- (Danny) I...
- Clean Wikidata?
- Close Wikidata? I don't know.
- (Danny) I just wanted to point out
that the distrust by people
on some of the big Wikis
is actually based on real concerns
about references that--
Yeah, that's-- the references problem
starts in 2013 because the references
on Wikidata was new and we couldn't deal
with complicated references
at the time like technically
on the Wikidata side.
So we had like very bare references,
and I think some Wikipedians
are still stuck on that
but Wikidata is in 2019 now.
And we have good references
on many things.
(Andrew) We're going to go
to the next question
but just to make sure people know,
I think a lot of people
in English Wikipedia
and other large languages
that are resistant
to some to some of this,
they'd make the argument that
an error in Wikidata
is magnified by these infoboxes.
But the number of fact checkers
is also magnified, right.
So you see it as both ways
like if you have one place
to fact check and reference,
you solve the problem
for hundreds of editions
at the same time, right, so.
(audience 2) Okay, so
with the previous topic
about how to implement data,
we in the Basque--
we in the Basque Wikipedia,
we adopted the Catalan system.
Aside maybe from scratch,
when you do something incrementally
as the Catalans did, it's a lot of work
but I say this is from scratch,
I notice how long it took
for me to do that.
So if you go to Wikidata,
you have the acronym of O,
[inaudible], as one hour
Wikidata template system
or a fully automatic system
and is actually one hour.
it's telling everything you need
for a template to work.
One, I mean, or biography
or city or one of them.
So implementing the five, six
most used templates
will take you like ten hours of work.
I mean it's like something
you can handle easily
in one week of volunteering
and you have it done.
So sometimes it's like,
"Oh, but we need a lot of templates,
we need a lot of things."
It's quite straightforward.
If you are not at Wikipedia
with only ten articles
and no templates because then you need
the coordinates, the model of coordinates,
these kind of things.
But if you have some development,
it will take like one hour,
two hours work to do that,
it's quite easy.
And I think the others
will be also like one hour, two hours,
it's not-- maybe yours is 30 minutes.
It takes about two minutes.
- (audience 2) That's it, it's quite easy.
- (laughter)
Yeah, what I said with databox
is that the problem with databox
is not installing it.
It's translating
and adding data to Wikidata.
To have your property in Hausa,
that's the work.
That's not making the infobox.
It's making the labelathons
and the translation and everything.
That's the work.
So it's a very different kind of problem
than the English Wikipedia,
where most of the time
have the labels or are translated easily.
(audience 3) Thank you.
I wanted to make you aware of something
that's going to be beneficial to this,
it's not directly related to infoboxes
but it's a proposal that I and Amir
and a couple of other people
have been working on
for quite some time now
around a central repository
for templates and modules.
So at the moment, if you have a module
that you've written in, for example,
in the English Wikipedia
and you want to use it
on the Catalan Wikipedia
or some minority language Wikipedia,
some of those, you have to copy
across the code.
And then when the code is improved
on the English Wikipedia,
the improvement typically
doesn't get copied across
or it gets copied across sometime later.
So the idea is to have
a central repository
where LUA modules can be held
and used by every Wiki
in the same way that you hold an image
on Commons or a piece of data
on Wikidata.
So there is a very long draft proposal
in MediaWiki
and I will put the address of that
on the Etherpads
when I hand the microphone back
- and can get a web connection.
- (chuckles)
We would like your comments
and questions on that proposal.
It is going to take a very long time
before this can be implemented
because it'll be quite
a considerable change to the way
that the underlying
MediaWiki software works.
But when it comes in,
it will make this reuse of code
by minor Wikipedias much more easy.
Great, thank you, Andy.
Go ahead.
(Tpt) Yes, so actually the link
is here for the Global Template proposal.
I just did not have time to talk about it.
But yes, I think it would be
tremendously useful
so if you could talk
to Wikimedia Foundation people
and say, "Hey, we need to make
this proposal happen."
- Yes, it's a link to the proposal.
- [inaudible]
(Tpt) Yes.
(audience 4) Excuse me,
can you say why you think it's--
No, no, no, no, no, it's not the same.
- (audience 3) Not the same?
- Maybe it's similar.
(Mike) This is off topic,
so let's not spend so much time on it.
(audience) Yes.
This will be incredibly useful
but not yet.
Take a look at the Etherpad.
Andy will leave a note there.
Andy, to answer to you,
maybe you talk about a project
called Multilingual Templates--
it's a model that is a project
and initiative to make a repository
that you can subscribe to there.
Anytime the owner or the creator
of this model or this template update,
all the subscribers receive their own copy
if they don't change it, the previous one.
So I don't know if finally
it will be this solution or another
- but this initiative or this idea,
- Idea.
I think that all of us agree.
(audience 3) I believe that Andy
was talking about Amir's proposal
so it's this one that is really having
just like Wikidata
but for templates.
- (audience 5) [inaudible]
- (audience 3) Yes.
(audience 5) Because you have things
lie the LUA code to verify
the little check that
you got a nice balance.
[inaudible]
(Andrew) You got a question.
(audience 6) First of all,
I wanted to thank you all
for your great projects
and all of the work you're doing.
And also the great idea
of the central repository,
it's a good thing.
I'm from the German Wikipedia
and we have a thing there with--
where we're trying to modify
most of the infoboxes
to actually support data from Wikidata
and we ran into a problem,
more of a cultural problem, actually,
where we are importing data
from Wikidata and by default
if nothing is entered in the infobox,
at the Wiki value pair
where it should be.
So if you leave the position empty,
for example, the coordinates of something
then the data comes from Wikidata
and if you put something in,
it's a local override.
And we run into a problem
that sometimes people actually
want nothing in there.
They want--they don't want
to change the data in Wikidata,
but they don't want the data
from Wikidata.
And they just want to override it
so it says nothing.
And we seem to have no real solution
for a use case scenario like that.
We thought about using like magic words
to suppress the actual information
of Wikidata but I don't know if you have
similar problems in your projects
and how you tackle them.
Yeah, so we have exactly the same problem
on French Wikipedia
and what the infoboxes are currently doing
is that if you put hyphen,
you just remove the Wikidata value.
It's kind of a hack but it works.
(audience 6) Interesting.
And the one I use
as a first field parameter
and you pass the name of a field leader
and want to show and it just hides it.
There are ways of doing this.
It's something that happens
on English Wikipedia as well.
Things like religion, people don't want
to show that in articles,
they can just turn it off.
(audience 6) Really cool, so.
So as we were saying,
you have multiple solutions
depending on the multiple problem
and you can--on the French Wikipedia,
we have some fields which
are then in the infoboxes itself,
like if the infoboxes can be used
for several kinds of things,
we all think, "Well, if it's this
specific thing, then this field
shouldn't show but it should show
if it's this other thing.
Or manually, in this article specifically,
I don't want this field."
So a technical solution exists
and we can probably implement that
on the German Wikipedia, no problem.
(audience 6) Really great,
and I'm going to look into that.
To answer you, we can...
in order to blend it,
you can hide any parameter.
But in each use, article by article.
Now, we are preparing a new release
if this release is--
use it another Wiki pages
in order to be able to make
some kind of a customization.
For instance, what is the color
of the headers?
We have our color code
but maybe you'll want another,
I don't need to change the code to do that
or what are the units
that you want the results
of the measurements?
I use centimeters and meters
but maybe you want feet and--
So all these kinds of things,
we want to make a list
of logical things that another Wikipedia
wants to customize
and this defines as a parameter
that you can change
and are not changing the version
of the infobox
because if you change the infobox,
you lose the connection
with the synchronization.
By the way, don't tell too many people
but Wikidata infobox
is on the German Wikipedia.
(laughter)
(audience 7) Once more.
(audience 8) Thank you, Mike.
(audience 9) That is a working mechanism
that, okay, we have
a central repository of infoboxes
it's a great idea.
But as a working--if we have
a central repository
of documentation, of these infoboxes
in meta or MediaWiki
or in a central place,
so everybody can benefit.
And so if I can put that page
into my watch list.
If something changes,
I will get a notification
so I can update my template.
That is easy, there is no need
for a proposal for that thing.
We can create a central repository
of documentation.
I don't think there's much
documentation at the moment,
so, yeah, that's important to do.
Yeah, well, actually
it's eight pages for databox,
way longer than the code, itself
because it's for a very smart community,
with no technical background.
But I think we have a tragic lack
of documentation in templates,
in general.
Really, a tragic lack.
And I do think that
Wikidata infoboxes are not worse.
Not by a lot.
Because we are working on
so many common LUA modules.
We are always using the same LUA bricks.
So some of this documentation
actually probably exists
in at least the French Wikipedia
but we have a translation problem.
We have--yeah, so a central repository
would be great,
but we will run
into a translation problem.
(audience 10) Dare I just throw out here
in the room that we probably
should have Wikidata items
for all of these infoboxes
and then we can have
the documentation on Wikidata,
where you have multilingual translations
that are a lot easier to do.
Yeah, we actually have the templates
for the translation of the documentation
of databox but people are not translating.
So yeah, you could translate easily,
you can help me translate that
in your languages so people can use.
But I can't translate that in Hausa,
I don't speak Hausa, though.
In any case, when we talk about
the accommodation of the template,
we are thinking in a large accommodation,
the display in each parameter
and each value that you can put,
et cetera, et cetera, et cetera.
Our experience is 60% of all of articles
has just info [inaudible] person
info [inaudible] building,
info [inaudible] no parameters.
So the unique documentation
that you need and we don't have,
very well, I confess,
is that was the model
of that data that
you have to fill in Wikidata.
Not how the template runs
because you do not need it anymore.
So we have stories--
oh, sorry, just rather quick--
stories of Catalan and Basque
who are like 80% infoboxes in Wikidata
which is incredible.
English at the other end of the spectrum.
Other languages,
I love to hear from other folks
after we hear from Jane,
about your experiences.
French is somewhere
maybe in between... yeah.
(Jane) Okay, the wonderful
Sandra Fauconnier is not
in the room, I don't think,
did this amazing page on Wikidata
for the visual arts
where she actually put in
all of the things
that are actually considered artworks.
So you could have a page on Wikidata
that has your infobox
with all of the fields.
And then the fields can--
those are actual things in Wikidata.
So that's why I talk about
multilingual translation
that is automatically done for you
and you just put it in a huge table.
Yeah, that actually is the same problem
we have on other projects,
we talked about the WikidataCon,
and I think it's a running thing,
like Wiki projects, I think
are making data modeling,
saying, hey,
we should do that on Wikidata
and we have outreach problem
in that Wikidatians which are not working
on this specific subject
don't know how the data is modeled.
And Wikipedians know that even less
and everything else.
So most of it already exists in some form.
But people who need it
don't know they need it
so they don't search
for the help pages which exist.
Yeah, it's not only a Wiki
and infoboxes problem.
It's a more general problem
we ran [into] several stations
today and yesterday, so.
Very good.
Any other reports or folks, or Shani?
(Shani) Well, not report but a question
to the people in the room.
A show of hands if we can,
for a second, how many think that
a central repository of infoboxes
is needed?
Okay, let's do the opposite.
Are there any people who oppose?
Okay, so, Andy, this is for you,
why did you say it's going to take
- a long time to--
- (laughter)
It's going to take a long time
because the people working
on infoboxes want it
and the people using the infoboxes
don't want it.
I have a--
Okay, yeah, so on a technical level,
making a--if you want to do it properly
so having one Wiki,
where you put infobox codes
and you're having other Wikis
taking the code
and doing some proper rendering,
it's kind of hard on the technical level.
So it's going to--so it's basically
like implementing Wikidata--
just like, for example,
getting Wikidata content
into Wikipedia was hard
on a technical level,
it's going to be the same
and so it's something that
a volunteer, for example,
couldn't do at all.
So that's why it takes a long time.
(audience 11) So it's a WF thing?
It's definitely a WMF thing, yes.
- [inaudible]
- Yeah.
(Andrew) You can help with that.
But it's software release engineering,
basically, if you think about it, right.
So it gets pretty complex.
I understand and I agree with you
that it's not easy.
It's not easy.
Maybe it's a dream,
a repository central,
repository et cetera, et cetera.
However, when I try to do
an installation pack,
the problems that I found
is not only the translation problem
that the language is different
but even that the--
the modus operandi is different.
For instance, the [inaudible] model
exists from several years.
All of us copy it from the English version
but after that, we have to change it
in order to adapt to our measurements.
So these kind of things
or the model or the elements
to put in the repository
is prepared to do that
or the repository is not the solution.
It's not only a question
of translating the language
but also that the running way
must be different for each
one of necessities of each user filter.
If you don't take care of all of them,
the model will know universally.
And this is the difficulty,
not the question of how
a repository with automatic replication--
No, this is a technical solution.
Okay.
Actually on the French Wikipedia,
one of the biggest, biggest
and longest war edits
before Wikidata was a project
was that we have
three different infoboxes for our cities.
And there were infoboxes
with data with cities,
totally manually,
so no Wikidata question here.
And we have war edits
about these templates for years.
It was a very big thing
because people were like,
"No, I prefer information this way"
or, "I prefer information this way."
And we have a big repository.
We are multiplying that for every template
across every Wiki and for community
who have edits for years
and years and years.
So we have a technical problem to do that.
And we will fight
with a very, very big push
against it by people who are not
creating the infoboxes
or using the infoboxes
but will just be really happy
that their specific field,
they are used to have at the top
of the infobox will get in the middle.
That's what we will fight against.
Because everyone wants a repository.
But everyone wants a repository
of their template as they want them.
(laughter)
(Andrew) Yeah, just--
(audience 3) Thank you,
I think that's a valid point.
But the idea is to provide
a repository of modules.
And then people can put
their own front end on them
if they want to, in a local template.
If they don't want to,
there should be a shared template
which they can use
out of the box as it were.
But it's certainly
meant to be configurable
and that is taken account
into the proposal that Amir has drafted,
if you read that and indeed,
if you read the discussion page,
that issue has already been addressed.
(Shani) Can I talk on to something?
Take the mic.
(Shani) Sorry, just to note
that on Hebrew Wikipedia, for example,
the templates that we--
that the infoboxes that we use,
they are automatic and come from Wikidata.
But there's always an option
for the community to edit that template
and adjust it to our specific needs.
- (Shani) So--
- No, the repository problem
is not about taking information
about Wikidata.
It's like on the French Wikipedia,
for dates, we will have
space between, I don't know,
the day, then the month, then the year.
You don't want that order in English.
So when you pull a data from dates,
you want it formatted in English
as you want it.
So the idea of the repository
would be a generic LUA template
with dates where you can just put,
"Hey, this is French,
so I want this formatting.
This is English, I want this."
So whatever infoboxes or other templates
you are using with dates,
you can use the exact same infobox
and have it correct in your language.
And that's a technical problem.
Can I suggest go to the Tool page,
talk about it there?
Amir would love feedback on this,
so please, do edit there.
So, yeah, a repository
is a really great thing
for everyone working on Wikidata
but it's really not Wikidata-centric,
the problem we are running with that idea.
Everyone here will love it.
But, yeah.
(Andrew) Any other questions.
João, do you have a question?
(audience 12) Just a general question.
Well, sorry, more aimed at Mike,
I have come across the problem
of wanting to use the infoboxes
on species pages in WikiCommons.
a species category in WikiCommons.
And there are taxonomic
minded people who really
do not like Wikidata going anywhere near
all their beautifully curated data
which has no references
- Yeah.
- in their category.
(audience 12) And I find it quite--
I personally can handle the fact
that the two pieces
of data disagree, that's fine
because taxonomists disagree
all the time about whether something
is even a species or not.
But the people who I have edited
their category pages for
go off the deep end at me for doing it,
so I've learned very quickly
to back off and not do it
and only do it for my categories
that I'm creating and I'm very quick
about creating my categories
before they get anywhere near them,
so that I can stick
a Wikidata-sourced infobox on it.
And then they don't take it off
because it's there first.
But they do still put
their own data in there
- and references.
- Yeah.
So taxons are the one exception
on Commons at the moment
- to the Wikidata infobox deployment?
- (audience 12) Yes.
It's probably because it's a Commons thing
but anyway, if you go
to the Village pump/proposals,
there was currently,
I submitted a proposal
to add them to taxons.
So go comment on there, please.
- (audience 12) Okay.
- (laughter)
It's currently being discussed.
(Andrew) We only have
about five minutes left.
- So one--oh, two minutes left.
- Yeah, I have an answer to Andy.
Andy?
What you proposed about...
a common base,
and a personalized presentation,
is the Basque solution.
They copy it all modeled
and old templated.
And after that, they modify
their templates.
So now, similar
but they have their own copy.
Obviously, also it's time
we made an upgrade.
I say we have a--
with relationship I send a message.
But I can send a copy...
to you?
But this is a [inaudible]
of the implementator.
Make modifications or not.
(Andrew) One last question
for the folks there.
How many people here know about
Shape Expressions in Wikidata?
They're E numbers in Wikidata?
You got a quick slide up there.
But what do you folks predict
as the relationship between
what you're doing with infoboxes
and the rise of Shape Expressions
or ShEx in Wikidata?
There's a mic right there.
I would love to be able to define
infoboxes with Shape Expression.
Because here you would write,
basically my idea is that
you would write a Shape Expression,
on the how data should look like
and then you annote--
maybe annoted them
with some labels if you want to customize
let's say a field name.
And then it would be able to do
multiple things.
First, generating infobox.
For example, the shape is--
here's an example of Shape Expression
for a personal infobox,
you would just first say that
you have a sex and gender property
with some added values,
some birth dates
and some, let's say
it's a nationality and so on.
And then you could see it as a section
of infobox on which you could give
a label in multiple languages.
So this way you are able
to first generate an infobox.
Then you are able to validate the data.
If you want to do--
you could even do some fun stuff
like generating having some--
you know the project of Wikimedia Germany
of being able to edit Wikidata
from Wikipedias.
(audience 14) Wikidata Bridge.
Wikidata Bridge, yes, I thank you.
So countries working on each field
but if you have these kind of things
for infobox, you can do edit forms,
that works on the field level,
but on the infobox level.
And you already know
which has the possible values.
For example, for a property
or could be able to say
that you might have multiple values
for some but not for others
and so on.
And so you would have both displays,
validations and editing
all in the same place.
And for the display part,
it would be like for the Catalan state.
With four infoboxes,
they are like ask the data.
So we have a Shape Expression,
we could say,
"Hey, this is a human,
I want this layout,
this is a location, I want this layout,"
without ever needing
several infoboxes.
So it would be a way
to make databox even prettier,
for a very small Wikipedia and everything.
So yeah, a big thing
I'm enthusiastic about.
With four changes, I solve 50%.
Yes.
(laughter)
I can't see how it integrates Shape
Expressions into the Wikidata infobox
because it just uses--
it's one thing for everything.
Where this is breaking it down again
into individual bits and pieces
and changing the amount of formatting
for each individual area
which maybe it isn't so useful.
Very useful to gain data into Wikidata
and saying this is what we want
in these entries.
But then the infoboxes
can just display the whole lot
without having to go
into this complex thing, I think.
I agree with Mike.
One of our goals, initial goals,
was the harmonization.
You don't like my look and feel,
you can make a proposal to change.
But everybody will have the same,
a [inaudible] Wikipedia.
Because if not,
finally you can have
four infoboxes for 50%,
you need 40 infoboxes for 50%.
So it's very dangerous.
(laughter)
No, no, I agree that
maybe the [sign] doesn't like,
I change it.
But all the articles would be the same,
the same that we agree.
(Andrew) Great, well, thank you so much.
Let's have a hand for the [inaudible]
(applause)
Alright, and continue
the conversation online, top page
or find them later on.
Thanks.