Hi, guys! Can everybody hear me?
So, hi! Nice to meet you all.
I'm Erica Azzellini.
I'm one of the Wikimovement
Brazil's Liaison,
and this is my first international
Wikimedia event,
so I'm super excited to be here
and I hopefully,
will share something interesting for you
all here on this lengthy talk.
So this work starts with research
that I was developing in Brazil,
Computational Journalism
and Structured Narratives with Wikidata.
So in journalism,
they're using some natural language
generation software
for automating news
for news that have
quite similar narrative structure.
And we developed this concept here
of structured narratives,
thinking about this practice
on computational journalism,
that is the development of verbal text,
understandable by humans,
automated from predetermined
arrangements that process information
from structured databases,
which looks like that,
the Wikimedia universe
and on this tool that we developed.
So, when I'm talking about verbal text
understandable by humans,
I'm talking about Wikipedia entries.
When I'm talking about
structured databases,
of course, I'm talking about
Wikidata here.
And predetermined arrangement,
I'm talking about Mbabel,
that is this tool.
The Mbabel tool was inspired by a template
by user Pharos, right here in front of me,
thank you very much,
and it was developed with Ederporto
that is right here too,
the brilliant Ederporto.
We developed this tool
that automatically generates
Wikipedia entries
based on information from Wikidata.
We actually do some thematic templates
that are created on the Wikidata module,
WikidataIB Module,
and these templates are pre-determined,
generic and editable templates
for various article themes.
We realized that many Wikipedia entries
had a quite similar structured narrative
so we could create a tool
that automatically generates that
for many Wikidata items.
Until now we have templates for museums,
works of art, books, films,
journals, earthquakes, libraries,
archives,
and Brazilian municipal
and state elections, and growing.
So, everybody here is able to contribute
and create new templates.
Each narrative template includes
an introduction, Wikidata infobox,
section suggestions for the users,
content tables or lists with Listeria,
depending on the case,
references and categories,
and of course the sentences,
that are created
with the Wikidata information.
I'm gonna show you in a sec
an example of that.
It's an integration with Wikipedia,
integration with Wikidata,
so the more properties properly filled
on Wikidata,
the more text entries you'll get
on your article stub.
That's very important to highlight here.
Structuring this Wikidata
can get more complex
as I'm going to show you
on the election projects that we've made.
So I'm going to let you hear this
Wikidata Lab XIV for you
after this lengthy talk
that is very brief,
so you'll be able to choose
on the work that we've been doing
on structuring Wikidata
for this purpose too.
We have this challenge to build
a narrative template
that is generic enough
to cover different Wikidata items
and to suppress the gender
and the number of difficulties
of languages,
and still sounding natural for the user
because we don't want to sound like
it doesn't click for the user
to edit after that.
This is how the Mbabel looks like
on the bottom form.
You just have insert the item number there
and call the desired template
and then you have article to edit
and expand, and everything.
So, more importantly, why we did it?
Not because it's cool to develop
things here in Wikidata,
we know, we all hear, know about it.
But we are experimenting this integration
from Wikidata to Wikipedia
and we want to focus
on meaningful individual contributions.
So we've been working
on education programs
and we want the students to feel the value
of their entries too, but not only--
Oh, five minutes only,
Geez, I'm gonna rush here.
(laughing)
And we want you all to make tasks
for users in general,
especially on tables
and this kind of content
that it's a bit of a rush to do.
And we're working on this concept
of abstract Wikipedia.
Denny Vrandečić wrote an article
super interesting about it
so I linked here too.
And we also want to now support
small language communities
to fill the lack of content there.
This is an example of how we've been using
this Mbabel tool for GLAM
and education programs,
and I showed you earlier
the bottom form of the Mbabel tool
but also we can make red links
that aren't exactly empty.
So you click on this red link
and you automatically have
this article draft
on your user page to edit.
And I'm going to briefly talk about it
because I only have some minutes more.
On educational projects,
we've been doing this with elections
in Brazil for journalism students.
We have the experience
with the [inaudible] students
with user Joalpe--
he's not here right now,
but we all know him, I think.
And we realize that we have the data
about Brazilian elections
but we don't have media cover on it.
So we were lacking also
Wikipedia entries on it.
How do we insert this meaningful
information on Wikipedia
that people really access?
Next year we're going
to have some election,
people are going to look for
this kind of information on Wikipedia
and they simply won't find it.
So this tool looks quite useful
for this purpose
and the students were introduced,
not only to Wikipedia,
but also to Wikidata.
Actually, they were introduced
to Wikipedia with Wikidata,
which is an experience super interesting
and we had a lot of fun,
and it was quite challenging
to organize all that.
We can talk about it later too.
And they also added the background
and the analysis sections
on these elections articles,
because we don't want them
to just simply automate the content there.
We can do better.
So this is the example
I'm going to show you.
This is from a municipal election
in Brazil.
Two minutes... oh my!
This example here was entirely created
with the Mbabel tool.
You have here this introduction text.
It really sounds natural for the reader.
The Wikidata infobox here--
it's a masterpiece
of Ederporto right there.
(laughter)
And we have here the tables with the
election results for each position.
And we also have these results here
on the textual form too,
so it really looks like an article
that was made, that was handcrafted.
The references here were also made
with the Mbabel tool
and we used identifiers
to build these references here
and the categories too.
So, to wrap things up here,
it is still a work in progress,
and we have some challenges
on outreach and technical
to bring Mbabel
to other language communities,
especially the smaller ones,
and how do we support those tools
on lower resource
language communities too.
And finally, is it possible
to create an Mbabel
that overcomes language barriers?
I think that's a question
very interesting for the conference
and hopefully we can figure
that out together.
So, thank you very much,
and look for the Mbabel poster downstairs
if you like to have all this information
wrapped up, okay?
Thank you.
(audience clapping)
(moderator) I'm afraid
we're a little too short for questions
but yes, Erica, as she said,
has a poster and is very friendly.
So I'm sure you can talk to her
afterwards,
and if there's time at the end,
I'll allow it.
But in the meantime,
I'd like to bring up our next speaker...
Thank you.
(audience chattering)
Next we've got Yolanda Gil,
talking about Wikidata and Geosciences.
Thank you.
I come from the University
of Southern California
and I've been working with
Semantic Technologies for a long time.
I want to talk about geosciences
in particular,
where this idea of crowd-sourcing
from the community is very important.
So I'll give you a sense
that individual scientists,
most of them in colleges,
collect their own data
for their particular project.
They describe it in their own way.
They use their own properties,
their own metadata characteristics.
This is an example
of some collaborators of mine
that collect data from a river.
They have their own sensors,
their own robots,
and they study the water quality.
I'm going to talk today about an effort
that we did to crowdsource metadata
for a community that works
in paleoclimate.
The article just came out
so it's in the slides if you're curious,
but it's a pretty large community
that work together
to integrate data more efficiently
through crowdsourcing.
So, if you've heard of the
hockey stick graphics for climate,
this is the community that does this.
This is a study for climate
in the last 200 years,
and it takes them literally many years
to look at data
from different parts of the globe.
Each dataset is collected by
a different investigator.
The data is very, very different,
so it takes them a long time
to put together
these global studies of climate,
and our goal is to make that
more efficient.
So, I've done a lot of work
over the years.
Going back to 2005, we used to call it,
"Knowledge Collection from Web Volunteers"
or from netizens at that time.
We had a system called "Learner."
It collected 700,000 common sense,
common knowledge statements
about the world.
We did a lot of different techniques.
The forms that we did
to extract knowledge from volunteers
really fit the knowledge models,
the data models that we used
and the properties that we wanted to use.
I worked with Denny
in the system called "Shortipedia"
when he was a Post Doc at ISI,
looking at keeping track
of the prominence of the assertions,
and we started to build
on Semantic Media Wiki software.
So everything that
I'm going to describe today
builds on that software,
but I think that now we have Wikibase,
we'll be starting to work more
on Wikibase.
So the LinkedEarth is the project
where we work with paleoclimate scientists
to crowdsource the metadata,
and seeing the title that we said,
"controlled crowdsourcing."
So we found a nice niche
where we could let them create
new properties
but we had an editorial process for it.
So I'll describe to you how it works.
For them, if you're looking at a sample
from lake sediments from 200 years ago,
you use different properties
to describe it
than if you have coral sediments
that you're looking at
or coral samples that you're looking at
that you extract from the ocean.
Palmyra is a coral atoll in the Pacific.
So if you have coral, you care
about the species and the genus,
but if you're just looking at lake sand,
you don't have that.
So each type of sample
has very different properties.
In LinkedEarth,
they're able to see in a map
where the datasets are.
They actually annotate their own datasets
or the datasets of other researchers
when they're using it.
So they have a reason
why they want certain properties
to describe those datasets.
Whenever there are disagreements,
or whenever there are agreements,
there's community discussions
about them
and they're also polls to decide on
what properties to settle.
So it's a nice ecosystem.
I'll give you examples.
You look at a particular dataset,
in this case it's a lake in Africa.
So you have the category of the page;
it can be a dataset,
it can be other things.
You can download the dataset itself
and you have kind of canonical properties
that they have all agreed to have
for datasets,
and then under Extra Information,
those are properties
that the person describing this dataset,
added on their own accord.
So these can be new properties.
We call them "crowd properties,"
rather than "core properties."
And then when you're describing
your dataset,
in this case
it's an ice core that you got
from a glacier dataset,
and your'e adding a dataset
you want to talk about measurements,
you have an offering
of all the existing properties
that match what you're saying.
So we do this search completion
so that you can adopt that.
That promotes normalization.
The core of the properties
has been agreed by the community
so we're really extending that core.
And that core is very important
because it gives structure
to all the extensions.
We engage the community
through many different ways.
We had one face-to-face meeting
at the beginning
and after about a year and a half,
we do have a new standard,
and a new way for them
to continue to evolve that standard.
They have editors, very much
in the Wikipedia style
of editorial boards.
They have working groups
for different types of data.
They do polls with the community,
and they have pretty nice engagement
of the community at large,
even if they've never visited our Wiki.
The metadata evolves
so what we do is that people annotate
their datasets,
then the schema evolves,
the properties evolve
and we have an entire infrastructure
and mechanisms
to re-annotate the datasets
with the new structure of the ontology
and the new properties.
This is described in the paper.
I won't go into the details.
But I think that
having that kind of capability
in Wikibase would be really interesting.
We basically extended
Semantic Media Wiki and Media Wiki
to create our own infrastructure.
I think a lot of this is now something
that we find in Wikibase,
but this is older than that.
And in general, we have many projects
where we look at crowdsourcing
not just descriptions of datasets
but also descriptions of hydrology models,
descriptions of multi-step
data analytic workflows
and many other things in the sciences.
So we are also interested in including
in Wikidata additional things
that are not just datasets or entities
but also other things
that have to do with science.
I think Geosciences are more complex
in this sense than Biology, for example.
That's it.
Thank you.
(audience clapping)
- Do I have time for questions?
- Yes.
(moderator) We have time
for just a couple of short questions.
When answering,
can go back to the microphone?
- Yes.
- Hopefully, yeah.
(audience 1) Does the structure allow
tabular datasets to be described
and can you talk a bit about that?
Yes. So the properties of the datasets
talk more about who collected them,
what kind of data was collected,
what kind of sample it was,
and then there's a separate standard
which is called "lipid"
that's complementary and mapped
to the properties
that describes the format
of the actual files
and the actual structure of the data.
So, you're right that there's both,
"how do I find data about x"
but also, "Now, how do I use it?
How do I know where
the temperature that I'm looking for
is actually in the file?"
(moderator) This will be the last.
(audience 2) I'll have
to make it relevant.
So, you have shown this process
of how users can suggest
or like actually already put in
properties,
and I didn't fully understand
how this thing works,
or what's the process behind it.
Is there some kind of
folksonomy approach--obviously--
but how is it promoted
into the core vocabulary
if something is promoted?
Yes, yes. It is.
So what we do is we have a core ontology
and the initial one was actually
very thoughtfully put together
through a lot of discussion
by very few people.
And then the idea was
the whole community can extend that
or propose changes to that.
So, as they are describing datasets,
they can add new properties
and those become "crowd properties."
And every now and then,
the Editorial Committee
looks at all of those properties,
the working groups look at all of those
crowd properties,
and decide whether to incorporate them
into the main ontology.
So it could be because they're used
for a lot of dataset descriptions.
It could be because
they are proposed by somebody
and they're found to be really interesting
or key, or uncontroversial.
So there's an entire editorial process
to incorporate those new crowd properties
or the folksonomy part of it,
but they are really built around the core
of the ontology.
The core ontology then grows
with more crowd properties
and then people propose
additional crowd properties again.
So we've gone through a couple
of these iterations
of rolling out a new core,
and then extending it,
and then rolling out a new core
and then extending it.
- (audience 2) Great. Thank you.
- Thanks.
(moderator) Thank you.
(audience applauding)
(moderator) Thank you, Yolanda.
And now we have Adam Shorn
with "Something About Wikibase,"
according to the title.
Uh... where's the internet? There it is.
So, I'm going to do a live demo,
which is probably a bad idea
but I'm going to try and do it
as the birthday present later
so I figure I might as well try it here.
And I also have some notes on my phone
because I have no slides.
So, two years ago,
I made these Wikibase doc images
that quite a few people have tried out,
and even before then,
I was working on another project,
which is kind of ready now,
and here it is.
It's a website that allows you
to instantly create a Wikibase
with a query service and quick statements,
without needing to know about
any of the technical details,
without needing to manage
any of them either.
There are still lots of features to go
and there's still some bugs,
but here goes the demo.
Let me get my emails up ready...
because I need them too...
Da da da... Stopwatch.
Okay.
So it's a simple as...
at the moment it's locked down behind...
Oh no! German keyboard!
(audience laughing)
Foiled... okay.
Okay.
(audience continues to laugh)
Aha! Okay.
I'll remember that for later.
(laughs)
Yes.
♪ (humming) ♪
Oh my god... now it's American.
All you have to do is create an account...
da da da...
Click this button up here...
Come up with a name for Wiki--
"Demo1"
"Demo1"
"Demo user"
Agree to the terms
which don't really exist yet.
(audience laughing)
Click on this thing which isn't a link.
And then you have your Wikibase.
(audience cheers and claps)
Anmelden in German.
Demo... oh god! I'm learning lots about
my demo later.
1-6-1-4-S-G...
- (audience 3) Y...
- (Adam) It's random.
(audience laughing)
Oh, come on....
(audience laughing)
Oh no. It's because this is a capital U...
(audience chattering)
6-1-4....
S-G-ENJ...
Is J... oh no. That's... oh yeah. Okay.
I'm really... I'm gonna have to look
at the laptop
that I'm doing this on later.
Cool...
Da da da da da...
Maybe I should have some things
in my clipboard ready.
Okay, so now I'm logged in.
Oh... keyboards.
So you can go and create an item...
Yeah, maybe I should make a video.
It might be easier.
So, yeah. You can make items,
you have quick statements here
that have... oh... it is all in German.
(audience laughing)
(sighs)
Oh, log in? Log in?
It has... Oh, set up ready.
Da da da...
It's as easy as...
I learned how to use
Quick Statements yesterday...
that's what I know how to do.
I can then go back to the Wiki...
We can go and see in Recent Changes
that there are now two items,
the one that I made
and the one from Quick Statements...
and then you go to Quick...
♪ (hums a tune) ♪
Stop...no...
No...
(audience laughing)
Oh god...
I'm glad I tried this out in advance.
There you go.
And the query service is updated.
(audience clapping)
And the idea of this is it'll allow
people to try out Wikibases.
Hopefully, it'll even be able
to allow people to...
have their real Wikibases here.
At the moment you can create
as many as you want
and they all just appear
in this lovely list.
As I said, there's lots of bugs
but it's all super quick.
Exactly how this is going to continue
in the future, we don't know yet
because I only finished writing this
in the last few days.
It's currently behind an invitation code
so that if you want to come try it out,
come and talk to me.
And if you have any other comments
or thoughts, let me know.
Oh, three minutes...40. That's...
That's not that bad.
Thanks.
(audience clapping)
Any questions?
(audience 5) Does the Quick Statements
and the Query Service
are automatically updated?
Yes. So the idea is that
there will be somebody,
at the moment, me,
maintaining all of the horrible stuff
that you don't have to behind the scenes.
So kind of think of it like GitHub.com,
but you don't have to know anything
about Git to use it. It's just all there.
- [inaudible]
- Yeah, we'll get that.
But any of those
big hosted solution things.
- (audience 6) A feature request.
- Yes.
Is there any-- In Scope
do you have plans on making it
so you can easily import existing...
- Wikidata...
- I have loads of plans.
Like I want there to be a button
where you can just import
another whole Wikibase and all of--yeah.
There will, in the future list
that's really long. Yeah.
(audience 7) I understand that it's...
you want to make it user-friendly
but if I want to access
to the machine itself, can I do that?
Nope.
(audience laughing)
So again, like, in the longer term future,
there are possib...
Everything's possible,
but at the moment, no.
(audience 8) Two questions.
Is there a plan to have export tools
so that you can export it
to your own Wikibase maybe at some point?
- Yes.
- Great.
And is this a business?
I have no idea.
(audience laughing)
Not currently.
(audience 9) What if I stop
using it tomorrow,
how long will the data be there?
So my plan was at the end of WikidataCon
I was going to delete all of the data
and there's a Wikibase Workshop
on a Sunday,
and we will maybe be using this
for the Wikibase workshop
so that everyone can have
their own Wikibase.
And then, from that point,
I probably won't be deleting the data
so it will all just stay there.
(moderator) Question.
(audience 10) It's two minutes...
Alright, fine. I'll allow two more
questions if you talk quickly.
(audience laughing)
- Alright, good people.
- Thank you, Adam.
Thank you for letting me test
my demo... I mean...
I'm going to do it different.
(audience clapping)
(moderator) Thank you.
Now we have Dennis Diefenbach
presenting Q Answer.
Hello, I'm Dennis Diefenbach,
I would like to present Q-Answer
which is a question-answering system
on top of Wikidata.
So, what we need are some questions
and this is the interface of QAnswer.
For example, where is WikidataCon?
Alright, I think it's written like this.
2019... And we get this response
which is Berlin.
So, other questions. For example,
"When did Wikidata start?"
It started the 30 October 2012
so it's birthday is approaching.
It is 6 years old,
so it will be their 7th birthday.
Who is developing Wikidata?
The Wikimedia Foundation
and Wikimedia Deutschland,
so thank you very much to them.
Something like museums in Berlin...
I don't know why this is not so...
Only one museum... no, yeah, a few more.
So, when you ask something like this,
we allow the user
to explore the information
with different aggregations.
For example,
if there are many geo coordinates
attached to the entities,
we will display a map.
If there are many images attached to them,
we will display the images,
and otherwise there is a list
where you can explore
the different entities.
You can ask something like
"Who is the mayor of Berlin,"
"Give me politicians born in Berlin,"
and things like this.
So you can both ask keyword questions
and foreign natural language questions.
The whole data is coming from Wikidata
so all entities which are in Wikidata
are queryable by this service.
And the data is really all from Wikidata
in the sense,
there are some Wikipedia snippets,
there are images from Wikimedia Commons,
but the rest is all Wikidata data.
We can do this in several languages.
This is now in Chinese.
I don't know what is written there
so do not ask me.
We are currently supporting this languages
with more or less good quality
because... yeah.
So, how can this be useful
for the Wikidata community?
I think there are different reasons.
First of all, this thing helps you
to generate SPARQL queries
and I know there are even some workshops
about how to use SPARQL.
It's not a language that everyone speaks.
So, if you ask something like
"a philosopher born before 1908,"
to figure out, to construct
a SPARQL query like this could be tricky,
In fact when you ask a question,
we generate many SPARQL queries
and the first one is always the thing,
the SPARQL query where we think
this is the good one.
So, if you ask your question
and then you go on SPARQL list,
then there is this button
for the Wikidata query service
and you have the SPARQL query right there
and you will get the same result
as you would get in the interface.
Another thing where it could be useful for
is for finding missing
contextual information.
For example, if you ask for actors
in "The Lord of the Rings,"
most of these entities
will have associated an image
but not all of them.
So here there is some missing metadata
that could be added.
You could go to this entity at an image
and then see first
that there is an image missing and so on.
Another thing is that you could find
schema issues.
For example, if you ask
"books by Andrea Camilleri,"
which is a famous Italian writer,
you would currently get
these three books.
But he wrote many more.
He wrote more than 50.
And so the question is,
are they not in Wikidata
or is maybe my knowledge
not correctly currently like it is.
And in this case, I know
there is another book from him,
which is "Un mese con Montalbano."
It has only an Italian label
so you can only search it in Italian.
And if you go to this entity,
you will say that he has written it.
It's a short story by Andrea Camilleri
and it's an instance of literary work,
but it's not instance of book
so that's the reason why
it doesn't appear.
This is a way to track
where things are missing
in the Wikidata model
not as you would expect.
Another reason is just to have fun.
I imagine that many of you added
many Wikidata entities
so just search for the ones
that you care most
or you have edited yourself.
So in this case, who developed
QAnswer, and that's it.
For any other questions,
go to www.QAnswer.eu/qa
and hopefully we'll find
an answer for you.
(audience clapping)
- Sorry.
- I'm just the dumbest person here.
(audience 11) So I want to know
how is this kind of agnostic
to Wikibase instance,
or has it been tied to the exact
like property numbers
and things in Wikidata?
Has it learned in some way
or how was it set up?
There is training data
and we rely on training data
and this is also most of the cases
why you will not get good resutls.
But we're training the system
by the simple yes and no answer.
When you ask a question,
and we ask always for feedback, yes or no,
and this feedback is used by
the machine learning algorithm.
This is where machine learning
comes into play.
But basically, we put up separate
Wikibase instances
and we can plug this in.
In fact, the system is agnostic
in the sense that it only wants RDF.
And RDF, you have in each Wikibase,
there are some few configurations
but you can have this on top
of any Wikibase.
(audience 11) Awesome.
(audience 12) You mentioned that
it's being trained by yes/no answers.
So I guess this is assuming that
the Wikidata instance is free of errors
or is it also...?
You assume that the Wikidata instances...
(audience 12) I guess I'm asking, like,
are you distinguishing
between source level errors
or misunderstanding the question
versus a bad mapping, etc.?
Generally, we assume that the data
in Wikidata is true.
So if you click "no"
and the data in Wikidata would be false,
then yeah... we would not catch
this difference.
But sincerely, Wikidata quality
is very good,
so I rarely have had this problem.
(audience 12) Is this data available
as a dataset by any chance, sir?
- What is... direct service?
- The... dataset of...
"is this answer correct
versus the query versus the answer?"
Is that something you're publishing
as part of this?
- The training data that you've...
- We published the training data.
We published some old training data
but no, just a--
There is a question there.
I don't know if we have still time.
(audience 13) Maybe I just missed this
but is it running on a live,
like the Live Query Service,
or is it running on
some static dump you loaded
or where is the data source
for Wikidata?
Yes. The problem is
to apply this technology,
you need a local dump.
Because we do not rely only
on the SPARQL end point,
we rely on special indexes.
So, we are currently loading
the Wikidata dump.
We are updating this every two weeks.
We would like to do it more often,
in fact we would like to get the difs
for each day, for example,
to put them in our index.
But unfortunately, right now,
the Wikidata dumps are released
only once every week.
So, we cannot be faster than that
and we also need some time
to re-index the data,
so it takes one or two days.
So we are always behind. Yeah.
(moderator) Any more?
- Okay, thank you very much.
- Thank you all very much.
(audience clapping)
(moderator) And now last, we have
Eugene Alvin Villar,
talking about Panandâ.
Good afternoon,
my name is Eugene Alvin Villar
and I'm from the Philippines,
and I'll be talking about Panandâ:
a mobile app powered by Wikidata.
This is a follow-up to my lightning talk
that I presented two years ago
at WikidataCon 2017
together with Carlo Moskito.
You can download the slides
and there's a link
to that presentation there.
I'll give you a bit of a background.
Wiki Society of the Philippines,
formerly, Wikimedia Philippines,
had a series of projects related
to Philippine heritage and history.
So we have the usual photo contests,
Wikipedia Takes Manila,
Wiki Loves Monuments,
and then our media project
was Cultural Heritage Mapping Project
back in 2014-2015.
In that project, we trained volunteers
to edit articles
related to cultural heritage.
This is our biggest,
and most successful project that we had.
794 articles were created or improved,
including 37 "Did You Knows"
and 4 "Good Articles,"
and more than 5,000 images were uploaded
to Commons.
As a result of that, we then launched
the Encyclopedia
of Philippine Heritage program
in order to expand the scope
and also include Wikidata in the scope.
Here's the Core Team: myself,
Carlo and Roel.
Our first pilot project was to document
the country's historical markers
in Wikidata and Commons,
starting with those created by
our historical national agency, NHCP.
For example, they installed a marker
for our national hero, here in Berlin,
so there's no Wikidata page
for that marker
and a collection of photos of that marker
in Commons.
Unfortunately, the government agency
does not keep a good database
up-to-date or complete of their markers,
so we have to painstakingly input these
to Wikidata manually.
After careful research and confirmation,
here's a graph of the number of markers
that we've added to Wikidata over time,
over the past three years.
And we've developed
this Historical Markers Map web app
that lets users view
these markers on a map,
so we can browse it as a list,
view a good visualization of the markers
with information and inscriptions.
All of this is powered by Live Query
from Wikidata Query Service.
There's the link
if you want to play around with it.
And so we developed
a mobile app for this one.
To better publicize our project,
I developed the Panandâ
which is Tagalog for "marker",
as an android app,
that was published back in 2018,
and I'll publish the IOS version
sometime in the future, hopefully.
I'd like to demo the app
but we have no time,
so here are some
of the features of the app.
There's a Map and a List view,
with text search,
so you can drill down as needed.
You can filter by region or by distance,
and whether you have marked
these markers,
as either you have visited them
or you'd like to bookmark them
for future visits.
Then you can use your GPS
on your mobile phone
to use for distance filtering.
For example, if I want markers
that are near me, you can do that.
And when you click on the Details page,
you can see the same thing,
photos from Commons,
inscription about the marker,
how to find the marker,
its location and address, etc.
And one thing that's unique for this app
is you can, again, visit
or put a bookmark of these,
so on the map or on the list,
or on the Details page,
you can just tap on those buttons
and say that you've visited them,
or you'd like to bookmark them
for future visits.
And my app has been covered by the press
and given recognition,
so plenty of local press articles.
Recently, it was selected
as one of the Top 5 finalists
for the Android Masters competition
in the App for Social Good category.
The final event will be next month.
Hopefully, we'll win.
Okay, so some behind the scenes.
How did I develop this app?
Panandâ is actually a hybrid app,
it's not native.
Basically it's just a web app
packaged as a mobile app
using Apache Cordova.
That reduces development time
because I don't have to learn
a different language.
I know JavaScript, HTML.
It's cross-platform, allows code reuse
from the Historical Markers Map.
And the app is also FIN Open Source.
under the MIT license.
So there's the GitHub repository
over there.
The challenge is
the apps data is not live.
Because if you query the data live,
it means you pulling around half
a megabyte of compressed JSON every time
which is not friendly
for those on mobile data,
incurs too much delay when starting
the app,
and if there are any errors in Wikidata,
that may result in poor user experience.
So instead, what I did was
the app is updated every few months
with fresh data, compiled using
a Perl script
that queries Wikidata Query Service,
and this script also does
some data validation
to highlight consistency or schema errors,
so that allows fixes before updates
in order to provide a good experience
for the mobile user.
And here's the... if you're tech-oriented,
here's the more or less,
the technologies that I'm using.
So a bunch of JavaScript libraries.
Here's the first script
that queries Wikidata,
some Cordova plug-ins,
and building it using Cordova
and then publishing this app.
And that's it.
(audience clapping)
(moderator) I hope you win.
Alright, questions.
(audience 14) Sorry if I missed this.
Are you opening your code
so the people can adapt your app
and do it for other cities?
Yes, as I've mentioned,
the app is free and open source,
- (audience 14) But where is it?
- There's the GitHub repository.
You can download the slides,
and there's a link
in one of the previous slides
to the repository.
(audience 14) Okay. Can you put it?
Yeah, at the bottom.
(audience 15) Hi. Sorry, maybe
I also missed this,
but how do you check for a schema errors?
Basically, we have a Wikiproject
on Wikidata,
so we try to put the other guidelines
on how to model these markers correctly.
Although it's not updated right now.
As far as I know, we're the only country
that's currently modeling these
in Wikidata.
There's also an effort
to add [inaudible]
in Wikidata,
but I think that's
a different thing altogether.
(audience 16) So I guess this may be part
of this Wikiproject you just described,
but for the consistency checks,
have you considered moving those
into like complex schema constraints
that then can be flagged
on the Wikidata side for
what there is to fix on there?
I'm actually interested in seeing
if I can do, for example,
shape expressions, so that, yeah,
we can do those things.
(moderator) At this point,
we have quite a few minutes left.
The speakers did very well,
so if Erica is okay with it,
I'm also going to allow
some time for questions,
still about this presentation,
but also about Mbabel,
if anyone wants to jump in
with something there,
either presentation is fair game.
Unless like me, you're all so dazzled
that you just want to go to snacks
and think about it.
(audience giggles)
- (moderator) You know...
- Yeah.
(audience 17) I will always have
questions about everything.
So, I came in late for the Mbabel tool.
But I was looking through
and I saw there's a number of templates,
and I was wondering
if there's a place to contribute
to adding more templates
for different types
or different languages and the like?
(Erica) So for now, we're developing
those narrative templates
on Portuguese Wikipedia.
I can show you if you like.
We're inserting those templates
on English Wikipedia too.
It's not complicated to do
but we have to expand for other languages.
- French?
- French.
- Yes.
- French and German already have.
(laughing)
Yeah.
(inaudible chatter)
(audience 18) I also have a question
about Mbabel,
which is, is this really just templates?
Is this based on the LUA scripting?
Is that all? Wow. Okay.
Yeah, so it's very deployable. Okay. Cool.
(moderator) Just to catch that
for the live stream,
the answer was an emphatic nod
of the head, and a yes.
(audience laughing)
- (Erica) Super simple.
- (moderator) Super simple.
(audience 19) Yeah.
I would also like to ask.
Sorry I haven't delved
into Mbabel earlier.
I'm wondering, you're working also
with the links, the red links.
Are you adding some code there?
- (Erica) For the lists?
- Wherever the link comes from...
(audience 19) The architecture.
Maybe I will have to look into it.
(Erica) I'll show you later.
(moderator) Alright. You're all ready
for snack break, I can tell.
So let's wrap it up.
But our kind speakers,
I'm sure will stick around
if you have questions for them.
Please join me in giving... first of all
we didn't give a round of applause yet.
I can tell you're interested in doing so.
(audience clapping)