(Susanna) ...Wikimedia Finland,
and we have during this year
started working
with the Saami communities,
the culture and language,
starting experimenting
doing the groundwork for future projects.
(Kimberli) Well, actually she started
working this year.
I've been working since 2006 so...
(laughter)
(Susanna) Well, it's at
the end of chapter...
Yep here we go. Let's see what we have.
I don't know which one it is.
[inaudible]
So usually when we give presentations,
we realize nobody knows
what we're talking about,
the Saami languages.
So this is Norway, Sweden,
Finland and Russia.
And the yellow part--
and it starts quite far down here--
is the Saami dialect continuum
or language continuum.
And the languages
that have Wikipedias are five--
or there's actually only one,
Northern Saami Wikipedia.
And then the other languages
that we work with are six and seven,
and Jon Harald is from Wikipedia Norway,
and they work with the other ones
in Norway and Sweden
and the Northern Saami one.
Sää'mjânnam is the name
for this area in Skolt Saami.
This is somehow...
Yeah, so.
(Susanna) Oh yes, while thinking
about how to serve
these language communities,
as Kimberli was showing there--
maybe we'll go back to the map,
the biggest language community
in Saami area is the Northern Saami.
And when we think of Saami,
we think of Northern Saami,
but there are at least
eight other Saami communities
and language groups.
So we are working with two,
which is here--it's Inari Saami
as well Skolt Saami,
they both have around 300 speakers.
So we cannot expect--
now going to the next slide--
there are two different types
of language communities,
those that have Wikipedias
and therefore are served
within the Wikimedia ecosystem
and those that don't have a Wikipedia,
and therefore it's
much more difficult for them.
And we find that working
with structured data,
we can serve
these language communities as well.
So Kimberli may tell you
about this sticker that you have got.
So the sticker says--
in Skolt Saami
which is spoken by about 300 people--
it says Wikimedia Finland wishes
everyone a happy United Nations
International Year
of Indigenous Languages 2019.
And the sticker was created
for an event that we went to
at the end of August in Northern Finland.
(Susanna) So, it wasn't that easy.
So we started setting up language code
for Skolt Saami and Inari Saami
and found out that it's not
a straightforward process.
It's not really documented.
It was really, really hard
to find out how to do it.
So we made this elephant metaphor
here as a reindeer.
So there are different parts
of this Wikimedia environment
that look at some specific area
of this language,
definitions and there doesn't seem
to be an overall way
and process of how to deal
with adding your languages.
So what we did was we made
a lot of noise
and tried to ask everyone
to help us, and in the end,
we managed to first have
Skolt Saami and Inari Saami
for monolingual properties;
then to labels in Wikidata;
and then only to find out
that they wouldn't work
in structured data on Commons.
Then again after another process
for that, maybe six months after,
we find out that they wouldn't work
in Wikipedias
so I think that's still unsolved.
(Kimberli) When we first started,
you could only use Northern Saami
and Southern Saami
in Wikimedia projects.
And as a bonus part of this,
we have now the ability to use
the Finnish Romani language also
within the Wikimedia projects.
This trying to get your language--
the ability to be able to use
your language in a Wikimedia project
is not straightforward.
It's really difficult,
and when you talk to people,
they're like, "Oh yeah, I'll fix it.
It'll take me five minutes."
And then, yeah, it takes them
five minutes to fix one thing.
but then the next thing is not working,
the next thing, something else breaks,
things like that.
And if we, people who have been
in the Wikimedia projects forever,
can't figure out how this thing works
and how to get things
straightforwardly working,
then we can't expect communities--
language communities that aren't
familiar with the Wikimedia projects
to be able to figure out where to start
and how to navigate this process.
It's not possible.
And there are actual pages
that people are like, "Oh yeah,
there's a page for this."
And you're going, "But it doesn't come up
in Google Search for instance,
so it's not findable."
- Do you want to say something about that?
- (Susanna) No, that's fine.
So well we tried to come up
with some things
that should be looked into.
This is not an exhaustive list,
but well, obviously, the process
needs to be streamlined.
(Kimberli) The one that I really hate
are the language codes.
Because for instance I did research
with [inaudible]
which is a specific language of its own.
And there is no ISO code for it.
There is an ISO code for [inaudible].
And they've lumped together
two different languages
that are completely
unintelligible to each other.
And so Wikimedia projects use ISO codes
for these type of things.
And we really think
that there should be
a more fine-grained level to this.
For Skolt Saami, even though
there's only 300 people that speak it,
we have a lot of data for it.
And there's four main dialects,
and the words aren't the same
in the four dialects.
So I would really like to be able to put
this is from the Paaččjokk dialect,
this is from the Suõ´nn’jel dialect,
and that type of stuff.
But we can't do that.
We can't do that for Spanish.
We can't do it for English even.
And so something has to be done
about the language codes
in the Wikimedia projects.
Yeah, and something that started to happen
I think is to engage maybe
the broader language,
linguist language communities
into the decision-making process,
and maybe they're like the decisions
that need to be made.
The bureaucracy maybe has
to be somehow assessed.
What are the decisions that are needed
in this sphere?
Like what are the application processes?
What are the... yeah, so.
Thanks to Benjamin's presentation today,
I think PanLex needs
to be added to this too.
(laughing)
(man) We have individual ISO codes
for all the languages you mentioned.
Are you using IETF or... ?
(man) We start with [inaudible] codes
and [inaudible] codes
and then they can just get
a variety ID [inaudible].
[inaudible]
(Kimberli) Good. We'll talk
about it more in the Q&A then.
(moderator) If we can repeat
that for the stream
because it was...
(Susanna) Okay, I can't. (chuckles)
- (moderator) We can do it after.
- (Susanna) Right.
(Kimberli) So some of the ways
that we work together...
We work with the communities themselves,
and we were invited
to this 70-year anniversary
of the Skolts living in Finland.
They were relocated to Finland
from when the border was closed off.
And so they've been living in this area
for seven years,
and there was a big party going on,
and we were there.
She was working with little kids
putting in Moomin characters
in the different Saami languages
and different words like that.
Do you want to say
something else about that?
(Susanna) Yeah, just
to also pinpoint that.
We can find new ways of working
with data or language
so we can go to this--
We can go together with the communities.
We want to create participatory methods
in which we can add more information.
I think we have come up with this idea
of the term of "depictathons"
now that we can work with images
or translateathons which have been
done earlier as well,
but these are the kinds of events
together with the communities
that we can work with the language.
(Kimberli) So some
of the solutions that we have.
(Susanna) Here are two ideas
for next year that we have.
We are developing and seeing
what can be done with them.
One of them comes
as a collaborative project
together with the Saami archives
and the Saami museum in Inari
in the North of Finland,
and we could collect
cultural heritage concepts
across these Nordic countries
in different Saami languages,
but not only Saami languages
but also in the Nordic languages
because we share
a similar cultural heritage/history
that we have similar monuments.
This, of course, came up
with a Wiki Loves Monuments competition
and archeological finds
across the area are similar.
And the other one is place names,
that is a fortunate new project
starting at Wikimedia.
Norway, that we could expand
to be Pan Nordic,
to include place names in all these.
- Pan Saami.
- Pan Saami, ooh.
(Kimberli) So these are depictathons.
The Skolt Saami--
there are thousands of pictures
of the Skolt Saami in Commons.
They come from different archives,
and they have data,
the structured data on them
is basically from 100 years ago
so it's describing things
in the way that they would have been
described 100 years ago.
We don't want those,
those ways of description there anymore
because a lot of them are racist,
quite racist.
We don't want them.
The community doesn't want them.
The community wants to be able
to write what they want to say
about the pictures in their own language,
or in Finnish or Norwegian or Swedish.
And so we've been having depictathons
as an idea that--
well, we've done it.
So people can change the captions,
change the descriptions
of these pictures in Commons,
and you work with structured data
so I'll let you talk about that.
(Susanna) Yeah, and well,
let's see our next slide
because this is just as--
you all know structured data on Commons
so for you this is no news.
And I think, well from these,
we also enter delicate questions
of what are the descriptions,
but we'll come back to that.
(Kimberli) In the Northern Saami,
we've been creating
autogenerated Wikidata info boxes.
They've been pulling in data
from Wikidata
because I'm the one person
that's correcting everything
in the Northern Saami Wikipedia,
and I don't have time
to change every mayor,
the population of every country,
things like that.
So I've been really blessed
with the people
that have come up and started helping
create these info boxes.
And it's expanded the amount of knowledge
we have in the Northern Saami
Wikipedia greatly.
So this is Nils-Aslak Valkeapää,
who is one of the most famous Saami
multi-talent--he's a polymath.
I mean, he was a singer, a writer,
artist, and we now have
this info box there for him,
all of the data which is pulled
from Wikidata.
Before we had maybe three lines
and no picture.
(Susanna) And this applies specifically
of course to the languages
that have a Wikipedia.
(Kimberli) Yeah, but doesn't work
in an incubator.
(Susanna) Yep.
This is quite exciting now.
Once we have the--
well, we are not working
with lexicographical data,
like specifically.
We will extend to it,
but we are concerned mainly
about labels and items so far.
So what this makes possible
is tagging content,
museums, libraries
as well as broadcasters.
Yle, the Finnish Broadcasting Company
as they are already using
the Wikidata for tagging,
this might be an opportunity
for the small Saami languages
in the Nordic area.
And this is my opportunity to show
my project Wikidocumentaries as well
because it is a project that reads--
well, it's difficult to make the change...
Let me have [inaudible] help.
Yeah, there.
So here we have a page
in Wikidocumentaries,
which is now in English.
This is a project that consumes
information from the Wikimedia sphere.
Every item in Wikidata has a page,
or can be made into a page
or is automatically created into a page.
Then it gathers all this information
across Wikimedia projects,
and the interface exists already
in 40 plus languages,
and I would be able
to change the interface
and then see all the same data
in another language.
I could also, as you can see,
or you were able to see
in the English one,
that there is no article on this
in the English Wikipedia.
Therefore you could go to see
which languages it exists,
and this one is in Northern Saami.
So you would be able to switch
only the article language.
But also then it can also display
any language
that is encoded in Wikidata.
So we also get it
in the same page in Skolt Saami.
Although, there is no Wikipedia,
you get all the same content
in these languages.
(Kimberli) There is actually
an article about her
in Skolt Saami on the incubator,
but it doesn't work with Wikidocumentaries
because of the way
the incubator is encoded.
(Susanna) Oh yeah.
And just briefly, I'm very excited
in thinking about an app
that will gamify this
or like collecting these terms
into Wikidata.
But I haven't landed on one,
and I'm sure there are experiences
of that across this community,
and it would be interesting
to put together our thoughts on that.
(Kimberli) So there's
quite a few challenges
that we have in these projects.
This picture, if you come across it
on any Wikipedia please delete it.
It's two Finns dressed as Saami people.
It's labeled fake Saami clothing,
and people still use it
on Wikipedia projects.
I don't know why.
So we have false data.
We have racist--and with the Saami,
we have a lot of eugenics-based data.
So when they were trying to prove
that the Saami were a lower race
so they could sterilize them
and things like that,
we have a lot of that data
because that's the stuff
that comes out of archives.
Data usage--data has been used
without the consent
of the communities,
and for instance, the Skolt community
was kind of shocked to see
that their relatives are in Commons,
and they weren't very appreciative of it.
Sensitive data,
which Stacy can talk more about.
Yeah, this is used
on the Hungarian Wikipedia.
Here's that lovely picture
describing that these people
are Saami people.
Please delete it.
Yeah, this is more
what Stacy will talk about.
(Susanna) Leave it to you?
(Kimberli) Sensitive data.
TK labels--you want to talk about before.
(Susanna) You're not addressing them.
I think we could also look
into identifying content
already on Commons
or just about to enter Commons,
how to tag and identify, tag
and perhaps delete
or then find out restricting
the usage of this media.
Well, it's very short,
but let's see if we have
more opportunities to discuss that.
(Kimberli) We can skip this part.
Sorry.
I want to say that this is the week
of the Saami Language Week this week
so please feel free to use hashtags
for Saami languages.
Gæjhtoe!
(Susanna) Spä'sseb!
(Kimberli) Spä'sseb!
Takkâ.
(applause)