(speaking in Maori)
As has been explained,
I'm Siobhan Leachman.
I'm a Wikimedian from New Zealand.
I contribute to Wikidata,
as well as English Wikipedia
and the Wikimedia Commons.
I'd like to thank
the Wikimedia Foundation,
Wikimedia Deutschland,
and, in particular,
the organizing committee
of the WikidataCon
for enabling me to attend
this conference and present today.
Now, in this presentation,
I want to tell you about the vital role
I think Wikidata and Wikidata editors
can play in surfacing notable women.
I want to take you through my workflows,
ensuring that these underacknowledged
women and their work
can be added to Wikidata.
I want to show how the curation
of data on these women
can assist with the creation
of citable secondary sources.
This, in turn, can encourage and enable
the creation of Wikipedia articles
about these women
in a variety of languages.
Now, I'm sure you're aware
that Wikipedia editors are working hard
to write more articles on women.
Examples of projects
focusing on this type of work
are the Women in Red project
or the WikiProject Women Scientists.
But one of the main hurdles
I've experienced
when attempting to write
about women in Wikipedia
is the notability criteria.
When writing articles on women,
I've found this criteria
can be a challenge to achieve.
I've discovered women
are less likely to be written about
in citable secondary sources,
and this has particularly
been brought home to me
when I've attempted to write articles
about women and the scientists pre-1950.
However, just like in our Wiki projects,
there are plenty of researchers
and creators of secondary sources
out in the wider world
attempting to change this.
They just need to be pointed
in the direction of these women,
and I believe Wikidata can be their arrow.
Now, yes, like Wikipedia,
Wikidata has a notability criteria
that must be met.
But this criteria is a much lower bar.
I'm advocating using Wikidata
to get a foot in the Wiki door
for unrepresented groups.
By adding these women to Wikidata,
editors can then make it easier
for the data about them
to be collated, curated, and linked.
In doing so, it would make it easier
for researchers and writers,
the generators of these vital
secondary sources,
to find these women
and then to use the data
to guide their research.
Once coverage reaches
the Wikipedia notability threshold,
Wikipedia editors can then create articles
on these underrepresented people.
Now, I want to show you
how I put this into practice,
to take you through how I started
on this data journey,
and to give you examples
of the collaborations
I and others like me
have managed to forge,
enabling this type of work to be done.
Now, I tend to focus on data about women
in the field of natural history--
these women scientific illustrators,
collectors of specimens
as well as women scientists,
such as botanists and zoologists.
I became interested in these women
when I started volunteering
for the Smithsonian Transcription Center.
I helped transcribe
natural history specimens
or scientific handwritten field notebooks,
and, in doing so, I frequently
came across women,
many of whom had contributed
specimens to the Smithsonian
or had undertaken scientific research.
At the same time, I was volunteering
for the Biodiversity Heritage Library,
or BHL.
Now, BHL is the world's
largest open-access digital library
of biodiversity literature and archives.
Much of the biodiversity literature
they host is historic
and therefore in the public domain.
They've got an extensive collection
of scientific illustrations in Flickr.
So I would tag those images
with not just taxonomic names
but as well as illustrated tags.
That metadata is in turn ingested
and stored into BHL.
The hope is to use those tags
to become searchable
at some point in the future
on BHL's website.
But as an added bonus, many of these tags
have been incorporated into Wikicommons
as a result of those Flickr files
being bulk uploaded
by other Wikicommons editors.
It was while transcribing
with the Smithsonian
I met and started collaborating
with another volunteer,
Michelle Marshall.
Both of us were avid taggers
of BHL images,
and while doing this work,
both Michelle and I
were enthusiastically kept encouraged
by Grace Constantino, the BHL Outreach
and Communications Manager.
And while tagging, we would again
come across women,
so many women, amazing women,
about whom there appeared
little known or written.
Some of these women
would be illustrating multiple articles,
books, and scientific publications.
Others would be writing
the books or articles,
amassing collections of specimens,
or having species named after them.
Both Michelle and I were really keen
on making known more about these women,
but there was very little
about them on the internet.
Every once in a while,
there would be a women
who had significant coverage,
enough so there was a Wikipedia article
created about them,
but this was an exception
rather than the rule.
This lack of coverage
was frustrating to both of us,
and, as a result, I became keen
on learning how to edit Wikipedia.
Both the folk in the Smithsonian
and BHL were extremely encouraging.
They too were keen
on addressing this issue
of underrepresented women
and wanted to highlight
notable women in their collections
via various WikiProjects.
So both Michelle and I
started researching,
me with the aim of writing
Wikipedia articles,
her with the aim of writing blog posts
and enriching the BHL Instagram account.
Now, on the rare occasion we managed
to find enough sources and references
to get these women over
the English Wikipedia notability criteria,
I'd actually write an article.
But as I've explained, this tended to be
the exception rather than the rule.
Historically, much
of these women's illustration work
was not regarded
at the time of their creation
as being worthy of comment.
At most, they received a passing remark
in the reviews of the publication
or perhaps an acknowledgment
by the author of the work.
This lack results in them
being overlooked by library catalogs,
and they and their contributions
were simply not recorded.
They created scientific illustrations
so didn't tend
to exhibit in art galleries.
The art was created to enhance
the scientific publication
and wasn't treated as a stand-alone work,
worthy of critique and public display.
It was, therefore, very rare
to find enough sources
to get these women artists
over the notability hurdle.
But we tried.
Working together, Michelle and I
began researching these women
and gathering our information
into a Google spreadsheet,
Often, we'd track down enough data
to work out who they were,
the works they contributed to,
and who they worked for.
BHL recently enabled a full text search,
which has significantly improved
our ability to find information on them.
We'd search for and, if we were lucky,
find external identifiers,
such as the BHL creator ID
or the Stuttgart Scientific
Illustrators Database ID,
or if we were really lucky, a VIAF ID.
However, there was no guarantee
an external database
identifier would exist.
So we'd tag their plates in Flickr,
collate our research on these women
in our spreadsheets,
and then wait for more books and articles
and institution blogs
and research to be generated.
For me, getting them into Wikipedia
was the gold standard,
but I could stretch
the notability criteria only so far.
My first Wikipedia article
on a woman botanist
was nominated for deletion,
and ever since that experience,
I've been extremely careful
about ensuring I did everything possible
to meet the notability criteria.
But I was actively looking for ways
to make our work
more impactful and effective.
Now, at this point,
I know what you're thinking,
what about Wikidata?
And I completely agree.
As soon as I discovered Wikidata,
I took the leap and started editing.
But, again, unfortunately,
I came up against
the Wikidata notability criteria.
Early on, I had an item deleted
due to my failure to meet
even the Wikidata notability criteria.
I was having to meet even that low bar.
But this was all part
of my learning by mistakes,
and I soon adapted
my workflow to allow for this.
I realized I could ensure these women
met the Wikidata notability criteria
by creating at least
one valid WikiCite link.
So my workflow started
with me creating a Wikicommons
category page for these women
and then manually adding
this category to her illustrations,
the illustrations that had been
previously uploaded
from the BHL Flickr feed
into Wikicommons by other editors.
Once the category page was created,
I would then create
a Wikidata item for that woman,
including that category in the item.
I'd then begin to collate
all the information
and research we'd found out
about that particular woman.
I would autogenerate
a creator page in Wikicommons
via that Wikidata item.
I'd improve the structured data
of the scientific art in Wikicommons
by adding the creator markup
to each of her images.
And I believe this assists
with the structured data on Commons
as it links the Wikidata item
to the artist
and to the work in Commons.
I'd like to emphasize
this was a manual process.
I wasn't working from established dataset.
There is no established dataset
for these women that I can find.
I would also use the reference section
of the Wikidata statements,
not just to reference
the statements themselves,
but also with an eye to help collate
all the links we discovered
during our research.
I wanted to leave a research trail,
making it easier for me and others like me
to find these links
and then write either secondary sources
or, if appropriate,
a Wikipedia article on these women.
Obviously, if external
identifiers existed,
I wanted to include them.
Again, to my disappointment,
despite the prestige
of the works they were illustrating,
many of these women
were not listed in external databases.
I would always check VIAF,
the Virtual International
Authority File database.
But, from my personal experience,
there appears to be
a bias against illustrators,
no matter what their gender.
I admit this is anecdotal
because I'm unable to find
any research to support this.
But VIAF would often list
the author of the [inaudible] publication,
but not the illustrator.
And this would even be the case
even if the illustrations made up
a large proportion of the work,
or the woman was thanked profusely
on the dedication page.
I would also check the Stuttgart
Scientific Illustrators database.
This is one of the most
comprehensive databases
for scientific artists.
Sometimes the woman would be in there,
but sometimes not.
Although a fabulous starting point,
this database wasn't
as comprehensive as I needed.
But the wonderful thing about it
was how responsive its creator,
the History Department
of the University of Stuttgart,
was to emails.
Both Michelle and I would write to them,
including our research
on particular women illustrators,
asking for these women to be included.
Again, there is a threshold to this.
I certainly wouldn't write to them
unless I had reasonable
supporting evidence
to justify their inclusion.
But the information they needed
to generate an external identifier
was definitely less than what was needed
to do a Wikipedia article.
Folk in charge of this database
were very grateful for our input,
and once our research
was confirmed by them,
they would add these women
to their database
and then would generate
an external identifier.
They were also able to access resources
that neither Michelle nor I had access to.
Often, more data was added
on these women in the DSI database
as a result of their further research.
A Wikidata property had already
been created for this database,
and so once awarded,
it was an identifier I could then add
to the woman's Wikidata item.
Now, Michelle and I also contacted
the BHL about these women.
This is where our collaborative
relationship with Grace
came to the fore.
Grace would encourage us
to submit a request
that the woman's name be added
to the BHL catalog record.
This is a more convoluted process
than it might appear.
BHL metadata is sourced
from numerous contributing institutions.
Since it was a cataloging change,
the BHL protocol required that the change
be submitted as a change request
to the BHL cataloging group
for review and final approval.
So, again, to obtain
the change to the catalog
and the subsequent external identifier,
it wasn't an easy rubber stamp process.
We had to back up our request
with sources and proof
in order for the catalog to be changed.
However, because we were doing
this relatively frequently,
the catalog group
became used to our requests
and were very appreciative of our efforts.
If the necessary criteria was satisfied,
the institutions were prepared
to edit their metadata,
and in doing so,
create another external identifier,
the BHL creator ID.
At around the same time
we were undertaking this work,
BHL, in its intern program,
was collaborating
with other Wikidata editors.
The BHL resident [Katie Nika]
was working with Andy [Mebert]
trialing adding
BHL creator IDs to Wikidata.
The original test case was 1,000 names
into the Mix-n-Match tool,
But, subsequently,
the whole created dataset
was uploaded into Mix-n-Match,
allowing the matching
of BHL dataset to Wikidata items.
This dataset is huge and continues
to be worked on by editors today.
Due to the lack of resources,
unfortunately, BHL can't continue
Katie's work in Wikidata,
but there are very encouraging
of folk reusing their data
and their collections and WikiProjects.
Now, editors have also approved
several BHL Wikidata properties,
not just for the creator ID,
but also the bibliographic ID,
page ID, and item ID.
And, as a result, it's now possible
to link these women illustrators
to their works via Wikidata.
Obtaining a creator ID
and therefore a Wikidata item
can ensure a cascade
of linked open data on them
that can raise the visibility
of these women to researchers.
Slowly, I began to feel
we were making real difference
in surfacing these women.
At least now when folk googled them
the Wikidata item would appear
and images they had created
would show up in the image feed.
Our research, tags, blogs,
Wikidata items, and external identifiers
brought about by our requests
were all coming together,
making these women
much more easier to discover.
Grace had already been using
our tagging work
in the BHL social media feeds
to highlight the illustrations
in the collections.
Member institution librarians
were writing blogs on these women
and raising their visibility
to a variety of audiences.
These edited, well researched
and referenced blogs
were a definite step in the ladder
towards obtaining citable sources
for Wikipedia articles.
But our work really came to the fore
when BHL held their "Her Natural History:
A Celebration of Women
in Natural History" campaign.
This was a multi-institutional,
multi-platform campaign
to raise awareness
and to celebrate the contributions
of women to natural history.
This campaign resulted
in numerous outcomes,
many of which had a direct impact
on the richness of the metadata
available on these women.
So the BHL cataloging group
added more female contributors
to the BHL catalog,
generating more external identifiers.
More images by the women
were added to the Flickr feed,
and these were either
in the public domain or openly licensed
so were able to be uploaded
into Wikicommons.
Numerous blog posts were written
by the employees
of the member institutions.
Some of these blogs used the research
Michelle and I had undertaken
as a starting point,
picking it up and running with it.
These blogs often resulted
in the discovery of new resources
and sources of information
that assisted in pushing
some of the women
over the notability threshold
for a Wikipedia article.
During the campaign, there were also
three Wikimedia workshops:
the Wikimedia District of Columbia
ran a workshop concentrating
on generating and improving
Wikipedia articles on these women;
two additional workshops
were organized by Esther Jackson
and jointly hosted
by the New York Botanical Garden
and the Wikimedia New York City.
The first workshop focused
on editing tags to the BHL Flickr feed
and the second workshop focused
on editing Wikidata and Wikicommons.
These events made use
of research [inaudible]
that Michelle and I had undertaken
in the preceding years.
Worklists were generated
by both the spreadsheets
Michelle and I had created,
as well as from Wikidata items
that I, along with other editors,
had helped create.
And this campaign, I think,
shows how effective Wikidata can be
in assisting with
the interlinking of knowledge.
The Wikidata items became
a leaping-off point,
providing a framework enabling research
to be collated and writing to commence.
Now, this is just one example
of a collaboration
that can improve linked
open data on these women.
Once these women
have a presence on Wikidata,
the item itself can be put to use.
An example of this
is women natural history
specimen collectors.
Many underacknowledged women
contributed to scientific knowledge,
collecting specimens,
and these are held
in museums and herbaria.
As more and more
of these collections are digitized,
more of the collectors
are coming out of the woodwork.
There are now sites being developed
to assist scientists in getting
the recognition they deserve
from their fieldwork and collecting.
The site I've recently been utilizing
is Bloodhound Tracker.
It uses the ORCID ID or the Wikidata item
to link the collector
to their collected specimen
via the Global Biodiversity
Information Facility, or GBIF.
Collection information is a rich vein
of data on early woman scientists,
particularly as at that time,
they'd been unable to publish works
or join scientific societies
due to the social norms of the day.
Wikidata can be used
to collect information on these women,
linking the information held on them
from archives, libraries, and museums,
or to the scientific literature,
based on the specimens they've collected,
or the species
that have been named after them.
Once a Wikidata item is created
and sufficient metadata
has been added to it,
the Bloodhound Tracker site
will then automatically ingest details
about those women into its site.
Contributors can help those women
claim their collections,
enriching not just the linked open data,
but ensuring these women
get the credit for their vital work.
But, again, Wikidata notability criteria
can be a challenge.
If the women collected significantly
but didn't contribute
either to the published record
or as an illustrator,
it can be difficult to hurdle
the notability criteria for Wikidata.
However, as more and more libraries,
archives, and museums,
and genealogical databases are gaining
Wikidata external identifiers,
it's becoming easier for these women
to become notable
for the purposes of Wikidata
and then use Wikidata
to link them to their works.
I believe similar workflows
to what I've outlined
can be used for other
underrepresented groups.
By actively working to achieve
the notability criteria for Wikidata,
and then expanding the Wikidata items
to highlight the contributions
of underrepresented people,
it's possible to improve their visibility.
This, in turn, assists with the generation
of secondary sources
and creates a virtual cycle
of information creation,
sharing, and linking.
By being proactive and collaborative,
it's possible to work towards
eliminating underrepresentation.
Thank you.
(applause)
(women) Have you found any publication
in which all of the illustrations
actually need their own item?
I think there will be;
there definitely is.
But if I went down that rabbit hole...
I've got to stop somewhere,
and I'm just trying
to concentrate on the women.
But, yes, there are classics
of biodiversity literature
that not only should have
an item for the book itself
but also for each illustration.
I mean, Elizabeth Gould
immediately springs to mind.
Every piece of art that she ever did--
(woman) I would just say Maria Sibylla...
Yep, she's a classic too.
(man) James [Heald].
While you've been working on this,
do you think that the way
the notability criteria
have been being applied has changed?
- Is there are drift in a good direction?
- Yes, I do think it has.
Other than that first item being...
I admit it was partially my mistake.
I did the item, and I didn't have
an external identifiers,
and it seemed, because of the lack
of the information I provided,
I am not surprised it got deleted.
Now I'm more experienced.
But, saying that, I'm pretty sure
I could put the same thing in nowadays
and it wouldn't get deleted.
I actually do think it has improved.
(James [Heald]) Different question.
I've seen on your Twitter sometimes,
you've found women's work
credited to their husbands.
- Oh God, yes!
- Would you say a bit more about that?
Okay, there's a whole problem...
Specifically, what gets me
having to be peeling myself
off the ceiling with rage
is when the women botanists
go out and collect
and they're known
under their marriage name,
and then they put
their specimens into the herbaria
and the herbaria have a database,
they transcribe the names,
but they don't have
a space in their database
for the vital, important missus.
And so what happens is that always,
if it's pre-1950
and the guy's known for being prolific,
check his wife,
because most of the time
either she's typing
and helping him produce
the scientific papers
or she's out there collecting with him.
Yes, that's a definite problem
that I have been raising
with a lot of the herbaria.
They just keep saying,
"Our database doesn't have
a place for the missus,"
and I say, "Find a place
because it's important."
Yeah.
(man 2) What other domains
will you copy this to?
Because you're now doing it
for a very specific subject.
What comes to mind?
It's a good question.
I think anything
where people get disappeared,
where they're not credited for their work,
it tends to be where they get lost.
So something historic
and the data just isn't linked.
For me, women are the classic example.
But I also think if there's, for example--
one that does spring to mind
is artists in New Zealand,
Maori artists, for example,
who get acknowledged to oral history,
but there are no written works,
and so the scholarship could possibly be
a problem later on down the track.
I think that was a group that's ripe
for using this type of work,
to try and get identifiers for them,
to make them more notable,
to get them into Wikidata,
so that then researchers
are pointed towards them
and can start doing the research
needed to rediscover them.
(woman 2) Okay, so I do
a lot with women artists,
and what I've found,
apart from the married name thing,
is they also tend to stay local,
so they don't move and cross borders.
It turns out notability
is very highly correlated
with the number of borders
you cross in your lifetime.
Right, yeah.
To tell you the truth,
I actually find that a benefit.
It's much easier to disambiguate
someone if they don't shift.
If they've been in one place,
you can then find the database,
like the births or deaths
or marriages database,
and you can work out
on the basis of their address
or you can find them
a lot easier if they don't shift.
It's when they shift, and they change
from maiden name to married name
that it can get really difficult.
(woman 2) Yeah.
(woman 3) Just adding to the question
that was asked earlier
in what field you could use this.
If it's a case where people
are disappearing or are not visible,
meaning that for women, in my opinion,
that would mean like everywhere.
Yeah.
(woman 3) One of the things I work on
is Delftware pottery workshops,
and that was an official job
in the 17th century.
And when the potter died,
there needed to be a new potter
that was inscribed
in the official guild book,
- unless his wife could take over.
- Ah!
(woman 3) And then she could take over
without that diploma,
or whatever you want to call it,
sometimes for years.
And it would be attributed to her husband?
(woman 3) Yes, because the pottery
is always attributed to the owner.
And they're like one line
in the official encyclopedias...
This doesn't surprise me.
...where the women are like taking care
of the business for 10 years
[and say for a job]
of their husband for two years,
but all the pottery items
would be marked--
I think this is a really good example
of how Wikidata can actually be used
to surface these women
and have something
to hang the scholarship off,
so that then, eventually,
the more people who don't struggle
to try and find the base information
can then start the research,
and the in-depth research
that's required to surface these women.
Wikidata, I think, is the easy way
to have a framework,
a skeleton to hang the bare data
that you've got on
to enable that research to happen.
Yeah.
(man 3) I'm sorry we are out of time.
We have the lunch break now, so thank you.
Well, come talk to me
if anyone else has any questions.
(applause)