WEBVTT 00:00:06.209 --> 00:00:07.765 (speaking in Maori) 00:00:08.525 --> 00:00:10.533 As has been explained, I'm Siobhan Leachman. 00:00:10.533 --> 00:00:12.604 I'm a Wikimedian from New Zealand. 00:00:12.604 --> 00:00:14.440 I contribute to Wikidata, 00:00:14.440 --> 00:00:18.113 as well as English Wikipedia and the Wikimedia Commons. 00:00:18.113 --> 00:00:20.400 I'd like to thank the Wikimedia Foundation, 00:00:20.400 --> 00:00:21.669 Wikimedia Deutschland, 00:00:21.669 --> 00:00:22.680 and, in particular, 00:00:22.680 --> 00:00:25.571 the organizing committee of the WikidataCon 00:00:25.571 --> 00:00:29.002 for enabling me to attend this conference and present today. 00:00:31.100 --> 00:00:32.336 Now, in this presentation, 00:00:32.336 --> 00:00:34.422 I want to tell you about the vital role 00:00:34.422 --> 00:00:39.565 I think Wikidata and Wikidata editors can play in surfacing notable women. 00:00:39.565 --> 00:00:41.481 I want to take you through my workflows, 00:00:41.481 --> 00:00:45.981 ensuring that these underacknowledged women and their work 00:00:45.981 --> 00:00:47.895 can be added to Wikidata. 00:00:47.895 --> 00:00:50.916 I want to show how the curation of data on these women 00:00:50.916 --> 00:00:54.981 can assist with the creation of citable secondary sources. 00:00:55.917 --> 00:00:58.067 This, in turn, can encourage and enable 00:00:58.067 --> 00:01:00.381 the creation of Wikipedia articles about these women 00:01:00.381 --> 00:01:02.480 in a variety of languages. 00:01:03.620 --> 00:01:04.804 Now, I'm sure you're aware 00:01:04.804 --> 00:01:09.180 that Wikipedia editors are working hard to write more articles on women. 00:01:09.750 --> 00:01:13.152 Examples of projects focusing on this type of work 00:01:13.152 --> 00:01:19.110 are the Women in Red project or the WikiProject Women Scientists. 00:01:21.032 --> 00:01:23.103 But one of the main hurdles I've experienced 00:01:23.103 --> 00:01:25.589 when attempting to write about women in Wikipedia 00:01:25.589 --> 00:01:27.843 is the notability criteria. 00:01:27.843 --> 00:01:29.266 When writing articles on women, 00:01:29.266 --> 00:01:32.199 I've found this criteria can be a challenge to achieve. 00:01:32.199 --> 00:01:34.637 I've discovered women are less likely to be written about 00:01:34.637 --> 00:01:36.847 in citable secondary sources, 00:01:36.847 --> 00:01:38.773 and this has particularly been brought home to me 00:01:38.773 --> 00:01:44.836 when I've attempted to write articles about women and the scientists pre-1950. 00:01:44.836 --> 00:01:46.744 However, just like in our Wiki projects, 00:01:46.744 --> 00:01:50.831 there are plenty of researchers and creators of secondary sources 00:01:50.831 --> 00:01:53.551 out in the wider world attempting to change this. 00:01:53.551 --> 00:01:56.306 They just need to be pointed in the direction of these women, 00:01:56.306 --> 00:01:59.159 and I believe Wikidata can be their arrow. 00:02:01.313 --> 00:02:05.481 Now, yes, like Wikipedia, Wikidata has a notability criteria 00:02:05.481 --> 00:02:06.657 that must be met. 00:02:06.657 --> 00:02:09.128 But this criteria is a much lower bar. 00:02:09.128 --> 00:02:13.033 I'm advocating using Wikidata to get a foot in the Wiki door 00:02:13.033 --> 00:02:15.024 for unrepresented groups. 00:02:16.154 --> 00:02:17.658 By adding these women to Wikidata, 00:02:17.658 --> 00:02:19.143 editors can then make it easier 00:02:19.143 --> 00:02:22.815 for the data about them to be collated, curated, and linked. 00:02:22.815 --> 00:02:26.990 In doing so, it would make it easier for researchers and writers, 00:02:26.990 --> 00:02:29.827 the generators of these vital secondary sources, 00:02:29.827 --> 00:02:31.004 to find these women 00:02:31.004 --> 00:02:33.716 and then to use the data to guide their research. 00:02:33.716 --> 00:02:36.991 Once coverage reaches the Wikipedia notability threshold, 00:02:36.991 --> 00:02:41.863 Wikipedia editors can then create articles on these underrepresented people. 00:02:41.863 --> 00:02:44.184 Now, I want to show you how I put this into practice, 00:02:44.184 --> 00:02:47.372 to take you through how I started on this data journey, 00:02:47.372 --> 00:02:49.426 and to give you examples of the collaborations 00:02:49.426 --> 00:02:52.321 I and others like me have managed to forge, 00:02:52.321 --> 00:02:54.579 enabling this type of work to be done. 00:02:55.719 --> 00:03:00.127 Now, I tend to focus on data about women in the field of natural history-- 00:03:00.127 --> 00:03:03.332 these women scientific illustrators, collectors of specimens 00:03:03.332 --> 00:03:07.454 as well as women scientists, such as botanists and zoologists. 00:03:07.454 --> 00:03:08.907 I became interested in these women 00:03:08.907 --> 00:03:12.380 when I started volunteering for the Smithsonian Transcription Center. 00:03:12.380 --> 00:03:15.468 I helped transcribe natural history specimens 00:03:15.468 --> 00:03:18.656 or scientific handwritten field notebooks, 00:03:18.656 --> 00:03:21.792 and, in doing so, I frequently came across women, 00:03:21.792 --> 00:03:24.651 many of whom had contributed specimens to the Smithsonian 00:03:24.651 --> 00:03:27.118 or had undertaken scientific research. 00:03:27.838 --> 00:03:31.695 At the same time, I was volunteering for the Biodiversity Heritage Library, 00:03:31.695 --> 00:03:33.030 or BHL. 00:03:33.030 --> 00:03:37.854 Now, BHL is the world's largest open-access digital library 00:03:37.854 --> 00:03:42.111 of biodiversity literature and archives. 00:03:42.111 --> 00:03:45.468 Much of the biodiversity literature they host is historic 00:03:45.468 --> 00:03:47.521 and therefore in the public domain. 00:03:47.521 --> 00:03:51.811 They've got an extensive collection of scientific illustrations in Flickr. 00:03:51.811 --> 00:03:55.834 So I would tag those images with not just taxonomic names 00:03:55.834 --> 00:03:58.389 but as well as illustrated tags. 00:03:58.389 --> 00:04:02.695 That metadata is in turn ingested and stored into BHL. 00:04:02.695 --> 00:04:06.300 The hope is to use those tags to become searchable 00:04:06.300 --> 00:04:09.389 at some point in the future on BHL's website. 00:04:09.389 --> 00:04:14.831 But as an added bonus, many of these tags have been incorporated into Wikicommons 00:04:15.338 --> 00:04:18.874 as a result of those Flickr files being bulk uploaded 00:04:18.874 --> 00:04:21.410 by other Wikicommons editors. 00:04:21.410 --> 00:04:24.146 It was while transcribing with the Smithsonian 00:04:24.146 --> 00:04:27.185 I met and started collaborating with another volunteer, 00:04:27.185 --> 00:04:28.536 Michelle Marshall. 00:04:28.536 --> 00:04:30.324 Both of us were avid taggers 00:04:30.324 --> 00:04:31.692 of BHL images, 00:04:31.692 --> 00:04:32.961 and while doing this work, 00:04:32.961 --> 00:04:37.043 both Michelle and I were enthusiastically kept encouraged 00:04:37.043 --> 00:04:41.786 by Grace Constantino, the BHL Outreach and Communications Manager. 00:04:42.876 --> 00:04:45.664 And while tagging, we would again come across women, 00:04:45.914 --> 00:04:48.119 so many women, amazing women, 00:04:48.119 --> 00:04:51.507 about whom there appeared little known or written. 00:04:51.507 --> 00:04:54.696 Some of these women would be illustrating multiple articles, 00:04:54.696 --> 00:04:56.991 books, and scientific publications. 00:04:56.991 --> 00:04:59.508 Others would be writing the books or articles, 00:04:59.508 --> 00:05:04.000 amassing collections of specimens, or having species named after them. 00:05:04.000 --> 00:05:09.029 Both Michelle and I were really keen on making known more about these women, 00:05:09.029 --> 00:05:11.674 but there was very little about them on the internet. 00:05:11.674 --> 00:05:12.808 Every once in a while, 00:05:12.808 --> 00:05:15.919 there would be a women who had significant coverage, 00:05:15.919 --> 00:05:19.242 enough so there was a Wikipedia article created about them, 00:05:19.242 --> 00:05:21.994 but this was an exception rather than the rule. 00:05:21.994 --> 00:05:25.000 This lack of coverage was frustrating to both of us, 00:05:25.000 --> 00:05:28.795 and, as a result, I became keen on learning how to edit Wikipedia. 00:05:29.925 --> 00:05:34.711 Both the folk in the Smithsonian and BHL were extremely encouraging. 00:05:34.711 --> 00:05:36.797 They too were keen on addressing this issue 00:05:36.797 --> 00:05:38.384 of underrepresented women 00:05:38.384 --> 00:05:41.231 and wanted to highlight notable women in their collections 00:05:41.231 --> 00:05:43.759 via various WikiProjects. 00:05:43.759 --> 00:05:45.863 So both Michelle and I started researching, 00:05:45.863 --> 00:05:48.267 me with the aim of writing Wikipedia articles, 00:05:48.267 --> 00:05:52.785 her with the aim of writing blog posts and enriching the BHL Instagram account. 00:05:54.044 --> 00:05:57.665 Now, on the rare occasion we managed to find enough sources and references 00:05:57.665 --> 00:06:02.874 to get these women over the English Wikipedia notability criteria, 00:06:02.874 --> 00:06:04.576 I'd actually write an article. 00:06:04.576 --> 00:06:07.949 But as I've explained, this tended to be the exception rather than the rule. 00:06:07.949 --> 00:06:10.903 Historically, much of these women's illustration work 00:06:10.903 --> 00:06:13.191 was not regarded at the time of their creation 00:06:13.191 --> 00:06:15.077 as being worthy of comment. 00:06:15.077 --> 00:06:19.184 At most, they received a passing remark in the reviews of the publication 00:06:19.184 --> 00:06:22.640 or perhaps an acknowledgment by the author of the work. 00:06:22.640 --> 00:06:26.679 This lack results in them being overlooked by library catalogs, 00:06:26.679 --> 00:06:30.231 and they and their contributions were simply not recorded. 00:06:30.231 --> 00:06:32.741 They created scientific illustrations 00:06:32.741 --> 00:06:35.695 so didn't tend to exhibit in art galleries. 00:06:35.695 --> 00:06:38.950 The art was created to enhance the scientific publication 00:06:38.950 --> 00:06:41.303 and wasn't treated as a stand-alone work, 00:06:41.303 --> 00:06:44.024 worthy of critique and public display. 00:06:44.024 --> 00:06:46.558 It was, therefore, very rare to find enough sources 00:06:46.558 --> 00:06:50.738 to get these women artists over the notability hurdle. 00:06:50.738 --> 00:06:52.252 But we tried. 00:06:52.822 --> 00:06:55.476 Working together, Michelle and I began researching these women 00:06:55.476 --> 00:06:58.548 and gathering our information into a Google spreadsheet, 00:06:58.548 --> 00:07:02.140 Often, we'd track down enough data to work out who they were, 00:07:02.140 --> 00:07:05.142 the works they contributed to, and who they worked for. 00:07:05.142 --> 00:07:07.730 BHL recently enabled a full text search, 00:07:07.730 --> 00:07:11.620 which has significantly improved our ability to find information on them. 00:07:11.620 --> 00:07:15.358 We'd search for and, if we were lucky, find external identifiers, 00:07:15.358 --> 00:07:17.413 such as the BHL creator ID 00:07:17.413 --> 00:07:20.300 or the Stuttgart Scientific Illustrators Database ID, 00:07:20.300 --> 00:07:23.304 or if we were really lucky, a VIAF ID. 00:07:23.304 --> 00:07:24.874 However, there was no guarantee 00:07:24.874 --> 00:07:27.889 an external database identifier would exist. 00:07:28.879 --> 00:07:30.468 So we'd tag their plates in Flickr, 00:07:30.468 --> 00:07:33.087 collate our research on these women in our spreadsheets, 00:07:33.087 --> 00:07:35.074 and then wait for more books and articles 00:07:35.074 --> 00:07:38.112 and institution blogs and research to be generated. 00:07:38.112 --> 00:07:41.167 For me, getting them into Wikipedia was the gold standard, 00:07:41.897 --> 00:07:44.940 but I could stretch the notability criteria only so far. 00:07:44.940 --> 00:07:47.377 My first Wikipedia article on a woman botanist 00:07:47.377 --> 00:07:49.180 was nominated for deletion, 00:07:49.180 --> 00:07:52.553 and ever since that experience, I've been extremely careful 00:07:52.553 --> 00:07:57.193 about ensuring I did everything possible to meet the notability criteria. 00:07:57.193 --> 00:07:58.929 But I was actively looking for ways 00:07:58.929 --> 00:08:02.401 to make our work more impactful and effective. 00:08:02.401 --> 00:08:04.872 Now, at this point, I know what you're thinking, 00:08:04.872 --> 00:08:06.659 what about Wikidata? 00:08:06.659 --> 00:08:08.578 And I completely agree. 00:08:08.578 --> 00:08:12.818 As soon as I discovered Wikidata, I took the leap and started editing. 00:08:12.818 --> 00:08:14.605 But, again, unfortunately, 00:08:14.605 --> 00:08:18.078 I came up against the Wikidata notability criteria. 00:08:18.078 --> 00:08:20.365 Early on, I had an item deleted 00:08:20.365 --> 00:08:24.508 due to my failure to meet even the Wikidata notability criteria. 00:08:24.938 --> 00:08:28.060 I was having to meet even that low bar. 00:08:28.570 --> 00:08:31.299 But this was all part of my learning by mistakes, 00:08:31.299 --> 00:08:34.476 and I soon adapted my workflow to allow for this. 00:08:36.156 --> 00:08:40.113 I realized I could ensure these women met the Wikidata notability criteria 00:08:40.113 --> 00:08:43.586 by creating at least one valid WikiCite link. 00:08:43.586 --> 00:08:45.377 So my workflow started 00:08:45.377 --> 00:08:48.761 with me creating a Wikicommons category page for these women 00:08:48.761 --> 00:08:52.666 and then manually adding this category to her illustrations, 00:08:52.666 --> 00:08:55.388 the illustrations that had been previously uploaded 00:08:55.388 --> 00:08:57.224 from the BHL Flickr feed 00:08:57.224 --> 00:09:00.239 into Wikicommons by other editors. 00:09:00.239 --> 00:09:02.269 Once the category page was created, 00:09:02.269 --> 00:09:04.619 I would then create a Wikidata item for that woman, 00:09:04.619 --> 00:09:07.107 including that category in the item. 00:09:07.946 --> 00:09:09.093 I'd then begin to collate 00:09:09.093 --> 00:09:11.673 all the information and research we'd found out 00:09:11.673 --> 00:09:13.467 about that particular woman. 00:09:13.917 --> 00:09:17.782 I would autogenerate a creator page in Wikicommons 00:09:17.782 --> 00:09:19.727 via that Wikidata item. 00:09:19.727 --> 00:09:24.238 I'd improve the structured data of the scientific art in Wikicommons 00:09:24.238 --> 00:09:27.656 by adding the creator markup to each of her images. 00:09:27.656 --> 00:09:30.828 And I believe this assists with the structured data on Commons 00:09:30.828 --> 00:09:34.241 as it links the Wikidata item to the artist 00:09:34.241 --> 00:09:37.006 and to the work in Commons. 00:09:37.006 --> 00:09:39.443 I'd like to emphasize this was a manual process. 00:09:39.443 --> 00:09:42.070 I wasn't working from established dataset. 00:09:42.070 --> 00:09:45.822 There is no established dataset for these women that I can find. 00:09:48.509 --> 00:09:51.830 I would also use the reference section of the Wikidata statements, 00:09:51.830 --> 00:09:54.266 not just to reference the statements themselves, 00:09:54.266 --> 00:09:56.153 but also with an eye to help collate 00:09:56.153 --> 00:09:58.909 all the links we discovered during our research. 00:09:58.909 --> 00:10:00.877 I wanted to leave a research trail, 00:10:00.877 --> 00:10:03.382 making it easier for me and others like me 00:10:03.382 --> 00:10:04.730 to find these links 00:10:04.730 --> 00:10:06.871 and then write either secondary sources 00:10:06.871 --> 00:10:10.054 or, if appropriate, a Wikipedia article on these women. 00:10:10.944 --> 00:10:13.114 Obviously, if external identifiers existed, 00:10:13.114 --> 00:10:14.634 I wanted to include them. 00:10:14.634 --> 00:10:16.069 Again, to my disappointment, 00:10:16.069 --> 00:10:19.415 despite the prestige of the works they were illustrating, 00:10:19.415 --> 00:10:23.731 many of these women were not listed in external databases. 00:10:23.731 --> 00:10:25.480 I would always check VIAF, 00:10:25.480 --> 00:10:28.244 the Virtual International Authority File database. 00:10:28.244 --> 00:10:29.775 But, from my personal experience, 00:10:29.775 --> 00:10:32.647 there appears to be a bias against illustrators, 00:10:32.647 --> 00:10:34.482 no matter what their gender. 00:10:34.482 --> 00:10:36.068 I admit this is anecdotal 00:10:36.068 --> 00:10:39.875 because I'm unable to find any research to support this. 00:10:39.875 --> 00:10:44.131 But VIAF would often list the author of the [inaudible] publication, 00:10:44.131 --> 00:10:46.068 but not the illustrator. 00:10:46.068 --> 00:10:47.554 And this would even be the case 00:10:47.554 --> 00:10:51.394 even if the illustrations made up a large proportion of the work, 00:10:51.394 --> 00:10:55.409 or the woman was thanked profusely on the dedication page. 00:10:57.553 --> 00:11:00.908 I would also check the Stuttgart Scientific Illustrators database. 00:11:00.908 --> 00:11:02.862 This is one of the most comprehensive databases 00:11:02.862 --> 00:11:04.732 for scientific artists. 00:11:04.732 --> 00:11:06.835 Sometimes the woman would be in there, 00:11:06.835 --> 00:11:08.554 but sometimes not. 00:11:08.554 --> 00:11:10.357 Although a fabulous starting point, 00:11:10.357 --> 00:11:13.863 this database wasn't as comprehensive as I needed. 00:11:13.863 --> 00:11:17.903 But the wonderful thing about it was how responsive its creator, 00:11:17.903 --> 00:11:20.482 the History Department of the University of Stuttgart, 00:11:20.482 --> 00:11:22.427 was to emails. 00:11:22.427 --> 00:11:24.463 Both Michelle and I would write to them, 00:11:24.463 --> 00:11:28.980 including our research on particular women illustrators, 00:11:28.980 --> 00:11:31.441 asking for these women to be included. 00:11:31.441 --> 00:11:33.596 Again, there is a threshold to this. 00:11:33.596 --> 00:11:34.964 I certainly wouldn't write to them 00:11:34.964 --> 00:11:37.302 unless I had reasonable supporting evidence 00:11:37.302 --> 00:11:39.481 to justify their inclusion. 00:11:39.481 --> 00:11:42.962 But the information they needed to generate an external identifier 00:11:42.962 --> 00:11:47.535 was definitely less than what was needed to do a Wikipedia article. 00:11:47.535 --> 00:11:50.907 Folk in charge of this database were very grateful for our input, 00:11:50.907 --> 00:11:53.894 and once our research was confirmed by them, 00:11:53.894 --> 00:11:56.380 they would add these women to their database 00:11:56.950 --> 00:11:59.481 and then would generate an external identifier. 00:11:59.481 --> 00:12:04.796 They were also able to access resources that neither Michelle nor I had access to. 00:12:04.796 --> 00:12:09.170 Often, more data was added on these women in the DSI database 00:12:09.170 --> 00:12:11.743 as a result of their further research. 00:12:11.743 --> 00:12:15.047 A Wikidata property had already been created for this database, 00:12:15.047 --> 00:12:16.448 and so once awarded, 00:12:16.448 --> 00:12:20.534 it was an identifier I could then add to the woman's Wikidata item. 00:12:22.676 --> 00:12:27.406 Now, Michelle and I also contacted the BHL about these women. 00:12:27.406 --> 00:12:30.221 This is where our collaborative relationship with Grace 00:12:30.221 --> 00:12:31.790 came to the fore. 00:12:31.790 --> 00:12:34.872 Grace would encourage us to submit a request 00:12:34.872 --> 00:12:38.480 that the woman's name be added to the BHL catalog record. 00:12:38.480 --> 00:12:42.040 This is a more convoluted process than it might appear. 00:12:42.040 --> 00:12:46.831 BHL metadata is sourced from numerous contributing institutions. 00:12:47.431 --> 00:12:49.602 Since it was a cataloging change, 00:12:49.602 --> 00:12:54.027 the BHL protocol required that the change be submitted as a change request 00:12:54.027 --> 00:12:58.144 to the BHL cataloging group for review and final approval. 00:12:58.634 --> 00:13:01.740 So, again, to obtain the change to the catalog 00:13:01.740 --> 00:13:04.059 and the subsequent external identifier, 00:13:04.059 --> 00:13:06.914 it wasn't an easy rubber stamp process. 00:13:06.914 --> 00:13:10.371 We had to back up our request with sources and proof 00:13:10.371 --> 00:13:12.506 in order for the catalog to be changed. 00:13:12.506 --> 00:13:15.394 However, because we were doing this relatively frequently, 00:13:15.394 --> 00:13:18.432 the catalog group became used to our requests 00:13:18.432 --> 00:13:20.896 and were very appreciative of our efforts. 00:13:21.506 --> 00:13:24.105 If the necessary criteria was satisfied, 00:13:24.105 --> 00:13:26.564 the institutions were prepared to edit their metadata, 00:13:26.564 --> 00:13:29.634 and in doing so, create another external identifier, 00:13:29.634 --> 00:13:31.905 the BHL creator ID. 00:13:31.905 --> 00:13:34.710 At around the same time we were undertaking this work, 00:13:34.710 --> 00:13:37.517 BHL, in its intern program, 00:13:37.517 --> 00:13:40.319 was collaborating with other Wikidata editors. 00:13:40.319 --> 00:13:45.060 The BHL resident [Katie Nika] was working with Andy [Mebert] 00:13:45.060 --> 00:13:48.431 trialing adding BHL creator IDs to Wikidata. 00:13:48.431 --> 00:13:54.023 The original test case was 1,000 names into the Mix-n-Match tool, 00:13:54.023 --> 00:13:55.219 But, subsequently, 00:13:55.219 --> 00:13:58.731 the whole created dataset was uploaded into Mix-n-Match, 00:13:58.731 --> 00:14:02.556 allowing the matching of BHL dataset to Wikidata items. 00:14:02.556 --> 00:14:07.017 This dataset is huge and continues to be worked on by editors today. 00:14:07.697 --> 00:14:09.098 Due to the lack of resources, 00:14:09.098 --> 00:14:14.157 unfortunately, BHL can't continue Katie's work in Wikidata, 00:14:14.157 --> 00:14:17.713 but there are very encouraging of folk reusing their data 00:14:17.713 --> 00:14:20.293 and their collections and WikiProjects. 00:14:20.953 --> 00:14:26.143 Now, editors have also approved several BHL Wikidata properties, 00:14:26.143 --> 00:14:28.096 not just for the creator ID, 00:14:28.096 --> 00:14:32.453 but also the bibliographic ID, page ID, and item ID. 00:14:32.453 --> 00:14:35.541 And, as a result, it's now possible to link these women illustrators 00:14:35.541 --> 00:14:37.962 to their works via Wikidata. 00:14:37.962 --> 00:14:41.668 Obtaining a creator ID and therefore a Wikidata item 00:14:41.668 --> 00:14:45.234 can ensure a cascade of linked open data on them 00:14:45.234 --> 00:14:48.913 that can raise the visibility of these women to researchers. 00:14:48.913 --> 00:14:51.417 Slowly, I began to feel we were making real difference 00:14:51.417 --> 00:14:52.870 in surfacing these women. 00:14:52.870 --> 00:14:54.980 At least now when folk googled them 00:14:54.980 --> 00:14:56.910 the Wikidata item would appear 00:14:56.910 --> 00:15:00.632 and images they had created would show up in the image feed. 00:15:00.632 --> 00:15:05.189 Our research, tags, blogs, Wikidata items, and external identifiers 00:15:05.189 --> 00:15:06.659 brought about by our requests 00:15:06.659 --> 00:15:07.983 were all coming together, 00:15:07.983 --> 00:15:10.947 making these women much more easier to discover. 00:15:11.667 --> 00:15:14.365 Grace had already been using our tagging work 00:15:14.365 --> 00:15:16.409 in the BHL social media feeds 00:15:16.409 --> 00:15:20.114 to highlight the illustrations in the collections. 00:15:20.114 --> 00:15:23.536 Member institution librarians were writing blogs on these women 00:15:23.536 --> 00:15:27.393 and raising their visibility to a variety of audiences. 00:15:27.393 --> 00:15:30.948 These edited, well researched and referenced blogs 00:15:30.948 --> 00:15:32.635 were a definite step in the ladder 00:15:32.635 --> 00:15:36.853 towards obtaining citable sources for Wikipedia articles. 00:15:37.693 --> 00:15:39.395 But our work really came to the fore 00:15:39.395 --> 00:15:42.230 when BHL held their "Her Natural History: 00:15:42.230 --> 00:15:45.906 A Celebration of Women in Natural History" campaign. 00:15:45.906 --> 00:15:49.062 This was a multi-institutional, multi-platform campaign 00:15:49.062 --> 00:15:50.252 to raise awareness 00:15:50.252 --> 00:15:53.969 and to celebrate the contributions of women to natural history. 00:15:53.969 --> 00:15:56.245 This campaign resulted in numerous outcomes, 00:15:56.245 --> 00:15:58.226 many of which had a direct impact 00:15:58.226 --> 00:16:01.378 on the richness of the metadata available on these women. 00:16:02.048 --> 00:16:03.618 So the BHL cataloging group 00:16:03.618 --> 00:16:06.510 added more female contributors to the BHL catalog, 00:16:06.510 --> 00:16:09.254 generating more external identifiers. 00:16:09.254 --> 00:16:11.981 More images by the women were added to the Flickr feed, 00:16:11.981 --> 00:16:14.954 and these were either in the public domain or openly licensed 00:16:14.954 --> 00:16:17.507 so were able to be uploaded into Wikicommons. 00:16:17.507 --> 00:16:18.995 Numerous blog posts were written 00:16:18.995 --> 00:16:21.314 by the employees of the member institutions. 00:16:21.314 --> 00:16:25.070 Some of these blogs used the research Michelle and I had undertaken 00:16:25.070 --> 00:16:26.071 as a starting point, 00:16:26.071 --> 00:16:28.262 picking it up and running with it. 00:16:28.262 --> 00:16:29.497 These blogs often resulted 00:16:29.497 --> 00:16:32.666 in the discovery of new resources and sources of information 00:16:32.666 --> 00:16:34.352 that assisted in pushing some of the women 00:16:34.352 --> 00:16:37.589 over the notability threshold for a Wikipedia article. 00:16:37.589 --> 00:16:40.862 During the campaign, there were also three Wikimedia workshops: 00:16:40.862 --> 00:16:42.598 the Wikimedia District of Columbia 00:16:42.598 --> 00:16:44.953 ran a workshop concentrating on generating and improving 00:16:44.953 --> 00:16:47.106 Wikipedia articles on these women; 00:16:47.106 --> 00:16:50.579 two additional workshops were organized by Esther Jackson 00:16:50.579 --> 00:16:53.422 and jointly hosted by the New York Botanical Garden 00:16:53.422 --> 00:16:56.005 and the Wikimedia New York City. 00:16:56.005 --> 00:16:59.860 The first workshop focused on editing tags to the BHL Flickr feed 00:16:59.860 --> 00:17:04.217 and the second workshop focused on editing Wikidata and Wikicommons. 00:17:04.217 --> 00:17:06.555 These events made use of research [inaudible] 00:17:06.555 --> 00:17:09.559 that Michelle and I had undertaken in the preceding years. 00:17:09.559 --> 00:17:10.747 Worklists were generated 00:17:10.747 --> 00:17:13.415 by both the spreadsheets Michelle and I had created, 00:17:13.415 --> 00:17:16.991 as well as from Wikidata items that I, along with other editors, 00:17:16.991 --> 00:17:18.557 had helped create. 00:17:18.557 --> 00:17:22.136 And this campaign, I think, shows how effective Wikidata can be 00:17:22.136 --> 00:17:25.084 in assisting with the interlinking of knowledge. 00:17:25.084 --> 00:17:27.605 The Wikidata items became a leaping-off point, 00:17:27.605 --> 00:17:30.136 providing a framework enabling research 00:17:30.136 --> 00:17:33.698 to be collated and writing to commence. 00:17:35.918 --> 00:17:37.921 Now, this is just one example of a collaboration 00:17:37.921 --> 00:17:40.852 that can improve linked open data on these women. 00:17:40.852 --> 00:17:43.197 Once these women have a presence on Wikidata, 00:17:43.197 --> 00:17:45.504 the item itself can be put to use. 00:17:45.504 --> 00:17:46.613 An example of this 00:17:46.613 --> 00:17:49.107 is women natural history specimen collectors. 00:17:49.107 --> 00:17:52.228 Many underacknowledged women contributed to scientific knowledge, 00:17:52.228 --> 00:17:54.361 collecting specimens, 00:17:54.361 --> 00:17:57.188 and these are held in museums and herbaria. 00:17:57.188 --> 00:17:59.509 As more and more of these collections are digitized, 00:17:59.509 --> 00:18:02.363 more of the collectors are coming out of the woodwork. 00:18:02.363 --> 00:18:03.781 There are now sites being developed 00:18:03.781 --> 00:18:07.035 to assist scientists in getting the recognition they deserve 00:18:07.035 --> 00:18:09.441 from their fieldwork and collecting. 00:18:09.441 --> 00:18:12.811 The site I've recently been utilizing is Bloodhound Tracker. 00:18:12.811 --> 00:18:16.033 It uses the ORCID ID or the Wikidata item 00:18:16.033 --> 00:18:19.062 to link the collector to their collected specimen 00:18:19.062 --> 00:18:23.663 via the Global Biodiversity Information Facility, or GBIF. 00:18:23.663 --> 00:18:29.071 Collection information is a rich vein of data on early woman scientists, 00:18:29.071 --> 00:18:32.878 particularly as at that time, they'd been unable to publish works 00:18:32.878 --> 00:18:34.698 or join scientific societies 00:18:34.698 --> 00:18:36.992 due to the social norms of the day. 00:18:36.992 --> 00:18:39.790 Wikidata can be used to collect information on these women, 00:18:39.790 --> 00:18:46.350 linking the information held on them from archives, libraries, and museums, 00:18:46.350 --> 00:18:49.672 or to the scientific literature, based on the specimens they've collected, 00:18:49.672 --> 00:18:52.226 or the species that have been named after them. 00:18:52.226 --> 00:18:54.166 Once a Wikidata item is created 00:18:54.166 --> 00:18:56.466 and sufficient metadata has been added to it, 00:18:56.466 --> 00:18:57.741 the Bloodhound Tracker site 00:18:57.741 --> 00:19:01.774 will then automatically ingest details about those women into its site. 00:19:01.774 --> 00:19:04.830 Contributors can help those women claim their collections, 00:19:04.830 --> 00:19:07.217 enriching not just the linked open data, 00:19:07.217 --> 00:19:10.510 but ensuring these women get the credit for their vital work. 00:19:10.510 --> 00:19:14.229 But, again, Wikidata notability criteria can be a challenge. 00:19:14.229 --> 00:19:16.165 If the women collected significantly 00:19:16.165 --> 00:19:17.189 but didn't contribute 00:19:17.189 --> 00:19:19.730 either to the published record or as an illustrator, 00:19:19.730 --> 00:19:23.677 it can be difficult to hurdle the notability criteria for Wikidata. 00:19:23.677 --> 00:19:26.465 However, as more and more libraries, archives, and museums, 00:19:26.465 --> 00:19:33.026 and genealogical databases are gaining Wikidata external identifiers, 00:19:33.026 --> 00:19:34.822 it's becoming easier for these women 00:19:34.822 --> 00:19:37.233 to become notable for the purposes of Wikidata 00:19:37.233 --> 00:19:40.671 and then use Wikidata to link them to their works. 00:19:40.671 --> 00:19:43.210 I believe similar workflows to what I've outlined 00:19:43.210 --> 00:19:46.081 can be used for other underrepresented groups. 00:19:46.081 --> 00:19:49.992 By actively working to achieve the notability criteria for Wikidata, 00:19:49.992 --> 00:19:52.141 and then expanding the Wikidata items 00:19:52.141 --> 00:19:55.225 to highlight the contributions of underrepresented people, 00:19:55.905 --> 00:19:58.231 it's possible to improve their visibility. 00:19:58.231 --> 00:20:01.422 This, in turn, assists with the generation of secondary sources 00:20:01.422 --> 00:20:03.675 and creates a virtual cycle 00:20:03.675 --> 00:20:06.548 of information creation, sharing, and linking. 00:20:06.548 --> 00:20:08.568 By being proactive and collaborative, 00:20:08.568 --> 00:20:12.307 it's possible to work towards eliminating underrepresentation. 00:20:12.857 --> 00:20:14.076 Thank you. 00:20:14.076 --> 00:20:16.197 (applause) 00:20:28.797 --> 00:20:31.304 (women) Have you found any publication 00:20:31.304 --> 00:20:38.065 in which all of the illustrations actually need their own item? 00:20:39.242 --> 00:20:41.837 I think there will be; there definitely is. 00:20:41.837 --> 00:20:44.926 But if I went down that rabbit hole... 00:20:46.986 --> 00:20:48.432 I've got to stop somewhere, 00:20:48.432 --> 00:20:50.419 and I'm just trying to concentrate on the women. 00:20:50.419 --> 00:20:56.613 But, yes, there are classics of biodiversity literature 00:20:56.613 --> 00:21:01.904 that not only should have an item for the book itself 00:21:01.904 --> 00:21:03.823 but also for each illustration. 00:21:03.823 --> 00:21:06.461 I mean, Elizabeth Gould immediately springs to mind. 00:21:06.461 --> 00:21:08.113 Every piece of art that she ever did-- 00:21:08.113 --> 00:21:10.033 (woman) I would just say Maria Sibylla... 00:21:10.033 --> 00:21:12.482 Yep, she's a classic too. 00:21:18.030 --> 00:21:20.200 (man) James [Heald]. While you've been working on this, 00:21:20.200 --> 00:21:22.949 do you think that the way the notability criteria 00:21:22.949 --> 00:21:25.229 have been being applied has changed? 00:21:25.229 --> 00:21:27.947 - Is there are drift in a good direction? - Yes, I do think it has. 00:21:29.477 --> 00:21:32.212 Other than that first item being... 00:21:32.982 --> 00:21:35.004 I admit it was partially my mistake. 00:21:35.004 --> 00:21:38.313 I did the item, and I didn't have an external identifiers, 00:21:38.313 --> 00:21:43.270 and it seemed, because of the lack of the information I provided, 00:21:43.270 --> 00:21:45.223 I am not surprised it got deleted. 00:21:45.223 --> 00:21:46.993 Now I'm more experienced. 00:21:47.333 --> 00:21:50.331 But, saying that, I'm pretty sure I could put the same thing in nowadays 00:21:50.331 --> 00:21:51.774 and it wouldn't get deleted. 00:21:51.774 --> 00:21:53.453 I actually do think it has improved. 00:22:01.708 --> 00:22:03.187 (James [Heald]) Different question. 00:22:03.187 --> 00:22:05.223 I've seen on your Twitter sometimes, 00:22:05.223 --> 00:22:08.897 you've found women's work credited to their husbands. 00:22:08.897 --> 00:22:10.992 - Oh God, yes! - Would you say a bit more about that? 00:22:11.952 --> 00:22:14.258 Okay, there's a whole problem... 00:22:16.328 --> 00:22:18.323 Specifically, what gets me 00:22:18.323 --> 00:22:20.683 having to be peeling myself off the ceiling with rage 00:22:20.683 --> 00:22:23.904 is when the women botanists go out and collect 00:22:23.904 --> 00:22:28.195 and they're known under their marriage name, 00:22:28.195 --> 00:22:32.741 and then they put their specimens into the herbaria 00:22:32.741 --> 00:22:34.505 and the herbaria have a database, 00:22:34.505 --> 00:22:35.989 they transcribe the names, 00:22:35.989 --> 00:22:38.987 but they don't have a space in their database 00:22:38.987 --> 00:22:41.503 for the vital, important missus. 00:22:41.503 --> 00:22:45.940 And so what happens is that always, 00:22:45.940 --> 00:22:47.342 if it's pre-1950 00:22:47.342 --> 00:22:49.078 and the guy's known for being prolific, 00:22:49.078 --> 00:22:50.347 check his wife, 00:22:50.347 --> 00:22:53.669 because most of the time either she's typing 00:22:53.669 --> 00:22:56.440 and helping him produce the scientific papers 00:22:56.440 --> 00:22:58.827 or she's out there collecting with him. 00:22:58.827 --> 00:23:02.183 Yes, that's a definite problem that I have been raising 00:23:02.183 --> 00:23:03.384 with a lot of the herbaria. 00:23:03.384 --> 00:23:04.386 They just keep saying, 00:23:04.386 --> 00:23:07.274 "Our database doesn't have a place for the missus," 00:23:07.274 --> 00:23:10.250 and I say, "Find a place because it's important." 00:23:10.793 --> 00:23:11.957 Yeah. 00:23:18.347 --> 00:23:20.679 (man 2) What other domains will you copy this to? 00:23:20.679 --> 00:23:23.902 Because you're now doing it for a very specific subject. 00:23:23.902 --> 00:23:25.254 What comes to mind? 00:23:29.108 --> 00:23:30.321 It's a good question. 00:23:33.771 --> 00:23:36.856 I think anything where people get disappeared, 00:23:36.856 --> 00:23:39.761 where they're not credited for their work, 00:23:41.261 --> 00:23:42.882 it tends to be where they get lost. 00:23:42.882 --> 00:23:46.872 So something historic and the data just isn't linked. 00:23:46.872 --> 00:23:50.111 For me, women are the classic example. 00:23:50.111 --> 00:23:53.098 But I also think if there's, for example-- 00:23:54.598 --> 00:23:59.542 one that does spring to mind is artists in New Zealand, 00:23:59.542 --> 00:24:05.231 Maori artists, for example, who get acknowledged to oral history, 00:24:05.231 --> 00:24:07.623 but there are no written works, 00:24:07.623 --> 00:24:13.181 and so the scholarship could possibly be a problem later on down the track. 00:24:13.971 --> 00:24:18.624 I think that was a group that's ripe for using this type of work, 00:24:18.624 --> 00:24:20.393 to try and get identifiers for them, 00:24:20.393 --> 00:24:23.064 to make them more notable, to get them into Wikidata, 00:24:23.064 --> 00:24:26.270 so that then researchers are pointed towards them 00:24:26.270 --> 00:24:29.932 and can start doing the research needed to rediscover them. 00:24:36.853 --> 00:24:39.744 (woman 2) Okay, so I do a lot with women artists, 00:24:39.744 --> 00:24:44.489 and what I've found, apart from the married name thing, 00:24:44.489 --> 00:24:48.259 is they also tend to stay local, 00:24:48.259 --> 00:24:51.294 so they don't move and cross borders. 00:24:51.294 --> 00:24:54.266 It turns out notability is very highly correlated 00:24:54.266 --> 00:24:56.670 with the number of borders you cross in your lifetime. 00:24:56.670 --> 00:24:58.377 Right, yeah. 00:24:59.717 --> 00:25:02.395 To tell you the truth, I actually find that a benefit. 00:25:02.395 --> 00:25:05.668 It's much easier to disambiguate someone if they don't shift. 00:25:05.668 --> 00:25:07.870 If they've been in one place, 00:25:07.870 --> 00:25:10.234 you can then find the database, 00:25:10.234 --> 00:25:13.399 like the births or deaths or marriages database, 00:25:13.399 --> 00:25:18.188 and you can work out on the basis of their address 00:25:18.188 --> 00:25:22.194 or you can find them a lot easier if they don't shift. 00:25:22.194 --> 00:25:26.571 It's when they shift, and they change from maiden name to married name 00:25:26.571 --> 00:25:29.040 that it can get really difficult. 00:25:29.040 --> 00:25:30.249 (woman 2) Yeah. 00:25:35.090 --> 00:25:36.868 (woman 3) Just adding to the question 00:25:36.868 --> 00:25:40.331 that was asked earlier in what field you could use this. 00:25:40.981 --> 00:25:46.850 If it's a case where people are disappearing or are not visible, 00:25:46.850 --> 00:25:49.231 meaning that for women, in my opinion, 00:25:49.231 --> 00:25:51.107 that would mean like everywhere. 00:25:51.107 --> 00:25:52.184 Yeah. 00:25:54.194 --> 00:25:57.902 (woman 3) One of the things I work on is Delftware pottery workshops, 00:25:57.902 --> 00:26:02.193 and that was an official job in the 17th century. 00:26:02.193 --> 00:26:08.152 And when the potter died, there needed to be a new potter 00:26:08.152 --> 00:26:15.080 that was inscribed in the official guild book, 00:26:15.080 --> 00:26:17.497 - unless his wife could take over. - Ah! 00:26:17.497 --> 00:26:20.394 (woman 3) And then she could take over without that diploma, 00:26:20.394 --> 00:26:21.808 or whatever you want to call it, 00:26:21.808 --> 00:26:23.264 sometimes for years. 00:26:24.094 --> 00:26:26.582 And it would be attributed to her husband? 00:26:26.582 --> 00:26:31.925 (woman 3) Yes, because the pottery is always attributed to the owner. 00:26:34.045 --> 00:26:37.300 And they're like one line in the official encyclopedias... 00:26:37.300 --> 00:26:38.451 This doesn't surprise me. 00:26:38.451 --> 00:26:41.557 ...where the women are like taking care of the business for 10 years 00:26:41.557 --> 00:26:44.394 [and say for a job] of their husband for two years, 00:26:44.394 --> 00:26:46.865 but all the pottery items would be marked-- 00:26:46.865 --> 00:26:51.105 I think this is a really good example of how Wikidata can actually be used 00:26:51.105 --> 00:26:52.775 to surface these women 00:26:52.775 --> 00:26:55.981 and have something to hang the scholarship off, 00:26:55.981 --> 00:26:58.073 so that then, eventually, 00:26:58.073 --> 00:27:03.659 the more people who don't struggle to try and find the base information 00:27:03.659 --> 00:27:06.752 can then start the research, and the in-depth research 00:27:06.752 --> 00:27:08.901 that's required to surface these women. 00:27:08.901 --> 00:27:12.290 Wikidata, I think, is the easy way to have a framework, 00:27:13.050 --> 00:27:16.816 a skeleton to hang the bare data that you've got on 00:27:16.816 --> 00:27:19.818 to enable that research to happen. 00:27:19.818 --> 00:27:21.087 Yeah. 00:27:22.173 --> 00:27:24.665 (man 3) I'm sorry we are out of time. 00:27:24.665 --> 00:27:27.522 We have the lunch break now, so thank you. 00:27:27.522 --> 00:27:29.695 Well, come talk to me if anyone else has any questions. 00:27:29.695 --> 00:27:31.480 (applause)