WEBVTT 00:00:07.138 --> 00:00:08.288 Thanks folks. 00:00:09.627 --> 00:00:11.991 As I mentioned before, you can load up the slides here 00:00:11.991 --> 00:00:16.661 by either the QR code or the short URL, which is wikidatacon..., this is bit.ly, 00:00:16.661 --> 00:00:19.920 wikidatacon19glamstrategies. 00:00:19.980 --> 00:00:22.040 And the slides are also on the program page 00:00:22.040 --> 00:00:24.520 on the WikidataCon site. 00:00:24.549 --> 00:00:27.269 And then, there's also an Etherpad here that you can click on. 00:00:27.269 --> 00:00:28.959 So, I'll be talking about a lot of things. 00:00:28.959 --> 00:00:31.629 that you might have heard about it at Wikimania, if you were there, 00:00:31.629 --> 00:00:34.089 but we are going to go into a lot more implementation details. 00:00:34.089 --> 00:00:36.209 Because we're at WikidataCon, we can dive deeper 00:00:36.209 --> 00:00:38.430 into the Wikidata and technical aspects. 00:00:38.430 --> 00:00:41.821 But Richard and myself, we are working at the Met Museum right now 00:00:41.821 --> 00:00:43.200 and their Open Access. 00:00:43.200 --> 00:00:45.320 If you didn't know, about two plus years ago, 00:00:45.320 --> 00:00:46.920 entering to the third year, 00:00:46.920 --> 00:00:49.320 there's been an Open Access strategy at the Met, 00:00:49.320 --> 00:00:52.763 where they're releasing their images under CC0 license and their metadata. 00:00:52.763 --> 00:00:54.639 And one of the things they brought us on to do 00:00:54.639 --> 00:00:58.409 is what things could we imagine doing with this Open Access content. 00:00:58.409 --> 00:01:00.469 So, we're going to talk a little bit about that 00:01:00.469 --> 00:01:02.598 in terms of the experiments that we've been running, 00:01:02.598 --> 00:01:04.044 and we'd love to hear your feedback. 00:01:04.044 --> 00:01:07.028 So, I hope to talk about 20 minutes, and then hope to get some conversation 00:01:07.028 --> 00:01:09.853 with you folks, since we have a lot of knowledge in this room. 00:01:09.923 --> 00:01:12.472 This is the announcement, and actually the one-year anniversary, 00:01:12.472 --> 00:01:16.452 where Katherine Maher was actually there, at the Met to talk about that anniversary. 00:01:16.452 --> 00:01:19.172 So, one of the things that's challenging I think for a lot of folks 00:01:19.172 --> 00:01:21.097 is how do you explain Wikidata, 00:01:21.097 --> 00:01:23.911 and this GLAM contribution strategy to Wikidata 00:01:23.911 --> 00:01:27.102 to C-level folks at an organization. 00:01:27.102 --> 00:01:31.392 We can talk about it with data scientists, Wikimedians, librarians, maybe curators, 00:01:31.392 --> 00:01:34.452 but when it comes to talking about this with a director of a museum, 00:01:34.452 --> 00:01:36.862 or a director of a library, what does it actually-- 00:01:36.862 --> 00:01:38.482 how does it resonate with them? 00:01:38.482 --> 00:01:41.352 So, one way that we actually talked about that I think makes sense, 00:01:41.352 --> 00:01:43.978 is everyone knows about Wikipedia, 00:01:43.978 --> 00:01:47.799 and for the English language edition, 00:01:47.799 --> 00:01:49.733 at least, we're talking about 6 million articles. 00:01:49.733 --> 00:01:51.792 And it sounds like a lot, but if you think about it, 00:01:51.792 --> 00:01:54.361 Wikipedia is not really the sum of all human knowledge, 00:01:54.361 --> 00:01:59.512 it's the sum of all reliably sourced, mostly western knowledge. 00:02:00.281 --> 00:02:02.211 And there's a lot of stuff out there. 00:02:02.211 --> 00:02:04.141 We have a lot of stuff in Commons already-- 00:02:04.141 --> 00:02:07.382 56 million media files going up every single day-- 00:02:07.382 --> 00:02:11.484 but these are very... a different type of standard 00:02:11.484 --> 00:02:13.011 to what goes into Wikimedia Commons. 00:02:13.011 --> 00:02:16.431 And the way that we have described Wikidata to GLAM professionals, 00:02:16.431 --> 00:02:18.231 and especially the C levels, 00:02:18.231 --> 00:02:22.061 is that what if we could have a repository that has a notability bar 00:02:22.061 --> 00:02:24.381 that is not as high as Wikipedia. 00:02:24.381 --> 00:02:26.001 So, we want all these paintings, 00:02:26.001 --> 00:02:28.161 but not every painting necessarily needs an article. 00:02:28.581 --> 00:02:30.241 Wikipedia is held back by the fact 00:02:30.241 --> 00:02:33.082 that you need to have language editions of Wikipedia. 00:02:33.171 --> 00:02:36.681 So, can we store the famous thing-- things, not strings. 00:02:36.681 --> 00:02:40.570 Can we be object oriented and not really lexical oriented? 00:02:40.570 --> 00:02:42.181 And can we store this in a database 00:02:42.181 --> 00:02:44.540 that stores facts, figures, and relationships? 00:02:44.540 --> 00:02:46.291 And that's pretty much what Wikidata does. 00:02:46.711 --> 00:02:50.736 And Wikidata is also a universal kind of crosswalk database to links 00:02:50.736 --> 00:02:52.321 to other collections out there. 00:02:52.321 --> 00:02:55.119 So, we think this really resonates with folks when you're talking about 00:02:55.119 --> 00:02:58.596 what is the value of Wikidata compared to what they're normally familiar with, 00:02:58.596 --> 00:03:00.326 which is just Wikipedia. 00:03:01.346 --> 00:03:02.876 Alright, so what are the benefits? 00:03:02.876 --> 00:03:05.086 You're interlinking your collections with others. 00:03:05.086 --> 00:03:07.676 So, unfortunately, I apologize to librarians here, 00:03:07.676 --> 00:03:09.337 I'll be talking mostly about museums, 00:03:09.337 --> 00:03:11.816 but a lot of this also is valid also for libraries. 00:03:11.816 --> 00:03:15.867 But you're basically connecting your collection with the global collection 00:03:15.867 --> 00:03:18.166 of linked open data collections. 00:03:18.846 --> 00:03:22.276 You can also receive enriched and improved metadata back 00:03:22.276 --> 00:03:25.656 after contributing and linking your collections to the world. 00:03:25.656 --> 00:03:28.436 And there are some pretty neat interactive multimedia applications 00:03:28.436 --> 00:03:30.596 that you get-- I don't want to say for free, 00:03:30.596 --> 00:03:33.596 but your collection in Wikidata allows you to visualize things 00:03:33.596 --> 00:03:35.276 that you've never seen before. 00:03:35.276 --> 00:03:36.776 We'll show you some examples. 00:03:36.776 --> 00:03:39.737 And so, how do you convey this to GLAM professionals effectively? 00:03:39.737 --> 00:03:41.746 Well, I usually like to start with storytelling, 00:03:41.746 --> 00:03:43.536 and not technical explanations. 00:03:43.536 --> 00:03:46.368 Okay, so if everyone here has a cell phone, 00:03:46.368 --> 00:03:49.574 especially if you have an iPhone, I want you to scan this QR code 00:03:49.574 --> 00:03:51.645 and bring up the URL that it comes up with. 00:03:51.645 --> 00:03:53.393 Or if you don't have a QR scanner, 00:03:53.393 --> 00:03:58.963 just type in w.wiki/Aij in a web browser. 00:04:00.036 --> 00:04:01.942 So go ahead and scan that. 00:04:03.280 --> 00:04:04.864 And what comes up? 00:04:06.778 --> 00:04:09.458 Does anyone see a knowledge graph pop up on your screen? 00:04:09.516 --> 00:04:11.156 So, for folks here in WikidataCon, 00:04:11.156 --> 00:04:13.266 this is probably not revolutionary for you. 00:04:13.266 --> 00:04:16.386 But what it does, it does a SPARQL query with these objects, 00:04:16.386 --> 00:04:18.836 and it shows the linkages between them. 00:04:18.836 --> 00:04:20.897 And you can actually drag them around the screen. 00:04:20.897 --> 00:04:22.204 You can actually click on nodes. 00:04:22.204 --> 00:04:24.458 If you're [inaudible] in a mobile, it will expand that-- 00:04:24.458 --> 00:04:27.554 you can actually start to surf through Wikidata this way. 00:04:27.554 --> 00:04:29.741 So, for Wikidata veterans this is pretty cool. 00:04:29.741 --> 00:04:31.206 One shot, you get this. 00:04:31.206 --> 00:04:33.313 For a lot folks who have never seen Wikidata before, 00:04:33.313 --> 00:04:35.574 this is a revolutionary moment for them. 00:04:36.176 --> 00:04:39.236 To actually hand-manipulate a knowledge graph, 00:04:39.236 --> 00:04:42.186 and to start surfing through Wikidata without having to know SPARQL, 00:04:42.186 --> 00:04:43.823 without having to know what a Q item is, 00:04:43.823 --> 00:04:45.860 without having to know what a property proposal is, 00:04:45.860 --> 00:04:48.623 they can suddenly start seeing connections in a way that is magical. 00:04:48.623 --> 00:04:50.264 Hey, I see [Jacob's] here. 00:04:50.264 --> 00:04:52.143 Jacob's been using some of this code, as well. 00:04:52.143 --> 00:04:54.443 So, this is some code that we'll talk about later on 00:04:54.443 --> 00:04:57.254 that allows you to create these visualizations in Wikidata. 00:04:57.254 --> 00:04:59.283 And we've really seen this turn a lot of heads 00:04:59.283 --> 00:05:01.408 who have really never gotten Wikidata before. 00:05:01.408 --> 00:05:04.653 But after seeing these interactive knowledge graphs, they get it. 00:05:04.653 --> 00:05:06.233 They understand the power of this. 00:05:06.233 --> 00:05:08.293 And especially this example here, 00:05:08.293 --> 00:05:11.304 this was a really big eye-opener for the folks at the Met, 00:05:11.304 --> 00:05:14.545 because this is the artifact that is the center of this graph, 00:05:14.545 --> 00:05:17.823 right there, the Portrait of Madame X, a very famous portrait. 00:05:17.823 --> 00:05:20.982 And they did not even know that this was the inspiration 00:05:20.982 --> 00:05:24.693 for the black dress that Rita Hayworth wore in the movie Gilda. 00:05:24.693 --> 00:05:26.783 So, just by seeing this graph, they said, 00:05:26.783 --> 00:05:29.353 "Wait a minute. This is one of our most visited portraits. 00:05:29.353 --> 00:05:31.683 I didn't know that this was true." 00:05:31.683 --> 00:05:35.214 And there's actually two other books published about that painting. 00:05:35.214 --> 00:05:38.983 You can see all these things, not just within the realm of GLAM, 00:05:38.983 --> 00:05:41.441 but it extends to fashion, it extends to literature. 00:05:41.441 --> 00:05:43.381 You're starting to see the global connections 00:05:43.381 --> 00:05:47.481 that your artworks have, or your collections have via Wikidata. 00:05:48.722 --> 00:05:50.342 So, how do we do this? 00:05:50.842 --> 00:05:53.098 If you can remember nothing else from this presentation, 00:05:53.098 --> 00:05:56.432 this one page is your one-stop shopping. 00:05:56.432 --> 00:05:58.592 Now, fortunately, you don't have to memorize all this. 00:05:58.592 --> 00:06:03.292 It's actually right here at Wikidata:Linked_open_data_workflow. 00:06:03.560 --> 00:06:06.170 So, we'll be talking about some of these different phases 00:06:06.170 --> 00:06:10.670 of how you first prepare, reconcile, and examine 00:06:11.160 --> 00:06:14.190 what the GLAM organization might have and what does Wikidata have. 00:06:14.190 --> 00:06:15.374 And then, what are the tools 00:06:15.374 --> 00:06:18.664 to actually ingest and correct or enrich that 00:06:18.664 --> 00:06:20.241 once it's in Wikidata. 00:06:20.241 --> 00:06:22.691 And then, what are some of ways to reuse that content, 00:06:22.691 --> 00:06:25.161 or to report and create new things out of it. 00:06:25.161 --> 00:06:31.191 So, this is the simpler version of a chart that Sandra and the GLAM folks 00:06:31.191 --> 00:06:33.111 at the foundation have created. 00:06:33.111 --> 00:06:35.534 But this is trying to sum up, in one shot-- 00:06:35.534 --> 00:06:38.133 because we know how hard things are to find in Wikidata-- 00:06:38.133 --> 00:06:41.733 to find in one shot all the different tools you should pay attention to 00:06:41.733 --> 00:06:43.475 as a GLAM organization. 00:06:44.969 --> 00:06:50.606 So, just using the Met as an example, we started with what is the ideal object 00:06:50.606 --> 00:06:53.398 that we have in Wikidata that comes from the Met? 00:06:53.398 --> 00:06:55.882 This is a typical shot of a Wikidata item, 00:06:55.882 --> 00:06:57.385 in the mobile mode there. 00:06:57.385 --> 00:06:59.244 And this is one of the more famous paintings 00:06:59.244 --> 00:07:00.729 we used as a model, here. 00:07:00.729 --> 00:07:03.315 We have the label, description, and aliases. 00:07:03.915 --> 00:07:05.225 And then, we found out, 00:07:05.225 --> 00:07:07.035 "What are the core statements that we wanted?" 00:07:07.035 --> 00:07:10.035 We wanted instance of, image, inception, collection. 00:07:10.035 --> 00:07:13.239 And what are some other properties we would like if we had it? 00:07:13.239 --> 00:07:15.960 Depiction information, material used, things like that. 00:07:16.879 --> 00:07:19.369 We actually do have an identifier. 00:07:19.369 --> 00:07:22.199 The Met object ID is P3634. 00:07:22.199 --> 00:07:24.629 So, for some organizations, you might want to propose 00:07:24.629 --> 00:07:28.529 a property just to track your items using an object ID. 00:07:29.369 --> 00:07:31.899 And then, for the Met, just trying to circumscribe 00:07:31.899 --> 00:07:35.519 what objects do we want to upload and keep in Wikidata-- 00:07:35.519 --> 00:07:38.927 the thing that we first identified were collection highlights. 00:07:38.927 --> 00:07:43.649 These are like a hand-selected set of 1,500 to 1,000 items 00:07:43.678 --> 00:07:48.878 that were going to be given priority to upload to Wikidata. 00:07:48.939 --> 00:07:51.709 So, Richard and the crew out of Wikimedia in New York 00:07:51.709 --> 00:07:53.105 did a lot of this early work. 00:07:53.105 --> 00:07:55.571 And then, now, we're systematically going through to make sure 00:07:55.571 --> 00:07:56.689 they're all complete. 00:07:56.689 --> 00:07:58.221 And there's a secondary set 00:07:58.221 --> 00:08:01.390 called the Heilbrunn Timeline of Art History-- about 8,000 items 00:08:01.390 --> 00:08:07.149 that are seminal pieces of work, artists' works throughout history. 00:08:07.149 --> 00:08:09.499 And there are about 8,000 that the Met has identified, 00:08:09.499 --> 00:08:11.812 and we're also putting that on Wikidata, as well, 00:08:11.812 --> 00:08:13.143 using a different destination. 00:08:13.143 --> 00:08:16.271 Here, described by source-- Heilbrunn Timeline of Art History. 00:08:16.271 --> 00:08:19.841 So, the collection highlight is denoted here as collection-- 00:08:19.841 --> 00:08:21.265 Metropolitan Museum of Art, 00:08:21.265 --> 00:08:22.976 subject has role collection highlight. 00:08:22.976 --> 00:08:26.872 And then, these 8,000 or so are like that in Wikidata. 00:08:29.741 --> 00:08:33.816 I couldn't show this chart at Wikimania, because it's too complicated. 00:08:33.816 --> 00:08:35.389 But WikidataCon, we can. 00:08:35.389 --> 00:08:38.845 So, this is something that is really hard to answer sometimes. 00:08:39.490 --> 00:08:42.169 What makes something in Wikidata from the Met, 00:08:42.169 --> 00:08:44.658 or from the New York Public Library, or from your organization? 00:08:44.658 --> 00:08:47.609 And the answer is not easy. It's: depends. 00:08:47.644 --> 00:08:49.684 It's complicated, it can be multi-factor. 00:08:49.684 --> 00:08:53.254 So, you could say, "Well, if I had an object ID in Wikidata, 00:08:53.254 --> 00:08:54.804 that is an embed object." 00:08:54.804 --> 00:08:56.674 But maybe someone didn't enter that. 00:08:56.674 --> 00:08:59.924 Maybe they only put in Collection: Met which is P195, 00:08:59.924 --> 00:09:02.684 or they put in the accession number, 00:09:02.684 --> 00:09:06.984 and they put collection as the qualifier to that accession number. 00:09:06.984 --> 00:09:11.454 So, there's actually, one, two, three different ways to try to find Met objects. 00:09:11.454 --> 00:09:14.214 And probably the best way to do it is through a union like this. 00:09:14.214 --> 00:09:16.173 So, you combine all three, and you come back, 00:09:16.173 --> 00:09:18.064 and you make a list out of it. 00:09:18.064 --> 00:09:20.813 So unfortunately, there is no one clean query 00:09:20.813 --> 00:09:23.684 that'll guarantee you all the Met objects. 00:09:23.684 --> 00:09:27.873 This is probably the best approach for this. 00:09:27.873 --> 00:09:29.384 And for some institutions, 00:09:29.384 --> 00:09:32.505 they're probably doing something similar to that right now. 00:09:32.505 --> 00:09:35.824 Alright, so example here, is that what you see here 00:09:35.824 --> 00:09:39.684 manifests itself differently-- not differently, but as this in a query, 00:09:39.684 --> 00:09:40.904 which can get pretty complex. 00:09:40.904 --> 00:09:43.063 So, if we're looking for all the collection highlights, 00:09:43.063 --> 00:09:47.713 we'd break this out into the statement and then the qualifier as this: 00:09:47.782 --> 00:09:49.712 subject has role collection highlight. 00:09:49.712 --> 00:09:51.450 So, that's one way that we sort out 00:09:51.450 --> 00:09:54.124 some of these special designations in Wikidata. 00:09:55.166 --> 00:09:58.716 So, the summary is, representing "The Met" is multifaceted, 00:09:58.716 --> 00:10:01.536 and needs to balance simplicity and findability. 00:10:01.536 --> 00:10:04.896 How many people here have heard of Sum of All Paintings as a project? 00:10:04.995 --> 00:10:07.088 Ooh, God, good, a lot of you! 00:10:07.088 --> 00:10:09.105 So, it's probably one of the most active ones 00:10:09.105 --> 00:10:10.525 that deals with these issues. 00:10:10.525 --> 00:10:17.057 So, we always debate whether we should model things super-accurately, 00:10:17.057 --> 00:10:19.815 or should you model things so that they're findable. 00:10:19.815 --> 00:10:21.997 These are kind of at odds with each other. 00:10:21.997 --> 00:10:24.232 So, we usually prefer findability. 00:10:24.232 --> 00:10:27.001 It's no good if it's perfectly modeled, but no one can ever find it, 00:10:27.001 --> 00:10:30.013 because it's so strict in terms of how it's defined at Wikidata. 00:10:30.013 --> 00:10:31.882 And then, we have some challenges. 00:10:31.882 --> 00:10:35.367 Multiple artifacts might be tied to one object ID, 00:10:35.367 --> 00:10:37.396 which might be different in Wikidata. 00:10:37.396 --> 00:10:42.097 And then, mapping the Met classification to instances has some complex cases. 00:10:42.097 --> 00:10:44.282 So, the way that the Met classifies things 00:10:44.282 --> 00:10:46.775 doesn't always fit with how Wikidata classifies things. 00:10:46.775 --> 00:10:49.982 So, we show you some examples here of how this works. 00:10:49.982 --> 00:10:53.602 So, this is a great example of using a Python library 00:10:53.602 --> 00:10:56.487 to actually ingest what we know from the Met, 00:10:56.487 --> 00:10:58.313 and then try to sort out what they have. 00:10:58.313 --> 00:10:59.887 So, this is just for textiles. 00:10:59.887 --> 00:11:02.076 You can see that they got a lot of detail here 00:11:02.076 --> 00:11:05.399 in terms of woven textiles, laces, printed, trimmings, velvets. 00:11:05.399 --> 00:11:07.907 We first looked into this in Wikidata. 00:11:07.907 --> 00:11:10.175 We did not have this level of detail in Wikidata. 00:11:10.175 --> 00:11:12.207 We still don't have all this resolved. 00:11:12.207 --> 00:11:14.764 You can see that this is really complex here. 00:11:14.764 --> 00:11:18.012 Anonymous is just not anonymous for a lot of databases. 00:11:18.012 --> 00:11:20.126 There's a lot of qualifications-- 00:11:20.126 --> 00:11:23.045 whether the nationality, or the century. 00:11:23.045 --> 00:11:26.282 So, trying to map all this to Wikidata can be complex, as well. 00:11:26.282 --> 00:11:30.450 And then, this shows you that of all the works in the Met, 00:11:30.450 --> 00:11:33.976 about 46% are open access right now. 00:11:33.976 --> 00:11:38.694 So, we still have about just over 50% that are not CC0 yet. 00:11:40.134 --> 00:11:43.444 (man) All the objects in the Met, or all objects on display? 00:11:43.444 --> 00:11:45.957 (Andrew) It's weird. It's not on display. 00:11:45.957 --> 00:11:47.866 But it's not all objects either. 00:11:47.866 --> 00:11:52.176 It's about 400 to 500 thousand objects in their database at this point. 00:11:52.176 --> 00:11:53.840 So, somewhere in between. 00:11:55.380 --> 00:11:57.609 So, starting points. This is always a hard one. 00:11:57.609 --> 00:12:03.514 We just had this discussion on the Facebook group recently 00:12:03.514 --> 00:12:04.923 about where do people go 00:12:04.923 --> 00:12:07.887 to find out where the modeling should look like for a certain thing. 00:12:07.887 --> 00:12:09.271 It's not easy. 00:12:09.271 --> 00:12:12.115 So, normally, what we have to do is just point people to, 00:12:12.115 --> 00:12:15.281 I don't know, some project that does it well now? 00:12:15.281 --> 00:12:17.230 So, it's not a satisfying answer, 00:12:17.230 --> 00:12:19.910 but we usually tell folks to start at things like visual arts, 00:12:19.910 --> 00:12:22.308 or Sum of All Paintings does it pretty well, 00:12:22.308 --> 00:12:25.569 or just go to the project chat to find out where some of these things are. 00:12:25.569 --> 00:12:27.444 We need better solutions for this. 00:12:27.444 --> 00:12:30.939 This is just a basic flow of what we're doing with the Met here. 00:12:30.939 --> 00:12:33.119 We're basically taking their CSV, and their API, 00:12:33.119 --> 00:12:35.979 and we're consuming it into a Python data frame. 00:12:35.979 --> 00:12:38.159 We're taking the SPARQL code-- 00:12:38.159 --> 00:12:40.499 the one that you saw before, this super union-- 00:12:40.499 --> 00:12:43.779 bring that in, and we're doing a bi-directional diff, 00:12:43.779 --> 00:12:45.999 and then seeing what new things have been added here, 00:12:45.999 --> 00:12:47.729 what things have been subtracted there, 00:12:47.729 --> 00:12:51.529 and we're actually making those changes either through QuickStatements, 00:12:51.529 --> 00:12:53.439 or we're doing it through Pywikibot. 00:12:53.439 --> 00:12:55.512 So, directly editing Wikidata. 00:12:56.204 --> 00:12:59.405 So, this is the big slide I also couldn't show at Wikimania, 00:12:59.405 --> 00:13:01.485 because it would have flummoxed everyone. 00:13:01.485 --> 00:13:04.924 So, this is a great example of how we start with the Met database, 00:13:04.924 --> 00:13:06.824 we have this crosswalk database, 00:13:06.824 --> 00:13:09.209 and then we generate the changes in Wikidata. 00:13:09.209 --> 00:13:12.644 The way this works is this is an example of one record from the Met. 00:13:12.644 --> 00:13:15.744 This is an evening dress-- we're working with the Costume Institute recently, 00:13:15.744 --> 00:13:17.518 the one that puts on the Met Gala. 00:13:17.518 --> 00:13:20.442 So, we have one evening dress here, by Valentina. 00:13:20.442 --> 00:13:22.100 Here's a date, accession number. 00:13:22.100 --> 00:13:25.105 So, these things can be put into Wikidata directly. 00:13:25.105 --> 00:13:27.744 A field equals the date, accession number. 00:13:27.744 --> 00:13:29.404 But what do we do with things like this? 00:13:29.404 --> 00:13:33.868 This is an object name, which is basically like a classification of what it is, 00:13:33.868 --> 00:13:35.648 like an instance of for the Met. 00:13:35.648 --> 00:13:37.396 And the designer's Valentina. 00:13:37.396 --> 00:13:41.571 So, what we do is we take these and we run all the unique object names 00:13:41.571 --> 00:13:43.801 and all the unique designers through OpenRefine. 00:13:43.801 --> 00:13:46.720 So, we get maybe 60% matches if we're lucky. 00:13:46.720 --> 00:13:48.418 We put that into a spreadsheet. 00:13:48.418 --> 00:13:53.178 Then we ask volunteers or the curators at the Met 00:13:53.178 --> 00:13:55.333 to help fill in this crosswalk database. 00:13:55.333 --> 00:13:57.312 This is just simply Google Sheets. 00:13:57.312 --> 00:13:59.911 So, we say, here are all the object names, the unique object names 00:13:59.911 --> 00:14:02.731 that match lexically exactly with what's in the Met database, 00:14:02.731 --> 00:14:05.912 and then you say this maps to this Q ID. 00:14:05.912 --> 00:14:08.556 So, we first started this maybe like only about-- 00:14:08.556 --> 00:14:11.233 well, 60% were failed, some of these were blank. 00:14:11.233 --> 00:14:13.751 So, we tap folks in specific groups. 00:14:13.751 --> 00:14:17.316 So there's like a Wiki Loves Fashion little chat group that we have. 00:14:17.316 --> 00:14:20.304 And folks like user PKM were super useful in this area. 00:14:20.304 --> 00:14:22.794 So she spent a lot of time looking through this, and saying, 00:14:22.794 --> 00:14:24.764 "Okay, Evening suit is this, Ewer is that." 00:14:24.764 --> 00:14:27.759 So, we looked through and made all this mappings here. 00:14:27.759 --> 00:14:30.719 And then, what happens is now, when we see this in the Met database, 00:14:30.719 --> 00:14:33.201 we look it up in the crosswalk database, and we say, "Oh, yeah. 00:14:33.201 --> 00:14:36.169 These are the two Q numbers we need to put into Wikidata." 00:14:36.169 --> 00:14:39.089 And then, it generates the QuickStatement right there. 00:14:39.089 --> 00:14:41.328 Same thing here with Designer: Valentina. 00:14:41.328 --> 00:14:44.138 If Valentina matches here, then it gets generated 00:14:44.138 --> 00:14:45.838 with that QuickStatement right there. 00:14:45.838 --> 00:14:48.069 If Valentina does not exist, then we'll create it. 00:14:48.069 --> 00:14:51.288 You can see here, Weeks-- look at that high Q ID right there. 00:14:51.288 --> 00:14:53.918 We just created that recently, because there was no entry before. 00:14:53.918 --> 00:14:55.358 Does that makes sense to everyone? 00:14:55.358 --> 00:14:57.727 - (man 2) What's the extra statement? - (Andrew) I'm sorry? 00:14:57.727 --> 00:15:00.610 - (man 2) What's the extra statement? - (Andrew) Oh, the extra statement. 00:15:00.610 --> 00:15:03.131 So, believe it or not, we have an Evening blouse, Evening dress, 00:15:03.131 --> 00:15:05.010 Evening pants, Evening ensemble, Evening hat-- 00:15:05.010 --> 00:15:08.650 do we want to make a new Wikidata item for Evening pants,Evening everything? 00:15:08.650 --> 00:15:10.444 So, we said, "No." We probably don't want to. 00:15:10.444 --> 00:15:13.859 We'll just say, "It's a dress, but it's also evening wear", 00:15:13.859 --> 00:15:15.117 which is what that is. 00:15:15.117 --> 00:15:17.301 So, we're saying an instance of both things. 00:15:17.931 --> 00:15:21.398 I'm not sure it's the perfect solution, but it's a solution at this point. 00:15:21.744 --> 00:15:22.944 So, does everyone get that? 00:15:22.944 --> 00:15:25.564 So, this is kind of a crosswalk database that we maintain here. 00:15:25.564 --> 00:15:28.025 And the nice thing about it, it's just Google Sheets. 00:15:28.025 --> 00:15:29.264 So, we can get people to help 00:15:29.264 --> 00:15:31.375 that don't need to know anything about this database, 00:15:31.375 --> 00:15:34.384 don't need to know about QuickStatements, don't need to know about queries. 00:15:34.384 --> 00:15:36.226 They just go in and fill in the Q number. 00:15:36.226 --> 00:15:37.244 Yeah. 00:15:37.244 --> 00:15:40.902 (woman) So, when you copy object name and you find the Q ID, 00:15:40.902 --> 00:15:43.145 the initial 60% that you mentioned as an example, 00:15:43.145 --> 00:15:45.223 is that by exact match? 00:15:46.483 --> 00:15:48.103 (Andrew) Well, it's through OpenRefine. 00:15:48.103 --> 00:15:52.014 So, it does its best guess, and then we verify to make sure 00:15:52.014 --> 00:15:54.444 that the OpenRefine match makes sense. 00:15:54.444 --> 00:15:56.114 Yeah. 00:15:56.203 --> 00:15:57.794 Does that make sense to everyone? 00:15:57.794 --> 00:16:00.304 So, some folks might be doing some variation on this, 00:16:00.304 --> 00:16:03.403 but I think the nice thing about this is that, by using Google Sheets, 00:16:03.403 --> 00:16:08.234 we remove a lot of the complexities of these two areas from this. 00:16:08.234 --> 00:16:11.193 And we'll show you some code that does this later on. 00:16:11.813 --> 00:16:15.273 - (man 3) How do you generate [inaudible]? - (Andrew) How do you generate this? 00:16:15.273 --> 00:16:17.272 - (man 3) Yes. - (Andrew) Python code. 00:16:17.272 --> 00:16:19.134 I'll show you a line that does this. 00:16:19.134 --> 00:16:21.136 But you can also go up here. 00:16:21.136 --> 00:16:25.096 This is the whole Python program that does this, this, and that, 00:16:25.096 --> 00:16:27.296 if you want to take a look at that. 00:16:28.026 --> 00:16:29.026 Yes. 00:16:29.026 --> 00:16:31.207 (man 4) Did you really use your own vocabulary, 00:16:31.207 --> 00:16:35.426 or is there something [inaudible]. 00:16:35.426 --> 00:16:37.246 - (Andrew) This right here? - (man 4) Yeah. 00:16:37.246 --> 00:16:39.721 (Andrew) Yeah. So, this is the Met's own vocabulary. 00:16:39.721 --> 00:16:43.031 So, most museums use a system called TMS. 00:16:43.031 --> 00:16:44.891 It's like their own management system. 00:16:44.891 --> 00:16:47.654 So, they'll usually-- this is the museum world-- 00:16:47.654 --> 00:16:50.771 they'll usually roll their own vocabulary for their own needs. 00:16:50.771 --> 00:16:54.022 Museums are very late to interoperable metadata. 00:16:54.022 --> 00:16:57.282 Librarians and archivists have this kind of as baked into them. 00:16:57.282 --> 00:16:58.664 Museums are like, "Meh..." 00:16:58.664 --> 00:17:01.471 Our primary goal is to put objects on display, 00:17:01.471 --> 00:17:04.141 and if it plays well with other people, that's a side benefit. 00:17:04.141 --> 00:17:05.931 But it's not a primary thing that they do. 00:17:05.931 --> 00:17:08.031 So, that's why it's complicated to work with museums. 00:17:08.031 --> 00:17:11.161 You need to map their vocabulary, which might be a mish-mash 00:17:11.161 --> 00:17:14.576 of famous vocabularies, like Getty AAT, and other things. 00:17:14.576 --> 00:17:17.911 But usually, it's to serve their exact needs at their museum. 00:17:17.911 --> 00:17:19.591 And that's what's challenging. 00:17:19.591 --> 00:17:21.091 And I see a lot of heads nodding, 00:17:21.091 --> 00:17:23.161 so you've probably seen this a lot at these museums. 00:17:23.161 --> 00:17:25.429 So, I'll move on to show you how this actually is done. 00:17:25.429 --> 00:17:26.749 Oh, go ahead. 00:17:26.749 --> 00:17:28.711 (man 5) How do you bring people, to collaborate, 00:17:28.711 --> 00:17:31.595 and put some Q codes into your database? 00:17:31.595 --> 00:17:32.971 (Andrew) How do you-- I'm sorry? 00:17:32.971 --> 00:17:35.038 (man 5) How do you bring... collaborate people? 00:17:35.038 --> 00:17:38.290 (Andrew) Ah, so for this, these are projects we just go to, 00:17:38.780 --> 00:17:41.750 for better or for worse, like Facebook chat groups that we know, 00:17:41.750 --> 00:17:43.007 are active in these areas. 00:17:43.007 --> 00:17:45.685 Like Sum of All Paintings, Wiki Loves Fashion-- 00:17:45.685 --> 00:17:47.918 which is a group of maybe five or seven folks. 00:17:48.548 --> 00:17:50.759 But we need a better way to get this out to folks 00:17:50.759 --> 00:17:52.339 so we get more collaborators on this. 00:17:52.339 --> 00:17:53.879 This doesn't scale well, right now. 00:17:53.879 --> 00:17:56.089 But for small groups, it works pretty well. 00:17:56.108 --> 00:17:57.568 I'm open to ideas. 00:17:57.568 --> 00:17:59.619 (man 5) [inaudible] 00:17:59.619 --> 00:18:01.669 (Andrew) Oh yeah. Please come on up. 00:18:01.669 --> 00:18:02.948 If folks want to come up here, 00:18:02.948 --> 00:18:05.357 there's a little more room in the aisle right here. 00:18:06.057 --> 00:18:09.629 So, we are utilizing Python for this mostly. 00:18:09.774 --> 00:18:13.354 If you don't know, there is a Python notebook system 00:18:13.354 --> 00:18:14.884 that WMFLabs has. 00:18:14.884 --> 00:18:17.345 So, you can actually go on and start playing with this. 00:18:17.345 --> 00:18:19.624 So, it's pretty easy to generate a lot of stuff 00:18:19.624 --> 00:18:21.401 if you know some of the code that's there. 00:18:21.401 --> 00:18:22.455 [inaudible], yeah. 00:18:22.485 --> 00:18:23.922 (woman 2) Why do you put everything 00:18:23.922 --> 00:18:27.821 into Wikidata, and not into your own Wikibase? 00:18:29.401 --> 00:18:31.127 (Andrew) If you're using your own Wikibase? 00:18:31.127 --> 00:18:33.741 (woman 2) Yeah. Why don't you use your own Wikibase? 00:18:33.741 --> 00:18:35.990 and then go to [inaudible] 00:18:35.990 --> 00:18:38.390 (Andrew) That's its own ball of-- 00:18:38.390 --> 00:18:41.630 I don't want to maintain my own Wikibase at this point. (laughs) 00:18:42.190 --> 00:18:44.400 If I can avoid doing the Wikibase maintenance, 00:18:44.400 --> 00:18:45.760 I would not do it. 00:18:46.530 --> 00:18:48.080 (man 6) Would you like a Wikibase? 00:18:48.080 --> 00:18:50.050 (Andrew) We could. It's possible. 00:18:50.050 --> 00:18:54.154 (man 7) But again, what they use [inaudible] 00:18:54.154 --> 00:18:59.868 about 2,000, 8,000, 10,000, of 400,000 digital [inaudible]. 00:18:59.868 --> 00:19:04.300 So that's only 2.5%, 00:19:04.300 --> 00:19:08.782 [inaudible] 00:19:08.782 --> 00:19:12.601 (Andrew) So, I'd say, solve it for 1,500, then scale up to 150 thousand. 00:19:12.601 --> 00:19:14.428 So, we're trying to solve it 00:19:14.428 --> 00:19:16.876 for the best well-known objects, and then-- 00:19:16.876 --> 00:19:19.875 (man 7) When do you think that will happen? 00:19:20.855 --> 00:19:25.788 I understand that those are people that shouldn't go onto Wikidata. 00:19:25.788 --> 00:19:29.856 So you go to Commons or your own Wikibase solution, 00:19:29.856 --> 00:19:31.695 not to be a [inaudible]-- 00:19:31.695 --> 00:19:34.588 (Andrew) Right. That's why we're going with the 2,000 and 8,000. 00:19:34.588 --> 00:19:37.460 We're pretty confident these are highly notable objects 00:19:37.460 --> 00:19:39.085 that deserve to be in Wikidata. 00:19:39.085 --> 00:19:40.465 Beyond that, it's debatable. 00:19:40.465 --> 00:19:44.265 So, that's why we're not vacuuming 400-thousand things at one shot. 00:19:44.265 --> 00:19:48.936 We're starting with notable 2,000, notable 8,000, then we'll talk after that. 00:19:49.515 --> 00:19:52.775 So, these are the two lines of code that do the most stuff here. 00:19:52.775 --> 00:19:54.217 So, even if you don't know Python, 00:19:54.217 --> 00:19:56.146 it's actually not that bad if you look at this. 00:19:56.146 --> 00:19:58.105 There's a read_csv function. 00:19:58.105 --> 00:20:00.015 You're taking the crosswalk URL, 00:20:00.015 --> 00:20:02.336 basically, the URL of that Google Spreadsheet. 00:20:02.336 --> 00:20:04.875 You're grabbing the spreadsheet that's called "Object Name", 00:20:04.875 --> 00:20:06.685 and you're basically creating a data structure 00:20:06.685 --> 00:20:08.165 that has the Object Name and the QID. 00:20:08.165 --> 00:20:09.645 That's it. That's all you're doing. 00:20:09.645 --> 00:20:11.655 Just pulling that in to the Python code. 00:20:11.655 --> 00:20:15.914 Then, you're actually matching whatever the entity's name is, 00:20:15.914 --> 00:20:17.754 and then looking up the QID. 00:20:17.754 --> 00:20:21.689 Okay, so, this is just to tell you that's not super hard. 00:20:21.689 --> 00:20:24.234 The code is available right there, if you want to look at it. 00:20:24.234 --> 00:20:26.474 But these two lines of code, which takes a little while 00:20:26.474 --> 00:20:29.524 when you're writing it from scratch to create these two lines of code, 00:20:29.524 --> 00:20:30.904 but once you have an example, 00:20:30.904 --> 00:20:34.484 it's pretty darn easy to plug in your own data set, your own crosswalk, 00:20:34.484 --> 00:20:36.844 to generate the QuickStatements. 00:20:36.844 --> 00:20:38.525 So, I've done a lot of the work already, 00:20:38.525 --> 00:20:41.385 and I invite you to steal the code and try it. 00:20:42.365 --> 00:20:44.936 So, when it comes to images, it's a little more challenging. 00:20:44.936 --> 00:20:48.215 So, at this point, Pattypan is probably your best bet. 00:20:48.215 --> 00:20:51.385 Pattypan is a tool that is a spreadsheet-oriented tool. 00:20:51.385 --> 00:20:54.855 You fill in the metadata, you point to the local file on your computer, 00:20:54.855 --> 00:20:57.435 and it uploads it to Commons with all that information, 00:20:57.435 --> 00:21:02.125 or another alternative is if you set P4765 to a URL-- 00:21:03.105 --> 00:21:06.195 because this is the Commons-compatible image available at URL, 00:21:06.195 --> 00:21:08.544 Martin Dahhmers has a bot, at least for paintings, 00:21:08.544 --> 00:21:12.020 that will just swoop through and say, "Oh, we don't have this image. 00:21:12.020 --> 00:21:15.113 Here's a Commons compatible one. 00:21:15.113 --> 00:21:17.709 Why don't I slip it from that site and put it into Commons?" 00:21:17.709 --> 00:21:18.995 And that's what his bot does. 00:21:18.995 --> 00:21:20.733 So, you can actually take a look at his bot 00:21:20.733 --> 00:21:24.102 and modify it for your own purposes, but that is also another alternative 00:21:24.102 --> 00:21:28.061 that doesn't require you to do some spreadsheet work there. 00:21:28.061 --> 00:21:30.452 If you might have heard of GLAM Wiki Toolset, 00:21:30.452 --> 00:21:32.552 it's effectively end of life at this point. 00:21:33.322 --> 00:21:37.362 It hasn't been updated, and even the folks who have been working with it in the past 00:21:37.362 --> 00:21:39.332 have said Pattypan is probably your best bet. 00:21:39.332 --> 00:21:41.722 Has anyone used GWT these days? 00:21:41.741 --> 00:21:43.591 A few of you, a little bit. 00:21:43.591 --> 00:21:45.161 It's just not being further developed, 00:21:45.161 --> 00:21:47.852 and it's not compatible with a lot of our authentication protocols 00:21:47.852 --> 00:21:49.280 that we have now. 00:21:49.280 --> 00:21:52.928 Okay. So, right now, we have basic metadata added to Wikidata, 00:21:52.928 --> 00:21:54.997 with pretty good results from the Met, 00:21:54.997 --> 00:21:58.117 and we have a Python script here to also analyze that. 00:21:58.117 --> 00:22:00.307 You're welcome to steal some of that code, as well. 00:22:00.307 --> 00:22:02.817 So, this is what we are showing to the Met folks, now. 00:22:02.817 --> 00:22:06.087 We actually have Listeria lists that are running 00:22:06.087 --> 00:22:07.627 to show all the inventory 00:22:07.627 --> 00:22:10.967 and all the information that we have in Wikidata. 00:22:10.967 --> 00:22:15.612 And I'll show you very quickly about a project that we ran to show folks. 00:22:15.612 --> 00:22:18.547 So, what are the benefits of adding your collections to Wikidata? 00:22:18.547 --> 00:22:21.917 One is to use AI in the image classifier 00:22:21.917 --> 00:22:24.787 to actually help train a machine learning model 00:22:24.787 --> 00:22:29.447 with all the Met's images and keywords, and let that be an engine for other folks 00:22:29.447 --> 00:22:32.047 to recognize content. 00:22:32.047 --> 00:22:36.408 So, this is a hack-a-thon that we had with MIT and Microsoft last year. 00:22:36.408 --> 00:22:39.238 The way this works, is we have the paintings from the Met, 00:22:39.238 --> 00:22:40.277 and we have the keywords 00:22:40.277 --> 00:22:43.157 that they actually paid a crew for six months to work on 00:22:43.157 --> 00:22:46.937 to add hand keyword tags to all the artworks. 00:22:47.567 --> 00:22:50.077 We ingested that into an AI system right here, 00:22:50.077 --> 00:22:51.367 and then, what we did was say, 00:22:51.367 --> 00:22:55.428 "Let's feed in new images that this AI ML system had never seen before, 00:22:55.428 --> 00:22:56.747 and see what comes out." 00:22:56.747 --> 00:23:00.037 And the problem is that it comes out with pretty good results, 00:23:00.037 --> 00:23:02.267 but it's maybe only 60% accurate. 00:23:02.267 --> 00:23:04.797 And for most folks, 60% accurate is garbage. 00:23:04.797 --> 00:23:08.627 How do I get the 60% good out of this pile of stuff? 00:23:08.627 --> 00:23:11.127 The good news is that our community knows how to do that. 00:23:11.127 --> 00:23:13.157 We can actually feed this into a Wikidata game 00:23:13.157 --> 00:23:14.997 and get the good stuff out of that. 00:23:14.997 --> 00:23:16.228 That's basically what we did. 00:23:16.228 --> 00:23:17.647 So, this is the Wikidata game-- 00:23:17.647 --> 00:23:19.757 you'll notice this is Magnus' interface right there-- 00:23:19.757 --> 00:23:21.182 being played at the Met Museum, 00:23:21.182 --> 00:23:22.207 in the lobby. 00:23:22.207 --> 00:23:25.437 We actually had folks at a cocktail party drinking champagne 00:23:25.437 --> 00:23:27.427 and hitting buttons on the screen. 00:23:27.427 --> 00:23:31.048 Hopefully, accurately. (chuckles) 00:23:31.048 --> 00:23:33.444 (applause) 00:23:33.444 --> 00:23:35.116 We had journalists, curators, 00:23:35.116 --> 00:23:37.506 we had some board members from the Met there as well. 00:23:37.506 --> 00:23:38.810 And this was great. 00:23:38.810 --> 00:23:40.061 No log in, whatever. 00:23:40.061 --> 00:23:42.106 (lowers voice) We created an account just for this. 00:23:42.106 --> 00:23:44.117 So, they just hit yes-no-yes-no. 00:23:44.117 --> 00:23:45.256 This is great. 00:23:45.256 --> 00:23:47.526 You saw this, it said, "Is there a tree in this picture?" 00:23:47.526 --> 00:23:49.148 You don't have to train anyone on this. 00:23:49.148 --> 00:23:52.213 You just hit yes-- depicts a tree, not depicted. 00:23:52.213 --> 00:23:55.910 I even had my eight-year-old boys play this game with a finger tap. 00:23:56.540 --> 00:24:00.047 And we also created a little tool that showed all the depictions going by 00:24:00.047 --> 00:24:01.505 so people could see them. 00:24:03.189 --> 00:24:06.453 It basically is like-- how do you sift good from bad? 00:24:06.453 --> 00:24:08.350 This is where the Wikimedia community comes in, 00:24:08.350 --> 00:24:11.034 that no other entity could ever do. 00:24:12.084 --> 00:24:15.052 So, in that first few months that we had this, 00:24:15.052 --> 00:24:19.017 over 7,000 judgments, resulting in about 5,000 edits. 00:24:19.912 --> 00:24:22.227 We did really well on tree, boat, flower, horse, 00:24:22.227 --> 00:24:24.907 things that are in landscape paintings. 00:24:25.146 --> 00:24:27.466 But when you go to things like gender discrimination, 00:24:27.466 --> 00:24:29.901 and cats and dogs, not so good, I know. 00:24:29.901 --> 00:24:32.159 Because there's so many different types of cats and dogs 00:24:32.159 --> 00:24:33.456 in different positions. 00:24:33.456 --> 00:24:36.105 But horses, a lot easier than cats and dogs. 00:24:36.735 --> 00:24:38.742 But also, I should note that Wikimedia Foundation 00:24:38.742 --> 00:24:42.697 is now looking into doing image recognition on Commons uploads 00:24:42.697 --> 00:24:46.368 to do these suggestions as well, which is an awesome development. 00:24:46.667 --> 00:24:49.627 Okay, so, dashboards. 00:24:50.750 --> 00:24:53.358 Let's just show you some of these dashboards. 00:24:53.418 --> 00:24:55.097 Folks you work with love dashboards. 00:24:55.097 --> 00:24:56.817 They just want to see stats. 00:24:56.817 --> 00:24:58.797 So, we have them, like BaGLAMa. 00:24:58.797 --> 00:25:00.787 We have InteGraality. 00:25:00.787 --> 00:25:02.767 Is JeanFred here? 00:25:03.447 --> 00:25:06.247 I think this is a very new thing relative to last WikidataCon. 00:25:06.247 --> 00:25:08.327 We actually have a tool which will create 00:25:08.327 --> 00:25:10.967 this property completeness chart right here. 00:25:10.967 --> 00:25:12.987 So, it's called InteGraality, with two A's. 00:25:13.206 --> 00:25:15.526 It's on that big chart that I showed you before. 00:25:15.526 --> 00:25:19.086 And it can just autogenerate how complete your items are 00:25:19.086 --> 00:25:21.036 in any set, which is really cool. 00:25:21.566 --> 00:25:23.771 So, we can see that paintings are by far the highest, 00:25:23.771 --> 00:25:26.057 we have sculptures, drawings, photographs. 00:25:26.121 --> 00:25:29.322 And then, they also like to see what are the most popular artworks 00:25:29.322 --> 00:25:31.148 in the Wikisphere? 00:25:31.148 --> 00:25:33.417 So, just looking at the site links in Wikidata-- 00:25:33.417 --> 00:25:37.781 you can see and rank all these different artworks there. 00:25:39.568 --> 00:25:41.926 Also another thing they'd like to see 00:25:41.926 --> 00:25:46.879 is what are the most frequent creators of content or Met artworks-- 00:25:46.879 --> 00:25:49.193 what are the most commonly depicted things. 00:25:49.193 --> 00:25:51.982 So, these are very easy to generate in SPARQL, 00:25:51.982 --> 00:25:54.622 you could look at it right there, using bubble graphs. 00:25:54.673 --> 00:25:56.991 Then place of birth of the most prominent artists, 00:25:56.991 --> 00:25:58.814 we have a chart there, as well. 00:25:58.814 --> 00:26:01.142 So, structured data on Commons. 00:26:01.142 --> 00:26:04.301 I just want to show you very briefly in case you can't get to Sandra's session, 00:26:04.301 --> 00:26:06.226 but you definitely should go to Sandra's session. 00:26:06.226 --> 00:26:10.693 You actually can search in Commons for a specific Wikibase statement. 00:26:11.353 --> 00:26:15.333 I don't always remember the syntax, but you have burn in your brain 00:26:15.333 --> 00:26:19.893 and say, it's haswbstatement:P1343= 00:26:19.893 --> 00:26:22.695 whatever-- basically, your last two parts of the triple. 00:26:22.695 --> 00:26:26.162 I always get haswb and wbhas mixed up. 00:26:26.162 --> 00:26:28.183 I always get the colon and the equals mixed up. 00:26:28.183 --> 00:26:32.022 So just do it once, remember it, and you'll get the hang of it. 00:26:32.022 --> 00:26:34.772 But simple searches are must faster than SPARQL queries. 00:26:34.772 --> 00:26:36.478 So, if you can just look for one statement, 00:26:36.478 --> 00:26:38.392 boom, you'll get the results. 00:26:39.181 --> 00:26:43.711 So, things like this, you can look for symbolically or semantically, 00:26:43.711 --> 00:26:47.511 things that depict the Met museum, for example. 00:26:48.051 --> 00:26:50.051 So, finally, community campaigns. 00:26:50.051 --> 00:26:51.681 Richard has been a pioneer in this area. 00:26:51.681 --> 00:26:54.071 So, once you have the Wikidata items, 00:26:54.071 --> 00:26:57.050 they can actually assist in creating Wikipedia articles. 00:26:57.050 --> 00:26:59.785 So, Richard, why don't you tell us a little bit about the Mbabel tool 00:26:59.785 --> 00:27:01.009 that you created for this. 00:27:01.009 --> 00:27:03.192 (Richard) Hi, can I get this on? 00:27:04.649 --> 00:27:06.109 (Andrew) Oh, use [Joisey's]. 00:27:06.109 --> 00:27:08.319 (Richard) It's on, now. I'm good. 00:27:08.949 --> 00:27:10.769 So, we had all this information on Wikidata. 00:27:10.769 --> 00:27:13.729 [inaudible] browsing data on our evenings and weekends 00:27:13.729 --> 00:27:15.649 to learn about art-- not everyone does. 00:27:15.649 --> 00:27:19.319 We have quite a bit more people [inaudible] Wikipedia, 00:27:19.319 --> 00:27:22.260 so how do we get this information from Wikidata to Wikipedia? 00:27:22.260 --> 00:27:25.289 One of the ways of doing this is this so-called Mbabel, 00:27:25.289 --> 00:27:28.069 which developed with the help of a lot of people in [inaudible]. 00:27:28.069 --> 00:27:30.639 People like Martin and others. 00:27:31.689 --> 00:27:34.659 So, basically to take some basic art information, 00:27:34.659 --> 00:27:37.688 and use it to populate a Wikipedia article. 00:27:37.688 --> 00:27:40.241 So, by who created this work, who was the artist, 00:27:40.241 --> 00:27:42.313 when it was created, et cetera. 00:27:42.313 --> 00:27:44.626 The nice thing about this is it can generate works. 00:27:44.626 --> 00:27:46.210 We started with English Wikipedia, 00:27:46.210 --> 00:27:48.608 but it's been developed in other languages. 00:27:48.608 --> 00:27:50.938 So, Portuguese Wikipedia, our Brazilian friends 00:27:50.938 --> 00:27:53.508 who've done a lot of work and taking it to realms beyond art, 00:27:53.508 --> 00:27:57.283 to stuff like elections and political work as well. 00:27:57.283 --> 00:28:01.128 And the nice thing about this is we can query on Wikidata-- 00:28:01.758 --> 00:28:06.928 so different artists-- so for example, we've done projects with Women in Red, 00:28:06.928 --> 00:28:08.472 looking at women artists. 00:28:08.472 --> 00:28:12.753 Projects related to Wiki Loves Pride, looking at LGBT-identified artists, 00:28:12.753 --> 00:28:14.073 African Diaspora Artists, 00:28:14.073 --> 00:28:16.493 and a lot of different groups and things of time periods, 00:28:16.493 --> 00:28:19.293 different collections, and also looking at articles 00:28:19.293 --> 00:28:22.213 that have been and haven't been translated to different languages. 00:28:22.213 --> 00:28:24.923 So all of the articles that haven't been translated to Arabic yet. 00:28:24.923 --> 00:28:28.329 You need to find some interesting articles maybe that are relevant to a culture 00:28:28.329 --> 00:28:30.459 that haven't been translated into that language yet. 00:28:30.459 --> 00:28:32.659 We actually have a number of works in the Met collection 00:28:32.659 --> 00:28:35.199 that are in Wikipedias that aren't in English yet, 00:28:35.199 --> 00:28:37.259 because it's a global collection. 00:28:37.769 --> 00:28:40.449 So, there are a lot of ways, and hopefully, we can spread it around 00:28:40.449 --> 00:28:44.709 of creating Wikipedia content, as well, that is driven by these Wikidata items, 00:28:44.709 --> 00:28:47.549 and that also maybe can help spread the improvement 00:28:47.549 --> 00:28:49.529 to Wikidata items, as well, in the future. 00:28:49.529 --> 00:28:52.403 (Andrew) And there's a number of folks here using Mbable already, right? 00:28:52.403 --> 00:28:54.124 Who's using Mbable in the room? Brazilians? 00:28:54.124 --> 00:28:58.690 And also, if [Armin] is here, we have our winner 00:28:59.165 --> 00:29:03.146 of the Wikipedia Asia Month, and Wiki Loves Pride contest. 00:29:03.146 --> 00:29:05.720 So, thank you for joining, and congratulations. 00:29:06.493 --> 00:29:09.993 We'll have another Wiki Asia Month campaign in November. 00:29:10.173 --> 00:29:13.383 The way I like to describe it [inaudible] 00:29:13.383 --> 00:29:15.443 It doesn't give you a blank page. 00:29:15.443 --> 00:29:16.863 It gives you the skeleton, 00:29:16.863 --> 00:29:18.962 which is really a much better user experience 00:29:18.962 --> 00:29:21.472 for edit-a-thons and beginners. 00:29:21.472 --> 00:29:23.526 So, it's a lot of great work that Richard has done, 00:29:23.526 --> 00:29:25.841 and people are building on it, which is awesome. 00:29:25.906 --> 00:29:29.066 (woman 3) [inaudible] for some of them, which is really nice. 00:29:29.066 --> 00:29:30.376 Yeah, exactly. 00:29:30.376 --> 00:29:32.956 (woman 3) [inaudible] 00:29:32.956 --> 00:29:35.815 Right. We should have put a URL here. 00:29:35.815 --> 00:29:38.196 (man 8) [inaudible] 00:29:38.196 --> 00:29:40.055 Oh, that's right. We have the link right here. 00:29:40.055 --> 00:29:43.725 So if you click-- this is a Listeria list, it's autogenerating all that for you. 00:29:43.725 --> 00:29:46.205 And then, you click on the red link, it'll create the skeleton, 00:29:46.205 --> 00:29:47.491 which is pretty cool. 00:29:47.491 --> 00:29:49.172 Alright, we're on the final stretch here. 00:29:49.172 --> 00:29:51.990 The tool that we're going to be announcing-- 00:29:51.990 --> 00:29:55.047 well, we announced a few weeks ago, but only to a small set of folks, 00:29:55.047 --> 00:29:57.038 but we're making a big splash here, 00:29:57.038 --> 00:29:59.345 is the depiction tool that we just created. 00:29:59.345 --> 00:30:05.298 Wikipedia has shown that volunteer contributors can add a lot of these things 00:30:05.298 --> 00:30:06.681 that museums can't. 00:30:06.681 --> 00:30:10.263 So, what if we created a tool that could let you enrich 00:30:10.263 --> 00:30:15.907 the metadata about artworks in terms of the depiction information? 00:30:15.907 --> 00:30:19.477 And what we did was we applied for a grant from the Knight Foundation, 00:30:19.477 --> 00:30:22.684 and we created this tool-- and is Edward here? 00:30:22.760 --> 00:30:26.590 Edward is our wonderful developer who in like a month, said, 00:30:26.590 --> 00:30:28.050 "Okay, here's a prototype." 00:30:28.050 --> 00:30:33.103 After we gave him a specification, and it's pretty cool. 00:30:33.900 --> 00:30:35.849 - So what we can do-- - (applause) 00:30:35.849 --> 00:30:37.169 Thanks, Edward. 00:30:37.569 --> 00:30:39.269 We're working within collections of items. 00:30:39.269 --> 00:30:41.629 So, what we do, is we can bring up a page like this. 00:30:41.629 --> 00:30:44.789 It's no longer looking at a Wikidata item with a tiny picture. 00:30:44.789 --> 00:30:48.484 If we're working with what's depicted in the image, we want the picture big. 00:30:48.484 --> 00:30:51.201 And we don't really have tools that work with big images. 00:30:51.201 --> 00:30:53.348 We have tools that deal with lexical and typing. 00:30:53.348 --> 00:30:56.715 So one of the big things that Edward did was made a big version of the picture, 00:30:56.715 --> 00:30:58.739 scrape whatever you can from the object page 00:30:58.739 --> 00:31:00.633 from a GLAM organization, give you context. 00:31:00.633 --> 00:31:02.773 I can see dogs, children, wigwam. 00:31:02.773 --> 00:31:05.782 These are things that direct the user to add meaningful information. 00:31:05.782 --> 00:31:09.024 You have some metadata that's scraped from the site, too. 00:31:09.024 --> 00:31:11.868 Teepee, Comanche-- oh, it's Comanche, not Navajo, 00:31:11.868 --> 00:31:13.556 because I know the object page said that. 00:31:13.556 --> 00:31:15.702 And you can actually start typing in the field, there. 00:31:15.702 --> 00:31:17.628 And the cool thing is that it gives you context, 00:31:17.628 --> 00:31:19.566 It doesn't just match anything to Wikidata, 00:31:19.566 --> 00:31:23.107 it first matches things that have already been used in other depiction statements. 00:31:23.107 --> 00:31:25.456 Very simple thing, but what a godsend it is 00:31:25.456 --> 00:31:27.166 for folks who have tried this in the past. 00:31:27.166 --> 00:31:29.116 Don't give me everything that matches teepee. 00:31:29.116 --> 00:31:33.321 Show me what other paintings have used teepee in the past. 00:31:33.355 --> 00:31:36.175 So, it's interactive, context-driven, statistics-driven, 00:31:36.175 --> 00:31:37.936 by showing you what is matched before. 00:31:37.936 --> 00:31:40.336 And the cool thing is once you're done with that painting, 00:31:40.336 --> 00:31:42.196 you can start to work in other areas. 00:31:42.196 --> 00:31:44.936 You want to work within the same artist, the collection, location, 00:31:45.876 --> 00:31:47.295 other criteria here. 00:31:47.295 --> 00:31:49.146 And you can even browse through the collections 00:31:49.146 --> 00:31:51.582 of different organizations, just work on their paintings. 00:31:51.582 --> 00:31:53.670 So, we wanted people to not live in Wikidata-- 00:31:53.670 --> 00:31:56.307 kind of onesy-twosies with items, but live in a space 00:31:56.307 --> 00:31:59.232 where you're looking at artworks in collections that make sense. 00:31:59.683 --> 00:32:01.792 And then, you can actually look through it visually. 00:32:01.792 --> 00:32:04.237 It kind of looks like Krotos or these other tools, 00:32:04.237 --> 00:32:07.726 but you can actually live edit on Wikidata at the same time. 00:32:07.726 --> 00:32:09.104 So, go ahead and try it out. 00:32:09.104 --> 00:32:10.609 We've only have 14 users, 00:32:10.609 --> 00:32:14.667 but we've had 2,100 paintings worked on, with 5,000 plus depict statements. 00:32:14.667 --> 00:32:16.126 That's pretty good for 14. 00:32:16.126 --> 00:32:18.119 So, multiply that by 10-- 00:32:18.119 --> 00:32:20.515 imagine how many more things we could do with that. 00:32:20.515 --> 00:32:23.797 So, you can go ahead and go to art.wikidata.link and try out the tool. 00:32:23.797 --> 00:32:26.594 It uses OLAF authentication, and you're off to the races. 00:32:26.594 --> 00:32:29.187 And it should be very natural without any kind of training 00:32:29.187 --> 00:32:31.782 to add depiction statements to artworks. 00:32:31.837 --> 00:32:35.170 But you can put any object. We don't restrict the object right now. 00:32:35.170 --> 00:32:37.278 So, you could put any Q number 00:32:38.468 --> 00:32:41.208 to edit this content if you want. 00:32:41.275 --> 00:32:44.645 But we primarily stick with paintings and 2D artworks, right now. 00:32:46.184 --> 00:32:49.405 Okay. You can actually look at the recent changes 00:32:49.405 --> 00:32:52.175 and see who's made edits recently to that. 00:32:52.815 --> 00:32:54.855 Okay? Okay, so we're going to wind it down. 00:32:54.855 --> 00:32:58.386 Ooh, one minute, then we'll do some Q&A. 00:32:58.915 --> 00:33:03.081 So, the final thing that I think is useful for museum types especially, 00:33:03.081 --> 00:33:07.307 is there's a very famous author named Nina Simon in the museum world, 00:33:07.307 --> 00:33:11.204 where she likes to talk about how do we go from users, 00:33:11.204 --> 00:33:14.968 or I guess your audience, contributing stuff to your collections 00:33:14.968 --> 00:33:18.004 to collaborating around content, to actually being co-creative 00:33:18.004 --> 00:33:19.714 and creating new things. 00:33:19.714 --> 00:33:20.984 And that's always been tough. 00:33:20.984 --> 00:33:24.154 And I'd like to argue that Wikidata is this co-creative level. 00:33:24.154 --> 00:33:26.914 So, it's not just uploading a file to Commons, 00:33:26.914 --> 00:33:28.234 which is contributing something. 00:33:28.234 --> 00:33:31.194 It's not just editing an article with someone else, which is collaborative. 00:33:31.194 --> 00:33:34.833 But we are now seeing these tools that let you make timelines, 00:33:34.833 --> 00:33:36.133 and graphs, and bubble charts. 00:33:36.133 --> 00:33:38.833 And this is actually the co-creative part that's really interesting. 00:33:38.833 --> 00:33:40.353 And that's what Wikidata provides you. 00:33:40.353 --> 00:33:42.235 Because suddenly, it's not language dependent-- 00:33:42.235 --> 00:33:45.146 we've got this database that's got this rich information in it. 00:33:45.946 --> 00:33:48.606 So, it's not just pictures, not just text, 00:33:48.606 --> 00:33:50.522 but it's all this rich multimedia 00:33:50.522 --> 00:33:52.607 that we have the opportunity to work on. 00:33:52.607 --> 00:33:55.851 So, this is just another example of this connected graph 00:33:55.851 --> 00:33:57.389 that you can take a look at later on 00:33:57.389 --> 00:33:59.860 to show another example of The Death of Socrates, 00:33:59.860 --> 00:34:02.312 and the different themes around that painting. 00:34:03.252 --> 00:34:05.653 And it's really easy to make this graph yourself. 00:34:05.653 --> 00:34:08.172 So again, another scary graphic that only makes sense 00:34:08.172 --> 00:34:09.822 for Wikidata folks, like you. 00:34:09.822 --> 00:34:13.682 You just give it a list of Wikidata items, and it'll do the rest, that's it. 00:34:14.102 --> 00:34:15.662 You'll give the list. 00:34:15.705 --> 00:34:17.664 Keep all this code the same. 00:34:17.664 --> 00:34:21.364 So, fortunately, Martin and Lucas helped do all this code here. 00:34:21.364 --> 00:34:23.864 Just give it a list of items and the magic will happen. 00:34:23.864 --> 00:34:25.624 Hopefully, it won't blow up your computer, 00:34:25.624 --> 00:34:28.755 because you're putting in a reasonable number of items there. 00:34:28.755 --> 00:34:31.593 But as long as you have the screen space, it'll draw the graph, 00:34:31.593 --> 00:34:33.283 which is pretty darn cool. 00:34:33.283 --> 00:34:37.223 And then, finally, two tools-- I realized at 2 a.m. last night 00:34:37.223 --> 00:34:39.744 a few people said, "I didn't know about these tools." 00:34:39.744 --> 00:34:41.343 And you should know about these tools. 00:34:41.343 --> 00:34:44.613 So, one is Recoin, which shows you the relative completeness of an item 00:34:44.613 --> 00:34:46.773 compared to other items of the same instance. 00:34:46.773 --> 00:34:49.473 And then, Cradle, which is a way to have a forms-based way 00:34:49.473 --> 00:34:50.693 to create content. 00:34:50.693 --> 00:34:52.453 So, these are very useful for edit-a-thons 00:34:52.453 --> 00:34:54.753 where if you know that you're working with just artworks, 00:34:54.753 --> 00:34:57.553 don't just let people create items with a blank screen. 00:34:57.553 --> 00:35:00.275 Give them a form to fill out to start entering in information 00:35:00.275 --> 00:35:01.818 that's structured. 00:35:01.818 --> 00:35:04.588 And then, finally, we've gone through some of this, already. 00:35:06.268 --> 00:35:09.539 This is my big chart that I love to get people's feedback on. 00:35:09.539 --> 00:35:14.296 How do we get people across the chasm to be in this space? 00:35:14.328 --> 00:35:16.839 We have a lot of folks who, now, can do template coding, 00:35:16.839 --> 00:35:20.040 spreadsheets, QuickStatements, SPARQL queries, and then we got-- 00:35:20.935 --> 00:35:24.259 how do we get people to this side where we have Python 00:35:24.259 --> 00:35:26.694 and the things that can do more sophisticated editing. 00:35:26.694 --> 00:35:28.625 It's really hard to get people across this. 00:35:28.625 --> 00:35:30.785 But I would like to say it's hard to get people across, 00:35:30.785 --> 00:35:32.847 but the content and the technology is not that hard. 00:35:32.847 --> 00:35:35.380 We actually need more people to learn about regular expressions. 00:35:35.380 --> 00:35:38.307 And once you get some kind of experience here, 00:35:38.307 --> 00:35:41.830 you'll find that this is a wonderful world that you can learn a lot in, 00:35:41.830 --> 00:35:44.700 but it does take some time to get across this chasm. 00:35:44.829 --> 00:35:46.289 Yes, James. 00:35:46.289 --> 00:35:52.148 (James) [inaudible] 00:35:53.127 --> 00:35:57.192 No, what it means is that the graph is not necessarily accurate 00:35:57.192 --> 00:35:59.178 in terms of its data points. 00:35:59.308 --> 00:36:03.427 But what it means-- I guess it's more like this is a valley. 00:36:03.786 --> 00:36:06.716 It's like we need to get people across this valley here. 00:36:06.716 --> 00:36:10.146 (woman 4) [inaudible] 00:36:10.146 --> 00:36:11.546 I would say this is the key. 00:36:11.546 --> 00:36:16.296 If we can get people who know this stuff, but can grok this stuff, 00:36:16.296 --> 00:36:17.918 it gets them to this stuff. 00:36:17.918 --> 00:36:19.668 Does that make sense? Yeah. 00:36:19.668 --> 00:36:24.155 So, my vision for the next few years, we can get better training 00:36:24.155 --> 00:36:27.516 in our community to get people from batch processing, 00:36:27.516 --> 00:36:29.847 which is pretty much what this is, to kind of intelligent-- 00:36:29.847 --> 00:36:32.726 I wouldn't say intelligent, but more sophisticated programming, 00:36:32.726 --> 00:36:35.486 that would be a great thing, because we're seeing this is a bottleneck 00:36:35.486 --> 00:36:37.846 to a lot of the stuff that I just showed you up there. 00:36:37.846 --> 00:36:39.086 Yes. 00:36:39.135 --> 00:36:42.105 (man 9) [inaudible] 00:36:42.105 --> 00:36:45.984 Okay, wait, you want to show me something, show me after the session, does that work? 00:36:45.984 --> 00:36:47.584 Okay. Yes, Megan. 00:36:47.584 --> 00:36:50.804 - (Megan) Can I have a microphone? - Microphone, yes. 00:36:50.834 --> 00:36:54.528 - (Megan) [inaudible] - Yeah. 00:36:55.316 --> 00:36:56.636 And we have lunch after this, 00:36:56.636 --> 00:36:59.006 so if you want to stay a little bit later, that's fine, too. 00:36:59.006 --> 00:37:01.009 - [inaudible] - We're already at lunch break? Okay. 00:37:01.009 --> 00:37:03.094 (Megan) So, thank you so much to both you and Richard 00:37:03.094 --> 00:37:04.799 for all the work you're doing at the Met. 00:37:04.799 --> 00:37:07.027 And I know that you're very well supported in that. 00:37:07.027 --> 00:37:09.100 (mic feedback) I don't know what happened there. 00:37:09.100 --> 00:37:15.071 For the average volunteer community, how do you balance doing the work 00:37:15.071 --> 00:37:19.124 for the cultural heritage organization versus training the professionals 00:37:19.124 --> 00:37:21.792 that are there to do that work? 00:37:21.792 --> 00:37:24.412 Where do you find the balance in terms of labor? 00:37:25.672 --> 00:37:26.962 It's a good question. 00:37:27.397 --> 00:37:30.467 (Megan) One that really comes up, I think, with this as well. 00:37:30.467 --> 00:37:33.158 - With this? - (Megan) Yeah, and with building out... 00:37:33.187 --> 00:37:36.277 where we put efforts in terms of building out competencies. 00:37:36.333 --> 00:37:39.398 Yeah. I don't have a great answer for you, but it's a great question. 00:37:39.398 --> 00:37:40.658 (Megan) Cool. 00:37:40.658 --> 00:37:43.580 (Richard) There are a lot of tech people at [inaudible] 00:37:43.580 --> 00:37:46.158 who understand this side of the graph, and don't understand it-- 00:37:46.158 --> 00:37:48.878 the people in [inaudible] who understand this part of the graph, 00:37:48.878 --> 00:37:50.658 and don't understand this part of the graph. 00:37:50.658 --> 00:37:53.928 So, the more we can get Wikimedians who understand some of this, 00:37:53.928 --> 00:37:57.748 with some tech professionals at museums who understand this, 00:37:57.748 --> 00:37:59.408 then that makes it a little bit easier-- 00:37:59.408 --> 00:38:01.968 and hopefully, as well as training up Wikimedians, 00:38:01.968 --> 00:38:05.587 we can also provide some guidance and let the museums [inaudible] 00:38:05.587 --> 00:38:07.438 to take care of themselves in the [inaudible]. 00:38:07.496 --> 00:38:09.285 Yeah, that's a good point. 00:38:09.285 --> 00:38:11.961 How many people here know what regular expressions are? 00:38:11.961 --> 00:38:13.216 Raise your hand. 00:38:13.216 --> 00:38:17.397 Okay, so how many people are comfortable specifying a regular expression? 00:38:17.397 --> 00:38:19.267 So, yeah, we need more work here. 00:38:19.267 --> 00:38:20.771 (laughter) 00:38:20.771 --> 00:38:23.199 (man 10) I want to suggest that-- 00:38:24.648 --> 00:38:28.575 maybe not getting every Wikidata practitioner, 00:38:28.575 --> 00:38:33.607 or institution practitioner to embrace Python programming is the way. 00:38:33.717 --> 00:38:39.657 But as Richard just said, finding more bridging people-- people like you-- 00:38:39.657 --> 00:38:41.137 who speak both-- 00:38:41.137 --> 00:38:44.042 who speak Python, but also speak GLAM institution-- 00:38:44.812 --> 00:38:48.392 to help the GLAM's own technical department, which may not-- 00:38:49.233 --> 00:38:51.951 they know Python, they don't know this stuff. 00:38:52.640 --> 00:38:54.186 That's, I think, what's needed. 00:38:54.235 --> 00:38:59.034 People like you, people like me, people who speak both of these jargons 00:38:59.034 --> 00:39:01.835 to help make the connections, to document the connections. 00:39:01.835 --> 00:39:03.344 You're already doing this, of course. 00:39:03.344 --> 00:39:05.534 You share your code, et cetera, you're doing tutorials. 00:39:05.534 --> 00:39:07.044 But we need more of this. 00:39:07.044 --> 00:39:09.223 I'm not sure we need to make everyone programmers. 00:39:09.223 --> 00:39:10.612 We already have programmers. 00:39:10.612 --> 00:39:12.332 We need to make them understand 00:39:12.332 --> 00:39:14.612 the non-programming material they need to-- 00:39:14.612 --> 00:39:15.782 I think that's a great point. 00:39:15.782 --> 00:39:18.062 We don't need to make everyone highly proficient in this, 00:39:18.062 --> 00:39:20.312 but we do need people knowledgeable to say that, 00:39:20.312 --> 00:39:23.004 "Yeah, we can ingest 400 thousand rows and do something with it." 00:39:23.004 --> 00:39:25.284 Whereas, if you're stuck on this side, you're like, 00:39:25.284 --> 00:39:27.444 "400 thousand rows sounds really big and scary." 00:39:27.444 --> 00:39:30.364 But if you know that it's possible, you're like, "No problem." 00:39:30.364 --> 00:39:32.284 400 thousand is not a problem. 00:39:32.284 --> 00:39:35.414 (woman 5) I would just like to chime in a little bit in that 00:39:35.414 --> 00:39:39.674 that there may be countries and areas where you will not find a GLAM 00:39:39.674 --> 00:39:44.404 with any skilled technologists. 00:39:44.434 --> 00:39:47.834 So, you will have to invent something there in the middle. 00:39:48.502 --> 00:39:49.634 That's a good point. 00:39:49.778 --> 00:39:51.378 Any questions? Sandra. 00:39:55.648 --> 00:39:57.807 (Sandra) Yeah, I just wanted to add to this discussion. 00:39:57.807 --> 00:40:01.656 Actually, I've seen some very good cases where it indeed has been successful 00:40:01.656 --> 00:40:05.476 to train GLAM professionals to work with this entire environment, 00:40:05.476 --> 00:40:09.276 and where they've done fantastic jobs, also at small institutions. 00:40:10.046 --> 00:40:14.986 It also requires that you have chapters or volunteers that can train the staff. 00:40:15.163 --> 00:40:17.513 So, it's really like a bigger environment. 00:40:18.192 --> 00:40:22.044 But I think that's a model that if we can manage to make that grow, 00:40:22.044 --> 00:40:24.263 it can scale very well, I think. 00:40:24.673 --> 00:40:25.693 Good point. 00:40:25.693 --> 00:40:30.896 (woman 5) [inaudible] 00:40:32.029 --> 00:40:34.217 Sorry, just noting that we don't have 00:40:34.217 --> 00:40:37.820 any structured trainings right now for that. 00:40:38.209 --> 00:40:42.498 We might want to develop those, and that would be helpful. 00:40:42.608 --> 00:40:44.408 We have been doing that for education 00:40:44.408 --> 00:40:47.488 in terms of teaching people Wikipedia and Wikidata. 00:40:47.488 --> 00:40:50.008 It's just a matter of taking it one step further. 00:40:50.528 --> 00:40:52.168 Right. Stacy. 00:40:54.518 --> 00:40:56.988 (Stacy) Well, I'd just like to say that a lot of professionals 00:40:56.988 --> 00:41:02.006 who work in this area of metadata have all these skills already. 00:41:02.006 --> 00:41:08.966 So, I think part of it is just proving the value to these organizations, 00:41:08.966 --> 00:41:13.126 but then it's also tapping into professional associations who can-- 00:41:13.195 --> 00:41:16.745 or ways of collaborating within those professional communities 00:41:16.745 --> 00:41:21.374 to build this work, and the documentation on how to do things 00:41:21.374 --> 00:41:23.234 is really, really important, 00:41:23.234 --> 00:41:27.454 because I'm not sure about the role of depending on volunteers, 00:41:27.454 --> 00:41:32.294 when some of this work is actually work GLAM organizations do anyway. 00:41:32.395 --> 00:41:35.355 We manage our collections in a variety of ways through metadata, 00:41:35.355 --> 00:41:37.126 and this is actually one more way. 00:41:37.126 --> 00:41:40.495 So, should we also not be thinking about ways to integrate this work 00:41:40.495 --> 00:41:43.946 into a GLAM professional's regular job. 00:41:43.985 --> 00:41:46.125 And then that way you're generating-- 00:41:46.125 --> 00:41:48.885 and when you think about sustainability and scalability, 00:41:48.885 --> 00:41:53.426 that's the real trick to making this sustainable and both scalable, 00:41:53.745 --> 00:41:58.695 is that once this is the regular work of GLAM folks, 00:41:58.695 --> 00:42:00.885 we're not worried as much about this part, 00:42:00.885 --> 00:42:03.503 because it's just turning that little switch to get this 00:42:03.503 --> 00:42:05.763 to be a part of that work. 00:42:05.863 --> 00:42:08.063 Right. Good point. [Shani]?. 00:42:11.603 --> 00:42:13.229 (Shani) You're absolutely right. 00:42:13.229 --> 00:42:16.122 But I want to echo what you said before. 00:42:16.152 --> 00:42:21.566 And yes, Susana-- this might work for more privileged countries 00:42:22.082 --> 00:42:25.042 where they have money, they have people doing it. 00:42:25.682 --> 00:42:29.042 It doesn't work for places that are still developing, 00:42:29.042 --> 00:42:32.282 that don't have resources-- they don't have all of that. 00:42:32.592 --> 00:42:36.832 And they can barely do what they need to do. 00:42:36.886 --> 00:42:41.066 So, it's difficult for them, and then, the community is really helpful. 00:42:41.906 --> 00:42:45.495 These are the cases where the community can have a huge impact actually, 00:42:45.985 --> 00:42:50.349 working with the GLAMS, because they can't do it all 00:42:50.979 --> 00:42:52.296 as part of their jobs. 00:42:52.834 --> 00:42:55.034 So, we need to think about that as well. 00:42:55.053 --> 00:42:58.223 And having these examples, actually, is hugely important, 00:42:58.223 --> 00:43:00.763 because it's helping to still convince them, 00:43:00.763 --> 00:43:05.842 that it's critical to invest in it and to work with volunteers, 00:43:05.842 --> 00:43:09.082 so, with non-professionals of sorts, to get there. 00:43:10.003 --> 00:43:12.650 I can imagine a future where you don't have to know all this code. 00:43:12.650 --> 00:43:14.379 These would just be kind of like Lego bricks 00:43:14.379 --> 00:43:15.801 you can slap together, 00:43:15.801 --> 00:43:18.761 saying, "Here's my database. Here's the crosswalk. Here's Wikidata," 00:43:18.761 --> 00:43:21.311 and just put it together, and you don't have to even code, 00:43:21.311 --> 00:43:23.835 you just have to make sure the databases are in the right place. 00:43:23.835 --> 00:43:25.375 Yep. Okay. 00:43:26.747 --> 00:43:28.705 (man 11) Sorry. [inaudible] 00:43:28.705 --> 00:43:34.025 I think if I would have done this project, I'd probably have done it the same way. 00:43:34.025 --> 00:43:36.146 So, I think that's maybe a good sign. 00:43:36.146 --> 00:43:39.725 I was wondering how did the whole financing work of this project? 00:43:39.725 --> 00:43:40.840 How did the-- I'm sorry? 00:43:40.840 --> 00:43:43.255 The financing of this project work. 00:43:43.795 --> 00:43:45.755 - The financing? - Yeah, the money. 00:43:46.425 --> 00:43:47.505 That's a good question. 00:43:47.505 --> 00:43:49.185 Well, so, there are different parts of it. 00:43:49.185 --> 00:43:53.073 So, the Knight grant funded the Wiki Art Depiction Explorer. 00:43:53.198 --> 00:43:56.928 But I, for the last, maybe what-- nine months-- 00:43:56.928 --> 00:43:58.768 I've been their Wikimedia strategist. 00:43:58.768 --> 00:44:01.618 So, I've been on since February of this year. 00:44:01.618 --> 00:44:04.818 So, that's pretty much they're paying for my time to help with their-- 00:44:04.818 --> 00:44:07.968 not only the upload of their collections, but developing these tools, as well. 00:44:07.968 --> 00:44:11.659 - (Richard) So the Met's paying you? - Yeah, that's right. 00:44:11.762 --> 00:44:14.894 (Richard) The grant, at least part of it has come from-- 00:44:14.894 --> 00:44:16.959 There was a grant for Open Access. 00:44:16.959 --> 00:44:20.176 And this is under that campaign and with the digital department. 00:44:20.176 --> 00:44:24.297 So, working as contractors throughout the Open Access campaign for the Met. 00:44:27.948 --> 00:44:30.116 (man 12) I'm sorry. I guess before you were hired, 00:44:30.116 --> 00:44:31.313 and before there was a grant, 00:44:31.313 --> 00:44:33.780 there was probably a lot of volunteer work done to make sure-- 00:44:33.780 --> 00:44:35.303 Richard did a lot of work before that. 00:44:35.303 --> 00:44:37.219 And then, Wikimedia New York did a lot of work, 00:44:37.219 --> 00:44:38.927 but it was kind of in bursts. 00:44:38.927 --> 00:44:41.045 It wasn't as comprehensive as we're talking about now 00:44:41.045 --> 00:44:45.915 in terms of having-- making sure those two layers are complete 00:44:45.915 --> 00:44:47.310 in Wikidata. 00:44:48.640 --> 00:44:50.543 Alright, yeah. I think that's it. 00:44:50.543 --> 00:44:53.843 So, I'm happy to talk after lunch, or after the break, if you want. 00:44:54.683 --> 00:44:56.223 Okay. Thank you. 00:44:56.223 --> 00:44:59.197 (applause)