Thanks folks. As I mentioned before, you can load up the slides here by either the QR code or the short URL, which is wikidatacon..., this is bit.ly, wikidatacon19glamstrategies. And the slides are also on the program page on the WikidataCon site. And then, there's also an Etherpad here that you can click on. So, I'll be talking about a lot of things. that you might have heard about it at Wikimania, if you were there, but we are going to go into a lot more implementation details. Because we're at WikidataCon, we can dive deeper into the Wikidata and technical aspects. But Richard and myself, we are working at the Met Museum right now and their Open Access. If you didn't know, about two plus years ago, entering to the third year, there's been an Open Access strategy at the Met, where they're releasing their images under CC0 license and their metadata. And one of the things they brought us on to do is what things could we imagine doing with this Open Access content. So, we're going to talk a little bit about that in terms of the experiments that we've been running, and we'd love to hear your feedback. So, I hope to talk about 20 minutes, and then hope to get some conversation with you folks, since we have a lot of knowledge in this room. This is the announcement, and actually the one-year anniversary, where Katherine Maher was actually there, at the Met to talk about that anniversary. So, one of the things that's challenging I think for a lot of folks is how do you explain Wikidata, and this GLAM contribution strategy to Wikidata to C-level folks at an organization. We can talk about it with data scientists, Wikimedians, librarians, maybe curators, but when it comes to talking about this with a director of a museum, or a director of a library, what does it actually-- how does it resonate with them? So, one way that we actually talked about that I think makes sense, is everyone knows about Wikipedia, and for the English language edition, at least, we're talking about 6 million articles. And it sounds like a lot, but if you think about it, Wikipedia is not really the sum of all human knowledge, it's the sum of all reliably sourced, mostly western knowledge. And there's a lot of stuff out there. We have a lot of stuff in Commons already-- 56 million media files going up every single day-- but these are very... a different type of standard to what goes into Wikimedia Commons. And the way that we have described Wikidata to GLAM professionals, and especially the C levels, is that what if we could have a repository that has a notability bar that is not as high as Wikipedia. So, we want all these paintings, but not every painting necessarily needs an article. Wikipedia is held back by the fact that you need to have language editions of Wikipedia. So, can we store the famous thing-- things, not strings. Can we be object oriented and not really lexical oriented? And can we store this in a database that stores facts, figures, and relationships? And that's pretty much what Wikidata does. And Wikidata is also a universal kind of crosswalk database to links to other collections out there. So, we think this really resonates with folks when you're talking about what is the value of Wikidata compared to what they're normally familiar with, which is just Wikipedia. Alright, so what are the benefits? You're interlinking your collections with others. So, unfortunately, I apologize to librarians here, I'll be talking mostly about museums, but a lot of this also is valid also for libraries. But you're basically connecting your collection with the global collection of linked open data collections. You can also receive enriched and improved metadata back after contributing and linking your collections to the world. And there are some pretty neat interactive multimedia applications that you get-- I don't want to say for free, but your collection in Wikidata allows you to visualize things that you've never seen before. We'll show you some examples. And so, how do you convey this to GLAM professionals effectively? Well, I usually like to start with storytelling, and not technical explanations. Okay, so if everyone here has a cell phone, especially if you have an iPhone, I want you to scan this QR code and bring up the URL that it comes up with. Or if you don't have a QR scanner, just type in w.wiki/Aij in a web browser. So go ahead and scan that. And what comes up? Does anyone see a knowledge graph pop up on your screen? So, for folks here in WikidataCon, this is probably not revolutionary for you. But what it does, it does a SPARQL query with these objects, and it shows the linkages between them. And you can actually drag them around the screen. You can actually click on nodes. If you're [inaudible] in a mobile, it will expand that-- you can actually start to surf through Wikidata this way. So, for Wikidata veterans this is pretty cool. One shot, you get this. For a lot folks who have never seen Wikidata before, this is a revolutionary moment for them. To actually hand-manipulate a knowledge graph, and to start surfing through Wikidata without having to know SPARQL, without having to know what a Q item is, without having to know what a property proposal is, they can suddenly start seeing connections in a way that is magical. Hey, I see [Jacob's] here. Jacob's been using some of this code, as well. So, this is some code that we'll talk about later on that allows you to create these visualizations in Wikidata. And we've really seen this turn a lot of heads who have really never gotten Wikidata before. But after seeing these interactive knowledge graphs, they get it. They understand the power of this. And especially this example here, this was a really big eye-opener for the folks at the Met, because this is the artifact that is the center of this graph, right there, the Portrait of Madame X, a very famous portrait. And they did not even know that this was the inspiration for the black dress that Rita Hayworth wore in the movie Gilda. So, just by seeing this graph, they said, "Wait a minute. This is one of our most visited portraits. I didn't know that this was true." And there's actually two other books published about that painting. You can see all these things, not just within the realm of GLAM, but it extends to fashion, it extends to literature. You're starting to see the global connections that your artworks have, or your collections have via Wikidata. So, how do we do this? If you can remember nothing else from this presentation, this one page is your one-stop shopping. Now, fortunately, you don't have to memorize all this. It's actually right here at Wikidata:Linked_open_data_workflow. So, we'll be talking about some of these different phases of how you first prepare, reconcile, and examine what the GLAM organization might have and what does Wikidata have. And then, what are the tools to actually ingest and correct or enrich that once it's in Wikidata. And then, what are some of ways to reuse that content, or to report and create new things out of it. So, this is the simpler version of a chart that Sandra and the GLAM folks at the foundation have created. But this is trying to sum up, in one shot-- because we know how hard things are to find in Wikidata-- to find in one shot all the different tools you should pay attention to as a GLAM organization. So, just using the Met as an example, we started with what is the ideal object that we have in Wikidata that comes from the Met? This is a typical shot of a Wikidata item, in the mobile mode there. And this is one of the more famous paintings we used as a model, here. We have the label, description, and aliases. And then, we found out, "What are the core statements that we wanted?" We wanted instance of, image, inception, collection. And what are some other properties we would like if we had it? Depiction information, material used, things like that. We actually do have an identifier. The Met object ID is P3634. So, for some organizations, you might want to propose a property just to track your items using an object ID. And then, for the Met, just trying to circumscribe what objects do we want to upload and keep in Wikidata-- the thing that we first identified were collection highlights. These are like a hand-selected set of 1,500 to 1,000 items that were going to be given priority to upload to Wikidata. So, Richard and the crew out of Wikimedia in New York did a lot of this early work. And then, now, we're systematically going through to make sure they're all complete. And there's a secondary set called the Heilbrunn Timeline of Art History-- about 8,000 items that are seminal pieces of work, artists' works throughout history. And there are about 8,000 that the Met has identified, and we're also putting that on Wikidata, as well, using a different destination. Here, described by source-- Heilbrunn Timeline of Art History. So, the collection highlight is denoted here as collection-- Metropolitan Museum of Art, subject has role collection highlight. And then, these 8,000 or so are like that in Wikidata. I couldn't show this chart at Wikimania, because it's too complicated. But WikidataCon, we can. So, this is something that is really hard to answer sometimes. What makes something in Wikidata from the Met, or from the New York Public Library, or from your organization? And the answer is not easy. It's: depends. It's complicated, it can be multi-factor. So, you could say, "Well, if I had an object ID in Wikidata, that is an embed object." But maybe someone didn't enter that. Maybe they only put in Collection: Met which is P195, or they put in the accession number, and they put collection as the qualifier to that accession number. So, there's actually, one, two, three different ways to try to find Met objects. And probably the best way to do it is through a union like this. So, you combine all three, and you come back, and you make a list out of it. So unfortunately, there is no one clean query that'll guarantee you all the Met objects. This is probably the best approach for this. And for some institutions, they're probably doing something similar to that right now. Alright, so example here, is that what you see here manifests itself differently-- not differently, but as this in a query, which can get pretty complex. So, if we're looking for all the collection highlights, we'd break this out into the statement and then the qualifier as this: subject has role collection highlight. So, that's one way that we sort out some of these special designations in Wikidata. So, the summary is, representing "The Met" is multifaceted, and needs to balance simplicity and findability. How many people here have heard of Sum of All Paintings as a project? Ooh, God, good, a lot of you! So, it's probably one of the most active ones that deals with these issues. So, we always debate whether we should model things super-accurately, or should you model things so that they're findable. These are kind of at odds with each other. So, we usually prefer findability. It's no good if it's perfectly modeled, but no one can ever find it, because it's so strict in terms of how it's defined at Wikidata. And then, we have some challenges. Multiple artifacts might be tied to one object ID, which might be different in Wikidata. And then, mapping the Met classification to instances has some complex cases. So, the way that the Met classifies things doesn't always fit with how Wikidata classifies things. So, we show you some examples here of how this works. So, this is a great example of using a Python library to actually ingest what we know from the Met, and then try to sort out what they have. So, this is just for textiles. You can see that they got a lot of detail here in terms of woven textiles, laces, printed, trimmings, velvets. We first looked into this in Wikidata. We did not have this level of detail in Wikidata. We still don't have all this resolved. You can see that this is really complex here. Anonymous is just not anonymous for a lot of databases. There's a lot of qualifications-- whether the nationality, or the century. So, trying to map all this to Wikidata can be complex, as well. And then, this shows you that of all the works in the Met, about 46% are open access right now. So, we still have about just over 50% that are not CC0 yet. (man) All the objects in the Met, or all objects on display? (Andrew) It's weird. It's not on display. But it's not all objects either. It's about 400 to 500 thousand objects in their database at this point. So, somewhere in between. So, starting points. This is always a hard one. We just had this discussion on the Facebook group recently about where do people go to find out where the modeling should look like for a certain thing. It's not easy. So, normally, what we have to do is just point people to, I don't know, some project that does it well now? So, it's not a satisfying answer, but we usually tell folks to start at things like visual arts, or Sum of All Paintings does it pretty well, or just go to the project chat to find out where some of these things are. We need better solutions for this. This is just a basic flow of what we're doing with the Met here. We're basically taking their CSV, and their API, and we're consuming it into a Python data frame. We're taking the SPARQL code-- the one that you saw before, this super union-- bring that in, and we're doing a bi-directional diff, and then seeing what new things have been added here, what things have been subtracted there, and we're actually making those changes either through QuickStatements, or we're doing it through Pywikibot. So, directly editing Wikidata. So, this is the big slide I also couldn't show at Wikimania, because it would have flummoxed everyone. So, this is a great example of how we start with the Met database, we have this crosswalk database, and then we generate the changes in Wikidata. The way this works is this is an example of one record from the Met. This is an evening dress-- we're working with the Costume Institute recently, the one that puts on the Met Gala. So, we have one evening dress here, by Valentina. Here's a date, accession number. So, these things can be put into Wikidata directly. A field equals the date, accession number. But what do we do with things like this? This is an object name, which is basically like a classification of what it is, like an instance of for the Met. And the designer's Valentina. So, what we do is we take these and we run all the unique object names and all the unique designers through OpenRefine. So, we get maybe 60% matches if we're lucky. We put that into a spreadsheet. Then we ask volunteers or the curators at the Met to help fill in this crosswalk database. This is just simply Google Sheets. So, we say, here are all the object names, the unique object names that match lexically exactly with what's in the Met database, and then you say this maps to this Q ID. So, we first started this maybe like only about-- well, 60% were failed, some of these were blank. So, we tap folks in specific groups. So there's like a Wiki Loves Fashion little chat group that we have. And folks like user PKM were super useful in this area. So she spent a lot of time looking through this, and saying, "Okay, Evening suit is this, Ewer is that." So, we looked through and made all this mappings here. And then, what happens is now, when we see this in the Met database, we look it up in the crosswalk database, and we say, "Oh, yeah. These are the two Q numbers we need to put into Wikidata." And then, it generates the QuickStatement right there. Same thing here with Designer: Valentina. If Valentina matches here, then it gets generated with that QuickStatement right there. If Valentina does not exist, then we'll create it. You can see here, Weeks-- look at that high Q ID right there. We just created that recently, because there was no entry before. Does that makes sense to everyone? - (man 2) What's the extra statement? - (Andrew) I'm sorry? - (man 2) What's the extra statement? - (Andrew) Oh, the extra statement. So, believe it or not, we have an Evening blouse, Evening dress, Evening pants, Evening ensemble, Evening hat-- do we want to make a new Wikidata item for Evening pants,Evening everything? So, we said, "No." We probably don't want to. We'll just say, "It's a dress, but it's also evening wear", which is what that is. So, we're saying an instance of both things. I'm not sure it's the perfect solution, but it's a solution at this point. So, does everyone get that? So, this is kind of a crosswalk database that we maintain here. And the nice thing about it, it's just Google Sheets. So, we can get people to help that don't need to know anything about this database, don't need to know about QuickStatements, don't need to know about queries. They just go in and fill in the Q number. Yeah. (woman) So, when you copy object name and you find the Q ID, the initial 60% that you mentioned as an example, is that by exact match? (Andrew) Well, it's through OpenRefine. So, it does its best guess, and then we verify to make sure that the OpenRefine match makes sense. Yeah. Does that make sense to everyone? So, some folks might be doing some variation on this, but I think the nice thing about this is that, by using Google Sheets, we remove a lot of the complexities of these two areas from this. And we'll show you some code that does this later on. - (man 3) How do you generate [inaudible]? - (Andrew) How do you generate this? - (man 3) Yes. - (Andrew) Python code. I'll show you a line that does this. But you can also go up here. This is the whole Python program that does this, this, and that, if you want to take a look at that. Yes. (man 4) Did you really use your own vocabulary, or is there something [inaudible]. - (Andrew) This right here? - (man 4) Yeah. (Andrew) Yeah. So, this is the Met's own vocabulary. So, most museums use a system called TMS. It's like their own management system. So, they'll usually-- this is the museum world-- they'll usually roll their own vocabulary for their own needs. Museums are very late to interoperable metadata. Librarians and archivists have this kind of as baked into them. Museums are like, "Meh..." Our primary goal is to put objects on display, and if it plays well with other people, that's a side benefit. But it's not a primary thing that they do. So, that's why it's complicated to work with museums. You need to map their vocabulary, which might be a mish-mash of famous vocabularies, like Getty AAT, and other things. But usually, it's to serve their exact needs at their museum. And that's what's challenging. And I see a lot of heads nodding, so you've probably seen this a lot at these museums. So, I'll move on to show you how this actually is done. Oh, go ahead. (man 5) How do you bring people, to collaborate, and put some Q codes into your database? (Andrew) How do you-- I'm sorry? (man 5) How do you bring... collaborate people? (Andrew) Ah, so for this, these are projects we just go to, for better or for worse, like Facebook chat groups that we know, are active in these areas. Like Sum of All Paintings, Wiki Loves Fashion-- which is a group of maybe five or seven folks. But we need a better way to get this out to folks so we get more collaborators on this. This doesn't scale well, right now. But for small groups, it works pretty well. I'm open to ideas. (man 5) [inaudible] (Andrew) Oh yeah. Please come on up. If folks want to come up here, there's a little more room in the aisle right here. So, we are utilizing Python for this mostly. If you don't know, there is a Python notebook system that WMFLabs has. So, you can actually go on and start playing with this. So, it's pretty easy to generate a lot of stuff if you know some of the code that's there. [inaudible], yeah. (woman 2) Why do you put everything into Wikidata, and not into your own Wikibase? (Andrew) If you're using your own Wikibase? (woman 2) Yeah. Why don't you use your own Wikibase? and then go to [inaudible] (Andrew) That's its own ball of-- I don't want to maintain my own Wikibase at this point. (laughs) If I can avoid doing the Wikibase maintenance, I would not do it. (man 6) Would you like a Wikibase? (Andrew) We could. It's possible. (man 7) But again, what they use [inaudible] about 2,000, 8,000, 10,000, of 400,000 digital [inaudible]. So that's only 2.5%, [inaudible] (Andrew) So, I'd say, solve it for 1,500, then scale up to 150 thousand. So, we're trying to solve it for the best well-known objects, and then-- (man 7) When do you think that will happen? I understand that those are people that shouldn't go onto Wikidata. So you go to Commons or your own Wikibase solution, not to be a [inaudible]-- (Andrew) Right. That's why we're going with the 2,000 and 8,000. We're pretty confident these are highly notable objects that deserve to be in Wikidata. Beyond that, it's debatable. So, that's why we're not vacuuming 400-thousand things at one shot. We're starting with notable 2,000, notable 8,000, then we'll talk after that. So, these are the two lines of code that do the most stuff here. So, even if you don't know Python, it's actually not that bad if you look at this. There's a read_csv function. You're taking the crosswalk URL, basically, the URL of that Google Spreadsheet. You're grabbing the spreadsheet that's called "Object Name", and you're basically creating a data structure that has the Object Name and the QID. That's it. That's all you're doing. Just pulling that in to the Python code. Then, you're actually matching whatever the entity's name is, and then looking up the QID. Okay, so, this is just to tell you that's not super hard. The code is available right there, if you want to look at it. But these two lines of code, which takes a little while when you're writing it from scratch to create these two lines of code, but once you have an example, it's pretty darn easy to plug in your own data set, your own crosswalk, to generate the QuickStatements. So, I've done a lot of the work already, and I invite you to steal the code and try it. So, when it comes to images, it's a little more challenging. So, at this point, Pattypan is probably your best bet. Pattypan is a tool that is a spreadsheet-oriented tool. You fill in the metadata, you point to the local file on your computer, and it uploads it to Commons with all that information, or another alternative is if you set P4765 to a URL-- because this is the Commons-compatible image available at URL, Martin Dahhmers has a bot, at least for paintings, that will just swoop through and say, "Oh, we don't have this image. Here's a Commons compatible one. Why don't I slip it from that site and put it into Commons?" And that's what his bot does. So, you can actually take a look at his bot and modify it for your own purposes, but that is also another alternative that doesn't require you to do some spreadsheet work there. If you might have heard of GLAM Wiki Toolset, it's effectively end of life at this point. It hasn't been updated, and even the folks who have been working with it in the past have said Pattypan is probably your best bet. Has anyone used GWT these days? A few of you, a little bit. It's just not being further developed, and it's not compatible with a lot of our authentication protocols that we have now. Okay. So, right now, we have basic metadata added to Wikidata, with pretty good results from the Met, and we have a Python script here to also analyze that. You're welcome to steal some of that code, as well. So, this is what we are showing to the Met folks, now. We actually have Listeria lists that are running to show all the inventory and all the information that we have in Wikidata. And I'll show you very quickly about a project that we ran to show folks. So, what are the benefits of adding your collections to Wikidata? One is to use AI in the image classifier to actually help train a machine learning model with all the Met's images and keywords, and let that be an engine for other folks to recognize content. So, this is a hack-a-thon that we had with MIT and Microsoft last year. The way this works, is we have the paintings from the Met, and we have the keywords that they actually paid a crew for six months to work on to add hand keyword tags to all the artworks. We ingested that into an AI system right here, and then, what we did was say, "Let's feed in new images that this AI ML system had never seen before, and see what comes out." And the problem is that it comes out with pretty good results, but it's maybe only 60% accurate. And for most folks, 60% accurate is garbage. How do I get the 60% good out of this pile of stuff? The good news is that our community knows how to do that. We can actually feed this into a Wikidata game and get the good stuff out of that. That's basically what we did. So, this is the Wikidata game-- you'll notice this is Magnus' interface right there-- being played at the Met Museum, in the lobby. We actually had folks at a cocktail party drinking champagne and hitting buttons on the screen. Hopefully, accurately. (chuckles) (applause) We had journalists, curators, we had some board members from the Met there as well. And this was great. No log in, whatever. (lowers voice) We created an account just for this. So, they just hit yes-no-yes-no. This is great. You saw this, it said, "Is there a tree in this picture?" You don't have to train anyone on this. You just hit yes-- depicts a tree, not depicted. I even had my eight-year-old boys play this game with a finger tap. And we also created a little tool that showed all the depictions going by so people could see them. It basically is like-- how do you sift good from bad? This is where the Wikimedia community comes in, that no other entity could ever do. So, in that first few months that we had this, over 7,000 judgments, resulting in about 5,000 edits. We did really well on tree, boat, flower, horse, things that are in landscape paintings. But when you go to things like gender discrimination, and cats and dogs, not so good, I know. Because there's so many different types of cats and dogs in different positions. But horses, a lot easier than cats and dogs. But also, I should note that Wikimedia Foundation is now looking into doing image recognition on Commons uploads to do these suggestions as well, which is an awesome development. Okay, so, dashboards. Let's just show you some of these dashboards. Folks you work with love dashboards. They just want to see stats. So, we have them, like BaGLAMa. We have InteGraality. Is JeanFred here? I think this is a very new thing relative to last WikidataCon. We actually have a tool which will create this property completeness chart right here. So, it's called InteGraality, with two A's. It's on that big chart that I showed you before. And it can just autogenerate how complete your items are in any set, which is really cool. So, we can see that paintings are by far the highest, we have sculptures, drawings, photographs. And then, they also like to see what are the most popular artworks in the Wikisphere? So, just looking at the site links in Wikidata-- you can see and rank all these different artworks there. Also another thing they'd like to see is what are the most frequent creators of content or Met artworks-- what are the most commonly depicted things. So, these are very easy to generate in SPARQL, you could look at it right there, using bubble graphs. Then place of birth of the most prominent artists, we have a chart there, as well. So, structured data on Commons. I just want to show you very briefly in case you can't get to Sandra's session, but you definitely should go to Sandra's session. You actually can search in Commons for a specific Wikibase statement. I don't always remember the syntax, but you have burn in your brain and say, it's haswbstatement:P1343= whatever-- basically, your last two parts of the triple. I always get haswb and wbhas mixed up. I always get the colon and the equals mixed up. So just do it once, remember it, and you'll get the hang of it. But simple searches are must faster than SPARQL queries. So, if you can just look for one statement, boom, you'll get the results. So, things like this, you can look for symbolically or semantically, things that depict the Met museum, for example. So, finally, community campaigns. Richard has been a pioneer in this area. So, once you have the Wikidata items, they can actually assist in creating Wikipedia articles. So, Richard, why don't you tell us a little bit about the Mbabel tool that you created for this. (Richard) Hi, can I get this on? (Andrew) Oh, use [Joisey's]. (Richard) It's on, now. I'm good. So, we had all this information on Wikidata. [inaudible] browsing data on our evenings and weekends to learn about art-- not everyone does. We have quite a bit more people [inaudible] Wikipedia, so how do we get this information from Wikidata to Wikipedia? One of the ways of doing this is this so-called Mbabel, which developed with the help of a lot of people in [inaudible]. People like Martin and others. So, basically to take some basic art information, and use it to populate a Wikipedia article. So, by who created this work, who was the artist, when it was created, et cetera. The nice thing about this is it can generate works. We started with English Wikipedia, but it's been developed in other languages. So, Portuguese Wikipedia, our Brazilian friends who've done a lot of work and taking it to realms beyond art, to stuff like elections and political work as well. And the nice thing about this is we can query on Wikidata-- so different artists-- so for example, we've done projects with Women in Red, looking at women artists. Projects related to Wiki Loves Pride, looking at LGBT-identified artists, African Diaspora Artists, and a lot of different groups and things of time periods, different collections, and also looking at articles that have been and haven't been translated to different languages. So all of the articles that haven't been translated to Arabic yet. You need to find some interesting articles maybe that are relevant to a culture that haven't been translated into that language yet. We actually have a number of works in the Met collection that are in Wikipedias that aren't in English yet, because it's a global collection. So, there are a lot of ways, and hopefully, we can spread it around of creating Wikipedia content, as well, that is driven by these Wikidata items, and that also maybe can help spread the improvement to Wikidata items, as well, in the future. (Andrew) And there's a number of folks here using Mbable already, right? Who's using Mbable in the room? Brazilians? And also, if [Armin] is here, we have our winner of the Wikipedia Asia Month, and Wiki Loves Pride contest. So, thank you for joining, and congratulations. We'll have another Wiki Asia Month campaign in November. The way I like to describe it [inaudible] It doesn't give you a blank page. It gives you the skeleton, which is really a much better user experience for edit-a-thons and beginners. So, it's a lot of great work that Richard has done, and people are building on it, which is awesome. (woman 3) [inaudible] for some of them, which is really nice. Yeah, exactly. (woman 3) [inaudible] Right. We should have put a URL here. (man 8) [inaudible] Oh, that's right. We have the link right here. So if you click-- this is a Listeria list, it's autogenerating all that for you. And then, you click on the red link, it'll create the skeleton, which is pretty cool. Alright, we're on the final stretch here. The tool that we're going to be announcing-- well, we announced a few weeks ago, but only to a small set of folks, but we're making a big splash here, is the depiction tool that we just created. Wikipedia has shown that volunteer contributors can add a lot of these things that museums can't. So, what if we created a tool that could let you enrich the metadata about artworks in terms of the depiction information? And what we did was we applied for a grant from the Knight Foundation, and we created this tool-- and is Edward here? Edward is our wonderful developer who in like a month, said, "Okay, here's a prototype." After we gave him a specification, and it's pretty cool. - So what we can do-- - (applause) Thanks, Edward. We're working within collections of items. So, what we do, is we can bring up a page like this. It's no longer looking at a Wikidata item with a tiny picture. If we're working with what's depicted in the image, we want the picture big. And we don't really have tools that work with big images. We have tools that deal with lexical and typing. So one of the big things that Edward did was made a big version of the picture, scrape whatever you can from the object page from a GLAM organization, give you context. I can see dogs, children, wigwam. These are things that direct the user to add meaningful information. You have some metadata that's scraped from the site, too. Teepee, Comanche-- oh, it's Comanche, not Navajo, because I know the object page said that. And you can actually start typing in the field, there. And the cool thing is that it gives you context, It doesn't just match anything to Wikidata, it first matches things that have already been used in other depiction statements. Very simple thing, but what a godsend it is for folks who have tried this in the past. Don't give me everything that matches teepee. Show me what other paintings have used teepee in the past. So, it's interactive, context-driven, statistics-driven, by showing you what is matched before. And the cool thing is once you're done with that painting, you can start to work in other areas. You want to work within the same artist, the collection, location, other criteria here. And you can even browse through the collections of different organizations, just work on their paintings. So, we wanted people to not live in Wikidata-- kind of onesy-twosies with items, but live in a space where you're looking at artworks in collections that make sense. And then, you can actually look through it visually. It kind of looks like Krotos or these other tools, but you can actually live edit on Wikidata at the same time. So, go ahead and try it out. We've only have 14 users, but we've had 2,100 paintings worked on, with 5,000 plus depict statements. That's pretty good for 14. So, multiply that by 10-- imagine how many more things we could do with that. So, you can go ahead and go to art.wikidata.link and try out the tool. It uses OLAF authentication, and you're off to the races. And it should be very natural without any kind of training to add depiction statements to artworks. But you can put any object. We don't restrict the object right now. So, you could put any Q number to edit this content if you want. But we primarily stick with paintings and 2D artworks, right now. Okay. You can actually look at the recent changes and see who's made edits recently to that. Okay? Okay, so we're going to wind it down. Ooh, one minute, then we'll do some Q&A. So, the final thing that I think is useful for museum types especially, is there's a very famous author named Nina Simon in the museum world, where she likes to talk about how do we go from users, or I guess your audience, contributing stuff to your collections to collaborating around content, to actually being co-creative and creating new things. And that's always been tough. And I'd like to argue that Wikidata is this co-creative level. So, it's not just uploading a file to Commons, which is contributing something. It's not just editing an article with someone else, which is collaborative. But we are now seeing these tools that let you make timelines, and graphs, and bubble charts. And this is actually the co-creative part that's really interesting. And that's what Wikidata provides you. Because suddenly, it's not language dependent-- we've got this database that's got this rich information in it. So, it's not just pictures, not just text, but it's all this rich multimedia that we have the opportunity to work on. So, this is just another example of this connected graph that you can take a look at later on to show another example of The Death of Socrates, and the different themes around that painting. And it's really easy to make this graph yourself. So again, another scary graphic that only makes sense for Wikidata folks, like you. You just give it a list of Wikidata items, and it'll do the rest, that's it. You'll give the list. Keep all this code the same. So, fortunately, Martin and Lucas helped do all this code here. Just give it a list of items and the magic will happen. Hopefully, it won't blow up your computer, because you're putting in a reasonable number of items there. But as long as you have the screen space, it'll draw the graph, which is pretty darn cool. And then, finally, two tools-- I realized at 2 a.m. last night a few people said, "I didn't know about these tools." And you should know about these tools. So, one is Recoin, which shows you the relative completeness of an item compared to other items of the same instance. And then, Cradle, which is a way to have a forms-based way to create content. So, these are very useful for edit-a-thons where if you know that you're working with just artworks, don't just let people create items with a blank screen. Give them a form to fill out to start entering in information that's structured. And then, finally, we've gone through some of this, already. This is my big chart that I love to get people's feedback on. How do we get people across the chasm to be in this space? We have a lot of folks who, now, can do template coding, spreadsheets, QuickStatements, SPARQL queries, and then we got-- how do we get people to this side where we have Python and the things that can do more sophisticated editing. It's really hard to get people across this. But I would like to say it's hard to get people across, but the content and the technology is not that hard. We actually need more people to learn about regular expressions. And once you get some kind of experience here, you'll find that this is a wonderful world that you can learn a lot in, but it does take some time to get across this chasm. Yes, James. (James) [inaudible] No, what it means is that the graph is not necessarily accurate in terms of its data points. But what it means-- I guess it's more like this is a valley. It's like we need to get people across this valley here. (woman 4) [inaudible] I would say this is the key. If we can get people who know this stuff, but can grok this stuff, it gets them to this stuff. Does that make sense? Yeah. So, my vision for the next few years, we can get better training in our community to get people from batch processing, which is pretty much what this is, to kind of intelligent-- I wouldn't say intelligent, but more sophisticated programming, that would be a great thing, because we're seeing this is a bottleneck to a lot of the stuff that I just showed you up there. Yes. (man 9) [inaudible] Okay, wait, you want to show me something, show me after the session, does that work? Okay. Yes, Megan. - (Megan) Can I have a microphone? - Microphone, yes. - (Megan) [inaudible] - Yeah. And we have lunch after this, so if you want to stay a little bit later, that's fine, too. - [inaudible] - We're already at lunch break? Okay. (Megan) So, thank you so much to both you and Richard for all the work you're doing at the Met. And I know that you're very well supported in that. (mic feedback) I don't know what happened there. For the average volunteer community, how do you balance doing the work for the cultural heritage organization versus training the professionals that are there to do that work? Where do you find the balance in terms of labor? It's a good question. (Megan) One that really comes up, I think, with this as well. - With this? - (Megan) Yeah, and with building out... where we put efforts in terms of building out competencies. Yeah. I don't have a great answer for you, but it's a great question. (Megan) Cool. (Richard) There are a lot of tech people at [inaudible] who understand this side of the graph, and don't understand it-- the people in [inaudible] who understand this part of the graph, and don't understand this part of the graph. So, the more we can get Wikimedians who understand some of this, with some tech professionals at museums who understand this, then that makes it a little bit easier-- and hopefully, as well as training up Wikimedians, we can also provide some guidance and let the museums [inaudible] to take care of themselves in the [inaudible]. Yeah, that's a good point. How many people here know what regular expressions are? Raise your hand. Okay, so how many people are comfortable specifying a regular expression? So, yeah, we need more work here. (laughter) (man 10) I want to suggest that-- maybe not getting every Wikidata practitioner, or institution practitioner to embrace Python programming is the way. But as Richard just said, finding more bridging people-- people like you-- who speak both-- who speak Python, but also speak GLAM institution-- to help the GLAM's own technical department, which may not-- they know Python, they don't know this stuff. That's, I think, what's needed. People like you, people like me, people who speak both of these jargons to help make the connections, to document the connections. You're already doing this, of course. You share your code, et cetera, you're doing tutorials. But we need more of this. I'm not sure we need to make everyone programmers. We already have programmers. We need to make them understand the non-programming material they need to-- I think that's a great point. We don't need to make everyone highly proficient in this, but we do need people knowledgeable to say that, "Yeah, we can ingest 400 thousand rows and do something with it." Whereas, if you're stuck on this side, you're like, "400 thousand rows sounds really big and scary." But if you know that it's possible, you're like, "No problem." 400 thousand is not a problem. (woman 5) I would just like to chime in a little bit in that that there may be countries and areas where you will not find a GLAM with any skilled technologists. So, you will have to invent something there in the middle. That's a good point. Any questions? Sandra. (Sandra) Yeah, I just wanted to add to this discussion. Actually, I've seen some very good cases where it indeed has been successful to train GLAM professionals to work with this entire environment, and where they've done fantastic jobs, also at small institutions. It also requires that you have chapters or volunteers that can train the staff. So, it's really like a bigger environment. But I think that's a model that if we can manage to make that grow, it can scale very well, I think. Good point. (woman 5) [inaudible] Sorry, just noting that we don't have any structured trainings right now for that. We might want to develop those, and that would be helpful. We have been doing that for education in terms of teaching people Wikipedia and Wikidata. It's just a matter of taking it one step further. Right. Stacy. (Stacy) Well, I'd just like to say that a lot of professionals who work in this area of metadata have all these skills already. So, I think part of it is just proving the value to these organizations, but then it's also tapping into professional associations who can-- or ways of collaborating within those professional communities to build this work, and the documentation on how to do things is really, really important, because I'm not sure about the role of depending on volunteers, when some of this work is actually work GLAM organizations do anyway. We manage our collections in a variety of ways through metadata, and this is actually one more way. So, should we also not be thinking about ways to integrate this work into a GLAM professional's regular job. And then that way you're generating-- and when you think about sustainability and scalability, that's the real trick to making this sustainable and both scalable, is that once this is the regular work of GLAM folks, we're not worried as much about this part, because it's just turning that little switch to get this to be a part of that work. Right. Good point. [Shani]?. (Shani) You're absolutely right. But I want to echo what you said before. And yes, Susana-- this might work for more privileged countries where they have money, they have people doing it. It doesn't work for places that are still developing, that don't have resources-- they don't have all of that. And they can barely do what they need to do. So, it's difficult for them, and then, the community is really helpful. These are the cases where the community can have a huge impact actually, working with the GLAMS, because they can't do it all as part of their jobs. So, we need to think about that as well. And having these examples, actually, is hugely important, because it's helping to still convince them, that it's critical to invest in it and to work with volunteers, so, with non-professionals of sorts, to get there. I can imagine a future where you don't have to know all this code. These would just be kind of like Lego bricks you can slap together, saying, "Here's my database. Here's the crosswalk. Here's Wikidata," and just put it together, and you don't have to even code, you just have to make sure the databases are in the right place. Yep. Okay. (man 11) Sorry. [inaudible] I think if I would have done this project, I'd probably have done it the same way. So, I think that's maybe a good sign. I was wondering how did the whole financing work of this project? How did the-- I'm sorry? The financing of this project work. - The financing? - Yeah, the money. That's a good question. Well, so, there are different parts of it. So, the Knight grant funded the Wiki Art Depiction Explorer. But I, for the last, maybe what-- nine months-- I've been their Wikimedia strategist. So, I've been on since February of this year. So, that's pretty much they're paying for my time to help with their-- not only the upload of their collections, but developing these tools, as well. - (Richard) So the Met's paying you? - Yeah, that's right. (Richard) The grant, at least part of it has come from-- There was a grant for Open Access. And this is under that campaign and with the digital department. So, working as contractors throughout the Open Access campaign for the Met. (man 12) I'm sorry. I guess before you were hired, and before there was a grant, there was probably a lot of volunteer work done to make sure-- Richard did a lot of work before that. And then, Wikimedia New York did a lot of work, but it was kind of in bursts. It wasn't as comprehensive as we're talking about now in terms of having-- making sure those two layers are complete in Wikidata. Alright, yeah. I think that's it. So, I'm happy to talk after lunch, or after the break, if you want. Okay. Thank you. (applause)