Well, it's almost time to begin the presentation. We will begin this last session with a presentation on WikiCite, led by Elizabeth Seiver, Simon Cobb, and Liam Wyatt. And I'll just let you introduce yourself. Please don't hesitate to take notes on Etherpad. Thank you for everything. Alright, let's get started. So, I'm Elizabeth Seiver. I'm the outgoing program manager for WikiCite. And I wanted to tell you all a little bit about it. Just as a show of hands, how many people are already familiar with WikiCite? That's great. I'm just glad that so many of you are. I was wondering how many people here-- I was thinking about it-- is just like, "Who are all these people putting all the citations in Wikidata and filling it up?" And WikiCite is so much more. So, we're all excited to tell you about it today. So, what is WikiCite? The goal of WikiCite is to collect all citations for the sum of all human knowledge. You know, just a little something. And we're doing this in a number of ways. And one of them is via conferences and workshops and getting together the community of people who are interested in working on citations. And it's a very diverse group of people. So, of course, we have people who are working in Wikidata, and other Wikimedians. We have librarians, people into linked open data, software engineers, data scientists, open knowledge advocates-- coming together about linked open bibliographic data. So, in terms of the history of WikiCite, it was founded as an initiative in 2016. And we secured dedicated funding for events for three years in 2018. And as I mentioned, you're probably familiar with the big-- the millions of citations that we already have that are hosted on Wikidata. So, what are we doing in WikiCite and with all these citations? It's not just about collecting them. It's about using them. And it creates so many opportunities for new projects. So, one of the things you can do with this data is build data models for bibliographic item types, which should be exciting for people who are into schemas. You can also do open cataloging and disambiguation-- sorry, my notes are not in sync with this. And people are also building tools on top of this. Visualization tools, such as Scholia. If you're interested at all in open cataloging, or author disambiguation, or just even figuring out how sources link together, WikiCite is a good way to do that. So, in terms of the direction that WikiCite is heading in, one of the things we're trying to do is expand all the types of things that are cited. Right now, in Wikidata, it's mostly journal articles. We'd like to keep growing our community, especially outside of the Global North and outside of English language publications. And I realize this is actually something that Liam will be talking about. So, what we wanted to do now, to do sort of a deep dive into one of the uses of Wikidata. So, for that, I would like to introduce Simon Cobb. Hi, everyone. So, what I want to talk about is an example of something we could potentially focus on within the scope of WikiCite. And that's the data quality issues that I've been encountering over the last year, as I've been editing on scholarly papers. The three issues I'm going to briefly touch on are the quality of the author items that are getting attached to scholarly articles, issues around DOI formats, and just general curation of the data that we're creating. Firstly, we look at some authors. Oh, sorry, firstly, I'll provide some context. We've got 26 million scholarly article items now. And the data quality issues I'm going to talk about, a very small proportion of these are generally creating quite good quality data. We have a lot of external identifiers-- 21.65 million PubMed IDs, 19 million DOIs, and we've added 8.3 million author statements, although we still have 105.5 million author name strings to replace. In terms of the authors, we've been creating a lot of items from ORCID IDs. We've got over half a million items with an ORCID ID now. But over 50% of those do not have any affiliation data yet. And that's now in employer or in educated at. I found 25,000 where we only have two statements. That's an ORCID ID, and an instance of a human. This isn't particularly useful in terms of-- we use for anyone else and beyond Wikidata. If we're serious about approaching a bibliographic database and providing open data for people, we really need to be focusing on quality, I believe. So, there's a lot of work to be done. We've done really well with automatic input, but I think we need to, in the future, step back and think how can we really make this data useful. And one of the ways to do that is by making our author items better quality by adding affiliation information, adding first names, surnames, and just moving beyond occupation researcher, trying to get what field people are working in, for example. Moving on to DOIs. When I was looking at how many scholarly papers we have now, I immediately noticed that we have DOIs that are just four characters. And that is not a correct DOI. We've got about 110 items with this DOI format. In the grand scheme of things, not that big a problem. But that's never been a correct DOI that's being created by an automatic process. No one's checked that and realized we had this error and corrected it. So, it's kind of an appeal I want to make to people-- if you're doing batch imports, to check what you're doing, look for these obvious data quality problems. And another final issue that I've noticed is errata. We have over 13 thousand items that are instance of errata, but they're not linked to the paper they're correcting. So, I've also produced a table of the top ten titles of the-- these are errata items. You will notice they're not particularly informative. So, as some point, we're going to have to go back and look at how we can actually get the information about what these errata are correcting, because they're not really of much use to anyone at the moment. So, in the future, I hope this is one area that we can work on as a community, and we can coordinate a bit better with what data imports we're doing, and how we can curate all our data, bring it all together, and combine our expertise. I'm going to pass over to Liam now to talk a bit about how we might be able to coordinate our efforts in the future. Thank you. So, as mentioned in the final slide from Elizabeth, WikiCite is trying to be more and more diverse, and high quality, and more widely spread. The idea is over the next year or so, with the dedicated funding that's been provided and is available over a three-year period, of which we've entered, to change WikiCite-- the conference-- which there's been a few-- into a series of proposals from you, into what we're calling "satellite events" around the world. This will be focusing-- there'll be a call for a proposal system-- like reviewing a procedure that is currently not yet invented for deciding on how to-- what's the word I'm after-- prioritize these requests. And see if we can't get a wider diversity of content contributor and topic supported in the WikiCite umbrella, through this series of satellite events. To that end, the WikiCite grant-- was successfully applied for and received through the work of WikiCite's father, Dario, who many of you might know from the Wikimedia Foundation. Dario no longer works with the Wikimedia Foundation, and so this grant has a-- needed a home. What has happened is that the WikiCite steering committee, primarily made up of the organizing team from last year's WikiCite conference, will continue to oversee this work, and the Wikimedia Foundation has hired a temporary or a part-time coordinator, to oversee and support that work, and to promote and receive those applications for the satellite events. And that will be me. (laughter and cheers) So, I got the call yesterday so that I could be able to like confirm that in-- among an audience which is highly relevant to that topic. Which is helpful, so I can talk to you here and now about that. So, this is listed as a panel in the program. Even though it's a bit of a-- I think panel is a generous way of describing the three of us in this context. But the idea is we would like to hear from you on that immediate thought about-- or questions to Simon, as well-- if you have questions for Simon, specifically-- about what you think are good directions that should be addressed or should be attempted in this forthcoming year, either individually, online-- and things that not necessarily you can do, but think should be done. And specifically, to start thinking about what a satellite event would mean with relation to open citations and how the community at large would best be served by that kind of support. Beyond merely financial, but what does support mean for satellite events in open citations according to you. If you want to come back up, and we can-- Did you have a question? (woman) Ah, yes. I do research on predatory publishing and on retractions. You only mentioned errata. So, how are you dealing with expressions of concern and retractions? And what is your policy on trying to identify predatory publishers? Okay, so, within the scope of preparing for this, I wasn't looking at retractions, but people have been doing work on that and trying to-- we have a property-- notice of retractions-- so we can be creating those links. I don't know what extent that's happened in the same way. Not all the errata are linked to the paper that's being corrected. I suspect that's a similar case with-- - (woman) It's exactly the same. - Yeah. As I said, I wasn't looking at that, but we can potentially link the retraction to the retracted article, the retraction notice to the retracted article. In terms of predatory publishers, I'm not aware of anyone having done any work in this area, but I wouldn't like to say that hasn't happened. We have Charles, whose hand is going up there. Do you want to comment on predatory publishers, Charles? (Charles) Well, I encountered this problem in the ScienceSource project. And first of all, I did what I could to put fields list in Wikidata format. Fields list isn't sort of what everybody wants to be dealing with, but it was a starting point. So, that has been done, as far as I was able to. But the thing I rely on more, perhaps, is DOAJ IDs. That is, if we put all the DOAJ IDs into Wikidata, we'd have made a really good attempt to isolate the predatory publishers. And that is not the whole story, but these days, it's the bulk of the story. (woman) [Is the directory of open access there?] - (Charles) Directory of open access, yes. - (woman) Alright, good. (man) To start with, I just spent a year traveling around New Zealand trying to explain Wikidata to the library community, and as soon as I mentioned WikiCite, their eyes rolled, because they've just been told they have to be [up] with Wikipedia, Wiki Commons, Wikidata. Here's another Wiki project that they need to know about. "Why can't we just do it all with Wikidata?" they were saying. So, there's a public perception problem straightaway, and that's the very community that we need to have onboard for this to work. I'm interested in thinking how we are going to reach the library community, educate them, and get them integrally involved in this process? I have thoughts, but I'd like to hear your thoughts first. - Sure, I think-- - (assistant) [This one is on.] This better? Alright. Feel like I'm in a concert. So, one of the things we've tried to do is incorporate librarians and libraries into WikiCite in everything that we do. So, on the steering committee, we have at least two librarians, if not more. And at our actual WikiCite events, one of the things that's actually pretty great about WikiCite is that we end up getting both speakers and participants, who maybe are not actually involved in any Wiki projects. So, we don't have Wiki fatigue. And a lot of times, they're coming from the perspective of... "Well, I'm interested in linked open data, I love to use citations at my university, can you tell me a little bit more about how Wikidata works, and how I might use the citations that are in Wikidata?" So, I think it's very much about bringing these communities together, which might seem disparate, around these common goals for people who are really concerned about curating data, and then, people who might already know about how to do that on Wikidata. I would say, in terms of the confusion, the complexity implied by the question of well, there's WikiCite, and there's Wikidata, and there's this... WikiCite is a brand name, it's a project-- GLAM-Wiki-- GLAM-Wiki also uses the word Wiki, but it's not pretending to be a Wiki or competing with Wikipedia and Wikidata. It's the particular focus area of reference information, "referenceable" information. Now, particularly in the context of a series of conferences that have happened over the last few years, and the conference is called WikiCite-- particularly within this community, the Wikidata core group, WikiCite is seen, known, understood as a large number of items uploaded to Wikidata about scholarly publications. That is what is understood as WikiCite by this community, mostly. I would like to-- there is a question about, could WikiCite be made into its own Wikibase of just citation stuff? Not Wikidata, and then there's federation, and funky things like that, and you could put a lot more very specific information about individual, citable things there, which is a perfectly valid way of dealing with questions of notability and properties. But the technology for doing that is not yet relevant in any way. We need a lot more work, particularly on federation in Wikibase to make sure everything syncs neatly. So, until such time as that would be a viable outcome, in the meantime, all of the things that would serve that kind of outcome also serve just improving the quality on Wikidata and improving the links with Wikipedia and Wikisource. The brand name is, as far as I'm concerned, irrelevant. It's just the project to make better footnotes. (woman 2) Just a comment in relation to your query about satellite proposals for satellite conferences-- I don't think you realize the level of ignorance about Wiki-anything from our country in New Zealand. I mean, seriously. As an Australian, I recognize the ignorance of New Zealanders-- (laughter) (woman 2) Oh, [inaudible], come on! What I'm trying to say is that if we have a satellite or somehow organize a joint satellite conference, from my perspective, what I'm looking for is strategies and how to engage the community. They aren't even at the level of being-- they don't know enough to even be enthusiastic about Wikidata and WikiCite yet. They look at it with a lot of skepticism, if they're even aware of it. So I, in particular, want to be able to have a meeting in order to be able to learn from those. We've already engaged more successfully with the community to get a skill base in order to build some collaborations in New Zealand. You're talking about extra people to actually engage with. I just want the core library community to get on board, and then go the extra step. It's like I'm looking at you saying that we want to reach out to other communities, and I'm saying, I just want to reach out to a community. You know, we're a lot further behind where we are. So, yeah. I would not wish to pretend that WikiCite and open bibliographic information is the be-all and end-all of Wikidata or Wikimedia outreach. It's a specific subset. And I would not wish to try and make WikiCite a brand, appear to be overriding or replacing or somehow getting in the way of just general, good quality outreach about Wikimedia, and working with libraries, in general, and Wikidata, even more specific. This is a subset of Wikidata. So, particularly, for WikiCite satellite events, I don't want to make it appear like there's a competition for Wiki-- so, everything about Wikidata now has to be called WikiCite-- no. This is a really quite niche-- in the scheme of things-- topic area, supporting general awareness-raising about Wikidata and open access information, and Wikimedia is far beyond the scope of this kind of particular specialist outreach. And that's not to say that it's not a good thing, too. (woman 2) I just perceived-- sorry, one more comment-- WikiCite as the possible inroad to those at the wider community for the people we want to get on board. So, to me, WikiCite is-- yes, it's a subset, and really a much smaller set of beliefs and information, et cetera-- but I see it as an easy steppingstone to get them addicted, and then you can open it up. So, yeah. (assistant) We have just time for one short question. So, one of you have another question for the WikiCite team? Thank you for sharing this feedback with us. Oh, somebody has a question. (assistant) Which one of you wants to... (woman 3) Hi, thank you so much for this. I was just wondering, is there ever going to be a paring of the bibliography used in Wikipedia articles and WikiCite? Are you planning to move all those references and parse them so that we can do some analyses of which references we're using in the Wikipedia articles-- and when you create an article in another language just to get suggestions of this, are the references that have been used, kind of like that. I know one of the short-term goals of WikiCite is to have all citations in WikiProjects represented in Wikidata. Currently, there's not an automatic pipeline that keeps that updated, but that's definitely one of our primary goals. And ultimately, there is not specific support in the developer community for that kind of activity in particular. That's on the interests of individual community members to do exports-- like all this work that's been demonstrated that's not from the foundation-- people doing individual work on their interests. So, that could be a good satellite event to try and explore that kind of work. Getting it a good pipeline so that you can make references in Wikipedia's easily hook into Wikidata items, multilingual, et cetera-- does not yet exist technologically, and certain languages have concerns about that. The larger the Wikipedia language, the more defensive they are about using Wikidata directly. But that'll come. Yeah, I was just going to say when Liam's finished with that-- that it's strictly citations or something that are very much within scope, and what we would like to work for, but that needs community to build this, to take on that challenge, I think. And also, we need to be doing the outreach to the Wikipedians to show them that we can provide good quality data consistently. (assistant) We are running out of time. So, if someone has another question I think that these nice people will ask you privately after. So, it's time for us, for the last edition, and we are welcoming on stage. Jean-Fred, Envel, and... (applause)