WEBVTT 00:00:06.239 --> 00:00:08.628 Willkommen, Bienvenue-- Welcome. 00:00:08.628 --> 00:00:10.782 I always wanted to say that on a stage. 00:00:10.817 --> 00:00:12.804 (laughter) 00:00:12.856 --> 00:00:14.928 This is going to be inspirational, 00:00:14.928 --> 00:00:19.057 because this is the official Wikibase inspiration panel 00:00:19.057 --> 00:00:22.543 of WikidataCon 2019. 00:00:23.839 --> 00:00:27.519 The point of this panel is to be inspired by all the things 00:00:27.519 --> 00:00:33.714 that people, in various countries, in various fields, do with Wikibase, 00:00:33.766 --> 00:00:36.034 the software behind Wikidata. 00:00:36.084 --> 00:00:39.375 I was really surprised to learn today that someone came to me and said, 00:00:39.375 --> 00:00:42.451 "I learned about Wikibase the first time today." 00:00:42.817 --> 00:00:47.073 So, it is the software that runs Wikidata. 00:00:47.073 --> 00:00:50.963 And if you want to order things in the world 00:00:50.963 --> 00:00:54.121 the way Wikidata orders things in the world, 00:00:55.101 --> 00:00:58.627 but you don't agree with the items that we have in there, 00:00:58.627 --> 00:01:02.372 because you might need a finer level of granularity, 00:01:02.372 --> 00:01:05.828 or maybe you don't want to start with Q1, which is the universe, 00:01:05.828 --> 00:01:10.197 because in your little world, Q1 could be a book, if you are a library, 00:01:10.197 --> 00:01:14.362 or it could be some kind of animal, if you work in biology, 00:01:14.362 --> 00:01:19.073 or it could be a historic person, if you do digital humanities, 00:01:19.073 --> 00:01:21.771 but you still want the same system of ordering, 00:01:21.771 --> 00:01:24.565 then Wikibase is the thing for you. 00:01:25.395 --> 00:01:30.070 Over the last one or two years, we have made contact 00:01:30.080 --> 00:01:34.163 with extraordinary people, who are pioneers, who are trailblazing, 00:01:34.163 --> 00:01:36.641 who are evaluating Wikibase, 00:01:36.641 --> 00:01:39.920 and who are doing extremely great stuff with that. 00:01:41.216 --> 00:01:43.886 This panel is going to be very rushed. 00:01:44.372 --> 00:01:48.310 Every one of the participants of this panel would have deserved 00:01:48.310 --> 00:01:51.314 a one-hour slot to present their thing. 00:01:51.406 --> 00:01:54.007 But our program is packed. 00:01:54.414 --> 00:02:00.108 So, yeah, keep your seat belt fastened for a fast-paced ride 00:02:00.108 --> 00:02:03.829 through the inspirational world of Wikibases. 00:02:04.155 --> 00:02:09.870 And the first one is a project from two organizations, 00:02:09.870 --> 00:02:12.223 which is a little sensation in itself. 00:02:12.833 --> 00:02:16.495 The Bibliothèque nationale de France, the French National Library, 00:02:16.495 --> 00:02:22.343 and Abes, which is an authority for higher education. 00:02:22.870 --> 00:02:26.440 But I think you will talk about that more in your presentation, 00:02:26.440 --> 00:02:31.406 and yeah, we'd like to welcome Anila Angjeli and Benjamin Bober 00:02:31.406 --> 00:02:34.741 on stage for the first ten minutes of inspiration. 00:02:35.509 --> 00:02:40.768 (applause) 00:02:46.204 --> 00:02:47.339 Hi, everybody. 00:02:47.339 --> 00:02:49.372 So, yeah, my name is Benjamin Bober. 00:02:49.372 --> 00:02:51.734 So, I work for Abes, 00:02:51.734 --> 00:02:54.406 which stands for Higher Education Agency, 00:02:54.406 --> 00:02:56.437 Bibliographic Higher Education Agency. 00:02:56.437 --> 00:03:00.642 Basically, we work with all the university libraries in France, 00:03:00.642 --> 00:03:03.070 and manage the union catalog. 00:03:03.120 --> 00:03:06.362 And also their authority files. 00:03:06.920 --> 00:03:10.353 And I'm here with Anila Angjeli, from the BnF, 00:03:10.353 --> 00:03:11.971 French National Library. 00:03:11.971 --> 00:03:16.027 And we're going to talk to you about our joint project, 00:03:17.077 --> 00:03:21.239 which is about creating a new production tool 00:03:21.239 --> 00:03:24.088 for authorities data-- 00:03:24.938 --> 00:03:28.785 person, corporate bodies, concepts, and so on. 00:03:28.785 --> 00:03:33.496 And we spent the last months 00:03:33.496 --> 00:03:37.064 asking Wikibase to do this stuff. 00:03:37.551 --> 00:03:43.931 So, I will give you some context really quickly, 00:03:45.833 --> 00:03:49.030 because it's important for us, as libraries-- 00:03:49.079 --> 00:03:54.475 There's been this technological shift recently 00:03:56.016 --> 00:03:58.051 with the linked open data movement, 00:03:58.051 --> 00:04:01.951 and we wanted, as a bibliographical agency, 00:04:01.951 --> 00:04:05.551 to follow this new trend. 00:04:06.111 --> 00:04:08.474 And, well, it's been years since we've-- 00:04:10.131 --> 00:04:12.611 experimenting with linked open data, 00:04:12.611 --> 00:04:16.215 with RDF, SPARQL and so on. 00:04:16.215 --> 00:04:21.765 But we think that now is the good time to move forward. 00:04:23.311 --> 00:04:28.313 It's also a good time because there's been a-- not a shift, 00:04:29.534 --> 00:04:31.009 there's a fundamental change 00:04:31.009 --> 00:04:36.780 in the way we consider bibliographical data. 00:04:37.712 --> 00:04:41.255 We used to, and we still have data 00:04:41.747 --> 00:04:44.803 stored in records, we call it MARC records 00:04:44.803 --> 00:04:47.801 in the library landscape. 00:04:48.444 --> 00:04:51.239 We used a specific format called MARC. 00:04:53.108 --> 00:04:56.956 But recently, there has been some way 00:04:59.431 --> 00:05:01.697 to think about it from another point of view. 00:05:01.697 --> 00:05:06.621 And to go from a record-based world, to an entity-based world 00:05:06.621 --> 00:05:11.572 when we try to interconnect people, works, 00:05:14.129 --> 00:05:16.724 and other entities. 00:05:17.777 --> 00:05:23.844 So, in this context, we decided to launch this joint initiative. 00:05:25.639 --> 00:05:28.516 But our goal is far beyond libraries. 00:05:28.516 --> 00:05:32.461 We would like to have with us 00:05:35.519 --> 00:05:38.060 other French GLAMS, for instance, 00:05:38.060 --> 00:05:42.386 because we think our project can help them also. 00:05:44.134 --> 00:05:49.368 So basically, our project is called Fichier National d'Entités, 00:05:49.411 --> 00:05:51.232 so National Entity Files. 00:05:51.917 --> 00:05:55.961 And it will be a shared platform for collaboratively create 00:05:55.961 --> 00:05:58.652 and maintain reference data about entities. 00:05:58.652 --> 00:06:01.544 Like I said, persons, corporate bodies, places, concepts, 00:06:01.544 --> 00:06:03.206 and creative works. 00:06:03.339 --> 00:06:06.221 So, we embrace a lot of things. 00:06:06.909 --> 00:06:09.632 And it's a challenge because it's the first time 00:06:09.632 --> 00:06:15.826 BnF and Abes collaborate at such a level. 00:06:19.031 --> 00:06:22.488 Giving you a quick view about where we are-- 00:06:22.618 --> 00:06:25.129 where we've come from and where we are now. 00:06:25.129 --> 00:06:27.834 We have been working on this project since 2017. 00:06:29.178 --> 00:06:31.967 We've benchmarked, other similar initiatives, 00:06:31.967 --> 00:06:33.923 and came to the conclusion last year 00:06:33.923 --> 00:06:40.687 that there was a strong interest in Wikibase as the FNE's backbone. 00:06:41.632 --> 00:06:44.886 We were considering it a good solution 00:06:44.886 --> 00:06:49.257 to build upon, but we still had doubts at this time, 00:06:50.016 --> 00:06:54.033 because we have specific needs to fulfill. 00:06:54.683 --> 00:06:58.871 So we decided to launch, to spend this year 00:06:58.871 --> 00:07:01.910 to build a proof of concept with real data 00:07:01.910 --> 00:07:06.039 both from BnF catalog, authority catalog, and our catalogs. 00:07:06.718 --> 00:07:10.990 And well, try to merge this data into a Wikibase, 00:07:10.990 --> 00:07:13.471 and to try to see how they behave 00:07:13.471 --> 00:07:17.964 and how the tool can fulfill our needs. 00:07:18.103 --> 00:07:22.370 And we were helped in this proof of concept 00:07:22.370 --> 00:07:27.282 by Maxime and Vincent from Inventaire.io, 00:07:28.497 --> 00:07:33.145 who helped us have a better idea 00:07:33.145 --> 00:07:37.133 about what Wikibase can bring us. 00:07:37.188 --> 00:07:40.272 And Anila will talk about the first findings. 00:07:42.255 --> 00:07:46.913 So, while this decision to go 00:07:46.995 --> 00:07:49.713 with experiments with the Wikibase 00:07:49.713 --> 00:07:52.793 as the technical infrastructure backbone 00:07:52.793 --> 00:07:57.360 or the basic layer for our FNE 00:07:57.360 --> 00:08:04.100 was because it's not trivial to move from one system to another, 00:08:04.657 --> 00:08:10.170 and because the initiative of using the Wikibase 00:08:10.940 --> 00:08:15.976 as the technical infrastructure for our data-- 00:08:17.760 --> 00:08:19.262 it was both-- 00:08:20.396 --> 00:08:25.771 means that we move from our classical 00:08:26.545 --> 00:08:28.083 system information 00:08:28.083 --> 00:08:33.131 or library information system to quite another thing. 00:08:33.643 --> 00:08:36.469 And so, we needed to experiment first, 00:08:36.469 --> 00:08:41.751 and just to see whether a set of functionalities that are-- 00:08:42.439 --> 00:08:48.189 that we usually need to perform and fulfill in our environment-- 00:08:48.239 --> 00:08:49.739 professional environment. 00:08:49.739 --> 00:08:52.946 I'm talking here about creating and maintaining, 00:08:52.946 --> 00:08:56.562 and not publishing, which is a big difference. 00:08:56.562 --> 00:08:59.685 You were at the session, the previous session, 00:08:59.685 --> 00:09:04.393 with just Wikidata Commons, 00:09:04.393 --> 00:09:06.765 contribution strategies for GLAM-- 00:09:06.765 --> 00:09:12.741 it was about publication and ways about creation in itself. 00:09:12.787 --> 00:09:16.146 So, we need to go step by step, 00:09:16.146 --> 00:09:19.955 and that's why we conducted this experiment, this proof of concept. 00:09:20.970 --> 00:09:26.726 And, good surprise, no major obstacle to ingest library data 00:09:26.726 --> 00:09:30.754 according to a specific ontology, which is, while we-- 00:09:31.159 --> 00:09:37.781 I briefly mentioned that we put their data in two different flavors of MARC, 00:09:38.552 --> 00:09:42.689 then we defined some [inaudible] properties 00:09:42.689 --> 00:09:47.233 in order to be able to experiment with merging the data, 00:09:47.233 --> 00:09:52.511 and there was no major obstacle from the technical point of view. 00:09:53.406 --> 00:09:56.569 Of course, we came up with a confirmation 00:09:56.569 --> 00:10:00.425 that Wikibase does offer built-in features 00:10:00.425 --> 00:10:05.140 that could be used as the basis for the technical infrastructure for FNE. 00:10:06.319 --> 00:10:09.000 But again, the decision is not yet made, 00:10:09.000 --> 00:10:11.637 because the experiment is still-- 00:10:12.487 --> 00:10:16.243 let's say, the developments have been completed. 00:10:16.650 --> 00:10:22.313 Now, we're in the phase of writing the final conclusions, 00:10:22.313 --> 00:10:28.774 and the decision is not yet made from the strategic point of view, 00:10:29.391 --> 00:10:34.468 but these are really the first findings we can talk about. 00:10:34.512 --> 00:10:37.954 And Wikibase-- it appears to us 00:10:37.954 --> 00:10:43.033 that a Wikibase might be a good operational solution 00:10:43.033 --> 00:10:48.571 for managing this initiative-- that is jointly, collaboratively, 00:10:48.571 --> 00:10:51.980 create these entity, these things, 00:10:53.281 --> 00:10:56.828 to remind you of the opposition, which is things and strings. 00:10:57.834 --> 00:11:01.113 However, we noticed there are gaps. 00:11:01.118 --> 00:11:05.418 Within the specific needs of our specific institutions, 00:11:06.146 --> 00:11:12.361 there are defined communities with their own culture, practices and, 00:11:14.711 --> 00:11:20.462 well, it is certain processes that are inherent to the libraries, 00:11:21.111 --> 00:11:25.650 and the solution offered by Wikibase, for example, the search. 00:11:26.542 --> 00:11:28.929 I mean, from the professional standpoint, 00:11:28.929 --> 00:11:31.648 not only from this end-user standpoint, 00:11:31.648 --> 00:11:34.575 but professional, we need some indexes 00:11:34.575 --> 00:11:38.925 in order to ensure data quality, data curation, 00:11:38.925 --> 00:11:41.197 and it is very important for the professional, 00:11:41.197 --> 00:11:46.406 and Wikibase with its Elasticsearch 00:11:46.406 --> 00:11:48.857 and CirrusSearch doesn't offer. 00:11:48.857 --> 00:11:51.702 But still areas of investigation there. 00:11:52.229 --> 00:11:54.454 The roles-- how are the roles managed? 00:11:54.454 --> 00:11:57.248 The bureaucrat, the patrolling of-- 00:11:57.248 --> 00:12:00.861 it's not exactly what happened in our world. 00:12:01.268 --> 00:12:04.712 Although there is a layer that can be used, 00:12:04.712 --> 00:12:11.132 upon which we can build other roles that are more in compliance 00:12:11.132 --> 00:12:14.876 with our way of managing the data. 00:12:15.649 --> 00:12:20.437 Or different constraints, constraints related to data publication, 00:12:20.842 --> 00:12:26.005 or data-- there's an error there we need to correct. 00:12:26.655 --> 00:12:29.096 Data policy-- okay, thank you. 00:12:29.702 --> 00:12:32.710 So, there are things that need to be-- 00:12:33.360 --> 00:12:38.574 other layers, bricks, need to be built upon Wikibase. 00:12:39.141 --> 00:12:42.873 And of course, one of the reasons, the major reasons, 00:12:42.873 --> 00:12:45.222 the reason why we are here with you, 00:12:45.222 --> 00:12:50.450 is that we-- we are willing, and we feel the necessity 00:12:50.450 --> 00:12:54.349 to be part of a community sharing the same concerns. 00:12:54.358 --> 00:12:59.267 And we all know, given the program, 00:12:59.320 --> 00:13:01.554 that libraries and GLAMs 00:13:01.554 --> 00:13:05.084 are heavily represented in this event. 00:13:05.896 --> 00:13:11.772 So, I think-- we think that maybe 00:13:11.772 --> 00:13:14.206 in a couple of weeks, 00:13:14.206 --> 00:13:19.082 or next year, we will able to communicate more openly 00:13:19.082 --> 00:13:23.717 on our decision to go forward with this solution. 00:13:24.404 --> 00:13:26.163 Thank you. 00:13:26.163 --> 00:13:27.748 Thank you so much. 00:13:27.748 --> 00:13:31.155 (applause) 00:13:31.155 --> 00:13:33.547 So, we will have short presentations first, 00:13:33.547 --> 00:13:35.092 and we will all return on stage 00:13:35.092 --> 00:13:37.646 for questions, if we have the time for that. 00:13:38.296 --> 00:13:41.251 But yeah, we heard something from France. 00:13:42.757 --> 00:13:44.301 There's another project. 00:13:45.086 --> 00:13:47.980 It's not Fichier National d'Ent-- 00:13:47.980 --> 00:13:50.031 (jokingly struggles with name) 00:13:50.031 --> 00:13:51.545 But it's Gemeinsame Normdatei, 00:13:52.937 --> 00:13:56.767 the universal authority file 00:13:56.767 --> 00:13:58.224 for the German-speaking world. 00:13:58.224 --> 00:14:03.747 And I'm so happy to have good friends of the Wikimedia movement here. 00:14:04.559 --> 00:14:09.436 Barbara Fischer and Sarah Hartmann. 00:14:11.831 --> 00:14:15.208 Thanks alot for the invitation to talk about our project, 00:14:15.212 --> 00:14:18.006 which is called GND meets Wikibase. 00:14:18.694 --> 00:14:21.645 And it's a joint project of Wikimedia Deutschland, 00:14:21.645 --> 00:14:23.468 and the GND. 00:14:23.745 --> 00:14:25.707 And we'd like to give you a quick overview, 00:14:25.707 --> 00:14:28.781 as Jens said before, there are just 10 minutes. 00:14:29.971 --> 00:14:33.138 Why we go for that approach to evaluate Wikibase, 00:14:33.138 --> 00:14:37.153 if it fulfills the requirements for managing authority data 00:14:37.153 --> 00:14:40.434 on a collaborative level, I would say. 00:14:42.258 --> 00:14:45.660 So, where do we come from, and what's the idea of authority control? 00:14:45.660 --> 00:14:49.927 And GND, which stands for Gemeinsame Normdatei, 00:14:50.837 --> 00:14:51.838 what's the idea of it? 00:14:51.838 --> 00:14:55.623 And yeah, where do we come from, as I said before. 00:14:55.623 --> 00:14:59.307 It's not that different from what Anila and Ben said, 00:15:00.007 --> 00:15:01.649 just a few seconds ago. 00:15:02.765 --> 00:15:06.003 The GND is used for the description of resources, 00:15:06.003 --> 00:15:09.726 such as publications, and objects, for example, 00:15:09.726 --> 00:15:14.168 and in order to enable accurate data retrieval, 00:15:14.168 --> 00:15:19.080 I would say, the GND provides unambiguous and distinct entities 00:15:19.080 --> 00:15:21.390 for that retrieval. 00:15:21.837 --> 00:15:25.328 And so, there are persistent identifiers, as well, as you all know, 00:15:25.328 --> 00:15:28.654 for identification and reference for these entities. 00:15:30.968 --> 00:15:33.972 The authority file is used by mainly libraries, 00:15:35.075 --> 00:15:37.955 we would say, in the German-speaking countries, 00:15:37.955 --> 00:15:41.477 but a few other institutions from the cultural heritage domain, 00:15:41.477 --> 00:15:45.497 are using the authority file already. 00:15:46.228 --> 00:15:52.567 And all in all there are around about 60 million records, 00:15:52.774 --> 00:15:55.242 and in Wikibase, we would say "items," 00:15:55.242 --> 00:15:58.037 which refer to persons, names of persons, 00:15:58.037 --> 00:16:01.475 corporate bodies, for example, geographic names, and works. 00:16:01.768 --> 00:16:06.522 And the GND is run cooperatively by so-called GND agencies, 00:16:06.583 --> 00:16:11.212 and at the moment, there are around about 1,000 institutions 00:16:11.212 --> 00:16:15.443 who are active users of the GND-- that means they establish new records 00:16:15.443 --> 00:16:19.999 and added records or items on a regular basis. 00:16:20.745 --> 00:16:24.204 And the most important thing, I would say, 00:16:24.204 --> 00:16:27.848 is that the GND data is provided free of charge 00:16:27.848 --> 00:16:29.520 under CC0 conditions, 00:16:29.520 --> 00:16:33.313 and that all the APIs and documentation is open as well. 00:16:34.532 --> 00:16:37.077 Yeah, talking about open-- 00:16:38.129 --> 00:16:41.613 that's the point, and the crucial one here-- 00:16:41.613 --> 00:16:45.235 at the moment, we challenge to open up the GND 00:16:45.235 --> 00:16:51.400 for other GLAM institutions and institutions from the science domain. 00:16:52.212 --> 00:16:55.972 At the moment, it's really focused on the library sector. 00:16:56.715 --> 00:17:00.243 That means that the handy tool of librarians has to evolve 00:17:01.223 --> 00:17:06.241 into a tool that is used and accepted across domains. 00:17:06.300 --> 00:17:10.144 And that means a lot of work on organizational stuff, 00:17:10.144 --> 00:17:15.011 community building, discussions about the current data model, 00:17:15.011 --> 00:17:17.930 and infrastructural and technical issues. 00:17:17.945 --> 00:17:19.527 And, yeah. 00:17:20.581 --> 00:17:22.966 Talking about the infrastructural issues, 00:17:23.806 --> 00:17:29.165 we came up with the idea to become partners in crime 00:17:29.596 --> 00:17:34.704 with Wikibase, I would say, so have slightly the same aims, 00:17:34.704 --> 00:17:40.092 namely make cultural data more accessible and interoperable. 00:17:40.661 --> 00:17:44.964 And therefore we now evaluate the software, 00:17:44.964 --> 00:17:49.581 which was originally conceived for a sole application, Wikidata, 00:17:49.581 --> 00:17:53.311 if it's sufficient for managing authority data. 00:17:58.084 --> 00:18:00.917 Right-- hi from my side as well. 00:18:00.917 --> 00:18:05.701 We're focusing in our evaluation [inaudible] we do commonly 00:18:05.701 --> 00:18:07.450 with Wikimedia Deutschland. 00:18:08.220 --> 00:18:11.269 First of all, if Wikibase meets the requirements 00:18:11.269 --> 00:18:15.224 of GLAM institutions, galleries, libraries, archives, and museums, 00:18:15.224 --> 00:18:18.467 to drive collaboratively an authority file, 00:18:18.467 --> 00:18:20.698 which is like our basic question. 00:18:21.748 --> 00:18:25.981 We also would like to see Wikibase to increase usability 00:18:25.981 --> 00:18:29.312 as the software system we're using right now 00:18:29.312 --> 00:18:32.885 is, let's say, quite a complex software 00:18:32.885 --> 00:18:37.361 that is not as handy as you might like it to be. 00:18:39.074 --> 00:18:41.828 Well, and then, we would like to know 00:18:41.828 --> 00:18:45.914 if Wikibase would also ease both data linking 00:18:45.914 --> 00:18:48.710 and growing a diverse community. 00:18:48.710 --> 00:18:52.429 As Sarah said before, we are right now in a process of opening up 00:18:52.429 --> 00:18:58.356 towards a broader scope of GLAM institutions, 00:18:58.356 --> 00:19:00.425 and science institutions. 00:19:00.425 --> 00:19:06.152 And of course, they are working within their own software structures, 00:19:06.152 --> 00:19:09.231 and we would like to know if Wikibase would ease 00:19:09.231 --> 00:19:12.190 the cooperation-- collaboration with us. 00:19:12.678 --> 00:19:15.390 So, why do we do that? 00:19:15.634 --> 00:19:19.200 This is because we consider that Wikibase 00:19:19.200 --> 00:19:22.239 might be the attractive community zone, 00:19:22.239 --> 00:19:25.596 which means--I had to write that down-- 00:19:26.807 --> 00:19:30.607 first of all, as it is open source, it will be more accessible 00:19:30.607 --> 00:19:35.285 than any proprietary source software system that is used 00:19:35.285 --> 00:19:39.421 in the cataloging fields of the GLAM institutions. 00:19:40.002 --> 00:19:43.114 Then, we feel that the Wikibase community 00:19:43.114 --> 00:19:46.354 already by now is a very dedicated community, 00:19:46.354 --> 00:19:50.163 and we would like to participate in that dedicated community, 00:19:50.446 --> 00:19:53.447 because we believe that sharing is caring. 00:19:53.771 --> 00:19:59.102 What we want to share is our knowledge is your knowledge, 00:19:59.144 --> 00:20:02.557 and together, in order to omit redundance, 00:20:02.557 --> 00:20:07.393 not by editing the same information over and over again, 00:20:07.393 --> 00:20:09.373 but reuse data, link it, 00:20:09.373 --> 00:20:11.559 quoting it, and enriching it. 00:20:12.609 --> 00:20:17.474 And I placed here on the picture one of the tools 00:20:17.474 --> 00:20:22.802 that is broadly spread within Wikidata, this Histropedia, 00:20:23.332 --> 00:20:29.061 because we also feel that if we are able to introduce our data into Wikibase, 00:20:29.061 --> 00:20:34.159 we might be able to share tools, improving the code, 00:20:34.159 --> 00:20:38.181 and thus being an active, contributing part of the community. 00:20:38.232 --> 00:20:40.030 Thank you. 00:20:40.030 --> 00:20:42.671 I'd like to debate that with you later on. 00:20:43.319 --> 00:20:44.775 Thank you so much. 00:20:44.775 --> 00:20:46.354 (applause) 00:20:46.354 --> 00:20:47.938 Thank you so much. 00:20:49.885 --> 00:20:53.874 So, at some point, we ask ourselves, did we-- 00:20:56.996 --> 00:20:59.868 by accident, write a library software? 00:20:59.913 --> 00:21:05.216 Because the adoption of Wikibase in the library fields is so overwhelming. 00:21:06.434 --> 00:21:08.012 But there's more to it. 00:21:09.023 --> 00:21:13.903 And of course, we didn't accidentally write a library system. 00:21:14.353 --> 00:21:17.764 It can be used for other fields as well. 00:21:18.296 --> 00:21:19.878 For instance, for biology. 00:21:19.878 --> 00:21:23.363 And David Fichtmueller will tell us about using Wikibase 00:21:23.363 --> 00:21:25.835 as a platform for biodiversity. 00:21:26.770 --> 00:21:29.449 - I think that was grayed. - Yeah. 00:21:29.449 --> 00:21:31.835 Full screen? Oh, okay. 00:21:37.603 --> 00:21:39.758 Yes. Hello, everybody. 00:21:40.819 --> 00:21:43.383 I'm David, and I work at the Botanic Garden, 00:21:43.383 --> 00:21:45.214 Botanical Museum here in Berlin. 00:21:45.988 --> 00:21:48.065 And I work there as a computer scientist. 00:21:48.065 --> 00:21:51.194 We have an entire department called Biodiversity Informatics. 00:21:51.884 --> 00:21:53.633 Generally speaking, we write the software 00:21:53.633 --> 00:21:55.858 that biologists use in their daily work. 00:21:56.430 --> 00:21:58.932 And on my private side, 00:21:58.932 --> 00:22:02.639 I've been a Wikipedia contributor for almost 15 years now, 00:22:02.639 --> 00:22:06.045 and Wikidata contributor for almost five years now. 00:22:06.981 --> 00:22:09.425 And also, as part of my job, 00:22:09.425 --> 00:22:12.068 I'm a co-administrator of a MediaWiki farm 00:22:12.068 --> 00:22:16.684 with more than 80 wikis regarding the biology community. 00:22:18.855 --> 00:22:22.116 And a couple of years ago, I was assigned to a project 00:22:22.556 --> 00:22:26.670 that was, yeah, about working on a standard. 00:22:26.735 --> 00:22:29.524 In particular, it's a standard called ABCD, 00:22:30.827 --> 00:22:33.135 that we needed to do some work on. 00:22:33.405 --> 00:22:37.295 And I assume most of you haven't heard about ABCD, 00:22:37.295 --> 00:22:39.728 that's not really a bad thing. 00:22:39.728 --> 00:22:41.279 It's really specific. 00:22:41.279 --> 00:22:44.128 It stands for Access to Biological Collection Data. 00:22:44.863 --> 00:22:47.292 And it's an XML schema. 00:22:47.298 --> 00:22:49.772 So, it can express biological information, 00:22:49.772 --> 00:22:54.190 particular things like information about herbarium sheets, 00:22:54.190 --> 00:22:59.920 about collections, like fish in alcohol jars, or-- 00:23:01.111 --> 00:23:02.449 but also observations-- 00:23:02.449 --> 00:23:05.165 scientists being out in the field, seeing certain plants, 00:23:05.165 --> 00:23:06.543 seeing certain animals. 00:23:06.543 --> 00:23:08.970 A lot of variety in here, and because of this, 00:23:08.970 --> 00:23:10.426 it's quite a huge standard. 00:23:10.426 --> 00:23:13.940 So, we have 1,800 different concepts in there. 00:23:14.748 --> 00:23:18.322 That's counting the different XPaths there are within the file. 00:23:20.055 --> 00:23:22.302 And so the challenge was to convert this 00:23:22.302 --> 00:23:25.234 into a new modern semantic standard. 00:23:25.280 --> 00:23:27.271 We wanted to use an OWL ontology 00:23:27.271 --> 00:23:31.200 that is able to express the same kind of information 00:23:31.200 --> 00:23:33.951 that has previously been expressed with the XML files, 00:23:35.245 --> 00:23:38.361 and also keep all the existing documentation, 00:23:38.361 --> 00:23:41.122 and restrictions, and all of the connections 00:23:41.122 --> 00:23:42.989 between the items 00:23:42.989 --> 00:23:46.357 and have a collaborative platform 00:23:46.357 --> 00:23:50.284 where other scientists can come in and give us advice 00:23:50.284 --> 00:23:52.914 on their specific fields of focus. 00:23:52.914 --> 00:23:54.780 Did we model this correctly? 00:23:55.266 --> 00:23:56.596 Is there anything missing? 00:23:56.596 --> 00:24:00.528 So, yeah, with all of this in mind, we went looking around, 00:24:00.528 --> 00:24:03.675 and found a solution, and I guess it wouldn't surprise anybody here, 00:24:03.675 --> 00:24:06.752 it's Wikibase, otherwise I wouldn't have been talking here. 00:24:08.171 --> 00:24:10.779 So, we decided on using Wikibase. 00:24:11.266 --> 00:24:14.356 And we started to install it without the Docker Image. 00:24:15.165 --> 00:24:17.171 Big mistake. Don't do this. 00:24:17.171 --> 00:24:18.171 (laughter) 00:24:18.171 --> 00:24:21.335 In our defense, we started this two and a half years ago. 00:24:21.616 --> 00:24:24.167 And it was two years ago at the WikidataCon 00:24:24.167 --> 00:24:26.088 that the Docker Image was first released. 00:24:26.898 --> 00:24:29.828 So, we had to figure out our own way. 00:24:29.828 --> 00:24:32.265 And once we had things up and running, 00:24:32.265 --> 00:24:35.259 we didn't really want to break changing things. 00:24:35.259 --> 00:24:39.801 We do have the Docker installed for the Query Service, 00:24:40.275 --> 00:24:43.322 and we have a weird, hybrid of custom installation 00:24:43.322 --> 00:24:46.004 and Docker installation and modified scripts 00:24:46.004 --> 00:24:48.542 connecting those two instances. 00:24:48.542 --> 00:24:51.605 We then installed QuickStatements, again, manually, 00:24:51.605 --> 00:24:57.201 because by that time, it wasn't part of the Query Service, 00:24:57.201 --> 00:25:00.361 did some slight modifications, and adjustments to get it to work. 00:25:00.888 --> 00:25:05.443 I know it's now part of the Docker Image. 00:25:05.928 --> 00:25:10.724 But yeah, we had it running, so, we didn't bother changing it. 00:25:11.574 --> 00:25:13.437 Keep this in mind for later on. 00:25:14.164 --> 00:25:15.867 But before I go into what we did, 00:25:15.867 --> 00:25:18.465 I'm going to avoid a possible confusion here, 00:25:18.465 --> 00:25:22.280 because we're talking about data standards, 00:25:22.345 --> 00:25:25.273 and when we express things in a semantic way, 00:25:25.273 --> 00:25:30.097 we will convert the concepts from the XML into Classes and Properties. 00:25:30.580 --> 00:25:33.659 So, this being Object Properties connecting the different classes, 00:25:33.659 --> 00:25:36.663 and Datatype Properties that actually contain the content, 00:25:36.663 --> 00:25:40.370 that is to store text, numbers, things like that. 00:25:41.195 --> 00:25:44.038 And we express all of this within Wikibase, 00:25:44.082 --> 00:25:46.910 but all of those are items in Wikibase. 00:25:47.597 --> 00:25:51.446 And they are then described using Wikibase Properties. 00:25:51.455 --> 00:25:54.950 So, we have ABCD properties being items being described 00:25:54.950 --> 00:25:56.657 as Wikibase Properties. 00:25:56.657 --> 00:26:00.531 I try to make sure to use the prefixes accordingly, 00:26:00.531 --> 00:26:03.581 so you know what I'm talking about when I talk about properties 00:26:03.581 --> 00:26:04.820 in this talk. 00:26:05.746 --> 00:26:08.060 So, let's look at the properties, 00:26:08.060 --> 00:26:10.203 in particular, with Wikibase Properties. 00:26:10.215 --> 00:26:13.013 We sat down and thought, "Okay, what do we need 00:26:13.013 --> 00:26:16.296 to describe the concepts we want to model?" 00:26:16.701 --> 00:26:19.323 And we ended up using around 25 properties 00:26:19.833 --> 00:26:22.532 in addition to, of course, label, description, alias. 00:26:22.670 --> 00:26:24.452 I'm not going to mention all of them, 00:26:24.452 --> 00:26:26.314 just so you see the variety. 00:26:27.243 --> 00:26:29.846 Those fulfill our requirements. 00:26:29.846 --> 00:26:36.496 And yeah, some things express some restrictions, 00:26:36.496 --> 00:26:38.544 and others-- 00:26:38.544 --> 00:26:40.062 Most of them are optional. 00:26:40.697 --> 00:26:42.628 Only very few are mandatory. 00:26:42.921 --> 00:26:46.489 So then, we set on importing all of this information. 00:26:46.581 --> 00:26:51.082 We wrote a Schema Parser that extracts all of the different concepts. 00:26:51.082 --> 00:26:53.959 So everything that has an XPath within the XML Schema, 00:26:53.959 --> 00:26:57.121 and all of the documentation that is part of the XML schema, 00:26:57.121 --> 00:27:00.284 and so we got this into a nice CSV file, 00:27:00.284 --> 00:27:04.862 and then we could work on this and import it using QuickStatements. 00:27:05.918 --> 00:27:07.176 Worked quite well. 00:27:07.176 --> 00:27:11.157 But then, we had, as I said, 1,800-plus concepts 00:27:11.157 --> 00:27:13.272 in our Wikibase instance. 00:27:13.760 --> 00:27:17.252 But then, when we had things like person-- 00:27:17.821 --> 00:27:20.366 person name, and contact email-- 00:27:20.366 --> 00:27:23.485 those appear a couple of times within the schema-- 00:27:23.485 --> 00:27:27.157 for the data set owner, for the person who took an image, things like that. 00:27:27.157 --> 00:27:29.180 So, of course, we needed to reduce those, 00:27:29.180 --> 00:27:32.013 and combine those to reusable classes. 00:27:32.064 --> 00:27:34.858 So, there was a lot of manual editing 00:27:34.858 --> 00:27:36.319 to reduce the number of concepts, 00:27:36.319 --> 00:27:39.558 and in the end, we ended up with a little more than 500. 00:27:39.965 --> 00:27:43.540 So, we have Classes, Object Properties, Datatype Properties, 00:27:43.540 --> 00:27:45.362 a couple of other ones I'm skipping 00:27:45.362 --> 00:27:47.392 to avoid additional complexity here. 00:27:48.362 --> 00:27:52.856 And for certain large-scale edits, we also used QuickStatements again. 00:27:54.686 --> 00:27:57.312 So now, we did all of the editing, 00:27:57.312 --> 00:27:59.476 now we wanted to make sure that the data we have 00:27:59.476 --> 00:28:00.775 is actually consistent. 00:28:01.101 --> 00:28:04.922 So, that's where we used what we call Maintenance Queries, 00:28:06.252 --> 00:28:09.570 used the query interface with some SPARQL queries, 00:28:09.570 --> 00:28:12.114 basically to check for missing properties, 00:28:13.250 --> 00:28:15.324 wrong links between concepts, 00:28:16.338 --> 00:28:18.761 basically, things that didn't match 00:28:18.761 --> 00:28:21.112 with our concept, with our structure. 00:28:21.840 --> 00:28:24.356 And in the end, we also had to do 00:28:24.356 --> 00:28:26.007 a manual review of all of the concepts 00:28:26.007 --> 00:28:27.875 just to make sure we didn't miss anything. 00:28:27.875 --> 00:28:29.986 This was kind of a lot of work, 00:28:29.986 --> 00:28:33.882 because if you only take like five minutes per item, 00:28:33.992 --> 00:28:35.771 multiply it by 550, 00:28:36.781 --> 00:28:39.855 it's over one week of full and concentrated work. 00:28:40.667 --> 00:28:42.732 But of course, we don't need five minutes, 00:28:42.732 --> 00:28:45.977 because you sometimes spend like half an hour to fix a certain item 00:28:45.977 --> 00:28:48.294 when there's problems with the modeling. 00:28:48.985 --> 00:28:50.895 So, we now had all of the data. 00:28:50.895 --> 00:28:53.058 Now, it was time to get the data out of Wikibase. 00:28:54.175 --> 00:28:58.236 We wrote an export script in Python that uses the Query Service 00:28:58.236 --> 00:29:01.088 to get the information about the concepts, 00:29:01.088 --> 00:29:04.706 and fill them in templates-- prepared templates. 00:29:05.234 --> 00:29:07.916 So, in the end, we get a nice valid OWL file 00:29:07.916 --> 00:29:09.787 that contains everything we need. 00:29:09.833 --> 00:29:12.788 And this is the actual basis of the standard. 00:29:12.916 --> 00:29:17.380 For future versions, when we're going to make revisions, 00:29:17.380 --> 00:29:19.651 the Wikibase is our working platform. 00:29:19.651 --> 00:29:22.697 And once we do an export, this is the new version of the standard. 00:29:22.750 --> 00:29:25.102 Keeping those separate, this would also allow us 00:29:25.102 --> 00:29:29.116 to move the server to a different instance, 00:29:29.116 --> 00:29:32.796 or as I said, change the installation. 00:29:32.887 --> 00:29:35.963 We export JSON for the documentation of the website. 00:29:36.771 --> 00:29:40.962 And we also export the data to a second Wikibase instance. 00:29:41.409 --> 00:29:43.196 This is like really experimental, right now. 00:29:43.196 --> 00:29:46.682 We haven't really used this in production where it can-- 00:29:46.682 --> 00:29:49.483 where the concepts can then be used to describe actual data. 00:29:49.483 --> 00:29:51.422 So we're breaking down those-- 00:29:52.189 --> 00:29:56.402 we're taking them a step down from properties being Wikibase items, 00:29:56.407 --> 00:29:59.318 and converting them into actual Wikibase properties. 00:29:59.761 --> 00:30:02.522 This is quite a lot of requests-- quite a lot of steps 00:30:02.522 --> 00:30:05.203 to keep all of the data and all of the linking consistent, 00:30:05.203 --> 00:30:06.669 but it works. 00:30:06.669 --> 00:30:08.865 And in the end, well, it was quite successful. 00:30:09.705 --> 00:30:11.703 There is a huge community-- 00:30:11.949 --> 00:30:14.909 there is a community about Biodiversity Information Standards, 00:30:14.909 --> 00:30:18.449 who also had their annual meeting just in the past days. 00:30:18.729 --> 00:30:21.589 So, there's a huge interest in reusing this approach 00:30:21.604 --> 00:30:23.385 for other standards, as well. 00:30:23.524 --> 00:30:25.183 And so, in the future, 00:30:25.183 --> 00:30:28.257 we want to try a bit about Shape Expressions-- 00:30:28.257 --> 00:30:31.110 as I said, we have some restrictions in there to export them-- 00:30:31.754 --> 00:30:35.160 and build some better workflows for the versioning. 00:30:35.160 --> 00:30:36.873 We haven't done this yet. 00:30:36.919 --> 00:30:38.908 And switch up the Docker instance. 00:30:39.398 --> 00:30:41.676 So, at the end, I'm gong to have a small wish list-- 00:30:41.676 --> 00:30:43.335 what things could be improved. 00:30:43.335 --> 00:30:47.096 Well, there are a lot more tools out there that are really written 00:30:47.096 --> 00:30:50.320 for Wikidata, but could be more agnostic, 00:30:51.839 --> 00:30:53.218 in particular, QuickStatements. 00:30:53.218 --> 00:30:56.658 As I said, I did some adjustments manually. 00:30:56.707 --> 00:30:59.899 Many of the issues I had are probably solved by now, 00:30:59.899 --> 00:31:01.679 but I don't think all of them. 00:31:01.840 --> 00:31:06.581 Then we want to import existing templates, 00:31:06.581 --> 00:31:09.288 or the SPARQL template, the Q and the P template. 00:31:09.288 --> 00:31:12.203 They are really useful when working with Wikibase. 00:31:12.203 --> 00:31:14.599 So, this would be done automatically. 00:31:14.599 --> 00:31:17.111 And as I said, we did a lot of manual editing. 00:31:17.111 --> 00:31:20.769 So, it would be useful, just ideal to have a tool where you can-- 00:31:20.769 --> 00:31:22.393 Like in an Excel table-- 00:31:22.393 --> 00:31:25.551 you load a couple of items, and you load a couple of properties, 00:31:25.551 --> 00:31:27.619 and then just jump from cell to cell, 00:31:27.619 --> 00:31:31.662 really quickly edit a lot of things 00:31:31.662 --> 00:31:33.423 in a semi-automated way. 00:31:34.985 --> 00:31:36.390 Thanks. That's the end. 00:31:37.093 --> 00:31:38.481 Thank you so much. 00:31:38.481 --> 00:31:40.795 (applause) 00:31:40.795 --> 00:31:42.659 So much to talk about on this. 00:31:43.273 --> 00:31:48.254 So, there is not only-- well, how do I get back from here. 00:31:50.917 --> 00:31:54.004 It's not only about science. It's not only about libraries. 00:31:54.004 --> 00:31:57.181 You can also create art and beauty with Wikibase. 00:31:57.181 --> 00:32:01.611 And who would be better to tell us about this than Stuart Prior. 00:32:12.056 --> 00:32:15.268 Now, slightly embarrassingly, we talk about art and beauty, 00:32:15.268 --> 00:32:17.296 but this is a really ugly presentation. 00:32:17.296 --> 00:32:18.554 (laughter) 00:32:19.604 --> 00:32:22.552 Starting off with a room full of Wikimedians, 00:32:22.552 --> 00:32:24.261 trains--people like trains. 00:32:24.956 --> 00:32:26.465 But it has a purpose. 00:32:26.465 --> 00:32:30.538 So, this is Hackney Downs Station in Northeast London. 00:32:31.429 --> 00:32:34.104 And this is about Banner Repeater and Wikibase, 00:32:34.104 --> 00:32:35.968 which I'll explain further. 00:32:36.014 --> 00:32:37.829 So, this is a terrible photo. 00:32:37.829 --> 00:32:43.405 But it is actually where an artists' publishing archive is held, 00:32:43.512 --> 00:32:46.140 which is on the platform of a train station. 00:32:46.950 --> 00:32:50.688 Within there, they've got several hundred copies 00:32:50.688 --> 00:32:52.886 of various types of artists' publishing. 00:32:52.886 --> 00:32:54.389 They get a lot of public footfall. 00:32:54.389 --> 00:32:57.132 It does a lot of outreach to actual general public. 00:32:57.132 --> 00:32:58.386 Like you get on the train, 00:32:58.386 --> 00:33:01.758 you'll find bits of sort of obscure art on the train. 00:33:02.856 --> 00:33:04.888 So, it's a really interesting project, 00:33:04.934 --> 00:33:06.924 but part of a much wider community. 00:33:07.452 --> 00:33:10.374 So, what is Artists' Publishing? What are Artists' Books? 00:33:10.430 --> 00:33:12.087 Like, I didn't know either. 00:33:13.545 --> 00:33:15.329 So, the definition, according to Wikipedia, 00:33:15.329 --> 00:33:19.377 is "Artists' books are works of art that utilize the form of the book." 00:33:19.377 --> 00:33:20.956 Well, you can read it. 00:33:21.569 --> 00:33:24.130 But it's individual pieces of art, 00:33:24.130 --> 00:33:28.257 or sometimes collections of art, using publishing as a medium. 00:33:28.583 --> 00:33:31.141 This varies quite a lot. It's very interesting. 00:33:31.141 --> 00:33:32.560 It was kind of-- 00:33:32.570 --> 00:33:35.043 There was a lot of it in the early '20s and '30s, 00:33:35.043 --> 00:33:37.793 and it had a bit of a renaissance, '60s and 70's, 00:33:37.793 --> 00:33:39.411 and continues to expand. 00:33:39.491 --> 00:33:42.237 Has a large global community, multilingual, 00:33:43.089 --> 00:33:47.748 somewhat separate from large institutional art institutions. 00:33:47.805 --> 00:33:50.448 So, you'll find collections, 00:33:50.448 --> 00:33:53.616 such as the V&A has a collection, obviously. 00:33:54.680 --> 00:33:58.483 So, they've got various kind of items such as these. 00:33:59.294 --> 00:34:02.045 This is just an article, so it's just not the best display. 00:34:03.098 --> 00:34:08.009 But it's a really kind of interesting, yet slightly niche field of work. 00:34:08.661 --> 00:34:11.674 But it's not very good on Wikidata. 00:34:14.023 --> 00:34:18.245 This is, again, a really terrible photo-- it's not my photo-- 00:34:18.245 --> 00:34:21.488 of some the stuff held in Banner Repeater's archive. 00:34:21.488 --> 00:34:24.086 If you see in the middle, the pink one, Blast, 00:34:24.086 --> 00:34:27.802 that's actually a fairly notable piece of artists' publishing 00:34:27.802 --> 00:34:29.548 from the '20s. 00:34:31.168 --> 00:34:32.838 What does it look like on Wikidata? 00:34:32.838 --> 00:34:34.341 It's not good on Wikidata. 00:34:34.869 --> 00:34:37.782 It's often just confused with books 00:34:37.782 --> 00:34:39.803 or other forms of publishing. 00:34:40.292 --> 00:34:42.724 The average kind of Wikidata item for 00:34:42.728 --> 00:34:46.374 a notable piece of artists' publishing 00:34:47.145 --> 00:34:50.512 doesn't really have much to say about it. 00:34:50.568 --> 00:34:53.738 You know, it's just-- there you go, that's it. 00:34:54.832 --> 00:34:57.429 There's not a huge amount of identifier numbers as well. 00:34:57.781 --> 00:35:00.782 So, there's clearly a lot missing 00:35:00.782 --> 00:35:03.710 when it comes to artists' publishing, 00:35:03.710 --> 00:35:06.840 certainly compared to more traditional forms of art-- 00:35:06.840 --> 00:35:09.073 paintings and sculpture and so forth. 00:35:09.722 --> 00:35:12.681 And there's a huge desire within the community 00:35:12.681 --> 00:35:15.631 to start codifying this, and making it a real thing. 00:35:16.566 --> 00:35:19.283 So, I'll give you an example of what is actually available. 00:35:19.283 --> 00:35:22.202 You can point out what's wrong with this query. 00:35:23.542 --> 00:35:28.173 So, this is basically all there is. 00:35:28.702 --> 00:35:31.507 That's every artists' book on Wikidata. 00:35:31.552 --> 00:35:33.049 So, there's really not a lot. 00:35:33.049 --> 00:35:36.322 Some of them don't even have labels for a start. 00:35:36.322 --> 00:35:38.632 And it's something that really needs expanding. 00:35:38.632 --> 00:35:41.099 And something that has capacity to be expanded. 00:35:41.148 --> 00:35:43.416 Has anyone seen what's wrong with this query yet? 00:35:45.164 --> 00:35:47.317 The labels-- the labels say "sausage", 00:35:48.172 --> 00:35:50.814 because I just stole someone else's query, 00:35:50.814 --> 00:35:52.212 and changed the key number. 00:35:52.212 --> 00:35:53.342 (laughter) 00:35:53.342 --> 00:35:55.264 It's actually a query about sausages. 00:35:55.877 --> 00:35:57.541 Anyway, moving on. 00:35:57.827 --> 00:36:00.475 But yeah, you see it doesn't really have much of a presence. 00:36:01.122 --> 00:36:04.163 We were approached by Banner Repeater. 00:36:05.378 --> 00:36:07.281 So, I work with Wikimedia UK. 00:36:07.281 --> 00:36:10.275 We were approached by Banner Repeater to help them with this-- 00:36:10.719 --> 00:36:12.416 with setting up a Wikibase-- 00:36:13.182 --> 00:36:15.812 in terms of funding, in getting extra funding, 00:36:15.812 --> 00:36:18.293 but also in terms of bringing in a wider community, 00:36:18.293 --> 00:36:20.152 and being part of the process. 00:36:20.561 --> 00:36:23.886 So, the process is basically to gather this community 00:36:23.886 --> 00:36:27.364 of artists, archivists, and linked data experts, 00:36:28.554 --> 00:36:31.607 and work out what the schema, the data model, 00:36:31.607 --> 00:36:33.872 for artists' publishing should be. 00:36:33.929 --> 00:36:35.588 It's a very specialized field. 00:36:35.953 --> 00:36:38.147 Doesn't really map onto Wikidata perfectly. 00:36:38.392 --> 00:36:40.793 It's probably too granular for it. 00:36:41.684 --> 00:36:44.485 And the other thing is the kind of flexibility of it. 00:36:44.577 --> 00:36:46.639 Maybe it doesn't really fit in Wikidata. 00:36:46.639 --> 00:36:50.090 Maybe it's too rigid at the moment. 00:36:50.090 --> 00:36:52.796 The Wikibase is being built, 00:36:52.796 --> 00:36:55.639 so I haven't got much to show you, because it's not been built yet, 00:36:55.639 --> 00:36:57.149 but this more about the process. 00:36:57.343 --> 00:37:00.591 And the process is extensive community consultation, 00:37:00.678 --> 00:37:02.136 a few kind of layers of it. 00:37:02.136 --> 00:37:04.563 So, we're not just going to do this in one session. 00:37:04.563 --> 00:37:06.146 It's not a few individuals deciding. 00:37:06.146 --> 00:37:08.296 It's kind of ongoing, and ongoing, and ongoing. 00:37:09.352 --> 00:37:13.244 The impact of this could be fairly substantial, 00:37:13.244 --> 00:37:15.145 because no one else is doing this work. 00:37:15.145 --> 00:37:18.593 A lot of the larger institutions have artists' publishing 00:37:18.593 --> 00:37:20.270 sitting in their kind of back room. 00:37:20.270 --> 00:37:22.163 They don't really know how to categorize it. 00:37:22.163 --> 00:37:23.744 They haven't categorized it very well. 00:37:23.793 --> 00:37:25.899 They're not very interested in it. 00:37:25.899 --> 00:37:29.104 But there is a huge community that is interested in doing this. 00:37:30.527 --> 00:37:34.012 So, this is basically the process at the moment. 00:37:34.502 --> 00:37:36.936 So, the initial workshop has happened. 00:37:36.936 --> 00:37:40.228 So, it was an expert workshop with some people 00:37:40.228 --> 00:37:43.644 deep in the field of artists' publishing-- 00:37:43.644 --> 00:37:45.959 archivists, people who own collections, and so forth-- 00:37:46.002 --> 00:37:48.962 to establish a kind of basic set of priors, 00:37:49.407 --> 00:37:52.080 to look at what things were existing. 00:37:52.080 --> 00:37:54.677 The existing status was on Wikidata, 00:37:54.677 --> 00:37:57.134 and look at how that could be expanded or improved. 00:37:57.665 --> 00:38:00.503 And then they documented that, 00:38:00.503 --> 00:38:03.605 and established this basic structure. 00:38:04.135 --> 00:38:05.759 And now, we move into the next process 00:38:05.759 --> 00:38:07.630 where it's bringing in a much wider community. 00:38:07.721 --> 00:38:11.087 So that's-- it's not just data people, it's creators, as well. 00:38:11.656 --> 00:38:13.237 There'll be a lot of narrative in this, 00:38:13.237 --> 00:38:15.140 and a lot of qualitative things. 00:38:15.140 --> 00:38:18.093 Again, stuff that just doesn't really belong on Wikidata. 00:38:18.933 --> 00:38:20.966 But also working with archivists, 00:38:20.966 --> 00:38:24.045 and working with linked data experts, and so forth, 00:38:24.045 --> 00:38:26.322 to hopefully bring this all together, 00:38:26.322 --> 00:38:29.979 to create a resource that will have a nice accessible front end, 00:38:29.979 --> 00:38:33.241 and also build this community-- people who can contribute to it, 00:38:33.241 --> 00:38:35.631 and kind of own this data set. 00:38:36.318 --> 00:38:38.505 I'll show you what we've got ready. 00:38:40.687 --> 00:38:44.205 This is subject to change. 00:38:44.815 --> 00:38:47.494 But this is basically kind of where we've got so far 00:38:47.494 --> 00:38:48.612 with the expert ones. 00:38:48.612 --> 00:38:51.968 So, you see different P numbers being developed, 00:38:51.968 --> 00:38:54.891 and look at what their equivalent on Wikidata is. 00:38:55.428 --> 00:38:58.472 And obviously, it's a lot more granular 00:38:58.472 --> 00:39:01.394 than probably the information on Wikidata is at the moment, so-- 00:39:02.900 --> 00:39:06.599 There's a lot of detailed stuff, so there's qualities 00:39:06.599 --> 00:39:09.063 such as height, width, thickness, and so forth, 00:39:09.763 --> 00:39:12.135 which aren't necessarily that present 00:39:12.135 --> 00:39:14.981 on other groups of artists' publishing on Wikidata. 00:39:15.453 --> 00:39:19.946 But there's also other things like "commissioned by", and "contributors to", 00:39:19.946 --> 00:39:22.573 and a lot of these works will have multiple contributors. 00:39:23.262 --> 00:39:25.526 And multiple editions and things like that. 00:39:25.526 --> 00:39:27.432 There's really a lot of granular information 00:39:27.432 --> 00:39:29.049 that can come about these things. 00:39:29.049 --> 00:39:30.844 And a lot of narrative as well, you know, 00:39:31.571 --> 00:39:32.953 as things have changed over time, 00:39:32.953 --> 00:39:34.741 as people have reinterpreted things. 00:39:35.566 --> 00:39:38.288 And this was what was created. 00:39:39.605 --> 00:39:42.633 Again, most of it has Wikidata equivalents, 00:39:42.633 --> 00:39:44.063 but some of it doesn't yet. 00:39:44.063 --> 00:39:46.748 So, what do we have here. 00:39:48.203 --> 00:39:50.395 Other editions, and things like that. 00:39:50.395 --> 00:39:51.813 So, it's fairly specialized. 00:39:51.813 --> 00:39:52.929 This is the first stage. 00:39:52.929 --> 00:39:54.643 And this will go through another process, 00:39:54.643 --> 00:39:57.237 as people take things away from it or contribute, too. 00:39:58.180 --> 00:40:00.727 The flexibility is really important in this. 00:40:01.577 --> 00:40:04.785 It's kind of getting away from older kind of standards, 00:40:04.785 --> 00:40:07.101 and moving to something which is a bit more up-to-date, 00:40:07.101 --> 00:40:09.423 and something where the community can really change things, 00:40:09.423 --> 00:40:11.869 and not be dictated to-- and I'll start speaking quicker. 00:40:13.778 --> 00:40:18.258 So, power dynamics, at the moment, and why Wikibase. 00:40:18.258 --> 00:40:20.432 So at the moment, this is the art world. 00:40:20.432 --> 00:40:21.950 This is what the art world looks like. 00:40:21.950 --> 00:40:23.269 It's a big orange thing. 00:40:23.570 --> 00:40:25.360 But you've got these large institutions, 00:40:25.360 --> 00:40:27.993 and then you've got sort of groups of artists' publishing. 00:40:28.117 --> 00:40:31.575 That could be Delhi, Mexico City, London, and so forth. 00:40:32.197 --> 00:40:36.219 And what we don't want is this kind of thing 00:40:36.219 --> 00:40:38.881 where large institutions and experts get to dictate 00:40:38.881 --> 00:40:41.672 the kind of ontology, and how these things are going to work. 00:40:43.115 --> 00:40:47.389 So, working to establish a Wikibase among an artist community 00:40:47.874 --> 00:40:51.523 can help them work out what they're going to do, 00:40:51.523 --> 00:40:54.582 and then they start pushing back into the larger institutions, 00:40:54.995 --> 00:40:57.140 with a more kind of flexible data model, 00:40:57.140 --> 00:40:59.344 with something that's more up-to-date 00:40:59.344 --> 00:41:01.814 and coming from grassroots organizations, 00:41:01.814 --> 00:41:05.115 as opposed as coming from institutions, so to speak. 00:41:05.564 --> 00:41:08.325 So, I think there's huge value in this approach 00:41:08.398 --> 00:41:12.058 in terms of creating a sort of parallel infrastructure 00:41:12.058 --> 00:41:16.284 for communities of people who own content, and so forth, 00:41:16.284 --> 00:41:19.360 much like Wikimedia is, 00:41:19.846 --> 00:41:21.996 and kind of pushing out to institutions, 00:41:21.996 --> 00:41:24.223 rather than doing it the other way around. 00:41:24.662 --> 00:41:26.847 Do I have another slide? What next? 00:41:26.847 --> 00:41:29.368 I always put this slide in, because it's always the worst slide, 00:41:29.368 --> 00:41:30.484 and it's such a stereotype. 00:41:31.347 --> 00:41:35.068 What next? We're moving on to the community consultation stage, 00:41:35.068 --> 00:41:38.953 so we'll get a bit more kind of expansive and interesting. 00:41:39.288 --> 00:41:43.724 This obviously, this database will be talking to Wikidata, 00:41:43.724 --> 00:41:46.632 but on what term, we're not 100% sure. 00:41:46.632 --> 00:41:48.614 But it could be that this becomes very-- 00:41:48.614 --> 00:41:51.086 just a very specific instance for artists' publishing 00:41:51.086 --> 00:41:53.922 that Wikidata can draw from, and vice versa. 00:41:54.501 --> 00:41:56.918 And I'll just finish off with that picture again, 00:41:56.918 --> 00:41:58.419 because I just quite like it. 00:41:58.587 --> 00:42:00.936 And that's all I have to say. Thank you. 00:42:00.936 --> 00:42:04.881 - Thank you so much. - (applause) 00:42:05.810 --> 00:42:08.853 We're almost at the end of our fast-paced ride, 00:42:08.853 --> 00:42:12.710 and we'll-- what to say? we saved the best for last? 00:42:12.710 --> 00:42:16.433 No, but we give the last presentation 00:42:16.433 --> 00:42:20.118 to someone who's a true pioneer of using Wikibase 00:42:20.118 --> 00:42:23.439 in the field of digital humanities. 00:42:23.439 --> 00:42:25.510 And, yeah-- Olaf Simons. 00:42:25.510 --> 00:42:28.499 You have not prepared any slides, but you will do some live action. 00:42:28.600 --> 00:42:29.978 Exactly. 00:42:30.793 --> 00:42:34.165 And I have been on Wikipedia since 2004, actually. 00:42:34.230 --> 00:42:35.570 I have the 15 years. 00:42:37.772 --> 00:42:39.555 What am I going to show? 00:42:41.665 --> 00:42:43.564 I've been congratulated for this. 00:42:43.564 --> 00:42:47.635 I'm going to show you the Wikibase instance we created. 00:42:47.635 --> 00:42:49.056 It's not a Docker Image. 00:42:49.056 --> 00:42:52.093 And I could agree, it's not the best to have a Docker-- 00:42:52.093 --> 00:42:56.707 it's not the best to have an independent installation. 00:42:56.707 --> 00:42:57.808 It's difficult, 00:42:57.808 --> 00:42:59.646 and it has been extremely difficult for us, 00:42:59.646 --> 00:43:03.638 and we're grateful for the Wikimedia Germany 00:43:04.828 --> 00:43:08.741 to help us get it done on a mutual agreement we had. 00:43:09.413 --> 00:43:15.696 So, basically, we have here several projects on this. 00:43:16.060 --> 00:43:18.243 It's more project-oriented than Wikidata. 00:43:18.847 --> 00:43:21.453 And my thing should be in here. 00:43:21.506 --> 00:43:27.025 I open that and go-- just should have done that before. 00:43:27.336 --> 00:43:28.595 Here we are. 00:43:29.723 --> 00:43:33.542 The history of the Illuminati-- I start with this one. 00:43:33.868 --> 00:43:36.216 This has been a little film 00:43:36.216 --> 00:43:40.272 which has been created by Paul-Olivier Dehaye, 00:43:41.602 --> 00:43:43.755 whom I only know from Twitter, 00:43:43.755 --> 00:43:45.709 as he asked us what kind of experience 00:43:45.709 --> 00:43:49.933 did we make when we got our Wikibase, 00:43:49.933 --> 00:43:52.242 and he was experimenting with his own. 00:43:52.242 --> 00:43:55.606 And I talked to him about things we could do, 00:43:55.606 --> 00:43:57.271 and things we could not do. 00:43:57.271 --> 00:44:00.432 This was a film I would love to be able to do. 00:44:00.432 --> 00:44:02.339 And he said, "It's easy for me. 00:44:02.339 --> 00:44:04.724 I can run a SPARQL search, get the information, 00:44:04.724 --> 00:44:08.147 and put it into a program, in which you can then see this thing." 00:44:08.835 --> 00:44:12.328 It's actually 20 years of research on the Illuminati, 00:44:12.328 --> 00:44:15.897 and gives you a short history of the entire organization 00:44:15.897 --> 00:44:17.921 and all its correspondences. 00:44:17.921 --> 00:44:20.147 That's not a Wikimedia tool. 00:44:20.147 --> 00:44:23.024 It's not a tool of Wikibase. 00:44:23.024 --> 00:44:25.010 But it's something you can do. 00:44:25.010 --> 00:44:29.545 And actually, I like it that it is not a tool already. 00:44:29.545 --> 00:44:31.006 It should become a tool. 00:44:31.006 --> 00:44:33.932 I like it because it shows our data is really free. 00:44:33.932 --> 00:44:37.343 Someone can download our data, someone can do something with it, 00:44:37.343 --> 00:44:42.308 which we haven't expected, and it can be done within two hours, 00:44:42.308 --> 00:44:44.482 if you're bright-- and he is bright, of course. 00:44:45.255 --> 00:44:46.735 So, he created this for us. 00:44:46.827 --> 00:44:48.929 I go back to my presentation. 00:44:50.141 --> 00:44:52.825 Why on Wikibase? 00:44:52.825 --> 00:44:56.203 This was the immediate question when we approached Wikimedia. 00:44:56.203 --> 00:44:58.910 I knew of Wikidata since 2010, 00:44:59.480 --> 00:45:04.643 and in 2017, it was ready to be used by us. 00:45:05.560 --> 00:45:10.942 And there was actually an interest from Wikimedia people to say, 00:45:10.942 --> 00:45:13.215 "Do it, and we support you." 00:45:13.705 --> 00:45:15.493 Why our own base? 00:45:15.777 --> 00:45:19.590 Basically, as original research that we have to do. 00:45:20.159 --> 00:45:24.951 And the entire installation is a research tool. 00:45:24.951 --> 00:45:27.663 It's not only there to take a look at what we did 00:45:27.663 --> 00:45:29.331 and for presentation purposes, 00:45:29.331 --> 00:45:31.968 but actually, I use it every day for my research. 00:45:31.968 --> 00:45:35.341 I change dates of documents, 00:45:35.341 --> 00:45:38.782 and take a look at how things look when I have changed that. 00:45:38.782 --> 00:45:41.410 I do a lot with working hypothesis. 00:45:41.410 --> 00:45:48.083 And we ask projects that have data to give us their data, 00:45:48.083 --> 00:45:50.073 and to feed them in, 00:45:50.073 --> 00:45:54.269 and they can, again, put a label, 00:45:54.269 --> 00:45:58.208 put an item to their data sets, 00:45:58.264 --> 00:46:02.397 that says this has been produced by the following project. 00:46:02.397 --> 00:46:04.777 Next projects can continue with it. 00:46:04.777 --> 00:46:06.962 But it's already there as a marker 00:46:06.962 --> 00:46:11.260 that this is a data set with work from a certain project. 00:46:11.437 --> 00:46:14.149 And if you have a project, DFG-- 00:46:14.779 --> 00:46:17.568 DFG funded, the German research institution-- 00:46:17.568 --> 00:46:19.404 if you have a project, you want to show 00:46:19.404 --> 00:46:20.983 what kind of work you have done. 00:46:20.983 --> 00:46:22.633 And you can now do a SPARQL search 00:46:22.633 --> 00:46:25.880 and present your entire group of data sets 00:46:25.880 --> 00:46:30.100 in the final résumé of your work. 00:46:30.751 --> 00:46:36.002 So we get original research, we identify research, 00:46:36.002 --> 00:46:38.513 we encourage the working hypothesis. 00:46:38.588 --> 00:46:40.045 This is a working tool, 00:46:40.045 --> 00:46:42.807 and it's actually quite useful to start from the beginning, 00:46:42.807 --> 00:46:44.267 not to present something in the end. 00:46:44.267 --> 00:46:46.741 But from day one, you work with it, 00:46:46.741 --> 00:46:50.170 and what you think is the proper answer to that question, 00:46:50.170 --> 00:46:53.120 you can put it into Wikibase, and then 00:46:53.120 --> 00:46:55.021 you can substantiate information 00:46:55.021 --> 00:46:57.253 until you see this is the right identification 00:46:57.253 --> 00:46:59.532 of a person or the right date for a thing 00:46:59.532 --> 00:47:02.249 which we haven't been able to date so far. 00:47:02.309 --> 00:47:05.103 So, actually, accumulate work while you are doing it, 00:47:05.103 --> 00:47:07.536 use the Wikibase as a kind of tool 00:47:07.536 --> 00:47:09.763 that is getting you closer to the final result. 00:47:11.098 --> 00:47:14.782 Our first meeting took place on December 1, 2017. 00:47:15.268 --> 00:47:18.757 And I remember I had a little challenge for you, 00:47:18.960 --> 00:47:25.067 and that was a death date-- a date of death for a person-- 00:47:25.245 --> 00:47:30.055 where I wanted to have someone to show a source for that, 00:47:30.055 --> 00:47:31.429 and that was extremely difficult, 00:47:31.429 --> 00:47:32.975 because he had to create the source 00:47:32.975 --> 00:47:34.758 before he could connect it to that. 00:47:34.758 --> 00:47:36.499 And in the room, we were-- 00:47:36.499 --> 00:47:39.815 we had the clear idea, if we do this, we'd do it 00:47:39.815 --> 00:47:44.608 with the sources already part of the Wikibase installation we have. 00:47:44.608 --> 00:47:46.433 And if we have the sources in there-- 00:47:46.433 --> 00:47:49.515 that is, all the early modern books that have been printed 00:47:49.515 --> 00:47:50.771 would be the ideal. 00:47:50.771 --> 00:47:53.382 If we have that in there, we need the GND in there. 00:47:53.382 --> 00:47:59.538 And when we heard that the GND people are on their track to test the software, 00:47:59.538 --> 00:48:01.879 I approached them and asked, "Wouldn't you like to do this 00:48:01.879 --> 00:48:05.499 in a cooperation with us, so that we can have your data, 00:48:05.499 --> 00:48:07.208 which we want to have, anyway, 00:48:07.208 --> 00:48:09.976 and that you can see how it works on a Wikibase." 00:48:09.976 --> 00:48:11.684 And this is where we are at the moment. 00:48:11.684 --> 00:48:14.849 And presently, I would say, a lot of things, 00:48:14.849 --> 00:48:16.399 we're not sure how they are done, 00:48:16.399 --> 00:48:18.339 or at least I am not sure how they are done. 00:48:18.339 --> 00:48:21.292 How's the input done, how do you get from a resource of strings 00:48:21.292 --> 00:48:24.500 to an item-based resource-- lots of things. 00:48:25.111 --> 00:48:28.065 And basically, my talk here is an invitation. 00:48:28.471 --> 00:48:30.012 Join us. 00:48:30.502 --> 00:48:32.987 We are still not really part of the Wikibase community. 00:48:32.987 --> 00:48:33.999 That doesn't exist. 00:48:33.999 --> 00:48:35.789 We have a Wikidata community. 00:48:35.789 --> 00:48:38.057 And lots of things are taking place in Wikidata, 00:48:38.057 --> 00:48:42.751 but if I ask for help for a Wikibase that is not Wikidata, 00:48:43.118 --> 00:48:44.696 that's a difficult thing. 00:48:46.030 --> 00:48:49.432 First thing I would say is, actually, to work with us is cool, 00:48:49.432 --> 00:48:53.665 because you can grab the data for Wikidata anytime, any moment, at CC0. 00:48:54.398 --> 00:48:57.886 So, actually, you can use it as an incubator of your work, 00:48:57.886 --> 00:49:00.486 and drag it to Wikidata. 00:49:01.013 --> 00:49:06.106 And also, we will work with big data, when we have the GND 00:49:06.106 --> 00:49:07.967 in there, that will be quite something. 00:49:07.967 --> 00:49:09.628 So, if you really want the challenge, 00:49:09.628 --> 00:49:11.810 you can get it also on our platform. 00:49:12.339 --> 00:49:15.499 And we offer interesting communities. 00:49:16.394 --> 00:49:18.341 Basically, one of the things that is different 00:49:18.341 --> 00:49:21.489 is that we have all clear-name accounts and institutions. 00:49:21.489 --> 00:49:24.459 So, but that also means you can do things 00:49:24.459 --> 00:49:25.949 which you couldn't do on Wikidata. 00:49:25.949 --> 00:49:27.976 You can do your genealogy at our site. 00:49:27.976 --> 00:49:28.993 We don't mind. 00:49:28.993 --> 00:49:32.075 It's interesting to have people getting such data. 00:49:32.075 --> 00:49:36.049 You can do your city's search-- research, historical research 00:49:36.049 --> 00:49:37.948 on our platform-- we don't mind. 00:49:37.948 --> 00:49:42.456 You can be with research on our platform. 00:49:43.052 --> 00:49:45.812 So, lots of things need to be done. 00:49:46.137 --> 00:49:48.565 We have immense problems running the database. 00:49:48.565 --> 00:49:50.676 It was implemented by Wikimedia, 00:49:50.676 --> 00:49:52.981 but now, we see lots of things don't really work. 00:49:52.981 --> 00:49:54.478 We can't really fix that. 00:49:54.478 --> 00:49:57.543 It's extremely difficult to get help 00:49:57.543 --> 00:50:00.489 to run the database, to update the database, 00:50:00.489 --> 00:50:03.034 to solve little technical problems, 00:50:03.034 --> 00:50:08.632 which we face as soon as we run an instance outside Wikidata. 00:50:09.318 --> 00:50:13.002 Like getting the direct GND link is difficult. 00:50:13.055 --> 00:50:15.644 It works on Wikidata, it doesn't work on our instance. 00:50:15.644 --> 00:50:19.620 Getting images from Wikimedia Commons 00:50:19.620 --> 00:50:23.260 on our Wikibase is not that easy. 00:50:23.260 --> 00:50:25.370 Lots of little things still remain. 00:50:25.370 --> 00:50:27.525 So, actually, this is an invitation. 00:50:27.525 --> 00:50:32.153 If you want to join us on the mass input, do that. 00:50:33.852 --> 00:50:34.861 Approach us. 00:50:34.912 --> 00:50:37.191 If you want to help us with technical things, 00:50:37.191 --> 00:50:38.591 this is highly welcome. 00:50:38.591 --> 00:50:40.129 And then, we need tools. 00:50:40.129 --> 00:50:42.120 You saw the tool we had in the beginning. 00:50:42.120 --> 00:50:44.921 Actually, it's not that difficult to get such tools. 00:50:45.934 --> 00:50:50.963 I saw what kind of query you do to get such a visualization, 00:50:50.963 --> 00:50:55.140 and once you have it, you should be able to modify it easily. 00:50:56.601 --> 00:50:59.358 These tools are extremely precious 00:50:59.358 --> 00:51:02.754 in our community of digital humanities projects. 00:51:02.774 --> 00:51:06.099 And there are little companies that create these tools, 00:51:06.099 --> 00:51:08.727 again, and again, and again, and get money for that. 00:51:08.727 --> 00:51:12.202 I would love to have these tools just once and for all free 00:51:12.202 --> 00:51:15.493 and on the market and working with a Wikibase instance. 00:51:15.493 --> 00:51:19.662 So, anyone who is interested in developing tools, 00:51:19.662 --> 00:51:21.901 approach us, and we have plenty of ideas 00:51:21.901 --> 00:51:24.624 of what visualizations historians would love to see, 00:51:25.071 --> 00:51:26.815 and that should be done. 00:51:28.198 --> 00:51:31.493 So, basically, lots of things, like, still remain. 00:51:31.549 --> 00:51:33.774 I've got one minute. I don't need that one minute. 00:51:33.821 --> 00:51:35.640 And you're putting pressure on me. 00:51:37.260 --> 00:51:38.637 (person) Give it to the audience. 00:51:38.637 --> 00:51:40.380 I give the minute to the audience. 00:51:40.380 --> 00:51:42.122 Yeah. Thank you so much. 00:51:42.172 --> 00:51:44.324 And maybe you want to sit down, 00:51:44.324 --> 00:51:49.363 because I would like everyone to join me back on stage. 00:51:50.053 --> 00:51:51.793 And we can have a round of questions. 00:51:51.793 --> 00:51:54.628 I really like that we ended with an invitation, 00:51:54.628 --> 00:51:56.850 because this is what this is now. 00:51:57.254 --> 00:51:58.836 You are invited to ask questions. 00:51:58.836 --> 00:52:03.165 You are also invited to join us tomorrow at the Wikibase meetup. 00:52:03.489 --> 00:52:06.332 If you are-- if you have some idea 00:52:06.332 --> 00:52:08.567 for an awesome Wikibase installation, 00:52:08.567 --> 00:52:12.262 for your institution, for your hobby, for changing the world-- 00:52:12.990 --> 00:52:16.267 please come and join us, we will meet up, and-- 00:52:18.083 --> 00:52:20.228 There's some complication with the chairs. 00:52:20.357 --> 00:52:22.340 Well, let's stand up. Okay. 00:52:22.390 --> 00:52:24.496 I think we have another microphone, here. 00:52:24.496 --> 00:52:26.528 (person) I have the microphone for the questions. 00:52:26.971 --> 00:52:29.246 Okay. So-- 00:52:31.157 --> 00:52:32.662 Thank you for the presenters. 00:52:32.662 --> 00:52:35.799 And meet us at the Wikibase meetup, 00:52:35.799 --> 00:52:38.911 and now, I can't wait to hear your questions to the panel. 00:52:40.731 --> 00:52:42.391 (person) Who's the first? 00:52:43.805 --> 00:52:47.088 (person) Hi. I will be talking in the lightning session, too, 00:52:47.088 --> 00:52:50.872 about geosciences, and how in geosciences, 00:52:50.872 --> 00:52:54.312 there's many data repositories that have collected 00:52:54.312 --> 00:52:56.895 and shared data with the community 00:52:56.895 --> 00:52:59.331 for years, for decades in some cases. 00:52:59.820 --> 00:53:04.808 And they curate the data set, their schemas evolve continuously, 00:53:04.808 --> 00:53:07.243 they get a lot of feedback from the community. 00:53:07.243 --> 00:53:10.042 All they desire is to organize the community, 00:53:10.042 --> 00:53:12.557 to enable the growth of these repositories. 00:53:13.046 --> 00:53:17.371 So, they don't necessarily desire to put all their content in Wikidata 00:53:17.371 --> 00:53:18.837 and lose control over it. 00:53:18.837 --> 00:53:22.201 They offer a tremendous service curating this content. 00:53:22.566 --> 00:53:27.743 So, I just wanted to point out that some of the requirements 00:53:27.743 --> 00:53:30.895 and needs that have been voiced by the panelists 00:53:30.895 --> 00:53:32.841 appear in my communities. 00:53:32.931 --> 00:53:39.764 And my question is, how do you mix or maintain control 00:53:40.291 --> 00:53:42.971 over those schemas, over the standards, 00:53:42.971 --> 00:53:47.827 while allowing the community to continue to introduce feedback 00:53:47.827 --> 00:53:52.194 and have more of this crowdsourcing spirit that Wikidata has? 00:53:52.882 --> 00:53:56.209 I think everyone could answer that, but maybe David, you want to start? 00:53:57.313 --> 00:53:59.470 I'm not sure whether I'm the right person to answer this, 00:53:59.470 --> 00:54:00.845 because in our use case-- 00:54:02.175 --> 00:54:04.100 in terms of data modeling, 00:54:04.100 --> 00:54:09.297 it's really a narrow set of people who actually do the work. 00:54:09.472 --> 00:54:13.415 We contact experts for the relevant segments, 00:54:14.145 --> 00:54:17.309 and some of them could contribute, but for the current iteration, 00:54:17.309 --> 00:54:21.035 it was only me and two colleagues who actually worked on it. 00:54:21.082 --> 00:54:25.903 So, we want to have this option, that we get experts in, 00:54:25.903 --> 00:54:29.356 but it's always in close collaboration with us, 00:54:29.356 --> 00:54:32.076 so that we don't really have to worry 00:54:32.076 --> 00:54:34.349 about the problem of crowdsourcing. 00:54:36.053 --> 00:54:38.232 Being part of the Wikimedia community, 00:54:38.232 --> 00:54:40.620 I would say, I would not be that worried. 00:54:40.702 --> 00:54:45.797 95% of the edits are good edits, and improving things--more than that. 00:54:47.097 --> 00:54:50.409 As soon as we have an instance that is actually closed-- 00:54:50.409 --> 00:54:53.350 where I offer the accounts on real name, 00:54:53.350 --> 00:54:59.469 that's an additional hurdle that no fool is going to go over. 00:54:59.520 --> 00:55:05.335 People are required on our instance to offer an address, on page-- 00:55:05.442 --> 00:55:06.938 not to me, but on page-- 00:55:06.938 --> 00:55:10.312 and this is something only institutions usually do, 00:55:10.312 --> 00:55:11.576 or private people that say, 00:55:11.576 --> 00:55:13.564 "Okay, I'm a private person. I love this research. 00:55:13.564 --> 00:55:15.882 This is my personal field. I give you my address." 00:55:15.882 --> 00:55:19.692 And this is a thing that puts off every-- 00:55:20.384 --> 00:55:23.718 any vandal who wants to destroy Wikidata. 00:55:24.084 --> 00:55:27.545 So, you can close the system, but then, 00:55:27.545 --> 00:55:30.216 you are not really part of the same flowing community. 00:55:30.305 --> 00:55:33.264 But again, I would say, if you go to CC0, 00:55:33.264 --> 00:55:35.848 then you can open up, you can be the incubator 00:55:35.848 --> 00:55:40.552 where people do the research, and then it goes out to the community. 00:55:40.552 --> 00:55:44.935 But it's an invitation-- use maybe closed works, 00:55:44.935 --> 00:55:48.743 and use an instance where you work together with people you like. 00:55:54.123 --> 00:55:56.475 Well, I think that-- 00:55:59.752 --> 00:56:03.798 I don't think that it's only my opinion-- 00:56:04.499 --> 00:56:07.250 it is there are different perspectives, 00:56:07.250 --> 00:56:12.911 and it will be hard to reconcile all perspectives and say, 00:56:13.359 --> 00:56:19.333 "Wikidata is the solution for the entire world to go into." 00:56:20.065 --> 00:56:24.364 I don't say by this that Wikidata is not a solution, 00:56:24.972 --> 00:56:27.925 but there are different perspectives, there are different needs. 00:56:27.925 --> 00:56:34.844 The world is-- really, there is a large variety of needs, 00:56:34.844 --> 00:56:40.271 of professional perspectives, that you cannot reconcile 00:56:40.271 --> 00:56:44.639 in a unique worldwide database. 00:56:44.639 --> 00:56:48.587 So, I think that both are-- 00:56:48.587 --> 00:56:51.756 The trickiest thing is how to reconcile 00:56:51.756 --> 00:56:58.528 and find angles of dialogue between these two large families 00:56:58.528 --> 00:57:00.800 of needs and perspectives. 00:57:03.349 --> 00:57:05.379 If there are more questions, 00:57:05.379 --> 00:57:07.860 I would rather like to go to more questions. 00:57:08.960 --> 00:57:10.382 Anybody else? 00:57:12.482 --> 00:57:15.159 If not, meanwhile you're thinking about your questions-- 00:57:15.159 --> 00:57:17.726 I would just like to say that's one of the reasons 00:57:17.726 --> 00:57:19.632 why we consider Wikibase, 00:57:19.647 --> 00:57:23.820 because we believe that adding, editing information 00:57:23.820 --> 00:57:27.992 within the Wikibase instance, where you have rights and roles, 00:57:27.992 --> 00:57:31.443 as you have in Wikidata, gives us the opportunity 00:57:31.443 --> 00:57:36.360 to share that information with the information in Wikidata 00:57:36.360 --> 00:57:39.109 in a more easy way, a more convenient way 00:57:39.109 --> 00:57:44.170 than if we try to build these bridges in between our authority file 00:57:44.170 --> 00:57:46.520 and Wikidata at the moment. 00:57:46.641 --> 00:57:48.421 (person) So, I find it quite exciting 00:57:48.421 --> 00:57:51.870 hearing about how you're energizing communities 00:57:51.870 --> 00:57:55.149 to find their own ways for data modeling, 00:57:55.149 --> 00:57:58.636 and that you can put into Wikibase. 00:57:59.336 --> 00:58:02.556 Will you-- I'm just saying of Stuart Prior's community, 00:58:02.556 --> 00:58:04.174 but also some of the others-- 00:58:04.174 --> 00:58:06.155 be trying to feed the approaches 00:58:06.155 --> 00:58:10.157 that as a community that you decide work back to Wikidata, 00:58:10.157 --> 00:58:12.876 to say, "We've done artists' books, 00:58:12.876 --> 00:58:15.316 we've thrashed through several iterations, 00:58:15.316 --> 00:58:17.753 this is what we found really worked, 00:58:17.753 --> 00:58:19.904 and the properties that you should have 00:58:19.904 --> 00:58:23.193 or revisions you should make to the Wikidata data model. 00:58:24.018 --> 00:58:26.006 Good question. Very short answer. 00:58:27.388 --> 00:58:28.922 It's an interesting question. 00:58:30.112 --> 00:58:31.847 I don't know whether this is a model 00:58:31.847 --> 00:58:33.551 that's going to work for other types. 00:58:33.638 --> 00:58:35.009 I hope it is. 00:58:36.063 --> 00:58:39.093 But it's a difficult one if you question 00:58:39.093 --> 00:58:42.774 of whether the Wikidata community accepts the kind of authority 00:58:42.774 --> 00:58:45.700 of a separate community that goes off and does the work on its own. 00:58:46.556 --> 00:58:47.776 But I would certainly hope 00:58:47.776 --> 00:58:50.335 that it's a way of people feeding back into this process, 00:58:50.335 --> 00:58:53.702 without necessarily needing to go onto Wikidata and do it. 00:58:56.904 --> 00:58:58.525 Well, I would say, grab it. 00:58:58.525 --> 00:59:01.721 Grab it if it's convenient, take it, and take a look at how it works 00:59:01.721 --> 00:59:02.896 in the other instance. 00:59:02.896 --> 00:59:06.424 And if you feel like this is a cool property 00:59:06.424 --> 00:59:09.457 to do certain searches, then that will be adopted, 00:59:09.457 --> 00:59:10.721 that will be flowing. 00:59:10.721 --> 00:59:12.839 I wouldn't think of authorities doing this. 00:59:12.839 --> 00:59:14.807 (person) Coming from a Wikidata user perspective, 00:59:14.807 --> 00:59:17.543 the great thing you're doing is showing you've established code 00:59:17.543 --> 00:59:18.802 that works and runs. 00:59:18.802 --> 00:59:21.390 You've established a data model that people can see, 00:59:21.390 --> 00:59:23.290 is implementable, and works. 00:59:23.348 --> 00:59:25.867 And so, in the open source community, 00:59:25.867 --> 00:59:27.693 you know, show us the code. 00:59:27.705 --> 00:59:29.124 You can do that. 00:59:29.124 --> 00:59:32.726 And that's why I think it's very exciting to have these branches 00:59:32.726 --> 00:59:35.306 that can then fold it back for data modeling. 00:59:35.306 --> 00:59:36.381 Yeah, thank you. 00:59:36.381 --> 00:59:38.373 I think that is exactly the point. 00:59:38.902 --> 00:59:41.833 I also like the verb that you used-- energize. 00:59:41.923 --> 00:59:43.869 This is exactly what we want to do. 00:59:43.869 --> 00:59:46.584 Energize, as in Star Trek. 00:59:47.890 --> 00:59:50.193 Yeah, this panel comes to an end. 00:59:51.120 --> 00:59:53.750 And if you have any more questions 00:59:53.750 --> 00:59:57.431 on all these Wikibase projects, talk. 00:59:57.442 --> 00:59:59.633 - Please come tomorrow. - Have conversations. 00:59:59.633 --> 01:00:01.504 This is what this conference is about. 01:00:01.504 --> 01:00:02.926 Thank you very much. 01:00:02.926 --> 01:00:08.073 (applause)