0:00:06.009,0:00:09.069 (host) Hello, everyone. Thank you[br]for coming to these lightning talks. 0:00:09.069,0:00:11.529 Our first speaker, I'm going[br]to run straight into it, 0:00:11.529,0:00:13.781 is going to be Rosie[br]Stephenson-Goodknight. 0:00:13.781,0:00:15.319 Did I get that right? 0:00:15.319,0:00:19.609 Yes. And so she's going to be talking[br]about the Women Writers Project. 0:00:19.609,0:00:22.569 And we're going to--[br]yeah, is that right? Great. 0:00:22.569,0:00:24.299 And so, we're going[br]to just launch right in, 0:00:24.299,0:00:26.699 and I want to remind you,[br]if there's time for questions, 0:00:26.699,0:00:28.802 to please not speak[br]until you have the microphone. 0:00:28.802,0:00:30.329 Thank you. 0:00:31.589,0:00:34.125 (Rosie) Hi, everyone, and thanks[br]for coming to this session, 0:00:34.125,0:00:36.829 where we're going to talk[br]about Women Writers in Review, 0:00:36.829,0:00:40.329 cultures of reception associated[br]with trans-Atlantic, 0:00:40.329,0:00:43.977 English language women writers,[br]broadly construed. 0:00:44.523,0:00:48.387 Women Writers in Review is an initiative[br]of the Women Writers Project 0:00:48.387,0:00:50.535 of Northeastern University. 0:00:50.535,0:00:55.253 It moved there from Brown University,[br]approximately 15 years ago. 0:00:55.993,0:01:00.287 Women Writers in Review is a collection[br]of 18th- and 19th-century reviews, 0:01:00.287,0:01:04.281 publication notices,[br]literary histories, and other texts 0:01:04.281,0:01:09.511 corresponding to trans-Atlantic--[br]so, UK and US mostly, 0:01:09.511,0:01:12.953 though a few Canadian--[br]written works by women. 0:01:13.255,0:01:15.683 It's a project where the two universities, 0:01:15.683,0:01:18.133 Brown University[br]and Northeastern University, 0:01:18.133,0:01:22.645 started collecting the manuscripts[br]of women from this period. 0:01:23.337,0:01:27.520 And then they started collecting[br]the reviews of these works, 0:01:27.520,0:01:31.593 and then they started scoring[br]these reviews by giving them a rating. 0:01:32.321,0:01:36.144 It's designed to investigate[br]the discourse of reception and connection 0:01:36.144,0:01:39.333 with the changing trans-Atlantic[br]literary landscape 0:01:39.333,0:01:42.664 for the period 1770 to 1830. 0:01:46.143,0:01:49.103 You're going to pardon me if I speak fast,[br]because I've got five minutes 0:01:49.103,0:01:50.646 to go over this. 0:01:50.646,0:01:55.443 It includes 690 English language texts[br]responding to works 0:01:55.443,0:01:59.565 written or translated[br]by 18th- and 19th-century women writers. 0:01:59.593,0:02:04.813 There are 74 authors in the corpus,[br]using 112 different sources, 0:02:04.813,0:02:07.782 or periodicals, or magazines. 0:02:07.782,0:02:10.773 And there are 628 critical reviews. 0:02:11.867,0:02:14.671 Here's a picture that shows you[br]what we're talking about 0:02:14.671,0:02:16.573 in terms of a review. 0:02:16.573,0:02:18.819 And you can also see what kind of scores 0:02:18.819,0:02:25.403 were given by the academics[br]at Northeastern University. 0:02:25.833,0:02:28.922 Most of these are women[br]who were giving scores 0:02:28.922,0:02:34.031 based on the reviews that were done[br]mostly, probably all men, 0:02:34.031,0:02:39.799 back in this time period 1770 to 1830[br]of works written by women. 0:02:39.799,0:02:43.469 By works, we're talking about plays,[br]and novels, and poems, 0:02:43.469,0:02:46.955 essays, and other kinds of articles. 0:02:48.615,0:02:50.275 So, what are we talking about? 0:02:50.275,0:02:54.676 This required creating[br]items for authors for their works, 0:02:54.676,0:02:57.946 like I said, novels and plays and poems. 0:02:57.946,0:03:04.938 It required creating new items[br]for this period of time 0:03:05.038,0:03:08.391 where there are defunct periodicals. 0:03:08.391,0:03:12.499 It required creating items[br]for the scholarly articles. 0:03:12.578,0:03:16.900 And then the review scores of each,[br]and the review score by, 0:03:16.943,0:03:19.998 which in this case would be[br]Women Writers in Review, 0:03:19.998,0:03:23.336 and what we still need to add[br]is the described by source. 0:03:25.226,0:03:28.970 This gives you a picture[br]of the kind of spreadsheets, 0:03:28.970,0:03:31.397 Google Spreadsheets,[br]that I have been working on. 0:03:31.397,0:03:34.296 I shouldn't just say I,[br]because I've had a lot of help. 0:03:34.296,0:03:37.546 I've had a lot of people[br]who were working on this project with me. 0:03:37.546,0:03:40.413 And you can see at the top,[br]something about the authors, 0:03:40.413,0:03:41.736 about the works. 0:03:41.736,0:03:45.496 The third group is going to be[br]the periodical, 0:03:45.496,0:03:48.006 and then, how the scores started showing. 0:03:49.203,0:03:52.122 And of course, this is how they look-- 0:03:52.122,0:03:57.396 the beauty of being able to present[br]the preliminary findings. 0:03:57.856,0:04:01.767 Once we have uploaded all of the data, 0:04:02.989,0:04:05.906 and I hope that that's going to be done[br]by the end of this year, 0:04:06.956,0:04:08.496 this will obviously look different. 0:04:09.916,0:04:10.931 Appendix. 0:04:10.931,0:04:15.267 So, here's what the depiction looks like 0:04:15.267,0:04:18.505 at the Northeastern University website. 0:04:19.024,0:04:22.474 I don't think it's quite as clear[br]as what we can do with Wikidata. 0:04:22.531,0:04:27.351 And so, this was probably the reason why,[br]when I started as a visiting scholar 0:04:27.351,0:04:31.751 in 2017, they asked if this is one[br]of the projects that I could work on. 0:04:31.751,0:04:36.093 They stopped their work[br]the year before, in 2016. 0:04:36.093,0:04:39.073 And I think they just don't have[br]the resources to continue. 0:04:40.251,0:04:43.415 Some parts of this presentation[br]came from another 0:04:43.415,0:04:45.812 that was published in 2016. 0:04:45.812,0:04:49.401 And last but not least, here are links 0:04:49.401,0:04:53.361 to the different parts[br]of the work that I'm doing. 0:04:54.257,0:04:55.561 Thank you very much. 0:04:55.561,0:04:56.845 Questions. 0:04:56.845,0:04:58.754 (applause) 0:05:10.397,0:05:14.665 (woman) So, when you have a work,[br]and you have the review of the work, 0:05:14.665,0:05:17.703 are you looking[br]at a particular edition of the work, 0:05:17.703,0:05:20.665 or are these all reviews[br]of first editions? 0:05:21.271,0:05:22.861 It's a good question. No. 0:05:22.861,0:05:25.601 They are not just reviews[br]of the first edition. 0:05:25.601,0:05:28.601 Some are reviews of the second[br]or third edition. 0:05:30.062,0:05:32.262 I'm going to add something[br]that maybe I should have said 0:05:32.262,0:05:34.951 before I closed[br]and went to question and answers-- 0:05:34.966,0:05:36.800 what's so special about this? 0:05:37.220,0:05:40.461 What's special is nobody else[br]has done this on Wikidata. 0:05:41.454,0:05:45.580 Surely, there are other universities[br]that have their own collections, 0:05:45.580,0:05:51.447 where their scholars have reviewed[br]the reviews of someone's work 0:05:51.800,0:05:53.394 in some language. 0:05:54.491,0:05:57.389 So, hopefully,[br]once this methodology gets-- 0:05:58.000,0:06:02.390 once I write this up and the project[br]is over and presented again, 0:06:02.390,0:06:05.310 that there will be other[br]universities, other libraries 0:06:05.310,0:06:07.923 that will speak up and say,[br]"We've got data sets, too, 0:06:08.248,0:06:13.020 and we're going to go ahead[br]and upload them into Wikidata ourselves," 0:06:13.020,0:06:15.910 and then it'd be lovely [br]to start doing some comparisons. 0:06:19.572,0:06:22.060 Anyone? Jane. 0:06:22.093,0:06:23.767 (Jane) Do you actually have books? 0:06:24.293,0:06:26.889 Do you actually have the books--[br]are the books in existence, 0:06:26.889,0:06:28.860 or are you actually[br]doing metadata about books 0:06:28.860,0:06:31.400 where we don't even know[br]where the books are? 0:06:31.780,0:06:34.829 Northeastern University[br]actually has the book, 0:06:34.829,0:06:37.209 or the essay, or the poem. 0:06:39.759,0:06:45.392 And they have the critical review[br]of the book, or the essay, or the poem. 0:06:45.755,0:06:48.820 And they're working[br]on the transcription of these, 0:06:48.820,0:06:51.452 and they're not at 100% yet. 0:06:52.432,0:06:56.256 They're not at 100%, but it's like,[br]all things working on it. 0:07:00.218,0:07:02.043 Any other questions? 0:07:05.697,0:07:07.399 (host) We're going to wrap it up there. 0:07:07.399,0:07:09.063 Thanks for being such a nice audience. 0:07:09.063,0:07:11.677 (applause) 0:07:14.012,0:07:18.581 Lady bug for [inaudible]. 0:08:58.271,0:08:59.372 (man) Finally got that. 0:08:59.372,0:09:02.565 What I'm going to do is I'm just going[br]to click on these to load. 0:09:02.565,0:09:06.091 Just while-- is that new tab there? 0:09:06.946,0:09:08.053 [inaudible] 0:09:08.053,0:09:10.524 The first one? Yeah, perfect. 0:09:11.024,0:09:13.503 Sorry, my German is not even rusty, 0:09:13.503,0:09:15.251 it's simply non-existent. 0:09:15.663,0:09:19.561 So, I'll just let them load,[br]because then these queries can run 0:09:19.561,0:09:22.728 while I'm sort of introducing[br]what I was talking about and doing. 0:09:22.728,0:09:24.795 So, hi, I'm Nav from Histropedia. 0:09:24.795,0:09:28.169 And basically, for the last[br]quite a few years, 0:09:28.169,0:09:29.710 we've been relatively quiet, 0:09:29.710,0:09:32.423 while we've been sort of working[br]on technology and tools 0:09:32.423,0:09:36.837 that we need to sort of develop,[br]ultimately, Histropedia version 2, 0:09:36.837,0:09:39.433 which is going to be, you know,[br]this huge enhancement 0:09:39.433,0:09:40.771 on the first version. 0:09:40.771,0:09:43.270 Well, it's kind of in progress,[br]but as we do it, 0:09:43.270,0:09:45.236 we've been experimenting[br]with these other tools, 0:09:45.236,0:09:47.387 and building the technology[br]that we're going to need. 0:09:48.132,0:09:51.781 One really crucial part for this[br]is the ability to sort of see 0:09:51.781,0:09:55.085 the whole of history[br]from the billions of years time scale, 0:09:55.085,0:09:58.602 to up to the current day, 0:09:58.602,0:10:00.638 and zooming all the way into single days. 0:10:00.638,0:10:03.433 And ultimately, in the end,[br]down to hours and minutes. 0:10:03.433,0:10:06.517 We've managed to create[br]a [inaudible] of update to our engine. 0:10:06.517,0:10:08.327 Other engines can already do this, 0:10:08.327,0:10:11.122 but unfortunately, they also can't handle[br]the large data sets. 0:10:11.122,0:10:13.269 So, we finally got this update[br]to our engine. 0:10:13.269,0:10:15.392 It allows us to zoom to billions of years. 0:10:15.392,0:10:19.533 So, recently-- the recently[br]finished update, 0:10:19.533,0:10:22.333 and it's basically, it's an update[br]to our query viewer tool, 0:10:22.333,0:10:24.482 which is like a live version[br]of Histropedia 0:10:24.482,0:10:26.832 just linked straight to Wikidata. 0:10:26.832,0:10:29.092 So, it's literally based on a query, 0:10:29.092,0:10:31.372 a live query, and we see[br]the results of it. 0:10:31.372,0:10:33.883 So, it's sort of separate[br]to our main tool. 0:10:33.883,0:10:37.502 So, I'm going to flick to the first one,[br]which is my first experiment. 0:10:37.502,0:10:39.716 And you'll forgive me, the queries-- 0:10:39.716,0:10:42.181 the code was kind of finished[br]not so long ago, 0:10:42.181,0:10:44.736 and the queries, I've been trying[br]to find out what can I find 0:10:44.736,0:10:47.692 and what's interesting[br]to look at, what's missing. 0:10:47.692,0:10:52.154 So, I started off[br]with a kind of, sort of, well-- 0:10:52.154,0:10:54.241 So, that's not the right--[br]that's not Life on Earth. 0:10:54.241,0:10:55.699 Is this Life on Earth? 0:10:56.123,0:10:57.467 That will do, anyway. 0:10:57.467,0:11:01.985 So, I started off just trying to look[br]at what sort of things 0:11:01.985,0:11:04.657 are actually in Wikidata. 0:11:04.657,0:11:07.407 And this particular one--[br]sorry, it's in reverse. 0:11:07.407,0:11:09.829 So, this is the first one[br]I wanted to show you. 0:11:09.829,0:11:12.485 So, this is a kind of[br]a life on Earth query 0:11:12.485,0:11:14.457 that I wanted to develop. 0:11:14.457,0:11:18.410 And basically, what it is[br]is all the taxons in Wikidata 0:11:18.410,0:11:20.157 that have a date. 0:11:20.157,0:11:23.726 And as you can probably see[br]from the panel, there is not many of them. 0:11:23.726,0:11:25.784 But we do have the different taxon ranks. 0:11:25.784,0:11:27.596 So, you know, is it a species, a class-- 0:11:27.596,0:11:29.725 for a biologist,[br]this makes a lot of sense. 0:11:29.725,0:11:32.446 But if I was just to close that a bit, 0:11:32.596,0:11:35.453 we can see, we are going back[br]to the earliest forms of life here. 0:11:35.453,0:11:37.236 3.5 billion years ago. 0:11:37.236,0:11:42.707 And as we zoom in here, we start to see[br]the more modern forms of life, 0:11:42.746,0:11:47.232 and we see some really[br]interesting things developing, 0:11:47.232,0:11:50.829 but we're still lacking a lot of data[br]in terms of this kind of time range. 0:11:52.250,0:11:55.286 So, my next thought was,[br]"Okay, well, why aren't--" 0:11:55.592,0:11:57.088 "I want to see a Tyrannosaurus Rex." 0:11:57.088,0:11:59.838 That's what I really wanted to see[br]on my query, and it wasn't there. 0:11:59.838,0:12:02.138 So, had a little dig in,[br]and I found out why. 0:12:02.234,0:12:05.284 It's because they're much more[br]being stored 0:12:05.284,0:12:08.696 in terms of the temporal range[br]or time period that they relate to. 0:12:09.065,0:12:11.412 So, on comes the next query, 0:12:11.412,0:12:13.144 where I actually sort of-- 0:12:13.664,0:12:17.641 basically, this query[br]is looking for any item 0:12:17.641,0:12:22.284 that has a temporal range start,[br]and/or a temporal range end. 0:12:22.665,0:12:25.965 Which is basically in the form--[br]in life forms, it kind of relates 0:12:25.965,0:12:28.644 to when they emerged[br]and when they became extinct. 0:12:28.644,0:12:31.044 So, these are the periods[br]on the side here. 0:12:31.585,0:12:33.190 If I just close that a bit-- 0:12:33.190,0:12:37.364 you can see that we have[br]quite a lot of interesting stuff. 0:12:37.364,0:12:39.834 And there's the Tyrannosaurus[br]that I was looking for. 0:12:39.834,0:12:43.394 So, I finally got that,[br]and I was like, "Yes! I've done it!" 0:12:43.394,0:12:46.084 I've got that Triceratops[br]in there for bonus. 0:12:46.084,0:12:48.984 But of course, still loads missing. 0:12:48.984,0:12:50.665 And I'd love to see lots more here. 0:12:50.665,0:12:52.590 But at least, it gives you the idea. 0:12:52.590,0:12:55.794 The nice thing is, here as well,[br]if I star some of these, 0:12:55.794,0:12:58.374 you can see that[br]the time range is shown. 0:12:58.374,0:13:01.027 So, you can start to do[br]what I really wanted to do, is say, 0:13:01.027,0:13:04.004 "Okay, when did this one end,[br]and when did the next one begin? 0:13:04.004,0:13:06.085 When did things start going extinct?" 0:13:06.085,0:13:09.832 So, I was pretty excited, but, still,[br]really hoping for a lot more. 0:13:09.832,0:13:11.619 So, there's a lot of editing to be done 0:13:11.619,0:13:15.098 in terms of these large geological[br]and cosmic time scales. 0:13:15.909,0:13:19.273 You can see on the color code,[br]I can also do extinction period. 0:13:19.273,0:13:23.489 So, I say, I want to find out stuff[br]that went extinct in the late Cretaceous. 0:13:23.489,0:13:25.768 And I now know that two things did that. 0:13:25.768,0:13:27.717 There's obviously quite a few more. 0:13:27.717,0:13:30.483 And I put the taxon rank[br]in there, as well, 0:13:30.483,0:13:31.986 just so that we can also see, 0:13:31.986,0:13:34.588 "Okay, which, what[br]is its species, genus, et cetera." 0:13:35.479,0:13:37.143 So, pretty exciting. 0:13:37.143,0:13:41.192 I was quite happy, but it's unfolding,[br]what needs to be done a lot. 0:13:42.126,0:13:45.447 So I went to the next one, which was-- 0:13:45.447,0:13:48.045 I was thinking, "Well, I can't find[br]all the data I'm looking for. 0:13:48.045,0:13:49.347 Let's go a bit more general, 0:13:49.347,0:13:53.833 and just look for all of a certain kind[br]of dates in Wikidata that I can find 0:13:53.833,0:13:57.240 that are over 10,000 years old, basically. 0:13:58.219,0:14:00.703 And what type of thing are they?" 0:14:00.762,0:14:04.298 So, this color code is relatively okay,[br]but it might be a bit misleading, 0:14:04.298,0:14:06.264 because some things are multiple types. 0:14:06.264,0:14:08.318 So, therefore,[br]it's a bit random, at times. 0:14:08.318,0:14:11.468 But, you get some really[br]fascinating stuff in here. 0:14:11.468,0:14:14.255 I've got for a start--[br]I've got all of the millennia 0:14:14.255,0:14:18.238 that we have in Wikidata,[br]which is, you know, there you go. 0:14:18.238,0:14:21.558 Read about everything that happened[br]in all these different millennia. 0:14:21.558,0:14:23.629 No pictures for any[br]of these, unfortunately. 0:14:23.629,0:14:26.670 So, there's nothing to really say[br]what happened in them. 0:14:26.670,0:14:29.203 Taxon, which we were just looking at,[br]which kind of led me on 0:14:29.203,0:14:31.124 to the other queries. 0:14:31.124,0:14:34.079 And of course, that sort of[br]like all of them in one group. 0:14:34.079,0:14:36.875 Interesting stuff.[br]Archaeological cultures. 0:14:36.875,0:14:40.121 And this is like, okay,[br]this is more like up my street. 0:14:40.121,0:14:42.670 This is the sort of things[br]I want to learn about. 0:14:42.670,0:14:45.234 Again, pictures would be nice. 0:14:45.493,0:14:48.781 But it's really showing you[br]something interesting. 0:14:48.781,0:14:50.361 And it's just worth exploring here. 0:14:50.361,0:14:52.534 And of course, there's some[br]that really make me excited 0:14:52.534,0:14:54.048 for what we could be doing. 0:14:54.048,0:14:57.288 For example, there was[br]something here which was-- 0:14:58.028,0:15:00.888 I mean, system, actually,[br]was quite an interesting one. 0:15:01.794,0:15:04.237 And sorry, that's not actually[br]the one I was thinking about. 0:15:04.237,0:15:05.958 In fact, that means nothing to me at all. 0:15:05.958,0:15:07.613 Someone might know what that means. 0:15:08.057,0:15:10.813 Art movements,[br]archaeological sites, activities. 0:15:10.813,0:15:12.478 There was only two of these, 0:15:12.478,0:15:15.788 but I really like the idea, because--[br]and they're both the same. 0:15:15.788,0:15:17.658 They're both hunting. 0:15:17.730,0:15:19.390 And of course, there's two of them. 0:15:19.390,0:15:22.360 And the reason is, is because[br]there's a little qualifier on there. 0:15:22.360,0:15:25.143 If we were to just[br]look through, we can see-- 0:15:25.143,0:15:27.735 we can see somewhere down here,[br]will be the start time. 0:15:27.735,0:15:30.690 And the qualifier is talking about[br]when Homo erectus did it, 0:15:30.690,0:15:32.735 and when Homo sapiens did it. 0:15:32.735,0:15:35.513 So that should be[br]in brackets on the query, 0:15:35.513,0:15:39.002 a little extension to do to show you[br]what the two different versions mean. 0:15:39.002,0:15:42.390 But I would love to see[br]all of human skills in here. 0:15:42.390,0:15:44.708 When did we first do farming,[br]when did we first this-- 0:15:44.708,0:15:46.010 when did fire come about? 0:15:46.010,0:15:48.270 All of these things,[br]when did we first extract iron? 0:15:48.270,0:15:50.355 When did we first--[br]all of these wonderful things 0:15:50.355,0:15:53.607 that developed[br]to modern world that we live in. 0:15:53.607,0:15:56.873 So, really exciting signs[br]of what could be there, 0:15:56.873,0:15:58.112 if it all got populated. 0:15:58.112,0:16:00.210 So, you know, this is what[br]we really need to work on, 0:16:00.210,0:16:02.333 is some of this historical info. 0:16:03.243,0:16:05.060 Last one, I just wanted to just show you, 0:16:05.060,0:16:07.283 which was just an extra[br]bonus one I threw in, 0:16:07.283,0:16:10.875 just to look at the time periods[br]that we actually have, 0:16:10.875,0:16:13.921 the historical ages[br]that we have in Wikidata. 0:16:13.921,0:16:17.524 And so, this is actually just all[br]sub-classes of unit of time. 0:16:17.524,0:16:22.396 And then, this is the actual[br]instance that it was. 0:16:22.396,0:16:23.775 And it's just really interesting. 0:16:23.775,0:16:25.849 This is more the kind of thing-- 0:16:26.979,0:16:29.541 In Histropedia Mark II,[br]these are the kind of things 0:16:29.541,0:16:31.944 that will actually will be displayed[br]more under the timeline 0:16:31.944,0:16:33.984 as a sort of a range or period. 0:16:33.993,0:16:36.436 And so, we are particularly interested[br]in these periods 0:16:36.436,0:16:37.976 being really tight and nice, 0:16:37.976,0:16:40.718 because it helps you to, then,[br]say what happened when, 0:16:40.718,0:16:43.983 and you can sound really clever[br]when you talk about when things happened, 0:16:43.983,0:16:47.263 in the Neolithic or the upper[br]Paleolithic, or whatever. 0:16:47.263,0:16:49.121 I'm still pretty clueless on most of it, 0:16:49.121,0:16:51.918 because I'm just kind of just waiting[br]for the data to be up to scratch. 0:16:51.918,0:16:55.163 Great. I think I can actually[br]round it up there. 0:16:55.163,0:16:57.145 Loads more exciting queries to come. 0:16:57.145,0:17:00.420 A lot more features and cool stuff,[br]actually, just around the corner for us, 0:17:00.420,0:17:02.758 because we've just finished[br]a lot of cool things. 0:17:02.758,0:17:05.471 But there's a little bit of time[br]to pull it all together. 0:17:05.471,0:17:07.373 So, look out for more. 0:17:07.373,0:17:09.760 If there's any questions,[br]I think I've got one minute. 0:17:09.760,0:17:11.458 So, it would have to be one. 0:17:11.510,0:17:13.253 (host) Yes, Nav.[br]I forgot to introduce you. 0:17:13.253,0:17:16.933 I'm sorry. That's Nav, as he said,[br]Histropedia, Evans. Thank you very much. 0:17:16.933,0:17:17.986 Thank you. Cheers. Yeah. 0:17:17.986,0:17:19.450 (host) Very fast questions. 0:17:19.450,0:17:21.815 Anyone with a very fast question[br][inaudible]. 0:17:24.654,0:17:29.230 (woman 2) Very quickly, how can[br]I do my own, if I want languages, 0:17:29.230,0:17:30.818 when do we start, for instance. 0:17:30.818,0:17:32.031 Absolutely. Good question. 0:17:32.031,0:17:34.320 So just click on the--[br]oh, I've shared this. 0:17:34.320,0:17:36.853 It's called cosmic timelines on the URL. 0:17:36.853,0:17:40.911 Should be cosmic and geological,[br]but then it's not a short URL anymore. 0:17:40.911,0:17:43.711 So, you click on this icon[br]in the top corner there, 0:17:43.711,0:17:47.431 and then, you get to the query page,[br]which is like the home page of this tool. 0:17:47.431,0:17:49.311 This is where the query is pasted in. 0:17:49.311,0:17:51.491 So, at the moment,[br]I've got the language there. 0:17:51.491,0:17:53.483 If I want to change it to something else, 0:17:53.483,0:17:56.062 Arabic, or French, or whatever-- 0:17:56.062,0:17:58.271 and here are the-- this is the area 0:17:58.271,0:18:03.092 where you sort of enter in exactly[br]which variables in your query 0:18:03.092,0:18:04.600 you would like to do each thing. 0:18:04.600,0:18:06.781 If you put nothing in,[br]it will try and figure it out. 0:18:06.781,0:18:09.971 But if you want advanced stuff--[br]and really important, is the precision, 0:18:09.971,0:18:13.033 because that's not available[br]on the query service timeline. 0:18:13.033,0:18:14.123 So, you get everything-- 0:18:14.123,0:18:16.303 is the first of January[br]10 billion years ago, 0:18:16.303,0:18:18.363 you know, which is not[br]what we want to see. 0:18:18.363,0:18:20.603 And the rank, which is quite interesting. 0:18:20.603,0:18:24.173 My timelines are all based[br]on a very simple rank of site link count, 0:18:24.173,0:18:27.058 how many different articles there are,[br]or something else. 0:18:27.058,0:18:29.432 But that's how you go[br]and mess around with it with yourself, 0:18:29.432,0:18:32.034 and you put your color codes[br]and your filters in down here. 0:18:32.034,0:18:34.098 Comma separate them,[br]if you would like more, 0:18:34.098,0:18:36.007 and they come up as options[br]in the final tool. 0:18:36.007,0:18:37.836 And I think that[br]pretty much is it, isn't it. 0:18:37.836,0:18:39.863 So, any other questions,[br]do find me afterwards. 0:18:39.863,0:18:41.655 Always happy to get cornered[br]for this stuff. 0:18:41.655,0:18:42.954 I love talking about it. 0:18:42.954,0:18:44.989 Okay. So, thank you very much. Cheers. 0:18:44.989,0:18:46.948 (applause) 0:19:28.344,0:19:30.220 (mumbles) 0:19:30.265,0:19:32.115 So, where is the first one? 0:19:33.854,0:19:35.397 This one, no. 0:19:45.636,0:19:47.132 This? Sorry. 0:19:48.270,0:19:50.090 Is it full screen? 0:19:50.217,0:19:52.129 Yep. Full screen. 0:19:54.747,0:19:56.289 Well, good work. 0:19:58.388,0:19:59.434 [Strike.] 0:19:59.497,0:20:02.312 Yeah, so, okay. Thank you. 0:20:04.752,0:20:07.062 So, hi, I'm Thibaud Senalada. 0:20:07.062,0:20:08.952 As [inaudible] introduced me. 0:20:09.552,0:20:14.212 I'm a software engineer[br]at the French National Library. 0:20:14.992,0:20:18.349 And I'm here today[br]to talk to you about NOEMI, 0:20:18.979,0:20:23.682 which is a software, a proof of concept, 0:20:23.682,0:20:26.501 and a [inaudible] software 0:20:26.635,0:20:29.961 to the French Library to cataloging. 0:20:30.787,0:20:32.870 Sorry. [inaudible]. 0:20:32.870,0:20:35.359 Sorry for my English. It's a bit of fuzzy. 0:20:36.971,0:20:39.321 And so, what's NOEMI? 0:20:39.321,0:20:41.589 So, NOEMI stands for: 0:20:41.589,0:20:44.591 Nouer les oeuvres, expressions,[br]Manifestations et Items. 0:20:44.591,0:20:46.533 Which, in English, is: 0:20:46.533,0:20:49.891 to link work, expression,[br]manifestation, and items. 0:20:51.086,0:20:58.057 It's based on the FRBR, 0:20:58.057,0:21:00.633 and [inaudible]. 0:21:00.881,0:21:03.105 Yeah. Anyway. 0:21:03.631,0:21:04.839 So, yeah. 0:21:05.244,0:21:09.540 So, this software,[br]we use to produce metadata. 0:21:10.841,0:21:12.201 It will be used 0:21:12.201,0:21:17.831 by 600 people on a daily basis. 0:21:18.911,0:21:24.271 And as I say in the title,[br]it will be based on Wikibase. 0:21:25.415,0:21:31.871 So, there is also a format manager. 0:21:32.388,0:21:39.138 So, people using this software[br]will use like a code editor, 0:21:39.254,0:21:41.817 but for MARC format. 0:21:41.968,0:21:45.178 So, it's [inaudible], things like that. 0:21:46.814,0:21:49.868 A data processing tool, like I said. 0:21:49.959,0:21:53.040 And also, authorization management, 0:21:54.327,0:21:56.378 because they will need a-- 0:21:57.337,0:22:01.417 if there is some data,[br]where it can be modified. 0:22:05.877,0:22:07.840 So, the PoC context. 0:22:08.728,0:22:12.738 So, this software will be replacing[br]an old software, 0:22:12.855,0:22:15.688 called ADCAT02. 0:22:17.111,0:22:20.964 It is part of the bibliographic[br]transition. 0:22:20.984,0:22:24.554 So, I say the [inaudible]. 0:22:25.359,0:22:29.390 [inaudible]. [inaudible] in English? 0:22:30.254,0:22:31.662 Format. 0:22:32.717,0:22:35.734 And it will be the [inaudible] of the-- 0:22:39.979,0:22:41.090 Sorry. 0:22:42.349,0:22:46.560 It will be [inaudible][br]all the [inaudible] 0:22:46.560,0:22:49.689 of the BnF with data. 0:22:51.731,0:22:54.124 And so, doing this work, 0:22:54.124,0:22:59.693 we accessed Wikibase to see[br]if it fits our needs. 0:23:01.244,0:23:03.383 And [inaudible] pretty good. 0:23:04.485,0:23:06.930 So, why Wikibase? 0:23:06.930,0:23:08.821 Because of the flexibility of the format. 0:23:08.835,0:23:11.646 We arrive-- 0:23:11.850,0:23:16.388 to inject MARC, INTERMARC for BnF-- 0:23:16.960,0:23:18.350 in the database. 0:23:18.399,0:23:22.803 And use it to-- use this link management 0:23:22.803,0:23:25.529 between entities using Blazegraph, 0:23:25.529,0:23:27.776 so, as Wikibase does. 0:23:29.155,0:23:32.700 We also choose Wikibase,[br]because it was already-- 0:23:35.183,0:23:38.900 it handles history and user account. 0:23:39.941,0:23:42.414 So, it's easiest for us. 0:23:43.106,0:23:48.270 And it also has a good--[br]it's pretty easy to create bots 0:23:48.270,0:23:51.090 to watch and curate data 0:23:51.840,0:23:53.430 and also to make statistics. 0:23:54.820,0:23:57.170 It's free and open, and sustainable. 0:23:57.908,0:23:59.084 Yeah, so. 0:23:59.610,0:24:02.519 I'm sorry if you don't[br]understand what I say, 0:24:02.519,0:24:04.839 because I know my English[br]is not that good. 0:24:07.720,0:24:12.139 But during this PoC,[br]we encountered some trouble. 0:24:12.802,0:24:13.938 Okay. 0:24:14.790,0:24:21.117 First of all, as a search engine,[br]I think we have to create 0:24:21.117,0:24:24.150 another-- 0:24:24.185,0:24:28.988 not another, a supplementary[br]search engine to use it with, 0:24:29.433,0:24:31.120 to fit our needs. 0:24:31.688,0:24:37.155 Because we need some search 0:24:37.155,0:24:42.366 like faceted search and filters. 0:24:43.755,0:24:47.525 Also we have the [inaudible], 0:24:47.525,0:24:50.407 of using postgreSQL database. 0:24:50.407,0:24:54.885 And for the moment,[br]I think Wikibase [inaudible]. 0:24:56.436,0:25:01.266 And when we try to use postgreSQL,[br]it was a bit difficult, 0:25:01.266,0:25:04.394 and will cause some issues. 0:25:05.662,0:25:08.825 And we have also some fear[br]about performance, 0:25:08.825,0:25:15.238 because the catalog is about[br]20 million entities, 0:25:16.366,0:25:19.146 20 million bibliographic entities. 0:25:19.146,0:25:22.851 That can be more[br]than 20 million entities, actually. 0:25:23.276,0:25:27.771 And we don't know the time[br]that we'll have to inject them 0:25:27.809,0:25:30.765 in the Wikibase, and how to do it. 0:25:32.198,0:25:34.267 So, [inaudible], 0:25:34.324,0:25:39.616 but the real software development[br]has already started. 0:25:43.242,0:25:46.175 We start by creating[br]an interface with Wikibase. 0:25:46.261,0:25:47.711 We're using Java. 0:25:48.091,0:25:50.093 Like PyWikibase. 0:25:51.691,0:25:54.888 - (man) Pywikibot.[br]- Pywikibot. Yeah, thank you. 0:25:56.027,0:25:57.723 The same way, but in Java. 0:25:59.309,0:26:02.831 We also inject already the format[br]into the Wikibase. 0:26:03.540,0:26:09.093 And we do something[br]like the INTERMARC editor, 0:26:09.458,0:26:12.134 [inaudible], et cetera. 0:26:13.672,0:26:14.926 Thank you. 0:26:15.333,0:26:17.135 (applause) 0:26:23.527,0:26:24.749 Yeah. 0:26:27.748,0:26:29.813 (man 2) Faceted search[br]will be a nice feature 0:26:29.813,0:26:31.885 in the Wikidata UI itself. 0:26:31.924,0:26:34.062 So, have you talked[br]to any of the developers, 0:26:34.062,0:26:35.675 or is that something[br]that could be done? 0:26:35.711,0:26:37.108 Sorry, I don't understand. 0:26:37.108,0:26:39.041 (man 2) The faceted search idea. 0:26:39.911,0:26:41.982 It would be nice to be able[br]to search only humans, 0:26:41.982,0:26:44.221 or search only works, or something, right? 0:26:44.321,0:26:47.991 Yeah. I'm sorry, I don't-- I don't-- 0:26:48.131,0:26:50.436 (man 2) Yeah, I mean, so,[br]it would be nice if we had that 0:26:50.436,0:26:52.265 in Wikidata itself in the UI. 0:26:52.822,0:26:53.954 Yeah, yeah, yeah. 0:26:54.088,0:26:56.077 [inaudible] 0:26:56.077,0:26:57.911 Yeah, okay, thank you. 0:26:57.911,0:27:00.026 I'm sorry. (laughs) 0:27:01.186,0:27:03.902 Yeah, yeah. But I think we will-- 0:27:04.506,0:27:07.266 I don't know if we want[br]to do it inside Wikibase, 0:27:07.266,0:27:10.746 or in our next systems. 0:27:10.785,0:27:15.186 For the moment,[br]we don't really solve that. 0:27:15.965,0:27:17.885 For the moment, I think. 0:27:17.885,0:27:19.285 Sorry. 0:27:27.645,0:27:30.644 (man 3) I suppose on the topic[br]of the faceted search, 0:27:32.535,0:27:35.068 Wikidata, SPARQL Query, Wikibase-- 0:27:35.068,0:27:38.965 SPARQL Query is I think,[br]functionally equivalent 0:27:38.965,0:27:41.405 to a facetable search. 0:27:42.105,0:27:44.234 So, it's mostly an interface issue, right? 0:27:44.284,0:27:47.791 I mean, you could build an interface[br]that starts with a query, 0:27:47.791,0:27:51.111 and then, gives you[br]possible facets to filter by. 0:27:51.370,0:27:52.660 And when you click one of them, 0:27:52.660,0:27:55.217 it adds a condition[br]to the SPARQL Query, right? 0:27:55.664,0:27:58.183 Yeah, but I think the SPARQL-- 0:27:59.157,0:28:04.310 they don't go as detailed[br]as we want, as we have-- 0:28:05.632,0:28:09.631 When we inject the format,[br]we use a statement for-- 0:28:10.525,0:28:13.124 the format is like XML. 0:28:13.223,0:28:15.842 So, it's a zone, subzone, and value. 0:28:16.413,0:28:20.292 And in the [inaudible] statement,[br]we add the subzone, 0:28:20.892,0:28:22.902 because the zone was already there. 0:28:23.002,0:28:28.565 And we want to query[br]some qualifier on this. 0:28:28.659,0:28:35.206 And I don't know if the SPARQL[br]goes through that-- I'm sorry-- 0:28:36.145,0:28:38.277 in a fast way. 0:28:40.025,0:28:46.285 I think we need some index[br]for us to [inaudible]. 0:28:46.925,0:28:48.145 Yeah. 0:28:48.145,0:28:50.250 (man 3) SPARQL doesn't do a query-- 0:28:52.321,0:28:55.703 To do proper string searches[br]in SPARQL is very hard. 0:28:55.703,0:28:57.610 You have to have filters, which are slow, 0:28:57.610,0:28:59.815 and it really doesn't work that well. 0:28:59.815,0:29:02.845 So, it's a different[br]search problem, really. 0:29:06.871,0:29:09.270 More question? If anyone has one? 0:29:12.215,0:29:13.999 - Great. Thank you.[br]- Thank you. 0:29:14.044,0:29:15.895 (applause) 0:29:37.766,0:29:41.960 (host) Nielsen speaking about[br]the tool Ordia. Thank you. 0:30:05.084,0:30:06.460 So, I'm Finn Årup Nielsen, 0:30:06.460,0:30:09.006 and a couple of years ago,[br]I started Scholia 0:30:09.006,0:30:14.611 that displays data from Wikidata[br]via a SPARQL Query 0:30:14.611,0:30:16.359 to the Wikidata Query Service 0:30:16.359,0:30:18.959 so we can generate, for example,[br]a list of publications 0:30:18.959,0:30:20.380 for a specific author. 0:30:20.866,0:30:26.941 Now, last year, Wikidata[br]introduced lexicographic data. 0:30:29.332,0:30:32.655 And I [inaudible] the idea of Scholia 0:30:32.655,0:30:39.279 that is using Wikidata[br]and the Wikidata Query Service 0:30:39.445,0:30:42.036 to generate overviews[br]of lexicographic data. 0:30:42.585,0:30:46.125 So, Ordia is the example of this one here. 0:30:46.197,0:30:51.998 So, it generates-- it's a web application[br]run from the Toolforge service, 0:30:51.998,0:30:57.198 and for example, it will dynamically[br]generate a page such as-- 0:30:57.234,0:31:01.768 This one here is statistics over[br]what there is of lexicographic data 0:31:01.768,0:31:03.841 in Wikidata. 0:31:03.992,0:31:07.404 For example, the number of lexemes,[br]is currently over 200,000. 0:31:08.664,0:31:10.483 So, there's a range of things[br]you can do here. 0:31:10.483,0:31:12.916 You can, for example,[br]look in the aspects of that. 0:31:12.916,0:31:15.560 The menu, there's quite a lot[br]of things here. 0:31:15.560,0:31:18.485 And so, I will search[br]on a specific Danish lexemes. 0:31:19.503,0:31:22.835 "Rød"-- which is "red" in Danish. 0:31:23.376,0:31:27.466 So, you basically get,[br]for the specific lexeme, 0:31:28.286,0:31:30.618 the same type of information[br]that you could see 0:31:30.618,0:31:33.751 in the ordinary part of Wikidata, here. 0:31:34.451,0:31:38.256 Annotations about the lexeme,[br]annotation about the forms, 0:31:39.359,0:31:40.872 single or plural forms. 0:31:41.548,0:31:43.501 Annotation about the sentence. 0:31:44.683,0:31:47.678 But what you can't see[br]in ordinary Wikidata 0:31:47.678,0:31:52.150 is sort of aggregating across lexemes. 0:31:52.246,0:31:54.207 And this is, for example, down here-- 0:31:54.207,0:31:55.902 down here with the compound. 0:31:55.902,0:31:57.764 So, in Danish, like in German, 0:31:57.764,0:31:59.950 words can be compounded. 0:31:59.950,0:32:03.478 For example, for "red",[br]we have rødkælk 0:32:03.478,0:32:05.830 which is compounded by two words. 0:32:06.721,0:32:10.085 And we've got, on the second one here,[br]rødvin-- red wine. 0:32:11.060,0:32:15.691 This list here is constructed[br]by a SPARQL Query to the Wikidata Service. 0:32:16.751,0:32:20.406 And also, further down here,[br]we've got a lot of Danish words here. 0:32:20.970,0:32:26.122 Further down here, we should have[br]a graph of the words 0:32:27.426,0:32:29.164 which are compounded from rød. 0:32:29.658,0:32:31.980 We have [rød]-- red here in the middle. 0:32:31.980,0:32:34.372 And for example, around--[br]somewhere around here, 0:32:34.372,0:32:36.895 which should have,[br]for example, "red cabbage," 0:32:36.936,0:32:40.343 "red cabbage salad,"[br]"red cabbage soup," and so on. 0:32:40.434,0:32:43.055 So you can browse around,[br]in this one here, and see it. 0:32:44.204,0:32:51.188 We can go a bit back here,[br]and then look on the main sense 0:32:51.388,0:32:55.030 of the word rød-- red in Danish. 0:32:55.550,0:33:01.610 So, Ordia automatically generates[br]information about hyponyms. 0:33:02.570,0:33:04.400 Subconcepts, for example, 0:33:04.400,0:33:07.400 light red, dark red,[br]pink, purple, and so on, 0:33:07.525,0:33:14.272 are in the-- when we make[br]a Wikidata Query service, SPARQL Query. 0:33:14.576,0:33:20.570 Then we go around in the Wikidata graph, 0:33:20.626,0:33:22.266 and get this information here. 0:33:22.266,0:33:24.786 And we can also get translation[br]automatically, 0:33:24.786,0:33:28.316 even though it's not necessarily stated[br]within the Wikidata lexemes items. 0:33:28.316,0:33:32.679 For example, here, we have translated[br]rød to "red" in English, 0:33:32.679,0:33:36.089 and röd in Swedish, and so on. 0:33:36.107,0:33:38.191 There's not that very many there. 0:33:38.747,0:33:40.262 There's a range of other things here. 0:33:40.262,0:33:43.487 Let me show you,[br]for example, this one here-- 0:33:44.387,0:33:51.308 this is veninde- now I go[br]over to this one here. 0:33:54.308,0:33:57.328 -inde, which is a feminine suffix. 0:33:58.058,0:34:00.498 So, this is auto-generated there, 0:34:00.498,0:34:02.641 it's a combination of "instance of"-- 0:34:03.268,0:34:07.171 lexemes that are "instance of"[br]feminine suffixes. 0:34:08.142,0:34:11.519 And for example, for German,[br]we have [inaudible]. 0:34:11.519,0:34:15.373 So, -in would be[br]a feminine suffix in German. 0:34:15.704,0:34:21.291 And I put in sort of the five Danish[br]feminine suffixes 0:34:22.571,0:34:24.206 of Danish. 0:34:25.480,0:34:29.106 Another facility is, for example,[br]if you have a text, 0:34:29.106,0:34:34.021 you can copy and paste it[br]into this Text to lexemes here. 0:34:34.571,0:34:35.911 Let me-- 0:34:37.482,0:34:41.218 "a car crashed into... 0:34:41.864,0:34:44.141 a green house." 0:34:46.485,0:34:48.701 Let me change that to "English". 0:34:49.006,0:34:50.029 Press Submit. 0:34:50.029,0:34:53.355 Now, Ordia will then extract[br]each of the word here, 0:34:53.355,0:34:54.733 in this sentence here, 0:34:54.733,0:34:58.217 and try to see whether they[br]are entered in the specific form, 0:34:58.217,0:35:00.778 a lexeme, are entered in Wikidata. 0:35:00.778,0:35:04.228 And these simple words here[br]are entered in Wikidata. 0:35:04.228,0:35:09.190 But if we, for example, change it to--[br]there's nothing called "vancar" 0:35:09.190,0:35:13.998 but just let us do that here. 0:35:14.535,0:35:19.532 And you got down here--[br]it's as a blue link 0:35:20.335,0:35:23.295 that you can create a new[br]Wikidata lexeme item. 0:35:24.556,0:35:29.097 But the range of other things to explore 0:35:29.716,0:35:31.496 in this web application. 0:35:31.496,0:35:35.596 And if there's any suggestions,[br]or comments, or notes, or something, 0:35:35.596,0:35:39.337 you can contact me, or put in[br]an issue on GitHub. 0:35:39.337,0:35:44.856 So, this particular application[br]is developed on GitHub, 0:35:44.856,0:35:50.526 and I'm open for new ideas[br]and ways to represent information there. 0:35:51.306,0:35:52.701 Okay, thank you. 0:35:52.701,0:35:54.661 (applause) 0:35:59.328,0:36:00.906 Questions? 0:36:03.262,0:36:04.524 (woman 3) I love your tool. 0:36:04.524,0:36:09.752 Can you show the languages,[br]that which is awesome for me, I think, 0:36:09.752,0:36:11.731 to show other languages. 0:36:12.183,0:36:14.537 So, this is a bit of statistics[br]over the languages, 0:36:14.537,0:36:17.046 and the Russians[br]have been scraping Wictionary, 0:36:17.046,0:36:20.327 and that's why they have now[br]100,000 lexemes. 0:36:24.387,0:36:28.088 There's also a lot of work on Basque here. 0:36:29.566,0:36:32.241 I think there's an organization[br]putting that information in here. 0:36:32.241,0:36:34.932 And you can also see a graph of these-- 0:36:34.932,0:36:37.662 this is Number of forms as functions[br]of number of lexemes. 0:36:38.798,0:36:41.279 And all the way up here-- 0:36:41.279,0:36:45.255 here, this is Russian,[br]down here, Basque, I think. 0:36:45.476,0:36:47.997 And English, perhaps, down here. 0:36:48.953,0:36:50.692 And also in the Number of senses, 0:36:52.473,0:36:58.360 I think Basque, English, and Russian, 0:37:00.184,0:37:02.048 Hebrew, and so on. 0:37:02.048,0:37:03.343 Yeah. 0:37:11.045,0:37:12.950 (man 4) That looks[br]like an incredible tool. 0:37:12.950,0:37:15.097 But I was just wondering,[br]is it all fully live? 0:37:15.097,0:37:18.344 Is it all based on SPARQL Queries[br]and live or are there some things-- 0:37:18.344,0:37:20.458 - Yes. I believe, yes.[br]- Fantastic. 0:37:20.511,0:37:24.961 But as they get more data into Wikidata, 0:37:24.961,0:37:26.100 there's a bit of an issue. 0:37:26.100,0:37:27.328 For example, for Russian here. 0:37:27.328,0:37:31.966 I started out this a year ago[br]when there's not that very many lexemes, 0:37:32.061,0:37:35.503 and so there was no problems[br]with the time-outs. 0:37:35.503,0:37:38.367 But representing it here-- 0:37:38.367,0:37:42.268 but if I press Russian,[br]I think there might be some issues. 0:37:42.268,0:37:44.284 There's a count that works here, 0:37:44.284,0:37:46.101 for example, longest words or phrases. 0:37:46.101,0:37:49.252 But I think the lexemes[br]are sort of loading in. 0:37:49.252,0:37:52.727 I think I'll need to fix that[br]as Wikidata grows here. 0:37:53.258,0:37:55.927 As you see, there's a lot[br]of Russian nouns, apparently. 0:37:56.699,0:37:58.451 And I don't know whether the-- 0:37:59.351,0:38:01.519 apparently, that's what[br]they're working on. 0:38:01.573,0:38:03.960 There seems also to be[br]a bit of time-out there. 0:38:06.705,0:38:08.033 [inaudible], oh, yes. 0:38:08.115,0:38:09.984 The first one there. 0:38:10.832,0:38:16.110 But apparently, the longest words[br]and phrases is a bit too expansive. 0:38:17.931,0:38:20.334 But apparently, it can be loaded there,[br]and it's probably-- 0:38:21.318,0:38:23.167 it's loaded all the 100,000 there, 0:38:23.167,0:38:27.938 so you can click all 10,000 pages. 0:38:36.748,0:38:38.678 (host) If there aren't[br]any other questions-- 0:38:39.564,0:38:40.950 The longest word came now. 0:38:40.950,0:38:43.146 So, it's, yeah. 0:38:44.972,0:38:46.390 Probably-- 0:38:47.855,0:38:49.975 [inaudible] 0:38:50.321,0:38:51.540 What is that? 0:38:51.540,0:38:53.518 - (audience) It's a chemical.[br]- A chemical, yes. 0:38:56.317,0:38:58.303 (host) More questions? Or shall we? 0:38:59.792,0:39:02.332 Alright, alright. Thank you very much. 0:39:02.332,0:39:04.392 (applause) 0:39:23.642,0:39:25.121 (Nicolas) Is it good? 0:39:31.008,0:39:32.346 (host) Awesome. 0:39:34.920,0:39:38.137 Alright, now, to wrap it up,[br]we have Nicolas Vigneron, 0:39:38.137,0:39:40.778 talking about Wikisource and Wikidata. 0:39:41.469,0:39:42.804 (Nicolas) This is good? 0:39:44.542,0:39:46.126 Who knows Wikisource? 0:39:47.582,0:39:48.959 Yay! 0:39:50.740,0:39:53.582 More and more people[br]raising hands every year. 0:39:53.582,0:39:54.957 That's good. 0:39:55.282,0:40:01.462 So, this morning, [Lydia] said that[br]Wikivoyage was the first real user of-- 0:40:03.306,0:40:05.987 [inaudible] 0:40:06.572,0:40:08.347 Wikisource is not that far behind. 0:40:09.230,0:40:13.280 There's a lot to do,[br]and I want to do some basic numbers, 0:40:13.280,0:40:16.964 statistics, about where we are,[br]and where I want to go. 0:40:17.613,0:40:23.409 So first, there will be a lot of questions[br]of what is a book, 0:40:23.409,0:40:25.389 what is bibliographical data. 0:40:25.389,0:40:27.229 People from the BnF can agree with me. 0:40:27.229,0:40:29.969 That can be a nightmare[br]if you go into details. 0:40:30.164,0:40:35.803 But some big numbers that--[br]Google Books tried to do an estimation 0:40:35.803,0:40:39.676 on how many "books," air quote books,[br]there is in the world, 0:40:39.676,0:40:43.005 and there's 130 million books[br]in the world. 0:40:43.705,0:40:47.279 And, yeah, let's put them all on Wikidata. 0:40:47.650,0:40:49.300 Or not. I don't know. 0:40:49.392,0:40:51.049 But where are we now? 0:40:51.413,0:40:52.468 And why is it books? 0:40:52.468,0:40:55.668 Because for Google Books,[br]everything is scanned, basically. 0:40:55.795,0:40:58.670 They don't have exactly[br]a very clear distinction. 0:40:59.400,0:41:04.350 There's sometimes, two-page books,[br]which [inaudible], Google Books is a book. 0:41:04.714,0:41:10.131 But for many people, you have to have[br]at least 50 pages to be a book. 0:41:10.536,0:41:12.321 So, that's always hard to count. 0:41:12.885,0:41:15.603 But here's what we know on Wikidata. 0:41:15.603,0:41:18.704 This the graph of what[br]is a book for Wikidata. 0:41:18.704,0:41:21.524 You have-- that's totally [inaudible]-- 0:41:21.524,0:41:23.979 but that's Wikidata,[br]literary work as well. 0:41:23.979,0:41:27.194 And this is all the subclasses,[br]or subclasses of subclasses-- 0:41:27.194,0:41:30.334 or subclasses of subclasses[br]of what is a book. 0:41:30.804,0:41:32.705 So, that's very hard to do. 0:41:32.737,0:41:34.253 I can do a graph like that, 0:41:34.253,0:41:36.833 but SPARQL Query engine doesn't work 0:41:36.833,0:41:41.523 if I want to count everything[br]that is instance of these subclasses, 0:41:41.523,0:41:45.143 and basically, SPARQL says no, time-out. 0:41:45.633,0:41:47.020 So, what's the problem? 0:41:47.020,0:41:50.713 But I know already that there's[br]a lot of subclasses, 0:41:50.713,0:41:52.153 but we need to look into it. 0:41:52.153,0:41:57.943 And probably, if you know Wikidata,[br]on the page, Wikidata point statistics, 0:41:58.643,0:42:02.647 you have all the numbers by big classes, 0:42:02.647,0:42:07.047 and you all probably know[br]that the big chunk here 0:42:07.047,0:42:08.642 is scholarly articles, 0:42:08.707,0:42:12.749 which is, thanks to[br]the WikiCite project, in particular, 0:42:14.113,0:42:17.125 which can be books or not,[br]depending on definition. 0:42:19.062,0:42:22.508 You see that there's no subclass books, 0:42:23.032,0:42:26.034 because there's not enough to show. 0:42:26.049,0:42:28.472 It's probably somewhere in the others, 0:42:28.472,0:42:30.127 the purple area is others. 0:42:30.163,0:42:34.115 And there's a lot of things[br]that's under one percent. 0:42:34.162,0:42:38.821 So, basically, we can say[br]that we have less one percent 0:42:38.821,0:42:42.131 of things identified as books in Wikidata. 0:42:42.551,0:42:46.091 Maybe there is more books,[br]but not identified as such. 0:42:47.842,0:42:49.284 I'm talking about books, 0:42:49.383,0:42:51.768 but when we are talking[br]about bibliographical data, 0:42:51.768,0:42:53.920 there's also the author, person, 0:42:53.920,0:42:58.472 so maybe some of the human here[br]are also authors, surely. 0:43:00.068,0:43:03.221 And we need to do another count,[br]which is another big query to do. 0:43:03.602,0:43:05.301 That times out, so-- 0:43:05.396,0:43:08.015 I have a lot of not number[br]to this, sorry. 0:43:10.619,0:43:14.332 So, yeah, basically, this first slide[br]is about how it's complicated 0:43:14.332,0:43:19.122 to know how much we have of what,[br]and how to count them. 0:43:19.445,0:43:21.091 So, yeah, hard to count. 0:43:21.618,0:43:23.280 What we know-- 0:43:24.133,0:43:26.618 that is we have a lot of properties-- 0:43:27.185,0:43:29.684 700,000, I guess, 0:43:30.208,0:43:31.680 now on Wikidata. 0:43:32.593,0:43:35.952 We know that we have a lot of identifiers[br]among these properties. 0:43:36.721,0:43:42.538 And we know that almost 4,000[br]are properties for identifiers 0:43:43.146,0:43:45.623 relative to bibliographical, 0:43:45.737,0:43:49.862 like ID at the National Library of France, 0:43:49.862,0:43:52.251 National Library of Yaddi, Yaddi, Yada, 0:43:52.251,0:43:56.681 because we love identifier[br]of National Library on Wikidata. 0:43:56.681,0:44:00.271 So, we have almost all libraries,[br]national libraries and more. 0:44:01.101,0:44:03.796 So, we have a lot of properties.[br]I know that. 0:44:05.071,0:44:06.727 And we are widely used. 0:44:06.834,0:44:10.053 I know that, for instance,[br]BnF properties use-- 0:44:10.579,0:44:12.772 BnF is National Library of France-- 0:44:12.772,0:44:18.989 is used 1 million times--[br]OCOC, VIAF, or the big like that. 0:44:21.001,0:44:24.202 A lot of uses in Wikidata. 0:44:25.426,0:44:28.980 But it's not because we have[br]a lot of uses of various properties 0:44:28.980,0:44:30.666 in Wikidata that it's complete. 0:44:31.266,0:44:33.758 As Thibaud said, there's more[br]than 20 million books, 0:44:33.758,0:44:37.099 [inaudible], which is more as entities. 0:44:37.837,0:44:39.569 And we have only 1 million, 0:44:39.569,0:44:43.538 so we have 19 million still to do. 0:44:45.177,0:44:47.276 Also, what we know from the Wikidata side, 0:44:47.276,0:44:51.918 is that we have a good--[br]very quite active Wikidata project, 0:44:51.918,0:44:53.840 called WikiProject Books, 0:44:54.332,0:44:58.127 where we have a model we kind of agree on, 0:44:58.181,0:45:00.916 which is not always followed,[br]which is, again, a problem. 0:45:00.956,0:45:02.710 What is a book? You know it. 0:45:03.414,0:45:05.385 I only have five minutes,[br]so, I'll keep going. 0:45:06.090,0:45:08.880 And then, I'm a Wikisourcean,[br]so, Wikisourcer. 0:45:09.426,0:45:11.930 So, I wanted to know[br]the other way around 0:45:11.930,0:45:13.496 what is from Wikisource already, 0:45:13.496,0:45:16.406 because Wikisource is already[br]inside the Wikimedia project. 0:45:16.406,0:45:19.883 A lot of bibliographical records[br]and information. 0:45:19.883,0:45:23.161 So, in the 66 million items on Wikidata, 0:45:23.161,0:45:28.850 already 1 million are linked[br]to Wikisource. 0:45:29.330,0:45:31.890 [inaudible]. 0:45:32.350,0:45:36.080 So, that's very few,[br]but that's quite a lot. 0:45:37.496,0:45:40.174 There's a lot of author. 0:45:40.174,0:45:44.670 There's some books, texts,[br]work, edition, whatever. 0:45:45.271,0:45:48.425 Not always well-arranged. 0:45:48.869,0:45:50.600 And there's a lot of internal pages, 0:45:50.600,0:45:53.150 like categories and templates,[br]and things like that. 0:45:53.194,0:45:54.984 But still, 1 million in total. 0:45:58.329,0:46:01.767 The Wikisource community[br]are often small communities, 0:46:01.767,0:46:05.010 like on the French community Wikisource, 0:46:05.010,0:46:07.537 which is one of the biggest,[br]there's 50 people. 0:46:07.537,0:46:08.787 That's the biggest we have. 0:46:09.047,0:46:12.937 So, we love Wikidata, because,[br]hey, they did a lot of work for us. 0:46:12.942,0:46:15.131 So, just take it from Wikisource. 0:46:15.131,0:46:19.885 So, in this small community,[br]we love to reuse Wikidata data. 0:46:20.935,0:46:24.076 Right now, we use a lot of a tool[br]which is called WEF-- 0:46:24.358,0:46:27.978 Wikidata Edit Framework-- thank you. 0:46:29.318,0:46:33.098 And we are eager to see[br]how Wikidata Bridge will work. 0:46:33.438,0:46:36.798 And we are trying to do things[br]with a team in Wikidata 0:46:37.638,0:46:40.678 in Wikipedia Deutschland team,[br][inaudible]. 0:46:41.007,0:46:43.934 And there's a lot[br]of collaboration in the future 0:46:43.934,0:46:46.586 that we want to do: better integrate, 0:46:47.636,0:46:51.068 do everything in one click when you import[br]a first book in Wikisource, 0:46:51.068,0:46:52.465 things like that. 0:46:53.364,0:46:57.664 Better-- do links between[br]edition in Wikidata. 0:46:57.852,0:46:59.492 That needs to be done. 0:47:00.041,0:47:02.282 The Foundation is doing the wish list now, 0:47:02.282,0:47:04.853 and we have a lot of requests about that. 0:47:05.938,0:47:07.342 And yeah, that's it. 0:47:07.342,0:47:09.116 That was just a short overview. 0:47:09.116,0:47:15.272 So, if you have some questions,[br]I'll take them and be available later, 0:47:15.712,0:47:17.112 if you want to. 0:47:17.723,0:47:19.722 (applause) 0:47:25.639,0:47:28.281 Come on, you love Wikisource,[br]you have questions! 0:47:33.989,0:47:35.775 (woman 4) I asked you[br]already this in August, 0:47:35.775,0:47:38.411 and I wonder if this has already changed. 0:47:38.411,0:47:42.337 What is the biggest problem you have[br]in Wikisource right now, 0:47:42.337,0:47:43.761 from your perspective? 0:47:44.167,0:47:45.670 The first one, only? (chuckles) 0:47:48.314,0:47:54.152 I think because it's a small community,[br]we need efficient tools that work easily, 0:47:54.152,0:47:57.148 because we have very few people, 0:47:57.148,0:47:59.464 so we need tool that are easy to use 0:47:59.464,0:48:04.247 and a one-click solution[br]to [inaudible] a bit, 0:48:04.371,0:48:05.607 that's a big dream. 0:48:05.607,0:48:07.179 I think that's what's most important, 0:48:07.179,0:48:10.485 because that's the threshold[br]in Wikisource, it's a small community. 0:48:11.204,0:48:13.241 I think this is the most important. 0:48:14.615,0:48:15.975 [inaudible] 0:48:16.867,0:48:19.600 (man 5) I'm curious if you can speak[br]to your opinion, 0:48:19.600,0:48:23.154 or the French Wikisource opinion,[br]or maybe you spoke to other communities 0:48:23.154,0:48:29.834 about the notion of not including[br]metadata about all the world's books. 0:48:30.234,0:48:31.635 That was mentioned in the morning. 0:48:31.635,0:48:34.965 Maybe other Wikibases,[br]and other federated databases 0:48:34.965,0:48:38.026 will have that information,[br]and Wikidata won't. 0:48:39.159,0:48:41.494 How does that feel for Wikisource? 0:48:43.981,0:48:45.502 This is my very personal opinion. 0:48:45.502,0:48:47.386 I know that people[br]in the Wikisource community 0:48:47.386,0:48:48.723 disagree with that. 0:48:48.723,0:48:50.537 But I think we need to stay-- 0:48:50.537,0:48:53.194 an external Wikibase[br]is not a good solution, 0:48:53.194,0:48:55.353 because we have Shakespeare on Wikisource, 0:48:55.353,0:48:58.323 and we have Shakespeare on Wikipedia. 0:48:58.564,0:49:01.295 So, we need to interlink,[br]and interlink is there. 0:49:01.295,0:49:04.007 Or like, Romeo and Juliet,[br]we have them both. 0:49:04.007,0:49:07.229 So, we are still pretty close[br]to Wikipedia. 0:49:07.433,0:49:09.431 And the difference with WikiCites-- 0:49:09.431,0:49:12.515 with WikiCite, we have a lot of items[br]which are small. 0:49:14.372,0:49:16.051 Wikisource is the other way around. 0:49:16.150,0:49:18.281 We have few items, who are big. 0:49:18.281,0:49:20.515 Which can be a scaling problem[br]and everything, 0:49:20.515,0:49:23.615 but it's quite a small subset of data. 0:49:23.683,0:49:27.539 So, my personal opinion[br]is we should stay in the Wikidata. 0:49:28.391,0:49:32.117 Again, because we are not[br]very much a lot of people, 0:49:32.117,0:49:34.287 so we need to stay,[br]with the tool we know, 0:49:34.287,0:49:35.846 don't change too much the tools 0:49:35.846,0:49:37.736 for the small community, please. 0:49:37.769,0:49:39.282 So, that's it. 0:49:39.282,0:49:40.910 But I know that other people disagree. 0:49:40.910,0:49:44.579 You can talk to [Sadeep] if you want.[br]He will have another point of view. 0:49:46.119,0:49:49.319 Thank you. I think, last question, maybe. 0:49:51.234,0:49:54.446 (man 6) Sometimes, I find it difficult[br]to link the Wikidata item 0:49:54.446,0:50:00.976 with a Wikisource article,[br]because there's a Wikisource novel-- 0:50:01.079,0:50:06.128 might be split over several pages,[br]and there's an index page, 0:50:06.128,0:50:08.853 and there's perhaps a front page,[br]or something like that. 0:50:08.853,0:50:12.053 Do you have that problem,[br]or is that a general problem, or-- 0:50:12.092,0:50:16.892 Yeah, that's one of the first ideas[br]on the wish list 0:50:16.892,0:50:19.092 for the Foundation, actually. 0:50:19.092,0:50:20.790 Yeah, because Wikipedia is on the-- 0:50:20.790,0:50:22.772 if you know the [inaudible] organization, 0:50:22.772,0:50:26.598 Wikipedia is on the work level,[br]and Wikisource on the edition level. 0:50:26.598,0:50:28.572 So, already, you have a problem there. 0:50:28.572,0:50:30.931 And then, we have several editions[br]of the same work, 0:50:30.931,0:50:34.014 and we have sub-chapters[br]and things inside the edition. 0:50:34.014,0:50:41.001 So, yeah, that's one too many problems[br]which is hard to solve by nature. 0:50:41.555,0:50:44.839 But there's maybe a tool[br]that can help to solve that. 0:50:45.893,0:50:47.469 Hopefully. 0:50:49.172,0:50:51.395 And that's time, ladies and gentlemen. 0:50:51.398,0:50:53.283 So, thank you very much, Nicolas. 0:50:53.335,0:50:55.137 (applause) 0:50:59.010,0:51:01.127 And please join me giving[br]one more round of applause 0:51:01.127,0:51:03.147 to all of our wonderful speakers. 0:51:03.147,0:51:04.901 (applause)