0:00:06.370,0:00:08.540 Hello, everyone. 0:00:08.540,0:00:11.636 It's awesome that you're all here,[br]so many of you. 0:00:11.647,0:00:13.298 It's really, really great. 0:00:14.659,0:00:19.541 So Lea already talked a lot [br]about this event, 0:00:19.541,0:00:22.875 and I'm going to talk a bit [br]about Wikidata itself 0:00:22.875,0:00:26.255 and what has been happening [br]around it over the last year 0:00:26.255,0:00:28.151 and where we are going. 0:00:28.663,0:00:32.974 So... what is this? Sorry. 0:00:40.118,0:00:44.329 So... where are we?[br]Where are we going? 0:00:44.950,0:00:49.680 Over the last year there has been [br]so much to celebrate 0:00:49.680,0:00:52.329 and I want to highlight some of that 0:00:52.329,0:00:55.125 because sometimes it goes unnoticed. 0:00:56.855,0:01:03.864 And first I want to take you through[br]some statistics around editors 0:01:03.985,0:01:07.119 and our content and how our data is used. 0:01:10.376,0:01:14.976 Over the last year, [br]we have grown our community 0:01:14.976,0:01:16.720 which is amazing. 0:01:16.724,0:01:21.248 We have around 3,000 new people 0:01:21.248,0:01:25.963 who edit once or more in 30 days. 0:01:26.133,0:01:30.276 So that's 3,000 new Wikidatans, yay! 0:01:31.617,0:01:36.544 Now if you look at people who do more,[br]like five edits in 30 days, 0:01:36.544,0:01:40.727 we've got an additional 1,200 roughly. 0:01:40.995,0:01:44.202 And if you look [br]at the people who do 100 edits or more-- 0:01:44.202,0:01:47.366 I hope many of you in this room-- 0:01:47.366,0:01:48.996 we have 300 more. 0:01:49.277,0:01:51.450 Raise your hand [br]if you're in this last group. 0:01:52.733,0:01:56.049 Woot! You're awesome! 0:01:58.059,0:02:04.436 And while the number of edits [br]is usually not something 0:02:04.436,0:02:08.592 we pay a lot of attention to, 0:02:08.592,0:02:12.683 we did cross [br]the 1 billion edits mark this year. 0:02:12.967,0:02:14.597 (applause) 0:02:21.347,0:02:23.224 Alright, let's look at content. 0:02:27.610,0:02:31.222 So, we're now at 65 million items, 0:02:31.462,0:02:34.093 so entities to describe the world, 0:02:34.093,0:02:40.541 and we're doing this [br]with around 6,700 properties. 0:02:43.667,0:02:48.079 Of those, around 4,300 [br]are external identifiers, 0:02:48.079,0:02:53.328 which gives us a lot of linking[br]to other catalogues, databases, 0:02:53.328,0:02:55.607 websites and more 0:02:55.927,0:02:59.024 and really makes Wikidata [br]the central place 0:02:59.024,0:03:01.594 in a linked open data web. 0:03:02.453,0:03:07.241 So using those properties and items, 0:03:07.241,0:03:11.990 we have around 800 million statements now, 0:03:11.990,0:03:15.892 and compared to last year,[br]we know about half a statement more 0:03:15.892,0:03:18.365 about every single item. 0:03:18.550,0:03:20.480 (laughter) 0:03:22.595,0:03:25.144 So, yeah, Wikidata got smarter. 0:03:26.914,0:03:29.694 But we don't just have items[br]and properties, 0:03:29.724,0:03:33.704 we also have new stuff [br]like lexemes 0:03:33.866,0:03:39.825 and we are now at 204,000 lexemes[br]that describe words 0:03:39.825,0:03:41.860 in many different languages. 0:03:41.939,0:03:43.241 It's very cool. 0:03:43.668,0:03:47.661 I will talk more about this[br]in a session later today. 0:03:48.860,0:03:52.690 Last, the latest addition [br]are entity schemas 0:03:52.690,0:03:58.503 that help us figure out [br]how to consistently model data 0:03:58.503,0:04:00.971 across a certain area. 0:04:02.171,0:04:04.462 And of those, we have around 140 now. 0:04:07.571,0:04:11.432 Now numbers aren't everything [br]around content, right, 0:04:11.432,0:04:14.697 amount of content--we also care[br]about quality of the content. 0:04:15.613,0:04:21.976 And what we've done now is [br]we've trained a machine learning system 0:04:21.976,0:04:25.287 to judge the quality of an item. 0:04:25.822,0:04:29.531 Now this is far from perfect,[br]but it gives you an idea. 0:04:29.916,0:04:35.011 So every item in Wikidata gets a score[br]between 1 and 5. 0:04:35.011,0:04:37.895 One is pretty terrible; five is amazing. 0:04:38.446,0:04:41.901 And it looks at things[br]like how many statements does it have, 0:04:41.901,0:04:44.031 how many external identifiers[br]does it have, 0:04:44.031,0:04:45.922 how many references are there, 0:04:45.922,0:04:49.414 how many different labels are there[br]in different languages, 0:04:49.414,0:04:50.604 and so on. 0:04:50.727,0:04:55.118 And then we looked at Wikidata over time, 0:04:55.118,0:04:59.751 and as you can see, [br]based on these measures, 0:04:59.751,0:05:03.918 we went from pretty terrible[br]to much better. 0:05:03.918,0:05:05.238 (laughter) 0:05:05.649,0:05:07.068 So that's good. 0:05:07.303,0:05:11.961 But what you can also see,[br]there's still a lot of room to 5. 0:05:13.664,0:05:20.171 Now I don't think [br]this is where we will get to, right? 0:05:20.380,0:05:23.263 Not every item will be absolutely perfect 0:05:23.266,0:05:26.087 according to these measures [br]that we have taken. 0:05:26.354,0:05:30.569 But I'm really happy to see [br]that consistently the quality of our data 0:05:30.569,0:05:32.387 is getting better and better. 0:05:36.709,0:05:43.111 Okay, but creating that data isn't enough. 0:05:44.428,0:05:46.734 We want this--we do this for a reason. 0:05:46.734,0:05:48.749 We want it to be used. 0:05:48.749,0:05:55.450 And now we looked at how many articles 0:05:55.450,0:06:00.770 on each of the other Wikimedia projects[br]use data from Wikidata, 0:06:02.040,0:06:06.762 and we looked at the percentage [br]of all articles on those projects. 0:06:07.395,0:06:09.554 Now if you look across all of Wikimedia 0:06:09.554,0:06:11.989 and all of the articles there, 0:06:11.989,0:06:18.768 then 56.35% of them today[br]make use of some data from Wikidata. 0:06:20.054,0:06:21.815 Which I think is pretty good, 0:06:21.815,0:06:27.378 but of course, [br]there's still a lot of room to 100. 0:06:29.085,0:06:33.811 And then I looked at which projects[br]are actually making most use 0:06:33.811,0:06:36.188 of Wikidata's data, 0:06:36.188,0:06:39.401 and I split this [br]by language versions and so on. 0:06:39.606,0:06:44.997 And now what do you think[br]the top five projects-- 0:06:45.577,0:06:48.254 which ones are all of them? 0:06:48.254,0:06:50.834 Which project family do they belong to? 0:06:51.036,0:06:53.177 (several in audience) Commons. 0:06:53.278,0:06:56.607 Okay, that's pretty uniformly Commons. 0:06:57.216,0:06:58.903 You would actually be wrong. 0:06:59.112,0:07:01.684 All of the top five are Wikivoyage. 0:07:02.084,0:07:03.650 (audience) Oh! 0:07:03.692,0:07:05.044 (laughter) 0:07:05.439,0:07:08.345 So yeah, applause to Wikivoyage. 0:07:08.937,0:07:10.741 (applause) 0:07:17.070,0:07:20.383 If you would like to check [br]where Commons actually is 0:07:20.383,0:07:22.053 and where all of your other projects are, 0:07:22.053,0:07:23.521 there is a dashboard. 0:07:23.521,0:07:25.443 Come to me and we can check it out. 0:07:28.049,0:07:32.016 Of course, inside Wikimedia is [br]not the only place where our data is used. 0:07:32.016,0:07:34.606 It's also used outside,[br]and so much has happened. 0:07:34.966,0:07:39.256 I can't begin to mention it all,[br]but to highlight some 0:07:39.518,0:07:44.028 there are great uses of our data [br]at the Met, at the Wellcome Trust, 0:07:44.030,0:07:45.687 at the Library of Congress, 0:07:45.687,0:07:47.848 in GeneWiki and so many more. 0:07:47.951,0:07:51.296 And if you go through some of the sessions[br]later in the program, 0:07:51.296,0:07:53.292 you will hear about some of them. 0:07:56.635,0:07:59.608 Alright, enough statistics. 0:07:59.977,0:08:02.171 Let's look at some other highlights. 0:08:02.644,0:08:06.897 So we already talked [br]about data quality improving, 0:08:06.897,0:08:10.646 and when you look at data quality,[br]there are a lot of dimensions 0:08:10.646,0:08:16.426 that you can look at,[br]and we've improved on some of those, 0:08:16.482,0:08:18.980 like how accurate is the data, 0:08:18.980,0:08:20.751 how trustworthy is the data, 0:08:20.751,0:08:22.515 how referenced is it, 0:08:22.515,0:08:24.865 how consistent is it modeled, 0:08:26.351,0:08:28.992 how completed is it and so on. 0:08:31.263,0:08:35.746 Just to pick out one-- [br]for consistency for example, 0:08:35.746,0:08:42.355 we have created the ability to store [br]entity schemas now in Wikidata 0:08:42.355,0:08:46.553 so that you can describe[br]how certain domains should be modeled. 0:08:46.806,0:08:49.172 So you can find-- 0:08:49.557,0:08:53.902 you can create an entity schema,[br]say, for Dutch painters, 0:08:53.902,0:08:56.492 and then you can look how-- 0:08:56.492,0:08:59.359 which items that are for Dutch painters 0:08:59.359,0:09:02.470 do not, for example, [br]have a date of birth but should 0:09:02.470,0:09:05.235 and similar things like that. 0:09:05.557,0:09:10.011 And I hope that a lot more [br]wiki projects and so on 0:09:10.011,0:09:13.291 will be able to make use [br]of entity schemas to take good care 0:09:13.291,0:09:15.925 of their data, and if you want [br]to learn how to do that, 0:09:15.925,0:09:18.055 there's a session later[br]in the program as well 0:09:18.055,0:09:23.072 by people who know all about this[br]and will make this less 0:09:23.072,0:09:24.858 of a black box for you. 0:09:27.575,0:09:28.745 Alright. 0:09:30.899,0:09:34.701 Another thing that really got traction 0:09:34.774,0:09:37.819 over the last year [br]is the Wikibase ecosystem, right? 0:09:38.087,0:09:44.015 This idea that not all open data [br]should and has to happen 0:09:44.015,0:09:47.490 in Wikidata, but instead, we want[br]a thriving ecosystem 0:09:47.490,0:09:51.151 of different places, of different actors, 0:09:51.151,0:09:53.513 like institutions, companies, 0:09:53.513,0:09:56.929 volunteer projects opening up their data[br]in a similar way 0:09:56.929,0:10:00.372 that Wikidata does it [br]and then connecting all of it, 0:10:00.372,0:10:03.317 exchanging data between those,[br]linking that data. 0:10:04.282,0:10:08.808 And over the last year,[br]the interest in that 0:10:08.808,0:10:11.760 and the interest in institutions[br]and people running 0:10:11.760,0:10:14.977 their own Wikibase instance [br]has really exploded, 0:10:14.977,0:10:20.466 and especially in the sector[br]of libraries. 0:10:23.009,0:10:26.210 There's a lot of testing, evaluating, 0:10:26.226,0:10:28.787 and to be honest, trailblazing, 0:10:28.787,0:10:33.536 going on there at the moment[br]where adventurous institutions 0:10:33.536,0:10:38.872 work with us to really figure out[br]how Wikibase can work 0:10:38.872,0:10:42.243 for their collections, [br]for their catalogues and so on. 0:10:42.539,0:10:45.024 Among them, the German National Library, 0:10:45.024,0:10:46.419 the French National Library, 0:10:46.419,0:10:49.194 OCLC and it's really exciting to see. 0:10:55.278,0:10:57.360 One of the reasons[br]why I think this is so exciting 0:10:57.360,0:11:02.868 is that we are helping these institutions [br]open up data in a way that is 0:11:02.868,0:11:07.914 not just putting it on a website [br]and someone can access it 0:11:07.926,0:11:11.947 but really thinking about this--[br]the next step after that, right? 0:11:11.947,0:11:15.229 Letting people help you maintain[br]that data, augment that data, 0:11:15.229,0:11:20.449 enrich it, and that's really a shift 0:11:20.450,0:11:24.526 that I hope will bring good things. 0:11:26.041,0:11:27.859 And the other thing it helps us with 0:11:27.859,0:11:31.203 is that it lets experts curate the data 0:11:31.203,0:11:37.474 in their space, keep it in good shape [br]so that we can then set up 0:11:37.474,0:11:42.317 synchronizing processes [br]to Wikidata, for example, 0:11:42.317,0:11:45.604 instead of having to take care of it [br]ourselves all the time. 0:11:46.519,0:11:50.223 And at the end of the day,[br]I hope it will take some pressure 0:11:50.223,0:11:53.776 off of Wikidata to be that place [br]where everything has to go. 0:11:58.040,0:12:00.450 Lexicographical data-- 0:12:01.962,0:12:06.997 Over the last year, [br]people started describing words 0:12:07.060,0:12:12.264 in their language in Wikidata[br]so that we can build things 0:12:12.264,0:12:14.713 like automated translation tools, 0:12:16.413,0:12:21.019 and we are at the point [br]where in some languages 0:12:21.019,0:12:25.500 we are starting to get nearer [br]to reaching that critical mass 0:12:25.500,0:12:29.143 that is needed to actually [br]build a serious application. 0:12:29.527,0:12:32.614 In a lot of languages, [br]we still have a long way to go, 0:12:32.614,0:12:35.411 but in some, [br]we're really starting to get there, 0:12:35.411,0:12:37.265 and that's really great to see. 0:12:38.621,0:12:41.430 If you want to know more about this,[br]come to my session later today. 0:12:46.064,0:12:48.954 And, of course, not to forget, 0:12:48.954,0:12:50.955 structured data on Commons. 0:12:51.150,0:12:52.384 (audience member whistles) 0:12:52.440,0:12:54.052 Yes! (laughs) 0:12:54.216,0:12:55.941 (applause) 0:12:59.324,0:13:01.515 The structured data on Commons [br]seen at the foundation 0:13:01.515,0:13:05.571 has really gotten... 0:13:07.121,0:13:11.459 everything together and made it possible 0:13:11.459,0:13:15.479 to add statements to files [br]on Commons over the last year, 0:13:15.526,0:13:18.586 and people are starting to add[br]those statements to images 0:13:18.586,0:13:22.770 to then make it easier to find [br]to build better applications on top of it, 0:13:22.770,0:13:24.292 and so much more. 0:13:24.292,0:13:26.852 It's really exciting to see how [br]that is growing, 0:13:26.852,0:13:29.988 and I think what's really important 0:13:29.988,0:13:32.959 for the Wikidata community [br]to understand here 0:13:32.959,0:13:36.555 is that when you see "depicts" 0:13:36.555,0:13:41.577 or "house cat" or "sitting," "lizard"[br]and "wall" here, 0:13:41.577,0:13:44.867 those are links to Wikidata items[br]and properties. 0:13:45.425,0:13:49.620 That means when we create items [br]and properties, 0:13:49.620,0:13:54.031 those are no longer just providing [br]the vocabulary for Wikidata itself. 0:13:54.031,0:13:57.749 They are providing the vocabulary[br]for Commons as well. 0:13:57.904,0:14:00.695 And this will only get more and more so, 0:14:00.695,0:14:02.929 so we have to pay a lot more attention 0:14:02.929,0:14:06.550 to how our ontology, our vocabulary 0:14:06.550,0:14:09.777 is actually used in other places[br]than we had before. 0:14:13.589,0:14:19.905 And the last one I have is that[br]we've started building stronger bridges 0:14:19.905,0:14:21.902 to the other Wikimedia projects. 0:14:23.281,0:14:26.159 My team and I are working [br]on a project called the Wikidata Bridge, 0:14:26.159,0:14:28.849 and you should totally come [br]to the UX booth 0:14:28.849,0:14:32.904 and do some testing of the current state 0:14:32.904,0:14:36.240 that will have [br]for example Wikipedia editors 0:14:36.240,0:14:38.970 edit Wikidata directly [br]from their projects 0:14:38.976,0:14:40.988 without having to go to Wikidata 0:14:40.988,0:14:43.958 and having to understand [br]everything around it. 0:14:43.958,0:14:50.571 I hope that this will take away [br]one more hurdle that makes it difficult 0:14:50.571,0:14:54.498 for Wikimedia projects [br]to adopt more data from Wikidata. 0:14:57.165,0:15:01.012 Alright, now to strategies [br]and where are we going? 0:15:03.005,0:15:07.179 Since December, the Wikidata team[br]at Wikimedia Deutschland, 0:15:07.179,0:15:12.262 and people from the Wikimedia Foundation[br]have been working on strategies, 0:15:12.262,0:15:14.675 papers around Wikidata. 0:15:14.675,0:15:16.101 It's basically writing down 0:15:16.101,0:15:19.526 what a lot of us have been [br]talking about already 0:15:19.526,0:15:22.958 over the last four or five years. 0:15:23.995,0:15:29.492 And I don't know if all of you[br]have read those papers. 0:15:29.492,0:15:33.887 They're published on Meta Commons[br]until the end of the month. 0:15:33.887,0:15:35.806 It would be great [br]if you haven't read them, 0:15:35.806,0:15:39.019 go read them, [br]leave your comments and so on. 0:15:40.062,0:15:44.338 Now the very quick overview[br]of what is in there 0:15:44.338,0:15:50.991 is that we think about Wikidata [br]and Wikibase in three pieces. 0:15:51.506,0:15:55.442 The first one is Wikidata as a platform. 0:15:55.442,0:15:57.023 You can see it in the lower corner, 0:15:57.301,0:16:03.876 and that is really around [br]Wikidata enables every person 0:16:03.876,0:16:06.273 to access and share information 0:16:06.273,0:16:09.038 regardless of their language [br]and technology, 0:16:09.038,0:16:14.479 and we do that by providing [br]general purpose data about the world. 0:16:14.479,0:16:18.161 So basically what you do every day. 0:16:21.282,0:16:25.497 The second thing is [br]the Wikibase ecosystem part 0:16:25.497,0:16:30.047 where Wikibase, the software [br]running Wikidata, powers 0:16:30.047,0:16:34.993 not just Wikidata, but a thriving [br]open data web that is the backbone 0:16:35.008,0:16:36.817 of free and open knowledge. 0:16:38.126,0:16:43.005 And the third and last thing[br]is Wikidata for the Wikimedia projects 0:16:43.005,0:16:47.011 at the top where Wikidata is there 0:16:47.011,0:16:49.594 to help the Wikimedia projects-- 0:16:50.750,0:16:53.559 help make them ready for the future. 0:16:57.597,0:17:02.973 Concretely, what does that mean[br]for the near or midterm future? 0:17:03.898,0:17:06.235 Wikidata as a platform-- 0:17:06.669,0:17:10.700 We want to have better data quality,[br]so we will continue working 0:17:10.700,0:17:14.195 on better tools, [br]improving the tools we have and so on. 0:17:14.633,0:17:18.899 We need to make our data [br]more accessible 0:17:18.899,0:17:23.864 through better APIs, [br]a more robust SPARQL endpoint 0:17:23.864,0:17:27.315 but also things like more consistently[br]modeling our data 0:17:27.315,0:17:31.235 so it actually is easy to reuse[br]in applications. 0:17:31.867,0:17:37.203 And the last thing I had was [br]setting up feedback processes 0:17:37.203,0:17:38.769 with our partners. 0:17:40.399,0:17:43.905 Unlike Wikipedia, Wikidata is not 0:17:43.905,0:17:46.142 what I call a destination project, right? 0:17:46.142,0:17:49.166 Someone goes to Wikipedia and reads it 0:17:49.166,0:17:50.742 whereas Wikidata is usually not 0:17:50.742,0:17:53.295 someone goes to Wikidata and reads it. 0:17:53.295,0:17:54.309 It would be awesome, 0:17:54.309,0:17:57.834 but realistically [br]it's not what it is, right? 0:17:57.882,0:18:00.520 A lot of the people who are exposed 0:18:00.520,0:18:02.719 to our data are not on Wikidata itself, 0:18:02.770,0:18:06.838 but they are seeing it through Wikipedia[br]and many other places. 0:18:07.847,0:18:12.238 Now these other places do get feedback [br]on that data, right? 0:18:12.238,0:18:14.635 Their users tell them, [br]"Hey, here's something that's wrong," 0:18:16.775,0:18:20.952 and I would like to have that[br]so that we can make it available 0:18:20.958,0:18:24.179 to the people who actually edit [br]on Wikidata, meaning you. 0:18:24.374,0:18:27.212 And figuring out how to do that [br]in a meaningful way 0:18:27.212,0:18:31.679 without overwhelming everyone[br]will be one of the things to do 0:18:31.679,0:18:33.143 over the next year. 0:18:34.623,0:18:37.127 Alright, Wikibase ecosystem. 0:18:37.127,0:18:40.925 There, we will continue to work [br]with the libraries, 0:18:41.055,0:18:46.192 but also look into science, [br]for example, and more. 0:18:46.278,0:18:51.641 There is a Wikibase showcase later today[br]that you should totally go to 0:18:51.641,0:18:52.951 and see what's already there 0:18:52.951,0:18:55.852 and what people are already doing[br]with Wikibase. 0:18:55.875,0:18:57.281 It's really worth it. 0:18:57.682,0:19:00.832 And what's needed there is 0:19:00.832,0:19:03.181 also setting up [br]good processes around that. 0:19:04.384,0:19:08.138 Helping people figure out[br]who to talk to about what, 0:19:08.138,0:19:10.467 where they can find help, 0:19:10.467,0:19:11.831 all these kinds of things. 0:19:13.474,0:19:17.395 And, of course, making it easier[br]to install and maintain 0:19:17.395,0:19:20.309 a Wikibase because that's still [br]a bit of a pain. 0:19:21.144,0:19:24.617 And the last thing is federation[br]which is basically 0:19:24.617,0:19:27.245 what we've been talking about[br]for Commons earlier 0:19:27.245,0:19:30.704 where Commons uses [br]Wikidata's items and properties 0:19:30.704,0:19:33.514 but for other Wikibase instances out there 0:19:33.514,0:19:36.488 so they can also use [br]Wikidata's vocabulary. 0:19:37.742,0:19:42.107 And that, as I was saying earlier,[br]increases yet again 0:19:42.107,0:19:48.228 the need to be mindful[br]of how our vocabulary is used out there 0:19:48.228,0:19:51.055 more than we have had to so far. 0:19:53.792,0:19:56.556 And Wikidata for the Wikimedia projects-- 0:19:57.132,0:20:00.580 of course, tighter integration[br]through the Wikidata Bridge 0:20:00.580,0:20:04.154 and helping people edit directly [br]from their projects 0:20:04.154,0:20:08.999 and the other thing that we all need[br]to think about together, I think, 0:20:08.999,0:20:14.684 is figuring out how to reduce [br]the language barriers. 0:20:15.484,0:20:19.096 The more Wikidata is integrated [br]in the Wikimedia projects, 0:20:19.096,0:20:22.472 the more people will have[br]a need to talk to each other 0:20:22.472,0:20:25.705 about that data without [br]speaking the same language, 0:20:25.705,0:20:31.680 and we have to figure out [br]how to deal with that. 0:20:33.276,0:20:36.634 If people have smart ideas,[br]I would love to talk to you. 0:20:38.790,0:20:41.365 And with that, [br]I come to the end of my talk. 0:20:41.618,0:20:44.248 Thank you, everyone, for giving[br]more people more access 0:20:44.248,0:20:46.305 to more knowledge every day. 0:20:46.688,0:20:48.914 (applause) 0:20:58.015,0:20:59.902 We have some time for questions 0:20:59.902,0:21:01.774 so if there are any questions [br]in the audience 0:21:01.774,0:21:04.975 or if you are remotely watching [br]the livestream--Hi, Mom-- 0:21:04.992,0:21:08.072 you can ask the question[br]on the EtherPad 0:21:08.072,0:21:11.387 or on the Telegram Channel[br]and we'll do our best. 0:21:11.387,0:21:13.233 So anything? 0:21:15.516,0:21:16.655 Ah. 0:21:21.133,0:21:25.208 (person 1) Hi, everyone, this is more[br]of a meme than a question, 0:21:25.243,0:21:32.341 so when the time extension[br]will be able to also to get 0:21:32.341,0:21:35.509 hours and minutes and seconds 0:21:35.509,0:21:38.376 because up till now [br]the position is just to date. 0:21:38.376,0:21:41.610 - I know... it's not my question--[br]- (laughing) 0:21:41.610,0:21:44.230 That's why I said it's a meme. 0:21:44.230,0:21:46.093 Every time is always like that, 0:21:46.093,0:21:48.738 but it comes always from remote so... 0:21:50.001,0:21:53.188 I do not have a very good answer to that. 0:21:53.260,0:21:54.443 I'm sorry. 0:21:55.678,0:22:01.636 But maybe as some background,[br]people need it even more 0:22:01.636,0:22:07.531 to describe images on Commons[br]so it might bubble up the long list 0:22:07.531,0:22:11.071 of things that need to be done[br]a bit faster through that. 0:22:14.713,0:22:16.236 Any more questions? 0:22:24.686,0:22:27.655 (person 2) [Linda] from Wikimedia [br]Foundation's research team-- 0:22:27.655,0:22:31.080 I have a question about your thoughts 0:22:31.080,0:22:37.763 on patrolling, and that may be related[br]to quality of content on Wikidata, 0:22:37.803,0:22:39.756 but if you can speak to that 0:22:39.756,0:22:43.542 like how do you see the near medium term[br]patrolling efforts changing, 0:22:43.542,0:22:45.557 especially with the Bridge project 0:22:45.559,0:22:48.147 which I'm looking forward to[br]going out and trying it. 0:22:48.147,0:22:49.433 Yeah, thank you. 0:22:52.298,0:22:56.812 So as you say, with things [br]like we did at Bridge, 0:22:58.812,0:23:03.287 a lot more effort will have to be spent[br]on patrolling, I think. 0:23:04.482,0:23:08.554 But we are at a size where this [br]is probably not feasible 0:23:08.554,0:23:10.922 to do it by hand, by a human, 0:23:10.922,0:23:15.090 so we need to spend a lot more effort [br]on improving, for example, 0:23:15.090,0:23:18.387 ORES, the machine learning system[br]to help us with that, 0:23:18.407,0:23:24.588 to help us figure out which edits[br]a human really needs to look at 0:23:24.588,0:23:26.493 and which is probably just like yeah, 0:23:26.493,0:23:29.792 the regular stuff [br]I don't need to look at this. 0:23:33.777,0:23:38.878 Currently, ORES is not super good [br]at judging what-- 0:23:38.878,0:23:41.459 if an edit on Wikidata is good or bad. 0:23:41.459,0:23:44.549 There's currently a campaign going on 0:23:44.549,0:23:50.280 that is training [br]the machine learning system, 0:23:51.062,0:23:52.474 with your help, 0:23:53.141,0:23:55.550 to teach it basically what a good edit is 0:23:55.550,0:23:57.078 and what a bad edit is, 0:23:57.109,0:24:02.774 and we haven't reached the threshold [br]of enough humans teaching it yet 0:24:02.774,0:24:08.025 to really improve it,[br]but if you have a few minutes, 0:24:08.025,0:24:11.098 it would be great if you help teach ORES 0:24:11.098,0:24:13.586 make better judgements[br]about Wikidata edits. 0:24:13.768,0:24:15.837 And it's really simple--[br]it shows you an edit, 0:24:15.842,0:24:17.584 and you say this is a good edit, 0:24:17.584,0:24:19.658 this is a bad edit, and that's it. 0:24:20.041,0:24:23.193 You can do this in front of the TV[br]in the evening on the couch. 0:24:25.588,0:24:27.021 (person 3) Share a link. 0:24:28.000,0:24:31.059 We will share a link [br]in the Telegram Group, yes. 0:24:32.239,0:24:36.239 And once we've reached [br]the threshold we need-- 0:24:36.239,0:24:39.269 I think it's around 7,000,[br]but I might be wrong-- 0:24:40.223,0:24:44.359 then we can rerun the training[br]for ORES and then it will be 0:24:44.374,0:24:48.484 hopefully considerably better [br]at judging the edits on Wikidata. 0:24:49.909,0:24:52.063 And then I hope more of you can use that 0:24:52.063,0:24:56.029 to filter recent changes, for example,[br]or your watch list 0:24:56.029,0:24:58.333 for edits that really need your attention. 0:24:59.093,0:25:00.227 Yeah. 0:25:02.899,0:25:04.004 Hi. 0:25:07.116,0:25:09.964 (person 4) I'm just curious to know,[br]and this is a question not from me, 0:25:09.964,0:25:12.729 but from partners [br]that I've been working with, 0:25:12.729,0:25:16.190 the more partners we have joining Wikidata 0:25:16.190,0:25:19.916 and starting to experiment with queries, 0:25:19.916,0:25:23.079 the more issues we are having [br]with timeout of queries 0:25:23.147,0:25:25.766 so what's happening with that? 0:25:27.732,0:25:30.170 So, some people [br]at the Wikimedia Foundation 0:25:30.170,0:25:34.355 are looking into that,[br]and--small spoiler-- 0:25:34.355,0:25:36.988 be there for the birthday present session. 0:25:37.142,0:25:38.181 (laughter) 0:25:43.384,0:25:46.201 (person 5) Hello, I'm Bart Magnus[br]from Belgium (PACKED). 0:25:46.201,0:25:48.620 I would like to know [br]what the current state of affairs is 0:25:48.620,0:25:52.115 regarding federation [br]so raising your properties 0:25:52.115,0:25:53.752 in your own Wikibase instance-- 0:25:53.752,0:25:56.887 is there anything to mention about that? 0:25:56.898,0:26:01.425 So over the last year, [br]a lot of people have told us 0:26:01.425,0:26:03.996 that they want federation, right? 0:26:03.996,0:26:06.866 But the problem was[br]that a lot of people understood 0:26:06.866,0:26:09.318 very different things [br]when they said federation. 0:26:10.566,0:26:13.533 Some of those things [br]were very easily doable. 0:26:13.533,0:26:15.664 Some of those things were [br]really, really hard. 0:26:16.934,0:26:22.148 And my team and I have been talking[br]to a lot of people, for example, 0:26:22.148,0:26:27.193 the partners we work with at libraries[br]to figure out what is it actually 0:26:27.193,0:26:28.836 precisely that they need. 0:26:30.111,0:26:33.893 And we finished that now,[br]though, of course, I'm happy 0:26:33.893,0:26:37.850 to take more feedback [br]if you want to talk to me about that, 0:26:37.850,0:26:41.397 and now I'm at a stage where [br]I'm comfortable to say, 0:26:41.397,0:26:43.480 "Okay, we're going to start with that." 0:26:44.606,0:26:48.197 And that will happen over the next[br]I would say two or three months 0:26:48.197,0:26:51.243 that we actually write [br]the first lines of code 0:26:51.243,0:26:53.793 and then hopefully have people able 0:26:53.793,0:26:56.533 to test it early next year, I would say. 0:26:59.661,0:27:01.023 (presenter) Okay, last questions. 0:27:02.457,0:27:05.603 (person 6) Finn Årup Nielsen [br]from Copenhagen, Denmark. 0:27:05.973,0:27:09.833 In relation to the other language,[br]there's been a sort of discussion 0:27:09.833,0:27:13.617 in the WikiCite community[br]about whether we should continue 0:27:13.617,0:27:15.765 to put more scientific papers in there-- 0:27:15.768,0:27:19.913 this relates to how much data[br]we can put into Wikidata. 0:27:19.913,0:27:23.032 Timeout in the Wikidata Query Service[br]is one issue 0:27:23.032,0:27:24.468 but also the maintaining 0:27:24.468,0:27:30.300 so what are your thoughts about... 0:27:31.060,0:27:35.173 Is the size of Wikidata[br]beginning to be a problem 0:27:35.173,0:27:36.237 in general? 0:27:36.237,0:27:38.666 Should we stop putting in lexeme data? 0:27:38.666,0:27:41.222 Should we stop putting[br]in scientific data 0:27:41.222,0:27:45.717 into Wikidata or do we have [br]any research on this 0:27:45.717,0:27:50.053 or technical problems inflating? 0:27:50.292,0:27:51.445 Yeah... 0:27:53.266,0:27:57.419 Wikidata is definitely coming[br]to some... 0:27:58.906,0:28:02.732 scalability boundaries, let's say, 0:28:03.740,0:28:05.975 both technically and socially. 0:28:05.975,0:28:09.197 And for both we need solutions, right? 0:28:09.197,0:28:12.518 Socially, we have things like more editors 0:28:12.518,0:28:15.689 and recent changes to the point [br]where it's completely unfeasible 0:28:15.689,0:28:19.623 for a human to patrol that[br]because it's simply too much. 0:28:21.246,0:28:26.205 But also technically, [br]and we've been addressing some of that. 0:28:26.205,0:28:29.958 For example, some database [br]re-architecturing 0:28:29.958,0:28:33.718 around database view-turned table,[br]if that says anything for anyone. 0:28:35.900,0:28:38.366 But those only get us so far, 0:28:38.516,0:28:41.343 and one of the things we want[br]to look at next year 0:28:41.343,0:28:45.968 is where the other pain points are[br]and what to do about them 0:28:45.968,0:28:47.585 on the technical side. 0:28:49.085,0:28:50.728 So that's a general picture. 0:28:50.728,0:28:54.455 At the same time, I am very hesitant 0:28:54.455,0:28:58.387 to tell anyone, "No, no, no,[br]stop putting data into Wikidata." 0:28:58.400,0:29:02.408 That would kind of defeat the purpose. 0:29:04.311,0:29:07.061 But, for example, the Wikibase ecosystem 0:29:07.061,0:29:09.220 is one way to address that, right, 0:29:09.220,0:29:13.952 to not require everything [br]in Wikidata. 0:29:13.952,0:29:16.267 That's the whole beauty [br]of linked open data. 0:29:16.267,0:29:18.298 You don't have [br]to have it all in the same place. 0:29:18.298,0:29:19.642 You can connect different places. 0:29:19.642,0:29:20.859 It's amazing. 0:29:21.957,0:29:28.309 So around WikiCites specifically, yes-- 0:29:29.644,0:29:34.718 okay, WikiCites specifically,[br]I think we need 0:29:34.718,0:29:36.256 to look at in proportion. 0:29:36.256,0:29:40.548 I don't have an exact percentage[br]of what percentage 0:29:40.548,0:29:44.511 of the items in Wikidata [br]are around WikiCite topics, 0:29:44.511,0:29:46.696 but it's a big percentage. 0:29:46.696,0:29:49.869 And maybe that's the thing[br]we need to talk about... 0:29:50.356,0:29:52.442 in the break. 0:29:53.191,0:29:54.766 Well, thank you very much! 0:29:54.845,0:29:56.281 (applause)