[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,0:00:05.96,0:00:08.13,Default,,0000,0000,0000,,(moderator) The next talk is\Nby Anders Sandholm Dialogue: 0,0:00:08.13,0:00:12.32,Default,,0000,0000,0000,,on Wikidata fact annotation\Nfor Wikipedia across languages. Dialogue: 0,0:00:12.32,0:00:13.92,Default,,0000,0000,0000,,- Thank you.\N- Thanks. Dialogue: 0,0:00:21.90,0:00:24.16,Default,,0000,0000,0000,,I wanted to start with a small confession. Dialogue: 0,0:00:26.43,0:00:31.69,Default,,0000,0000,0000,,Wow! I'm blown away\Nby the momentum of Wikidata Dialogue: 0,0:00:33.80,0:00:35.91,Default,,0000,0000,0000,,and the engagement of the community. Dialogue: 0,0:00:37.23,0:00:38.67,Default,,0000,0000,0000,,I am really excited about being here Dialogue: 0,0:00:38.67,0:00:42.30,Default,,0000,0000,0000,,and getting a chance to talk\Nabout work that we've been doing. Dialogue: 0,0:00:42.91,0:00:47.40,Default,,0000,0000,0000,,This is doing work with Michael,\Nwho's also here in the third row. Dialogue: 0,0:00:49.55,0:00:51.92,Default,,0000,0000,0000,,But before I dive more into this, Dialogue: 0,0:00:51.92,0:00:55.52,Default,,0000,0000,0000,,this wouldn't be\Na Google presentation without an ad, Dialogue: 0,0:00:56.10,0:00:58.20,Default,,0000,0000,0000,,so you get that up front. Dialogue: 0,0:00:58.20,0:01:01.24,Default,,0000,0000,0000,,This is what I'll be talking about,\Nour project, the SLING project. Dialogue: 0,0:01:02.26,0:01:06.64,Default,,0000,0000,0000,,It is an open source project\Nand it's using Wikidata a lot. Dialogue: 0,0:01:08.02,0:01:11.72,Default,,0000,0000,0000,,You can go check it out on GitHub\Nwhen you get a chance Dialogue: 0,0:01:11.72,0:01:15.96,Default,,0000,0000,0000,,if you feel excited about it\Nafter the presentation. Dialogue: 0,0:01:18.22,0:01:23.49,Default,,0000,0000,0000,,And really, what I wanted to talk about--\Nthe title is admittedly a little bit long, Dialogue: 0,0:01:23.49,0:01:25.80,Default,,0000,0000,0000,,it's even shorter than it was\Nin the original program. Dialogue: 0,0:01:25.80,0:01:29.70,Default,,0000,0000,0000,,But what it comes down to,\Nwhat the project comes down to Dialogue: 0,0:01:29.70,0:01:33.62,Default,,0000,0000,0000,,is trying to answer\Nthis one very exciting question. Dialogue: 0,0:01:34.81,0:01:38.22,Default,,0000,0000,0000,,If you want, in the beginning,\Nthere were just two files, Dialogue: 0,0:01:39.91,0:01:41.40,Default,,0000,0000,0000,,some of you may recognize them, Dialogue: 0,0:01:42.42,0:01:45.95,Default,,0000,0000,0000,,they're essentially the dump files\Nfrom Wikidata and Wikipedia, Dialogue: 0,0:01:47.23,0:01:50.28,Default,,0000,0000,0000,,and the question we're trying\Nto figure out or answer is really, Dialogue: 0,0:01:51.57,0:01:54.42,Default,,0000,0000,0000,,can we dramatically improve\Nhow good machines are Dialogue: 0,0:01:54.42,0:01:58.06,Default,,0000,0000,0000,,at understanding human language\Njust by using these files? Dialogue: 0,0:02:00.90,0:02:04.16,Default,,0000,0000,0000,,And of course, you're entitled to ask Dialogue: 0,0:02:04.16,0:02:06.19,Default,,0000,0000,0000,,whether that's an interesting\Nquestion to answer. Dialogue: 0,0:02:07.45,0:02:14.34,Default,,0000,0000,0000,,If you're a company that [inaudible]\Nis to be able to take search queries Dialogue: 0,0:02:14.34,0:02:17.66,Default,,0000,0000,0000,,and try to answer them\Nin the best possible way, Dialogue: 0,0:02:18.46,0:02:23.99,Default,,0000,0000,0000,,obviously, understanding natural language\Ncomes in as a very handy thing. Dialogue: 0,0:02:25.32,0:02:27.91,Default,,0000,0000,0000,,But even if you look at Wikidata, Dialogue: 0,0:02:29.11,0:02:33.84,Default,,0000,0000,0000,,in the previous data quality panel\Nearlier today, Dialogue: 0,0:02:33.84,0:02:39.07,Default,,0000,0000,0000,,there was a question that came up about\Nverification, or verifiability of facts. Dialogue: 0,0:02:39.07,0:02:42.62,Default,,0000,0000,0000,,So let's say you actually do\Nunderstand natural language. Dialogue: 0,0:02:42.62,0:02:47.30,Default,,0000,0000,0000,,If you have a fact and there's a source,\Nyou could go to the source and analyze it, Dialogue: 0,0:02:47.30,0:02:49.72,Default,,0000,0000,0000,,and you can figure out whether\Nit actually confirms the fact Dialogue: 0,0:02:49.72,0:02:52.28,Default,,0000,0000,0000,,that is claiming\Nto have this as a source. Dialogue: 0,0:02:53.46,0:02:55.54,Default,,0000,0000,0000,,And if you could do it,\Nyou could even go beyond that Dialogue: 0,0:02:55.54,0:02:59.72,Default,,0000,0000,0000,,and you could read articles\Nand annotate them, come up with facts, Dialogue: 0,0:02:59.72,0:03:03.48,Default,,0000,0000,0000,,and actually look for existing facts\Nthat may need sources Dialogue: 0,0:03:03.48,0:03:06.11,Default,,0000,0000,0000,,and add these articles as sources. Dialogue: 0,0:03:07.11,0:03:11.37,Default,,0000,0000,0000,,Or, you know, in the wildest,\Ncraziest possible of all worlds, Dialogue: 0,0:03:11.37,0:03:13.76,Default,,0000,0000,0000,,if you get really, really good at it\Nyou could read articles Dialogue: 0,0:03:13.76,0:03:18.24,Default,,0000,0000,0000,,and maybe even annotate with new facts\Nthat you could then suggest as facts Dialogue: 0,0:03:18.24,0:03:19.96,Default,,0000,0000,0000,,that you could potentially\Nadd to Wikidata. Dialogue: 0,0:03:20.60,0:03:27.02,Default,,0000,0000,0000,,But there's a whole world of applications\Nof natural language understanding. Dialogue: 0,0:03:28.90,0:03:32.48,Default,,0000,0000,0000,,One of the things that's really hard when\Nyou do natural language understanding-- Dialogue: 0,0:03:32.48,0:03:35.60,Default,,0000,0000,0000,,these days, that also means\Ndeep learning or machine learning, Dialogue: 0,0:03:35.60,0:03:39.54,Default,,0000,0000,0000,,and one of the things that's really hard\Nis getting enough training data. Dialogue: 0,0:03:39.54,0:03:42.81,Default,,0000,0000,0000,,And historically,\Nthat's meant having a lot of text Dialogue: 0,0:03:42.81,0:03:45.44,Default,,0000,0000,0000,,that you need human annotators\Nto then first process Dialogue: 0,0:03:45.44,0:03:46.80,Default,,0000,0000,0000,,and then you can do training. Dialogue: 0,0:03:46.80,0:03:51.18,Default,,0000,0000,0000,,And part of the question here\Nis also really to say: Dialogue: 0,0:03:51.18,0:03:55.93,Default,,0000,0000,0000,,Can we use Wikidata and the way\Nin which it's interlinked with Wikipedia Dialogue: 0,0:03:57.01,0:03:58.01,Default,,0000,0000,0000,,for training data, Dialogue: 0,0:03:58.01,0:04:00.60,Default,,0000,0000,0000,,and will that be enough\Nto train that model? Dialogue: 0,0:04:03.43,0:04:06.52,Default,,0000,0000,0000,,So hopefully, we'll get closer\Nto answering this question Dialogue: 0,0:04:06.52,0:04:09.29,Default,,0000,0000,0000,,in the next 15 to 20 minutes. Dialogue: 0,0:04:10.27,0:04:14.07,Default,,0000,0000,0000,,We don't quite know the answer yet\Nbut we have some exciting results Dialogue: 0,0:04:14.07,0:04:16.99,Default,,0000,0000,0000,,that are pointing\Nin the right direction, if you want. Dialogue: 0,0:04:19.39,0:04:23.80,Default,,0000,0000,0000,,Just take a step back in terms of\Nthe development we've seen, Dialogue: 0,0:04:24.44,0:04:28.45,Default,,0000,0000,0000,,machine learning and deep learning\Nhas revolutionized a lot of areas Dialogue: 0,0:04:28.45,0:04:32.43,Default,,0000,0000,0000,,and this is just one example\Nof a particular image recognition task Dialogue: 0,0:04:32.43,0:04:37.34,Default,,0000,0000,0000,,that if you look at what happened\Nbetween 2010 and 2015, Dialogue: 0,0:04:37.34,0:04:40.88,Default,,0000,0000,0000,,in that five-year period,\Nwe went from machines doing pretty poorly Dialogue: 0,0:04:40.88,0:04:44.92,Default,,0000,0000,0000,,to, in the end, actually performing\Nat the same level of humans Dialogue: 0,0:04:44.92,0:04:48.80,Default,,0000,0000,0000,,or in some cases even better\Nalbeit for a very specific task. Dialogue: 0,0:04:50.22,0:04:55.52,Default,,0000,0000,0000,,So we've seen really a lot of things\Nimproving dramatically. Dialogue: 0,0:04:56.22,0:04:57.88,Default,,0000,0000,0000,,And so you can ask Dialogue: 0,0:04:57.88,0:05:02.44,Default,,0000,0000,0000,,why don't we just throw deep learning\Nat natural language processing Dialogue: 0,0:05:02.44,0:05:04.60,Default,,0000,0000,0000,,and natural language understanding\Nand be done with it? Dialogue: 0,0:05:05.50,0:05:11.53,Default,,0000,0000,0000,,And the answer is kind of\Nwe've sort of done to a certain extent, Dialogue: 0,0:05:11.53,0:05:14.37,Default,,0000,0000,0000,,but what it turns out is that Dialogue: 0,0:05:15.00,0:05:17.72,Default,,0000,0000,0000,,natural language understanding\Nis actually still a bit of a challenge Dialogue: 0,0:05:17.73,0:05:23.28,Default,,0000,0000,0000,,and one of the situations where\Na lot of us interact with machines Dialogue: 0,0:05:23.28,0:05:25.80,Default,,0000,0000,0000,,that are trying to behave like\Nthey understand what we're saying Dialogue: 0,0:05:25.80,0:05:26.80,Default,,0000,0000,0000,,is in these chat bots. Dialogue: 0,0:05:26.80,0:05:28.60,Default,,0000,0000,0000,,So this is not to pick\Non anyone in particular Dialogue: 0,0:05:28.61,0:05:31.99,Default,,0000,0000,0000,,but just, I think, an experience\Nthat a lot of us have had. Dialogue: 0,0:05:31.99,0:05:36.84,Default,,0000,0000,0000,,In this case, it's a user saying\NI want to stay in this place. Dialogue: 0,0:05:36.84,0:05:41.77,Default,,0000,0000,0000,,The chat bot says: "OK, got it,\Nwhen will you be checking in and out? Dialogue: 0,0:05:41.77,0:05:44.49,Default,,0000,0000,0000,,For example, November 17th to 23rd." Dialogue: 0,0:05:44.49,0:05:46.62,Default,,0000,0000,0000,,And the user says:\N"Well, I don't have any dates yet." Dialogue: 0,0:05:46.62,0:05:47.68,Default,,0000,0000,0000,,And then the response is: Dialogue: 0,0:05:47.68,0:05:51.05,Default,,0000,0000,0000,,"Sorry, there are no hotels available\Nfor the dates you've requested. Dialogue: 0,0:05:51.05,0:05:52.57,Default,,0000,0000,0000,,Would you like to start a new search?" Dialogue: 0,0:05:53.21,0:05:55.04,Default,,0000,0000,0000,,So there's still some way to go Dialogue: 0,0:05:55.86,0:05:58.76,Default,,0000,0000,0000,,to get machines to really\Nunderstand human language. Dialogue: 0,0:05:59.82,0:06:03.76,Default,,0000,0000,0000,,But machine learning or deep learning Dialogue: 0,0:06:03.76,0:06:06.79,Default,,0000,0000,0000,,has been applied\Nalready to this discipline. Dialogue: 0,0:06:06.79,0:06:09.72,Default,,0000,0000,0000,,Like, one of the examples is a recent... Dialogue: 0,0:06:09.72,0:06:11.23,Default,,0000,0000,0000,,a more successful example is BERT Dialogue: 0,0:06:11.23,0:06:17.32,Default,,0000,0000,0000,,where they're using transformers\Nto solve NLP or NLU tasks. Dialogue: 0,0:06:18.80,0:06:22.16,Default,,0000,0000,0000,,And it's dramatically improved\Nthe performance but, as we've seen, Dialogue: 0,0:06:22.16,0:06:23.56,Default,,0000,0000,0000,,there is still some way to go. Dialogue: 0,0:06:25.15,0:06:27.86,Default,,0000,0000,0000,,One thing that's shared among\Nmost of these approaches Dialogue: 0,0:06:27.86,0:06:31.78,Default,,0000,0000,0000,,is that you look at the text itself Dialogue: 0,0:06:31.78,0:06:36.63,Default,,0000,0000,0000,,and you depend on having a lot of it\Nso you can train your model on the text, Dialogue: 0,0:06:36.63,0:06:39.76,Default,,0000,0000,0000,,but everything is based\Non just looking at the text Dialogue: 0,0:06:39.76,0:06:41.68,Default,,0000,0000,0000,,and understanding the text. Dialogue: 0,0:06:41.68,0:06:45.73,Default,,0000,0000,0000,,So the learning is really\Njust representation learning. Dialogue: 0,0:06:45.73,0:06:50.65,Default,,0000,0000,0000,,What we wanted to do is actually\Nunderstand and annotate the text Dialogue: 0,0:06:50.65,0:06:54.01,Default,,0000,0000,0000,,in terms of items\Nor entities in the real world. Dialogue: 0,0:06:56.38,0:06:59.54,Default,,0000,0000,0000,,And in general, if we take a step back, Dialogue: 0,0:07:00.08,0:07:03.44,Default,,0000,0000,0000,,why is natural language processing\Nor understanding so hard? Dialogue: 0,0:07:03.44,0:07:07.66,Default,,0000,0000,0000,,There are a number of reasons\Nwhy it's really hard, but at the core, Dialogue: 0,0:07:07.66,0:07:11.04,Default,,0000,0000,0000,,one of the important reasons\Nis that somehow, Dialogue: 0,0:07:11.04,0:07:13.22,Default,,0000,0000,0000,,the machine needs to have\Nknowledge of the world Dialogue: 0,0:07:13.23,0:07:16.87,Default,,0000,0000,0000,,in order to understand human language. Dialogue: 0,0:07:19.57,0:07:22.46,Default,,0000,0000,0000,,And you think about that\Nfor a little while. Dialogue: 0,0:07:23.07,0:07:26.65,Default,,0000,0000,0000,,What better place to look for knowledge\Nabout the world than Wikidata? Dialogue: 0,0:07:27.32,0:07:29.62,Default,,0000,0000,0000,,So in essence, that's the approach. Dialogue: 0,0:07:29.62,0:07:31.98,Default,,0000,0000,0000,,And the question is can you leverage it, Dialogue: 0,0:07:31.98,0:07:38.88,Default,,0000,0000,0000,,can you use this wonderful knowledge Dialogue: 0,0:07:38.88,0:07:40.60,Default,,0000,0000,0000,,of the world that we already have Dialogue: 0,0:07:40.60,0:07:45.62,Default,,0000,0000,0000,,in a way that you can help\Nto train and bootstrap your model. Dialogue: 0,0:07:47.39,0:07:51.12,Default,,0000,0000,0000,,So the alternative here is really\Nunderstanding the text Dialogue: 0,0:07:51.12,0:07:55.44,Default,,0000,0000,0000,,not just in terms of other texts\Nor how this text is similar to other texts Dialogue: 0,0:07:55.44,0:07:59.10,Default,,0000,0000,0000,,but in terms of the existing knowledge\Nthat we have about the world. Dialogue: 0,0:08:01.16,0:08:02.70,Default,,0000,0000,0000,,And what makes me really excited Dialogue: 0,0:08:02.70,0:08:05.90,Default,,0000,0000,0000,,or at least makes me\Nhave a good gut feeling about this Dialogue: 0,0:08:05.91,0:08:07.37,Default,,0000,0000,0000,,is that in some ways Dialogue: 0,0:08:07.37,0:08:10.78,Default,,0000,0000,0000,,it seems closer\Nto how we interact as humans. Dialogue: 0,0:08:10.78,0:08:13.80,Default,,0000,0000,0000,,So if we were having a conversation Dialogue: 0,0:08:13.80,0:08:17.85,Default,,0000,0000,0000,,and you were bringing up\Nthe Bundeskanzler and Angela Merkel, Dialogue: 0,0:08:18.66,0:08:23.17,Default,,0000,0000,0000,,I would have an internal representation\Nof Q567 and it would light up. Dialogue: 0,0:08:23.17,0:08:25.52,Default,,0000,0000,0000,,And in our continued conversation, Dialogue: 0,0:08:25.52,0:08:29.62,Default,,0000,0000,0000,,mentioning other things\Nrelated to Angela Merkel, Dialogue: 0,0:08:29.62,0:08:31.76,Default,,0000,0000,0000,,I would have an easier time\Nassociating with that Dialogue: 0,0:08:31.76,0:08:33.92,Default,,0000,0000,0000,,or figuring out\Nwhat you were actually talking about. Dialogue: 0,0:08:35.03,0:08:38.92,Default,,0000,0000,0000,,And so, in essence,\Nthat's at the heart of this approach, Dialogue: 0,0:08:38.92,0:08:42.10,Default,,0000,0000,0000,,that we really believe\NWikidata is a key component Dialogue: 0,0:08:42.10,0:08:45.81,Default,,0000,0000,0000,,in unlocking this better understanding\Nof natural language. Dialogue: 0,0:08:49.73,0:08:51.45,Default,,0000,0000,0000,,And so how are we planning to do it? Dialogue: 0,0:08:52.56,0:08:56.80,Default,,0000,0000,0000,,Essentially, there are five steps\Nwe're going through, Dialogue: 0,0:08:56.80,0:08:58.08,Default,,0000,0000,0000,,or have been going through. Dialogue: 0,0:08:58.79,0:09:02.84,Default,,0000,0000,0000,,I'll go over each\Nof the steps briefly in turn Dialogue: 0,0:09:02.84,0:09:04.41,Default,,0000,0000,0000,,but essentially, there are five steps. Dialogue: 0,0:09:04.41,0:09:07.12,Default,,0000,0000,0000,,First, we need to start\Nwith the dump files that I showed you Dialogue: 0,0:09:07.12,0:09:08.12,Default,,0000,0000,0000,,to begin with-- Dialogue: 0,0:09:08.71,0:09:11.15,Default,,0000,0000,0000,,understanding what's in them,\Nparsing them, Dialogue: 0,0:09:11.15,0:09:13.40,Default,,0000,0000,0000,,having an efficient\Ninternal representation in memory Dialogue: 0,0:09:13.40,0:09:15.72,Default,,0000,0000,0000,,that allows us to do\Nquick processing on this. Dialogue: 0,0:09:16.22,0:09:18.50,Default,,0000,0000,0000,,And then, we're leveraging\Nsome of the annotations Dialogue: 0,0:09:18.50,0:09:22.60,Default,,0000,0000,0000,,that are already in Wikipedia,\Nlinking it to items in Wikidata. Dialogue: 0,0:09:22.60,0:09:25.46,Default,,0000,0000,0000,,I'll briefly show you what I mean by that. Dialogue: 0,0:09:25.46,0:09:31.00,Default,,0000,0000,0000,,We can use that to then\Ngenerate more advanced annotations Dialogue: 0,0:09:31.97,0:09:34.55,Default,,0000,0000,0000,,where we have much more text annotated. Dialogue: 0,0:09:34.55,0:09:40.33,Default,,0000,0000,0000,,But still, with annotations\Nbeing items or facts in Wikidata, Dialogue: 0,0:09:40.33,0:09:43.72,Default,,0000,0000,0000,,we can then train a model\Nbased on the silver data Dialogue: 0,0:09:43.72,0:09:46.21,Default,,0000,0000,0000,,and get a reasonably good model Dialogue: 0,0:09:46.21,0:09:49.05,Default,,0000,0000,0000,,that will allow us to read\Na Wikipedia document Dialogue: 0,0:09:49.05,0:09:53.31,Default,,0000,0000,0000,,and understand what the actual content is\Nin terms of Wikidata, Dialogue: 0,0:09:54.61,0:09:57.58,Default,,0000,0000,0000,,but only for facts that are\Nalready in Wikidata. Dialogue: 0,0:09:58.52,0:10:02.37,Default,,0000,0000,0000,,And so that's where kind of\Nthe hard part of this begins. Dialogue: 0,0:10:02.37,0:10:06.10,Default,,0000,0000,0000,,In order to go beyond that\Nwe need to have a plausibility model, Dialogue: 0,0:10:06.10,0:10:07.64,Default,,0000,0000,0000,,so a model that can tell us, Dialogue: 0,0:10:07.64,0:10:10.88,Default,,0000,0000,0000,,given a lot of facts about an item\Nand an additional fact, Dialogue: 0,0:10:10.88,0:10:12.63,Default,,0000,0000,0000,,whether the additional fact is plausible. Dialogue: 0,0:10:13.19,0:10:14.30,Default,,0000,0000,0000,,If we can build that, Dialogue: 0,0:10:14.89,0:10:21.83,Default,,0000,0000,0000,,we can then use a more "hyper modern"\Nreinforcement learning aspect Dialogue: 0,0:10:21.83,0:10:26.03,Default,,0000,0000,0000,,of deep learning and machine learning\Nto fine-tune the model Dialogue: 0,0:10:26.03,0:10:30.30,Default,,0000,0000,0000,,and hopefully go beyond\Nwhat we've been able to so far. Dialogue: 0,0:10:31.93,0:10:32.93,Default,,0000,0000,0000,,So real quick, Dialogue: 0,0:10:32.93,0:10:36.63,Default,,0000,0000,0000,,the first step is essentially\Ngetting the dump files parsed, Dialogue: 0,0:10:36.63,0:10:41.02,Default,,0000,0000,0000,,understanding the contents, and linking up\NWikidata and Wikipedia information, Dialogue: 0,0:10:41.02,0:10:44.42,Default,,0000,0000,0000,,and then utilizing some of the annotations\Nthat are already there. Dialogue: 0,0:10:45.55,0:10:49.30,Default,,0000,0000,0000,,And so this is essentially\Nwhat's happening. Dialogue: 0,0:10:49.30,0:10:51.96,Default,,0000,0000,0000,,Trust me, Michael built all of this,\Nit's working great. Dialogue: 0,0:10:52.70,0:10:55.62,Default,,0000,0000,0000,,But essentially, we're starting\Nwith the two files you can see on the top, Dialogue: 0,0:10:55.62,0:10:58.24,Default,,0000,0000,0000,,the Wikidata dump and the Wikipedia dump. Dialogue: 0,0:10:58.24,0:11:02.41,Default,,0000,0000,0000,,The Wikidata dump gets processed\Nand we end up with a knowledge base, Dialogue: 0,0:11:02.41,0:11:04.38,Default,,0000,0000,0000,,a KB at the bottom. Dialogue: 0,0:11:04.38,0:11:07.34,Default,,0000,0000,0000,,That's essentially a store\Nwe can hold in memory Dialogue: 0,0:11:07.34,0:11:10.44,Default,,0000,0000,0000,,that has essentially all of Wikidata in it Dialogue: 0,0:11:10.44,0:11:13.84,Default,,0000,0000,0000,,and we can quickly access\Nall the properties and facts and so on Dialogue: 0,0:11:13.84,0:11:15.16,Default,,0000,0000,0000,,and do analysis there. Dialogue: 0,0:11:15.16,0:11:16.41,Default,,0000,0000,0000,,Similarly, for the documents, Dialogue: 0,0:11:16.42,0:11:18.49,Default,,0000,0000,0000,,they get processed\Nand we end up with documents Dialogue: 0,0:11:19.27,0:11:21.91,Default,,0000,0000,0000,,that have been processed. Dialogue: 0,0:11:21.91,0:11:23.54,Default,,0000,0000,0000,,We know all the mentions Dialogue: 0,0:11:23.54,0:11:26.84,Default,,0000,0000,0000,,and some of the things\Nthat are already in the documents. Dialogue: 0,0:11:26.84,0:11:27.84,Default,,0000,0000,0000,,And then in the middle, Dialogue: 0,0:11:27.84,0:11:30.09,Default,,0000,0000,0000,,we have an important part\Nwhich is a phrase table Dialogue: 0,0:11:30.09,0:11:33.08,Default,,0000,0000,0000,,that allows us to basically\Nsee for any phrase Dialogue: 0,0:11:34.10,0:11:35.75,Default,,0000,0000,0000,,what is the frequency distribution, Dialogue: 0,0:11:35.75,0:11:39.48,Default,,0000,0000,0000,,what's the most likely item\Nthat we're referring to Dialogue: 0,0:11:39.48,0:11:41.16,Default,,0000,0000,0000,,when we're using this phrase. Dialogue: 0,0:11:41.16,0:11:44.44,Default,,0000,0000,0000,,So we're using that later on\Nto build the silver annotations. Dialogue: 0,0:11:44.45,0:11:48.00,Default,,0000,0000,0000,,So let's say we've run this\Nand then we also want to make sure Dialogue: 0,0:11:48.00,0:11:51.69,Default,,0000,0000,0000,,we utilize annotations\Nthat are already there. Dialogue: 0,0:11:51.69,0:11:54.11,Default,,0000,0000,0000,,So an important part\Nof a Wikipedia article Dialogue: 0,0:11:54.11,0:11:57.84,Default,,0000,0000,0000,,is that it's not just plain text, Dialogue: 0,0:11:57.84,0:12:01.01,Default,,0000,0000,0000,,it's actually already\Npre-annotated with a few things. Dialogue: 0,0:12:01.01,0:12:04.05,Default,,0000,0000,0000,,So a template is one example,\Nlinks is another example. Dialogue: 0,0:12:04.05,0:12:08.02,Default,,0000,0000,0000,,So if we take here the English article\Nfor Angela Merkel, Dialogue: 0,0:12:09.39,0:12:12.30,Default,,0000,0000,0000,,there is one example of a link here\Nwhich is to her party. Dialogue: 0,0:12:12.30,0:12:13.77,Default,,0000,0000,0000,,If you look at the bottom, Dialogue: 0,0:12:13.77,0:12:16.43,Default,,0000,0000,0000,,that's a link to a specific\NWikipedia article, Dialogue: 0,0:12:16.43,0:12:20.16,Default,,0000,0000,0000,,and I guess for people here,\Nit's no surprise that, in essence, Dialogue: 0,0:12:20.16,0:12:23.36,Default,,0000,0000,0000,,that is then, if you look\Nat the associated Wikidata item, Dialogue: 0,0:12:23.36,0:12:25.80,Default,,0000,0000,0000,,that's essentially an annotation saying Dialogue: 0,0:12:25.80,0:12:31.45,Default,,0000,0000,0000,,this is the QID I am talking about\Nwhen I'm talking about this party, Dialogue: 0,0:12:31.45,0:12:32.82,Default,,0000,0000,0000,,the Christian Democratic Union. Dialogue: 0,0:12:33.95,0:12:37.28,Default,,0000,0000,0000,,So we're using this\Nto already have a good start Dialogue: 0,0:12:37.28,0:12:39.33,Default,,0000,0000,0000,,in terms of understanding what text means. Dialogue: 0,0:12:39.33,0:12:40.33,Default,,0000,0000,0000,,All of these links, Dialogue: 0,0:12:40.33,0:12:43.98,Default,,0000,0000,0000,,we know exactly what the author\Nmeans with the phrase Dialogue: 0,0:12:44.50,0:12:47.04,Default,,0000,0000,0000,,in the cases where\Nthere are links to QIDs. Dialogue: 0,0:12:48.23,0:12:53.30,Default,,0000,0000,0000,,We can use this and the phrase table\Nto then try and take a Wikipedia document Dialogue: 0,0:12:53.30,0:12:58.76,Default,,0000,0000,0000,,and fully annotate it with everything\Nwe know about already from Wikidata. Dialogue: 0,0:12:59.66,0:13:02.75,Default,,0000,0000,0000,,And we can use this to train\Nthe first iteration of our model. Dialogue: 0,0:13:03.93,0:13:04.93,Default,,0000,0000,0000,,(coughs) Excuse me. Dialogue: 0,0:13:04.93,0:13:07.88,Default,,0000,0000,0000,,So this is exactly the same article, Dialogue: 0,0:13:08.40,0:13:13.57,Default,,0000,0000,0000,,but now, after we've annotated it\Nwith silver annotations, Dialogue: 0,0:13:14.67,0:13:18.44,Default,,0000,0000,0000,,and essentially,\Nyou can see all of the squares Dialogue: 0,0:13:18.44,0:13:24.53,Default,,0000,0000,0000,,are places where we've been able\Nto annotate with QIDs or with facts. Dialogue: 0,0:13:26.36,0:13:30.68,Default,,0000,0000,0000,,This is just a screenshot\Nof the viewer on the data, Dialogue: 0,0:13:30.68,0:13:34.28,Default,,0000,0000,0000,,so you can have access\Nto all of this information Dialogue: 0,0:13:34.28,0:13:37.58,Default,,0000,0000,0000,,and see what's come out\Nof the silver annotation. Dialogue: 0,0:13:37.58,0:13:41.36,Default,,0000,0000,0000,,And it's important to say that\Nthere's no machine learning Dialogue: 0,0:13:41.36,0:13:42.68,Default,,0000,0000,0000,,or anything involved here. Dialogue: 0,0:13:42.68,0:13:46.01,Default,,0000,0000,0000,,All we've done, is sort of\Nmechanically, with a few tricks, Dialogue: 0,0:13:46.52,0:13:49.71,Default,,0000,0000,0000,,basically pushed information\Nwe already have from Wikidata Dialogue: 0,0:13:49.71,0:13:52.76,Default,,0000,0000,0000,,onto the Wikipedia article. Dialogue: 0,0:13:53.33,0:13:56.20,Default,,0000,0000,0000,,And so here, if you hover over\N"Chancellor of Germany" Dialogue: 0,0:13:56.20,0:14:01.97,Default,,0000,0000,0000,,that is itself a Wikidata,\Nthat's referring to a Wikidata item, Dialogue: 0,0:14:01.97,0:14:04.97,Default,,0000,0000,0000,,has a number of properties\Nlike "subclass of: Chancellor", Dialogue: 0,0:14:04.97,0:14:08.66,Default,,0000,0000,0000,,"country: Germany",\Nthat again referring to subtext. Dialogue: 0,0:14:08.66,0:14:11.73,Default,,0000,0000,0000,,And here, it also has\Nthe property "officeholder" Dialogue: 0,0:14:12.47,0:14:15.50,Default,,0000,0000,0000,,which happens to be\NAngela Dorothea Merkel, Dialogue: 0,0:14:15.50,0:14:17.05,Default,,0000,0000,0000,,which is also mentioned in the text. Dialogue: 0,0:14:17.05,0:14:22.14,Default,,0000,0000,0000,,So there's really a full annotation\Nlinking up the contents here. Dialogue: 0,0:14:24.64,0:14:27.43,Default,,0000,0000,0000,,But again, there is an important\Nand unfortunate point Dialogue: 0,0:14:27.43,0:14:31.56,Default,,0000,0000,0000,,about what we are able to\Nand not able to do here. Dialogue: 0,0:14:31.56,0:14:35.34,Default,,0000,0000,0000,,So what we are doing is pushing\Ninformation we already have in Wikidata, Dialogue: 0,0:14:35.34,0:14:40.17,Default,,0000,0000,0000,,so what we can't annotate here\Nare things that are not in Wikidata. Dialogue: 0,0:14:40.17,0:14:41.68,Default,,0000,0000,0000,,So for instance, here, Dialogue: 0,0:14:41.68,0:14:44.91,Default,,0000,0000,0000,,she was at some point appointed\NFederal Minister for Women and Youth Dialogue: 0,0:14:44.91,0:14:48.71,Default,,0000,0000,0000,,and that alias or that phrase\Nis not in Wikidata, Dialogue: 0,0:14:48.71,0:14:54.00,Default,,0000,0000,0000,,so we're not able to make that annotation\Nhere in our silver annotations. Dialogue: 0,0:14:56.23,0:14:59.94,Default,,0000,0000,0000,,That said, it's still... at least for me, Dialogue: 0,0:14:59.94,0:15:02.62,Default,,0000,0000,0000,,it's was pretty surprising to see\Nhow much you can actually annotate Dialogue: 0,0:15:02.63,0:15:04.27,Default,,0000,0000,0000,,and how much information is already there Dialogue: 0,0:15:04.27,0:15:08.88,Default,,0000,0000,0000,,when you combine Wikidata\Nwith a Wikipedia article. Dialogue: 0,0:15:08.88,0:15:15.32,Default,,0000,0000,0000,,So what you can do is, once you have this,\Nyou know, millions of documents, Dialogue: 0,0:15:16.28,0:15:20.24,Default,,0000,0000,0000,,you can train your parser\Nbased on the annotations that are there. Dialogue: 0,0:15:21.13,0:15:26.97,Default,,0000,0000,0000,,And that's essentially a parser\Nthat has a number of components. Dialogue: 0,0:15:26.97,0:15:30.48,Default,,0000,0000,0000,,Essentially, the text is coming in\Nat the bottom and at the top, Dialogue: 0,0:15:30.48,0:15:33.72,Default,,0000,0000,0000,,we have a transition-based\Nframe semantic parser Dialogue: 0,0:15:33.72,0:15:39.15,Default,,0000,0000,0000,,that then generates the annotations\Nor these facts or references to the items. Dialogue: 0,0:15:40.62,0:15:44.99,Default,,0000,0000,0000,,We built this and run\Non more classical corpora Dialogue: 0,0:15:44.99,0:15:49.61,Default,,0000,0000,0000,,like [inaudible],\Nwhich are more classical NLP corpora, Dialogue: 0,0:15:49.61,0:15:53.80,Default,,0000,0000,0000,,but we want to be able to run this\Non the full Wikipedia corpora. Dialogue: 0,0:15:53.80,0:15:57.20,Default,,0000,0000,0000,,So Michael has been rewriting this in C++ Dialogue: 0,0:15:57.20,0:15:59.93,Default,,0000,0000,0000,,and we're able to really\Nscale up performance Dialogue: 0,0:15:59.93,0:16:01.10,Default,,0000,0000,0000,,of the parser trainer here. Dialogue: 0,0:16:01.10,0:16:03.59,Default,,0000,0000,0000,,So it will be exciting to see exactly Dialogue: 0,0:16:03.60,0:16:05.83,Default,,0000,0000,0000,,the results that are going\Nto come out of that. Dialogue: 0,0:16:08.64,0:16:10.26,Default,,0000,0000,0000,,So once that's in place, Dialogue: 0,0:16:10.26,0:16:13.46,Default,,0000,0000,0000,,we have a pretty good model\Nthat's able to at least Dialogue: 0,0:16:13.46,0:16:16.05,Default,,0000,0000,0000,,predict facts that are\Nalready known in Wikidata, Dialogue: 0,0:16:16.05,0:16:18.79,Default,,0000,0000,0000,,but ideally, we want to move beyond that, Dialogue: 0,0:16:18.79,0:16:20.70,Default,,0000,0000,0000,,and for that\Nwe need this plausibility model Dialogue: 0,0:16:20.70,0:16:23.93,Default,,0000,0000,0000,,which in essence,\Nyou can think of it as a black box Dialogue: 0,0:16:23.93,0:16:27.12,Default,,0000,0000,0000,,where you supply it with\Nall of the known facts you have Dialogue: 0,0:16:27.12,0:16:30.57,Default,,0000,0000,0000,,about a particular item\Nand then you provide an additional item. Dialogue: 0,0:16:31.41,0:16:32.41,Default,,0000,0000,0000,,And by magic, Dialogue: 0,0:16:32.41,0:16:36.95,Default,,0000,0000,0000,,the black box tells you how plausible is\Nthe additional fact that you're providing Dialogue: 0,0:16:36.95,0:16:40.40,Default,,0000,0000,0000,,and how plausible is it\Nthat this particular item is fact. Dialogue: 0,0:16:42.79,0:16:43.79,Default,,0000,0000,0000,,And... Dialogue: 0,0:16:45.73,0:16:48.58,Default,,0000,0000,0000,,I don't know if it's fair to say\Nthat it was much to our surprise, Dialogue: 0,0:16:48.58,0:16:50.78,Default,,0000,0000,0000,,but at least, you can actually-- Dialogue: 0,0:16:50.78,0:16:52.90,Default,,0000,0000,0000,,In order to train a model, you need, Dialogue: 0,0:16:52.90,0:16:55.26,Default,,0000,0000,0000,,like we've seen earlier,\Nyou need a lot of training data Dialogue: 0,0:16:55.26,0:16:57.88,Default,,0000,0000,0000,,and essentially, you can\Nuse Wikidata as training data. Dialogue: 0,0:16:57.88,0:17:02.21,Default,,0000,0000,0000,,You serve it basically\Nall the facts for a given item Dialogue: 0,0:17:02.21,0:17:04.61,Default,,0000,0000,0000,,and then you mask or hold off one fact Dialogue: 0,0:17:04.62,0:17:08.57,Default,,0000,0000,0000,,and then you provide that as a fact\Nthat it's supposed to predict. Dialogue: 0,0:17:09.24,0:17:10.72,Default,,0000,0000,0000,,And just using this as training data, Dialogue: 0,0:17:10.72,0:17:15.88,Default,,0000,0000,0000,,you can get a really really good\Nplausibility model, actually, Dialogue: 0,0:17:18.57,0:17:21.68,Default,,0000,0000,0000,,to the extent that I was hoping one day\Nto maybe be able to even use it Dialogue: 0,0:17:21.68,0:17:27.53,Default,,0000,0000,0000,,for discovering what you could call\Naccidental vandalism in Wikidata Dialogue: 0,0:17:27.53,0:17:33.01,Default,,0000,0000,0000,,like a fact that's been added by accident\Nand really doesn't look like it's... Dialogue: 0,0:17:33.01,0:17:35.03,Default,,0000,0000,0000,,It doesn't fit with the normal topology Dialogue: 0,0:17:35.03,0:17:38.62,Default,,0000,0000,0000,,of facts or knowledge\Nin Wikidata, if you want. Dialogue: 0,0:17:41.06,0:17:43.76,Default,,0000,0000,0000,,But in this particular setup,\Nwe need it for something else, Dialogue: 0,0:17:43.76,0:17:46.74,Default,,0000,0000,0000,,namely for doing reinforcement learning Dialogue: 0,0:17:47.95,0:17:50.80,Default,,0000,0000,0000,,so we can fine-tune the Wiki parser, Dialogue: 0,0:17:50.80,0:17:54.03,Default,,0000,0000,0000,,and basically using the plausibility model\Nas a reward function. Dialogue: 0,0:17:54.04,0:17:59.58,Default,,0000,0000,0000,,So when you do the training,\Nyou try to pass a Wikipedia document Dialogue: 0,0:17:59.58,0:18:01.87,Default,,0000,0000,0000,,[inaudible] in Wikipedia\Ncomes up with a fact Dialogue: 0,0:18:01.87,0:18:04.28,Default,,0000,0000,0000,,and we check the fact\Non the plausibility model Dialogue: 0,0:18:04.28,0:18:07.53,Default,,0000,0000,0000,,and use that as feedback\Nor as a reward function Dialogue: 0,0:18:08.20,0:18:09.60,Default,,0000,0000,0000,,in training the model. Dialogue: 0,0:18:09.60,0:18:12.71,Default,,0000,0000,0000,,And the big question here is then\Ncan we learn to predict facts Dialogue: 0,0:18:12.71,0:18:15.00,Default,,0000,0000,0000,,that are not already in Wikidata. Dialogue: 0,0:18:15.80,0:18:22.30,Default,,0000,0000,0000,,And we hope and believe we can\Nbut it's still not clear. Dialogue: 0,0:18:22.88,0:18:27.79,Default,,0000,0000,0000,,So this is essentially what we have been\Nand are planning to do. Dialogue: 0,0:18:27.79,0:18:31.22,Default,,0000,0000,0000,,There's been some\Nsurprisingly good results Dialogue: 0,0:18:31.22,0:18:33.99,Default,,0000,0000,0000,,in terms of how far\Nyou can get with silver annotations Dialogue: 0,0:18:33.99,0:18:35.72,Default,,0000,0000,0000,,and a plausibility model. Dialogue: 0,0:18:36.27,0:18:40.08,Default,,0000,0000,0000,,But in terms of\Nhow far we are, if you want, Dialogue: 0,0:18:40.08,0:18:41.96,Default,,0000,0000,0000,,we sort of have\Nthe infrastructure in place Dialogue: 0,0:18:41.96,0:18:44.48,Default,,0000,0000,0000,,to do the processing\Nand have everything efficiently in memory. Dialogue: 0,0:18:45.12,0:18:49.14,Default,,0000,0000,0000,,We have first instances\Nof silver annotations Dialogue: 0,0:18:49.14,0:18:53.04,Default,,0000,0000,0000,,and have a parser trainer in place\Nfor the supervised learning Dialogue: 0,0:18:53.04,0:18:55.76,Default,,0000,0000,0000,,and an initial plausibility model. Dialogue: 0,0:18:55.76,0:19:00.40,Default,,0000,0000,0000,,But we're still pushing on those fronts\Nand very much looking forward Dialogue: 0,0:19:00.40,0:19:03.32,Default,,0000,0000,0000,,to see what comes out\Nof the very last bit. Dialogue: 0,0:19:07.79,0:19:10.31,Default,,0000,0000,0000,,And those were my words. Dialogue: 0,0:19:10.31,0:19:14.68,Default,,0000,0000,0000,,I'm very excited to see\Nwhat comes out of it Dialogue: 0,0:19:14.68,0:19:17.66,Default,,0000,0000,0000,,and it's been pure joy\Nto work with Wikidata. Dialogue: 0,0:19:17.66,0:19:19.51,Default,,0000,0000,0000,,It's been fun to see Dialogue: 0,0:19:19.51,0:19:23.92,Default,,0000,0000,0000,,how some of the things you come across\Nseemed wrong and then the next day, Dialogue: 0,0:19:23.92,0:19:24.96,Default,,0000,0000,0000,,you look, things are fixed Dialogue: 0,0:19:24.96,0:19:30.55,Default,,0000,0000,0000,,and it's really been amazing\Nto see the momentum there. Dialogue: 0,0:19:31.16,0:19:35.30,Default,,0000,0000,0000,,Like I said, the URL,\Nall the source code is on GitHub. Dialogue: 0,0:19:35.89,0:19:38.91,Default,,0000,0000,0000,,Our email addresses\Nwere on the first slide, Dialogue: 0,0:19:38.91,0:19:42.58,Default,,0000,0000,0000,,so please do reach out\Nif you have questions or are interested Dialogue: 0,0:19:42.58,0:19:47.15,Default,,0000,0000,0000,,and I think we have time\Nfor a couple questions now in case... Dialogue: 0,0:19:49.45,0:19:51.45,Default,,0000,0000,0000,,(applause) Dialogue: 0,0:19:51.45,0:19:52.45,Default,,0000,0000,0000,,Thanks. Dialogue: 0,0:19:55.58,0:19:59.40,Default,,0000,0000,0000,,(woman 1) Thank you for your presentation.\NI do have a concern however. Dialogue: 0,0:19:59.40,0:20:05.44,Default,,0000,0000,0000,,The Wikipedia corpus\Nis known to be with bias. Dialogue: 0,0:20:05.44,0:20:09.84,Default,,0000,0000,0000,,There's a very strong bias--\Nfor example, fewer women, more men, Dialogue: 0,0:20:09.84,0:20:11.79,Default,,0000,0000,0000,,all sorts of other aspects in there. Dialogue: 0,0:20:11.79,0:20:15.20,Default,,0000,0000,0000,,So isn't this actually\Nalso tainting the knowledge Dialogue: 0,0:20:15.20,0:20:19.47,Default,,0000,0000,0000,,that you are taking out of the Wikipedia? Dialogue: 0,0:20:22.32,0:20:25.42,Default,,0000,0000,0000,,Well, there are two aspects\Nof the question. Dialogue: 0,0:20:25.42,0:20:28.59,Default,,0000,0000,0000,,There's both in the model\Nthat we are then training, Dialogue: 0,0:20:28.59,0:20:32.50,Default,,0000,0000,0000,,you could ask how... let's just... Dialogue: 0,0:20:33.17,0:20:35.84,Default,,0000,0000,0000,,If you make it really simple\Nand say like: Dialogue: 0,0:20:35.84,0:20:41.20,Default,,0000,0000,0000,,Does it mean that the model\Nwill then be worse Dialogue: 0,0:20:41.20,0:20:46.03,Default,,0000,0000,0000,,at predicting facts\Nabout women than men, say, Dialogue: 0,0:20:46.03,0:20:50.42,Default,,0000,0000,0000,,or some other set of groups? Dialogue: 0,0:20:53.10,0:20:55.42,Default,,0000,0000,0000,,To begin with,\Nif you just look at the raw data, Dialogue: 0,0:20:55.42,0:21:00.53,Default,,0000,0000,0000,,it will reflect whatever is the bias\Nin the training data, so that's... Dialogue: 0,0:21:02.81,0:21:06.00,Default,,0000,0000,0000,,People work on this to try\Nand address that in the best possible way. Dialogue: 0,0:21:06.00,0:21:10.07,Default,,0000,0000,0000,,But normally,\Nwhen you're training a model, Dialogue: 0,0:21:10.07,0:21:14.24,Default,,0000,0000,0000,,it will reflect\Nwhatever data you're training it on. Dialogue: 0,0:21:14.87,0:21:18.98,Default,,0000,0000,0000,,So that's something to account for\Nwhen doing the work, yeah. Dialogue: 0,0:21:21.50,0:21:23.19,Default,,0000,0000,0000,,(man 2) Hi, this is [Marco]. Dialogue: 0,0:21:23.20,0:21:25.96,Default,,0000,0000,0000,,I am a natural language\Nprocessing practitioner. Dialogue: 0,0:21:26.85,0:21:31.58,Default,,0000,0000,0000,,I was curious about\Nhow you model your facts. Dialogue: 0,0:21:31.58,0:21:34.54,Default,,0000,0000,0000,,So I heard you set frame semantics, Dialogue: 0,0:21:34.54,0:21:35.56,Default,,0000,0000,0000,,Right. Dialogue: 0,0:21:35.56,0:21:38.88,Default,,0000,0000,0000,,(Mike) could you maybe\Ngive some more details on that, please. Dialogue: 0,0:21:40.05,0:21:46.51,Default,,0000,0000,0000,,Yes, so it's frame semantics,\Nwe're using frame semantics, Dialogue: 0,0:21:46.51,0:21:49.64,Default,,0000,0000,0000,,and basically, Dialogue: 0,0:21:49.64,0:21:55.78,Default,,0000,0000,0000,,all of the facts in Wikidata,\Nthey're modeled as frames. Dialogue: 0,0:21:56.29,0:21:58.80,Default,,0000,0000,0000,,And so that's an essential part\Nof the set up Dialogue: 0,0:21:58.81,0:22:00.03,Default,,0000,0000,0000,,and how we make this work. Dialogue: 0,0:22:00.03,0:22:03.77,Default,,0000,0000,0000,,That's essentially\Nhow we try to address the... Dialogue: 0,0:22:03.77,0:22:06.68,Default,,0000,0000,0000,,How can I make all the knowledge\Nthat I have in Wikidata Dialogue: 0,0:22:06.68,0:22:11.01,Default,,0000,0000,0000,,available in a context where\NI can annotate and train my model Dialogue: 0,0:22:12.48,0:22:14.44,Default,,0000,0000,0000,,when I am annotating or passing text. Dialogue: 0,0:22:14.44,0:22:19.81,Default,,0000,0000,0000,,Is that existing data\Nin Wikidata is modeled as frames. Dialogue: 0,0:22:19.81,0:22:21.01,Default,,0000,0000,0000,,So the store that we have, Dialogue: 0,0:22:21.01,0:22:24.04,Default,,0000,0000,0000,,the knowledge base with\Nall of the knowledge is a frame store, Dialogue: 0,0:22:24.04,0:22:27.25,Default,,0000,0000,0000,,and this is the same frame store\Nthat we are building on top of Dialogue: 0,0:22:27.25,0:22:29.52,Default,,0000,0000,0000,,when we're then passing the text. Dialogue: 0,0:22:29.52,0:22:34.02,Default,,0000,0000,0000,,(Marco) So you're converting\Nthe Wikidata data model into some frame. Dialogue: 0,0:22:34.55,0:22:36.70,Default,,0000,0000,0000,,Yes, we are converting the Wikidata model Dialogue: 0,0:22:36.70,0:22:39.87,Default,,0000,0000,0000,,into one large frame store\Nif you want, yeah. Dialogue: 0,0:22:40.56,0:22:43.60,Default,,0000,0000,0000,,(man 3) Thanks. Is Pluto a planet? Dialogue: 0,0:22:44.39,0:22:47.23,Default,,0000,0000,0000,,(audience laughing) Dialogue: 0,0:22:47.23,0:22:48.23,Default,,0000,0000,0000,,Can I get the question... Dialogue: 0,0:22:48.23,0:22:51.56,Default,,0000,0000,0000,,(man 3) I like the bootstrapping thing\Nthat you are doing, Dialogue: 0,0:22:51.56,0:22:53.40,Default,,0000,0000,0000,,I mean the way\Nthat you're training your model Dialogue: 0,0:22:53.40,0:22:57.73,Default,,0000,0000,0000,,by picking out the known facts\Nabout things that are verified, Dialogue: 0,0:22:57.73,0:23:00.67,Default,,0000,0000,0000,,and then training\Nthe plausibility prediction Dialogue: 0,0:23:00.67,0:23:03.68,Default,,0000,0000,0000,,by trying to teach\Nthe architecture of the system Dialogue: 0,0:23:03.68,0:23:06.48,Default,,0000,0000,0000,,to recognize that actually,\Nthat fact fits. Dialogue: 0,0:23:06.48,0:23:13.46,Default,,0000,0000,0000,,So that will work for large classes,\Nbut it will really... Dialogue: 0,0:23:13.46,0:23:15.74,Default,,0000,0000,0000,,It doesn't sound like it will learn\Nabout surprises Dialogue: 0,0:23:15.74,0:23:18.68,Default,,0000,0000,0000,,and especially not\Nin small classes of items, right. Dialogue: 0,0:23:18.68,0:23:20.84,Default,,0000,0000,0000,,So if you train your model in... Dialogue: 0,0:23:20.84,0:23:23.48,Default,,0000,0000,0000,,When did Pluto disappear, I forgot... Dialogue: 0,0:23:23.48,0:23:24.48,Default,,0000,0000,0000,,As a planet, you mean. Dialogue: 0,0:23:24.48,0:23:26.90,Default,,0000,0000,0000,,(man 3) Yeah, it used to be\Na member of the solar system Dialogue: 0,0:23:26.90,0:23:29.44,Default,,0000,0000,0000,,and we had how many,\Nnine observations there. Dialogue: 0,0:23:29.44,0:23:31.17,Default,,0000,0000,0000,,- Yeah.\N- (man 3) It's slightly problematic. Dialogue: 0,0:23:31.17,0:23:33.51,Default,,0000,0000,0000,,So everyone, the kids think\Nthat Pluto is not a planet, Dialogue: 0,0:23:33.52,0:23:36.04,Default,,0000,0000,0000,,I still think it's a planet,\Nbut never mind. Dialogue: 0,0:23:36.04,0:23:42.32,Default,,0000,0000,0000,,So the fact that it suddenly\Nstopped being a planet, Dialogue: 0,0:23:42.32,0:23:45.52,Default,,0000,0000,0000,,which was supported in the period before,\NI don't know, hundreds of years, right? Dialogue: 0,0:23:47.15,0:23:50.16,Default,,0000,0000,0000,,That's crazy, how would you go\Nfor figuring out that thing? Dialogue: 0,0:23:50.16,0:23:53.60,Default,,0000,0000,0000,,For example, the new claim\Nis not plausible for that thing. Dialogue: 0,0:23:53.60,0:23:55.89,Default,,0000,0000,0000,,Sure. So there are two things. Dialogue: 0,0:23:55.89,0:23:59.43,Default,,0000,0000,0000,,So there's both like how precise\Nis a plausibility model. Dialogue: 0,0:23:59.43,0:24:02.09,Default,,0000,0000,0000,,So what it distinguishes between\Nis random facts Dialogue: 0,0:24:02.09,0:24:03.60,Default,,0000,0000,0000,,and facts that are plausible. Dialogue: 0,0:24:04.10,0:24:06.60,Default,,0000,0000,0000,,And there's also the question\Nof whether Pluto is a planet Dialogue: 0,0:24:06.60,0:24:09.24,Default,,0000,0000,0000,,and that's back to whether... Dialogue: 0,0:24:09.24,0:24:10.34,Default,,0000,0000,0000,,I was in another session Dialogue: 0,0:24:10.34,0:24:14.06,Default,,0000,0000,0000,,where someone brought up the example\Nof the earth being flat, Dialogue: 0,0:24:14.06,0:24:16.55,Default,,0000,0000,0000,,- whether that is a fact or not.\N- (man 3) That makes sense. Dialogue: 0,0:24:16.55,0:24:18.51,Default,,0000,0000,0000,,So it is a fact in a sense\Nthat you can put it in, Dialogue: 0,0:24:18.51,0:24:19.95,Default,,0000,0000,0000,,I guess you could put it in Wikidata Dialogue: 0,0:24:19.95,0:24:22.03,Default,,0000,0000,0000,,with sources that are claiming\Nthat that's the thing. Dialogue: 0,0:24:22.03,0:24:26.56,Default,,0000,0000,0000,,So again, you would not necessarily\Nwant to train the model in a way Dialogue: 0,0:24:26.56,0:24:30.72,Default,,0000,0000,0000,,where if you read someone saying\Nthe planet Pluto, bla, bla, bla, Dialogue: 0,0:24:30.72,0:24:33.56,Default,,0000,0000,0000,,then it should be fine for it Dialogue: 0,0:24:33.56,0:24:36.56,Default,,0000,0000,0000,,to then say that\Nan annotation for this text Dialogue: 0,0:24:36.56,0:24:38.20,Default,,0000,0000,0000,,is that Pluto is a planet. Dialogue: 0,0:24:39.51,0:24:41.43,Default,,0000,0000,0000,,That doesn't mean, you know... Dialogue: 0,0:24:42.12,0:24:46.92,Default,,0000,0000,0000,,The model won't be able to tell\Nwhat "in the end" is the truth, Dialogue: 0,0:24:46.92,0:24:49.21,Default,,0000,0000,0000,,I don't think any of us here\Nwill be able to either, so... Dialogue: 0,0:24:49.21,0:24:50.28,Default,,0000,0000,0000,,(man 3) I just want to say Dialogue: 0,0:24:50.28,0:24:52.78,Default,,0000,0000,0000,,it's not a hard accusation\Nagainst the approach Dialogue: 0,0:24:52.78,0:24:56.03,Default,,0000,0000,0000,,because even people\Ncannot be sure whether that's a fact, Dialogue: 0,0:24:56.03,0:24:58.21,Default,,0000,0000,0000,,a new fact is plausible at that moment. Dialogue: 0,0:24:58.73,0:24:59.73,Default,,0000,0000,0000,,But that's always... Dialogue: 0,0:24:59.73,0:25:03.39,Default,,0000,0000,0000,,I just maybe reiterated a question\Nthat I am posing all the time Dialogue: 0,0:25:03.39,0:25:05.75,Default,,0000,0000,0000,,to myself and my work; I always ask. Dialogue: 0,0:25:06.31,0:25:09.27,Default,,0000,0000,0000,,We do the statistical learning thing,\Nit's amazing nowadays Dialogue: 0,0:25:09.27,0:25:13.58,Default,,0000,0000,0000,,we can do billions of things,\Nbut we cannot learn about surprises, Dialogue: 0,0:25:13.59,0:25:16.84,Default,,0000,0000,0000,,and they are\Nvery, very important in fact, right? Dialogue: 0,0:25:17.60,0:25:20.71,Default,,0000,0000,0000,,- (man 4) But, just to refute...\N- (man 3) Thank you. Dialogue: 0,0:25:22.57,0:25:26.55,Default,,0000,0000,0000,,(man 4) The plausibility model\Nis combined with kind of two extra roles. Dialogue: 0,0:25:26.55,0:25:30.36,Default,,0000,0000,0000,,First of all,\Nif it's in Wikidata, it's true. Dialogue: 0,0:25:30.36,0:25:34.64,Default,,0000,0000,0000,,We just give you the benefit of the doubt,\Nso please make it good. Dialogue: 0,0:25:34.64,0:25:39.26,Default,,0000,0000,0000,,The second thing is if it's not\Nallowed by the schema it's false; Dialogue: 0,0:25:39.77,0:25:42.50,Default,,0000,0000,0000,,it's all the things in between\Nwe're looking at. Dialogue: 0,0:25:43.44,0:25:50.37,Default,,0000,0000,0000,,So if it's a planet according to Wikidata,\Nit will be a true fact. Dialogue: 0,0:25:53.13,0:25:57.41,Default,,0000,0000,0000,,But it won't predict surprises\Nbut what is important here Dialogue: 0,0:25:57.41,0:26:01.81,Default,,0000,0000,0000,,is that there's kind of\Nno manual human work involved, Dialogue: 0,0:26:01.81,0:26:03.63,Default,,0000,0000,0000,,so there's nothing\Nthat prevents you from... Dialogue: 0,0:26:03.63,0:26:05.94,Default,,0000,0000,0000,,Well, now, if we're successful\Nwith the approach, Dialogue: 0,0:26:05.94,0:26:09.02,Default,,0000,0000,0000,,there's nothing that prevents him\Nfrom continuously updating the model Dialogue: 0,0:26:09.02,0:26:12.48,Default,,0000,0000,0000,,with changes happening\Nin Wikidata and Wikipedia and so on. Dialogue: 0,0:26:12.48,0:26:18.13,Default,,0000,0000,0000,,So in theory, you should be able\Nto quickly learn new surprises. Dialogue: 0,0:26:18.13,0:26:19.66,Default,,0000,0000,0000,,(moderator) One last question. Dialogue: 0,0:26:20.22,0:26:23.16,Default,,0000,0000,0000,,- (man 4) Maybe we're biased by Wikidata.\N- Yeah. Dialogue: 0,0:26:23.68,0:26:27.56,Default,,0000,0000,0000,,(man 4) You are our bias.\NWhatever you annotate is what we believe. Dialogue: 0,0:26:27.56,0:26:31.70,Default,,0000,0000,0000,,So if you make it good,\Nif you make it balanced, Dialogue: 0,0:26:31.70,0:26:33.95,Default,,0000,0000,0000,,we can hopefully be balanced. Dialogue: 0,0:26:33.95,0:26:39.36,Default,,0000,0000,0000,,With the gender thing,\Nthere's actually an interesting thing. Dialogue: 0,0:26:39.95,0:26:42.30,Default,,0000,0000,0000,,We are actually getting\Nmore training facts Dialogue: 0,0:26:42.30,0:26:43.65,Default,,0000,0000,0000,,about women than men Dialogue: 0,0:26:43.65,0:26:48.95,Default,,0000,0000,0000,,because "she" is a much less\Nambiguous pronoun in the text, Dialogue: 0,0:26:48.95,0:26:51.60,Default,,0000,0000,0000,,so we actually get a lot more\Ntrue facts about women. Dialogue: 0,0:26:51.60,0:26:55.19,Default,,0000,0000,0000,,So we are biased, but on the women's side. Dialogue: 0,0:26:56.24,0:26:58.92,Default,,0000,0000,0000,,(woman 2) No, I want to see\Nthe data on that. Dialogue: 0,0:26:58.92,0:27:00.47,Default,,0000,0000,0000,,(audience laughing) Dialogue: 0,0:27:00.47,0:27:02.38,Default,,0000,0000,0000,,We should bring that along next time. Dialogue: 0,0:27:02.38,0:27:04.94,Default,,0000,0000,0000,,(man 4) You get had decision [inaudible]. Dialogue: 0,0:27:04.94,0:27:06.28,Default,,0000,0000,0000,,(man 3) Yes, hard decision. Dialogue: 0,0:27:07.88,0:27:13.00,Default,,0000,0000,0000,,(man 5) It says SLING is...\Nparser across many languages Dialogue: 0,0:27:13.00,0:27:15.16,Default,,0000,0000,0000,,- and you showed us English.\N- Yes! Dialogue: 0,0:27:15.16,0:27:17.93,Default,,0000,0000,0000,,(man 5) Can you something about\Nthe number of languages that you are-- Dialogue: 0,0:27:17.93,0:27:19.16,Default,,0000,0000,0000,,Yes! Thank you for asking. Dialogue: 0,0:27:19.16,0:27:21.60,Default,,0000,0000,0000,,I had told myself to say that\Nup front on the first page Dialogue: 0,0:27:21.60,0:27:23.36,Default,,0000,0000,0000,,because otherwise,\NI would forget, and I did. Dialogue: 0,0:27:24.74,0:27:25.74,Default,,0000,0000,0000,,So right now, Dialogue: 0,0:27:25.74,0:27:29.88,Default,,0000,0000,0000,,we're not actually looking at two files,\Nwe're looking at 13 files. Dialogue: 0,0:27:29.88,0:27:32.77,Default,,0000,0000,0000,,So Wikipedia dumps\Nfrom 12 different languages Dialogue: 0,0:27:32.77,0:27:35.80,Default,,0000,0000,0000,,that we're processing, Dialogue: 0,0:27:35.80,0:27:41.48,Default,,0000,0000,0000,,and none of this is dependent\Non the language being English. Dialogue: 0,0:27:41.48,0:27:44.28,Default,,0000,0000,0000,,So we're processing this\Nfor all of the 12 languages. Dialogue: 0,0:27:48.24,0:27:49.24,Default,,0000,0000,0000,,Yeah. Dialogue: 0,0:27:49.24,0:27:50.24,Default,,0000,0000,0000,,For now, Dialogue: 0,0:27:50.24,0:27:56.62,Default,,0000,0000,0000,,they share the property of, I think,\Nbeing the Latin alphabet, and so on. Dialogue: 0,0:27:56.62,0:27:58.60,Default,,0000,0000,0000,,Mostly for us to be able to make sure Dialogue: 0,0:27:58.60,0:28:02.12,Default,,0000,0000,0000,,that what we are doing\Nstill make sense and works. Dialogue: 0,0:28:02.12,0:28:04.96,Default,,0000,0000,0000,,But there's nothing\Nfundamental about the approach Dialogue: 0,0:28:04.96,0:28:09.87,Default,,0000,0000,0000,,that prevents it from being used\Nin very different languages Dialogue: 0,0:28:09.87,0:28:14.66,Default,,0000,0000,0000,,from those being spoken around this area. Dialogue: 0,0:28:17.28,0:28:19.32,Default,,0000,0000,0000,,(woman 3) Leila from Wikimedia Foundation. Dialogue: 0,0:28:19.32,0:28:21.85,Default,,0000,0000,0000,,I may have missed this\Nwhen you presented this. Dialogue: 0,0:28:22.90,0:28:28.38,Default,,0000,0000,0000,,Do you make an attempt to bring\Nany references from Wikipedia articles Dialogue: 0,0:28:28.39,0:28:32.43,Default,,0000,0000,0000,,back to the property and statements\Nyou're making in Wikidata? Dialogue: 0,0:28:33.36,0:28:37.22,Default,,0000,0000,0000,,So I briefly mentioned this\Nas a potential application. Dialogue: 0,0:28:37.22,0:28:40.35,Default,,0000,0000,0000,,So for now, what we're trying to do\Nis just to get this to work, Dialogue: 0,0:28:41.16,0:28:46.00,Default,,0000,0000,0000,,but let's say we did get it to work\Nwith a high level of quality, Dialogue: 0,0:28:46.62,0:28:51.24,Default,,0000,0000,0000,,that would be an obvious thing\Nto try to do, so when you... Dialogue: 0,0:28:52.81,0:28:55.19,Default,,0000,0000,0000,,Let's let's say you were willing to... Dialogue: 0,0:28:55.19,0:28:59.59,Default,,0000,0000,0000,,I know there's some controversy around\Nusing Wikipedia as a source for Wikidata, Dialogue: 0,0:28:59.59,0:29:01.96,Default,,0000,0000,0000,,that you can't have\Ncircular references and so on, Dialogue: 0,0:29:01.96,0:29:04.85,Default,,0000,0000,0000,,so you need to have\Nproperly sourced facts. Dialogue: 0,0:29:04.85,0:29:07.42,Default,,0000,0000,0000,,So let's say you were\Ncoming up with new facts, Dialogue: 0,0:29:07.42,0:29:14.31,Default,,0000,0000,0000,,and obviously, you could look\Nat the cover of news media and so on Dialogue: 0,0:29:14.31,0:29:16.22,Default,,0000,0000,0000,,and process these\Nand try to annotate these. Dialogue: 0,0:29:16.22,0:29:19.52,Default,,0000,0000,0000,,And then, that way,\Nfind sources for facts, Dialogue: 0,0:29:19.52,0:29:20.96,Default,,0000,0000,0000,,new facts that you come up with. Dialogue: 0,0:29:20.96,0:29:22.33,Default,,0000,0000,0000,,Or you could even take existing... Dialogue: 0,0:29:22.33,0:29:25.90,Default,,0000,0000,0000,,There are a lot of facts in Wikidata\Nthat either have no sources Dialogue: 0,0:29:25.90,0:29:29.64,Default,,0000,0000,0000,,or only have Wikipedia as a source,\Nso you can start processing these Dialogue: 0,0:29:29.64,0:29:32.80,Default,,0000,0000,0000,,and try to find sources\Nfor those automatically. Dialogue: 0,0:29:33.54,0:29:38.20,Default,,0000,0000,0000,,(Leila) Or even within the articles\Nthat you're taking this information from Dialogue: 0,0:29:38.20,0:29:41.88,Default,,0000,0000,0000,,just using the sources from there\Nbecause they may contain... Dialogue: 0,0:29:42.38,0:29:44.33,Default,,0000,0000,0000,,- Yeah. Yeah.\N- Yeah. Thanks. Dialogue: 0,0:29:47.43,0:29:49.32,Default,,0000,0000,0000,,- (moderator) Thanks Anders.\N- Cool. Thanks. Dialogue: 0,0:29:49.92,0:29:55.34,Default,,0000,0000,0000,,(applause)