Music Herald: We'll do some live querying with him so you were told to think of some ideas that we could search for in Wikidata and when we get to that point I would ask you to raise your hand and wait till I get to you with the microphone so that the people in the stream can also hear what we're talking about so that's the thing I'll go back to Lucas and we still have translations. Wenn ihr es auf deutsch hören wollt wir haben immen noch Übersetzter, die alles versuchen es Euch nochmal auf Deutsch zu erzählen. Also hört mal rein. Lucas: Does anyone have a query? Yes, in the front there. We have a question allready. Audience member: Is it possible to find all circular family trees? L: All circular what? AM: Family trees L: It's certainly possible to find some. Finding all it's probably going to be a timeout but there would be something like select, probably child would be the simplest, so item child plus item again. So if we put the star like earlier then you could then every Tree would match that, but with the plus it means it has to be at least one child link or more, and let's just add a "limit 1" because I'm not that optimistic that this is even going to find one, but I'm pretty sure we cannot find all of them, but let's see if we can find one and this might just take a while, but I don't think there is a good way to do this otherwise unless you download one of the dumps either the JSON data dumps or the RDF dumps which is the same data format used here and then you can do it locally without an timeout. I don't think there's much I can optimize about this query is pretty short unless like I had an idea that people named John are more likely to have these kinds of cycles, then we could filter it down first but you men. I'm afraid that is not going to work it looks like. Yes, timeout. And you can see the thing is written in Java the server dragazines. One thing we can do with this "P40+" is something like search start with a certain mythical creature such as King Arthur. I hope I can find him like this. Search is being alright 3d map of EDT. There we go, that's the legendary British or Welsh king and then we are searching for an item who is definitely a real human and who has a date of birth and we say the date of birth should be greater than at say 1950 and this is a date time value and this let's even say 1980. I think that might be No more efficient. There we go. No results, okay. I thought King Arthur had some real descendants. Though then it was some other mythical creatures. Let's just start with any ancestor who has the item as child and the ancestor is also instance of mythical creature, mythical character. Let's see if we have any mythical characters with children who are born after 1950. Oh I still have the "limit 1" here could make that a "limit 10" probably or something, but I'm optimistic I think there are some people here, especially, I think, even British MPs, there's some that's already on the list of example queries British MPs with mythical ancestors and there's lovely have traced their lineage back to some 6th or 5th century person and you have all the apparent links in there and then it's kind of tricky to figure out where it starts being wrong. Oh that's not working out so well. Does anyone else have ideas in the meantime? There, way in the back. Someone: Thank you Audience member: We all know that stupid game in Wikipedia where you try to find the Adolf Hitler page by only clicking links, so can you find the number of pages that are directly connected to the Adolf Hitler page in Wikipedia? L: You can. Oh that was a timeout, dammit. So that would be kind of ... It's a one funny story about that for example is there's the first example we have here is cats and why do we have cats and not dogs? Because if you search for dogs the second result, no, it's the fourth result by now, but that's the thought of Hitler and we don't really want that and normally so we usually use cats as the example instead but let's just search for anything where the item has any connection and we don't care which property it is to Adolf Hitler, like that, and we are going to find 920 results. ok some of these are site links so we also want the item to have some label which uses this new namespace and we want only the English label so the language of the label should be in English and we then we just select the item and the label and hopefully that's still pretty efficient. Here we go NSDAP membership number that's actually a property but I assume it has as the example yep there's a property example here as NSDAP number 1. World War two has probably of cause of death do we have him as an example on cause of death really? and we have nitric acid poisoning, stroke, cholera, shot to the head, cyanide poisoning, hanging, That's a very pleasant list. Do we need to have that many projects handfuls of closet yeah then we have Nazi Party, Klara Hitler, I don't know who that is, 1936 Summer Olympics, all kinds of things, so that's how you can find all the things with a direct connection to Hitler. Any other examples? Yes, over there in the right, or was there already somewhere someone back there that I missed AM: Can you find the cheapest public infrastructure projects in Germany? L: The cheapest public infrastructure what? AM: Projects L: Projects AM: Like a bridge building L: I don't think we're going to have a full dataset about that but you can try. Let's start with a more expensive one and [crackling noise] L: see - perhaps move away from the box, that might help. Let's start with a very expensive project and see just what the data model looks like so what does infrastructure project look like what's what was the cost so the cost is probably going to be in euro and I don't know how to write here over there okay it's a property called cost in Euro and does it have something like instance of international airport, building under construction, Greenfield Airport, proposed airport being built, so we could check first is Berlin Brandenburg Airport, is that an instance of some subclass of public infrastructure? Is that a thing? that looks like the wrong item what is this this is nothing. Okay. There's anything linked to this item? no nothing like suicide. Okay. So it could be an international airport is a subclass of airport which is a subclass of an aerodrome which is an architectural structure, and we can search for architectural structures, so the structure would be an instance of subclass of architectural structure, and it would have a cost, and order by the singing costs I think it's 10 and we're probably going to get things in like yen or some other currency where this number is just going to be very high because we're not taking any conversions into account right now but let's see if we find something there. What is it doing? Okay... not sure why this is taking so long. Let's try a second version in the mean time where we quantity amount is cost and various quantities units should be the euro they're still running and yeah let's try this that works any better or not? Okay, this was a timeout. This looks like it's going to be a timeout as well. I don't know, we can just search for the most expensive things at all. Remove this part, there we go. This costs 55 billion euros. What is this thing? Power of Siberia, natural gas pipeline. That's, that's in euro, the costs? Apparently. And then this is 15 billion euros and then 8.77 find something that's the channel, oh the Channel Tunnel is expensive. The Brenner Tunnel was also expensive. laughter And Stuttgart 21 took about 21 whatever was also- or is projected to be expensive. Do we have one cost or several? Okay in 2018 we have a cost of 7 billion. Yeah, so let's sort by the ascending constant set because that was what we actually wanted and then we get... okay now we're going to get a lot of things that aren't really infrastructure projects we have the whole and a hot and energetic universe. Does that mean it's a no budget film or what? Okay. So we would need some kind of ... Let's say, let's do duck typing instead of saying it is an infrastructure project, let's say it has, I don't know, a coordinate location. And if it has a coordinate location, we're going to call it some kind of infrastructure project, or at least it's not going to be a documentary film. Perhaps that works better. Yeah, so 21,000 euros cost this thing which was in France. Oh, okay, right, it should also be country Germany. Here we go. That's 400,000 euros for fountain in Stuttgart. Does that count? I guess. And that's the engines of something it doesn't even have a German la- an English label, just a German one. Wait... Oh, so this is the class of all the fountains with exactly this name which are a subclass of well and are all named after this goddess, okay, cool. Yeah so then we have some of these cheap projects, which is… this public square… a bridge – oh yeah, there's this tiny bridge, a footbridge, has even an image, that's what it looks like, and it costs, what was it, 1.6 million euros already. Wow. And then we have another public square. Yeah. So, "cheap public infrastructure projects". And also probably "infrastructure" in quotes, because we're really just saying it has a location and "Country: Germany". And, yeah, I can send this query around afterwards. And this didn't work, this didn't work. Okay, any other ideas? That's bad news. We could try to continue with some of these. Was there something? Oh, from the Camera Angel! AM: I have a question! I saw that with Wikidata Query Service we can draw these nice trees and have images in them, and one example that came to my mind was all the programming patterns – programming design patterns, but grouped by their kind, like they're structural patterns, convenience patterns, and so on, and like, can we draw a graph and maybe put an image in them. L: We can try that. So let's see how that's modeled, I don't know, with the visitor pattern for example. That's a design pattern what kind of statements does have. It's a subclass of behavioral pattern, is this a programming thing or already…? Oh yeah it's a soft… okay it's a software design pattern. So we should say ... We're going to have a pattern with its label and a pattern kind with its label and the pattern is going to be a subclass of the pattern kinds, which is going to be some subclass of – what was it? Of software design pattern – and I'm just going to copy this ID so it's the right one – label service, and say, I would like to see this by default in the graph view. Here we go. Well that looks not as bad as I thought. We have a lot of structural patterns, behavioral patterns, one architectural pattern, a few creational patterns, and one fundamental pattern. Yeah. And… yeah what we could also do is, if we do this, then we should also see connections of all of these. Now we have the tree rooted at software design patterns, we have monads, and fundamental pattern is a kind of software design pattern. Structural pattern… and it's all linked there and this is working… very well, I… That's much better than I expected. I expected a huge mess of… because it sometimes gets different to determine when should you use "instance of" and when should use "subclass off", like if it's software or patterns like this, I expected we would have to account for both of these, but this looks very good to me. I think we don't need to do anything with this query. Yeah, so that is, uhm, software design patterns by a pattern tree. Okay. Any other ideas? Or I can try to keep optimizing this one AM: Which cities have applied to be host city of the Eurovision Song Contest the most times but were never successful? L: Oh! Laughter from Audience L: That's a very good question. I don't know if we have– do you know who applied for this year or for some year? But I could check if the state if that's modeled anywhere. Uhm, I need some example cities so… let's check ESC 2018 if it has information on where it took place, which one won the bid, but also who was nominated or something, or who applied… We have "presenters", we have "followed by ", "start time", "end time", "participants", we have the winner, do we have a location at all? Oh yeah, there it is. Okay, we have a country, and a location, but I'm not seeing any other countries here, and I assume that information is not going to be on the country item. It's possible that we have some separate item for "Eurovision 2018 Bid" or… Well wait, it would have to be "which city", because the country is determined by the winner isn't it? So the city, but I suspect we don't have that information. We have a list of host cities, but that's just… a Wikipedia list article. Interference noise Do we have to switch to the other mic? Oh no, that sounds great! Okay. Yeah, so we don't have any of the structured information here. It's just linking all of these Wikipedia articles together, and then here is the actual list with the different venues. But I don't think we have that information in Wikidata at the moment. We could add it, you'd have to figure out the data model, but it would probably be relevant enough, I think. I wonder if we have that for the Olympic Games. So, Olympics 2020, do we have the process of who applied to host those? Uhm. We have a location. We have parts. Let's check. Perhaps English Wikipedia has a separate article about the selection process for the 2020 Summer Olympics. Doesn't look like it. "Host city selection". No I don't see a main… oh no, there! "Bids for the 2020 Summer Olympics", that's the Wikipedia article. Does that have any useful information on Wikidata? Bids for Olympic Games no. Damnit. So you can see, when these bids all happened, but we don't have the bidding countries and cities apparently on Wikidata, at least not as far as I can see. Bids for the… 2012 for example… No, sadly, we don't have that information yet. Did this one run by way? No. Any other questions? Herald: Our translation angels had a question. H: They want to know, if can give them the countries with the most colorful flags L: Yes! That [interference noise] should be possible. So "select country", and the "count of the colors as counts" [interference noise] the country has, oops, has a flag, not the "flag image", a flag, and the flag has color. And it should be "color" and not "colors", and then we group by country so this is a bit like a [noise] grouping and aggregate functions [Interference noise] Interferene noise* Do we need to use the other microphone? [Noise] Okay [Noise] But then you can't really walk around anymore. H: Hello hello? Hello hello? Muss man da noch was machen? L: Okay so now… This could be really fun! Yeah, so we are searching for countries with flags, and hope that the flags have colors and been counting them, and what I didn't do is… what's this? Do I want to know? Okay, okay it's at least it's not the straight pride flag, I guess. Does this have 14 colors? No, what was it? No, eight, I guess, one, two, three, four, five, six, seven, eight, yeah. That's accurate. Yeah, I didn't filter for countries here, the thing is, country is really a stupidly complicated term, so what I did was… queries… I have a pre- prepared query for the UN member states somewhere, which I just copy all the time. And this is now going to be called a state, and then we only get state flags, uhm, and there's exactly– oh, right. I need to group by "state" and "state label" and copy these up here as well, and then it will hopefully work, and we will find out that… the United Kingdom has… 12? I suspect that's because it has four flags, which all have the same rank, or a no– no it should be five, right? United Kingdom and Northern Ireland, Scotland, Wales and England. Let's search for "flag". Flag is the flag of the United Kingdom, no? Why does it have 12 colors? It has blue, red, white… wat. I see. But that still doesn't explain the 12. Let's count only the distinct colours "distinct", there's auto- completion, thank God, perhaps that helps . Though I don't know why it would have… oh it would have had the state multiple times because it's a sovereign state multiple times probably. Let's check. Yeah the United Kingdom is, it's a Commonwealth realm, and an island nation, and a sovereign state, and that's probably why we got it multiple times, and, yeah that looks more reasonable. South Africa, Ecuador, South Sudan, and what we can also do is, add the, of the flag, the image and call that I, because I can't be bothered to type the whole thing, and add that here, and also add it to the "group by", because otherwise it's not the right aggregate and I can't be bothered to write "sample" with one hand, and then we can hopefully also see it. Oh, we get two images of the flag of South Africa. That also looks like one of them should be "preferred rank", but anyways, we can switch to image grid, and then we can see all these colorful flags. One, two, three, four, five, six, yeah. That's six. And this is more than six, so I guess, I would say that should actually be two separate items, for this old flag and– no, this old flag and the new flag, but… This is six… is that only six colors? I'll believe it. This is six colors, six, and then we have five colors, yeah. So here are the, let's just add a comment there, and I will tweet this out later as well, "colorful state flags". Yeah. And also we can use the image grid as the default view. We probably have time for one more question, if it's a short one. Though I won't be able to type very fast. Yes, let's hope this works. Otherwise I can repeat it for the stream if I hear you. AM: So does it work? Yep seems so. I don't know if it's possible, but the smallest images that are on Wikipedia? So, by image size? L: That would not be possible with the Query Service I think. But I think on Commons you can search… can you search? Whoops, I don't have that search shortcut set up here. Can you search by image size? I think that might be possible. Advanced search, file type, sorting order… No. You could probably sort by a file size in an SQL query. Which is not a thing from the Wikidata Query Service, but it's possible with something else and as it happens I am going to have another talk later, where I talk, about among other things, how you can write SQL queries against the Wikipedia databases, and then we might be able to find a solution for that, and that's I think at 6 p.m. today over in the Esszimmer, or you come over to me after the talk and then I can try to figure it out there. H: A last emergency idea that we have to try out? H: I'm muted. Do you have ano– one more idea? A small idea maybe we could do but other than that I think we are, so– filled the time quite well. L: Yeah I think we're done. But if you have any other ideas, you can always contact me on Twitter @wikidatafacts, or on Mastodon as well, and then I will see what I can do for you. Yeah. Thanks. H: Thank you very much, Lucas, that was a great introduction to Wikidata querying! Music Subtitles created by c3subtitles.de in the year 2021. Join, and help us!