-
Music
-
Herald: We'll do some live querying with him
so you were told to think of some ideas that
-
we could search for in Wikidata and when
we get to that point I would ask you to
-
raise your hand and wait till I get to you
with the microphone so that the people in
-
the stream can also hear what we're
talking about so that's the thing I'll go
-
back to Lucas and we still have
translations. Wenn ihr es auf deutsch hören wollt
-
wir haben immen noch Übersetzter, die alles versuchen
es Euch nochmal auf Deutsch zu erzählen. Also hört mal rein.
-
Lucas: Does anyone have a query?
Yes, in the front there. We have a question allready.
-
Audience member: Is it possible to find all circular family trees?
-
L: All circular what?
-
AM: Family trees
L: It's certainly possible to find
-
some. Finding all it's probably going to
be a timeout but there would be something
-
like select, probably child would be
the simplest, so item child plus item
-
again. So if we put the star like earlier
then you could then every Tree would
-
match that, but with the plus it means it
has to be at least one child link or more,
-
and let's just add a "limit 1" because I'm
not that optimistic that this is even
-
going to find one, but I'm pretty sure we
cannot find all of them, but let's see if
-
we can find one and this might just take a
while, but I don't think there is a good
-
way to do this otherwise unless you
download one of the dumps either the JSON
-
data dumps or the RDF dumps which is the
same data format used here and then you
-
can do it locally without an timeout.
I don't think there's much I can optimize
-
about this query is pretty short unless
like I had an idea that people named John
-
are more likely to have these kinds of
cycles, then we could filter it down first
-
but you men. I'm afraid that is not going
to work it looks like. Yes, timeout. And
-
you can see the thing is written in Java
the server dragazines. One thing we
-
can do with this "P40+" is something like
search start with a certain mythical
-
creature such as King Arthur. I hope I can
find him like this. Search is being alright
-
3d map of EDT. There we go, that's the
legendary British or Welsh king and then
-
we are searching for an item who is
definitely a real human and who has a date
-
of birth and we say the date of birth
should be greater than at say 1950 and
-
this is a date time value and this let's
even say 1980. I think that might be
-
No more efficient. There we go. No
results, okay. I thought King Arthur had
-
some real descendants. Though then it was
some other mythical creatures. Let's just
-
start with any ancestor who has the item
as child and the ancestor is also instance
-
of mythical creature, mythical character.
Let's see if we have any mythical
-
characters with children who are born
after 1950. Oh I still have the "limit 1"
-
here could make that a "limit 10" probably
or something, but I'm optimistic I think
-
there are some people here, especially, I
think, even British MPs, there's some
-
that's already on the list of example
queries British MPs with mythical
-
ancestors and there's lovely have traced
their lineage back to some 6th or 5th
-
century person and you have all the
apparent links in there and then it's kind
-
of tricky to figure out where it starts
being wrong. Oh that's not working out so
-
well. Does anyone else have ideas in the
meantime? There, way in the back.
-
Someone: Thank you
Audience member: We all know that stupid
-
game in Wikipedia where you try to find
the Adolf Hitler page by only clicking
-
links, so can you find the number of pages
that are directly connected to the Adolf
-
Hitler page in Wikipedia?
L: You can. Oh that was a timeout, dammit.
-
So that would be kind of ... It's a one funny
story about that for example is there's
-
the first example we
have here is cats and why do we have cats
-
and not dogs? Because if you search for
dogs the second result, no, it's the
-
fourth result by now, but that's the
thought of Hitler and we don't really want
-
that and normally so we usually use cats
as the example instead but let's just
-
search for anything where the item has any
connection and we don't care which
-
property it is to Adolf Hitler, like that,
and we are going to find 920 results. ok
-
some of these are site links so we also
want the item to have some label which
-
uses this new namespace and we want only
the English label so the language of the
-
label should be in English and we then we
just select the item and the label and
-
hopefully that's still pretty efficient.
Here we go NSDAP membership number
-
that's actually a property but I assume it
has as the example yep there's a property
-
example here as NSDAP number 1. World War
two has probably of cause of death do we
-
have him as an example on cause of death
really? and we have nitric acid poisoning,
-
stroke, cholera, shot to the head, cyanide
poisoning, hanging, That's a very pleasant
-
list. Do we need to have that many
projects handfuls of closet yeah then we
-
have Nazi Party, Klara Hitler, I don't
know who that is, 1936 Summer Olympics,
-
all kinds of things, so that's how
you can find all the things with a direct
-
connection to Hitler. Any other
examples? Yes, over there in the right, or
-
was there already somewhere someone back
there that I missed
-
AM: Can you find the
cheapest public infrastructure projects in
-
Germany?
L: The cheapest public infrastructure
-
what?
AM: Projects
-
L: Projects
AM: Like a bridge building
-
L: I don't think we're going to have a
full dataset about that but you can try.
-
Let's start with a more expensive one and
[crackling noise]
-
L: see - perhaps move away from the box,
that might help. Let's start with a very
-
expensive project and see just what the
data model looks like so what does
-
infrastructure project look like what's
what was the cost so the cost is probably
-
going to be in euro and I don't know how
to write here over there okay it's a
-
property called cost in Euro and does it
have something like instance of
-
international airport, building under
construction, Greenfield Airport, proposed
-
airport being built, so we could check
first is Berlin Brandenburg Airport, is
-
that an instance of some subclass of
public infrastructure? Is that a thing?
-
that looks like the wrong item what is
this this is nothing. Okay. There's anything
-
linked to this item? no nothing like
suicide. Okay. So it could be an
-
international airport is a subclass of
airport which is a subclass of an
-
aerodrome which is an architectural
structure, and we can search for
-
architectural structures, so the structure
would be an instance of subclass of
-
architectural structure, and it would have
a cost, and order by the singing costs I
-
think it's 10 and we're probably going to
get things in like yen or some other
-
currency where this number is just going
to be very high because we're not taking
-
any conversions into account right now but
let's see if we find something there. What
-
is it doing? Okay... not sure why this is
taking so long. Let's try a second version
-
in the mean time where we quantity amount
is cost and various quantities units
-
should be the euro they're still running
and yeah let's try this that works any
-
better or not? Okay, this was a timeout. This
looks like it's going to be a timeout as
-
well. I don't know, we can just search for
the most expensive things at all. Remove
-
this part, there we go. This costs 55
billion euros. What is this thing? Power
-
of Siberia, natural gas pipeline. That's,
that's in euro, the costs? Apparently. And
-
then this is 15 billion euros and then
8.77 find something that's the channel, oh
-
the Channel Tunnel is expensive. The
Brenner Tunnel was also expensive.
-
laughter
And Stuttgart 21 took about 21 whatever
-
was also- or is projected to be expensive.
Do we have one cost or several? Okay in
-
2018 we have a cost of 7 billion. Yeah, so
let's sort by the ascending constant set
-
because that was what we actually wanted
and then we get... okay now we're going to
-
get a lot of things that aren't really
infrastructure projects we have the whole
-
and a hot and energetic universe. Does
that mean it's a no budget film or what?
-
Okay. So we would need some kind of ...
Let's say, let's do duck typing instead of
-
saying it is an infrastructure project,
let's say it has, I don't know, a
-
coordinate location. And if it has a
coordinate location, we're going to call
-
it some kind of infrastructure project, or
at least it's not going to be a
-
documentary film. Perhaps that works
better. Yeah, so 21,000 euros cost this
-
thing which was in France. Oh, okay,
right, it should also be country Germany.
-
Here we go. That's 400,000 euros for
fountain in Stuttgart. Does that count? I
-
guess. And that's the engines of something
it doesn't even have a German la- an
-
English label, just a German one. Wait...
Oh, so this is the class of all the
-
fountains with exactly this name which are
a subclass of well and are all named after
-
this goddess, okay, cool. Yeah so then we
have some of these cheap projects, which
-
is… this public square… a bridge – oh
yeah, there's this tiny bridge, a
-
footbridge, has even an image, that's what
it looks like, and it costs, what was it,
-
1.6 million euros already. Wow. And then
we have another public square. Yeah. So,
-
"cheap public infrastructure projects".
And also probably "infrastructure" in
-
quotes, because we're really just saying
it has a location and "Country: Germany".
-
And, yeah, I can send this query around
afterwards. And this didn't work, this
-
didn't work. Okay, any other ideas? That's
bad news. We could try to continue with
-
some of these. Was there something? Oh,
from the Camera Angel!
-
AM: I have a question! I saw that with
Wikidata Query Service we can draw these
-
nice trees and have images in them, and
one example that came to my mind was all
-
the programming patterns – programming
design patterns, but grouped by their
-
kind, like they're structural patterns,
convenience patterns, and so on, and like,
-
can we draw a graph and maybe put an image
in them.
-
L: We can try that. So let's see how
that's modeled, I don't know, with the
-
visitor pattern for example. That's a
design pattern what kind of statements
-
does have. It's a subclass of behavioral
pattern, is this a programming thing or
-
already…? Oh yeah it's a soft… okay it's a
software design pattern. So we should say ...
-
We're going to have a pattern with
its label and a pattern kind with its
-
label and the pattern is going to be a
subclass of the pattern kinds, which is
-
going to be some subclass of – what was
it? Of software design pattern – and I'm
-
just going to copy this ID so it's the
right one – label service, and say, I
-
would like to see this by default in the
graph view. Here we go. Well that looks
-
not as bad as I thought. We have a lot of
structural patterns, behavioral patterns,
-
one architectural pattern, a few
creational patterns, and one fundamental
-
pattern. Yeah. And… yeah what we could
also do is, if we do this, then we should
-
also see connections of all of these.
Now we have the tree rooted at
-
software design patterns, we have monads,
and fundamental pattern is a kind of
-
software design pattern. Structural
pattern… and it's all linked there and
-
this is working… very well, I… That's much
better than I expected. I expected a huge
-
mess of… because it sometimes gets
different to determine when should you use
-
"instance of" and when should use
"subclass off", like if it's software or
-
patterns like this, I expected we would
have to account for both of these, but
-
this looks very good to me. I think we
don't need to do anything with this query.
-
Yeah, so that is, uhm, software design
patterns by a pattern tree.
-
Okay. Any other ideas? Or I can try to
keep optimizing this one
-
AM: Which cities have applied to be host
city of the Eurovision Song Contest the
-
most times but were never successful?
L: Oh!
-
Laughter from Audience
L: That's a very good question. I don't
-
know if we have– do you know who applied
for this year or for some year? But I
-
could check if the state if that's modeled
anywhere. Uhm, I need some example cities
-
so… let's check ESC 2018 if it has
information on where it took place, which
-
one won the bid, but also who was
nominated or something, or who applied… We
-
have "presenters", we have "followed by ",
"start time", "end time", "participants",
-
we have the winner, do we have a location
at all? Oh yeah, there it is. Okay, we
-
have a country, and a location, but I'm
not seeing any other countries here, and I
-
assume that information is not going to be
on the country item. It's possible that we
-
have some separate item for "Eurovision
2018 Bid" or… Well wait, it would have to
-
be "which city", because the country is
determined by the winner isn't it? So the
-
city, but I suspect we don't have that
information. We have a list of host
-
cities, but that's just… a Wikipedia list
article.
-
Interference noise
Do we have to switch to the other mic? Oh
-
no, that sounds great! Okay. Yeah, so we
don't have any of the structured
-
information here. It's just linking all of
these Wikipedia articles together, and
-
then here is the actual list with the
different venues. But I don't think we
-
have that information in Wikidata at the
moment. We could add it, you'd have to
-
figure out the data model, but it would
probably be relevant enough, I think.
-
I wonder if we have that for the Olympic
Games. So, Olympics 2020, do we have the
-
process of who applied to host those? Uhm.
We have a location. We have parts. Let's
-
check. Perhaps English Wikipedia has a
separate article about the selection
-
process for the 2020 Summer Olympics.
Doesn't look like it. "Host city
-
selection". No I don't see a main… oh no,
there! "Bids for the 2020 Summer
-
Olympics", that's the Wikipedia article.
Does that have any useful information on
-
Wikidata? Bids for Olympic Games no.
Damnit. So you can see, when these bids
-
all happened, but we don't have the
bidding countries and cities apparently on
-
Wikidata, at least not as far as I can
see. Bids for the… 2012 for example…
-
No, sadly, we don't have that information
yet. Did this one run by way? No.
-
Any other questions?
Herald: Our translation angels had a question.
-
H: They want to know, if can give them the
countries with the most colorful flags
-
L: Yes! That [interference noise] should
be possible. So "select country", and the
-
"count of the colors as counts"
[interference noise] the country has,
-
oops, has a flag, not the "flag image", a
flag, and the flag has color. And it
-
should be "color" and not "colors", and
then we group by country so this is a bit
-
like a [noise] grouping and aggregate
functions
-
[Interference noise]
Interferene noise*
-
Do we need to use the other microphone?
[Noise] Okay [Noise] But then you can't
-
really walk around anymore.
H: Hello hello? Hello hello? Muss man da
-
noch was machen?
L: Okay so now… This could be really fun!
-
Yeah, so we are searching for countries
with flags, and hope that the flags have
-
colors and been counting them, and what I
didn't do is… what's this? Do I want to
-
know? Okay, okay it's at least it's not
the straight pride flag, I guess. Does
-
this have 14 colors? No, what was it? No,
eight, I guess, one, two, three, four,
-
five, six, seven, eight, yeah. That's
accurate. Yeah, I didn't filter for
-
countries here, the thing is, country is
really a stupidly complicated term, so
-
what I did was… queries… I have a pre-
prepared query for the UN member states
-
somewhere, which I just copy all the time.
And this is now going to be called a
-
state, and then we only get state flags,
uhm, and there's exactly– oh, right. I
-
need to group by "state" and "state label"
and copy these up here as well, and then
-
it will hopefully work, and we will find
out that… the United Kingdom has… 12?
-
I suspect that's because it has four flags,
which all have the same rank, or a no– no
-
it should be five, right? United Kingdom
and Northern Ireland, Scotland, Wales and
-
England. Let's search for "flag". Flag is
the flag of the United Kingdom, no? Why
-
does it have 12 colors? It has blue, red,
white… wat. I see. But that still doesn't
-
explain the 12. Let's count only the
distinct colours "distinct", there's auto-
-
completion, thank God, perhaps that helps
. Though I don't know why it would have…
-
oh it would have had the state multiple
times because it's a sovereign state
-
multiple times probably. Let's check. Yeah
the United Kingdom is, it's a Commonwealth
-
realm, and an island nation, and a
sovereign state, and that's probably why
-
we got it multiple times, and, yeah that
looks more reasonable. South Africa,
-
Ecuador, South Sudan, and what we can also
do is, add the, of the flag, the image and
-
call that I, because I can't be bothered
to type the whole thing, and add that
-
here, and also add it to the "group by",
because otherwise it's not the right
-
aggregate and I can't be bothered to write
"sample" with one hand, and then we can
-
hopefully also see it. Oh, we get two
images of the flag of South Africa. That
-
also looks like one of them should be
"preferred rank", but anyways, we can
-
switch to image grid, and then we can see
all these colorful flags. One, two, three,
-
four, five, six, yeah. That's six. And
this is more than six, so I guess, I would
-
say that should actually be two separate
items, for this old flag and– no, this old
-
flag and the new flag, but… This is six…
is that only six colors? I'll believe it.
-
This is six colors, six, and then we have
five colors, yeah. So here are the, let's
-
just add a comment there, and I will tweet
this out later as well, "colorful state
-
flags". Yeah. And also we can use
the image grid as the default view.
-
We probably have time for one more question,
if it's a short one. Though I won't be
-
able to type very fast. Yes, let's
hope this works. Otherwise I can repeat it
-
for the stream if I hear you.
AM: So does it work? Yep seems so. I don't
-
know if it's possible, but the smallest
images that are on Wikipedia? So, by image
-
size?
L: That would not be possible with the
-
Query Service I think. But I think on
Commons you can search… can you search?
-
Whoops, I don't have that search shortcut
set up here. Can you search by image size?
-
I think that might be possible. Advanced
search, file type, sorting order… No.
-
You could probably sort by a file size in an
SQL query. Which is not a thing from the
-
Wikidata Query Service, but it's possible
with something else and as it happens I am
-
going to have another talk later, where I
talk, about among other things, how you
-
can write SQL queries against the
Wikipedia databases, and then we might be
-
able to find a solution for that, and
that's I think at 6 p.m. today over in the
-
Esszimmer, or you come over to me after
the talk and then I can try to figure it out there.
-
H: A last emergency idea that we have to
try out?
-
H: I'm muted. Do you have ano– one more
idea? A small idea maybe we could do but
-
other than that I think we are, so– filled
the time quite well.
-
L: Yeah I think we're done. But if you
have any other ideas, you can always
-
contact me on Twitter @wikidatafacts, or
on Mastodon as well, and then I will see
-
what I can do for you. Yeah. Thanks.
H: Thank you very much, Lucas, that was a
-
great introduction to Wikidata querying!
-
Music
-
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!