0:00:05.882,0:00:07.218
(Dan) Hello everyone.
0:00:07.218,0:00:09.911
So this session is about teaching SPARQL.
0:00:09.911,0:00:12.423
The presenter is Martin Poulter,[br]so I leave you the stage.
0:00:12.423,0:00:13.668
Have fun.
0:00:13.668,0:00:14.943
(Martin) Thank you very much.
0:00:16.501,0:00:18.717
Hi, everybody.
0:00:18.717,0:00:23.355
I trust you'll agree[br]that Wikidata is great,
0:00:23.355,0:00:27.171
it has lots of interesting data[br]on different topics,
0:00:27.171,0:00:31.225
the tools people make with it[br]are fun to use and fun to explore,
0:00:31.225,0:00:33.412
and easy to use.
0:00:33.412,0:00:38.578
And maybe you'll agree with the suggestion[br]that to get the best out of Wikidata
0:00:38.578,0:00:40.142
you need to know SPARQL,
0:00:40.142,0:00:42.040
you need to be able to phrase[br]your own queries.
0:00:42.040,0:00:45.141
So you might see that[br]as a barrier, an obstacle,
0:00:45.141,0:00:50.183
that we ideally need a big program[br]of training for developers,
0:00:50.183,0:00:54.008
for librarians, for curators,[br]for ordinary people
0:00:54.008,0:00:58.236
to get them literate in this language,[br]and that's a big effort,
0:01:01.036,0:01:04.031
an aspect of Wikidata outreach.
0:01:04.031,0:01:06.238
My suggestion is to kind of[br]turn that around,
0:01:06.238,0:01:09.037
that Wikidata,[br]especially the Query Service,
0:01:09.037,0:01:11.673
because it's so helpful,[br]because it's so full of good stuff,
0:01:11.673,0:01:13.857
because it's so colorful,
0:01:13.857,0:01:16.200
because it has so many[br]visualization abilities,
0:01:16.200,0:01:20.173
is the ideal platform[br]for people to learn SPARQL,
0:01:20.173,0:01:21.890
also to learn about databases,
0:01:21.890,0:01:23.724
learn about knowledge representation,
0:01:23.724,0:01:25.305
learn about data and computers.
0:01:25.305,0:01:28.671
There's no necessity[br]that someone's first encounter
0:01:28.671,0:01:32.106
with data and computers,[br]has to be a relational database system.
0:01:32.106,0:01:33.947
So I'm going to put forward,
0:01:33.947,0:01:36.539
I'm going to report on[br]a training workshop
0:01:36.539,0:01:40.330
I've delivered to library staff[br]in University of Oxford,
0:01:40.330,0:01:42.550
and I've also done as a public event,
0:01:42.550,0:01:46.710
so just with members of the public[br]coming to an open data week
0:01:46.710,0:01:47.875
that university hosted.
0:01:47.875,0:01:51.979
And also done some of this[br]with researchers as well.
0:01:51.979,0:01:57.441
So I teach in a way[br]that is very particular to me,
0:01:57.441,0:01:59.847
so it's not like[br]I hand over materials to you.
0:01:59.847,0:02:03.164
I'll show you my approach[br]and then you'll take it up
0:02:03.164,0:02:05.902
and improve on it,[br]and make it personal to you
0:02:05.902,0:02:08.469
and the audiences you're dealing with.
0:02:08.469,0:02:10.253
And I want to avoid this.
0:02:10.253,0:02:16.256
So in my career, I had to learn[br]data technologies, and SQL, and XML,
0:02:16.256,0:02:19.610
and the content of tutorials,
0:02:19.610,0:02:23.400
or examples, is very much like this.
0:02:23.400,0:02:26.330
I'm not objecting to the language--[br]because that's what you got to learn--
0:02:26.330,0:02:28.969
but employees, invoices.
0:02:28.969,0:02:32.708
So your task might be[br]you have a sales force
0:02:32.708,0:02:36.913
and you've got to identify[br]the person who sold the most items,
0:02:36.913,0:02:38.369
and calculate their bonus
0:02:38.369,0:02:41.541
and then issue the invoices[br]to the customers,
0:02:41.541,0:02:44.707
and it's the most boring--[br]I can't get excited about that,
0:02:44.707,0:02:48.195
or I don't feel like I'm learning a topic.
0:02:48.195,0:02:51.662
With Wikidata, we have so many topics[br]we can engage people in,
0:02:51.665,0:02:54.613
and it might be things[br]in the solar system,
0:02:54.613,0:02:56.591
or characters in Shakespeare,
0:02:56.591,0:02:59.765
or things in the solar system[br]named after characters in Shakespeare,
0:02:59.765,0:03:01.897
which is what most of this is.
0:03:03.497,0:03:05.739
So when you have a teaching approach,
0:03:05.739,0:03:08.395
one question is[br]what things do you leave out.
0:03:09.295,0:03:15.271
So in the workshop I run,[br]I don't explain what SPARQL stands for,
0:03:15.271,0:03:18.193
that doesn't help you write SPARQL at all.
0:03:18.193,0:03:20.591
It doesn't help to explain what RDF is.
0:03:20.591,0:03:22.763
Obviously, it's historically[br]really important,
0:03:22.763,0:03:25.713
but telling people there's a format[br]for describing resources
0:03:25.713,0:03:27.630
that's called resource description format,
0:03:27.630,0:03:30.966
and resource is whatever's described,[br]it's not really a format.
0:03:30.966,0:03:32.226
That doesn't help people,
0:03:32.226,0:03:36.650
that gets people no closer to actually,[br]practically, using this.
0:03:36.650,0:03:40.639
Linked open data, LOD, I may mention.
0:03:40.639,0:03:44.317
So the library museum professionals[br]that come to my training
0:03:44.317,0:03:46.830
have definitely heard about[br]linked open data,
0:03:46.830,0:03:50.697
and know that it's the future[br]of their discipline,
0:03:50.697,0:03:52.564
and it's going to[br]revolutionize their work.
0:03:52.564,0:03:54.879
But at the moment,[br]they're not using that kind of system.
0:03:54.879,0:03:58.404
So they've not seen a real[br]practical example of that technology.
0:03:58.404,0:04:00.206
So that's what[br]they're going to get from this.
0:04:00.206,0:04:01.895
So I might mention linked open data,
0:04:01.895,0:04:03.971
but I don't get into the definition.
0:04:03.971,0:04:06.404
I basically say, this is a service[br]you can use for free.
0:04:06.404,0:04:08.113
It's been given to you to use for free,
0:04:08.113,0:04:10.675
and that gets the point across.
0:04:10.675,0:04:14.925
Semantic identifiers and namespaces,
0:04:14.925,0:04:16.518
I want to get across implicitly,
0:04:16.518,0:04:18.294
I don't want to teach people[br]these concepts,
0:04:18.294,0:04:21.271
I want them to pick up the concepts[br]even if I don't use the terms.
0:04:21.271,0:04:26.536
Reification, so people already[br]using a RDF database want to know
0:04:26.536,0:04:31.432
does Wikidata have statement IDs,[br]and I try to avoid that.
0:04:31.432,0:04:33.855
I hardly even mention Wikidata.
0:04:33.855,0:04:39.048
So these workshops are advertised[br]as like Introduction to SPARQL,
0:04:39.048,0:04:41.027
or for the public event one, it was
0:04:41.027,0:04:45.097
Asking and Answering Questions[br]with Open Data.
0:04:45.097,0:04:47.826
And then in the blurb, I'd say[br]we're going to be using this platform,
0:04:47.826,0:04:50.268
And I'll introduce it and say,[br]well, this is the best platform
0:04:50.268,0:04:52.815
on which to learn[br]this language, this skill.
0:04:52.815,0:04:55.138
It's the most helpful,[br]it's got the most interesting stuff.
0:04:55.138,0:04:57.265
And then in the course of the workshop,
0:04:57.265,0:04:58.969
maybe we'll get into more about Wikidata,
0:04:58.969,0:05:02.351
why this exists, who put this data here.
0:05:02.351,0:05:04.501
So there's a whole lot of background
0:05:04.501,0:05:08.347
that kind of professional RDF[br]or link data people will have,
0:05:08.347,0:05:09.942
but you don't need.
0:05:09.942,0:05:13.737
I just want to get people thinking[br]about nodes and arcs,
0:05:13.737,0:05:15.699
and thinking in triples,
0:05:15.699,0:05:19.690
and imagining how a triple representation[br]can be created and queried.
0:05:19.690,0:05:22.897
I want them to phrase questions[br]in their own language,
0:05:22.897,0:05:27.252
and translate into SPARQL,[br]via a kind of a baby talk intermediary.
0:05:27.252,0:05:28.984
But I want them to think in triples
0:05:28.984,0:05:34.740
and get used to asking questions[br]in that way, and just to get to the point
0:05:34.740,0:05:38.887
where they ask interesting questions[br]relevant to their work, or their hobbies,
0:05:38.887,0:05:42.395
or whatever, and they come away[br]with something.
0:05:42.395,0:05:44.107
So it's not the theoretical understanding
0:05:44.107,0:05:46.835
that I'm getting[br]in these quite short sessions.
0:05:46.835,0:05:50.285
And the first thing I present them with[br]is this, they've got to look at this.
0:05:50.285,0:05:53.650
And there's a "what the hell?" reaction
0:05:53.650,0:05:55.496
in the workshop[br]and probably in the room now,
0:05:55.496,0:05:59.361
because, "I thought this was[br]about technology skills!
0:05:59.361,0:06:01.512
Why have we got to look at a cute dog?"
0:06:01.512,0:06:05.289
But this is to introduce my toy world.
0:06:05.289,0:06:10.525
So there are three human beings.[br]Two of them are a married couple.
0:06:10.525,0:06:13.054
One is the child from that couple.
0:06:13.054,0:06:16.678
There are two beings[br]that are pets of this couple,
0:06:16.678,0:06:19.119
and we've got the types of the pets.
0:06:19.119,0:06:20.839
Clearly, this is not official data.
0:06:20.839,0:06:23.922
This knowledge representation,[br]which it is,
0:06:23.922,0:06:26.854
only exists in this slide,[br]it's not a database.
0:06:26.854,0:06:28.780
So I'm getting people thinking[br]of a toy world.
0:06:28.780,0:06:30.512
And there's loads that can be learnt
0:06:30.512,0:06:33.491
with just discussing this,[br]and kind of role-playing about this.
0:06:33.491,0:06:38.121
And you're going to[br]make your own toy world.
0:06:40.721,0:06:43.701
So a point to come from this[br]is this isn't a representation
0:06:43.701,0:06:47.102
of all of my family[br]or of all my parent's pets.
0:06:47.102,0:06:49.311
It's a tiny fragment.
0:06:49.311,0:06:50.787
When we query things,
0:06:50.787,0:06:53.261
we're querying a representation[br]of the world, not the world.
0:06:53.261,0:06:55.150
There's so much that's missed out.
0:06:56.150,0:07:01.104
That's a really important first lesson[br]to get about any database, any querying.
0:07:01.104,0:07:06.281
So everything's expressed[br]in triples, and nodes, and arcs.
0:07:06.281,0:07:08.427
Arcs have a direction.
0:07:08.427,0:07:09.529
How do the names work?
0:07:09.529,0:07:12.507
So one of these nodes is marked Bob.
0:07:12.507,0:07:17.207
Is that the name Bob,[br]does that stand for the name Bob?
0:07:17.207,0:07:20.624
Well, not quite, because other people[br]use the name Bob.
0:07:20.624,0:07:22.535
And Dan, you probably know a Bob.
0:07:22.535,0:07:23.649
(Dan) Like Bob [inaudible].
0:07:23.649,0:07:25.247
Yeah, you know a Bob.
0:07:25.247,0:07:28.617
And that's the Bob I think--[br]no, that isn't this Bob.
0:07:28.617,0:07:29.642
So we talk about that.
0:07:29.642,0:07:32.359
So names are relative[br]to the system that they're in,
0:07:32.359,0:07:36.327
and we could talk about Martin's Bob[br]and Dan's Bob not being the same person.
0:07:36.327,0:07:37.696
So it's not the names.
0:07:37.696,0:07:39.878
So we could think of them[br]as relative to a system.
0:07:39.878,0:07:43.828
So we can even say Martin:Bob[br]is the name for one thing,
0:07:43.828,0:07:47.775
and Dan:Bob identifies another thing[br]in another system.
0:07:49.375,0:07:52.121
And I emphasize triples, so three things.
0:07:52.121,0:07:57.754
You might be tempted to say,[br]"Cindy and Bob, together, have a pet dog,"
0:07:58.511,0:08:03.995
but you can't do that in this system[br]unless you have a node for the couple.
0:08:03.995,0:08:07.350
Things have to have a direction.[br]That may not make much sense.
0:08:07.350,0:08:09.673
There's a married couple--[br]that doesn't have a direction,
0:08:09.673,0:08:11.196
that's a relation between two people,
0:08:11.196,0:08:14.014
but we are modeling it[br]with things that have a direction
0:08:14.014,0:08:17.464
so we have to have the two directions.
0:08:17.464,0:08:18.962
There are arbitrary choices.
0:08:18.962,0:08:24.206
So why have "Cindy has child, Martin,[br]and not Martin has parent, Cindy?"
0:08:24.206,0:08:25.598
It's an arbitrary choice.
0:08:25.598,0:08:28.605
Arbitrary choices like that--[br]choices of name, choices of direction--
0:08:28.605,0:08:31.140
are built into this system and intrinsic.
0:08:31.140,0:08:32.871
So there are arbitrary choices to be made,
0:08:32.871,0:08:34.656
how to represent this,
0:08:34.656,0:08:37.794
even the same facts[br]could be represented in different ways.
0:08:37.794,0:08:39.233
Who makes that decision?
0:08:39.233,0:08:40.731
Well, whoever creates the system,
0:08:40.731,0:08:45.069
whoever sets up[br]the knowledge-based system.
0:08:45.069,0:08:49.330
So people can see that this--[br]called serializable--
0:08:49.330,0:08:52.459
this could be expressed[br]as triple statements.
0:08:52.459,0:08:58.468
So, "Cindy has pet, Tilly,[br]Martin is a human,"
0:08:58.468,0:09:02.393
and getting to the core insight
0:09:02.393,0:09:06.970
is comparing how do we make[br]a question in English?
0:09:06.970,0:09:10.953
Well, we have a statement[br]and it's incomplete,
0:09:10.953,0:09:16.762
like, "Who has pet, Tilly?"
0:09:16.762,0:09:21.585
So we go from "Cindy has pet Tilly,"[br]to "Who has pet Tilly?"
0:09:21.585,0:09:23.316
We've taken something out,
0:09:23.316,0:09:27.522
we've put in a placeholder,[br]and we've introduced a question mark.
0:09:27.522,0:09:30.080
I say that's just like[br]what we do with SPARQL.
0:09:30.080,0:09:33.053
We take something out,[br]we have an incomplete statement,
0:09:33.053,0:09:35.930
or incomplete statements,
0:09:35.930,0:09:40.213
we put a placeholder in the missing place,[br]and we have a question mark
0:09:40.213,0:09:42.645
to mark that that's a placeholder.
0:09:42.645,0:09:47.164
So it can be a role play[br]where I'm the query service
0:09:47.164,0:09:49.383
for this knowledge base.
0:09:49.383,0:09:53.906
And so people can learn[br]what a query service does
0:09:53.906,0:09:56.969
by seeing a query service and role-playing
0:09:56.969,0:09:59.709
and being a query service,[br]which we'll get to.
0:10:00.909,0:10:05.414
So people can see that[br]working on the level of triples.
0:10:07.214,0:10:09.371
"Who has pet, Tilly?"
0:10:09.371,0:10:14.480
If you say that to me, and I can say,[br]"results Cindy, Bob."
0:10:14.480,0:10:17.774
Then I put it to the trainees,
0:10:17.774,0:10:19.534
how do you ask more complicated questions?
0:10:19.534,0:10:22.436
So, "Who has a dog as a pet?"
0:10:23.646,0:10:28.701
And some will get it straightaway,[br]some will say, "Oh, it's a triple--
0:10:28.701,0:10:33.075
Who? has pet dog?"
0:10:33.075,0:10:38.103
So my role as the query service[br]is to look at this and match your triple,
0:10:38.103,0:10:39.385
"Who? has pet dog,"
0:10:39.385,0:10:41.522
so I got to find things that have pet dog,
0:10:41.522,0:10:43.024
and results None.
0:10:43.024,0:10:48.082
So this is the discussion--[br]what is this node I've called dog?
0:10:48.082,0:10:49.231
It's not a dog.
0:10:49.231,0:10:53.250
Although it's called dog,[br]it's not a dog, it stands for a class.
0:10:53.250,0:10:56.130
Obvious when you're a SPARQL user,[br]but this is getting people
0:10:56.130,0:10:59.054
over the threshold[br]of thinking in this way.
0:10:59.054,0:11:02.319
And you got to do[br]what kinds of things have pets.
0:11:02.319,0:11:05.258
People see that they can't do that [br]in one triple,
0:11:05.258,0:11:06.572
you got to do multiple triples,
0:11:06.572,0:11:10.126
and those multiple triples[br]ask for multiple things.
0:11:12.726,0:11:16.588
So if you've got,[br]"What kinds of things have pets?"
0:11:16.588,0:11:18.861
then you're going to identify people,
0:11:18.861,0:11:21.070
and then you've got to[br]identify those types,
0:11:21.070,0:11:24.362
and it naturally comes up,[br]"How do I specify the columns I want?
0:11:24.362,0:11:27.365
How do I specify that I want the types?"[br]That's the question.
0:11:27.365,0:11:29.838
And then you say,[br]"You have these partial statements,
0:11:29.838,0:11:34.643
and you enclose them[br]in curly brackets and put Select."
0:11:37.943,0:11:41.137
So this is kind of the first half hour[br]of the workshop,
0:11:41.137,0:11:44.162
and it's not on computers,[br]it's all with role play
0:11:44.162,0:11:45.743
and thinking about this.
0:11:45.743,0:11:51.776
And I invite people in the workshop[br]to make their own toy world,
0:11:51.776,0:11:54.506
and you'll be going toy world,[br]I hope, after this.
0:11:54.506,0:11:59.702
So five minutes, eight to ten nodes[br]to represent your family, your work place,
0:11:59.702,0:12:02.351
the thing you're working on,[br]the TV you were watching last night,
0:12:02.351,0:12:05.166
and to have some[br]meaningful links between them.
0:12:05.166,0:12:08.688
And the lesson that--[br]you make arbitrary decisions,
0:12:08.688,0:12:10.516
you name things, you create properties,
0:12:10.516,0:12:17.228
but they're the creation of the person[br]who sets up the knowledge system.
0:12:17.558,0:12:24.394
And then, in pairs, they explain[br]their graphs to each other, and query.
0:12:24.394,0:12:28.166
So, "What's a query you could ask[br]about this little world,
0:12:28.166,0:12:29.570
and then what would be the answer?"
0:12:29.570,0:12:33.730
So, like I say, people mostly get it,
0:12:33.730,0:12:36.451
but people want a four-[br]or five-part relation,
0:12:36.451,0:12:38.088
so they might want to say,
0:12:38.088,0:12:39.958
"This couple, together, have a pet."
0:12:39.958,0:12:43.204
Or they might want to say,[br]"Tilly is a pet, is a dog."
0:12:43.204,0:12:47.207
And you can enforce nodes, triples,[br]and triples have a direction.
0:12:48.307,0:12:51.258
So I'll explain what a triple is[br]and say also, not in this example,
0:12:51.258,0:12:54.639
but, "Triples, generally,[br]they have an item, they have a property,
0:12:54.639,0:12:57.307
and then they have[br]a number of other things
0:12:57.307,0:12:59.516
which could be values,[br]could be time periods,
0:12:59.516,0:13:03.104
could be locations on a globe."
0:13:07.288,0:13:11.235
So with that role-play exercise,[br]we're 40 minutes into a 2-hour workshop,
0:13:11.235,0:13:14.270
and in a computer room,[br]and we haven't touched computers yet.
0:13:14.270,0:13:17.387
But I think it's useful[br]to get people thinking in that way,
0:13:17.387,0:13:19.535
and to think about[br]how they would make the model
0:13:19.535,0:13:23.793
and what the query is,[br]and to actually translate,
0:13:23.793,0:13:25.149
so your translation exercise.
0:13:26.339,0:13:32.597
And then I'd direct people to[br]query.wikidata.org.
0:13:34.197,0:13:36.240
So there's a bunch of things[br]they've got to take on.
0:13:36.240,0:13:40.086
We've been doing--[br]I will have a flip chart, and we will--
0:13:40.086,0:13:41.539
Is that six?
0:13:41.539,0:13:43.290
Six minutes elapsed?
0:13:43.290,0:13:45.278
(man) [inaudible]
0:13:45.278,0:13:46.318
Right.
0:13:50.548,0:13:52.485
So I'll give them a task.
0:13:52.485,0:13:55.679
I don't want them to learn[br]Q numbers and P numbers.
0:13:55.679,0:14:00.646
So I'll tell them what the names are[br]and show them the Ctrl+Shift trick.
0:14:00.646,0:14:01.894
But there's a lot to take on,
0:14:01.894,0:14:04.210
so they're taking on[br]Q numbers and P numbers,
0:14:04.210,0:14:08.240
they've seen the triple format,[br]and they've seen Select,
0:14:08.240,0:14:11.338
but they've got to apply this[br]all in one go.
0:14:11.338,0:14:14.538
So I'll give people a task.
0:14:14.538,0:14:17.299
Some will get it immediately,[br]some will struggle
0:14:17.299,0:14:18.896
because they missed a bit of discussion,
0:14:18.896,0:14:22.866
or more often, because they're familiar[br]with another kind of database system,
0:14:22.866,0:14:25.490
and they have[br]particular expectations from that.
0:14:26.890,0:14:30.656
So I set bonus things[br]or more complicated things
0:14:30.656,0:14:31.874
if people are getting bored.
0:14:31.874,0:14:37.828
Or I say, "If you get bored and you work[br]on an entirely different question,
0:14:37.828,0:14:40.058
that's fine, but show me."
0:14:40.058,0:14:42.254
So I'll run through this in front of them,
0:14:42.254,0:14:45.617
tell them to do it, just show the hints[br]of what properties they'll be using,
0:14:45.617,0:14:46.979
and then run through it again.
0:14:46.979,0:14:50.277
And then, go through the cycle[br]of adding on extra things
0:14:50.277,0:14:51.280
to enhance the query.
0:14:51.280,0:14:53.084
So we might have done a query[br]and I'll say,
0:14:53.084,0:14:55.522
"Here's how you add on[br]an optional property."
0:14:57.822,0:15:01.046
And then give them a task[br]involving optional property.
0:15:01.046,0:15:04.518
In the Bodleian, I say,[br]"Find manuscripts in Latin
0:15:04.518,0:15:06.326
for a public event[br]at University of Bristol,
0:15:06.326,0:15:09.255
where there's lots of celebrities[br]who study at the University of Bristol,
0:15:09.255,0:15:14.113
so get that as an example."
0:15:14.113,0:15:15.933
So going to the interface,
0:15:15.933,0:15:20.949
there's still a hump in the learning curve
0:15:20.949,0:15:24.199
because they've got[br]to put the query into action,
0:15:24.199,0:15:25.752
they've got to think in this language,
0:15:25.752,0:15:29.879
and they've got to look up[br]Q numbers and P numbers,
0:15:29.879,0:15:32.246
and then there's all the things[br]they can do with the query,
0:15:32.246,0:15:33.283
once they've done it.
0:15:33.283,0:15:37.627
And the visualization options,[br]the bookmarking, getting the data.
0:15:43.881,0:15:45.635
So I'll suggest refinements.
0:15:45.635,0:15:50.264
So we can take a succession of steps[br]of getting people doing a query,
0:15:50.264,0:15:53.215
and taking it up to the next level.
0:15:53.215,0:15:56.069
Like, "Find landscape paintings[br]taller than they are wide."
0:15:56.069,0:16:02.658
So within the two-hour thing,[br]we get people doing basic queries,
0:16:02.658,0:16:07.803
adding refinements onto them,
0:16:07.803,0:16:11.164
not doing much filtering,
0:16:11.164,0:16:13.893
but starting to introduce measurements,
0:16:13.893,0:16:14.982
and so on.
0:16:14.982,0:16:17.782
Not getting into qualifiers[br]or another level.
0:16:17.782,0:16:20.816
If it's a whole day thing,[br]you probably could.
0:16:20.816,0:16:25.526
It comes up, inevitably, "Where else[br]can I use the SPARQL language?"
0:16:25.526,0:16:29.581
And I observe that that is a question,[br]and questions can be framed in SPARQL,
0:16:29.581,0:16:31.671
and put to Wikidata,[br]and you'll get answers,
0:16:31.671,0:16:34.444
and there is a Wikidata property[br]called SPARQL endpoint.
0:16:34.444,0:16:36.888
So when they ask that,[br]that becomes their task.
0:16:36.888,0:16:38.809
And then they get[br]that list of institutions
0:16:38.809,0:16:40.369
that have SPARQL endpoints.
0:16:42.499,0:16:43.877
And it's worth pointing out,
0:16:43.877,0:16:48.647
so in an introductory session[br]on other computer languages,
0:16:48.647,0:16:52.065
people will typically[br]learn how to do loops,
0:16:52.065,0:16:55.477
how to do functions,[br]how to do conditionals.
0:16:55.477,0:16:56.803
They'll learn the basic grammar
0:16:56.803,0:16:59.735
but they won't make something[br]fantastic and useful,
0:16:59.735,0:17:01.663
they'll just learn the basic grammar.
0:17:01.663,0:17:06.458
But in an introductory session[br]on Wikidata SPARQL you can make--
0:17:06.458,0:17:08.142
if you're interested[br]in German literature--
0:17:08.142,0:17:10.333
a map of the birthplace[br]of German poets, and so on.
0:17:10.333,0:17:12.097
And so we get feedback like this.
0:17:12.097,0:17:14.196
This is how great[br]the Wikidata Query Service is
0:17:14.196,0:17:16.266
as an educational tool.
0:17:16.266,0:17:19.298
"What is this sorcery?"[br]Isn't even from someone in the room.
0:17:19.298,0:17:21.226
A trainee in the room made a map,
0:17:21.226,0:17:24.702
emailed it to her colleagues[br]and got back, "What is this sorcery!?
0:17:24.702,0:17:25.703
How have you made this?"
0:17:25.703,0:17:29.428
And was just not expecting this to happen.
0:17:29.428,0:17:32.271
People are not expecting to look at[br]the picture of the cute dog,
0:17:32.271,0:17:36.243
they're not expecting to do the role play[br]where they represent their family
0:17:36.243,0:17:37.865
and query each other.
0:17:37.865,0:17:40.210
They're not expecting[br]to actually make something concrete
0:17:40.210,0:17:42.587
which they take away as a link[br]and show to their colleagues.
0:17:42.587,0:17:45.010
And all of this, being unexpected,
0:17:45.010,0:17:47.092
makes it memorable[br]and makes them want to go away
0:17:47.092,0:17:48.527
and talk to other people about it.
0:17:48.527,0:17:51.399
It's not like your run-of-the-mill[br]IT training.
0:17:52.699,0:17:58.020
The lower quote is from a researcher[br]who saw how he could make a map
0:17:58.020,0:18:00.761
of famous people with his first name
0:18:00.761,0:18:04.421
and another one of famous people[br]with his wife's first name.
0:18:04.421,0:18:07.819
And then he just had more and more ideas[br]of things and charts, and so on,
0:18:07.819,0:18:09.469
he's going to create with Wikidata,
0:18:09.469,0:18:10.967
and so he's glad to say,
0:18:10.967,0:18:13.297
"You've destroyed my productivity[br]for the next month."
0:18:15.805,0:18:17.601
So that's my recommendation.
0:18:17.601,0:18:19.702
I think we can take it as a positive,
0:18:19.702,0:18:22.985
and we take beyond[br]training people about Wikidata,
0:18:22.985,0:18:24.671
training people about data.
0:18:24.671,0:18:26.716
The stuff that came up[br]in the keynote this morning,
0:18:26.716,0:18:32.468
making people literate[br]about ideas of representation
0:18:32.468,0:18:36.568
and starting people off[br]and being involved in that discussion,
0:18:36.568,0:18:37.722
involves this [inaudible].
0:18:37.722,0:18:38.816
So this could be done--
0:18:38.816,0:18:40.822
doesn't have to be like[br]a workplace training thing,
0:18:40.822,0:18:42.134
it could be a public event,
0:18:42.134,0:18:45.250
to get people familiar[br]with these technologies.
0:18:46.150,0:18:48.302
But I will stop there for discussion.
0:18:48.302,0:18:51.150
And like I say, it's respectfully[br]submitted to people in the room
0:18:51.150,0:18:55.280
who do SPARQL training a different way,[br]but I hope this is useful to you.
0:18:57.180,0:19:00.184
(audience applause)
0:19:12.915,0:19:15.721
(Dan) Okay, are there any questions?
0:19:23.511,0:19:26.605
(man) Hi, it's [Mohammed Hijah][br]from Palestine.
0:19:26.605,0:19:28.420
Thank you for the session.
0:19:28.420,0:19:30.921
I was wondering if there are resources
0:19:30.921,0:19:35.131
that we can get to learn[br]SPARQL language professionally?
0:19:37.899,0:19:40.213
I've got the SPARQL book,[br]the O'Reilly book.
0:19:40.213,0:19:43.413
I find the Wikibook on SPARQL
0:19:43.413,0:19:44.987
is really, really useful.
0:19:44.987,0:19:48.387
That's like the most useful[br]and accessible reference.
0:19:49.287,0:19:54.570
The tutorials on Wikidata itself[br]are going to vary in quality.
0:19:55.170,0:19:57.694
(Mohammed) I think[br]that they are for beginners.
0:19:57.694,0:20:01.240
I can handle with SPARQL[br]but in the beginner level,
0:20:01.240,0:20:04.343
but I want to deal with it professionally.
0:20:10.864,0:20:13.609
So my concern is to get[br]as many people as possible
0:20:13.609,0:20:16.292
across the threshold[br]into being aware of how this works,
0:20:16.292,0:20:17.925
and dabbling.
0:20:19.225,0:20:24.920
I'd like it to be a deeper course[br]by going into more of the...
0:20:26.220,0:20:29.120
how it works--[br]qualifiers and references, and so on.
0:20:29.120,0:20:31.809
Where in a professional context,[br]you're probably aiming towards
0:20:31.809,0:20:35.923
people using a particular SPARQL endpoint,
0:20:35.923,0:20:39.123
and Wikidata has some customizations
0:20:39.123,0:20:41.636
We've discussed in Twitter[br]that there's some things we use
0:20:41.636,0:20:43.548
that actually aren't a SPARQL standard.
0:20:43.548,0:20:46.130
They're like an optimization.
0:20:46.130,0:20:48.816
So in the professional context,
0:20:50.516,0:20:56.190
I'd hope it would be tailored[br]to that particular data set and endpoint,
0:20:56.190,0:20:59.575
but there's not a demand for that yet,
0:20:59.575,0:21:03.459
because like I said, I deal with people[br]who are aware of linked open data,
0:21:03.459,0:21:07.558
and the word out, it's a good thing,[br]but haven't seen an example yet,
0:21:07.558,0:21:09.446
haven't an example[br]they can apply to their work,
0:21:09.446,0:21:11.693
they're not enthusiastic about it yet.
0:21:11.693,0:21:13.843
So I think we want to[br]get my whole workplace
0:21:13.843,0:21:17.726
and other workplaces and developers[br]across that threshold
0:21:17.726,0:21:21.998
to where they're demanding[br]that kind of really in deep,
0:21:21.998,0:21:25.333
like using endpoint in a library[br]kind of training.
0:21:26.082,0:21:27.376
(Mohammed) Thank you.
0:21:31.883,0:21:34.892
(woman) It's just a question.[br]I really liked that, thank you so much.
0:21:34.892,0:21:37.819
Is it documented step-by-step anywhere?
0:21:39.194,0:21:43.043
I can share my succession of tasks.
0:21:43.843,0:21:47.100
That's very much tailored[br]to where I'm presenting it.
0:21:47.100,0:21:50.697
Like I said, with librarians,[br]I start with manuscripts and go on.
0:21:53.697,0:21:56.393
You want to end up[br]with people asking a question
0:21:56.393,0:22:00.764
which is the question they came,[br]in their heads, to the event with.
0:22:04.764,0:22:10.283
So there's an order[br]of querying with a triple,
0:22:10.283,0:22:13.006
and then with multiple triples,[br]and then with an optional triple,
0:22:13.006,0:22:17.147
and then with a measurement[br]in a filter, and so on.
0:22:17.147,0:22:20.618
And, yeah, I can share...
0:22:22.438,0:22:24.338
Yeah, I'll share a separate set of slides
0:22:24.338,0:22:25.421
for those exercises.
0:22:25.421,0:22:27.379
(woman) Thank you so much[br]because I will take that
0:22:27.379,0:22:29.783
and customize it for my own needs.[br]Thank you.
0:22:31.010,0:22:33.095
(Dan) Okay. No questions?
0:22:34.953,0:22:38.994
(man) What would you recommend[br]if you also want to teach editing,
0:22:38.994,0:22:41.595
apart from just querying?
0:22:46.968,0:22:53.476
I'm pleased to report[br]that people find Wikidata editing,
0:22:53.476,0:22:56.632
when I demonstrate it, to be so simple,
0:22:56.632,0:22:58.943
that it just takes them by surprise.
0:22:58.943,0:23:01.568
It's Wikidata editing,[br]and I've got to add knowledge
0:23:01.568,0:23:03.018
to this huge knowledge base.
0:23:03.018,0:23:05.435
Sounds like something[br]that really technical people can do.
0:23:05.435,0:23:08.524
And then you show it,[br]and they go, "Oh, right.
0:23:08.524,0:23:11.096
Martin is instance of human."
0:23:13.296,0:23:18.851
So I haven't done that systematically yet.
0:23:21.498,0:23:26.007
I think a precondition would be[br]getting people thinking in triples,
0:23:26.007,0:23:29.675
and maybe underline that[br]triples need references,
0:23:29.675,0:23:34.237
and triples need qualifiers[br]and that multiple triples,
0:23:34.237,0:23:37.442
triples have multiple conflicting values.
0:23:37.442,0:23:39.949
So I'd still do the toy world,
0:23:39.949,0:23:45.149
maybe a more professionally relevant[br]toy world, and translation exercise,
0:23:45.149,0:23:48.222
but then go to, "So now the exercise[br]we're going to do with triples
0:23:48.222,0:23:49.661
is adding them."
0:23:51.561,0:23:54.522
There's a lot of work done,[br]and maybe Jason's done,
0:23:54.522,0:23:58.402
with guessing a table of identifiers.
0:23:58.402,0:23:59.581
So something I'd like to do,
0:23:59.581,0:24:03.710
there's an online database
0:24:03.710,0:24:06.710
of people who've won a Rhodes Scholarship.
0:24:06.710,0:24:10.616
There's a scholarship to Oxford University[br]from other countries.
0:24:10.616,0:24:12.221
But it's not in Wikidata yet.
0:24:12.221,0:24:14.381
So you can kind of divide up[br]the room and say,
0:24:14.381,0:24:16.595
"You're going to find[br]these people in Wikidata
0:24:16.595,0:24:18.874
and your task is to add
0:24:18.874,0:24:21.106
with the reference[br]to this online database."
0:24:21.106,0:24:23.449
And then you can do a query[br]to see how many have been added
0:24:23.449,0:24:25.545
in that session.
0:24:25.545,0:24:28.246
So I think, with all the training I do,
0:24:28.246,0:24:31.582
I think the comprehension[br]is more important
0:24:31.582,0:24:33.554
than the taking action immediately.
0:24:33.554,0:24:35.543
So when I'm training people on Wikipedia,
0:24:35.543,0:24:39.514
I first show them article histories,[br]contribution records, talk page,
0:24:39.514,0:24:44.800
quality scale, so they're comprehending[br]the process before they edit,
0:24:44.800,0:24:47.439
and actually change something.
0:24:49.939,0:24:52.636
(man) Not really a question but a comment.
0:24:52.636,0:24:58.570
There is, for beginners,[br]a good tutorial on YouTube,
0:24:58.570,0:25:01.423
How to Query and Start with SPARQL,
0:25:01.423,0:25:04.421
and if you want to go deeper, also,
0:25:04.421,0:25:08.521
How to Add Data with OpenRefine.
0:25:08.521,0:25:12.621
And I've also made some videos
0:25:12.621,0:25:15.121
and uploaded them in German language.
0:25:15.121,0:25:16.916
Oh, great! Thanks.
0:25:17.894,0:25:21.823
I should also mention Hilary Thorsen,[br]who's from Stanford Library,
0:25:21.823,0:25:25.076
did, last week,[br]a really good video capture
0:25:25.076,0:25:28.857
of adding a data set to Wikidata[br]with OpenRefine.
0:25:28.857,0:25:33.529
This is for the LD4P, the Linked Data[br]for Production project,
0:25:33.529,0:25:35.932
and that was a really good video tutorial
0:25:35.932,0:25:38.392
I'd recommend to anybody for--
0:25:38.392,0:25:42.426
That's the next couple of levels up[br]from what I'm doing.
0:25:43.189,0:25:45.029
(Dan) Is there a last question?
0:25:49.486,0:25:52.203
(man) So SPARQL's sort of SQL-ish.
0:25:52.203,0:25:54.856
If someone walked into your tutorial[br]with an SQL background,
0:25:54.856,0:25:57.291
is that a blessing or a curse?
0:25:57.291,0:26:00.164
It's a bit of a curse[br]because I had to learn SQL,
0:26:00.164,0:26:03.398
so I did the...
0:26:03.398,0:26:09.498
generate the invoices[br]using SQL for your fictitious company,
0:26:09.498,0:26:14.369
and definitely had to unlearn[br]an SQL way of thinking about things
0:26:14.369,0:26:15.712
to get to SPARQL.
0:26:15.712,0:26:17.638
But it was freeing, it was freeing.
0:26:17.638,0:26:21.302
Databases without built-in schemas[br]are liberating.
0:26:22.102,0:26:24.042
When you think about[br]how many columns there are,
0:26:24.042,0:26:25.727
and it's this number[br]of columns for a book,
0:26:25.727,0:26:27.638
and it's this number of columns[br]for the address,
0:26:27.638,0:26:28.984
and it's just three columns.
0:26:28.984,0:26:31.406
Well, three and a bit more.
0:26:31.406,0:26:34.443
That's really liberating.
0:26:34.443,0:26:36.814
So that's my point, I kind of glanced at,
0:26:36.814,0:26:41.810
that people make different progress[br]in these workshops as in all training,
0:26:41.810,0:26:43.869
but it's not like intelligent versus dumb,
0:26:43.869,0:26:46.588
it's like the preconceptions[br]you're coming with,
0:26:46.588,0:26:47.823
are more the obstacle.
0:26:47.823,0:26:50.242
So it's actually more--
0:26:50.242,0:26:55.655
I'm more optimistic about training people[br]who have never encountered databases,
0:26:55.655,0:26:58.805
coding, or any of that before, than...
0:26:58.805,0:27:02.232
The worst people to try and train[br]are linked data experts
0:27:02.232,0:27:04.631
because they've used DBpedia a lot.
0:27:04.631,0:27:07.180
They used a particular approach[br]of querying
0:27:07.180,0:27:08.834
and expecting to get certain things,
0:27:08.834,0:27:12.429
and it looks odd when Wikidata[br]does things differently.
0:27:12.429,0:27:14.540
And they need to get with the program.
0:27:15.205,0:27:17.867
(Dan) Okay, let's thank Martin[br]for his insights.
0:27:17.867,0:27:18.884
Thanks very much.
0:27:18.884,0:27:21.888
(audience applause)