Return to Video

cdn.media.ccc.de/.../wikidatacon2019-1109-eng-Teaching_SPARQL_as_a_Foreign_Language_hd.mp4

  • 0:06 - 0:07
    (Dan) Hello everyone.
  • 0:07 - 0:10
    So this session is about teaching SPARQL.
  • 0:10 - 0:12
    The presenter is Martin Poulter,
    so I leave you the stage.
  • 0:12 - 0:14
    Have fun.
  • 0:14 - 0:15
    (Martin) Thank you very much.
  • 0:17 - 0:19
    Hi, everybody.
  • 0:19 - 0:23
    I trust you'll agree
    that Wikidata is great,
  • 0:23 - 0:27
    it has lots of interesting data
    on different topics,
  • 0:27 - 0:31
    the tools people make with it
    are fun to use and fun to explore,
  • 0:31 - 0:33
    and easy to use.
  • 0:33 - 0:39
    And maybe you'll agree with the suggestion
    that to get the best out of Wikidata
  • 0:39 - 0:40
    you need to know SPARQL,
  • 0:40 - 0:42
    you need to be able to phrase
    your own queries.
  • 0:42 - 0:45
    So you might see that
    as a barrier, an obstacle,
  • 0:45 - 0:50
    that we ideally need a big program
    of training for developers,
  • 0:50 - 0:54
    for librarians, for curators,
    for ordinary people
  • 0:54 - 0:58
    to get them literate in this language,
    and that's a big effort,
  • 1:01 - 1:04
    an aspect of Wikidata outreach.
  • 1:04 - 1:06
    My suggestion is to kind of
    turn that around,
  • 1:06 - 1:09
    that Wikidata,
    especially the Query Service,
  • 1:09 - 1:12
    because it's so helpful,
    because it's so full of good stuff,
  • 1:12 - 1:14
    because it's so colorful,
  • 1:14 - 1:16
    because it has so many
    visualization abilities,
  • 1:16 - 1:20
    is the ideal platform
    for people to learn SPARQL,
  • 1:20 - 1:22
    also to learn about databases,
  • 1:22 - 1:24
    learn about knowledge representation,
  • 1:24 - 1:25
    learn about data and computers.
  • 1:25 - 1:29
    There's no necessity
    that someone's first encounter
  • 1:29 - 1:32
    with data and computers,
    has to be a relational database system.
  • 1:32 - 1:34
    So I'm going to put forward,
  • 1:34 - 1:37
    I'm going to report on
    a training workshop
  • 1:37 - 1:40
    I've delivered to library staff
    in University of Oxford,
  • 1:40 - 1:43
    and I've also done as a public event,
  • 1:43 - 1:47
    so just with members of the public
    coming to an open data week
  • 1:47 - 1:48
    that university hosted.
  • 1:48 - 1:52
    And also done some of this
    with researchers as well.
  • 1:52 - 1:57
    So I teach in a way
    that is very particular to me,
  • 1:57 - 2:00
    so it's not like
    I hand over materials to you.
  • 2:00 - 2:03
    I'll show you my approach
    and then you'll take it up
  • 2:03 - 2:06
    and improve on it,
    and make it personal to you
  • 2:06 - 2:08
    and the audiences you're dealing with.
  • 2:08 - 2:10
    And I want to avoid this.
  • 2:10 - 2:16
    So in my career, I had to learn
    data technologies, and SQL, and XML,
  • 2:16 - 2:20
    and the content of tutorials,
  • 2:20 - 2:23
    or examples, is very much like this.
  • 2:23 - 2:26
    I'm not objecting to the language--
    because that's what you got to learn--
  • 2:26 - 2:29
    but employees, invoices.
  • 2:29 - 2:33
    So your task might be
    you have a sales force
  • 2:33 - 2:37
    and you've got to identify
    the person who sold the most items,
  • 2:37 - 2:38
    and calculate their bonus
  • 2:38 - 2:42
    and then issue the invoices
    to the customers,
  • 2:42 - 2:45
    and it's the most boring--
    I can't get excited about that,
  • 2:45 - 2:48
    or I don't feel like I'm learning a topic.
  • 2:48 - 2:52
    With Wikidata, we have so many topics
    we can engage people in,
  • 2:52 - 2:55
    and it might be things
    in the solar system,
  • 2:55 - 2:57
    or characters in Shakespeare,
  • 2:57 - 3:00
    or things in the solar system
    named after characters in Shakespeare,
  • 3:00 - 3:02
    which is what most of this is.
  • 3:03 - 3:06
    So when you have a teaching approach,
  • 3:06 - 3:08
    one question is
    what things do you leave out.
  • 3:09 - 3:15
    So in the workshop I run,
    I don't explain what SPARQL stands for,
  • 3:15 - 3:18
    that doesn't help you write SPARQL at all.
  • 3:18 - 3:21
    It doesn't help to explain what RDF is.
  • 3:21 - 3:23
    Obviously, it's historically
    really important,
  • 3:23 - 3:26
    but telling people there's a format
    for describing resources
  • 3:26 - 3:28
    that's called resource description format,
  • 3:28 - 3:31
    and resource is whatever's described,
    it's not really a format.
  • 3:31 - 3:32
    That doesn't help people,
  • 3:32 - 3:37
    that gets people no closer to actually,
    practically, using this.
  • 3:37 - 3:41
    Linked open data, LOD, I may mention.
  • 3:41 - 3:44
    So the library museum professionals
    that come to my training
  • 3:44 - 3:47
    have definitely heard about
    linked open data,
  • 3:47 - 3:51
    and know that it's the future
    of their discipline,
  • 3:51 - 3:53
    and it's going to
    revolutionize their work.
  • 3:53 - 3:55
    But at the moment,
    they're not using that kind of system.
  • 3:55 - 3:58
    So they've not seen a real
    practical example of that technology.
  • 3:58 - 4:00
    So that's what
    they're going to get from this.
  • 4:00 - 4:02
    So I might mention linked open data,
  • 4:02 - 4:04
    but I don't get into the definition.
  • 4:04 - 4:06
    I basically say, this is a service
    you can use for free.
  • 4:06 - 4:08
    It's been given to you to use for free,
  • 4:08 - 4:11
    and that gets the point across.
  • 4:11 - 4:15
    Semantic identifiers and namespaces,
  • 4:15 - 4:17
    I want to get across implicitly,
  • 4:17 - 4:18
    I don't want to teach people
    these concepts,
  • 4:18 - 4:21
    I want them to pick up the concepts
    even if I don't use the terms.
  • 4:21 - 4:27
    Reification, so people already
    using a RDF database want to know
  • 4:27 - 4:31
    does Wikidata have statement IDs,
    and I try to avoid that.
  • 4:31 - 4:34
    I hardly even mention Wikidata.
  • 4:34 - 4:39
    So these workshops are advertised
    as like Introduction to SPARQL,
  • 4:39 - 4:41
    or for the public event one, it was
  • 4:41 - 4:45
    Asking and Answering Questions
    with Open Data.
  • 4:45 - 4:48
    And then in the blurb, I'd say
    we're going to be using this platform,
  • 4:48 - 4:50
    And I'll introduce it and say,
    well, this is the best platform
  • 4:50 - 4:53
    on which to learn
    this language, this skill.
  • 4:53 - 4:55
    It's the most helpful,
    it's got the most interesting stuff.
  • 4:55 - 4:57
    And then in the course of the workshop,
  • 4:57 - 4:59
    maybe we'll get into more about Wikidata,
  • 4:59 - 5:02
    why this exists, who put this data here.
  • 5:02 - 5:05
    So there's a whole lot of background
  • 5:05 - 5:08
    that kind of professional RDF
    or link data people will have,
  • 5:08 - 5:10
    but you don't need.
  • 5:10 - 5:14
    I just want to get people thinking
    about nodes and arcs,
  • 5:14 - 5:16
    and thinking in triples,
  • 5:16 - 5:20
    and imagining how a triple representation
    can be created and queried.
  • 5:20 - 5:23
    I want them to phrase questions
    in their own language,
  • 5:23 - 5:27
    and translate into SPARQL,
    via a kind of a baby talk intermediary.
  • 5:27 - 5:29
    But I want them to think in triples
  • 5:29 - 5:35
    and get used to asking questions
    in that way, and just to get to the point
  • 5:35 - 5:39
    where they ask interesting questions
    relevant to their work, or their hobbies,
  • 5:39 - 5:42
    or whatever, and they come away
    with something.
  • 5:42 - 5:44
    So it's not the theoretical understanding
  • 5:44 - 5:47
    that I'm getting
    in these quite short sessions.
  • 5:47 - 5:50
    And the first thing I present them with
    is this, they've got to look at this.
  • 5:50 - 5:54
    And there's a "what the hell?" reaction
  • 5:54 - 5:55
    in the workshop
    and probably in the room now,
  • 5:55 - 5:59
    because, "I thought this was
    about technology skills!
  • 5:59 - 6:02
    Why have we got to look at a cute dog?"
  • 6:02 - 6:05
    But this is to introduce my toy world.
  • 6:05 - 6:11
    So there are three human beings.
    Two of them are a married couple.
  • 6:11 - 6:13
    One is the child from that couple.
  • 6:13 - 6:17
    There are two beings
    that are pets of this couple,
  • 6:17 - 6:19
    and we've got the types of the pets.
  • 6:19 - 6:21
    Clearly, this is not official data.
  • 6:21 - 6:24
    This knowledge representation,
    which it is,
  • 6:24 - 6:27
    only exists in this slide,
    it's not a database.
  • 6:27 - 6:29
    So I'm getting people thinking
    of a toy world.
  • 6:29 - 6:31
    And there's loads that can be learnt
  • 6:31 - 6:33
    with just discussing this,
    and kind of role-playing about this.
  • 6:33 - 6:38
    And you're going to
    make your own toy world.
  • 6:41 - 6:44
    So a point to come from this
    is this isn't a representation
  • 6:44 - 6:47
    of all of my family
    or of all my parent's pets.
  • 6:47 - 6:49
    It's a tiny fragment.
  • 6:49 - 6:51
    When we query things,
  • 6:51 - 6:53
    we're querying a representation
    of the world, not the world.
  • 6:53 - 6:55
    There's so much that's missed out.
  • 6:56 - 7:01
    That's a really important first lesson
    to get about any database, any querying.
  • 7:01 - 7:06
    So everything's expressed
    in triples, and nodes, and arcs.
  • 7:06 - 7:08
    Arcs have a direction.
  • 7:08 - 7:10
    How do the names work?
  • 7:10 - 7:13
    So one of these nodes is marked Bob.
  • 7:13 - 7:17
    Is that the name Bob,
    does that stand for the name Bob?
  • 7:17 - 7:21
    Well, not quite, because other people
    use the name Bob.
  • 7:21 - 7:23
    And Dan, you probably know a Bob.
  • 7:23 - 7:24
    (Dan) Like Bob [inaudible].
  • 7:24 - 7:25
    Yeah, you know a Bob.
  • 7:25 - 7:29
    And that's the Bob I think--
    no, that isn't this Bob.
  • 7:29 - 7:30
    So we talk about that.
  • 7:30 - 7:32
    So names are relative
    to the system that they're in,
  • 7:32 - 7:36
    and we could talk about Martin's Bob
    and Dan's Bob not being the same person.
  • 7:36 - 7:38
    So it's not the names.
  • 7:38 - 7:40
    So we could think of them
    as relative to a system.
  • 7:40 - 7:44
    So we can even say Martin:Bob
    is the name for one thing,
  • 7:44 - 7:48
    and Dan:Bob identifies another thing
    in another system.
  • 7:49 - 7:52
    And I emphasize triples, so three things.
  • 7:52 - 7:58
    You might be tempted to say,
    "Cindy and Bob, together, have a pet dog,"
  • 7:59 - 8:04
    but you can't do that in this system
    unless you have a node for the couple.
  • 8:04 - 8:07
    Things have to have a direction.
    That may not make much sense.
  • 8:07 - 8:10
    There's a married couple--
    that doesn't have a direction,
  • 8:10 - 8:11
    that's a relation between two people,
  • 8:11 - 8:14
    but we are modeling it
    with things that have a direction
  • 8:14 - 8:17
    so we have to have the two directions.
  • 8:17 - 8:19
    There are arbitrary choices.
  • 8:19 - 8:24
    So why have "Cindy has child, Martin,
    and not Martin has parent, Cindy?"
  • 8:24 - 8:26
    It's an arbitrary choice.
  • 8:26 - 8:29
    Arbitrary choices like that--
    choices of name, choices of direction--
  • 8:29 - 8:31
    are built into this system and intrinsic.
  • 8:31 - 8:33
    So there are arbitrary choices to be made,
  • 8:33 - 8:35
    how to represent this,
  • 8:35 - 8:38
    even the same facts
    could be represented in different ways.
  • 8:38 - 8:39
    Who makes that decision?
  • 8:39 - 8:41
    Well, whoever creates the system,
  • 8:41 - 8:45
    whoever sets up
    the knowledge-based system.
  • 8:45 - 8:49
    So people can see that this--
    called serializable--
  • 8:49 - 8:52
    this could be expressed
    as triple statements.
  • 8:52 - 8:58
    So, "Cindy has pet, Tilly,
    Martin is a human,"
  • 8:58 - 9:02
    and getting to the core insight
  • 9:02 - 9:07
    is comparing how do we make
    a question in English?
  • 9:07 - 9:11
    Well, we have a statement
    and it's incomplete,
  • 9:11 - 9:17
    like, "Who has pet, Tilly?"
  • 9:17 - 9:22
    So we go from "Cindy has pet Tilly,"
    to "Who has pet Tilly?"
  • 9:22 - 9:23
    We've taken something out,
  • 9:23 - 9:28
    we've put in a placeholder,
    and we've introduced a question mark.
  • 9:28 - 9:30
    I say that's just like
    what we do with SPARQL.
  • 9:30 - 9:33
    We take something out,
    we have an incomplete statement,
  • 9:33 - 9:36
    or incomplete statements,
  • 9:36 - 9:40
    we put a placeholder in the missing place,
    and we have a question mark
  • 9:40 - 9:43
    to mark that that's a placeholder.
  • 9:43 - 9:47
    So it can be a role play
    where I'm the query service
  • 9:47 - 9:49
    for this knowledge base.
  • 9:49 - 9:54
    And so people can learn
    what a query service does
  • 9:54 - 9:57
    by seeing a query service and role-playing
  • 9:57 - 10:00
    and being a query service,
    which we'll get to.
  • 10:01 - 10:05
    So people can see that
    working on the level of triples.
  • 10:07 - 10:09
    "Who has pet, Tilly?"
  • 10:09 - 10:14
    If you say that to me, and I can say,
    "results Cindy, Bob."
  • 10:14 - 10:18
    Then I put it to the trainees,
  • 10:18 - 10:20
    how do you ask more complicated questions?
  • 10:20 - 10:22
    So, "Who has a dog as a pet?"
  • 10:24 - 10:29
    And some will get it straightaway,
    some will say, "Oh, it's a triple--
  • 10:29 - 10:33
    Who? has pet dog?"
  • 10:33 - 10:38
    So my role as the query service
    is to look at this and match your triple,
  • 10:38 - 10:39
    "Who? has pet dog,"
  • 10:39 - 10:42
    so I got to find things that have pet dog,
  • 10:42 - 10:43
    and results None.
  • 10:43 - 10:48
    So this is the discussion--
    what is this node I've called dog?
  • 10:48 - 10:49
    It's not a dog.
  • 10:49 - 10:53
    Although it's called dog,
    it's not a dog, it stands for a class.
  • 10:53 - 10:56
    Obvious when you're a SPARQL user,
    but this is getting people
  • 10:56 - 10:59
    over the threshold
    of thinking in this way.
  • 10:59 - 11:02
    And you got to do
    what kinds of things have pets.
  • 11:02 - 11:05
    People see that they can't do that
    in one triple,
  • 11:05 - 11:07
    you got to do multiple triples,
  • 11:07 - 11:10
    and those multiple triples
    ask for multiple things.
  • 11:13 - 11:17
    So if you've got,
    "What kinds of things have pets?"
  • 11:17 - 11:19
    then you're going to identify people,
  • 11:19 - 11:21
    and then you've got to
    identify those types,
  • 11:21 - 11:24
    and it naturally comes up,
    "How do I specify the columns I want?
  • 11:24 - 11:27
    How do I specify that I want the types?"
    That's the question.
  • 11:27 - 11:30
    And then you say,
    "You have these partial statements,
  • 11:30 - 11:35
    and you enclose them
    in curly brackets and put Select."
  • 11:38 - 11:41
    So this is kind of the first half hour
    of the workshop,
  • 11:41 - 11:44
    and it's not on computers,
    it's all with role play
  • 11:44 - 11:46
    and thinking about this.
  • 11:46 - 11:52
    And I invite people in the workshop
    to make their own toy world,
  • 11:52 - 11:55
    and you'll be going toy world,
    I hope, after this.
  • 11:55 - 12:00
    So five minutes, eight to ten nodes
    to represent your family, your work place,
  • 12:00 - 12:02
    the thing you're working on,
    the TV you were watching last night,
  • 12:02 - 12:05
    and to have some
    meaningful links between them.
  • 12:05 - 12:09
    And the lesson that--
    you make arbitrary decisions,
  • 12:09 - 12:11
    you name things, you create properties,
  • 12:11 - 12:17
    but they're the creation of the person
    who sets up the knowledge system.
  • 12:18 - 12:24
    And then, in pairs, they explain
    their graphs to each other, and query.
  • 12:24 - 12:28
    So, "What's a query you could ask
    about this little world,
  • 12:28 - 12:30
    and then what would be the answer?"
  • 12:30 - 12:34
    So, like I say, people mostly get it,
  • 12:34 - 12:36
    but people want a four-
    or five-part relation,
  • 12:36 - 12:38
    so they might want to say,
  • 12:38 - 12:40
    "This couple, together, have a pet."
  • 12:40 - 12:43
    Or they might want to say,
    "Tilly is a pet, is a dog."
  • 12:43 - 12:47
    And you can enforce nodes, triples,
    and triples have a direction.
  • 12:48 - 12:51
    So I'll explain what a triple is
    and say also, not in this example,
  • 12:51 - 12:55
    but, "Triples, generally,
    they have an item, they have a property,
  • 12:55 - 12:57
    and then they have
    a number of other things
  • 12:57 - 13:00
    which could be values,
    could be time periods,
  • 13:00 - 13:03
    could be locations on a globe."
  • 13:07 - 13:11
    So with that role-play exercise,
    we're 40 minutes into a 2-hour workshop,
  • 13:11 - 13:14
    and in a computer room,
    and we haven't touched computers yet.
  • 13:14 - 13:17
    But I think it's useful
    to get people thinking in that way,
  • 13:17 - 13:20
    and to think about
    how they would make the model
  • 13:20 - 13:24
    and what the query is,
    and to actually translate,
  • 13:24 - 13:25
    so your translation exercise.
  • 13:26 - 13:33
    And then I'd direct people to
    query.wikidata.org.
  • 13:34 - 13:36
    So there's a bunch of things
    they've got to take on.
  • 13:36 - 13:40
    We've been doing--
    I will have a flip chart, and we will--
  • 13:40 - 13:42
    Is that six?
  • 13:42 - 13:43
    Six minutes elapsed?
  • 13:43 - 13:45
    (man) [inaudible]
  • 13:45 - 13:46
    Right.
  • 13:51 - 13:52
    So I'll give them a task.
  • 13:52 - 13:56
    I don't want them to learn
    Q numbers and P numbers.
  • 13:56 - 14:01
    So I'll tell them what the names are
    and show them the Ctrl+Shift trick.
  • 14:01 - 14:02
    But there's a lot to take on,
  • 14:02 - 14:04
    so they're taking on
    Q numbers and P numbers,
  • 14:04 - 14:08
    they've seen the triple format,
    and they've seen Select,
  • 14:08 - 14:11
    but they've got to apply this
    all in one go.
  • 14:11 - 14:15
    So I'll give people a task.
  • 14:15 - 14:17
    Some will get it immediately,
    some will struggle
  • 14:17 - 14:19
    because they missed a bit of discussion,
  • 14:19 - 14:23
    or more often, because they're familiar
    with another kind of database system,
  • 14:23 - 14:25
    and they have
    particular expectations from that.
  • 14:27 - 14:31
    So I set bonus things
    or more complicated things
  • 14:31 - 14:32
    if people are getting bored.
  • 14:32 - 14:38
    Or I say, "If you get bored and you work
    on an entirely different question,
  • 14:38 - 14:40
    that's fine, but show me."
  • 14:40 - 14:42
    So I'll run through this in front of them,
  • 14:42 - 14:46
    tell them to do it, just show the hints
    of what properties they'll be using,
  • 14:46 - 14:47
    and then run through it again.
  • 14:47 - 14:50
    And then, go through the cycle
    of adding on extra things
  • 14:50 - 14:51
    to enhance the query.
  • 14:51 - 14:53
    So we might have done a query
    and I'll say,
  • 14:53 - 14:56
    "Here's how you add on
    an optional property."
  • 14:58 - 15:01
    And then give them a task
    involving optional property.
  • 15:01 - 15:05
    In the Bodleian, I say,
    "Find manuscripts in Latin
  • 15:05 - 15:06
    for a public event
    at University of Bristol,
  • 15:06 - 15:09
    where there's lots of celebrities
    who study at the University of Bristol,
  • 15:09 - 15:14
    so get that as an example."
  • 15:14 - 15:16
    So going to the interface,
  • 15:16 - 15:21
    there's still a hump in the learning curve
  • 15:21 - 15:24
    because they've got
    to put the query into action,
  • 15:24 - 15:26
    they've got to think in this language,
  • 15:26 - 15:30
    and they've got to look up
    Q numbers and P numbers,
  • 15:30 - 15:32
    and then there's all the things
    they can do with the query,
  • 15:32 - 15:33
    once they've done it.
  • 15:33 - 15:38
    And the visualization options,
    the bookmarking, getting the data.
  • 15:44 - 15:46
    So I'll suggest refinements.
  • 15:46 - 15:50
    So we can take a succession of steps
    of getting people doing a query,
  • 15:50 - 15:53
    and taking it up to the next level.
  • 15:53 - 15:56
    Like, "Find landscape paintings
    taller than they are wide."
  • 15:56 - 16:03
    So within the two-hour thing,
    we get people doing basic queries,
  • 16:03 - 16:08
    adding refinements onto them,
  • 16:08 - 16:11
    not doing much filtering,
  • 16:11 - 16:14
    but starting to introduce measurements,
  • 16:14 - 16:15
    and so on.
  • 16:15 - 16:18
    Not getting into qualifiers
    or another level.
  • 16:18 - 16:21
    If it's a whole day thing,
    you probably could.
  • 16:21 - 16:26
    It comes up, inevitably, "Where else
    can I use the SPARQL language?"
  • 16:26 - 16:30
    And I observe that that is a question,
    and questions can be framed in SPARQL,
  • 16:30 - 16:32
    and put to Wikidata,
    and you'll get answers,
  • 16:32 - 16:34
    and there is a Wikidata property
    called SPARQL endpoint.
  • 16:34 - 16:37
    So when they ask that,
    that becomes their task.
  • 16:37 - 16:39
    And then they get
    that list of institutions
  • 16:39 - 16:40
    that have SPARQL endpoints.
  • 16:42 - 16:44
    And it's worth pointing out,
  • 16:44 - 16:49
    so in an introductory session
    on other computer languages,
  • 16:49 - 16:52
    people will typically
    learn how to do loops,
  • 16:52 - 16:55
    how to do functions,
    how to do conditionals.
  • 16:55 - 16:57
    They'll learn the basic grammar
  • 16:57 - 17:00
    but they won't make something
    fantastic and useful,
  • 17:00 - 17:02
    they'll just learn the basic grammar.
  • 17:02 - 17:06
    But in an introductory session
    on Wikidata SPARQL you can make--
  • 17:06 - 17:08
    if you're interested
    in German literature--
  • 17:08 - 17:10
    a map of the birthplace
    of German poets, and so on.
  • 17:10 - 17:12
    And so we get feedback like this.
  • 17:12 - 17:14
    This is how great
    the Wikidata Query Service is
  • 17:14 - 17:16
    as an educational tool.
  • 17:16 - 17:19
    "What is this sorcery?"
    Isn't even from someone in the room.
  • 17:19 - 17:21
    A trainee in the room made a map,
  • 17:21 - 17:25
    emailed it to her colleagues
    and got back, "What is this sorcery!?
  • 17:25 - 17:26
    How have you made this?"
  • 17:26 - 17:29
    And was just not expecting this to happen.
  • 17:29 - 17:32
    People are not expecting to look at
    the picture of the cute dog,
  • 17:32 - 17:36
    they're not expecting to do the role play
    where they represent their family
  • 17:36 - 17:38
    and query each other.
  • 17:38 - 17:40
    They're not expecting
    to actually make something concrete
  • 17:40 - 17:43
    which they take away as a link
    and show to their colleagues.
  • 17:43 - 17:45
    And all of this, being unexpected,
  • 17:45 - 17:47
    makes it memorable
    and makes them want to go away
  • 17:47 - 17:49
    and talk to other people about it.
  • 17:49 - 17:51
    It's not like your run-of-the-mill
    IT training.
  • 17:53 - 17:58
    The lower quote is from a researcher
    who saw how he could make a map
  • 17:58 - 18:01
    of famous people with his first name
  • 18:01 - 18:04
    and another one of famous people
    with his wife's first name.
  • 18:04 - 18:08
    And then he just had more and more ideas
    of things and charts, and so on,
  • 18:08 - 18:09
    he's going to create with Wikidata,
  • 18:09 - 18:11
    and so he's glad to say,
  • 18:11 - 18:13
    "You've destroyed my productivity
    for the next month."
  • 18:16 - 18:18
    So that's my recommendation.
  • 18:18 - 18:20
    I think we can take it as a positive,
  • 18:20 - 18:23
    and we take beyond
    training people about Wikidata,
  • 18:23 - 18:25
    training people about data.
  • 18:25 - 18:27
    The stuff that came up
    in the keynote this morning,
  • 18:27 - 18:32
    making people literate
    about ideas of representation
  • 18:32 - 18:37
    and starting people off
    and being involved in that discussion,
  • 18:37 - 18:38
    involves this [inaudible].
  • 18:38 - 18:39
    So this could be done--
  • 18:39 - 18:41
    doesn't have to be like
    a workplace training thing,
  • 18:41 - 18:42
    it could be a public event,
  • 18:42 - 18:45
    to get people familiar
    with these technologies.
  • 18:46 - 18:48
    But I will stop there for discussion.
  • 18:48 - 18:51
    And like I say, it's respectfully
    submitted to people in the room
  • 18:51 - 18:55
    who do SPARQL training a different way,
    but I hope this is useful to you.
  • 18:57 - 19:00
    (audience applause)
  • 19:13 - 19:16
    (Dan) Okay, are there any questions?
  • 19:24 - 19:27
    (man) Hi, it's [Mohammed Hijah]
    from Palestine.
  • 19:27 - 19:28
    Thank you for the session.
  • 19:28 - 19:31
    I was wondering if there are resources
  • 19:31 - 19:35
    that we can get to learn
    SPARQL language professionally?
  • 19:38 - 19:40
    I've got the SPARQL book,
    the O'Reilly book.
  • 19:40 - 19:43
    I find the Wikibook on SPARQL
  • 19:43 - 19:45
    is really, really useful.
  • 19:45 - 19:48
    That's like the most useful
    and accessible reference.
  • 19:49 - 19:55
    The tutorials on Wikidata itself
    are going to vary in quality.
  • 19:55 - 19:58
    (Mohammed) I think
    that they are for beginners.
  • 19:58 - 20:01
    I can handle with SPARQL
    but in the beginner level,
  • 20:01 - 20:04
    but I want to deal with it professionally.
  • 20:11 - 20:14
    So my concern is to get
    as many people as possible
  • 20:14 - 20:16
    across the threshold
    into being aware of how this works,
  • 20:16 - 20:18
    and dabbling.
  • 20:19 - 20:25
    I'd like it to be a deeper course
    by going into more of the...
  • 20:26 - 20:29
    how it works--
    qualifiers and references, and so on.
  • 20:29 - 20:32
    Where in a professional context,
    you're probably aiming towards
  • 20:32 - 20:36
    people using a particular SPARQL endpoint,
  • 20:36 - 20:39
    and Wikidata has some customizations
  • 20:39 - 20:42
    We've discussed in Twitter
    that there's some things we use
  • 20:42 - 20:44
    that actually aren't a SPARQL standard.
  • 20:44 - 20:46
    They're like an optimization.
  • 20:46 - 20:49
    So in the professional context,
  • 20:51 - 20:56
    I'd hope it would be tailored
    to that particular data set and endpoint,
  • 20:56 - 21:00
    but there's not a demand for that yet,
  • 21:00 - 21:03
    because like I said, I deal with people
    who are aware of linked open data,
  • 21:03 - 21:08
    and the word out, it's a good thing,
    but haven't seen an example yet,
  • 21:08 - 21:09
    haven't an example
    they can apply to their work,
  • 21:09 - 21:12
    they're not enthusiastic about it yet.
  • 21:12 - 21:14
    So I think we want to
    get my whole workplace
  • 21:14 - 21:18
    and other workplaces and developers
    across that threshold
  • 21:18 - 21:22
    to where they're demanding
    that kind of really in deep,
  • 21:22 - 21:25
    like using endpoint in a library
    kind of training.
  • 21:26 - 21:27
    (Mohammed) Thank you.
  • 21:32 - 21:35
    (woman) It's just a question.
    I really liked that, thank you so much.
  • 21:35 - 21:38
    Is it documented step-by-step anywhere?
  • 21:39 - 21:43
    I can share my succession of tasks.
  • 21:44 - 21:47
    That's very much tailored
    to where I'm presenting it.
  • 21:47 - 21:51
    Like I said, with librarians,
    I start with manuscripts and go on.
  • 21:54 - 21:56
    You want to end up
    with people asking a question
  • 21:56 - 22:01
    which is the question they came,
    in their heads, to the event with.
  • 22:05 - 22:10
    So there's an order
    of querying with a triple,
  • 22:10 - 22:13
    and then with multiple triples,
    and then with an optional triple,
  • 22:13 - 22:17
    and then with a measurement
    in a filter, and so on.
  • 22:17 - 22:21
    And, yeah, I can share...
  • 22:22 - 22:24
    Yeah, I'll share a separate set of slides
  • 22:24 - 22:25
    for those exercises.
  • 22:25 - 22:27
    (woman) Thank you so much
    because I will take that
  • 22:27 - 22:30
    and customize it for my own needs.
    Thank you.
  • 22:31 - 22:33
    (Dan) Okay. No questions?
  • 22:35 - 22:39
    (man) What would you recommend
    if you also want to teach editing,
  • 22:39 - 22:42
    apart from just querying?
  • 22:47 - 22:53
    I'm pleased to report
    that people find Wikidata editing,
  • 22:53 - 22:57
    when I demonstrate it, to be so simple,
  • 22:57 - 22:59
    that it just takes them by surprise.
  • 22:59 - 23:02
    It's Wikidata editing,
    and I've got to add knowledge
  • 23:02 - 23:03
    to this huge knowledge base.
  • 23:03 - 23:05
    Sounds like something
    that really technical people can do.
  • 23:05 - 23:09
    And then you show it,
    and they go, "Oh, right.
  • 23:09 - 23:11
    Martin is instance of human."
  • 23:13 - 23:19
    So I haven't done that systematically yet.
  • 23:21 - 23:26
    I think a precondition would be
    getting people thinking in triples,
  • 23:26 - 23:30
    and maybe underline that
    triples need references,
  • 23:30 - 23:34
    and triples need qualifiers
    and that multiple triples,
  • 23:34 - 23:37
    triples have multiple conflicting values.
  • 23:37 - 23:40
    So I'd still do the toy world,
  • 23:40 - 23:45
    maybe a more professionally relevant
    toy world, and translation exercise,
  • 23:45 - 23:48
    but then go to, "So now the exercise
    we're going to do with triples
  • 23:48 - 23:50
    is adding them."
  • 23:52 - 23:55
    There's a lot of work done,
    and maybe Jason's done,
  • 23:55 - 23:58
    with guessing a table of identifiers.
  • 23:58 - 24:00
    So something I'd like to do,
  • 24:00 - 24:04
    there's an online database
  • 24:04 - 24:07
    of people who've won a Rhodes Scholarship.
  • 24:07 - 24:11
    There's a scholarship to Oxford University
    from other countries.
  • 24:11 - 24:12
    But it's not in Wikidata yet.
  • 24:12 - 24:14
    So you can kind of divide up
    the room and say,
  • 24:14 - 24:17
    "You're going to find
    these people in Wikidata
  • 24:17 - 24:19
    and your task is to add
  • 24:19 - 24:21
    with the reference
    to this online database."
  • 24:21 - 24:23
    And then you can do a query
    to see how many have been added
  • 24:23 - 24:26
    in that session.
  • 24:26 - 24:28
    So I think, with all the training I do,
  • 24:28 - 24:32
    I think the comprehension
    is more important
  • 24:32 - 24:34
    than the taking action immediately.
  • 24:34 - 24:36
    So when I'm training people on Wikipedia,
  • 24:36 - 24:40
    I first show them article histories,
    contribution records, talk page,
  • 24:40 - 24:45
    quality scale, so they're comprehending
    the process before they edit,
  • 24:45 - 24:47
    and actually change something.
  • 24:50 - 24:53
    (man) Not really a question but a comment.
  • 24:53 - 24:59
    There is, for beginners,
    a good tutorial on YouTube,
  • 24:59 - 25:01
    How to Query and Start with SPARQL,
  • 25:01 - 25:04
    and if you want to go deeper, also,
  • 25:04 - 25:09
    How to Add Data with OpenRefine.
  • 25:09 - 25:13
    And I've also made some videos
  • 25:13 - 25:15
    and uploaded them in German language.
  • 25:15 - 25:17
    Oh, great! Thanks.
  • 25:18 - 25:22
    I should also mention Hilary Thorsen,
    who's from Stanford Library,
  • 25:22 - 25:25
    did, last week,
    a really good video capture
  • 25:25 - 25:29
    of adding a data set to Wikidata
    with OpenRefine.
  • 25:29 - 25:34
    This is for the LD4P, the Linked Data
    for Production project,
  • 25:34 - 25:36
    and that was a really good video tutorial
  • 25:36 - 25:38
    I'd recommend to anybody for--
  • 25:38 - 25:42
    That's the next couple of levels up
    from what I'm doing.
  • 25:43 - 25:45
    (Dan) Is there a last question?
  • 25:49 - 25:52
    (man) So SPARQL's sort of SQL-ish.
  • 25:52 - 25:55
    If someone walked into your tutorial
    with an SQL background,
  • 25:55 - 25:57
    is that a blessing or a curse?
  • 25:57 - 26:00
    It's a bit of a curse
    because I had to learn SQL,
  • 26:00 - 26:03
    so I did the...
  • 26:03 - 26:09
    generate the invoices
    using SQL for your fictitious company,
  • 26:09 - 26:14
    and definitely had to unlearn
    an SQL way of thinking about things
  • 26:14 - 26:16
    to get to SPARQL.
  • 26:16 - 26:18
    But it was freeing, it was freeing.
  • 26:18 - 26:21
    Databases without built-in schemas
    are liberating.
  • 26:22 - 26:24
    When you think about
    how many columns there are,
  • 26:24 - 26:26
    and it's this number
    of columns for a book,
  • 26:26 - 26:28
    and it's this number of columns
    for the address,
  • 26:28 - 26:29
    and it's just three columns.
  • 26:29 - 26:31
    Well, three and a bit more.
  • 26:31 - 26:34
    That's really liberating.
  • 26:34 - 26:37
    So that's my point, I kind of glanced at,
  • 26:37 - 26:42
    that people make different progress
    in these workshops as in all training,
  • 26:42 - 26:44
    but it's not like intelligent versus dumb,
  • 26:44 - 26:47
    it's like the preconceptions
    you're coming with,
  • 26:47 - 26:48
    are more the obstacle.
  • 26:48 - 26:50
    So it's actually more--
  • 26:50 - 26:56
    I'm more optimistic about training people
    who have never encountered databases,
  • 26:56 - 26:59
    coding, or any of that before, than...
  • 26:59 - 27:02
    The worst people to try and train
    are linked data experts
  • 27:02 - 27:05
    because they've used DBpedia a lot.
  • 27:05 - 27:07
    They used a particular approach
    of querying
  • 27:07 - 27:09
    and expecting to get certain things,
  • 27:09 - 27:12
    and it looks odd when Wikidata
    does things differently.
  • 27:12 - 27:15
    And they need to get with the program.
  • 27:15 - 27:18
    (Dan) Okay, let's thank Martin
    for his insights.
  • 27:18 - 27:19
    Thanks very much.
  • 27:19 - 27:22
    (audience applause)
Title:
cdn.media.ccc.de/.../wikidatacon2019-1109-eng-Teaching_SPARQL_as_a_Foreign_Language_hd.mp4
Video Language:
English
Duration:
27:29

English subtitles

Revisions