< Return to Video

RailsConf 2014 - Mutation Testing with Mutant by Erik Michaels-Ober

  • 0:17 - 0:20
    ERIK MICHAELS-OBER: OK. Is the mic live? Yeah?
    We're good.
  • 0:20 - 0:27
    OK. Hi everybody. Welcome. Thank you for coming.
    So,
  • 0:29 - 0:34
    this is gonna be a talk about tools. And
  • 0:34 - 0:38
    there's this common expression that says that
    a carpenter
  • 0:38 - 0:42
    is only as good as his or her tools.
  • 0:42 - 0:44
    I'm not a carpenter, but that makes a lot
  • 0:44 - 0:46
    of sense to me. If your hammer is made
  • 0:46 - 0:49
    out of feathers, you're not gonna be able
    to
  • 0:49 - 0:51
    build very much.
  • 0:51 - 0:55
    And I really think the same thing is true
  • 0:55 - 0:59
    for programmers. I know that that is true.
    The
  • 0:59 - 1:03
    tools that we use really enable us to do
  • 1:03 - 1:06
    our job. And we use so many tools, it's
  • 1:06 - 1:09
    easy to sort of take for granted the tools
  • 1:09 - 1:10
    that we have and the tools that we do
  • 1:10 - 1:13
    use. And so I think it's worth sort of
  • 1:13 - 1:15
    thinking about the tools that we have and
    how
  • 1:15 - 1:19
    they help us improve as a programmer. And
    thinking
  • 1:19 - 1:22
    about what new tools we can use. In this
  • 1:22 - 1:26
    case, I'll be talking specifically about mutation
    testing and
  • 1:26 - 1:28
    how that, as a tool, can really help us
  • 1:28 - 1:32
    all improve as programmers. Help us write
    better tests.
  • 1:32 - 1:35
    But, I think, I just want to sort of
  • 1:35 - 1:37
    take some time to reflect and, and set a
  • 1:37 - 1:40
    little bit of a context for the tools that
  • 1:40 - 1:42
    we use every day and sort of, I think
  • 1:42 - 1:44
    take for granted a bit.
  • 1:44 - 1:50
    So, the first one is an editor. And it
  • 1:50 - 1:52
    seems like a very simple tool, right. You
    just
  • 1:52 - 1:54
    type in text and it just shows up on
  • 1:54 - 1:57
    the screen. But it's incredibly sophisticated.
    If you've ever
  • 1:57 - 1:59
    tried to write a text editor, if you've ever
  • 1:59 - 2:01
    read the source code of a text editor, most
  • 2:01 - 2:04
    text editors are like millions of lines of
    code
  • 2:04 - 2:08
    to implement what seems like a relatively
    simple thing.
  • 2:08 - 2:10
    And they help us. They provide us with things
  • 2:10 - 2:15
    like syntax highlighting, auto completion.
    And this directly helps
  • 2:15 - 2:18
    us write better programs, right. We avoid
    bugs. We'll
  • 2:18 - 2:21
    realize a bug in our editor before we, before
  • 2:21 - 2:23
    we deploy it to production. Before we even
    run
  • 2:23 - 2:26
    tests, we'll find a bug in our editor. Because
  • 2:26 - 2:30
    our editor tells us about it.
  • 2:30 - 2:37
    This is an early version of Vim. So it
  • 2:37 - 2:39
    can, it can be really easy to forget sort
  • 2:39 - 2:41
    of what these tools used to look like, right.
  • 2:41 - 2:44
    This is how people used to write code. And
  • 2:44 - 2:46
    these look more like the sort of tools from
  • 2:46 - 2:47
    the wood shop than the tools that we're used
  • 2:47 - 2:51
    to using. So this is an early punch card
  • 2:51 - 2:55
    machine. The photo was taken in the, in the
  • 2:55 - 2:59
    computer history museum in Mountainview, California.
    And I can
  • 2:59 - 3:01
    tell you for a fact that I would not
  • 3:01 - 3:03
    be a programmer today if this is how we
  • 3:03 - 3:06
    still had to write programs. And I suspect
    that
  • 3:06 - 3:09
    many of you would not be programmers if this
  • 3:09 - 3:12
    was sort of the state-of-the-art in how it
    was
  • 3:12 - 3:12
    done.
  • 3:12 - 3:15
    And so I think, like, I want to make
  • 3:15 - 3:17
    the case that sort of both the quality and
  • 3:17 - 3:21
    the quantity of software would be much worse
    than
  • 3:21 - 3:23
    it is today, if not for sort of the
  • 3:23 - 3:28
    continued evolution of, of our tools.
  • 3:28 - 3:30
    Another tool I use every day is an interactive
  • 3:30 - 3:35
    debugger. So, sort of allows you to step through
  • 3:35 - 3:37
    your code, line by line, and better understand
    how
  • 3:37 - 3:39
    it works. You can kind of get inside the
  • 3:39 - 3:43
    code, right. I'm not gonna spend too much
    time
  • 3:43 - 3:48
    talking about debuggers. Sort of a public
    service announcement,
  • 3:48 - 3:52
    next week, not next week. This week. Next
    Thursday.
  • 3:52 - 3:55
    This Thursday. In this same room, I believe,
    is
  • 3:55 - 4:00
    a great talk on debugger-driven development
    with Pry. So,
  • 4:00 - 4:02
    if you're interested in hearing more about
    that, you
  • 4:02 - 4:03
    should go to that.
  • 4:03 - 4:07
    So, what do we do when our code is
  • 4:07 - 4:11
    slow? What's the tool for that, right? We
    have
  • 4:11 - 4:14
    profilers that tell us where time is being
    spent
  • 4:14 - 4:17
    when we execute our code. And I wouldn't even
  • 4:17 - 4:21
    know how to start optimizing the program if
    I
  • 4:21 - 4:23
    didn't have a profile, right, profiler. I
    would be
  • 4:23 - 4:27
    a terrible optimizer without a profiler. I
    guess I
  • 4:27 - 4:30
    would like start putting in, you know, t equals
  • 4:30 - 4:33
    time dot now, and then, like, at the end
  • 4:33 - 4:36
    of whatever I wanted to measure, I would subtract
  • 4:36 - 4:40
    the current time from the start time. But,
    that's
  • 4:40 - 4:43
    crazy. Like, instrumenting your entire code
    that way is,
  • 4:43 - 4:47
    yeah. Like, I wouldn't really know how to
    optimize
  • 4:47 - 4:49
    code without a profiler. I wouldn't be as
    good
  • 4:49 - 4:52
    at it. None of us would be.
  • 4:52 - 4:57
    And another sort of tool that is very prevalent
  • 4:57 - 5:01
    in the, in the Ruby community is testing.
    This
  • 5:01 - 5:03
    is an example of someone who should have done
  • 5:03 - 5:10
    more testing. So that, again, just. Yeah.
    All right.
  • 5:12 - 5:15
    So I think this is a good illustration of
  • 5:15 - 5:20
    how testing can save you, right. Test so that
  • 5:20 - 5:22
    you find out before you sort of run it
  • 5:22 - 5:24
    in production. OK.
  • 5:24 - 5:26
    Enough of that.
  • 5:26 - 5:28
    So, I would, I'm actually gonna make the case
  • 5:28 - 5:31
    that, in the Ruby, Ruby toolbox, or maybe
    in
  • 5:31 - 5:34
    the Rubyist's toolbox, tests are sort of like
    the
  • 5:34 - 5:35
    hammer, right. Like, this is the thing you
    turn
  • 5:35 - 5:39
    to all the time for all sorts of things.
  • 5:39 - 5:42
    We use them to prevent regressions. We use
    them
  • 5:42 - 5:45
    to specify behavior. And we actually use them
    to
  • 5:45 - 5:50
    drive development. DHH doesn't do this, but
    many others
  • 5:50 - 5:53
    do. And find it useful.
  • 5:53 - 5:57
    So if we write tests, then we have perfect
  • 5:57 - 6:00
    code, right. If we have tests that verify
    that
  • 6:00 - 6:03
    our code does what it's supposed to do, then
  • 6:03 - 6:05
    at the end of the day, we have perfect
  • 6:05 - 6:10
    code. Correct? Not correct.
  • 6:10 - 6:13
    This is the fundamental logical flaw with
    testing, right.
  • 6:13 - 6:17
    You have some code. And you know that code
  • 6:17 - 6:19
    can have bugs. So you say I have an
  • 6:19 - 6:23
    idea, let's write some tests. But tests are
    just
  • 6:23 - 6:27
    more code. And we know that code has bugs.
  • 6:27 - 6:29
    So we're screwed.
  • 6:29 - 6:31
    What's that?
  • 6:31 - 6:36
    Test your tests. That's right. So. I'm getting
    there.
  • 6:36 - 6:36
    Patience.
  • 6:36 - 6:40
    So, like, one tool that people use to sort
  • 6:40 - 6:43
    of measure the effectiveness of their tests
    is code
  • 6:43 - 6:50
    coverage. And it's sort of a metric that's
    designed
  • 6:50 - 6:52
    to tell you whether your tests do what they're
  • 6:52 - 6:56
    supposed to do. But I'll show you, in a
  • 6:56 - 6:58
    moment, why I think it's a really flawed metric
  • 6:58 - 7:00
    and why it sort of can give you a
  • 7:00 - 7:03
    false sense of security, right. A lot of people
  • 7:03 - 7:07
    think that they have 100% code coverage, and
    that
  • 7:07 - 7:09
    means, like, their code is perfect and bug-free.
    Or
  • 7:09 - 7:11
    if they reach that level, then their code
    will
  • 7:11 - 7:14
    be perfect and bug-free. But this is not true.
  • 7:14 - 7:17
    Right, like, this guy thinks he's covered
    and he's
  • 7:17 - 7:18
    not.
  • 7:18 - 7:21
    And code coverage is actually, like, it's
    something that
  • 7:21 - 7:24
    was built into Ruby, right. Like, in Ruby
    1.9.3,
  • 7:24 - 7:26
    this is something that, like, we as a programmer
  • 7:26 - 7:28
    community said, like, we want to have. And
    I'm
  • 7:28 - 7:31
    not against it. Like, I think it's good. But
  • 7:31 - 7:33
    I do think it can give you a false
  • 7:33 - 7:34
    sense of security, right.
  • 7:34 - 7:38
    I thought this was a funny Tweet.
  • 7:38 - 7:43
    So, you can have 100% code coverage and still
  • 7:43 - 7:50
    have completely bug-ridden code. So, so is
    there hope
  • 7:51 - 7:54
    for us? Right? Like, how do we, how do
  • 7:54 - 7:57
    we test our tests? It's sort of this problem
  • 7:57 - 8:00
    of, like, who will watch the watchers, right?
    Who
  • 8:00 - 8:02
    do we, who can we trust? If we can't
  • 8:02 - 8:04
    trust our tests, how, why, why are we even
  • 8:04 - 8:06
    writing them?
  • 8:06 - 8:08
    And I'm gonna try to make the case that
  • 8:08 - 8:12
    mutation testing is the sort of solution to
    this
  • 8:12 - 8:18
    problem. So just like everything else, like
    an editor,
  • 8:18 - 8:22
    like an interactive debugger, like a profiler,
    like tests,
  • 8:22 - 8:26
    mutation testing is a tool. The basic idea
    behind
  • 8:26 - 8:28
    it is that it takes your tests and it
  • 8:28 - 8:32
    runs them against your code, and they should
    pass.
  • 8:32 - 8:33
    And if they do pass, then what it does,
  • 8:33 - 8:35
    is it takes your code and it makes a
  • 8:35 - 8:39
    modification to your code. It actually changes
    your code
  • 8:39 - 8:43
    at runtime. And then it runs your tests again,
  • 8:43 - 8:46
    against the modified version of your code.
    And the
  • 8:46 - 8:48
    idea is that when that code is modified, the
  • 8:48 - 8:52
    tests that previously passed should now fail,
    right.
  • 8:52 - 8:55
    So the thing, your modified code is called
    a
  • 8:55 - 8:57
    mutant, and the idea is that if that test
  • 8:57 - 9:01
    fails, you kill the mutant. Right. The mutant
    dies.
  • 9:01 - 9:03
    But if that mutant survives, then that means
    there's
  • 9:03 - 9:05
    something wrong with your tests. There might
    not be
  • 9:05 - 9:07
    something wrong with your code. But there
    is certainly
  • 9:07 - 9:10
    something wrong with your tests. Either you
    have a
  • 9:10 - 9:12
    bug in your tests. You have missing tests.
    Your
  • 9:12 - 9:17
    tests are either over-specified or under-specified.
  • 9:17 - 9:19
    So this is a technique, it's very helpful
    for
  • 9:19 - 9:22
    sort of answering the question, what tests
    should I
  • 9:22 - 9:25
    write? Which I think is a question that many
  • 9:25 - 9:28
    of us struggle with. It's certainly something
    that beginners
  • 9:28 - 9:30
    struggle with when they're starting to program.
    Like, how
  • 9:30 - 9:33
    do I, how do I write tests? What, what
  • 9:33 - 9:35
    do I test? Right.
  • 9:35 - 9:37
    And then there's also this question of like,
    how
  • 9:37 - 9:39
    do I know when I'm done? How do I
  • 9:39 - 9:42
    know when the code is sufficiently tested?
    And I
  • 9:42 - 9:46
    think these are actually hard questions to
    ask, or
  • 9:46 - 9:51
    hard questions to answer, and mutation, mutation
    testing provides
  • 9:51 - 9:54
    a, a quantitative answer to those questions.
    You can
  • 9:54 - 9:59
    say, with confidence, that this code has 100%
    mutation
  • 9:59 - 10:02
    coverage.
  • 10:02 - 10:09
    So, just to sort of give an example, here
  • 10:09 - 10:14
    is some code. And an assertion about the code.
  • 10:14 - 10:17
    So, I have a method, foo. It takes an
  • 10:17 - 10:21
    argument whose default is true. And the actual
    method
  • 10:21 - 10:26
    body for foo is either return that argument
    or
  • 10:26 - 10:33
    fail. And my assertion says assert_nothing_raised
    if I call
  • 10:33 - 10:38
    the method foo without passing in any parameters.
  • 10:38 - 10:40
    And so, or without passing in any arguments
    to
  • 10:40 - 10:47
    the, our parameter, rather. And so what, you
    know,
  • 10:48 - 10:52
    this test will pass, right. Arg. You call
    foo.
  • 10:52 - 10:55
    Arg is true. And it sort of short-circuits,
    right.
  • 10:55 - 11:00
    It sees arg. It sees the or. And this
  • 11:00 - 11:03
    test passes. So maybe you think this is a
  • 11:03 - 11:06
    good test. Maybe you think you're done writing
    your
  • 11:06 - 11:08
    tests. But you are not.
  • 11:08 - 11:11
    And a mutant of that code, a small modification,
  • 11:11 - 11:14
    a sort of unit modification of that code might
  • 11:14 - 11:17
    look like this. And basically what it did
    was
  • 11:17 - 11:19
    it just sort of took that or fail and
  • 11:19 - 11:22
    removed it. And the idea is like, if you
  • 11:22 - 11:25
    do that, at least one of your tests should
  • 11:25 - 11:28
    now, that was passing before, should now fail.
    One
  • 11:28 - 11:30
    of your tests over that code, for that foo
  • 11:30 - 11:34
    method, should now fail. And if it does not,
  • 11:34 - 11:38
    then you are not testing your code sufficiently.
  • 11:38 - 11:42
    So, this is called a statement deletion mutation.
    There
  • 11:42 - 11:47
    are various other types of mutation. So, for
    example,
  • 11:47 - 11:50
    there are mutations that would take that default
    parameter
  • 11:50 - 11:52
    and change it from true to false, or from
  • 11:52 - 11:56
    true to nil, right. Which would also cause
    failure,
  • 11:56 - 11:59
    in this case.
  • 11:59 - 12:01
    There's another mutation that will take the
    or and
  • 12:01 - 12:05
    change it to an and, right. So any time
  • 12:05 - 12:08
    there is sort of a unit in your code,
  • 12:08 - 12:10
    it takes greater than signs and changes them
    to
  • 12:10 - 12:13
    less than or equal to signs, et cetera. Right.
  • 12:13 - 12:16
    It takes ifs and changes them to unless. It
  • 12:16 - 12:18
    will take whole expressions and negate them
    and make
  • 12:18 - 12:21
    sure that your tests fail when the negation
    of
  • 12:21 - 12:25
    a statement is, when, when the method returns
    the
  • 12:25 - 12:28
    negation of the statement instead of the statement,
    right.
  • 12:28 - 12:31
    So that's, that's sort of the core idea behind
  • 12:31 - 12:34
    mutation testing. And so you end up sort of
  • 12:34 - 12:38
    writing these tests to cover all these cases
    that,
  • 12:38 - 12:40
    and then you sort of know when you're done,
  • 12:40 - 12:43
    right. Like, you know when all of your tests,
  • 12:43 - 12:47
    when, when your code is fully mutation-covered.
  • 12:47 - 12:52
    This is another Tweet. It's one from Katrina
    Owen.
  • 12:52 - 12:54
    And it's sort of this idea, it's kind of
  • 12:54 - 12:57
    like both horrifying and satisfying at the
    same time.
  • 12:57 - 12:59
    But if you sort of add more granular tests,
  • 12:59 - 13:03
    you'll find more bugs. And in many cases,
    mutant,
  • 13:03 - 13:05
    which is a mutation testing framework, will
    find those
  • 13:05 - 13:08
    bugs for you. Right. That's cool.
  • 13:08 - 13:10
    OK. So I promised there would be live-coding.
    This
  • 13:10 - 13:14
    is sort of. The introduction is over and now
  • 13:14 - 13:17
    we will write some code. Hopefully.
  • 13:17 - 13:24
    I'm just gonna switch to mirror display. Command
    F1.
  • 13:33 - 13:37
    That is a protip. That's great. You're a pro.
  • 13:37 - 13:41
    I clearly am not. OK. Cool.
  • 13:41 - 13:44
    Cool. And a new version of mutant was, like,
  • 13:44 - 13:49
    just released a few minutes ago, in advance
    of
  • 13:49 - 13:53
    this presentation. I am not the author of
    mutant.
  • 13:53 - 13:57
    It's a great library by Markus Schirp, and
    I
  • 13:57 - 14:00
    encourage you all to check it out. Version
    zero
  • 14:00 - 14:03
    dot five dot eleven, hot off the presses.
  • 14:03 - 14:08
    So this is some code. So, like, the, this
  • 14:08 - 14:11
    sort of thrust behind this live-coding demo
    is I
  • 14:11 - 14:13
    will not be live-coding code, I will be live-coding
  • 14:13 - 14:17
    tests. Because the idea is not to, like, mutant
  • 14:17 - 14:19
    doesn't verify that your code is correct.
    It verifies
  • 14:19 - 14:21
    that your tests are correct. So you still
    need
  • 14:21 - 14:23
    to write tests, right. Tests verify that your
    code
  • 14:23 - 14:26
    is correct. Mutant verifies that your tests
    are correct.
  • 14:26 - 14:29
    So this is the code. And it's pretty, pretty
  • 14:29 - 14:33
    simple. But we'll sort of walk through it
    line-by-line.
  • 14:33 - 14:34
    Just to make sure everyone has a good understanding
  • 14:34 - 14:39
    of it. And so there's this module that represents
  • 14:39 - 14:43
    the universe, the entire universe, and inside
    of the
  • 14:43 - 14:45
    universe we have planets. And that's what
    this class
  • 14:45 - 14:49
    is all about. It's a pretty simple planet.
    It
  • 14:49 - 14:55
    takes a radius and an area as parameters when
  • 14:55 - 14:59
    it's constructed and stores those in instance
    variables. The
  • 14:59 - 15:03
    radius is the mean radius of the planet and,
  • 15:03 - 15:06
    in kilometers, and the area is sort of surface
  • 15:06 - 15:10
    area of the planet in square kilometers.
  • 15:10 - 15:14
    And then there's one sort of interesting method,
    one
  • 15:14 - 15:21
    public method, spherical. And spherical will
    return true if,
  • 15:23 - 15:26
    if the planet is a perfect sphere, or within
  • 15:26 - 15:29
    a particular tolerance of that. So the idea
    is
  • 15:29 - 15:33
    we calculate the approximate area using four
    pi r
  • 15:33 - 15:37
    squared, which is the formula to calculate
    the area
  • 15:37 - 15:44
    of a sphere, and if the area sort of
  • 15:44 - 15:47
    matches that, then we know it's a sphere.
    We
  • 15:47 - 15:50
    know it's spherical. This method returns true.
  • 15:50 - 15:53
    And if, if that's not true, then the planet
  • 15:53 - 15:57
    is not spherical. It's either oblate, like
    the earth,
  • 15:57 - 16:02
    or prolate, and then this method will return
    false.
  • 16:02 - 16:04
    So, yeah. We just sort of calculate the approximate
  • 16:04 - 16:07
    area and then we have this ranged private
    method
  • 16:07 - 16:10
    that just generates a range. We need sort
    a
  • 16:10 - 16:12
    tolerance. The idea is you don't want it to
  • 16:12 - 16:17
    be too precise, because we're dealing with
    pi, so
  • 16:17 - 16:23
    pi is, I mean, in actuality, it's a non-terminating
  • 16:23 - 16:27
    number. In Ruby, it has, like, ten digits
    of
  • 16:27 - 16:29
    precision or something like that, right. Like
    the constant
  • 16:29 - 16:30
    map pi.
  • 16:30 - 16:33
    But the idea is that, like, if it's close
  • 16:33 - 16:36
    enough to a sphere, within a particular tolerance,
    then
  • 16:36 - 16:40
    we'll just call it round, basically. And so
    we
  • 16:40 - 16:44
    generate this range, which is sort of the
    approximate
  • 16:44 - 16:47
    area that we've calculated, based on the radius
    plus
  • 16:47 - 16:49
    or minus the tolerance, and we see if the
  • 16:49 - 16:53
    area falls within those bounds. Does everyone
    understand this
  • 16:53 - 16:56
    code? I think it is pretty simple. I tried
  • 16:56 - 16:58
    to make it fit on one screen. On one
  • 16:58 - 16:59
    slide.
  • 16:59 - 17:01
    Yeah?
  • 17:01 - 17:05
    OK. So if everyone understands it, I want
    to
  • 17:05 - 17:07
    take a little bit of a poll. This is
  • 17:07 - 17:09
    kind of like the interactive part of the talk.
  • 17:09 - 17:11
    And you have to, like, everyone has to participate.
  • 17:11 - 17:15
    That's the, that's the goal. Everyone, people
    like to
  • 17:15 - 17:17
    sort of sit by the sidelines and not commit,
  • 17:17 - 17:19
    but you have to commit. I'll be really angry
  • 17:19 - 17:23
    if you don't.
  • 17:23 - 17:26
    You don't want to see me angry.
  • 17:26 - 17:29
    So how many tests do you think you need
  • 17:29 - 17:35
    to fully cover this code? To cover the public
  • 17:35 - 17:39
    method, the, the spherical method, right,
    so that it's
  • 17:39 - 17:42
    sort of fully exercised. Who thinks you need
    zero
  • 17:42 - 17:49
    tests? Show of hands? Anybody? No. Good. I
    agree.
  • 17:49 - 17:52
    You can't cover code without tests. So, that's
    good.
  • 17:52 - 17:54
    You've been paying some attention.
  • 17:54 - 17:57
    Who thinks you can do it with one test?
  • 17:57 - 18:00
    Maybe, sort of, the happy path? Right. You
    write
  • 18:00 - 18:04
    a test that says, you know, you expect some
  • 18:04 - 18:09
    planet to be spherical given radius and an
    area,
  • 18:09 - 18:16
    and it is. All good. Who thinks that's sufficient?
  • 18:18 - 18:20
    Nobody. So.
  • 18:20 - 18:24
    You can actually get C-zero, 100% C-zero code
    coverage
  • 18:24 - 18:27
    of this entire class with one test. With one
  • 18:27 - 18:31
    spec. Right. You won't have 100% mutation
    coverage, but
  • 18:31 - 18:33
    I will show you, in a minute, you can
  • 18:33 - 18:36
    have 100% C-zero code coverage, despite the
    fact that
  • 18:36 - 18:39
    nobody in this room thinks that that is sufficient
  • 18:39 - 18:42
    to cover this code. So. I will prove it
  • 18:42 - 18:44
    to you. But you all intuitively know this
    to
  • 18:44 - 18:47
    be the case. And yet we all idolize this
  • 18:47 - 18:51
    C-zero code coverage metric as if it means
    something,
  • 18:51 - 18:54
    when really it, it's a false sense of security,
  • 18:54 - 18:56
    right. You're the guy with the umbrella in
    the
  • 18:56 - 18:59
    hurricane, and the umbrella is like destroyed
    and inside
  • 18:59 - 19:01
    out.
  • 19:01 - 19:05
    OK. So how many people think you can do
  • 19:05 - 19:10
    it with two tests? OK. Somebody who's raising
    your
  • 19:10 - 19:12
    hand. This gentleman in the front. What are
    the
  • 19:12 - 19:14
    two tests that you would write? Just sort
    of
  • 19:14 - 19:18
    roughly? Maybe the happy path and what other?
  • 19:18 - 19:20
    AUDIENCE: One that's spherical and one not.
  • 19:20 - 19:22
    E.M.: One that's spherical and one that's
    not. OK.
  • 19:22 - 19:24
    I think that's good. How many people think
    you
  • 19:24 - 19:28
    would need three to do it? K, maybe gentleman
  • 19:28 - 19:30
    there who thinks we need three. What's the
    third
  • 19:30 - 19:33
    you would write?
  • 19:33 - 19:37
    AUDIENCE: [indecipherable - 00:19:41]
  • 19:37 - 19:43
    E.M.: Say it again? A value for tolerance?
  • 19:43 - 19:46
    AUDIENCE: A value that will blow up the computation.
  • 19:46 - 19:47
    E.M.: That will blow up the computation. How
    would
  • 19:47 - 19:49
    you blow up the computation?
  • 19:49 - 19:54
    AUDIENCE: [indecipherable - 00:19:55]
  • 19:54 - 19:56
    E.M.: Passing in a string.
  • 19:56 - 19:58
    AUDIENCE: Yes.
  • 19:58 - 20:02
    E.M.: OK. Great. And what would you expect
    the
  • 20:02 - 20:04
    result to be, like, what would your expectation,
    what
  • 20:04 - 20:06
    would you assert? Like, I pass in a string
  • 20:06 - 20:09
    and I expect.
  • 20:09 - 20:12
    AUDIENCE: An exception to be raised.
  • 20:12 - 20:13
    E.M.: An exception. OK. And if you didn't
    get
  • 20:13 - 20:16
    an exception then that would be a problem.
  • 20:16 - 20:18
    AUDIENCE: Yes.
  • 20:18 - 20:24
    E.M.: OK. OK. Who thinks four will do it?
  • 20:24 - 20:26
    Nobody thinks four will do it. A few people
  • 20:26 - 20:34
    do. Yeah. What additional tests would you
    add?
  • 20:34 - 20:40
    AUDIENCE: Well, you're testing a range, so
    you have-
  • 20:40 - 20:40
    E.M.: Hmm. Great.
  • 20:40 - 20:40
    AUDIENCE: -so there's two sides.
  • 20:40 - 20:41
    E.M.: I really like this. So, the comment
    was
  • 20:41 - 20:43
    that you're testing a range, and there's sort
    of
  • 20:43 - 20:45
    two sides. There's the, I'm on the low-end
    of
  • 20:45 - 20:47
    the range and I am included, and I am
  • 20:47 - 20:50
    on the high-end of the range. So it would
  • 20:50 - 20:53
    be, there's two of those. Right. One for the
  • 20:53 - 20:54
    low-end and one for the high-end. Exactly.
  • 20:54 - 20:57
    So, it's sort of the happy path. The thing
  • 20:57 - 21:00
    is spherical. The sad path, the thing is not
  • 21:00 - 21:05
    spherical. And both sides of the range. I
    like
  • 21:05 - 21:08
    that. Good. How many people think five? Five
    or
  • 21:08 - 21:11
    more? How's that? Five or more. OK. Lots of
  • 21:11 - 21:13
    hands for five or more.
  • 21:13 - 21:20
    So, according to mutant, which is also software,
    therefore
  • 21:20 - 21:23
    imperfect, you can, you can test this with
    four,
  • 21:23 - 21:26
    and it will not handle things like you should,
  • 21:26 - 21:29
    like, it sort of assumes that the radius and
  • 21:29 - 21:34
    area are valid, right. Like, you can, although,
    actually,
  • 21:34 - 21:36
    maybe that's. Well, we can try it. It's a
  • 21:36 - 21:38
    live coding thing. So let's just do it and
  • 21:38 - 21:41
    see what happens. But thank you for participating
    in
  • 21:41 - 21:44
    that. I think it was an interesting exercise.
  • 21:44 - 21:47
    But, yeah. Basically, like, mutant says the
    answer to
  • 21:47 - 21:50
    this question is four, right. It's basically
    the happy
  • 21:50 - 21:52
    path, the sad path, and both sides of the
  • 21:52 - 21:58
    range. So yeah. Let's, let's sort of show
    how
  • 21:58 - 22:01
    that works.
  • 22:01 - 22:08
    OK. So I'm gonna start by just making a
  • 22:08 - 22:10
    gemfile, as you do. So let me, I can
  • 22:10 - 22:16
    just sort of show. It's a very simple layout
  • 22:16 - 22:19
    so far. I have a lib directory, which contains
  • 22:19 - 22:23
    universe dot rb, which you've all seen. And
    a
  • 22:23 - 22:26
    spec directory which is empty. So, very little
    up
  • 22:26 - 22:28
    my sleeve at this point.
  • 22:28 - 22:35
    I'm just gonna make a gemfile, as you do.
  • 22:41 - 22:43
    And at this point I'm just gonna add rspec,
  • 22:43 - 22:46
    cause I'm starting to write some tests, and
    I'm
  • 22:46 - 22:48
    gonna add mutant.
  • 22:48 - 22:55
    OK. So, and we'll bundle install. Ah. Cool.
    It
  • 22:59 - 23:03
    just installed that new version of mutant
    that was
  • 23:03 - 23:06
    just released moments ago. Good. Let me just
    see
  • 23:06 - 23:12
    what Ruby version I'm on. OK. That should
    be
  • 23:12 - 23:13
    fine.
  • 23:13 - 23:15
    So. Let's write some specs. So we have the
  • 23:15 - 23:21
    spec directory. Let's write planet_spec dot
    rb. And we'll
  • 23:21 - 23:26
    require rspec and we'll require our planet
    file. I'll
  • 23:26 - 23:29
    just use require_relative for that, rather
    than messing with
  • 23:29 - 23:31
    the load path or anything. So that's up a
  • 23:31 - 23:36
    directory in lib and I think it's called universe.
  • 23:36 - 23:41
    And now let's start writing our specs, right.
    So
  • 23:41 - 23:48
    we're just gonna describe our planet in our
    universe
  • 23:48 - 23:55
    model. And. So let's create a subject, which
    is
  • 23:59 - 24:02
    just gonna be our planet. That's like the
    main
  • 24:02 - 24:05
    thing that we're gonna be testing here. And
    it's
  • 24:05 - 24:08
    initialized with a radius and an area. I believe
  • 24:08 - 24:12
    in that order. Yup.
  • 24:12 - 24:19
    Cool. So let's create a context. And let's
    do
  • 24:20 - 24:22
    the happy path first, because that was kind
    of,
  • 24:22 - 24:24
    like, we all agreed that the first path we
  • 24:24 - 24:28
    should write was the happy path. So in this
  • 24:28 - 24:32
    case, Venus is actually the happy path. Venus
    is
  • 24:32 - 24:37
    pretty darn close to spherical. So in this
    case
  • 24:37 - 24:44
    we'll define the radius to be. Oops. Cool.
  • 25:01 - 25:05
    And I think I said it's in meters, yeah?
  • 25:05 - 25:11
    So it'll be that. And then the area will
  • 25:11 - 25:18
    be. Eh, let's see. Wikipedia. OK. So the surface
  • 25:27 - 25:34
    area is, what is that? Four-hundred sixty
    million? Which
  • 25:34 - 25:37
    is OK. But actually, like, I would like a
  • 25:37 - 25:40
    more precise number, because, like, I don't
    want to
  • 25:40 - 25:42
    crank up our tolerance to some ridiculous
    value to
  • 25:42 - 25:45
    make this true. So I actually found a more
  • 25:45 - 25:48
    precise number than the one that's on Wikipedia,
    which
  • 25:48 - 25:52
    is this. So it's four-hundred sixty million,
    two hundred
  • 25:52 - 25:55
    sixty-four thousand, seven-hundred forty.
    Which is, you know, pretty
  • 25:55 - 25:58
    round number still, but it's more precise
    than the
  • 25:58 - 26:01
    one on Wikipedia.
  • 26:01 - 26:03
    And now we'll have our assertion. So we'll
    just
  • 26:03 - 26:10
    say it's spherical. Venus is spherical. We
    expect our
  • 26:10 - 26:17
    subject to be spherical. Good? Is everyone
    satisfied? Do
  • 26:19 - 26:21
    I, like, if people see bugs, call them out.
  • 26:21 - 26:23
    Like, does this look like a good happy path
  • 26:23 - 26:28
    test? Yes? This will pass?
  • 26:28 - 26:35
    Good. Let's run it. Yup. That should work.
    Cool.
  • 26:40 - 26:43
    It passed. Hooray.
  • 26:43 - 26:45
    Let's do something else. Let's open up our
    gemfile
  • 26:45 - 26:49
    again and add simplecov to measure the C-0
    code
  • 26:49 - 26:56
    coverage. And I guess here we can just say
  • 26:57 - 27:03
    require simplecov. SimpleCov.start. And so
    now, if we run
  • 27:03 - 27:10
    our specs again, we'll get a little coverage
    report.
  • 27:11 - 27:17
    Tada!
  • 27:17 - 27:19
    So for those who aren't that familiar with
    simplecov,
  • 27:19 - 27:24
    basically it looks to make sure that your,
    that
  • 27:24 - 27:26
    every line of code is executed, and if you
  • 27:26 - 27:30
    test the happy path, it totally is, right?
    The
  • 27:30 - 27:34
    class, the module is loaded, the class is
    loaded,
  • 27:34 - 27:41
    this constant is set. We initialize. We initialize
    a
  • 27:41 - 27:46
    planet. I can turn on lines. We initialize
    a
  • 27:46 - 27:51
    planet on line nine. We invoke this spherical
    method
  • 27:51 - 27:56
    on line fifteen, in the assertion. And that
    invokes
  • 27:56 - 27:59
    the range method. So we have, you can actually
  • 27:59 - 28:02
    see every line of code is executed precisely
    one
  • 28:02 - 28:04
    time.
  • 28:04 - 28:07
    So we have, we're not over-testing. We're
    not under-testing.
  • 28:07 - 28:11
    We have perfect, a hundred percent C-zero
    code coverage.
  • 28:11 - 28:14
    But we all agreed that this was completely
    insufficient.
  • 28:14 - 28:15
    So-
  • 28:15 - 28:16
    AUDIENCE: Ship it.
  • 28:16 - 28:19
    E.M.: Ship it. K. Right.
  • 28:19 - 28:21
    AUDIENCE: Force push.
  • 28:21 - 28:27
    E.M.: I'm gonna delete this simplecov stuff
    cause it's
  • 28:27 - 28:29
    garbage.
  • 28:29 - 28:34
    OK. So let's write some more tests. So a
  • 28:34 - 28:41
    planet that's not spherical is. No. That's
    my name.
  • 28:41 - 28:48
    Thank you. Is our home. The earth. Radius
    of
  • 28:53 - 29:00
    the earth. Cool. I guess we could say point
  • 29:06 - 29:13
    one. Doesn't really matter. And. Oops. What's
    the area?
  • 29:22 - 29:29
    Cool. So in square kilometers, it's five-hundred
    ten. Five-hundred
  • 29:32 - 29:37
    ten million, rather.
  • 29:37 - 29:38
    So we, again, we could, like, try to find
  • 29:38 - 29:40
    a number that's more precise, but we actually,
    like,
  • 29:40 - 29:43
    the whole point of this test is to test
  • 29:43 - 29:45
    a planet that is an oblate spheroid, not an
  • 29:45 - 29:48
    actual sphere. And so in this case, we want
  • 29:48 - 29:51
    to, so like, it's fine that the numbers are
  • 29:51 - 29:56
    not within the default tolerance. And so,
    yeah. Basically
  • 29:56 - 30:00
    we want to say, like, it is oblate. Not
  • 30:00 - 30:06
    spherical.
  • 30:06 - 30:11
    So in this case, we expect our subject not
  • 30:11 - 30:18
    to be spherical. Cool. Look good? Let's run
    it.
  • 30:22 - 30:28
    Cool. Our tests pass.
  • 30:28 - 30:32
    So this is, like, maybe your normal workflow.
    You
  • 30:32 - 30:33
    would do this. A few of you would stop
  • 30:33 - 30:35
    at this point. I think there were probably
    as
  • 30:35 - 30:37
    many hands for, like, I would stop at two,
  • 30:37 - 30:39
    or probably more tests, for like, I would
    stop
  • 30:39 - 30:42
    at two, than I would stop at four or
  • 30:42 - 30:45
    three. But let me show, let me show what
  • 30:45 - 30:46
    mutant does.
  • 30:46 - 30:48
    Let me show sort of how this mutation testing
  • 30:48 - 30:54
    stuff works. So you're gonna say bundle exec.
    Or,
  • 30:54 - 30:58
    I have it aliased to b-e. I can spell
  • 30:58 - 31:02
    that out. So this is the mutant command line,
  • 31:02 - 31:05
    and it takes a bunch of arguments. So you
  • 31:05 - 31:07
    have to give it a lib for the sort
  • 31:07 - 31:09
    of lib directory that you're testing so that
    it
  • 31:09 - 31:12
    knows to add that to the load path. And
  • 31:12 - 31:16
    then you give it a require. So it's gonna
  • 31:16 - 31:20
    require some specific library, in this case
    the universe
  • 31:20 - 31:24
    library that you wrote. And then you can say,
  • 31:24 - 31:26
    like, I want to test everything in universe,
    or
  • 31:26 - 31:30
    you can say, like, with wild cards like colon
  • 31:30 - 31:32
    colon universe star. I can make that a little
  • 31:32 - 31:34
    smaller so it fits on one line.
  • 31:34 - 31:37
    Or you can say, like, I want to test
  • 31:37 - 31:41
    specifically the planet class, or you say,
    like, I
  • 31:41 - 31:42
    want to test a particular method. So you can
  • 31:42 - 31:45
    say, like, I want to test spherical. Something
    like
  • 31:45 - 31:47
    that. Right. But we want to test the whole
  • 31:47 - 31:48
    planet class.
  • 31:48 - 31:51
    Oh, and you also, there's an option to say
  • 31:51 - 31:54
    use rspec, so it knows what test framework
    to
  • 31:54 - 31:59
    run. This is important, because it's testing
    your tests.
  • 31:59 - 32:01
    And I am getting some sort of an error.
  • 32:01 - 32:05
    Ah. I am missing mutant-rspec in my gemfile.
    That
  • 32:05 - 32:10
    is easy to fix. Right. So.
  • 32:10 - 32:12
    Rspec used to be built in. This has changed
  • 32:12 - 32:16
    recently. So basically there are other libraries.
    There's like
  • 32:16 - 32:18
    plugin library. So if you want to write, if
  • 32:18 - 32:21
    you use some crazy test-framework, you can
    just write
  • 32:21 - 32:23
    a gem that adds mutant support for that test
  • 32:23 - 32:26
    framework. So this happens to be the one for
  • 32:26 - 32:29
    rspec. But you can use one for test-unit or
  • 32:29 - 32:30
    anything else.
  • 32:30 - 32:33
    So. BI is just a short-cut for bundle install.
  • 32:33 - 32:38
    And we'll do this. Cool.
  • 32:38 - 32:40
    So what it is doing, you're like, what, this
  • 32:40 - 32:43
    is crazy. We only wrote two tests. Why are
  • 32:43 - 32:45
    there all those little green dots and Fs flying
  • 32:45 - 32:52
    by? So basically what's happening is we, it's
    taking
  • 32:54 - 32:59
    our two tests and it's running through these
    various
  • 32:59 - 33:01
    mutations. In this case, it made eight-three
    mutations to
  • 33:01 - 33:05
    our code, based on what we used, right. Like,
  • 33:05 - 33:08
    so, depending on, like, if you use an and,
  • 33:08 - 33:09
    it will convert it to an or. But if
  • 33:09 - 33:12
    you don't use that, you can't, you do that
  • 33:12 - 33:12
    mutation.
  • 33:12 - 33:17
    So, in this case, there was eighty-three mutations.
    Eighty-three
  • 33:17 - 33:19
    sort of mutants. And eighty-two of those mutants
    were
  • 33:19 - 33:24
    killed. So there, in this case, was one that
  • 33:24 - 33:27
    was not. And you get this really cool output,
  • 33:27 - 33:31
    diff output. So it basically says, this is
    the
  • 33:31 - 33:34
    mutation we did that was not killed. We took,
  • 33:34 - 33:40
    what is it, line twenty-four? Was that? Is
    there
  • 33:40 - 33:45
    a comment? We took line twenty-five, right,
    this range
  • 33:45 - 33:50
    method, and we deleted the code that you wrote
  • 33:50 - 33:52
    and we mutated it in this way. We got
  • 33:52 - 33:54
    rid of that minus T. And it turned out
  • 33:54 - 33:57
    that even after we made that mutation, all
    of
  • 33:57 - 34:00
    your tests still passed.
  • 34:00 - 34:03
    Actually, maybe it would be helpful, like,
    I can
  • 34:03 - 34:10
    show with earth. So before we do earth, this
  • 34:10 - 34:12
    is what the mutation output would look like.
    Right.
  • 34:12 - 34:13
    So. I just want to give you a sense
  • 34:13 - 34:16
    of, like, all the different mutations and
    kind of
  • 34:16 - 34:18
    how they work and what the output looks like.
  • 34:18 - 34:20
    So if we don't have the sort of unhappy
  • 34:20 - 34:23
    path where it returns false, these are the
    various
  • 34:23 - 34:26
    mutations it runs. So there was this one,
    which
  • 34:26 - 34:27
    we saw earlier, where it removes the minus
    T
  • 34:27 - 34:30
    from the range and it still passes because
    we're
  • 34:30 - 34:33
    sort of in the top half of that range.
  • 34:33 - 34:36
    There's this other one where it gets rid of
  • 34:36 - 34:38
    the n, so the beginning part of the range,
  • 34:38 - 34:40
    and it just puts in t there.
  • 34:40 - 34:44
    Here, it actually gets rid of that call to
  • 34:44 - 34:48
    dot cover, and it turns out that, because
    the
  • 34:48 - 34:50
    range returns true and you haven't put in
    a
  • 34:50 - 34:53
    thing that says it should return false, that
    this
  • 34:53 - 34:56
    also passes, right. So, in this case, you're
    just
  • 34:56 - 34:59
    returning the range. But that is truthy. And
    so
  • 34:59 - 35:03
    this, this test fails.
  • 35:03 - 35:05
    If you wanted to write a more precise test,
  • 35:05 - 35:08
    instead of saying. No, I guess that's right.
    So,
  • 35:08 - 35:12
    in this case it's just gonna check whether
    that
  • 35:12 - 35:14
    method is truthy or falsey, and in this case
  • 35:14 - 35:16
    it's truthy if it just returns the range.
    Right?
  • 35:16 - 35:18
    And you're not testing that it would ever
    be
  • 35:18 - 35:20
    falsey.
  • 35:20 - 35:24
    Also, if you just return the instance variable
    area,
  • 35:24 - 35:27
    so if you basically throw away everything
    except that
  • 35:27 - 35:31
    last argument to the cover method, this turns
    out
  • 35:31 - 35:35
    to also, like, you have no tests that covers
  • 35:35 - 35:39
    this. And actually you can delete that whole
    line,
  • 35:39 - 35:42
    and the previous line, approximate area, like
    you get
  • 35:42 - 35:44
    the same result. Like, the fact that you have
  • 35:44 - 35:46
    an approximate area and that is truthy and
    you
  • 35:46 - 35:48
    are only testing that this method returns
    a truthy
  • 35:48 - 35:52
    value means that this test will pass.
  • 35:52 - 35:59
    So I just wanted to show that. I can
  • 35:59 - 36:02
    bring this back. Cool.
  • 36:02 - 36:05
    So now we're in a place where, oops. OK.
  • 36:05 - 36:12
    So our tests will pass. And we have one
  • 36:12 - 36:16
    mutant that we need to kill. So does anyone
  • 36:16 - 36:19
    have an idea for how to kill this mutant?
  • 36:19 - 36:26
    AUDIENCE: Pass in a tolerance. [indecipherable
    - 00:36:27] Pass
  • 36:27 - 36:29
    in zero tolerance.
  • 36:29 - 36:33
    E.M.: So the suggestion was to pass in a
  • 36:33 - 36:36
    zero tolerance. So let's try that. So should
    I
  • 36:36 - 36:40
    just, should we make up a planet or, how
  • 36:40 - 36:41
    do you want to do that? We could do
  • 36:41 - 36:42
    Mars, maybe?
  • 36:42 - 36:44
    AUDIENCE: Venus shouldn't be spherical with
    a tolerance of
  • 36:44 - 36:46
    E.M.: Ah. Venus shouldn't be spherical with
    a tolerance
  • 36:46 - 36:51
    of zero. So that's true. So we can sort
  • 36:51 - 36:54
    of change this one to be, it is spherical,
  • 36:54 - 36:55
    give the default tolerance.
  • 36:55 - 36:57
    AUDIENCE: Yes.
  • 36:57 - 37:01
    E.M.: That's what that tests. Right. It's
    spherical-ish. I
  • 37:01 - 37:04
    like that. Ish.
  • 37:04 - 37:11
    But is not perfectly spherical. And so here
    we
  • 37:16 - 37:18
    would expect this not to be spherical, given
    a
  • 37:18 - 37:22
    tolerance of zero. Yeah? So let's first run
    that
  • 37:22 - 37:29
    test. Cool. So that passes. It is not perfectly
  • 37:30 - 37:35
    spherical, and it is spherical-ish. We didn't
    break that
  • 37:35 - 37:38
    test. OK, so now let's do the same thing
  • 37:38 - 37:45
    with our mutant command.
  • 37:45 - 37:51
    So the mutant still lives. Why?
  • 37:51 - 37:53
    So to make this fail, what we need to
  • 37:53 - 37:56
    do is we need to pass in a tolerance
  • 37:56 - 37:59
    that falls in the bottom half of the range.
  • 37:59 - 38:03
    So, in this case, Venus is slightly the area
  • 38:03 - 38:09
    of Venus is slightly above the perfect sphericism
    or
  • 38:09 - 38:14
    whatever, right. It's not, it's on the high-end
    of
  • 38:14 - 38:16
    the range. So what we need to do is
  • 38:16 - 38:20
    we need to find a planet that is actually
  • 38:20 - 38:22
    on the low-end of the range, right, where
    it's
  • 38:22 - 38:29
    less. It's spherical, but within the tolerance,
    but it's,
  • 38:29 - 38:34
    yeah. On the low-end of the range. Make sense?
  • 38:34 - 38:40
    So yeah. I don't know. Like, what we could
  • 38:40 - 38:43
    do to test, like, we could, I, I don't
  • 38:43 - 38:45
    want to necessarily like look up more planets
    and
  • 38:45 - 38:49
    their radiuses. But we could do something
    like this.
  • 38:49 - 38:53
    So this is, sorry, that's not earth. This
    is,
  • 38:53 - 38:54
    like.
  • 38:54 - 38:57
    AUDIENCE: Rubinius 5.
  • 38:57 - 39:00
    E.M.: Rubinius 5. I like that. Thank you for
  • 39:00 - 39:05
    the suggestion from the audience. And Rubinius
    5. Let's
  • 39:05 - 39:07
    sort of make it easy for ourselves. So we'll
  • 39:07 - 39:14
    say the radius is zero point five, right.
    So
  • 39:14 - 39:17
    if we put that in our formula, zero point
  • 39:17 - 39:21
    five squared is a quarter, and then a quarter,
  • 39:21 - 39:26
    when it sort of cancels out the multiple by
  • 39:26 - 39:29
    four. You div, you're dividing by four basically.
    So
  • 39:29 - 39:33
    the, we know that the actual area should be
  • 39:33 - 39:36
    pi. So then we can just say something like,
  • 39:36 - 39:44
    let the area be Math::PI. And we want it
  • 39:45 - 39:48
    to fall, we want the area to be below
  • 39:48 - 39:49
    the range. Right, so we want it to be
  • 39:49 - 39:52
    like, Math::Pi minus, like, some amount that
    falls within
  • 39:52 - 39:57
    the tolerance or whatever. Right? Make sense?
  • 39:57 - 40:02
    And then we expect that this is gonna be
  • 40:02 - 40:09
    spherical. Ish. Within the default tolerance.
    Cool. OK. So
  • 40:20 - 40:26
    let's run that. Specs pass. And have we killed
  • 40:26 - 40:33
    the last mutant? Nice. Yeah.
  • 40:33 - 40:37
    Yeah! So.
Title:
RailsConf 2014 - Mutation Testing with Mutant by Erik Michaels-Ober
Description:

more » « less
Duration:
41:02

English subtitles

Revisions