5.8 - 5.11 - Coverage, Unit vs. Integration Tests, Other Testing Concepts, and Perspectives
-
0:00 - 0:01So we spent a bunch of time
-
0:01 - 0:03in the last couple of lectures
-
0:03 - 0:05talking about different kinds of testing
-
0:05 - 0:08about unit testing versus integration testing
-
0:08 - 0:10We talked about how do you use RSpec
-
0:10 - 0:12to really isolate the parts of your code you want to test
-
0:12 - 0:14you’ve also, you know, because of homework 3,
-
0:14 - 0:18and other stuff, we have been doing BDD,
-
0:18 - 0:20where we’ve been using Cucumber to turn user stories
-
0:20 - 0:22into, essentially, integration and acceptance tests
-
0:22 - 0:25So you’ve seen testing in a couple of different levels
-
0:25 - 0:27and the goal here is sort of to do a few remarks
-
0:27 - 0:29to, you know, let’s back up a little bit
-
0:29 - 0:33and see the big picture, and tie those things together
-
0:33 - 0:34So this sort of spans material
-
0:34 - 0:37that covers three or four sections in the book
-
0:37 - 0:39and I want to just hit the high points in lecture
-
0:39 - 0:41So a question that comes up
-
0:41 - 0:43I’m sure it’s come up for all of you
-
0:43 - 0:44as you have been doing homework
-
0:44 - 0:45is: “How much testing is enough?”
-
0:45 - 0:48And, sadly, for a long time
-
0:48 - 0:51kind of if you asked this question in industry
-
0:51 - 0:52the answer was basically
-
0:52 - 0:53“Well, we have a shipping deadline,
-
0:53 - 0:54so however much testing we can do
-
0:54 - 0:56before that deadline, that’s how much.”
-
0:56 - 0:58That’s what you have time for.
-
0:58 - 1:00So, you know, that’s a little flip
-
1:00 - 1:01obviously not very good
-
1:01 - 1:02So you can do a bit better, right?
-
1:02 - 1:03There’re some static measures
-
1:03 - 1:06like how many lines of code does your app have
-
1:06 - 1:08and how many lines of tests do you have?
-
1:08 - 1:10And it’s not unusual in industry
-
1:10 - 1:12in a well-tested piece of software
-
1:12 - 1:14for the number of lines of tests
-
1:14 - 1:17to go far beyond the number of lines of code
-
1:17 - 1:19So, integer multiples are not unusual
-
1:19 - 1:21And I think even for sort of, you know,
-
1:21 - 1:23research code or classwork
-
1:23 - 1:26a ratio of, you know, maybe 1.5 is not unreasonable
-
1:26 - 1:30so one and a half times the amount of test code
-
1:30 - 1:32as you have application code
-
1:32 - 1:34And in a lot of production systems
-
1:34 - 1:35where they really care about testing
-
1:35 - 1:36it is much higher than that
-
1:36 - 1:38So maybe a better question to ask:
-
1:38 - 1:39Rather than saying “How much testing is enough?”
-
1:39 - 1:42is to ask “How good is the testing I am doing now?
-
1:42 - 1:44How thorough is it?”
-
1:44 - 1:45Later in this semester
-
1:45 - 1:46Professor Sen will talk about
-
1:46 - 1:48a little bit about formal methods
-
1:48 - 1:50and sort of what’s at the frontiers of testing and debugging
-
1:50 - 1:52But a couple of things that we can talk about
-
1:52 - 1:54based on what you already know
-
1:54 - 1:57is some basic concepts about test coverage
-
1:57 - 1:59And although I would say
-
1:59 - 2:01you know, we’ve been saying all along
-
2:01 - 2:03formal methods, they don’t really work on big systems
-
2:03 - 2:05I think that statement, in my personal opinion
-
2:05 - 2:07is actually a lot less true than it used to be
-
2:07 - 2:09I think there are a number of specific places
-
2:09 - 2:10especially in testing and debugging
-
2:10 - 2:12where formal methods are actually making fast progress
-
2:12 - 2:15and Koushik Sen is one of the leaders in that
-
2:15 - 2:17So you’ll have the opportunity to hear more about that later
-
2:17 - 2:21but for the moment I think, kind of bread and butter
-
2:21 - 2:22is let’s talk about coverage measurement
-
2:22 - 2:24because this is where the rubber meets the road
-
2:24 - 2:26in terms of how you’d be evaluated
-
2:26 - 2:28if you are doing this for real
-
2:28 - 2:29So what’s some basics?
-
2:29 - 2:30Here’s a really simple class you can use
-
2:30 - 2:32to talk about different ways to measure
-
2:32 - 2:34how our test covers this code
-
2:34 - 2:36And there’re a few different levels
-
2:36 - 2:37with different terminologies
-
2:37 - 2:40It’s not really universal across all software houses
-
2:40 - 2:42But one common set of terminology
-
2:42 - 2:43that the book exposes
-
2:43 - 2:44is we could talk about S0
-
2:44 - 2:47where we’d just mean you’ve called every method once
-
2:47 - 2:50So you know, if you call foo, and you call bar, you’re done
-
2:50 - 2:52That’s S0 coverage: not terribly thorough
-
2:52 - 2:54A little more stringent, S1, is
-
2:54 - 2:56you could say, we’re calling every method
-
2:56 - 2:57from every place that it could be called
-
2:57 - 2:58So what does that mean?
-
2:58 - 3:00It means, for example
-
3:00 - 3:01it’s not enough to call bar
-
3:01 - 3:02You have to make sure that you have to call it
-
3:02 - 3:05at least once from in here
-
3:05 - 3:07as well as calling it once
-
3:07 - 3:10from any exterior function that might call it
-
3:10 - 3:12C0 which is what SimpleCov measures
-
3:12 - 3:15(those of you who’ve gotten SimpleCov up and running)
-
3:15 - 3:18basically says you’ve executed every statement
-
3:18 - 3:20you’ve touched every statement in your code once
-
3:20 - 3:22But the caveat there is that
-
3:22 - 3:25conditionals really just count as a single statement
-
3:25 - 3:28So, if you, no matter which branch of this “if” you took
-
3:28 - 3:31as long as you touched one of the other branch
-
3:31 - 3:33you’ve executed the “if’ statement
-
3:33 - 3:35So even C0 is still, you know, sort of superficial coverage
-
3:35 - 3:37But, as we will see
-
3:37 - 3:39the way that you will want to read this information is:
-
3:39 - 3:41if you are getting bad coverage at the C0 level
-
3:41 - 3:44then you have really really bad coverage
-
3:44 - 3:46So if you are not kind of making
-
3:46 - 3:47this simple level of superficial coverage
-
3:47 - 3:50then your testing is probably deficient
-
3:50 - 3:51C1 is the next step up from that
-
3:51 - 3:53We could say:
-
3:53 - 3:55Well, we have to take every branch in both directions
-
3:55 - 3:56So, when we are doing this “if” statement
-
3:56 - 3:58we have to make sure that
-
3:58 - 3:59we do the “if x” part once
-
3:59 - 4:05and the “if not x” part at least once to meet C1
-
4:05 - 4:08You can augment that with decision coverage
-
4:08 - 4:09saying: Well, if we’re gonna…
-
4:09 - 4:12If we have “if” statments where the condition
-
4:12 - 4:13is made up of multiple terms
-
4:13 - 4:15we have to make sure that every subexpression
-
4:15 - 4:17has been evaluated both directions
-
4:17 - 4:19In other words, that means that
-
4:19 - 4:22if we’re going to fail this “if” statement
-
4:22 - 4:24we have to make sure to fail it at least once
-
4:24 - 4:26because y was false in at least once because z was false
-
4:26 - 4:28In other words, any subexpression that could
-
4:28 - 4:31independently change the outcome of the condition
-
4:31 - 4:34has to be exercised in both directions
-
4:34 - 4:36And then,
-
4:36 - 4:38kind of, the one that, you know, a lot of people aspire to
-
4:38 - 4:41but there is disagreement on how much more valuable it is
-
4:41 - 4:42is you take every path through the code
-
4:42 - 4:45Obviously, this is kind of difficult because
-
4:45 - 4:48it tends to be exponential in the number of conditions
-
4:48 - 4:53And in general it’s difficult
-
4:53 - 4:55to evaluate if you’ve taken every path through the code
-
4:55 - 4:57There are formal techniques that you can use
-
4:57 - 4:58to tell you where the holes are
-
4:58 - 5:01but the bottom line is that
-
5:01 - 5:03in most commercial software houses
-
5:03 - 5:04there is, I would say, not complete consensus
-
5:04 - 5:06on how much more valuable C2 is
-
5:06 - 5:08compared to C0 or C1
-
5:08 - 5:10So, I think, for the purpose of our class
-
5:10 - 5:11you get exposed to the idea
-
5:11 - 5:13of how you use coverage information
-
5:13 - 5:16SimpleCov takes advantage of some built-in Ruby features
-
5:16 - 5:18to give you C0 coverage
-
5:18 - 5:19[It] does really nice reports
-
5:19 - 5:21We can sort of see it
-
5:21 - 5:22at the level of individual lines in your file
-
5:22 - 5:24You can see what your coverage is
-
5:24 - 5:27and I think that’s kind of a, you know
-
5:27 - 5:31a good start for where we are
-
5:31 - 5:33So, having see a sort of different flavours of tests
-
5:33 - 5:37Stepping back and looking back at the big picture
-
5:37 - 5:38what are the different kind of tests
-
5:38 - 5:40that we’ve seen concretely?
-
5:40 - 5:42and what are the tradeoffs
-
5:42 - 5:43between using those different kinds of tests?
-
5:43 - 5:47So we’ve seen at the level of individual classes or methods
-
5:47 - 5:50we use RSpec, with extensive use of mocking and stubbing
-
5:50 - 5:53So, for example when we do testing methods in the model
-
5:53 - 5:55that will be an example of unit testing
-
5:55 - 5:59We also did something that is pretty similar to
-
5:59 - 6:00functional or module testing
-
6:00 - 6:02where there is more than one module participating
-
6:02 - 6:04So, for example when we did controller specs
-
6:04 - 6:07we saw that—we simulate a POST action
-
6:07 - 6:09but remember that the POST action
-
6:09 - 6:10has to go through the routing subsystem
-
6:10 - 6:12before it gets to the controller
-
6:12 - 6:14Once the controller is done it will try to render a view
-
6:14 - 6:16So in fact there’s other pieces
-
6:16 - 6:17that collaborate with the controller
-
6:17 - 6:19that have to be working in order for controller specs to pass
-
6:19 - 6:21So that’s somewhere inbetween:
-
6:21 - 6:23where we’re doing more than a single method
-
6:23 - 6:25touching more than a single class
-
6:25 - 6:27but we’re still concentrating [our] attention
-
6:27 - 6:28on a fairly narrow slice of the system at a time
-
6:28 - 6:31and we’re still using mocking and stubbing extensively
-
6:31 - 6:35to sort of isolate that behaviour that we want to test
-
6:35 - 6:36And then at the level of Cucumber scenarios
-
6:36 - 6:38these are more like integration or system tests
-
6:38 - 6:41They exercise complete paths throughout the application
-
6:41 - 6:43They probably touch a lot of different modules
-
6:43 - 6:46They make minimal use of mocks and stubs
-
6:46 - 6:48because part of the goal of an integration test
-
6:48 - 6:50is exactly to test the interaction between pieces
-
6:50 - 6:53So you don’t want to stub or control those interactions
-
6:53 - 6:54You actually want to let the system do
-
6:54 - 6:56what it would really do
-
6:56 - 6:58if this was a scenario happening in production
-
6:58 - 7:00So how would we compare these different kinds of tests?
-
7:00 - 7:02There’s a few different axes we can look at
-
7:02 - 7:05One of them is how long they take to run
-
7:05 - 7:06Now, both RSpec and Cucumber
-
7:06 - 7:09have, kind of, high startup times and stuff like that
-
7:09 - 7:10But, as you’ll see
-
7:10 - 7:11as you start adding more and more RSpec tests
-
7:11 - 7:14and using autotest to run them in the background
-
7:14 - 7:17by and large, once RSpec kind of gets off the launching pad
-
7:17 - 7:19it runs specs really fast
-
7:19 - 7:21whereas running Cucumber features just takes a long time
-
7:21 - 7:24as it essentially fires up your entire application
-
7:24 - 7:26And later in this semester
-
7:26 - 7:28we’ll see a way to make Cucumber even slower—
-
7:28 - 7:30which is to have it fire up an entire browser
-
7:30 - 7:33basically act like a puppet, remote-controlling Firefox
-
7:33 - 7:35so you can test Javascript code
-
7:35 - 7:37We’ll do that when we actually—
-
7:37 - 7:40I think we’ll be able to work with our friends at SourceLabs
-
7:40 - 7:42so you can do that in the cloud—That will be exciting
-
7:42 - 7:45So, “run fast” versus “run slow”
-
7:45 - 7:46Resolution:
-
7:46 - 7:48If an error happens in your unit tests
-
7:48 - 7:49it’s usually pretty easy
-
7:49 - 7:52to figure out and track down what the source of that error is
-
7:52 - 7:53because the tests are so isolated
-
7:53 - 7:56You’ve stubbed out everything that doesn’t matter
-
7:56 - 7:58and you’re focusing on only the behaviour of interest
-
7:58 - 7:59So, if you’ve done a good job of doing that
-
7:59 - 8:01when something goes wrong in one of your tests
-
8:01 - 8:03there’s not a lot of places
-
8:03 - 8:04that something could have gone wrong
-
8:04 - 8:07In contrast, if you’re running a Cucumber scenario
-
8:07 - 8:08that’s got, you know, 10 steps
-
8:08 - 8:10and every step is touching
-
8:10 - 8:11a whole bunch of pieces of the app
-
8:11 - 8:12it could take a long time
-
8:12 - 8:14to actually get to the bottom of a bug
-
8:14 - 8:16So it is kind of a tradeoff
-
8:16 - 8:17between how well can you localize errors
-
8:17 - 8:20Coverage:
-
8:20 - 8:23It’s possible if you write a good suite
-
8:23 - 8:24of unit and functional tests
-
8:24 - 8:26you can get really high coverage
-
8:26 - 8:27You can run your SimpleCov report
-
8:27 - 8:30and you can actually identify specific lines in your files
-
8:30 - 8:32that have not been exercised by any test
-
8:32 - 8:34and then you can go right at tests that cover them
-
8:34 - 8:36So, figuring out how to improve your coverage
-
8:36 - 8:37for example at the C0 level
-
8:37 - 8:40is something much more easily done with unit tests
-
8:40 - 8:42whereas, with a Cucumber test—
-
8:42 - 8:43with a Cucumber scenario—
-
8:43 - 8:45you are touching a lot of parts of the code
-
8:45 - 8:47but you are doing it very sparsely
-
8:47 - 8:49So, if your goal is to get your coverage up
-
8:49 - 8:51use the tools at that are at the unit levels
-
8:51 - 8:53so that you can focusing on understanding
-
8:53 - 8:54what parts or my code are undertested
-
8:54 - 8:56and then you can write very targeted tests
-
8:56 - 8:58just to focus on them
-
8:58 - 9:01And, sort of, you know, putting those pieces together
-
9:01 - 9:03the unit tests
-
9:03 - 9:05because of their isolation and their fine resolution
-
9:05 - 9:07tend to use a lot of mocks
-
9:07 - 9:09to isolate the behaviours you don’t care about
-
9:09 - 9:11But that means that, by definition
-
9:11 - 9:12you’re not testing the interfaces
-
9:12 - 9:14and it’s sort of a “received wisdom” in software
-
9:14 - 9:16that a lot of the interesting bugs
-
9:16 - 9:18occur at the interfaces between pieces
-
9:18 - 9:20and not sort of within a class or within a method—
-
9:20 - 9:22those are sort of the easy bugs to track down
-
9:22 - 9:24And at the other extreme
-
9:24 - 9:26the more you get towards the integration testing extreme
-
9:26 - 9:29you’re supposed to rely less and less on mocks
-
9:29 - 9:30for that exact reason
-
9:30 - 9:32Now we saw, if you’re testing something like
-
9:32 - 9:34say, in a service-oriented architecture
-
9:34 - 9:35where you have to interact with the remote site
-
9:35 - 9:37you still end up
-
9:37 - 9:38having to do a fair amount of mocking and stubbing
-
9:38 - 9:40so that you don’t rely on the Internet
-
9:40 - 9:41in order for your tests to pass
-
9:41 - 9:43but, generally speaking
-
9:43 - 9:47you’re trying to remove as many of the mocks that you can
-
9:47 - 9:48and let the system run the way it would run in real life
-
9:48 - 9:52So, the good news is you are testing the interfaces
-
9:52 - 9:54but when something goes wrong in one of the interfaces
-
9:54 - 9:57because your resolution is not as good
-
9:57 - 10:00it may take longer to figure out what it is
-
10:00 - 10:05So, what’s sort of the high-order bit from this tradeoff
-
10:05 - 10:07is you don’t really want to rely
-
10:07 - 10:08too heavily on any one kind of test
-
10:08 - 10:10They serve different purposes and, depending on
-
10:10 - 10:13are you trying to exercise your interfaces more
-
10:13 - 10:15or are you trying to improve your fine-grain coverage
-
10:15 - 10:18that affects how you develop your test suite
-
10:18 - 10:20and you’ll evolve it along with your software
-
10:20 - 10:24So, we’ve used a certain set of terminology in testing
-
10:24 - 10:26It’s the terminology that, by and large
-
10:26 - 10:29is most commonly used in the Rails community
-
10:29 - 10:30but there’s some variation
-
10:30 - 10:33[and] some other terms that you might hear
-
10:33 - 10:35if you go get a job somewhere
-
10:35 - 10:36and you hear about mutation testing
-
10:36 - 10:38which we haven’t done
-
10:38 - 10:40This is an interesting idea that was, I think, invented by
-
10:40 - 10:43Ammann and Offutt, who have, sort of
-
10:43 - 10:44the definitive book on software testing
-
10:44 - 10:46The idea is:
-
10:46 - 10:48Suppose I introduced a deliberate bug into my code
-
10:48 - 10:49does that force some test to fail?
-
10:49 - 10:53Because, if I changed, you know, “if x” to “if not x”
-
10:53 - 10:56and no tests fail, then either I’m missing some coverage
-
10:56 - 10:59or my app is very strange and somehow nondeterministic
-
10:59 - 11:03Fuzz testing, which Koushik Sen may talk more about
-
11:03 - 11:07basically, this is the “10,000 monkeys at typewriters
-
11:07 - 11:09throwing random input at your code”
-
11:09 - 11:10What’s interesting about it is that
-
11:10 - 11:11those tests we’ve been doing
-
11:11 - 11:13essentially are crafted to test the app
-
11:13 - 11:15the way it was designed
-
11:15 - 11:16and these, you know, fuzz testing
-
11:16 - 11:19is about testing the app in ways it wasn’t meant to be used
-
11:19 - 11:22So, what happens if you throw enormous form submissions
-
11:22 - 11:25What happens if you put control characters in your forms?
-
11:25 - 11:27What happens if you submit the same thing over and over?
-
11:27 - 11:29And, Koushik has a statistic that
-
11:29 - 11:32Microsoft finds up to 20% of their bugs
-
11:32 - 11:34using some variation of fuzz testing
-
11:34 - 11:36and that about 25%
-
11:36 - 11:39of the common Unix command-line programs
-
11:39 - 11:40can be made to crash
-
11:40 - 11:44[when] put through aggressive fuzz testing
-
11:44 - 11:46Defining-use coverage is something that we haven’t done
-
11:46 - 11:48but it’s another interesting concept
-
11:48 - 11:50The idea is that at any point in my program
-
11:50 - 11:52there’s a place where I define—
-
11:52 - 11:54or I assign a value to some variable—
-
11:54 - 11:56and then there’s a place downstream
-
11:56 - 11:57where presumably I’m going to consume that value—
-
11:57 - 11:59someone’s going to use that value
-
11:59 - 12:01Have I covered every pair?
-
12:01 - 12:02In other words, do I have tests where every pair
-
12:02 - 12:04of defining a variable and using it somewhere
-
12:04 - 12:07is executed at some part of my test suites
-
12:07 - 12:10It’s sometimes called DU-coverage
-
12:10 - 12:14And other terms that I think are not as widely used anymore
-
12:14 - 12:17blackbox versus whitebox, or blackbox versus glassbox
-
12:17 - 12:20Roughly, a blackbox test is one that is written from
-
12:20 - 12:22the point of view of the external specification of the thing
-
12:22 - 12:24[For example:] “This is a hash table
-
12:24 - 12:26When I put in a key I should get back a value
-
12:26 - 12:28If I delete the key the value shouldn’t be there”
-
12:28 - 12:29That’s a blackbox test because it doesn’t say
-
12:29 - 12:32anything about how the hash table is implemented
-
12:32 - 12:34and it doesn’t try to stress the implementation
-
12:34 - 12:36A corresponding whitebox test might be:
-
12:36 - 12:38“I know something about the hash function
-
12:38 - 12:39and I’m going to deliberately create
-
12:39 - 12:41hash keys in my test cases
-
12:41 - 12:43that cause a lot of hash collisions
-
12:43 - 12:45to make sure that I’m testing that part of the functionality”
-
12:45 - 12:49Now, a C0 test coverage tool, like SimpleCov
-
12:49 - 12:52would reveal that if all you had is blackbox tests
-
12:52 - 12:53you might find that
-
12:53 - 12:55the collision coverage code wasn’t being hit very often
-
12:55 - 12:56And that might tip you off and say:
-
12:56 - 12:58“Ok, if I really want to strengthen that—
-
12:58 - 13:00for one, if I want to boost coverage for those tests
-
13:00 - 13:02now I have to write a whitebox or a glassbox test
-
13:02 - 13:04I have to look inside, see what the implementation does
-
13:04 - 13:05and find specific ways
-
13:05 - 13:10to try to break the implementation in evil ways”
-
13:10 - 13:13So, I think, testing is a kind of a way of life, right?
-
13:13 - 13:16We’ve gotten away from the phase of
-
13:16 - 13:18“We’d build the whole thing and then we’d test it”
-
13:18 - 13:19and we’ve gotten into the phase of
-
13:19 - 13:20“We’re testing as we go”
-
13:20 - 13:22Testing is really more like a development tool
-
13:22 - 13:24and like so many development tools
-
13:24 - 13:25the effectiveness of it depends
-
13:25 - 13:27on whether you’re using it in a tasteful manner
-
13:27 - 13:31So, you could say: “Well, let’s see—I kicked the tires
-
13:31 - 13:33You know, I fired up the browser, I tried a couple of things
-
13:33 - 13:35(claps hand) Looks like it works! Deploy it!”
-
13:35 - 13:38That’s obviously a little more cavalier than you’d want to be
-
13:38 - 13:41And, by the way, one of the things that we discovered
-
13:41 - 13:43with this online course just starting up
-
13:43 - 13:45when 60,000 people are enrolled in the course
-
13:45 - 13:48and 0.1% of those people have a problem
-
13:48 - 13:50you’d get 60 emails
-
13:50 - 13:53The corollary is: when your site is used by a lot of people
-
13:53 - 13:55some stupid bug that you didn’t find
-
13:55 - 13:57but that could have found by testing
-
13:57 - 13:59could very quickly generate *a lot* of pain
-
13:59 - 14:02On the other hand, you don’t want to be dogmatic and say
-
14:02 - 14:04“Uh, until we have 100% coverage and every test is green
-
14:04 - 14:06we absolutely will not ship”
-
14:06 - 14:07That’s not healthy either
-
14:07 - 14:08And the test quality
-
14:08 - 14:10doesn’t necessarily correlate with the statement
-
14:10 - 14:11unless you can say something
-
14:11 - 14:12about the quality of your tests
-
14:12 - 14:14just because you’ve executed every line
-
14:14 - 14:17doesn’t mean that you’ve tested the interesting cases
-
14:17 - 14:18So, somewhere in between, you could say
-
14:18 - 14:20“Well, we’ll use coverage tools to identify
-
14:20 - 14:23undertested or poorly-tested parts of the code
-
14:23 - 14:24and we’ll use them as a guideline
-
14:24 - 14:27to sort of help improve our overall confidence level”
-
14:27 - 14:29But remember, Agile is about embracing change
-
14:29 - 14:30and dealing with it
-
14:30 - 14:32Part of change is things would change that will cause
-
14:32 - 14:33bugs that you didn’t foresee
-
14:33 - 14:34and the right reaction is:
-
14:34 - 14:36Be comfortable enough for the testing tools
-
14:36 - 14:37[so] that you can quickly find those bugs
-
14:37 - 14:39Write a test that reproduces that bug
-
14:39 - 14:40And then make the test green
-
14:40 - 14:41Then you’ll really fix it
-
14:41 - 14:43That means, the way that you really fix a bug is
-
14:43 - 14:45if you created a test that correctly failed
-
14:45 - 14:46to reproduce that bug
-
14:46 - 14:48and then you went back and fixed the code
-
14:48 - 14:49to make those tests pass
-
14:49 - 14:51Similarly, you don’t want to say
-
14:51 - 14:53“Well, unit tests give you better coverage
-
14:53 - 14:54They’re more thorough and detailed
-
14:54 - 14:56So let’s focus all our energy on that”
-
14:56 - 14:57as opposed to
-
14:57 - 14:58“Oh, focus on integration tests
-
14:58 - 15:00because they’re more realistic, right?
-
15:00 - 15:01They reflect what the customer said they want
-
15:01 - 15:03So, if the integration tests are passing
-
15:03 - 15:05by definition we’re meeting a customer need”
-
15:05 - 15:07Again, both extremes are kind of unhealthy
-
15:07 - 15:09because each one of these can find problems
-
15:09 - 15:11that would be missed by the other
-
15:11 - 15:12So, having a good combination of them
-
15:12 - 15:15is kind of all it is all about
-
15:15 - 15:18The last thing I want to leave you with is, I think
-
15:18 - 15:20in terms of testing, is “TDD versus
-
15:20 - 15:22what I call conventional debugging—
-
15:22 - 15:24i.e., the way that we all kind of do it
-
15:24 - 15:25even though we say we don’t”
-
15:25 - 15:26and we’re all trying to get better, right?
-
15:26 - 15:27We’re all kind of in the gutter
-
15:27 - 15:29Some of us are looking up at the stars
-
15:29 - 15:31trying to improve our practices
-
15:31 - 15:33But, having now lived with this for 3 or 4 years myself
-
15:33 - 15:35and—I’ll be honest—3 years ago I didn’t do TDD
-
15:35 - 15:37I do it now, because I find that it’s better
-
15:37 - 15:40and here’s my distillation of why I think it works for me
-
15:40 - 15:43Sorry, the colours are a little weird
-
15:43 - 15:45but on the left column of the table
-
15:45 - 15:46[it] says “Conventional debugging”
-
15:46 - 15:47and the right side says “TDD”
-
15:47 - 15:49So what’s the way I used to write code?
-
15:49 - 15:51Maybe some of you still do this
-
15:51 - 15:53I write a whole bunch of lines
-
15:53 - 15:54maybe a few tens of lines of code
-
15:54 - 15:55I’m sure they’re right—
-
15:55 - 15:56I mean, I am a good programmer, right?
-
15:56 - 15:57This is not that hard
-
15:57 - 15:59I run it – It doesn’t work
-
15:59 - 16:01Ok, fire up the debugger – Start putting in printf’s
-
16:01 - 16:04If I’d been using TDD what would I do instead?
-
16:04 - 16:08Well I’d write a few lines of code, having written a test first
-
16:08 - 16:10So as soon as the test goes from red to green
-
16:10 - 16:12I know I wrote code that works—
-
16:12 - 16:15or at least the parts of the behaviour that I had in mind
-
16:15 - 16:16Those parts of the behaviour work, because I had a test
-
16:16 - 16:19Ok, back to conventional debugging:
-
16:19 - 16:21I’m running my program, trying to find the bugs
-
16:21 - 16:23I start putting in printf’s everywhere
-
16:23 - 16:24to print out the values of things
-
16:24 - 16:25which by the way is a lot fun
-
16:25 - 16:26when you’re trying to read them
-
16:26 - 16:28out of the 500 lines of log output
-
16:28 - 16:29that you’d get in a Rails app
-
16:29 - 16:30trying to find your printf’s
-
16:30 - 16:32you know, “I know what I’ll do—
-
16:32 - 16:34I’ll put in 75 asterisks before and after
-
16:34 - 16:36That will make it readable” (laughter)
-
16:36 - 16:38Who don’t—Ok, raise your hands if you don’t do this!
-
16:38 - 16:40Thank you for your honesty. (laughter) Ok.
-
16:40 - 16:43Or— Or I could do the other thing, I could say:
-
16:43 - 16:45Instead of printing the value of a variable
-
16:45 - 16:47why don’t I write a test that inspects it
-
16:47 - 16:48in such an expectation which should
-
16:48 - 16:50and I’ll know immediately in bright red letters
-
16:50 - 16:53if that expectation wasn’t met
-
16:53 - 16:56Ok, I’m back on the conventional debugging side:
-
16:56 - 16:58I break out the big guns: I pull out the Ruby debugger
-
16:58 - 17:02I set a debug breakpoint, and I now start tweaking and say
-
17:02 - 17:04“Oh, let’s see, I have to get past that ‘if’ statement
-
17:04 - 17:06so I have to set that thing
-
17:06 - 17:07Oh, I have to call that method and so I need to…”
-
17:07 - 17:08No!
-
17:08 - 17:10I could instead—if I’m going to do that anyway—
-
17:10 - 17:13let’s just do it in a file, set up some mocks and stubs
-
17:13 - 17:16to control the code path, make it go the way I want
-
17:16 - 17:19And now, “Ok, for sure I’ve fixed it!
-
17:19 - 17:22I’ll get out of the debugger, run it all again!”
-
17:22 - 17:24And, of course, 9 times out of 10, you didn’t fix it
-
17:24 - 17:26or you kind of partly fixed it but you didn’t completely fix it
-
17:26 - 17:30and now I have to do all these manual things all over again
-
17:30 - 17:32or I already have a bunch of tests
-
17:32 - 17:34and I can just rerun them automatically
-
17:34 - 17:35and I could, if some of them fail
-
17:35 - 17:36“Oh, I didn’t fix the whole thing
-
17:36 - 17:38No problem, I’ll just go back!”
-
17:38 - 17:39So, the bottom line is that
-
17:39 - 17:41you know, you could do it on the left side
-
17:41 - 17:45but you’re using the same techniques in both cases
-
17:45 - 17:48The only difference is, in one case you’re doing it manually
-
17:48 - 17:50which is boring and error-prone
-
17:50 - 17:51In the other case you’re doing a little more work
-
17:51 - 17:53but you can make it automatic and repeatable
-
17:53 - 17:55and have, you know, some high confidence
-
17:55 - 17:57that as you change things in your code
-
17:57 - 17:58you are not breaking stuff that used to work
-
17:58 - 18:00and basically it’s more productive
-
18:00 - 18:02So you’re doing all the same things
-
18:02 - 18:04but with a, kind of, “delta” extra work
-
18:04 - 18:07you are using your effort at a much higher leverage
-
18:07 - 18:10So that’s kind of my view of why TDD is a good thing
-
18:10 - 18:11It’s really, it doesn’t require new skills
-
18:11 - 18:15It just requires [you] to refactor your existing skills
-
18:15 - 18:18I also tried when I—again, honest confessions, right?—
-
18:18 - 18:19when I started doing this it was like
-
18:19 - 18:21“Ok, I gonna be teaching a course on Rails
-
18:21 - 18:22I should really focus on testing
-
18:22 - 18:24So I went back to some code I had written
-
18:24 - 18:26that was working—you know, that was decent code—
-
18:26 - 18:29and I started trying to write tests for it
-
18:29 - 18:31and it was *so painful*
-
18:31 - 18:33because the code wasn’t written in way that was testable
-
18:33 - 18:34There were all kinds of interactions
-
18:34 - 18:36There were, like, nested conditionals
-
18:36 - 18:38And if you wanted to isolate a particular statement
-
18:38 - 18:41and have it test—to trigger test—just that statement
-
18:41 - 18:44the amount of stuff you’d have to set up in your test
-
18:44 - 18:45to have it happen—
-
18:45 - 18:46remember when talked about mock train wrecks—
-
18:46 - 18:48you have to set up all this infrastructure
-
18:48 - 18:49just to get one line of code
-
18:49 - 18:51and you do that and you go
-
18:51 - 18:52“Gawd, testing is really not worth it!
-
18:52 - 18:54I wrote 20 lines of setup
-
18:54 - 18:56so that I could test two lines in my function!”
-
18:56 - 18:58What that’s really telling you—as I now realize—
-
18:58 - 19:00is your function is bad
-
19:00 - 19:01It’s a badly written function
-
19:01 - 19:02It’s not a testable function
-
19:02 - 19:03It’s got too many moving parts
-
19:03 - 19:06whose dependencies can be broken
-
19:06 - 19:07There’s no seams in my function
-
19:07 - 19:11that allow me to individually test the different behaviours
-
19:11 - 19:12And once you start doing Test First Development
-
19:12 - 19:15because you have to write your tests in small chunks
-
19:15 - 19:17it kind of make this problem go away
-
19:17 -So that’s been my epiphany
- Title:
- 5.8 - 5.11 - Coverage, Unit vs. Integration Tests, Other Testing Concepts, and Perspectives
- Video Language:
- English
Show all