1 00:00:01,426 --> 00:00:06,083 In many ways, the most creative, challenging, and under-appreciated aspect of interaction design 2 00:00:06,083 --> 00:00:08,464 is evaluating designs with people. 3 00:00:08,464 --> 00:00:11,566 The insights that you’ll get from testing designs with people 4 00:00:11,566 --> 00:00:16,017 can help you get new ideas, make changes, decide wisely, and fix bugs. 5 00:00:16,017 --> 00:00:20,811 One reason I think design is such an interesting field is its relationship to truth and objectivity. 6 00:00:20,811 --> 00:00:26,407 I find design so incredibly fascinating because we can say more in response to a question like: 7 00:00:26,407 --> 00:00:32,961 “How can we measure success?” than “It’s just personal preference” or “Whatever feels right.” 8 00:00:32,961 --> 00:00:37,474 At the same time, the answers are more complex and more open-ended, more subjective, 9 00:00:37,474 --> 00:00:41,545 and require more wisdom than just a number like 7 or 3. 10 00:00:41,545 --> 00:00:43,850 One of the things that we’re going to learn in this class 11 00:00:43,850 --> 00:00:48,372 is the different kinds of knowledge that you can get out of different kinds of methods. 12 00:00:48,372 --> 00:00:53,319 Why evaluate designs with people? Why learn about how people use interactive systems? 13 00:00:53,319 --> 00:00:58,444 I think one major reason for this is that it can be difficult to tell how good a user interface is 14 00:00:58,444 --> 00:01:03,974 until you’ve tried it out with actual users, and that’s because clients and designers and developers, 15 00:01:03,974 --> 00:01:07,107 they may know too much about the domain and the user interface, 16 00:01:07,107 --> 00:01:11,376 or have acquired blinders through designing and building the user interface. 17 00:01:11,376 --> 00:01:15,415 At the same time they may not know enough about the user’s actual tasks. 18 00:01:15,415 --> 00:01:20,965 And while experience and theory can help, it can still be hard to predict what real users will actually do. 19 00:01:21,888 --> 00:01:25,002 You might want to know, “Can people figure out how to use it?” 20 00:01:25,002 --> 00:01:28,859 or “Do they swear or giggle when using this interface?” 21 00:01:28,859 --> 00:01:31,224 “How does this design compare to that design?” 22 00:01:31,224 --> 00:01:35,337 and, “If we changed the interface, how does that change people’s behaviour?” 23 00:01:35,337 --> 00:01:39,499 “What new practices might emerge?” “How do things change over time?” 24 00:01:39,499 --> 00:01:44,714 These are all great questions to ask about an interface, and each will come from different methods. 25 00:01:44,714 --> 00:01:49,932 The value of having a broad toolbox of different methods can be especially valuable in emerging areas 26 00:01:49,932 --> 00:01:56,178 like mobile and social software where people’s use practices can be particularly context-dependent 27 00:01:56,178 --> 00:02:00,681 and also evolves significantly over time in response to how other people use software 28 00:02:00,681 --> 00:02:03,197 through network effects and things like that. 29 00:02:03,197 --> 00:02:08,534 To give you a flavour of this, I’d like to quickly run through some common types of empiracal research in HCI. 30 00:02:08,534 --> 00:02:11,741 The examples I’ll show are mostly published work of one sort or another, 31 00:02:11,741 --> 00:02:14,024 because that’s the easiest stuff to share. 32 00:02:14,024 --> 00:02:18,654 If you have good examples from current systems out in the world, post them to the forum! 33 00:02:18,654 --> 00:02:21,130 I keep an archive of user interface examples, 34 00:02:21,130 --> 00:02:24,434 and I and the other students would love to see what you can come up with. 35 00:02:24,434 --> 00:02:27,176 One way to learn about the user experience of a design 36 00:02:27,176 --> 00:02:30,811 is to bring people into your lab or office and have them try it out. 37 00:02:30,811 --> 00:02:32,978 We often call these usability studies. 38 00:02:32,978 --> 00:02:37,458 This “watch someone use my interface” approach is a common one in HCI. 39 00:02:37,458 --> 00:02:43,622 This basic strategy for traditional user-centred design is to iteratively bring people 40 00:02:43,622 --> 00:02:48,221 into your lab or office until you run out of time. And then release. 41 00:02:48,221 --> 00:02:52,312 And, if you had deep pockets, these rooms had a one-way glass mirror, 42 00:02:52,312 --> 00:02:54,684 and the development team was on the other side. 43 00:02:54,684 --> 00:02:59,245 In a leaner environment, this may be just bring in people into your dorm room office. 44 00:02:59,245 --> 00:03:01,672 You’ll learn a huge amount by doing this. 45 00:03:01,672 --> 00:03:04,702 Every single time that I or a student, friend, or colleague 46 00:03:04,702 --> 00:03:07,731 has watched somebody use a new interactive system, 47 00:03:07,731 --> 00:03:14,185 we learn something, [as,] as designers we get blinders to systems’ quirks, bugs, and false assumptions. 48 00:03:15,308 --> 00:03:19,562 However, there are some major shortcomings to this approach. 49 00:03:19,562 --> 00:03:24,122 In particular, the setting probably isn’t very ecologically valid. 50 00:03:24,122 --> 00:03:29,463 In the real world, people may have different tasks, goals, motivations, and physical settings 51 00:03:29,463 --> 00:03:32,288 than your office or lab. 52 00:03:32,288 --> 00:03:35,354 This can be especially true for user interfaces that you think people might use on the go, 53 00:03:35,354 --> 00:03:38,405 like at a bus stop or while waiting in line. 54 00:03:38,405 --> 00:03:40,827 Second, there can be a “please me” experimental bias, 55 00:03:40,827 --> 00:03:44,122 where when you bring somebody in to try out a user interface, 56 00:03:44,122 --> 00:03:47,339 they know that they’re trying out the technology that you developed 57 00:03:47,339 --> 00:03:50,966 and so they may work harder or be nicer 58 00:03:50,966 --> 00:03:54,593 than they would if they had to use it without the constraints of a lab setup 59 00:03:54,593 --> 00:03:58,497 with the person who developed it watching right over them. 60 00:03:58,497 --> 00:04:03,338 Third, in its most basic form where you’re just trying out just one user interface, there is no comparison point. 61 00:04:03,338 --> 00:04:09,177 So while you can track when people laugh, or swear, or smile with joy, 62 00:04:09,177 --> 00:04:12,456 you won’t know whether they would’ve laugh more, or sworn less, or smiled more 63 00:04:12,456 --> 00:04:14,974 if you’d had a different user interface. 64 00:04:14,974 --> 00:04:18,176 And finally it requires bringing people to your physical location. 65 00:04:18,176 --> 00:04:20,596 This is often a whole lot easier than a lot of people think. 66 00:04:20,596 --> 00:04:23,845 It can be a psychological burden, even if nothing else. 67 00:04:24,307 --> 00:04:28,172 A very different way of getting feedback from people is to use a survey. 68 00:04:28,172 --> 00:04:31,150 Here is an example of a survey that I got recently from San Francisco 69 00:04:31,150 --> 00:04:34,127 asking about different street light designs. 70 00:04:34,127 --> 00:04:38,151 Surveys are great because you can quickly get feedback from a large number of responses. 71 00:04:38,151 --> 00:04:41,353 And it’s relatively easy to compare multiple alternatives. 72 00:04:41,353 --> 00:04:44,385 You can also automatically tally the results. 73 00:04:44,385 --> 00:04:48,390 You don’t even need to build anything; you can just show screen shots or mock-ups. 74 00:04:48,390 --> 00:04:50,532 One of the things that I’ve learned the hard way, though, 75 00:04:50,532 --> 00:04:55,144 is the difference between what people say they’re going to do and what they actually do. 76 00:04:55,144 --> 00:04:59,026 Ask people how often they exercise and you’ll probably get a much more optimistic answer 77 00:04:59,026 --> 00:05:02,060 than how often they really do exercise. 78 00:05:02,060 --> 00:05:05,173 The same holds for the street light example here. 79 00:05:05,173 --> 00:05:08,999 Try to imagine what a number of different street light designs might be 80 00:05:08,999 --> 00:05:12,191 is really different than actually observing them on the street 81 00:05:12,191 --> 00:05:15,384 and having them become part of normal everyday life. 82 00:05:15,384 --> 00:05:18,085 Still, it can be valuable to get feedback. 83 00:05:18,085 --> 00:05:20,439 Another type of responder strategy is focus groups. 84 00:05:20,439 --> 00:05:26,046 In a focus group, you’ll gather together a small group of people to discuss a design or idea. 85 00:05:26,046 --> 00:05:31,372 The fact that focus groups involve a group of people is a double-edged sword. 86 00:05:31,372 --> 00:05:37,541 On one hand, you can get people to tease out of their colleagues things that they might not have thought 87 00:05:37,541 --> 00:05:44,579 to say on their own; on the other hand, for a variety of psychological reasons, people may be inclined 88 00:05:44,579 --> 00:05:48,774 to say polite things or generate answers completely on the spot 89 00:05:48,774 --> 00:05:53,785 that are totally uncorrelated with what they believe or what they would actually do. 90 00:05:54,662 --> 00:05:59,982 Focus groups can be a particularly problematic method when you are looking at trying to gather data 91 00:05:59,982 --> 00:06:04,135 about taboo topics or about cultural biases. 92 00:06:04,135 --> 00:06:06,723 With those caveats — right now we’re just making a laundry list, and — 93 00:06:06,723 --> 00:06:12,312 I think that focus groups, like almost any other method, can play an important role in your toolbelt. 94 00:06:13,420 --> 00:06:16,574 Our third category of techniques is to get feedback from experts. 95 00:06:16,574 --> 00:06:22,905 For example, in this class we’re going to do a bunch of peer critique for your weekly project assignments. 96 00:06:22,905 --> 00:06:25,370 In addition to having users try your interface, 97 00:06:25,370 --> 00:06:29,775 it can be important to eat your own dog food and use the tools that you built yourself. 98 00:06:29,775 --> 00:06:35,069 When you are getting feedback from experts, it can often be helpful to have some kind of structured format, 99 00:06:35,069 --> 00:06:38,558 much like the rubrics you’ll see in your project assignments. 100 00:06:38,558 --> 00:06:44,881 And, for getting feedback on user interfaces, one common approach to this structured feedback 101 00:06:44,881 --> 00:06:48,390 is called heuristic evaluation, and you’ll learn how to do that in this class; 102 00:06:48,390 --> 00:06:51,051 it’s pioneered by Jacob Nielson. 103 00:06:51,051 --> 00:06:53,496 Our next genre is comparative experiments: 104 00:06:53,496 --> 00:06:57,565 taking two or more distinct options and comparing their performance to each other. 105 00:06:57,565 --> 00:07:00,183 These comparisons can take place in lots of different ways: 106 00:07:00,183 --> 00:07:04,061 They can be in the lab; they can be in the field; they can be online. 107 00:07:04,061 --> 00:07:06,543 These experiments can be more-or-less controlled, 108 00:07:06,543 --> 00:07:10,125 and they can take place over shorter or longer durations. 109 00:07:10,125 --> 00:07:14,235 What you’re trying to learn here is which option is the more effective, 110 00:07:14,235 --> 00:07:16,998 and, more often, what are the active ingredients, 111 00:07:16,998 --> 00:07:21,422 what are the variables that matter in creating the user experience that you seek. 112 00:07:22,006 --> 00:07:26,714 Here’s an example: My former PhD student Joel Brandt, and his colleague at Adobe, 113 00:07:26,714 --> 00:07:30,847 ran a number of studies comparing help interfaces for programmers. 114 00:07:32,139 --> 00:07:38,319 In particular they compared a more traditional search-style user interface for finding programming help 115 00:07:38,319 --> 00:07:43,443 with a search interface that integrated programming help directly into your environment. 116 00:07:43,443 --> 00:07:46,979 By running these comparisons they were able to see how programmers’ behaviour differed 117 00:07:46,979 --> 00:07:50,588 based on the changing help user interface. 118 00:07:50,588 --> 00:07:53,698 Comparative experiments have an advantage over surveys 119 00:07:53,698 --> 00:07:57,230 in that you get to see the actual behaviour as opposed to self report, 120 00:07:57,230 --> 00:08:02,329 and they can be better than usability studies because you’re comparing multiple alternatives. 121 00:08:02,329 --> 00:08:06,780 This enables you to see what works better or worse, or at least what works different. 122 00:08:06,780 --> 00:08:10,366 I find that comparative feedback is also often much more actionable. 123 00:08:11,166 --> 00:08:13,938 However, if you are running controlled experiments online, 124 00:08:13,938 --> 00:08:18,079 you don’t get to see much about the person on the other side of the screen. 125 00:08:18,079 --> 00:08:20,774 And if you are inviting people into your office or lab, 126 00:08:20,774 --> 00:08:24,111 the behaviour you’re measuring might not be very realistic. 127 00:08:24,111 --> 00:08:30,283 If realistic longitudinal behaviour is what you’re after, participant observation may be the approach for you. 128 00:08:30,283 --> 00:08:36,419 This approach is just what it sounds like: observing what people actually do in their actual work environment. 129 00:08:36,419 --> 00:08:40,226 And this more long-term evaluation can be important for uncovering things 130 00:08:40,226 --> 00:08:44,131 that you might not see in shorter term, more controlled scenarios. 131 00:08:44,131 --> 00:08:48,015 For example, my colleagues Bob Sutton and Andrew Hargadon studied brainstorming. 132 00:08:48,015 --> 00:08:51,655 The prior literature on brainstorming had focused mostly on questions like 133 00:08:51,655 --> 00:08:54,402 “Do people come up with more ideas?” 134 00:08:54,402 --> 00:08:56,829 What Bob and Andrew realized by going into the field 135 00:08:56,829 --> 00:09:00,517 was that brainstorming served a number of other functions also, 136 00:09:00,517 --> 00:09:05,365 like, for example, brainstorming provides a way for members of the design team 137 00:09:05,365 --> 00:09:08,081 to demonstrate their creativity to their peers; 138 00:09:08,081 --> 00:09:13,210 it allows them to pass along knowledge that then can be reused in other projects; 139 00:09:13,210 --> 00:09:19,057 and it creates a fun, exciting environment that people like to work in and that clients like to participate in. 140 00:09:19,057 --> 00:09:22,206 In a real ecosystem, all of these things are important, 141 00:09:22,206 --> 00:09:25,514 in addition to just having the ideas that people come up with. 142 00:09:26,191 --> 00:09:32,908 Nearly all experiments seek to build a theory on some level — I don’t mean anything fancy by this, 143 00:09:32,908 --> 00:09:37,309 just that we take some things to be more relevant, and other things less relevant. 144 00:09:37,309 --> 00:09:39,250 We might, for example, assume 145 00:09:39,250 --> 00:09:43,068 that the ordering of search results may play an important role in what people click on, 146 00:09:43,068 --> 00:09:46,415 but that the batting average of the Detroit Tigers doesn’t, 147 00:09:46,415 --> 00:09:49,763 unless, of course, somebody’s searching for baseball. 148 00:09:49,763 --> 00:09:55,093 If you have a theory that sufficiently, formal mathematically that you may make predictions, 149 00:09:55,093 --> 00:10:00,037 then you can compare alternative interfaces using that model, without having to bring people in. 150 00:10:00,037 --> 00:10:05,576 And we’ll go over that in this class a little bit, with respect to input models. 151 00:10:05,576 --> 00:10:10,072 This makes it possible to try out a number of alternatives really fast. 152 00:10:10,072 --> 00:10:12,286 Consequently, when people use simulations, 153 00:10:12,286 --> 00:10:16,378 it’s often in conjunction with something like Monte Carlo optimization. 154 00:10:16,378 --> 00:10:19,934 One example of this can be found in the ShapeWriter system, 155 00:10:19,934 --> 00:10:22,735 where Shuman Zhai and colleagues figured out how to build a keyboard 156 00:10:22,735 --> 00:10:26,122 where people could enter an entire word in a single stroke. 157 00:10:26,122 --> 00:10:31,247 They were able to do this with the benefit of formal models and optimization-based approaches. 158 00:10:31,247 --> 00:10:34,402 Simulation has mostly been used for input techniques 159 00:10:34,402 --> 00:10:39,795 because people’s motor performance is probably the most well-quantified area of HCI. 160 00:10:39,795 --> 00:10:42,701 And, while we won’t get much to it in this intro course, 161 00:10:42,701 --> 00:10:46,266 simulation can also be used for higher-level cognitive tasks; 162 00:10:46,266 --> 00:10:48,497 for example, Pete Pirolli and colleagues at PARC 163 00:10:48,497 --> 00:10:51,528 had built impressive models of people’s web-searching behaviour. 164 00:10:52,467 --> 00:10:57,253 These models enable them to estimate, for example, which links somebody is most likely to click on 165 00:10:57,253 --> 00:11:00,238 by looking at the relevant link texts. 166 00:11:00,238 --> 00:11:05,072 That’s our whirlwind tour of a number of empirical methods that this class will introduce. 167 00:11:05,072 --> 00:11:09,481 You’ll want to pick the right method for the right task, and here’s some issues to consider: 168 00:11:09,481 --> 00:11:13,187 If you did it again, would you get the same thing? 169 00:11:13,187 --> 00:11:18,544 Another is generalizability and realism — Does this hold for people other than 18-year-old 170 00:11:18,544 --> 00:11:23,135 upper-middle-class students who are doing this for course credit or a gift certificate? 171 00:11:23,135 --> 00:11:28,546 Is this behaviour also what you’d see in the real world, or only in a more stilted lab environment? 172 00:11:28,546 --> 00:11:30,864 Comparisons are important, because they can tell you 173 00:11:30,879 --> 00:11:34,351 how the user experience would change with different interface choices, 174 00:11:34,351 --> 00:11:38,553 as opposed to just a “people liked it” study. 175 00:11:38,553 --> 00:11:42,784 It’s also important to think about how to achieve how these insights efficiently, 176 00:11:42,784 --> 00:11:48,747 and not chew up a lot of resources, especially when your goal is practical. 177 00:11:48,747 --> 00:11:54,252 My experience as a designer, researcher, teacher, consultant, advisor and mentor has taught me 178 00:11:54,252 --> 00:12:01,340 that evaluating designs with people is both easier and more valuable than many people expect, 179 00:12:01,340 --> 00:12:04,704 and there’s an incredible lightbulb moment that happens 180 00:12:04,704 --> 00:12:08,831 when you actually get designs in front of people and see how they use them. 181 00:12:08,831 --> 00:12:12,945 So, to sum up this video, I’d like to ask what could be the most important question: 182 00:12:12,945 --> 99:59:59,999 “What do you want to learn?”