0:00:01.426,0:00:06.083 In many ways, the most creative, challenging, and under-appreciated aspect of interaction design 0:00:06.083,0:00:08.464 is evaluating designs with people. 0:00:08.464,0:00:11.566 The insights that you’ll get from testing designs with people 0:00:11.566,0:00:16.017 can help you get new ideas, make changes, decide wisely, and fix bugs. 0:00:16.017,0:00:20.811 One reason I think design is such an interesting field is its relationship to truth and objectivity. 0:00:20.811,0:00:26.407 I find design so incredibly fascinating because we can say more in response to a question like: 0:00:26.407,0:00:32.961 “How can we measure success?” than “It’s just personal preference” or “Whatever feels right.” 0:00:32.961,0:00:37.474 At the same time, the answers are more complex and more open-ended, more subjective, 0:00:37.474,0:00:41.545 and require more wisdom than just a number like 7 or 3. 0:00:41.545,0:00:43.850 One of the things that we’re going to learn in this class 0:00:43.850,0:00:48.372 is the different kinds of knowledge that you can get out of different kinds of methods. 0:00:48.372,0:00:53.319 Why evaluate designs with people? Why learn about how people use interactive systems? 0:00:53.319,0:00:58.444 I think one major reason for this is that it can be difficult to tell how good a user interface is 0:00:58.444,0:01:03.974 until you’ve tried it out with actual users, and that’s because clients and designers and developers, 0:01:03.974,0:01:07.107 they may know too much about the domain and the user interface, 0:01:07.107,0:01:11.376 or have acquired blinders through designing and building the user interface. 0:01:11.376,0:01:15.415 At the same time they may not know enough about the user’s actual tasks. 0:01:15.415,0:01:20.965 And while experience and theory can help, it can still be hard to predict what real users will actually do. 0:01:21.888,0:01:25.002 You might want to know, “Can people figure out how to use it?” 0:01:25.002,0:01:28.859 or “Do they swear or giggle when using this interface?” 0:01:28.859,0:01:31.224 “How does this design compare to that design?” 0:01:31.224,0:01:35.337 and, “If we changed the interface, how does that change people’s behaviour?” 0:01:35.337,0:01:39.499 “What new practices might emerge?” “How do things change over time?” 0:01:39.499,0:01:44.714 These are all great questions to ask about an interface, and each will come from different methods. 0:01:44.714,0:01:49.932 The value of having a broad toolbox of different methods can be especially valuable in emerging areas 0:01:49.932,0:01:56.178 like mobile and social software where people’s use practices can be particularly context-dependent 0:01:56.178,0:02:00.681 and also evolves significantly over time in response to how other people use software 0:02:00.681,0:02:03.197 through network effects and things like that. 0:02:03.197,0:02:08.534 To give you a flavour of this, I’d like to quickly run through some common types of empiracal research in HCI. 0:02:08.534,0:02:11.741 The examples I’ll show are mostly published work of one sort or another, 0:02:11.741,0:02:14.024 because that’s the easiest stuff to share. 0:02:14.024,0:02:18.654 If you have good examples from current systems out in the world, post them to the forum! 0:02:18.654,0:02:21.130 I keep an archive of user interface examples, 0:02:21.130,0:02:24.434 and I and the other students would love to see what you can come up with. 0:02:24.434,0:02:27.176 One way to learn about the user experience of a design 0:02:27.176,0:02:30.811 is to bring people into your lab or office and have them try it out. 0:02:30.811,0:02:32.978 We often call these usability studies. 0:02:32.978,0:02:37.458 This “watch someone use my interface” approach is a common one in HCI. 0:02:37.458,0:02:43.622 This basic strategy for traditional user-centred design is to iteratively bring people 0:02:43.622,0:02:48.221 into your lab or office until you run out of time. And then release. 0:02:48.221,0:02:52.312 And, if you had deep pockets, these rooms had a one-way glass mirror, 0:02:52.312,0:02:54.684 and the development team was on the other side. 0:02:54.684,0:02:59.245 In a leaner environment, this may be just bring in people into your dorm room office. 0:02:59.245,0:03:01.672 You’ll learn a huge amount by doing this. 0:03:01.672,0:03:04.702 Every single time that I or a student, friend, or colleague 0:03:04.702,0:03:07.731 has watched somebody use a new interactive system, 0:03:07.731,0:03:14.185 we learn something, [as,] as designers we get blinders to systems’ quirks, bugs, and false assumptions. 0:03:15.308,0:03:19.562 However, there are some major shortcomings to this approach. 0:03:19.562,0:03:24.122 In particular, the setting probably isn’t very ecologically valid. 0:03:24.122,0:03:29.463 In the real world, people may have different tasks, goals, motivations, and physical settings 0:03:29.463,0:03:32.288 than your office or lab. 0:03:32.288,0:03:35.354 This can be especially true for user interfaces that you think people might use on the go, 0:03:35.354,0:03:38.405 like at a bus stop or while waiting in line. 0:03:38.405,0:03:40.827 Second, there can be a “please me” experimental bias, 0:03:40.827,0:03:44.122 where when you bring somebody in to try out a user interface, 0:03:44.122,0:03:47.339 they know that they’re trying out the technology that you developed 0:03:47.339,0:03:50.966 and so they may work harder or be nicer 0:03:50.966,0:03:54.593 than they would if they had to use it without the constraints of a lab setup 0:03:54.593,0:03:58.497 with the person who developed it watching right over them. 0:03:58.497,0:04:03.338 Third, in its most basic form where you’re just trying out just one user interface, there is no comparison point. 0:04:03.338,0:04:09.177 So while you can track when people laugh, or swear, or smile with joy, 0:04:09.177,0:04:12.456 you won’t know whether they would’ve laugh more, or sworn less, or smiled more 0:04:12.456,0:04:14.974 if you’d had a different user interface. 0:04:14.974,0:04:18.176 And finally it requires bringing people to your physical location. 0:04:18.176,0:04:20.596 This is often a whole lot easier than a lot of people think. 0:04:20.596,0:04:23.845 It can be a psychological burden, even if nothing else. 0:04:24.307,0:04:28.172 A very different way of getting feedback from people is to use a survey. 0:04:28.172,0:04:31.150 Here is an example of a survey that I got recently from San Francisco 0:04:31.150,0:04:34.127 asking about different street light designs. 0:04:34.127,0:04:38.151 Surveys are great because you can quickly get feedback from a large number of responses. 0:04:38.151,0:04:41.353 And it’s relatively easy to compare multiple alternatives. 0:04:41.353,0:04:44.385 You can also automatically tally the results. 0:04:44.385,0:04:48.390 You don’t even need to build anything; you can just show screen shots or mock-ups. 0:04:48.390,0:04:50.532 One of the things that I’ve learned the hard way, though, 0:04:50.532,0:04:55.144 is the difference between what people say they’re going to do and what they actually do. 0:04:55.144,0:04:59.026 Ask people how often they exercise and you’ll probably get a much more optimistic answer 0:04:59.026,0:05:02.060 than how often they really do exercise. 0:05:02.060,0:05:05.173 The same holds for the street light example here. 0:05:05.173,0:05:08.999 Try to imagine what a number of different street light designs might be 0:05:08.999,0:05:12.191 is really different than actually observing them on the street 0:05:12.191,0:05:15.384 and having them become part of normal everyday life. 0:05:15.384,0:05:18.085 Still, it can be valuable to get feedback. 0:05:18.085,0:05:20.439 Another type of responder strategy is focus groups. 0:05:20.439,0:05:26.046 In a focus group, you’ll gather together a small group of people to discuss a design or idea. 0:05:26.046,0:05:31.372 The fact that focus groups involve a group of people is a double-edged sword. 0:05:31.372,0:05:37.541 On one hand, you can get people to tease out of their colleagues things that they might not have thought 0:05:37.541,0:05:44.579 to say on their own; on the other hand, for a variety of psychological reasons, people may be inclined 0:05:44.579,0:05:48.774 to say polite things or generate answers completely on the spot 0:05:48.774,0:05:53.785 that are totally uncorrelated with what they believe or what they would actually do. 0:05:54.662,0:05:59.982 Focus groups can be a particularly problematic method when you are looking at trying to gather data 0:05:59.982,0:06:04.135 about taboo topics or about cultural biases. 0:06:04.135,0:06:06.723 With those caveats — right now we’re just making a laundry list, and — 0:06:06.723,0:06:12.312 I think that focus groups, like almost any other method, can play an important role in your toolbelt. 0:06:13.420,0:06:16.574 Our third category of techniques is to get feedback from experts. 0:06:16.574,0:06:22.905 For example, in this class we’re going to do a bunch of peer critique for your weekly project assignments. 0:06:22.905,0:06:25.370 In addition to having users try your interface, 0:06:25.370,0:06:29.775 it can be important to eat your own dog food and use the tools that you built yourself. 0:06:29.775,0:06:35.069 When you are getting feedback from experts, it can often be helpful to have some kind of structured format, 0:06:35.069,0:06:38.558 much like the rubrics you’ll see in your project assignments. 0:06:38.558,0:06:44.881 And, for getting feedback on user interfaces, one common approach to this structured feedback 0:06:44.881,0:06:48.390 is called heuristic evaluation, and you’ll learn how to do that in this class; 0:06:48.390,0:06:51.051 it’s pioneered by Jacob Nielson. 0:06:51.051,0:06:53.496 Our next genre is comparative experiments: 0:06:53.496,0:06:57.565 taking two or more distinct options and comparing their performance to each other. 0:06:57.565,0:07:00.183 These comparisons can take place in lots of different ways: 0:07:00.183,0:07:04.061 They can be in the lab; they can be in the field; they can be online. 0:07:04.061,0:07:06.543 These experiments can be more-or-less controlled, 0:07:06.543,0:07:10.125 and they can take place over shorter or longer durations. 0:07:10.125,0:07:14.235 What you’re trying to learn here is which option is the more effective, 0:07:14.235,0:07:16.998 and, more often, what are the active ingredients, 0:07:16.998,0:07:21.422 what are the variables that matter in creating the user experience that you seek. 0:07:22.006,0:07:26.714 Here’s an example: My former PhD student Joel Brandt, and his colleague at Adobe, 0:07:26.714,0:07:30.847 ran a number of studies comparing help interfaces for programmers. 0:07:32.139,0:07:38.319 In particular they compared a more traditional search-style user interface for finding programming help 0:07:38.319,0:07:43.443 with a search interface that integrated programming help directly into your environment. 0:07:43.443,0:07:46.979 By running these comparisons they were able to see how programmers’ behaviour differed 0:07:46.979,0:07:50.588 based on the changing help user interface. 0:07:50.588,0:07:53.698 Comparative experiments have an advantage over surveys 0:07:53.698,0:07:57.230 in that you get to see the actual behaviour as opposed to self report, 0:07:57.230,0:08:02.329 and they can be better than usability studies because you’re comparing multiple alternatives. 0:08:02.329,0:08:06.780 This enables you to see what works better or worse, or at least what works different. 0:08:06.780,0:08:10.366 I find that comparative feedback is also often much more actionable. 0:08:11.166,0:08:13.938 However, if you are running controlled experiments online, 0:08:13.938,0:08:18.079 you don’t get to see much about the person on the other side of the screen. 0:08:18.079,0:08:20.774 And if you are inviting people into your office or lab, 0:08:20.774,0:08:24.111 the behaviour you’re measuring might not be very realistic. 0:08:24.111,0:08:30.283 If realistic longitudinal behaviour is what you’re after, participant observation may be the approach for you. 0:08:30.283,0:08:36.419 This approach is just what it sounds like: observing what people actually do in their actual work environment. 0:08:36.419,0:08:40.226 And this more long-term evaluation can be important for uncovering things 0:08:40.226,0:08:44.131 that you might not see in shorter term, more controlled scenarios. 0:08:44.131,0:08:48.015 For example, my colleagues Bob Sutton and Andrew Hargadon studied brainstorming. 0:08:48.015,0:08:51.655 The prior literature on brainstorming had focused mostly on questions like 0:08:51.655,0:08:54.402 “Do people come up with more ideas?” 0:08:54.402,0:08:56.829 What Bob and Andrew realized by going into the field 0:08:56.829,0:09:00.517 was that brainstorming served a number of other functions also, 0:09:00.517,0:09:05.365 like, for example, brainstorming provides a way for members of the design team 0:09:05.365,0:09:08.081 to demonstrate their creativity to their peers; 0:09:08.081,0:09:13.210 it allows them to pass along knowledge that then can be reused in other projects; 0:09:13.210,0:09:19.057 and it creates a fun, exciting environment that people like to work in and that clients like to participate in. 0:09:19.057,0:09:22.206 In a real ecosystem, all of these things are important, 0:09:22.206,0:09:25.514 in addition to just having the ideas that people come up with. 0:09:26.191,0:09:32.908 Nearly all experiments seek to build a theory on some level — I don’t mean anything fancy by this, 0:09:32.908,0:09:37.309 just that we take some things to be more relevant, and other things less relevant. 0:09:37.309,0:09:39.250 We might, for example, assume 0:09:39.250,0:09:43.068 that the ordering of search results may play an important role in what people click on, 0:09:43.068,0:09:46.415 but that the batting average of the Detroit Tigers doesn’t, 0:09:46.415,0:09:49.763 unless, of course, somebody’s searching for baseball. 0:09:49.763,0:09:55.093 If you have a theory that sufficiently, formal mathematically that you may make predictions, 0:09:55.093,0:10:00.037 then you can compare alternative interfaces using that model, without having to bring people in. 0:10:00.037,0:10:05.576 And we’ll go over that in this class a little bit, with respect to input models. 0:10:05.576,0:10:10.072 This makes it possible to try out a number of alternatives really fast. 0:10:10.072,0:10:12.286 Consequently, when people use simulations, 0:10:12.286,0:10:16.378 it’s often in conjunction with something like Monte Carlo optimization. 0:10:16.378,0:10:19.934 One example of this can be found in the ShapeWriter system, 0:10:19.934,0:10:22.735 where Shuman Zhai and colleagues figured out how to build a keyboard 0:10:22.735,0:10:26.122 where people could enter an entire word in a single stroke. 0:10:26.122,0:10:31.247 They were able to do this with the benefit of formal models and optimization-based approaches. 0:10:31.247,0:10:34.402 Simulation has mostly been used for input techniques 0:10:34.402,0:10:39.795 because people’s motor performance is probably the most well-quantified area of HCI. 0:10:39.795,0:10:42.701 And, while we won’t get much to it in this intro course, 0:10:42.701,0:10:46.266 simulation can also be used for higher-level cognitive tasks; 0:10:46.266,0:10:48.497 for example, Pete Pirolli and colleagues at PARC 0:10:48.497,0:10:51.528 had built impressive models of people’s web-searching behaviour. 0:10:52.467,0:10:57.253 These models enable them to estimate, for example, which links somebody is most likely to click on 0:10:57.253,0:11:00.238 by looking at the relevant link texts. 0:11:00.238,0:11:05.072 That’s our whirlwind tour of a number of empirical methods that this class will introduce. 0:11:05.072,0:11:09.481 You’ll want to pick the right method for the right task, and here’s some issues to consider: 0:11:09.481,0:11:13.187 If you did it again, would you get the same thing? 0:11:13.187,0:11:18.544 Another is generalizability and realism — Does this hold for people other than 18-year-old 0:11:18.544,0:11:23.135 upper-middle-class students who are doing this for course credit or a gift certificate? 0:11:23.135,0:11:28.546 Is this behaviour also what you’d see in the real world, or only in a more stilted lab environment? 0:11:28.546,0:11:30.864 Comparisons are important, because they can tell you 0:11:30.879,0:11:34.351 how the user experience would change with different interface choices, 0:11:34.351,0:11:38.553 as opposed to just a “people liked it” study. 0:11:38.553,0:11:42.784 It’s also important to think about how to achieve how these insights efficiently, 0:11:42.784,0:11:48.747 and not chew up a lot of resources, especially when your goal is practical. 0:11:48.747,0:11:54.252 My experience as a designer, researcher, teacher, consultant, advisor and mentor has taught me 0:11:54.252,0:12:01.340 that evaluating designs with people is both easier and more valuable than many people expect, 0:12:01.340,0:12:04.704 and there’s an incredible lightbulb moment that happens 0:12:04.704,0:12:08.831 when you actually get designs in front of people and see how they use them. 0:12:08.831,0:12:12.945 So, to sum up this video, I’d like to ask what could be the most important question: 0:12:12.945,9:59:59.000 “What do you want to learn?”