0:00:01.426,0:00:06.083
In many ways, the most creative, challenging, and under-appreciated aspect of interaction design

0:00:06.083,0:00:08.464
is evaluating designs with people.

0:00:08.464,0:00:11.566
The insights that you’ll get from testing designs with people

0:00:11.566,0:00:16.017
can help you get new ideas, make changes, decide wisely, and fix bugs.

0:00:16.017,0:00:20.811
One reason I think design is such an interesting field is its relationship to truth and objectivity.

0:00:20.811,0:00:26.407
I find design so incredibly fascinating because we can say more in response to a question like:

0:00:26.407,0:00:32.961
“How can we measure success?” than “It’s just personal preference” or “Whatever feels right.”

0:00:32.961,0:00:37.474
At the same time, the answers are more complex and more open-ended, more subjective,

0:00:37.474,0:00:41.545
and require more wisdom than just a number like 7 or 3.

0:00:41.545,0:00:43.850
One of the things that we’re going to learn in this class

0:00:43.850,0:00:48.372
is the different kinds of knowledge that you can get out of different kinds of methods.

0:00:48.372,0:00:53.319
Why evaluate designs with people? Why learn about how people use interactive systems?

0:00:53.319,0:00:58.444
I think one major reason for this is that it can be difficult to tell how good a user interface is

0:00:58.444,0:01:03.974
until you’ve tried it out with actual users, and that’s because clients and designers and developers,

0:01:03.974,0:01:07.107
they may know too much about the domain and the user interface,

0:01:07.107,0:01:11.376
or have acquired blinders through designing and building the user interface.

0:01:11.376,0:01:15.415
At the same time they may not know enough about the user’s actual tasks.

0:01:15.415,0:01:20.965
And while experience and theory can help, it can still be hard to predict what real users will actually do.

0:01:21.888,0:01:25.002
You might want to know, “Can people figure out how to use it?”

0:01:25.002,0:01:28.859
or “Do they swear or giggle when using this interface?”

0:01:28.859,0:01:31.224
“How does this design compare to that design?”

0:01:31.224,0:01:35.337
and, “If we changed the interface, how does that change people’s behaviour?”

0:01:35.337,0:01:39.499
“What new practices might emerge?” “How do things change over time?”

0:01:39.499,0:01:44.714
These are all great questions to ask about an interface, and each will come from different methods.

0:01:44.714,0:01:49.932
The value of having a broad toolbox of different methods can be especially valuable in emerging areas

0:01:49.932,0:01:56.178
like mobile and social software where people’s use practices can be particularly context-dependent

0:01:56.178,0:02:00.681
and also evolves significantly over time in response to how other people use software

0:02:00.681,0:02:03.197
through network effects and things like that.

0:02:03.197,0:02:08.534
To give you a flavour of this, I’d like to quickly run through some common types of empiracal research in HCI.

0:02:08.534,0:02:11.741
The examples I’ll show are mostly published work of one sort or another,

0:02:11.741,0:02:14.024
because that’s the easiest stuff to share.

0:02:14.024,0:02:18.654
If you have good examples from current systems out in the world, post them to the forum!

0:02:18.654,0:02:21.130
I keep an archive of user interface examples,

0:02:21.130,0:02:24.434
and I and the other students would love to see what you can come up with.

0:02:24.434,0:02:27.176
One way to learn about the user experience of a design

0:02:27.176,0:02:30.811
is to bring people into your lab or office and have them try it out.

0:02:30.811,0:02:32.978
We often call these usability studies.

0:02:32.978,0:02:37.458
This “watch someone use my interface” approach is a common one in HCI.

0:02:37.458,0:02:43.622
This basic strategy for traditional user-centred design is to iteratively bring people

0:02:43.622,0:02:48.221
into your lab or office until you run out of time. And then release.

0:02:48.221,0:02:52.312
And, if you had deep pockets, these rooms had a one-way glass mirror,

0:02:52.312,0:02:54.684
and the development team was on the other side.

0:02:54.684,0:02:59.245
In a leaner environment, this may be just bring in people into your dorm room office.

0:02:59.245,0:03:01.672
You’ll learn a huge amount by doing this.

0:03:01.672,0:03:04.702
Every single time that I or a student, friend, or colleague

0:03:04.702,0:03:07.731
has watched somebody use a new interactive system,

0:03:07.731,0:03:14.185
we learn something, [as,] as designers we get blinders to systems’ quirks, bugs, and false assumptions.

0:03:15.308,0:03:19.562
However, there are some major shortcomings to this approach.

0:03:19.562,0:03:24.122
In particular, the setting probably isn’t very ecologically valid.

0:03:24.122,0:03:29.463
In the real world, people may have different tasks, goals, motivations, and physical settings

0:03:29.463,0:03:32.288
than your office or lab.

0:03:32.288,0:03:35.354
This can be especially true for user interfaces that you think people might use on the go,

0:03:35.354,0:03:38.405
like at a bus stop or while waiting in line.

0:03:38.405,0:03:40.827
Second, there can be a “please me” experimental bias,

0:03:40.827,0:03:44.122
where when you bring somebody in to try out a user interface,

0:03:44.122,0:03:47.339
they know that they’re trying out the technology that you developed

0:03:47.339,0:03:50.966
and so they may work harder or be nicer

0:03:50.966,0:03:54.593
than they would if they had to use it without the constraints of a lab setup

0:03:54.593,0:03:58.497
with the person who developed it watching right over them.

0:03:58.497,0:04:03.338
Third, in its most basic form where you’re just trying out just one user interface, there is no comparison point.

0:04:03.338,0:04:09.177
So while you can track when people laugh, or swear, or smile with joy,

0:04:09.177,0:04:12.456
you won’t know whether they would’ve laugh more, or sworn less, or smiled more

0:04:12.456,0:04:14.974
if you’d had a different user interface.

0:04:14.974,0:04:18.176
And finally it requires bringing people to your physical location.

0:04:18.176,0:04:20.596
This is often a whole lot easier than a lot of people think.

0:04:20.596,0:04:23.845
It can be a psychological burden, even if nothing else.

0:04:24.307,0:04:28.172
A very different way of getting feedback from people is to use a survey.

0:04:28.172,0:04:31.150
Here is an example of a survey that I got recently from San Francisco

0:04:31.150,0:04:34.127
asking about different street light designs.

0:04:34.127,0:04:38.151
Surveys are great because you can quickly get feedback from a large number of responses.

0:04:38.151,0:04:41.353
And it’s relatively easy to compare multiple alternatives.

0:04:41.353,0:04:44.385
You can also automatically tally the results.

0:04:44.385,0:04:48.390
You don’t even need to build anything; you can just show screen shots or mock-ups.

0:04:48.390,0:04:50.532
One of the things that I’ve learned the hard way, though,

0:04:50.532,0:04:55.144
is the difference between what people say they’re going to do and what they actually do.

0:04:55.144,0:04:59.026
Ask people how often they exercise and you’ll probably get a much more optimistic answer

0:04:59.026,0:05:02.060
than how often they really do exercise.

0:05:02.060,0:05:05.173
The same holds for the street light example here.

0:05:05.173,0:05:08.999
Try to imagine what a number of different street light designs might be

0:05:08.999,0:05:12.191
is really different than actually observing them on the street

0:05:12.191,0:05:15.384
and having them become part of normal everyday life.

0:05:15.384,0:05:18.085
Still, it can be valuable to get feedback.

0:05:18.085,0:05:20.439
Another type of responder strategy is focus groups.

0:05:20.439,0:05:26.046
In a focus group, you’ll gather together a small group of people to discuss a design or idea.

0:05:26.046,0:05:31.372
The fact that focus groups involve a group of people is a double-edged sword.

0:05:31.372,0:05:37.541
On one hand, you can get people to tease out of their colleagues things that they might not have thought

0:05:37.541,0:05:44.579
to say on their own; on the other hand, for a variety of psychological reasons, people may be inclined

0:05:44.579,0:05:48.774
to say polite things or generate answers completely on the spot

0:05:48.774,0:05:53.785
that are totally uncorrelated with what they believe or what they would actually do.

0:05:54.662,0:05:59.982
Focus groups can be a particularly problematic method when you are looking at trying to gather data

0:05:59.982,0:06:04.135
about taboo topics or about cultural biases.

0:06:04.135,0:06:06.723
With those caveats — right now we’re just making a laundry list, and —

0:06:06.723,0:06:12.312
I think that focus groups, like almost any other method, can play an important role in your toolbelt.

0:06:13.420,0:06:16.574
Our third category of techniques is to get feedback from experts.

0:06:16.574,0:06:22.905
For example, in this class we’re going to do a bunch of peer critique for your weekly project assignments.

0:06:22.905,0:06:25.370
In addition to having users try your interface,

0:06:25.370,0:06:29.775
it can be important to eat your own dog food and use the tools that you built yourself.

0:06:29.775,0:06:35.069
When you are getting feedback from experts, it can often be helpful to have some kind of structured format,

0:06:35.069,0:06:38.558
much like the rubrics you’ll see in your project assignments.

0:06:38.558,0:06:44.881
And, for getting feedback on user interfaces, one common approach to this structured feedback

0:06:44.881,0:06:48.390
is called heuristic evaluation, and you’ll learn how to do that in this class;

0:06:48.390,0:06:51.051
it’s pioneered by Jacob Nielson.

0:06:51.051,0:06:53.496
Our next genre is comparative experiments:

0:06:53.496,0:06:57.565
taking two or more distinct options and comparing their performance to each other.

0:06:57.565,0:07:00.183
These comparisons can take place in lots of different ways:

0:07:00.183,0:07:04.061
They can be in the lab; they can be in the field; they can be online.

0:07:04.061,0:07:06.543
These experiments can be more-or-less controlled,

0:07:06.543,0:07:10.125
and they can take place over shorter or longer durations.

0:07:10.125,0:07:14.235
What you’re trying to learn here is which option is the more effective,

0:07:14.235,0:07:16.998
and, more often, what are the active ingredients,

0:07:16.998,0:07:21.422
what are the variables that matter in creating the user experience that you seek.

0:07:22.006,0:07:26.714
Here’s an example: My former PhD student Joel Brandt, and his colleague at Adobe,

0:07:26.714,0:07:30.847
ran a number of studies comparing help interfaces for programmers.

0:07:32.139,0:07:38.319
In particular they compared a more traditional search-style user interface for finding programming help

0:07:38.319,0:07:43.443
with a search interface that integrated programming help directly into your environment.

0:07:43.443,0:07:46.979
By running these comparisons they were able to see how programmers’ behaviour differed

0:07:46.979,0:07:50.588
based on the changing help user interface.

0:07:50.588,0:07:53.698
Comparative experiments have an advantage over surveys

0:07:53.698,0:07:57.230
in that you get to see the actual behaviour as opposed to self report,

0:07:57.230,0:08:02.329
and they can be better than usability studies because you’re comparing multiple alternatives.

0:08:02.329,0:08:06.780
This enables you to see what works better or worse, or at least what works different.

0:08:06.780,0:08:10.366
I find that comparative feedback is also often much more actionable.

0:08:11.166,0:08:13.938
However, if you are running controlled experiments online,

0:08:13.938,0:08:18.079
you don’t get to see much about the person on the other side of the screen.

0:08:18.079,0:08:20.774
And if you are inviting people into your office or lab,

0:08:20.774,0:08:24.111
the behaviour you’re measuring might not be very realistic.

0:08:24.111,0:08:30.283
If realistic longitudinal behaviour is what you’re after, participant observation may be the approach for you.

0:08:30.283,0:08:36.419
This approach is just what it sounds like: observing what people actually do in their actual work environment.

0:08:36.419,0:08:40.226
And this more long-term evaluation can be important for uncovering things

0:08:40.226,0:08:44.131
that you might not see in shorter term, more controlled scenarios.

0:08:44.131,0:08:48.015
For example, my colleagues Bob Sutton and Andrew Hargadon studied brainstorming.

0:08:48.015,0:08:51.655
The prior literature on brainstorming had focused mostly on questions like

0:08:51.655,0:08:54.402
“Do people come up with more ideas?”

0:08:54.402,0:08:56.829
What Bob and Andrew realized by going into the field

0:08:56.829,0:09:00.517
was that brainstorming served a number of other functions also,

0:09:00.517,0:09:05.365
like, for example, brainstorming provides a way for members of the design team

0:09:05.365,0:09:08.081
to demonstrate their creativity to their peers;

0:09:08.081,0:09:13.210
it allows them to pass along knowledge that then can be reused in other projects;

0:09:13.210,0:09:19.057
and it creates a fun, exciting environment that people like to work in and that clients like to participate in.

0:09:19.057,0:09:22.206
In a real ecosystem, all of these things are important,

0:09:22.206,0:09:25.514
in addition to just having the ideas that people come up with.

0:09:26.191,0:09:32.908
Nearly all experiments seek to build a theory on some level — I don’t mean anything fancy by this,

0:09:32.908,0:09:37.309
just that we take some things to be more relevant, and other things less relevant.

0:09:37.309,0:09:39.250
We might, for example, assume

0:09:39.250,0:09:43.068
that the ordering of search results may play an important role in what people click on,

0:09:43.068,0:09:46.415
but that the batting average of the Detroit Tigers doesn’t,

0:09:46.415,0:09:49.763
unless, of course, somebody’s searching for baseball.

0:09:49.763,0:09:55.093
If you have a theory that sufficiently, formal mathematically that you may make predictions,

0:09:55.093,0:10:00.037
then you can compare alternative interfaces using that model, without having to bring people in.

0:10:00.037,0:10:05.576
And we’ll go over that in this class a little bit, with respect to input models.

0:10:05.576,0:10:10.072
This makes it possible to try out a number of alternatives really fast.

0:10:10.072,0:10:12.286
Consequently, when people use simulations,

0:10:12.286,0:10:16.378
it’s often in conjunction with something like Monte Carlo optimization.

0:10:16.378,0:10:19.934
One example of this can be found in the ShapeWriter system,

0:10:19.934,0:10:22.735
where Shuman Zhai and colleagues figured out how to build a keyboard

0:10:22.735,0:10:26.122
where people could enter an entire word in a single stroke.

0:10:26.122,0:10:31.247
They were able to do this with the benefit of formal models and optimization-based approaches.

0:10:31.247,0:10:34.402
Simulation has mostly been used for input techniques

0:10:34.402,0:10:39.795
because people’s motor performance is probably the most well-quantified area of HCI.

0:10:39.795,0:10:42.701
And, while we won’t get much to it in this intro course,

0:10:42.701,0:10:46.266
simulation can also be used for higher-level cognitive tasks;

0:10:46.266,0:10:48.497
for example, Pete Pirolli and colleagues at PARC

0:10:48.497,0:10:51.528
had built impressive models of people’s web-searching behaviour.

0:10:52.467,0:10:57.253
These models enable them to estimate, for example, which links somebody is most likely to click on

0:10:57.253,0:11:00.238
by looking at the relevant link texts.

0:11:00.238,0:11:05.072
That’s our whirlwind tour of a number of empirical methods that this class will introduce.

0:11:05.072,0:11:09.481
You’ll want to pick the right method for the right task, and here’s some issues to consider:

0:11:09.481,0:11:13.187
If you did it again, would you get the same thing?

0:11:13.187,0:11:18.544
Another is generalizability and realism — Does this hold for people other than 18-year-old

0:11:18.544,0:11:23.135
upper-middle-class students who are doing this for course credit or a gift certificate?

0:11:23.135,0:11:28.546
Is this behaviour also what you’d see in the real world, or only in a more stilted lab environment?

0:11:28.546,0:11:30.864
Comparisons are important, because they can tell you

0:11:30.879,0:11:34.351
how the user experience would change with different interface choices,

0:11:34.351,0:11:38.553
as opposed to just a “people liked it” study.

0:11:38.553,0:11:42.784
It’s also important to think about how to achieve how these insights efficiently,

0:11:42.784,0:11:48.747
and not chew up a lot of resources, especially when your goal is practical.

0:11:48.747,0:11:54.252
My experience as a designer, researcher, teacher, consultant, advisor and mentor has taught me

0:11:54.252,0:12:01.340
that evaluating designs with people is both easier and more valuable than many people expect,

0:12:01.340,0:12:04.704
and there’s an incredible lightbulb moment that happens

0:12:04.704,0:12:08.831
when you actually get designs in front of people and see how they use them.

0:12:08.831,0:12:12.945
So, to sum up this video, I’d like to ask what could be the most important question:

0:12:12.945,9:59:59.000
“What do you want to learn?”