Lecture 1.3: Evaluating Designs (12:15)

Edit subtitles

0:01 - 0:06

In many ways, the most creative, challenging, and under-appreciated aspect of interaction design
0:06 - 0:08

is evaluating designs with people.
0:08 - 0:12

The insights that you’ll get from testing designs with people
0:12 - 0:16

can help you get new ideas, make changes, decide wisely, and fix bugs.
0:16 - 0:21

One reason I think design is such an interesting field is its relationship to truth and objectivity.
0:21 - 0:26

I find design so incredibly fascinating because we can say more in response to a question like:
0:26 - 0:33

“How can we measure success?” than “It’s just personal preference” or “Whatever feels right.”
0:33 - 0:37

At the same time, the answers are more complex and more open-ended, more subjective,
0:37 - 0:42

and require more wisdom than just a number like 7 or 3.
0:42 - 0:44

One of the things that we’re going to learn in this class
0:44 - 0:48

is the different kinds of knowledge that you can get out of different kinds of methods.
0:48 - 0:53

Why evaluate designs with people? Why learn about how people use interactive systems?
0:53 - 0:58

I think one major reason for this is that it can be difficult to tell how good a user interface is
0:58 - 1:04

until you’ve tried it out with actual users, and that’s because clients and designers and developers,
1:04 - 1:07

they may know too much about the domain and the user interface,
1:07 - 1:11

or have acquired blinders through designing and building the user interface.
1:11 - 1:15

At the same time they may not know enough about the user’s actual tasks.
1:15 - 1:21

And while experience and theory can help, it can still be hard to predict what real users will actually do.
1:22 - 1:25

You might want to know, “Can people figure out how to use it?”
1:25 - 1:29

or “Do they swear or giggle when using this interface?”
1:29 - 1:31

“How does this design compare to that design?”
1:31 - 1:35

and, “If we changed the interface, how does that change people’s behaviour?”
1:35 - 1:39

“What new practices might emerge?” “How do things change over time?”
1:39 - 1:45

These are all great questions to ask about an interface, and each will come from different methods.
1:45 - 1:50

The value of having a broad toolbox of different methods can be especially valuable in emerging areas
1:50 - 1:56

like mobile and social software where people’s use practices can be particularly context-dependent
1:56 - 2:01

and also evolves significantly over time in response to how other people use software
2:01 - 2:03

through network effects and things like that.
2:03 - 2:09

To give you a flavour of this, I’d like to quickly run through some common types of empiracal research in HCI.
2:09 - 2:12

The examples I’ll show are mostly published work of one sort or another,
2:12 - 2:14

because that’s the easiest stuff to share.
2:14 - 2:19

If you have good examples from current systems out in the world, post them to the forum!
2:19 - 2:21

I keep an archive of user interface examples,
2:21 - 2:24

and I and the other students would love to see what you can come up with.
2:24 - 2:27

One way to learn about the user experience of a design
2:27 - 2:31

is to bring people into your lab or office and have them try it out.
2:31 - 2:33

We often call these usability studies.
2:33 - 2:37

This “watch someone use my interface” approach is a common one in HCI.
2:37 - 2:44

This basic strategy for traditional user-centred design is to iteratively bring people
2:44 - 2:48

into your lab or office until you run out of time. And then release.
2:48 - 2:52

And, if you had deep pockets, these rooms had a one-way glass mirror,
2:52 - 2:55

and the development team was on the other side.
2:55 - 2:59

In a leaner environment, this may be just bring in people into your dorm room office.
2:59 - 3:02

You’ll learn a huge amount by doing this.
3:02 - 3:05

Every single time that I or a student, friend, or colleague
3:05 - 3:08

has watched somebody use a new interactive system,
3:08 - 3:14

we learn something, [as,] as designers we get blinders to systems’ quirks, bugs, and false assumptions.
3:15 - 3:20

However, there are some major shortcomings to this approach.
3:20 - 3:24

In particular, the setting probably isn’t very ecologically valid.
3:24 - 3:29

In the real world, people may have different tasks, goals, motivations, and physical settings
3:29 - 3:32

than your office or lab.
3:32 - 3:35

This can be especially true for user interfaces that you think people might use on the go,
3:35 - 3:38

like at a bus stop or while waiting in line.
3:38 - 3:41

Second, there can be a “please me” experimental bias,
3:41 - 3:44

where when you bring somebody in to try out a user interface,
3:44 - 3:47

they know that they’re trying out the technology that you developed
3:47 - 3:51

and so they may work harder or be nicer
3:51 - 3:55

than they would if they had to use it without the constraints of a lab setup
3:55 - 3:58

with the person who developed it watching right over them.
3:58 - 4:03

Third, in its most basic form where you’re just trying out just one user interface, there is no comparison point.
4:03 - 4:09

So while you can track when people laugh, or swear, or smile with joy,
4:09 - 4:12

you won’t know whether they would’ve laugh more, or sworn less, or smiled more
4:12 - 4:15

if you’d had a different user interface.
4:15 - 4:18

And finally it requires bringing people to your physical location.
4:18 - 4:21

This is often a whole lot easier than a lot of people think.
4:21 - 4:24

It can be a psychological burden, even if nothing else.
4:24 - 4:28

A very different way of getting feedback from people is to use a survey.
4:28 - 4:31

Here is an example of a survey that I got recently from San Francisco
4:31 - 4:34

asking about different street light designs.
4:34 - 4:38

Surveys are great because you can quickly get feedback from a large number of responses.
4:38 - 4:41

And it’s relatively easy to compare multiple alternatives.
4:41 - 4:44

You can also automatically tally the results.
4:44 - 4:48

You don’t even need to build anything; you can just show screen shots or mock-ups.
4:48 - 4:51

One of the things that I’ve learned the hard way, though,
4:51 - 4:55

is the difference between what people say they’re going to do and what they actually do.
4:55 - 4:59

Ask people how often they exercise and you’ll probably get a much more optimistic answer
4:59 - 5:02

than how often they really do exercise.
5:02 - 5:05

The same holds for the street light example here.
5:05 - 5:09

Try to imagine what a number of different street light designs might be
5:09 - 5:12

is really different than actually observing them on the street
5:12 - 5:15

and having them become part of normal everyday life.
5:15 - 5:18

Still, it can be valuable to get feedback.
5:18 - 5:20

Another type of responder strategy is focus groups.
5:20 - 5:26

In a focus group, you’ll gather together a small group of people to discuss a design or idea.
5:26 - 5:31

The fact that focus groups involve a group of people is a double-edged sword.
5:31 - 5:38

On one hand, you can get people to tease out of their colleagues things that they might not have thought
5:38 - 5:45

to say on their own; on the other hand, for a variety of psychological reasons, people may be inclined
5:45 - 5:49

to say polite things or generate answers completely on the spot
5:49 - 5:54

that are totally uncorrelated with what they believe or what they would actually do.
5:55 - 6:00

Focus groups can be a particularly problematic method when you are looking at trying to gather data
6:00 - 6:04

about taboo topics or about cultural biases.
6:04 - 6:07

With those caveats — right now we’re just making a laundry list, and —
6:07 - 6:12

I think that focus groups, like almost any other method, can play an important role in your toolbelt.
6:13 - 6:17

Our third category of techniques is to get feedback from experts.
6:17 - 6:23

For example, in this class we’re going to do a bunch of peer critique for your weekly project assignments.
6:23 - 6:25

In addition to having users try your interface,
6:25 - 6:30

it can be important to eat your own dog food and use the tools that you built yourself.
6:30 - 6:35

When you are getting feedback from experts, it can often be helpful to have some kind of structured format,
6:35 - 6:39

much like the rubrics you’ll see in your project assignments.
6:39 - 6:45

And, for getting feedback on user interfaces, one common approach to this structured feedback
6:45 - 6:48

is called heuristic evaluation, and you’ll learn how to do that in this class;
6:48 - 6:51

it’s pioneered by Jacob Nielson.
6:51 - 6:53

Our next genre is comparative experiments:
6:53 - 6:58

taking two or more distinct options and comparing their performance to each other.
6:58 - 7:00

These comparisons can take place in lots of different ways:
7:00 - 7:04

They can be in the lab; they can be in the field; they can be online.
7:04 - 7:07

These experiments can be more-or-less controlled,
7:07 - 7:10

and they can take place over shorter or longer durations.
7:10 - 7:14

What you’re trying to learn here is which option is the more effective,
7:14 - 7:17

and, more often, what are the active ingredients,
7:17 - 7:21

what are the variables that matter in creating the user experience that you seek.
7:22 - 7:27

Here’s an example: My former PhD student Joel Brandt, and his colleague at Adobe,
7:27 - 7:31

ran a number of studies comparing help interfaces for programmers.
7:32 - 7:38

In particular they compared a more traditional search-style user interface for finding programming help
7:38 - 7:43

with a search interface that integrated programming help directly into your environment.
7:43 - 7:47

By running these comparisons they were able to see how programmers’ behaviour differed
7:47 - 7:51

based on the changing help user interface.
7:51 - 7:54

Comparative experiments have an advantage over surveys
7:54 - 7:57

in that you get to see the actual behaviour as opposed to self report,
7:57 - 8:02

and they can be better than usability studies because you’re comparing multiple alternatives.
8:02 - 8:07

This enables you to see what works better or worse, or at least what works different.
8:07 - 8:10

I find that comparative feedback is also often much more actionable.
8:11 - 8:14

However, if you are running controlled experiments online,
8:14 - 8:18

you don’t get to see much about the person on the other side of the screen.
8:18 - 8:21

And if you are inviting people into your office or lab,
8:21 - 8:24

the behaviour you’re measuring might not be very realistic.
8:24 - 8:30

If realistic longitudinal behaviour is what you’re after, participant observation may be the approach for you.
8:30 - 8:36

This approach is just what it sounds like: observing what people actually do in their actual work environment.
8:36 - 8:40

And this more long-term evaluation can be important for uncovering things
8:40 - 8:44

that you might not see in shorter term, more controlled scenarios.
8:44 - 8:48

For example, my colleagues Bob Sutton and Andrew Hargadon studied brainstorming.
8:48 - 8:52

The prior literature on brainstorming had focused mostly on questions like
8:52 - 8:54

“Do people come up with more ideas?”
8:54 - 8:57

What Bob and Andrew realized by going into the field
8:57 - 9:01

was that brainstorming served a number of other functions also,
9:01 - 9:05

like, for example, brainstorming provides a way for members of the design team
9:05 - 9:08

to demonstrate their creativity to their peers;
9:08 - 9:13

it allows them to pass along knowledge that then can be reused in other projects;
9:13 - 9:19

and it creates a fun, exciting environment that people like to work in and that clients like to participate in.
9:19 - 9:22

In a real ecosystem, all of these things are important,
9:22 - 9:26

in addition to just having the ideas that people come up with.
9:26 - 9:33

Nearly all experiments seek to build a theory on some level — I don’t mean anything fancy by this,
9:33 - 9:37

just that we take some things to be more relevant, and other things less relevant.
9:37 - 9:39

We might, for example, assume
9:39 - 9:43

that the ordering of search results may play an important role in what people click on,
9:43 - 9:46

but that the batting average of the Detroit Tigers doesn’t,
9:46 - 9:50

unless, of course, somebody’s searching for baseball.
9:50 - 9:55

If you have a theory that sufficiently, formal mathematically that you may make predictions,
9:55 - 10:00

then you can compare alternative interfaces using that model, without having to bring people in.
10:00 - 10:06

And we’ll go over that in this class a little bit, with respect to input models.
10:06 - 10:10

This makes it possible to try out a number of alternatives really fast.
10:10 - 10:12

Consequently, when people use simulations,
10:12 - 10:16

it’s often in conjunction with something like Monte Carlo optimization.
10:16 - 10:20

One example of this can be found in the ShapeWriter system,
10:20 - 10:23

where Shuman Zhai and colleagues figured out how to build a keyboard
10:23 - 10:26

where people could enter an entire word in a single stroke.
10:26 - 10:31

They were able to do this with the benefit of formal models and optimization-based approaches.
10:31 - 10:34

Simulation has mostly been used for input techniques
10:34 - 10:40

because people’s motor performance is probably the most well-quantified area of HCI.
10:40 - 10:43

And, while we won’t get much to it in this intro course,
10:43 - 10:46

simulation can also be used for higher-level cognitive tasks;
10:46 - 10:48

for example, Pete Pirolli and colleagues at PARC
10:48 - 10:52

had built impressive models of people’s web-searching behaviour.
10:52 - 10:57

These models enable them to estimate, for example, which links somebody is most likely to click on
10:57 - 11:00

by looking at the relevant link texts.
11:00 - 11:05

That’s our whirlwind tour of a number of empirical methods that this class will introduce.
11:05 - 11:09

You’ll want to pick the right method for the right task, and here’s some issues to consider:
11:09 - 11:13

If you did it again, would you get the same thing?
11:13 - 11:19

Another is generalizability and realism — Does this hold for people other than 18-year-old
11:19 - 11:23

upper-middle-class students who are doing this for course credit or a gift certificate?
11:23 - 11:29

Is this behaviour also what you’d see in the real world, or only in a more stilted lab environment?
11:29 - 11:31

Comparisons are important, because they can tell you
11:31 - 11:34

how the user experience would change with different interface choices,
11:34 - 11:39

as opposed to just a “people liked it” study.
11:39 - 11:43

It’s also important to think about how to achieve how these insights efficiently,
11:43 - 11:49

and not chew up a lot of resources, especially when your goal is practical.
11:49 - 11:54

My experience as a designer, researcher, teacher, consultant, advisor and mentor has taught me
11:54 - 12:01

that evaluating designs with people is both easier and more valuable than many people expect,
12:01 - 12:05

and there’s an incredible lightbulb moment that happens
12:05 - 12:09

when you actually get designs in front of people and see how they use them.
12:09 - 12:13

So, to sum up this video, I’d like to ask what could be the most important question:
12:13 -

“What do you want to learn?”

Title:: Lecture 1.3: Evaluating Designs (12:15)
Video Language:: English

	Ambrose LI commented on English subtitles for Lecture 1.3: Evaluating Designs (12:15)
	Ambrose LI edited English subtitles for Lecture 1.3: Evaluating Designs (12:15)
	stanford-bot edited English subtitles for Lecture 1.3: Evaluating Designs (12:15)
	stanford-bot edited English subtitles for Lecture 1.3: Evaluating Designs (12:15)
	stanford-bot edited English subtitles for Lecture 1.3: Evaluating Designs (12:15)
	Ambrose LI edited English subtitles for Lecture 1.3: Evaluating Designs (12:15)
	Ambrose LI edited English subtitles for Lecture 1.3: Evaluating Designs (12:15)
	Ambrose LI edited English subtitles for Lecture 1.3: Evaluating Designs (12:15)

Show all

English subtitles

Revisions

Revision 6

Ambrose LI

Lecture 1.3: Evaluating Designs (12:15)

Revisions

Our website uses cookies

Operating cookies (Required)