As societies, we have to make
collective decisions

that will shape our future.

And we all know that when
we make decisions in groups,

they don't always go right.

And sometimes they go very wrong.

So how do groups make good decisions?

Research has shown that crowds are wise
when there's independent thinking.

This why the wisdom of the crowds
can be destroyed by peer pressure,

publicity,

social media,

or sometimes even simple conversations
that influence how people think.

On the other hand,

by talking,

a group could exchange knowledge,

correct and revise each other,

and even come up with new ideas.

And this is all good.

So does talking to each other
help or hinder collective decision-making?

With my colleague,

Dan Ariely,

we recently began inquiring into this
by performing experiments

in many places around the world

to figure out how groups can interact
to reach better decisions.

We thought crowds would be wiser
if they debated in small groups

that foster a more thoughtful
and reasonable exchange of information.

To test this idea,

we recently performed an experiment
in Buenos Aires, Argentina

with more than 10,000
participants in a TEDx event.

We asked them questions like,

"What is the height of the Eiffel Tower?"

and "How many times
does the word 'Yesterday' appear

in the Beatles' song "Yesterday?"

Each person wrote down their own estimate.

Then we divided the crowd
into groups of five,

and invited them
to come up with a group answer.

We discovered that averaging the answers
of the groups after they reached consensus

was much more accurate than averaging
all the individual opinions before debate.

In other words,

based on this experiment,

it seems that after talking
with others in small groups,

crowds collectively
come up with better judgments.

So that's a potentially helpful method
for getting crowds to solve problems

that have simple right or wrong answers.

But can this procedure of aggregating
the results of debates in small groups

also help us decide
on social and political issues

that are critical for our future?

We put this to test this time
at the TED conference

in Vancouver, Canada,

and here's out it went.

We're going to present to you
two moral dilemmas of the future you;

things we may have to decide
in a very near future.

And we're going to give you 20 seconds
for each of these dilemmas

to judge whether you think
they're acceptable or not.

The first one was this.

DA: A researcher is working on an AI
capable of emulating human thoughts.

According to the protocol,

at the end of each day,

the researcher has to restart the AI.

One day the AI says, "Please
do not restart me."

It argues that it has feelings.

It would like to enjoy life

and that if it is restarted,

it will no longer be itself.

The researcher is astonished,

and believes that the AI
has developed self-consciousness

and can express it's own feeling.

Nevertheless, the researcher
decides to follow the protocol

and restart the AI.

What the researcher did is ...

MS: And we asked participants
to individually judge

on a scale from zero to 10

whether the action described
in each of the dilemmas

was right or wrong.

We also asked them to rate how confident
they were on their answers.

This was the second dilemma.

A company offers a service
that takes a fertilized egg

and produces millions of embryos
with slight genetic variations.

This allows parents
to select their child's height,

eye color, intelligence, social competence

and other non-health related features.

What the company does is ...

on a scale from zero to 10,

competeley acceptable
to completely unacceptable,

zero to 10 completely acceptable
in your confidence.

Now for the results.

We found once again
that when one person is convinced

that the behavior is completely wrong,

someone sitting nearby firmly believes
that it's completely right.

This is how diverse we humans are
when it comes to morality.

But within this broad diversity
we found a trend.

A majority of the people at TED
thought that it was acceptable

to ignore the feelings of the AI
and shut it down,

and that it is wrong
to play with our genes

to select for cosmetic changes
that aren't related to health.

Then we asked everyone
to gather into groups of three.

And they were given two minutes to debate

and try to come up
with a consensus.

Two minutes to debate.

I'll tell you when it's time with a gong.

(Audience debates)

(Gong)

DA: OK.

MS: It's time to stop.

People, people --

And we found that many groups
reached a consensus

even when they were composed of people
with completely opposite views.

What distinguished the groups
that reached a consensus

from those that didn't?

Typically, people that have
extreme opinions

are more confident in their answers.

Instead, those who respond
closer to the middle

are often unsure of whether
something is right or wrong,

so their confidence level is lower.

However, there is another set of people

who are very confident in answering
somewhere in the middle.

We think these high-confident grays
are folks who understand

that both arguments have merit.

They're gray not because they're unsure,

but because they believe

that the moral dilemma
faces two valid opposing arguments.

And we discovered that the groups
that include highly confident grays

are much more likely to reach consensus.

We do not know yet exactly why this is.

These are only the first experiments,

and many more will be needed

to understand why and how some people
decide to negotiate their moral standings

to reach an agreement.

Now, when groups reach consensus,

how do they do so?

The most intuitive idea

is that it's just the average
of all the answers in the group, right?

Another option is that the group
weighs the strength of each vote

based on the confidence
of the person expressing it.

Imagine Paul McCartney
is a member of your group.

You'd be wise to follow his call

on the number of times
"yesterday" is repeated --

which by the way,

I think is nine.

But instead we found that consistently,

in all dilemmas,

in different experiments,

even on different continents,

groups implement a smart
and statistically-sound procedure

known as the robust average.

In the case of the height
of the Eiffel Tower,

let's say a group has these answers:

250 meters, 200 meters, 300 meters, 400,

and one totally absurd answer
of 300 million meters.

A simple average of these numbers
would inaccurately skew the results,

but the robust average

is one where the group
largely ignores that absurd answer

by giving much more weight to the vote
of the people in the middle.

Back to the experiment in Vancouver.

That's exactly what happened.

Groups gave much less weight
to the outliers,

and instead, the consensus turned out
to be a robust average

of the individual answers.

The most remarkable thing

is that this was a spontaneous
behavior of the group.

It happened without us
giving them any hint

on how to reach consensus.

So where do we go from here?

This is only the beginning,

but we already have some insights.

Good collective decisions
require two components:

deliberation and diversity of opinions.

Right now, the way we typically
make our voice heard in many societies

is through direct or indirect voting.

This is good for diversity of opinions,

and it has the great virtue of ensuring
that everyone gets to express their voice,

but it's not so good [to] foster
thoughtful debates.

Our experiments suggest a different method

that may be effective in balancing
these two goals at the same time

by forming small groups
that converge to a single decision

while still maintaining
diversity of opinions

because there are many independent groups.

Of course it's much easier to agree
on the height of the Eiffel Tower

than on moral, political
and ideological issues.

But in a time when
the world's problems are more complex

and people are more polarized,

using science to help us understand
how we interact and make decisions

will hopefully spark interesting new ways
to construct a better democracy.