AI: Training Data & Bias

Edit subtitles

0:07 - 0:12

Machine learning is only as good as the
training data you put into it.
0:12 - 0:16

So, it's super important to use high quality data, and lots of it.
0:17 - 0:22

But if data is important, it's worth asking where does training data come from?
0:22 - 0:26

Often, computers are collecting training data from people like you and me,
0:26 - 0:28

without any effort on our part.
0:28 - 0:31

A video streaming service might keep track of what you watch, then it can recognize patterns
0:32 - 0:36

in that data to recommend what you might want to watch next.
0:37 - 0:43

Other times, you're directly asked to help, like when a website asks you to spot street signs and photos,
0:44 - 0:49

You're providing training data to help a
machine learn to see, and maybe even one day drive.
0:52 - 0:56

Medical researchers can use
medical images as training data to teach
0:57 - 1:00

computers how to recognize and diagnose diseases.
1:00 - 1:06

Machine Learning needs hundreds and thousands of images, and training direction from a doctor
1:06 - 1:10

who knows what to look for, before it can correctly identify disease.
1:11 - 1:16

Even with thousands of examples, there can be problems with the computer's predictions.
1:16 - 1:21

If X-ray data is only collected from men, then the computer's predictions may only work for men.
1:22 - 1:26

It may not recognize diseases when
asked to diagnose the X-ray of a woman.
1:27 - 1:31

This blind spot in the training data
creates something called bias.
1:31 - 1:36

Biased data favors some things, and de-prioritizes or excludes others.
1:37 - 1:42

Depending on how training data is collected, who is doing the collecting, and how the data is fed,
1:42 - 1:45

there is a chance that
human bias is included in the data.
1:46 - 1:51

By learning from bias data, the computer may make biased predictions,
1:51 - 1:54

whether the people training the computer
are aware of it or not.
1:55 - 1:58

When you are looking at training data, ask yourself two questions:
1:59 - 2:02

Is this enough data to accurately train a computer?
2:02 - 2:07

And, does this data represent all possible scenarios and users without bias?
2:07 - 2:11

This is where you, as the human training, play a crucial role.
2:11 - 2:14

It's up to you to give your machine unbiased data.
2:14 - 2:18

That means collecting tons of examples, from lots of sources.
2:19 - 2:23

Remember, when you pick and choose data for machine learning,
2:23 - 2:27

you're actually programming the algorithm, using training data instead of code.
2:27 - 2:30

The data IS the code.
2:30 - 2:35

The better the data you provide, the better the computer will learn.

Title:: AI: Training Data & Bias
Description:: more » « less
Video Language:: English
Team:: Code.org
Project:: How AI Works
Duration:: 02:41

Amara Bot edited English subtitles for AI: Training Data & Bias

English subtitles

Revisions

Revision 1 Imported

Amara Bot

AI: Training Data & Bias

Revisions

Our website uses cookies

Operating cookies (Required)