## New Enron Feature Solution - Intro to Machine Learning

• 0:00 - 0:03
Here's a visualization of the new feature.
• 0:03 - 0:06
Along the x axis here, I have the number of emails from a person of
• 0:06 - 0:09
interest to a given person in the data set.
• 0:09 - 0:10
Along the y axis,
• 0:10 - 0:14
I have something else that I think might give me some discrimination as well.
• 0:14 - 0:18
Which is the number of emails that this person sends to persons of interest.
• 0:18 - 0:23
What I've also done is colored my persons of interest red in the scatter plot,
• 0:23 - 0:26
so I can easily identify if there's some sort of pattern in this feature that I
• 0:26 - 0:30
start to see clumps of red dots all together, for example.
• 0:30 - 0:32
That would be an indication of something that a supervised learning
• 0:32 - 0:37
algorithm could exploit in trying to predict persons of interest.
• 0:37 - 0:41
And what I see is that there doesn't seem to be a very strong trend here.
• 0:41 - 0:45
The red points seem to be mixed in rather equally with the blue points.
• 0:45 - 0:48
Another thing that I notice is that there are a few outliers.
• 0:48 - 0:52
Most people, we only have maybe less than 100 emails to or
• 0:52 - 0:55
from them, but some people we have many, many more that that.
• 0:55 - 1:00
So this visualization leads me into the next step of repeating this process.
• 1:00 - 1:05
Using my human intuition to think about what features might be valuable here.
• 1:05 - 1:08
The thing that I'm thinking of at this point is maybe the feature that I
• 1:08 - 1:13
need is not the absolute number of emails from a person of interest to a,
• 1:13 - 1:14
a given person.
• 1:14 - 1:18
But the fraction of emails that a person receives that come
• 1:18 - 1:19
from a person of interest.
• 1:19 - 1:24
In other words, if you get 80% of your emails from persons of interest,
• 1:24 - 1:27
my intuition might be that you yourself are one.
• 1:27 - 1:30
But of course, I have to actually code up the feature to test this hypothesis.
Title:
New Enron Feature Solution - Intro to Machine Learning
Description:

more » « less
Video Language:
English
Team:
Udacity
Project:
ud120 - Intro to Machine Learning
Duration:
01:31
 Udacity Robot edited English subtitles for 11-04 New_Enron_Feature_Solution Udacity Robot edited English subtitles for 11-04 New_Enron_Feature_Solution Cogi-Admin edited English subtitles for 11-04 New_Enron_Feature_Solution

# English subtitles

## Revisions Compare revisions

• API
Udacity Robot
• API
Udacity Robot
• API