[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,0:00:00.42,0:00:03.33,Default,,0000,0000,0000,,Welcome to the mini project for feature selection. Dialogue: 0,0:00:03.33,0:00:06.84,Default,,0000,0000,0000,,In one of the earlier videos in this lesson I told you about when I was working Dialogue: 0,0:00:06.84,0:00:10.23,Default,,0000,0000,0000,,with the e-mail data, that there was a word that was effectively serving as Dialogue: 0,0:00:10.23,0:00:14.25,Default,,0000,0000,0000,,a signature on the e-mails and I didn't initially realize it. Dialogue: 0,0:00:14.25,0:00:17.45,Default,,0000,0000,0000,,Now, the mark of a good machine learner doesn't mean that they never make any Dialogue: 0,0:00:17.45,0:00:19.69,Default,,0000,0000,0000,,mistakes or that their features are always perfect. Dialogue: 0,0:00:19.69,0:00:21.82,Default,,0000,0000,0000,,It means that they're on the lookout for ways to check this and Dialogue: 0,0:00:21.82,0:00:25.16,Default,,0000,0000,0000,,to figure out if there is a bug in there that they need to go in and fix. Dialogue: 0,0:00:25.16,0:00:28.49,Default,,0000,0000,0000,,So in this case it would mean that there's a type of signature word, Dialogue: 0,0:00:28.49,0:00:30.49,Default,,0000,0000,0000,,that we would need to go in and remove in order for Dialogue: 0,0:00:30.49,0:00:34.59,Default,,0000,0000,0000,,us to, to feel like we were being fair in our supervised classification. Dialogue: 0,0:00:34.59,0:00:36.60,Default,,0000,0000,0000,,So this was a really big learning experience for me. Dialogue: 0,0:00:36.60,0:00:39.27,Default,,0000,0000,0000,,So I want to share it with you in this mini project. Dialogue: 0,0:00:39.27,0:00:42.08,Default,,0000,0000,0000,,I'm going to sort of take you into my head as I was trying to Dialogue: 0,0:00:42.08,0:00:46.08,Default,,0000,0000,0000,,figure out what was going on that I couldn't over fit this decision tree. Dialogue: 0,0:00:46.08,0:00:48.76,Default,,0000,0000,0000,,And how I figured out that there was one feature or Dialogue: 0,0:00:48.76,0:00:51.07,Default,,0000,0000,0000,,a couple features that were responsible for that. Dialogue: 0,0:00:51.07,0:00:52.08,Default,,0000,0000,0000,,And then, specifically, Dialogue: 0,0:00:52.08,0:00:55.97,Default,,0000,0000,0000,,how I figured out what words they were and how I removed them. Dialogue: 0,0:00:55.97,0:00:56.66,Default,,0000,0000,0000,,So that's what you'll be doing in this mini project.