[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,0:00:00.28,0:00:03.95,Default,,0000,0000,0000,,Now I want to show you a visualization tool that I found online that I Dialogue: 0,0:00:03.95,0:00:09.35,Default,,0000,0000,0000,,think does a really great job of helping you see what k-means clustering does. Dialogue: 0,0:00:09.35,0:00:12.16,Default,,0000,0000,0000,,And that should give you a good intuition for how it works. Dialogue: 0,0:00:12.16,0:00:15.80,Default,,0000,0000,0000,,So I'd like to give a special shout out to Naftali Harris, Dialogue: 0,0:00:15.80,0:00:19.89,Default,,0000,0000,0000,,who wrote this visualization and very kindly agreed to let us use it. Dialogue: 0,0:00:19.89,0:00:22.99,Default,,0000,0000,0000,,I'll put a link to this website in the instructor notes that you can go and Dialogue: 0,0:00:22.99,0:00:24.64,Default,,0000,0000,0000,,play around with it on your own. Dialogue: 0,0:00:24.64,0:00:29.72,Default,,0000,0000,0000,,So it starts out by asking me how to pick the initial centroids of my clusters. Dialogue: 0,0:00:29.72,0:00:32.53,Default,,0000,0000,0000,,I'll start out with Randomly right now. Dialogue: 0,0:00:32.53,0:00:34.49,Default,,0000,0000,0000,,What kind of data would I like to use? Dialogue: 0,0:00:34.49,0:00:35.80,Default,,0000,0000,0000,,There are a number of different things here, and Dialogue: 0,0:00:35.80,0:00:37.50,Default,,0000,0000,0000,,I encourage you to play around with them. Dialogue: 0,0:00:37.50,0:00:40.63,Default,,0000,0000,0000,,A Gaussian Mixture has been really similar to one of the simple examples we've Dialogue: 0,0:00:40.63,0:00:41.67,Default,,0000,0000,0000,,done so far. Dialogue: 0,0:00:41.67,0:00:43.49,Default,,0000,0000,0000,,So Gaussian mixture data looks like this. Dialogue: 0,0:00:43.49,0:00:45.46,Default,,0000,0000,0000,,These are all the points that we have to classify. Dialogue: 0,0:00:45.46,0:00:46.54,Default,,0000,0000,0000,,The first question for you is, Dialogue: 0,0:00:46.54,0:00:49.47,Default,,0000,0000,0000,,how many centroids do you think is the correct number of centroids on this data?