1 00:00:00,000 --> 00:00:02,416 Now there are really 2 components to the analyze stage-- 2 00:00:02,416 --> 00:00:06,081 understanding where your application spends its time 3 00:00:06,081 --> 00:00:09,715 and understanding what you want to do with the additional parallelism. 4 00:00:09,715 --> 00:00:12,534 So understanding hotspots is pretty straightforward. 5 00:00:12,534 --> 00:00:15,299 Often the programmer will have some intuition for this. 6 00:00:15,299 --> 00:00:17,897 They'll have some idea where the time is being spent in their code. 7 00:00:17,897 --> 00:00:21,669 But intuition can be wrong and there's no substitute for data, so don't rely on intuition. 8 00:00:21,669 --> 00:00:25,593 So run your favorite profiler, whether that's GProf or VTune 9 00:00:25,593 --> 00:00:28,278 or the brilliantly named Very Sleepy profiler, 10 00:00:28,278 --> 00:00:32,412 and look at how much time the various functions or lines of code take. 11 00:00:32,412 --> 00:00:36,379 You're going to get some output back that tells you how much time each function takes. 12 00:00:36,379 --> 00:00:39,250 So this function might take 10% of the time, 13 00:00:39,250 --> 00:00:42,604 this function might take 20% of the total time of the program, 14 00:00:42,604 --> 00:00:46,590 and in this case, the program is spending 50% of its time in this 1 function, 15 00:00:46,590 --> 00:00:49,190 so there's a clear hotspot here. 16 00:00:49,190 --> 00:00:51,924 Maybe your intuition would have told you that this is where the time is going. 17 00:00:51,924 --> 00:00:53,837 Often you do know that first hotspot, 18 00:00:53,837 --> 00:00:56,419 but sometimes you're surprised surprised by that second and third hotspot. 19 00:00:56,419 --> 00:00:57,983 Sometimes they're not what you think. 20 00:00:57,983 --> 00:01:00,705 And this gives you a starting point of where to begin parallelizing the code 21 00:01:00,705 --> 00:01:03,998 and an indication as well of how much speedup you can expect to do so. 22 00:01:03,998 --> 00:01:05,576 Let's try that at home with a quiz. 23 00:01:05,576 --> 00:01:08,522 So in this example that we've shown here, the question is 24 00:01:08,522 --> 00:01:14,226 what is the maximum speedup possible that you could hope for if you parallelized this top function, 25 00:01:14,226 --> 00:01:16,158 the function which takes the most time? 26 00:01:16,158 --> 00:01:17,725 Okay, that's this one. 27 00:01:17,725 --> 00:01:22,296 And then I'll say what if you parallelized the top 3 functions? So what if you parallelized all of these?