Got a YouTube account?

New: enable viewer-created translations and captions on your YouTube channel!

English subtitles

← Tools to Measure Bandwidth Utilization - Intro to Parallel Programming

Get Embed Code
2 Languages

Showing Revision 4 created 05/24/2016 by Udacity Robot.

  1. It's important to be able to reason about this the way that I just described to you, right?
  2. So we sort of walked our way through.
  3. We figured out what kind of bandwidth we were getting and what percentage of theoretical peak that was.
  4. We saw that it was really quite low and we said, why would we be getting low bandwidth-to-global memory?
  5. Well, the first thing you always look at there is coalescing.
  6. And then we inspected the code and convinced ourselves that, yes,
  7. there's bad coalescing happening when we write to the output matrix.
  8. But, I also want to make the point that you don't have to do this from scratch every time.
  9. Right? Doing all these calculations is a little bit like rubbing two sticks together to start a fire;
  10. it's good to know how, but there are tools to help you do this.
  11. The tool that we're going to be using is called nSight.
  12. This is an Nvidia product, there's also third-party products.
  13. Maybe I'll give some links to those in supplementary material.
  14. And if you're using Linux or a Mac like I'm using, then you'll be using the nSight Eclipse edition.
  15. If you were using Windows, you'd by using nSight Visual Studio edition.
  16. These are integrated debuggers and profilers, they're full-blown development environments.
  17. The part that we're going to use is called the Nvidia Visual Profiler, or NVPP.
  18. Let's fire that up now.