Return to Video

11-21 Why is Dynamic Parallel Quicksort is More Efficient

  • 0:01 - 0:04
    Okay, so, for more efficient partitioning, that is actually not true. We have
  • 0:04 - 0:07
    not been touching the partition function, the partition function does not have
  • 0:07 - 0:09
    anything to do with the dynamic launches that I can do the recursive
  • 0:09 - 0:15
    parallelism. Launching on-the-fly, however, yes. That does substantially
  • 0:15 - 0:18
    contribute because I don't have to keep returning back to the CPU to do my
  • 0:18 - 0:24
    launch forming. That means I'm communicating less data and it means that my
  • 0:24 - 0:27
    launch cause immediately that I needed, instead of waiting around until that
  • 0:27 - 0:32
    particular wave of launch is finish. Simple code while convenient and I can
  • 0:32 - 0:37
    probably maintain it faster, is not the reason why it actually runs any faster.
  • 0:37 - 0:42
    And finally, greater GPU utilization is probably the cause for the greatest of
  • 0:42 - 0:46
    speedups. By launching on the fly, I'm making sure my GPU is always busy. So
  • 0:46 - 0:51
    when one partial sort finishes, it creates two more immediately. Keeping my GPU
  • 0:51 - 0:56
    fully stacked up and busy with work. It streams more work for my GPU at one
  • 0:56 - 1:00
    time. And my sort ends up faster end to end. In fact when I've written this
  • 1:00 - 1:04
    program in dynamic parallel form and then host launched form I see a pretty
  • 1:04 - 1:09
    much exactly fact of two speed up between the two.
Tytuł:
11-21 Why is Dynamic Parallel Quicksort is More Efficient
Opis:

more » « less
Video Language:
English
Team:
Udacity
Projekt:
CS344 - Intro to Parallel Programming
Duration:
01:09

English subtitles

Revisions Compare revisions