Return to Video

Determining the Size of the Tiles - Intro to Parallel Programming

  • 0:00 - 0:04
    In this problem you need to divide your work up into chunks; in this case, tiles.
  • 0:04 - 0:08
    We have a continuum between tiny tiles--lots of them,
  • 0:08 - 0:14
    and fewer tiles where each tile is sized to the maximum that can fit in a single thread block.
  • 0:14 - 0:15
    And in this particular problem,
  • 0:15 - 0:18
    bigger tiles means less memory bandwidth; this is good.
  • 0:18 - 0:22
    Generally then you want to make your tiles as big as can fit into a single thread block,
  • 0:22 - 0:25
    because that minimizes overall memory bandwidth.
  • 0:25 - 0:27
    But note the following 2 caveats.
  • 0:27 - 0:29
    One, you need to have at least as many thread blocks
  • 0:29 - 0:31
    as you have SMs in your GPU,
  • 0:31 - 0:33
    because otherwise you'll have SMs sitting idle.
  • 0:33 - 0:36
    Definitely you want to make sure fill the machine
  • 0:36 - 0:38
    with enough work to keep all the SMs busy,
  • 0:38 - 0:41
    even if you have to move a little bit this way on the continuum
  • 0:41 - 0:44
    and size your tiles just a little bit smaller.
  • 0:44 - 0:47
    Two, if you're sitting at the right end of this continuum,
  • 0:47 - 0:49
    it's best for overall memory bandwidth,
  • 0:49 - 0:51
    but often it turns out that you would actually prefer
  • 0:51 - 0:53
    to just maybe 1 tick to the left.
  • 0:53 - 0:55
    This allows a small number,
  • 0:55 - 0:58
    say, 2 blocks to both B-resident at a time,
  • 0:58 - 1:01
    And that potentially gives better latency-hiding characteristics,
  • 1:01 - 1:03
    because you have more warps that may be in flight at the same time
  • 1:03 - 1:06
    from slightly different pieces of the program.
  • 1:06 - 1:09
    It's certainly something that you would want to tune carefully
  • 1:09 - 1:12
    if you needed the fastest possible implementation.
Title:
Determining the Size of the Tiles - Intro to Parallel Programming
Description:

08-18 Determining the Size of the Tiles

more » « less
Video Language:
English
Team:
Udacity
Project:
CS344 - Intro to Parallel Programming
Duration:
01:12

English subtitles

Revisions Compare revisions