English subtitles

  1. The answer, too big. If P has more threads than a thread block is allowed to
  2. have, then we can't use shared memory to share data among all P threads, because
  3. we have to distribute that tile across multiple thread blocks. Another
  4. consideration is making sure that we have at least as many thread blocks as SMs
  5. or else SMs will set IO.