If you recall from earlier in this unit, the way that we compacted an array of
input elements was to compute the address for each of the input elements to be written into the output array,
and then we would scatter these input elements using these scatter addresses into the output array.
We've computed, for instance, that element 21 would be scattered by address 5 into output location 5.
We're going to have the same goal here with merge, but we're going to use a different algorithm,
and our algorithm is going to rely on both of the input arrays being sorted.
So let's take a little bit of a closer look.
So, mentally, we're going to think about assigning 1 thread per input element in each of the 2 sorted lists.
So in this example, we're going to merge 2 sorted sequences of 4 elements each,
so we would launch 8 threads, each of which would be responsible for 1 input element.
So the goal of each thread is, just like the compact example,
to calculate the position of its element in the final list and then scatter that element there.
So, to start with, let's figure out what we are trying to do.
What we're going to ask you to do is fill in the scatter addresses for these 2 sorts.
This is a very useful and common thing that you need to be able to do as you're building a parallel algorithm,
understanding the scatter pattern that will let you get to the final answer.
Since you're going to generate a dense sorted list of 8 elements,
the scatter addresses in these 8 boxes are going to be the numbers from 0 to 7.
你是否还记得这个单元早些时候的内容,
我们压缩一个输入元素数组的方式
是计算要写入到输出数组中的每个输入元素的地址,
然后我们使用这些分散地址将输入元素分散到输出数组。
例如,我们已经计算出,元素21将按地址5分散到输出位置5。
在这里我们有和归并一样的目标,但我们将使用不同的算法,
而且我们的算法要依赖于两个排序后的输入数组。
所以让我们近距离地看看。
所以,在心里,我们想给这2份已排序列表中的每个输入元素分配1个线程。
所以在此示例中,我们将归并2个排序序列中的每四个元素,
因此,我们将启动8个线程,每个线程将负责1个输入元素。
所以每个线程的目标是,就像压缩的例子一样,
计算其元素在最终列表中的位置,
然后把那个元素分散到那里。
所以,开始前,让我们想想我们想做什么。
我们要让你做的是为这 2个排序列表填写分散地址。
当你在开发一种并行算法时,
这是你需要会做的一种非常有用和普通的事情,
理解分散模式会让你得到最后的答案。
既然你要生成一个8元素的密集排序的列表,
这 8个方框中的分散地址将是从0到7的数字。