In Problem Set #2 you will be implementing a parallel algorithm for blurring images.
Here is an example of the effect we're talking about.
Here's your original image and here's the image after we apply a blur effect to that original image.
Blurring an image involves averaging a local neighborhood of pixels,
and it is expressed naturally using a parallel stencil operation.
Stencil operations come up all the time in all types of application domains.
This is why we are going to focus in on stencil in this homework.
Let's take a closer look at a simple example demonstrating the kind of local averaging that we are talking about here.
Suppose we have the following pixel representation of an image,
and we want to calculate the average intensity value for this pixel right here.
What do we do?
First we take the value of this pixel, and then we add this value to the value of all its neighbors.
So 10 4 6 2 1 2 3 and 6, and once when we add up all these values then we take the average.
Since we have 9 elements or 9 pixels here, then we multiply the sum by 1/9,
and that is how you would calculate the average intensity value for a pixel in an image.
If we do this operation for every pixel in the image,
we will arrive at a blurred version of input image.
However it turns out that performing an unweighted average of pixels can sometimes look really ugly,
and we can achieve a better looking blur by computing a weighted average of these pixels.
What I mean by weighted average is the following.
Rather than multiplying 1/9 to each pixel value here, we will multiply each pixel value by a different weight.
So w1 is different than w2. And w2 may be different than w3.
And w3 may be different than w4.
And that is the approach that we will take in Problem Set #2.
Here is an image produced by weighted blur,
and here is an image produced by unweighted blur,
and as you can see that the weighted blur
is much smoother than the unweighted blur counterpart.
In this problem set we will give you a small 2D array that contains weight values between 0 and 1 as follows.
But this is just an example.
The actual weight values that we will use will look like this:
the smooth shape of the weights, as you can see here,
will produce the nice looking blur effect that we saw earlier.
And also, here's a note.
We will blur color images by blurring each color channel independently,
and we will include a more detailed mathematical formula on blurring computations in the instructor comments.
This is what you need to do for Problem Set #2.
First, you will need to write the actual blur kernel.
Second, you will need to actually write the kernel that separates the color image to its R, G, B channels.
And third we will give you the opportunity to allocate memory on the device for the filter,
so you will have an opportunity to code CUDA mem copies.
And fourth, you will have to set the correct or the optimal grid and block size for this problem set.
And, as you remember in Problem Set #1,
the grid and the block size has a huge impact on your program's execution time.
Set the size correctly and be careful.
Lastly your submission will be evaluated based on correctness and speed.
But we recommend that you focus on correctness first.
Then after your blurring kernal is run correctly then we recommend that you try to make it run faster.
And lastly we have supplied serial code that you can reference and compare your solution against.
Good luck on writing Problem Set #2.
If you have any questions, feel free to ask in the class forums.