<- function(image, kernel, bias) {
fconv2d
... }
CNN Exercises
Question 1: Manual Convolution
In this exercise, your task is to implement a function that performs a 2D convolution operation manually on a 3D input image using a given 3D kernel.
The input is a tensor with dimensions [channels, height, width]
and the kernel is a tensor with dimensions [channels, kH, kW]
. Your goal is to produce an output tensor of shape [height - kH + 1, width - kW + 1]
by applying the convolution operation. Recall that output element is computed as the sum of the element-wise multiplication of the kernel and the corresponding patch of the input image.
You can use the code below to check your implementation against the conv2d
module from the torch
package.
library(torch)
<- nn_conv2d(3, 1, kernel_size = c(3, 3))
conv <- conv$parameters$weight
kernel <- conv$parameters$bias
bias
<- torch_randn(1, 3, 28, 28)
input torch_allclose(
fconv2d(input$squeeze(), kernel$squeeze(), bias$squeeze()),
conv(input),
atol = 1e-5
)
[1] TRUE
Hint
- Allocate a new empty tensor of the correct size to store the output.
- Using a nested loop, iterate over each valid spatial location in the input and multiply the corresponding patch of the input with the kernel (
torch_sum(patch * kernel)
) and add the bias.
Solution
<- function(image, kernel, bias) {
fconv2d <- image$size(1)
channels <- image$size(2)
height <- image$size(3)
width <- kernel$size(2)
kH <- kernel$size(3)
kW
<- torch_zeros(1, height - kH + 1, width - kW + 1)
new_image
for (i in seq_len(height - kH + 1)) {
for (j in seq_len(width - kW + 1)) {
<- image[.., i:(i + kH - 1), j:(j + kW - 1)]
patch <- torch_sum(patch * kernel) + bias
new_image[.., i, j]
}
}
new_image }
Question 2: Be edgey
Construct a convolutional 2x2 kernel that extracts the edges of an image. Apply it using the fconv2d
function from the previous exercise.
As an input, we use an image from MNIST. You can use the plot_2d_image
function from the helper script to plot the image.
library(torchvision)
source(here::here("scripts/helper.R"))
<- mnist_dataset(root = "data", download = TRUE) mnist
Processing...
Done!
<- mnist$.getitem(13)$x
image plot_2d_image(image)
To get started, use the code below and modify the values of the kernel.
<- matrix(c(0.53, 0.34, 0.22, 0.1), byrow = TRUE, nrow = 2)
kernel kernel
[,1] [,2]
[1,] 0.53 0.34
[2,] 0.22 0.10
<- torch_tensor(kernel)$unsqueeze(1)
kernel
<- fconv2d(torch_tensor(image)$unsqueeze(1), kernel, 0)
imageout plot_2d_image(imageout$squeeze())
Solution
<- torch_tensor(matrix(c(-1, -1, 1, 1), byrow = TRUE, nrow = 3)) edge_kernel
Warning in matrix(c(-1, -1, 1, 1), byrow = TRUE, nrow = 3): data length [4] is not a sub-multiple or multiple of the
number of rows [3]
plot_2d_image(edge_kernel$squeeze())