Calculate CNN backprop with padding and stride set to 2

by Hide Inada

In this article, I'll discuss how you can calculate backprop for one layer of a convolutional neural network (CNN) when stride is set to two. I would like to thank Mr. Sujit Rai for explaining backpropagation calculation in his article Forward And Backpropagation in Convolutional Neural Network in which backprop calculation for 3x3 input, 2x2 kernel, 2x2 output is discussed. I got an idea to calculate backprop from this article and applied to calculating equations for 4x4 zero-padded input, 3x3 kernel, 2x2 output when stride is set to 2 is used during front prop. I was trying to be careful in coming up with equations and calculations, but if you see any error, please open an issue in my ML repository. Please note that any error in this article is mine.

0. Prerequisite

This article is focused on a specific area of CNN backprop and knowledge in the following areas will be required as a prerequisite:

1. Objectives

The objectives of this article are to calculate the gradient of overall loss with respect to the previous layer's activation as well as the gradient of overall loss with respect to the weights stored in the convolutional kernel of the current layer when front prop of the layer was calculated by the following equation: \begin{equation} Z = zero\_pad(a_{prev}) *_{stride2} W \end{equation} where The matrix form of each variable is given as the following: \begin{equation} Z = \begin{bmatrix} z_{11} & z_{12} \\ z_{21} & z_{22} \end{bmatrix} \end{equation} \begin{equation} a_{prev} = \begin{bmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{bmatrix} \end{equation} \begin{equation} zero\_pad(a_{prev}) = \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & a_{11} & a_{12} & a_{13} & a_{14} & 0 \\ 0 & a_{21} & a_{22} & a_{23} & a_{24} & 0 \\ 0 & a_{31} & a_{32} & a_{33} & a_{34} & 0 \\ 0 & a_{41} & a_{42} & a_{43} & a_{44} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \end{equation} \begin{equation} W = \begin{bmatrix} w_{11} & w_{12} & w_{13} \\ w_{21} & w_{22} & w_{23} \\ w_{31} & w_{32} & w_{33} \end{bmatrix} \end{equation} Each element of Z is expressed by the following equations. Note that the convolution kernel W is shifted by two slots after each operation since stride is set to 2: \begin{equation} Z_{11} = 0 \cdot w_{11} + 0 \cdot w_{12} + 0 \cdot w_{13} + 0 \cdot w_{21} + a_{11} \cdot w_{22} + a_{12} \cdot w_{23} + 0 \cdot w_{31} + a_{21} \cdot w_{32} + a_{22} \cdot w_{33} \end{equation} \begin{equation} Z_{12} = 0 \cdot w_{11} + 0 \cdot w_{12} + 0 \cdot w_{13} + a_{12} \cdot w_{21} + a_{13} \cdot w_{22} + a_{14} \cdot w_{23} + a_{22} \cdot w_{31} + a_{23} \cdot w_{32} + a_{24} \cdot w_{33} \end{equation} \begin{equation} Z_{21} = 0 \cdot w_{11} + a_{21} \cdot w_{12} + a_{22} \cdot w_{13} + 0 \cdot w_{21} + a_{31} \cdot w_{22} + a_{32} \cdot w_{23} + 0 \cdot w_{31} + a_{41} \cdot w_{32} + a_{42} \cdot w_{33} \end{equation} \begin{equation} Z_{22} = a_{22} \cdot w_{11} + a_{23} \cdot w_{12} + a_{24} \cdot w_{13} + a_{32} \cdot w_{21} + a_{33} \cdot w_{22} + a_{34} \cdot w_{23} + a_{42} \cdot w_{31} + a_{43} \cdot w_{32} + a_{44} \cdot w_{33} \end{equation} During a backprop, gradients of the loss are propagated from the final layer over each layer of the network. Specifically propagation will take in the form of \( \frac{\partial L}{\partial a} \) for this layer where Once you know \( \frac{\partial L}{\partial a}\), you can apply the following chain rule to calculate the gradient with respect to Z. This spefic calculation is an element-wise operation, so it is straightforward. \begin{equation} \frac{\partial L}{\partial Z} = \frac{\partial L}{\partial a} \cdot \frac{\partial a}{\partial Z} \end{equation} \begin{equation} \frac{\partial L}{\partial Z} = \begin{bmatrix} \frac{\partial L}{\partial z_{11}} & \frac{\partial L}{\partial z_{12}} \\ \frac{\partial L}{\partial z_{21}} & \frac{\partial L}{\partial z_{22}} \\ \end{bmatrix} \end{equation} The objectives of backprop for this layer are to find the values for the following two matrices: \begin{equation} \frac{\partial L}{\partial a_{prev}} = \begin{bmatrix} \frac{\partial L}{\partial a_{11}} & \frac{\partial L}{\partial a_{12}} & \frac{\partial L}{\partial a_{13}} & \frac{\partial L}{\partial a_{14}} \\ \frac{\partial L}{\partial a_{21}} & \frac{\partial L}{\partial a_{22}} & \frac{\partial L}{\partial a_{23}} & \frac{\partial L}{\partial a_{24}} \\ \frac{\partial L}{\partial a_{31}} & \frac{\partial L}{\partial a_{32}} & \frac{\partial L}{\partial a_{33}} & \frac{\partial L}{\partial a_{34}} \\ \frac{\partial L}{\partial a_{41}} & \frac{\partial L}{\partial a_{42}} & \frac{\partial L}{\partial a_{43}} & \frac{\partial L}{\partial a_{44}} \end{bmatrix} \end{equation} \begin{equation} \frac{\partial L}{\partial W} = \begin{bmatrix} \frac{\partial L}{\partial w_{11}} & \frac{\partial L}{\partial w_{12}} & \frac{\partial L}{\partial w_{13}} \\ \frac{\partial L}{\partial w_{21}} & \frac{\partial L}{\partial w_{22}} & \frac{\partial L}{\partial w_{23}} \\ \frac{\partial L}{\partial w_{31}} & \frac{\partial L}{\partial w_{32}} & \frac{\partial L}{\partial w_{33}} \end{bmatrix} \end{equation}

2. Solution

To find \(\frac{\partial L}{\partial a_{prev}}\), we just have to calculate each of 16 elements of the matrix \( (12) \). Note that for all the 16 equations below, the second equation is derived by substituting w for the partial derivative of z with respect to a. This is done by differentiating both sides of equations \( (6) \) through \( (9) \) with respect to \( a \). Let's take an example of \( (6) \). Differentiating both sides of the equation with respect to \( a_{11} \) yields \begin{equation} \frac{\partial z_{11}}{\partial a_{11}} = w_{22} \end{equation} This is because all other terms on the right side will disappear during differentiation.

2.1. Calculate \(\frac{\partial L}{\partial a_{prev}}\)

2.1.1. Calculate 16 elements

\( \frac{\partial L}{\partial a_{11}} \)
\begin{equation} \frac{\partial L}{\partial a_{11}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{11}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{11}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{11}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{11}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot w_{22} + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot 0 \end{equation} \begin{equation} \frac{\partial L}{\partial a_{11}} = \frac{\partial L}{\partial z_{11}} \cdot w_{22} \end{equation}
\( \frac{\partial L}{\partial a_{12}} \)
\begin{equation} \frac{\partial L}{\partial a_{12}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{12}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{12}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{12}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{12}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot w_{23} + \frac{\partial L}{\partial z_{12}} \cdot w_{21} + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot 0 \end{equation} \begin{equation} \frac{\partial L}{\partial a_{12}} = \frac{\partial L}{\partial z_{11}} \cdot w_{23} + \frac{\partial L}{\partial z_{12}} \cdot w_{21} \end{equation}
\( \frac{\partial L}{\partial a_{13}} \)
\begin{equation} \frac{\partial L}{\partial a_{13}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{13}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{13}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{13}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{13}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} 0 + \frac{\partial L}{\partial z_{12}} \cdot w_{22} + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot 0 \end{equation} \begin{equation} \frac{\partial L}{\partial a_{13}} = \frac{\partial L}{\partial z_{12}} \cdot w_{22} \end{equation}
\( \frac{\partial L}{\partial a_{14}} \)
\begin{equation} \frac{\partial L}{\partial a_{14}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{14}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{14}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{14}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{14}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot w_{23} + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot 0 \end{equation} \begin{equation} \frac{\partial L}{\partial a_{14}} = \frac{\partial L}{\partial z_{12}} \cdot w_{23} \end{equation}
\( \frac{\partial L}{\partial a_{21}} \)
\begin{equation} \frac{\partial L}{\partial a_{21}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{21}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{21}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{21}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{21}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot w_{32} + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot w_{12} + \frac{\partial L}{\partial z_{22}} \cdot 0 \end{equation} \begin{equation} \frac{\partial L}{\partial a_{21}} = \frac{\partial L}{\partial z_{11}} \cdot w_{32} + \frac{\partial L}{\partial z_{21}} \cdot w_{12} \end{equation}
\( \frac{\partial L}{\partial a_{22}} \)
\begin{equation} \frac{\partial L}{\partial a_{22}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{22}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{22}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{22}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{22}} \end{equation} \begin{equation} \frac{\partial L}{\partial a_{22}} = \frac{\partial L}{\partial z_{11}} \cdot w_{33} + \frac{\partial L}{\partial z_{12}} \cdot w_{31} + \frac{\partial L}{\partial z_{21}} \cdot w_{13} + \frac{\partial L}{\partial z_{22}} \cdot w_{11} \end{equation}
\( \frac{\partial L}{\partial a_{23}} \)
\begin{equation} \frac{\partial L}{\partial a_{23}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{23}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{23}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{23}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{23}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot w_{32} + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot w_{12} \end{equation} \begin{equation} \frac{\partial L}{\partial a_{23}} = \frac{\partial L}{\partial z_{12}} \cdot w_{32} + \frac{\partial L}{\partial z_{22}} \cdot w_{12} \end{equation}
\( \frac{\partial L}{\partial a_{24}} \)
\begin{equation} \frac{\partial L}{\partial a_{24}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{24}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{24}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{24}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{24}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot w_{33} + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot w_{13} \end{equation} \begin{equation} \frac{\partial L}{\partial a_{24}} = \frac{\partial L}{\partial z_{12}} \cdot w_{33} + \frac{\partial L}{\partial z_{22}} \cdot w_{13} \end{equation}
\( \frac{\partial L}{\partial a_{31}} \)
\begin{equation} \frac{\partial L}{\partial a_{31}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{31}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{31}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{31}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{31}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot w_{22} + \frac{\partial L}{\partial z_{22}} 0 \end{equation} \begin{equation} \frac{\partial L}{\partial a_{31}} = \frac{\partial L}{\partial z_{21}} \cdot w_{22} \end{equation}
\( \frac{\partial L}{\partial a_{32}} \)
\begin{equation} \frac{\partial L}{\partial a_{32}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{32}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{32}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{32}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{32}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot w_{23} + \frac{\partial L}{\partial z_{22}} \cdot w_{21} \end{equation} \begin{equation} \frac{\partial L}{\partial a_{32}} = \frac{\partial L}{\partial z_{21}} \cdot w_{23} + \frac{\partial L}{\partial z_{22}} \cdot w_{21} \end{equation}
\( \frac{\partial L}{\partial a_{33}} \)
\begin{equation} \frac{\partial L}{\partial a_{33}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{33}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{33}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{33}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{33}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot w_{22} \end{equation} \begin{equation} \frac{\partial L}{\partial a_{33}} = \frac{\partial L}{\partial z_{22}} \cdot w_{22} \end{equation}
\( \frac{\partial L}{\partial a_{34}} \)
\begin{equation} \frac{\partial L}{\partial a_{34}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{34}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{34}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{34}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{34}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot w_{23} \end{equation} \begin{equation} \frac{\partial L}{\partial a_{34}} = \frac{\partial L}{\partial z_{22}} \cdot w_{23} \end{equation}
\( \frac{\partial L}{\partial a_{41}} \)
\begin{equation} \frac{\partial L}{\partial a_{41}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{41}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{41}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{41}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{41}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot w_{32} + \frac{\partial L}{\partial z_{22}} 0 \end{equation} \begin{equation} \frac{\partial L}{\partial a_{41}} = \frac{\partial L}{\partial z_{21}} \cdot w_{32} \end{equation}
\( \frac{\partial L}{\partial a_{42}} \)
\begin{equation} \frac{\partial L}{\partial a_{42}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{42}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{42}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{42}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{42}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot w_{33} + \frac{\partial L}{\partial z_{22}} \cdot w_{31} \end{equation} \begin{equation} \frac{\partial L}{\partial a_{42}} = \frac{\partial L}{\partial z_{21}} \cdot w_{33} + \frac{\partial L}{\partial z_{22}} \cdot w_{31} \end{equation}
\( \frac{\partial L}{\partial a_{43}} \)
\begin{equation} \frac{\partial L}{\partial a_{43}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{43}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{43}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{43}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{43}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot w_{32} \end{equation} \begin{equation} \frac{\partial L}{\partial a_{43}} = \frac{\partial L}{\partial z_{22}} \cdot w_{32} \end{equation}
\( \frac{\partial L}{\partial a_{44}} \)
\begin{equation} \frac{\partial L}{\partial a_{43}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial a_{44}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial a_{44}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial a_{44}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial a_{44}} \end{equation} \begin{equation} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot w_{33} \end{equation} \begin{equation} \frac{\partial L}{\partial a_{44}} = \frac{\partial L}{\partial z_{22}} \cdot w_{33} \end{equation} Now if we put all the elements into a matrix, we will get: \begin{equation} \frac{\partial L}{\partial a_{prev}} = \begin{bmatrix} \frac{\partial L}{\partial z_{11}} \cdot w_{22} & \frac{\partial L}{\partial z_{11}} \cdot w_{23} + \frac{\partial L}{\partial z_{12}} \cdot w_{21} & \frac{\partial L}{\partial z_{12}} \cdot w_{22} & \frac{\partial L}{\partial z_{12}} \cdot w_{23} \\ \frac{\partial L}{\partial z_{11}} \cdot w_{32} + \frac{\partial L}{\partial z_{21}} \cdot w_{12} & \frac{\partial L}{\partial z_{11}} \cdot w_{33} + \frac{\partial L}{\partial z_{12}} \cdot w_{31} + \frac{\partial L}{\partial z_{21}} \cdot w_{13} + \frac{\partial L}{\partial z_{22}} \cdot w_{11} & \frac{\partial L}{\partial z_{12}} \cdot w_{32} + \frac{\partial L}{\partial z_{22}} \cdot w_{12} & \frac{\partial L}{\partial z_{12}} \cdot w_{33} + \frac{\partial L}{\partial z_{22}} \cdot w_{13} \\ \frac{\partial L}{\partial z_{21}} \cdot w_{22} & \frac{\partial L}{\partial z_{21}} \cdot w_{23} + \frac{\partial L}{\partial z_{22}} \cdot w_{21} & \frac{\partial L}{\partial z_{22}} \cdot w_{22} & \frac{\partial L}{\partial z_{22}} \cdot w_{23} \\ \frac{\partial L}{\partial z_{21}} \cdot w_{32} & \frac{\partial L}{\partial z_{21}} \cdot w_{33} + \frac{\partial L}{\partial z_{22}} \cdot w_{31} & \frac{\partial L}{\partial z_{22}} \cdot w_{32} & \frac{\partial L}{\partial z_{22}} \cdot w_{33} \end{bmatrix} \end{equation}

2.1.2. Simplifying the equations

This looks hairy and seems hard to calculate. However, you'll soon see that that's not the case.
Step 1. Interweave \( \frac{\partial L}{\partial z} \) with zeros
First step is to put a row and a column after every element of \( \frac{\partial L}{\partial z} \): \begin{equation} \frac{\partial L}{\partial z} = \begin{bmatrix} \frac{\partial L}{\partial z_{11}} & \frac{\partial L}{\partial z_{12}} \\ \frac{\partial L}{\partial z_{21}} & \frac{\partial L}{\partial z_{22}} \end{bmatrix} \end{equation} \begin{equation} zero \_ interweave(\frac{\partial L}{\partial z}) = \begin{bmatrix} \frac{\partial L}{\partial z_{11}} & 0 & \frac{\partial L}{\partial z_{12}} & 0 \\ 0 & 0 & 0 & 0 \\ \frac{\partial L}{\partial z_{21}} & 0 & \frac{\partial L}{\partial z_{22}} & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \end{equation}
Step 2. zero-pad along the edges
Next step is to put zero along the edges. \begin{equation} zero\_pad(zero \_ interweave(\frac{\partial L}{\partial z})) = \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{\partial L}{\partial z_{11}} & 0 & \frac{\partial L}{\partial z_{12}} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{\partial L}{\partial z_{21}} & 0 & \frac{\partial L}{\partial z_{22}} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ \end{bmatrix} \end{equation}
Step 3. Flip W vertically and horizontally.
\begin{equation} W_{2} = flip_{verthorz}(W) = \begin{bmatrix} w_{33} & w_{32} & w_{31} \\ w_{23} & w_{22} & w_{21} \\ w_{13} & w_{12} & w_{11} \end{bmatrix} \end{equation} where \(flip_{verthorz}(W) \) is an operator to flip the matrix both vertically and horizontally.
Step 4. Convolve two matrices
\begin{equation} \frac{\partial L}{\partial a_{prev}} = zero\_pad(zero \_ interweave(\frac{\partial L}{\partial z})) * W_{2} \end{equation} This yields the same matrix as the equation \( (62) \).

2.2. Calculate \(\frac{\partial L}{\partial W}\)

Calculating \(\frac{\partial L}{\partial W}\) is similar to the steps above. We just have to calculate each of the 9 elements of the matrix \( (13) \) this time.

2.2.1. Calculate 9 elements

\( \frac{\partial L}{\partial w_{11}} \)
\begin{equation} \frac{\partial L}{\partial w_{11}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial w_{11}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial w_{11}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial w_{11}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial w_{11}} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{11}} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot a_{22} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{11}} = \frac{\partial L}{\partial z_{22}} \cdot a_{22} \end{equation}
\( \frac{\partial L}{\partial w_{12}} \)
\begin{equation} \frac{\partial L}{\partial w_{12}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial w_{12}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial w_{12}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial w_{12}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial w_{12}} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{12}} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot a_{21} + \frac{\partial L}{\partial z_{22}} \cdot a_{23} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{12}} = \frac{\partial L}{\partial z_{21}} \cdot a_{21} + \frac{\partial L}{\partial z_{22}} \cdot a_{23} \end{equation}
\( \frac{\partial L}{\partial w_{13}} \)
\begin{equation} \frac{\partial L}{\partial w_{13}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial w_{13}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial w_{13}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial w_{13}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial w_{13}} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{13}} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot 0 + \frac{\partial L}{\partial z_{21}} \cdot a_{22} + \frac{\partial L}{\partial z_{22}} \cdot a_{24} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{13}} = \frac{\partial L}{\partial z_{21}} \cdot a_{22} + \frac{\partial L}{\partial z_{22}} \cdot a_{24} \end{equation}
\( \frac{\partial L}{\partial w_{21}} \)
\begin{equation} \frac{\partial L}{\partial w_{21}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial w_{21}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial w_{21}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial w_{21}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial w_{21}} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{21}} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot a_{12} + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot a_{32} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{21}} = \frac{\partial L}{\partial z_{12}} \cdot a_{12} + \frac{\partial L}{\partial z_{22}} \cdot a_{32} \end{equation}
\( \frac{\partial L}{\partial w_{22}} \)
\begin{equation} \frac{\partial L}{\partial w_{22}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial w_{22}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial w_{22}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial w_{22}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial w_{22}} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{22}} = \frac{\partial L}{\partial z_{11}} \cdot a_{11} + \frac{\partial L}{\partial z_{12}} \cdot a_{13} + \frac{\partial L}{\partial z_{21}} \cdot a_{31} + \frac{\partial L}{\partial z_{22}} \cdot a_{33} \end{equation}
\( \frac{\partial L}{\partial w_{23}} \)
\begin{equation} \frac{\partial L}{\partial w_{23}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial w_{23}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial w_{23}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial w_{23}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial w_{23}} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{23}} = \frac{\partial L}{\partial z_{11}} \cdot a_{12} + \frac{\partial L}{\partial z_{12}} \cdot a_{14} + \frac{\partial L}{\partial z_{21}} \cdot a_{32} + \frac{\partial L}{\partial z_{22}} \cdot a_{34} \end{equation}
\( \frac{\partial L}{\partial w_{31}} \)
\begin{equation} \frac{\partial L}{\partial w_{21}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial w_{31}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial w_{31}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial w_{31}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial w_{31}} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{31}} = \frac{\partial L}{\partial z_{11}} \cdot 0 + \frac{\partial L}{\partial z_{12}} \cdot a_{22} + \frac{\partial L}{\partial z_{21}} \cdot 0 + \frac{\partial L}{\partial z_{22}} \cdot a_{42} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{31}} = \frac{\partial L}{\partial z_{12}} \cdot a_{22} + \frac{\partial L}{\partial z_{22}} \cdot a_{42} \end{equation}
\( \frac{\partial L}{\partial w_{32}} \)
\begin{equation} \frac{\partial L}{\partial w_{32}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial w_{32}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial w_{32}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial w_{32}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial w_{32}} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{32}} = \frac{\partial L}{\partial z_{11}} \cdot a_{21} + \frac{\partial L}{\partial z_{12}} \cdot a_{23} + \frac{\partial L}{\partial z_{21}} \cdot a_{41} + \frac{\partial L}{\partial z_{22}} \cdot a_{43} \end{equation}
\( \frac{\partial L}{\partial w_{33}} \)
\begin{equation} \frac{\partial L}{\partial w_{23}} = \frac{\partial L}{\partial z_{11}} \cdot \frac{\partial z_{11}}{\partial w_{33}} + \frac{\partial L}{\partial z_{12}} \cdot \frac{\partial z_{12}}{\partial w_{33}} + \frac{\partial L}{\partial z_{21}} \cdot \frac{\partial z_{21}}{\partial w_{33}} + \frac{\partial L}{\partial z_{22}} \cdot \frac{\partial z_{22}}{\partial w_{33}} \end{equation} \begin{equation} \frac{\partial L}{\partial w_{33}} = \frac{\partial L}{\partial z_{11}} \cdot a_{22} + \frac{\partial L}{\partial z_{12}} \cdot a_{24} + \frac{\partial L}{\partial z_{21}} \cdot a_{42} + \frac{\partial L}{\partial z_{22}} \cdot a_{44} \end{equation}
\begin{equation} \frac{\partial L}{\partial W} = \begin{bmatrix} \frac{\partial L}{\partial z_{22}} \cdot a_{22} & \frac{\partial L}{\partial z_{21}} \cdot a_{21} + \frac{\partial L}{\partial z_{22}} \cdot a_{23} & \frac{\partial L}{\partial z_{21}} \cdot a_{22} + \frac{\partial L}{\partial z_{22}} \cdot a_{24} \\ \frac{\partial L}{\partial z_{12}} \cdot a_{12} + \frac{\partial L}{\partial z_{22}} \cdot a_{32} & \frac{\partial L}{\partial z_{11}} \cdot a_{11} + \frac{\partial L}{\partial z_{12}} \cdot a_{13} + \frac{\partial L}{\partial z_{21}} \cdot a_{31} + \frac{\partial L}{\partial z_{22}} \cdot a_{33} & \frac{\partial L}{\partial z_{11}} \cdot a_{12} + \frac{\partial L}{\partial z_{12}} \cdot a_{14} + \frac{\partial L}{\partial z_{21}} \cdot a_{32} + \frac{\partial L}{\partial z_{22}} \cdot a_{34} \\ \frac{\partial L}{\partial z_{12}} \cdot a_{22} + \frac{\partial L}{\partial z_{22}} \cdot a_{42} & \frac{\partial L}{\partial z_{11}} \cdot a_{21} + \frac{\partial L}{\partial z_{12}} \cdot a_{23} + \frac{\partial L}{\partial z_{21}} \cdot a_{41} + \frac{\partial L}{\partial z_{22}} \cdot a_{43} & \frac{\partial L}{\partial z_{11}} \cdot a_{22} + \frac{\partial L}{\partial z_{12}} \cdot a_{24} + \frac{\partial L}{\partial z_{21}} \cdot a_{42} + \frac{\partial L}{\partial z_{22}} \cdot a_{44} \end{bmatrix} \end{equation}

2.2.2. Simplifying the equations

This matrix also looks complicated, but we can simplify using the below steps.
Step 1. Interweave \( \frac{\partial L}{\partial z} \) with zeros
First step is to put a row and a column after every element of \( Z \). This operation is the same as the one discussed in 2.1.2. \begin{equation} zero \_ interweave(\frac{\partial L}{\partial z}) = \begin{bmatrix} \frac{\partial L}{\partial z_{11}} & 0 & \frac{\partial L}{\partial z_{12}} & 0 \\ 0 & 0 & 0 & 0 \\ \frac{\partial L}{\partial z_{21}} & 0 & \frac{\partial L}{\partial z_{22}} & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \end{equation}
Step 2. Zero-pad \(a_{prev} \)
Put zeros along the edges of \( a_{prev} \). This is the same operation as in \( (4) \). \begin{equation} zero\_pad(a_{prev}) = \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & a_{11} & a_{12} & a_{13} & a_{14} & 0 \\ 0 & a_{21} & a_{22} & a_{23} & a_{24} & 0 \\ 0 & a_{31} & a_{32} & a_{33} & a_{34} & 0 \\ 0 & a_{41} & a_{42} & a_{43} & a_{44} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \end{equation}
Step 3. Convolve two matrices
\begin{equation} \frac{\partial L}{\partial W} = zero\_pad(a_{prev}) * zero \_ interweave(\frac{\partial L}{\partial Z}) \end{equation} This yields the same matrix as the equation \( (91) \), and we are done.