07 August 2014

Today someone asked on Google+

Hello, when computing the gradients CNN, the weights need to be rotated, Why ?

question

I had the same question when I was pouring through code back in the day, so I wanted to clear it up for people once and for all.

Simple answer:

This is just a efficient and clean way of writing things for:

Computing the gradient of a valid 2D convolution w.r.t. the inputs.

There is no magic here

Here’s a detailed explanation with visualization!

Input question Kernel kernel Output output

Section 1: valid convolution (input, kernel)

section1

Section 2: gradient w.r.t. input of valid convolution (input, kernel) = weighted contribution of each input location w.r.t. gradient of Output

section2

Section 3: full convolution(180 degree rotated filter, output)

section3

As you can see, the calculation for the first three elements in section 2 is the same as the first three figures in section 3.

Hope this helps!



blog comments powered by Disqus