We could use PCA for analogy.
When using conv, the forward pass is to extract the coefficients of principle components from the input image, and the backward pass (that updates the input) is to use (the gradient of) the coefficients to reconstruct a new input image, so that the new input image has PC coefficients that better match the desired coefficients.
When using deconv, the forward pass and the backward pass are reversed. The forward pass tries to reconstruct an image from PC coefficients, and the backward pass updates the PC coefficients given (the gradient of) the image.
The deconv forward pass does exactly the conv gradient computation given in this post: http://andrew.gibiansky.com/blog/machine-learning/convolutional-neural-networks/this post.
That's why in the caffe implementation of deconv (refer to Andrei Pokrovsky's answer), the deconv forward pass calls backward_cpu_gemm()backward_cpu_gemm()
, and the backward pass calls forward_cpu_gemm()forward_cpu_gemm()
.