2
$\begingroup$

Let's say that we have a CNN with two convolutional layers (https://www.tensorflow.org/tutorials/layers). My question regards the dimension of the tensor, which is the output of the pooling layer 1.

In the first convolutional layer, we apply $32$ filters to the input image (let's say that the output will be $28\times 28 \times 32)$, so as far as I can understand we will get $32$ separate feature maps, because of the number of filters.

In the next step, we can apply an activation function, which does not change the dimensionality.

The $\max$ pooling layer takes as input a tensor of $28\times 28\times 32$ and the output is going to be a $14 \times 14 \times \color{red}{1}$ tensor (according to the link above).


I cannot understand the unit as the depth, since we apply $32$ filters and we apply the $\max$ pooling layer to every feature map. So, why is the output tensor $14 \times 14 \times 1$?

According to my understanding the input in the second convolutional layer should be a $14 \times 14 \times 32$ tensor. Probably, I am missing something here.

$\endgroup$
1
  • $\begingroup$Thanks for the edit, it makes sense now. You should have said that the top part of your question comes from the website.$\endgroup$CommentedJun 11, 2017 at 17:20

1 Answer 1

2
$\begingroup$

I kept spinning my head around this question because I seem to come along the same conclusion as you do. However, it appears to be a mistake in the documentation.

https://stackoverflow.com/questions/43453712/what-is-output-tensor-of-max-pooling-2d-layer-in-tensorflow

$\endgroup$
1
  • $\begingroup$Probably, yes. Otherwise, it doesn't make much of a sense.$\endgroup$CommentedJun 11, 2017 at 17:30

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.