edited tags

edited Feb 12, 2021 at 22:20

1.7k
9
25
39

added 22 characters in body

edited Nov 20, 2018 at 22:10

180
2
18

Note: I've read How do subsequent convolution layers work? a few times, but it's still difficult to understand because of the parameters $k_1$, $k_2$, and many proposals (1, 2.1, 2.2) in the question. This seems to be complex for manyother people too, I think I'm not the only one (see a few comments like "I have just struggled with this same question for a few hours"). So here it is formulated with a particular specific example with no parameters, to grasp the idea more easily.

Let's say we have a CNN with:

input: 28x28x1 grayscale images (28x28 pixels, 1 channel)
1st convolutional layer with kernel size 3x3, and 32 features
2nd convolutional layer with kernel size 3x3, and 64 features

Keras implementation:

model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(Conv2D(64, (3, 3), activation='relu'))

Question: how does the 2nd layer work?

More precisely:

for the 1st layer:
- input size: (1, 28, 28, 1)
- weights size: (3, 3, 1, 32) (good to know: the number of weights doesn't depend on input pixel size)
- output size: (1, 26, 26, 32)
for the 2nd layer:
- input size: (1, 26, 26, 32)
- weights size: (3, 3, 11 32, 64)
- output size: (1, 24, 24, 64)

How is the latter possible? It seemed to me that, in the 2nd layer, every input 26x26 image will be convolved with the 3x3 kernels of each 64 feature maps, but this could done for all the 32 channels! (input size for 2nd layer: (1, 26, 26, 32)).

Thus I had the feeling the output of 2nd layer should be (1, 24, 24, 32*64)

How does it work here?

Note: I've read How do subsequent convolution layers work? a few times, but it's still difficult to understand because of the parameters $k_1$, $k_2$, and many proposals in the question. This seems to be complex for many people (see a few comments like "I have just struggled with this same question for a few hours"). So here it is formulated with a particular specific example with no parameters, to grasp the idea more easily.

Let's say we have a CNN with:

input: 28x28x1 grayscale images (28x28 pixels, 1 channel)
1st convolutional layer with kernel size 3x3, and 32 features
2nd convolutional layer with kernel size 3x3, and 64 features

Keras implementation:

model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(Conv2D(64, (3, 3), activation='relu'))

Question: how does the 2nd layer work?

More precisely:

for the 1st layer:
- input size: (1, 28, 28, 1)
- weights size: (3, 3, 1, 32) (good to know: the number of weights doesn't depend on input pixel size)
- output size: (1, 26, 26, 32)
for the 2nd layer:
- input size: (1, 26, 26, 32)
- weights size: (3, 3, 1, 64)
- output size: (1, 24, 24, 64)

How is the latter possible? It seemed to me that, in the 2nd layer, every input 26x26 image will be convolved with the 3x3 kernels of each 64 feature maps, but this could done for all the 32 channels! (input size for 2nd layer: (1, 26, 26, 32)).

Thus I had the feeling the output of 2nd layer should be (1, 24, 24, 32*64)

How does it work here?

Note: I've read How do subsequent convolution layers work? a few times, but it's still difficult to understand because of the parameters $k_1$, $k_2$, and many proposals (1, 2.1, 2.2) in the question. This seems to be complex for other people too, I think I'm not the only one (see a few comments like "I have just struggled with this same question for a few hours"). So here it is formulated with a particular specific example with no parameters, to grasp the idea more easily.

Let's say we have a CNN with:

input: 28x28x1 grayscale images (28x28 pixels, 1 channel)
1st convolutional layer with kernel size 3x3, and 32 features
2nd convolutional layer with kernel size 3x3, and 64 features

Keras implementation:

model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(Conv2D(64, (3, 3), activation='relu'))

Question: how does the 2nd layer work?

More precisely:

for the 1st layer:
- input size: (1, 28, 28, 1)
- weights size: (3, 3, 1, 32) (good to know: the number of weights doesn't depend on input pixel size)
- output size: (1, 26, 26, 32)
for the 2nd layer:
- input size: (1, 26, 26, 32)
- weights size: (3, 3, 1 32, 64)
- output size: (1, 24, 24, 64)

How is the latter possible? It seemed to me that, in the 2nd layer, every input 26x26 image will be convolved with the 3x3 kernels of each 64 feature maps, but this could done for all the 32 channels!

Thus I had the feeling the output of 2nd layer should be (1, 24, 24, 32*64)

How does it work here?

Source Link

asked Nov 20, 2018 at 21:53

Basj

180
2
18

Subsequent convolution layers

Note: I've read How do subsequent convolution layers work? a few times, but it's still difficult to understand because of the parameters $k_1$, $k_2$, and many proposals in the question. This seems to be complex for many people (see a few comments like "I have just struggled with this same question for a few hours"). So here it is formulated with a particular specific example with no parameters, to grasp the idea more easily.

Let's say we have a CNN with:

input: 28x28x1 grayscale images (28x28 pixels, 1 channel)
1st convolutional layer with kernel size 3x3, and 32 features
2nd convolutional layer with kernel size 3x3, and 64 features

Keras implementation:

model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(Conv2D(64, (3, 3), activation='relu'))

Question: how does the 2nd layer work?

More precisely:

for the 1st layer:
- input size: (1, 28, 28, 1)
- weights size: (3, 3, 1, 32) (good to know: the number of weights doesn't depend on input pixel size)
- output size: (1, 26, 26, 32)
for the 2nd layer:
- input size: (1, 26, 26, 32)
- weights size: (3, 3, 1, 64)
- output size: (1, 24, 24, 64)

How is the latter possible? It seemed to me that, in the 2nd layer, every input 26x26 image will be convolved with the 3x3 kernels of each 64 feature maps, but this could done for all the 32 channels! (input size for 2nd layer: (1, 26, 26, 32)).

Thus I had the feeling the output of 2nd layer should be (1, 24, 24, 32*64)

How does it work here?

neural-network convnet convolution

Stack Exchange Network

Return to Question

Subsequent convolution layers