1D CNN Variational Autoencoder Conv1D Size

Question

I am trying to create a 1D variational autoencoder to take in a 931x1 vector as input, but I have been having trouble with two things:

Getting the output size of 931, since maxpooling and upsampling gives even sizes
Getting the layer sizes proper

This is what I have so far. I added 0 padding on both sides of my input array before training (This is why you'll see h+2 for the input, 931+2 = 933), and then cropped the output to also get a 933 output size. Using 931 input gives 928 output which I am not sure what the best way to get 931 from there without cropping.

 input_sig = Input(batch_shape=(w,h+2, 1)) x = Conv1D(8,3, activation='relu', padding='same',dilation_rate=2)(input_sig) # x = ZeroPadding1D((2,1))(x) x1 = MaxPooling1D(2)(x) x2 = Conv1D(4,3, activation='relu', padding='same',dilation_rate=2)(x1) x3 = MaxPooling1D(2)(x2) x4 = AveragePooling1D()(x3) flat = Flatten()(x4) encoder = Dense(2)(flat) x = encoder z_mean = Dense(latent_dim, name="z_mean")(x) z_log_var = Dense(latent_dim, name="z_log_var")(x) z = Sampling()([z_mean, z_log_var]) encoder = Model(input_sig, [z_mean, z_log_var, z], name="encoder") encoder.summary() latent_inputs = keras.Input(shape=(latent_dim,)) # d1 = Dense(464)(latent_inputs) d1 = Dense(468)(latent_inputs) # d2 = Reshape((117,4))(d1) d2 = Reshape((117,4))(d1) d3 = Conv1D(4,1,strides=1, activation='relu', padding='same')(d2) d4 = UpSampling1D(2)(d3) d5 = Conv1D(8,1,strides=1, activation='relu', padding='same')(d4) d6 = UpSampling1D(2)(d5) d7 = UpSampling1D(2)(d6) d8 = Conv1D(1,1, strides=1, activation='sigmoid', padding='same')(d7) decoded = Cropping1D(cropping=(1,2))(d8) # this is the added step decoder = Model(latent_inputs, decoded, name="decoder") decoder.summary()

This is the summary printed:

Model: "encoder" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_99 (InputLayer) [(1, 933, 1)] 0 __________________________________________________________________________________________________ conv1d_209 (Conv1D) (1, 933, 8) 32 input_99[0][0] __________________________________________________________________________________________________ max_pooling1d_90 (MaxPooling1D) (1, 466, 8) 0 conv1d_209[0][0] __________________________________________________________________________________________________ conv1d_210 (Conv1D) (1, 466, 4) 100 max_pooling1d_90[0][0] __________________________________________________________________________________________________ max_pooling1d_91 (MaxPooling1D) (1, 233, 4) 0 conv1d_210[0][0] __________________________________________________________________________________________________ average_pooling1d_45 (AveragePo (1, 116, 4) 0 max_pooling1d_91[0][0] __________________________________________________________________________________________________ flatten_45 (Flatten) (1, 464) 0 average_pooling1d_45[0][0] __________________________________________________________________________________________________ dense_89 (Dense) (1, 2) 930 flatten_45[0][0] __________________________________________________________________________________________________ z_mean (Dense) (1, 2) 6 dense_89[0][0] __________________________________________________________________________________________________ z_log_var (Dense) (1, 2) 6 dense_89[0][0] __________________________________________________________________________________________________ sampling_45 (Sampling) (1, 2) 0 z_mean[0][0] z_log_var[0][0] ================================================================================================== Total params: 1,074 Trainable params: 1,074 Non-trainable params: 0 __________________________________________________________________________________________________ Model: "decoder" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_100 (InputLayer) [(None, 2)] 0 _________________________________________________________________ dense_90 (Dense) (None, 468) 1404 _________________________________________________________________ reshape_44 (Reshape) (None, 117, 4) 0 _________________________________________________________________ conv1d_211 (Conv1D) (None, 117, 4) 20 _________________________________________________________________ up_sampling1d_117 (UpSamplin (None, 234, 4) 0 _________________________________________________________________ conv1d_212 (Conv1D) (None, 234, 8) 40 _________________________________________________________________ up_sampling1d_118 (UpSamplin (None, 468, 8) 0 _________________________________________________________________ up_sampling1d_119 (UpSamplin (None, 936, 8) 0 _________________________________________________________________ conv1d_213 (Conv1D) (None, 936, 1) 9 _________________________________________________________________ cropping1d_18 (Cropping1D) (None, 933, 1) 0 ================================================================= Total params: 1,473 Trainable params: 1,473 Non-trainable params: 0 ______________________________

However when I try to fit my model I get the following exception:

ValueError: Invalid reduction dimension 2 for input with 2 dimensions. for '{{node Sum}} = Sum[T=DT_FLOAT, Tidx=DT_INT32, keep_dims=false](Mean, Sum/reduction_indices)' with input shapes: [1,933], [2] and with computed input tensors: input[1] = <1 2>.

Anyone experience this error, or see what I am doing wrong in my model construction? I am new at this and not sure what I am doing wrong.

Note that I have modified this from a working 28x28 MNIST VAE from the Keras documentation.

Thanks in advance

Ethan · Accepted Answer · 2021-09-29 18:40:50Z

I think your input dimension to the autoencoder and its output dimensions are different. The input is (1,933,1) while the output is (933,1). These should be same actually.

Stack Exchange Network

1D CNN Variational Autoencoder Conv1D Size

1 Answer 1

Hot Network Questions

1D CNN Variational Autoencoder Conv1D Size

1 Answer 1

Related

Hot Network Questions