4
$\begingroup$

I am trying to create a 1D variational autoencoder to take in a 931x1 vector as input, but I have been having trouble with two things:

  1. Getting the output size of 931, since maxpooling and upsampling gives even sizes
  2. Getting the layer sizes proper

This is what I have so far. I added 0 padding on both sides of my input array before training (This is why you'll see h+2 for the input, 931+2 = 933), and then cropped the output to also get a 933 output size. Using 931 input gives 928 output which I am not sure what the best way to get 931 from there without cropping.

 input_sig = Input(batch_shape=(w,h+2, 1)) x = Conv1D(8,3, activation='relu', padding='same',dilation_rate=2)(input_sig) # x = ZeroPadding1D((2,1))(x) x1 = MaxPooling1D(2)(x) x2 = Conv1D(4,3, activation='relu', padding='same',dilation_rate=2)(x1) x3 = MaxPooling1D(2)(x2) x4 = AveragePooling1D()(x3) flat = Flatten()(x4) encoder = Dense(2)(flat) x = encoder z_mean = Dense(latent_dim, name="z_mean")(x) z_log_var = Dense(latent_dim, name="z_log_var")(x) z = Sampling()([z_mean, z_log_var]) encoder = Model(input_sig, [z_mean, z_log_var, z], name="encoder") encoder.summary() latent_inputs = keras.Input(shape=(latent_dim,)) # d1 = Dense(464)(latent_inputs) d1 = Dense(468)(latent_inputs) # d2 = Reshape((117,4))(d1) d2 = Reshape((117,4))(d1) d3 = Conv1D(4,1,strides=1, activation='relu', padding='same')(d2) d4 = UpSampling1D(2)(d3) d5 = Conv1D(8,1,strides=1, activation='relu', padding='same')(d4) d6 = UpSampling1D(2)(d5) d7 = UpSampling1D(2)(d6) d8 = Conv1D(1,1, strides=1, activation='sigmoid', padding='same')(d7) decoded = Cropping1D(cropping=(1,2))(d8) # this is the added step decoder = Model(latent_inputs, decoded, name="decoder") decoder.summary() 

This is the summary printed:

Model: "encoder" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_99 (InputLayer) [(1, 933, 1)] 0 __________________________________________________________________________________________________ conv1d_209 (Conv1D) (1, 933, 8) 32 input_99[0][0] __________________________________________________________________________________________________ max_pooling1d_90 (MaxPooling1D) (1, 466, 8) 0 conv1d_209[0][0] __________________________________________________________________________________________________ conv1d_210 (Conv1D) (1, 466, 4) 100 max_pooling1d_90[0][0] __________________________________________________________________________________________________ max_pooling1d_91 (MaxPooling1D) (1, 233, 4) 0 conv1d_210[0][0] __________________________________________________________________________________________________ average_pooling1d_45 (AveragePo (1, 116, 4) 0 max_pooling1d_91[0][0] __________________________________________________________________________________________________ flatten_45 (Flatten) (1, 464) 0 average_pooling1d_45[0][0] __________________________________________________________________________________________________ dense_89 (Dense) (1, 2) 930 flatten_45[0][0] __________________________________________________________________________________________________ z_mean (Dense) (1, 2) 6 dense_89[0][0] __________________________________________________________________________________________________ z_log_var (Dense) (1, 2) 6 dense_89[0][0] __________________________________________________________________________________________________ sampling_45 (Sampling) (1, 2) 0 z_mean[0][0] z_log_var[0][0] ================================================================================================== Total params: 1,074 Trainable params: 1,074 Non-trainable params: 0 __________________________________________________________________________________________________ Model: "decoder" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_100 (InputLayer) [(None, 2)] 0 _________________________________________________________________ dense_90 (Dense) (None, 468) 1404 _________________________________________________________________ reshape_44 (Reshape) (None, 117, 4) 0 _________________________________________________________________ conv1d_211 (Conv1D) (None, 117, 4) 20 _________________________________________________________________ up_sampling1d_117 (UpSamplin (None, 234, 4) 0 _________________________________________________________________ conv1d_212 (Conv1D) (None, 234, 8) 40 _________________________________________________________________ up_sampling1d_118 (UpSamplin (None, 468, 8) 0 _________________________________________________________________ up_sampling1d_119 (UpSamplin (None, 936, 8) 0 _________________________________________________________________ conv1d_213 (Conv1D) (None, 936, 1) 9 _________________________________________________________________ cropping1d_18 (Cropping1D) (None, 933, 1) 0 ================================================================= Total params: 1,473 Trainable params: 1,473 Non-trainable params: 0 ______________________________ 

However when I try to fit my model I get the following exception:

ValueError: Invalid reduction dimension 2 for input with 2 dimensions. for '{{node Sum}} = Sum[T=DT_FLOAT, Tidx=DT_INT32, keep_dims=false](Mean, Sum/reduction_indices)' with input shapes: [1,933], [2] and with computed input tensors: input[1] = <1 2>. 

Anyone experience this error, or see what I am doing wrong in my model construction? I am new at this and not sure what I am doing wrong.

Note that I have modified this from a working 28x28 MNIST VAE from the Keras documentation.

Thanks in advance

$\endgroup$
0

    1 Answer 1

    1
    $\begingroup$

    I think your input dimension to the autoencoder and its output dimensions are different. The input is (1,933,1) while the output is (933,1). These should be same actually.

    $\endgroup$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.