I want to use a complex valued variantional autoencoder for unsupervised blind source separation. As an input to the network, I am giving the time-freq matrix of the spectrogram instead of the spectrogram image. But the network is not learning to separate when I am giving time-freq matrix as input but if I give the spectrogram image then it is able to separate. How can I make it work on the matrix itself and not on the image? I have attached the link to the code in the drive for anyone who wants to have a look. Or how do I debug the model, to check why it's not working with the matrix, but on the image?
$\begingroup$$\endgroup$
2- $\begingroup$Standard neural network VAE architectures are primarily designed for real-valued inputs. Have you used ComplexConv2D layer or treated the real and imaginary parts as separate normalized channels or concatenate them along a new dimension? Ensure the loss function accounts for both magnitude and phase information, as phase plays a critical role in signal reconstruction. See this paper for details about source separation with multichannel VAE.$\endgroup$– cinchCommentedApr 8 at 21:31
- $\begingroup$@cinch, yes I have used complex valued layers, you can find the code in the link that I have shared. I have followed the model from the paper; ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9616154. Using this it works perfectly when I use a real valued network on the spectrogram images, but when I use a complex valued network and give input as the spectrogram matrix it doesn't separate the signals. If you can recommend any changes in the code, that will be very helpful.$\endgroup$– ananyaCommentedApr 9 at 7:43
Add a comment |