Underfitting and perfomance metrics in unsupervised methods

Question

My question is simple and yet quite hard to find an answer to. In an unsupervised method, for example, when you have to reconstruct an input, how can you tell if your loss is good enough? Generally, at the beginning of your training, your loss will go down and then it will stabilize around a certain level. How can you define in that level is good enough,i.e. indicating that your model actually learned something, or even if it decreased, the reconstruction is still bad? Since you don't have the labels, you can't define metrics like accuracy or precision that are quite straightforward to understand and compare to acceptable learning. You just have a loss that decreased, but no metrics to compare the learning. Is out there a way to say, for example, "My error is at this value X and I can accept a loss under a level Y"

I am trying to implement an unsupervised method for anomaly detection but I don't know how to define an acceptable level for the reconstruction error. Since I have labels only on the test set, and test and training have obtained different loss levels at the regime, I would like to understand if I have only a problem of overfitting or my model is actually underfitting too

Jon Nordby · Accepted Answer · 2023-01-29 17:23:29Z

There is no guarantee that the reconstruction error is related at all to the ability to discriminate between what you have defined to be anomalous/not.

You should split your labeled data into a test set and a validation set. And use the validation to determine when to stop training, which hyperparameters and model alternatives to use, et.c.

Stack Exchange Network

Underfitting and perfomance metrics in unsupervised methods

1 Answer 1

Hot Network Questions

Underfitting and perfomance metrics in unsupervised methods

1 Answer 1

Related

Hot Network Questions