I am trying to use an autoencoder (as described here https://blog.keras.io/building-autoencoders-in-keras.html#) for anomaly detection. I am using a ~1700 feature vector (rather than images, which were used in the example) with each vector describing a different protein interaction. I have a "normal" category of interactions on which I train the AE, then I feed it new vectors and use reconstruction error to detect anomalous interactions.
Adjusting my threshold so I get a true positive rate of 0.95, I get a false positive rate of 0.15, which is rather high. When I trained xgboost on the normal and anomalous vectors (using both types of interactions in training and testing) I was able to get precision of 0.98 **.
Does that mean that my model (or indeed my approach of using an AE) is ineffective, or maybe this is the best I could hope for when training an anomaly detector rather than a 2 category classifier (that is, xgboost in my case)? How should I proceed?
** Of course, this is merely a sanity check, and cannot be used as the solution. I need the model to detect anomalies that can be very different from those I currently have - thus I need to train it on the normal interaction set, and leave anomalies for testing alone.