1
$\begingroup$

Having a sequence of 10 days of sensors events, and a true / false label, specifying if the sensor triggered an alert within the 10 days duration:

sensor_idtimestampfeature_1feature_210_days_alert_label
12020-12-20 01:00:34.5650.230.11
12020-12-20 01:03:13.8970.30.121
22020-12-20 01:00:34.5650.130.40
22020-12-20 01:03:13.8970.20.90

95% of the sensors do not trigger an alert, therefore the data is imbalanced. I was thinking of an autoEncoder model in order to detect the anomalies. Since I'm not interested in decoding the entire sequence, just the LSTM learned context vector, I was thinking of something like the figure below, where the decoder is reconstructing the encoder output:

enter image description here

Before I'm diving in, I would like to validate some points:

  • Does the architecture make sense?
  • How do I copy the LSTM output vector to be used as the decoder target?
  • Train only on the sensors data that didn't trigger an alert, then measure the reconstruction error to find the anomalies (sensors that triggered an alert), right?
$\endgroup$

    1 Answer 1

    1
    $\begingroup$

    You question has two parts 1) how to use LSTM to find anomalies in time series data 2) how to deal with imbalanced data.

    Regarding 1) the closest thing comes to my mind is this post from the sister website https://stats.stackexchange.com/questions/127484/cluster-sequences-of-data-with-different-length/440432#440432 - the only difference is you have labeled data therefore, you have to adjust the architecture to accept a binary vector and optimize the weights with respect to that.

    You can also think of your case project as an unsupervised where you use and auto-encoder to find distances between pair of data. You are lucky to have labels, so the auto-encoder is not the entire architecture and you need the output layer with size 1 to take care of the labels.

    Regarding item 2), in your question the imbalanced data. There are various strategies in fact. One way is what you suggest. I would filter the data (up-sampling or down-sampling) and train the model with balanced data.

    $\endgroup$
    2
    • $\begingroup$Thank you for your reply, allow me to elaborate. I've found may articles regarding LSTM auto-encoder for anomalies, but all of them reconstructed the entire sequence. So my question is: If I'm only interested in the LSTM output and not the sequence, can I reconstruct only that? if so how can I use the output of the first LSTM as the target for the decoder?$\endgroup$CommentedJun 15, 2021 at 10:02
    • $\begingroup$you can - I think even with keras sequential model you can design that - start small - perhaps add a simple data and code for others to come and help you further.$\endgroup$CommentedJun 15, 2021 at 11:13

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.