Having a sequence of 10 days of sensors events, and a true / false label, specifying if the sensor triggered an alert within the 10 days duration:
sensor_id | timestamp | feature_1 | feature_2 | 10_days_alert_label |
---|---|---|---|---|
1 | 2020-12-20 01:00:34.565 | 0.23 | 0.1 | 1 |
1 | 2020-12-20 01:03:13.897 | 0.3 | 0.12 | 1 |
2 | 2020-12-20 01:00:34.565 | 0.13 | 0.4 | 0 |
2 | 2020-12-20 01:03:13.897 | 0.2 | 0.9 | 0 |
95% of the sensors do not trigger an alert, therefore the data is imbalanced. I was thinking of an autoEncoder model in order to detect the anomalies. Since I'm not interested in decoding the entire sequence, just the LSTM learned context vector, I was thinking of something like the figure below, where the decoder is reconstructing the encoder output:
Before I'm diving in, I would like to validate some points:
- Does the architecture make sense?
- How do I copy the LSTM output vector to be used as the decoder target?
- Train only on the sensors data that didn't trigger an alert, then measure the reconstruction error to find the anomalies (sensors that triggered an alert), right?