Questions tagged [anomaly-detection]
Anomaly detection refers to the problem of finding patterns in data that do not conform to expected behaviour. This is also known as outlier detection.
366 questions
1vote
0answers
19views
Rolling z-score and normalizing
I am using a rolling window z-score method to flag if a record is an outlier. Is it necessary to first normalize the values of the desired feature before computing the rolling z-score?
0votes
0answers
12views
Anomaly detection time in time-series for drops
I am looking into different statistical methods for determining a decrease in a numeric "count" feature across a time-series dataset. The dataset is relatively small (about 50 records), and ...
2votes
0answers
28views
PyGOD memory error despite batch size argument
Anyone know why PyGOD's DOMINANT implementation produces a memory error even though the batch size argument is reasonable? To reproduce: ...
4votes
1answer
54views
Unsupervised Isolation Forrest sklearn hyperparameters
I am using sklearn's IsolationForest for unsupervised anomaly detection task. According to the docs, https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html, there are ...
3votes
1answer
37views
Confirm understanding of decision_function in Isolation Forest sklearn
I am looking to better understand sklearn IsolationForest decision_function. My understanding is that if the metric is closer to -1 then the model is more confident ...
2votes
0answers
36views
Determine best hyperprameteres in GridSearch - Isolation Forest
I have implemented an Isolation Forest algorithm for anomaly detection (unsupervised learning), where I divided my dataset into 1000 subsets, and for each subset, there is one isolation tree. This ...
1vote
0answers
35views
What are the Strategies for Anomaly Detection in Sparse Datasets?
I’m working on a large dataset (300+ columns, 500k+ rows) and have been asked to build an anomaly detection algorithm, but I’m unsure how to define or approach these anomalies in a meaningful way. ...
1vote
0answers
59views
How Would You Approach This Project on Time Series and Anomaly Detection
TLDR Have background in MLOps and machine learning engineering, started at a new employer (as the first AI engineer) and failed in a project of time series forecasting. Approach detailed below, any ...
2votes
1answer
18views
How to determine the feasible domain of a trained tree model?
As far as I know, tree models (such as those trained using xgboost/lightgbm) makes reasonable prediction only if the input feature vector is similar to the train set data. If the feature vector looks ...
2votes
1answer
29views
Battery Disconnect Categorization
I hope someone can help me with a work problem I am facing. My data has machineID, timestamp(UTC), batterypotential for multiple machines over 14 days for every 2 mins. I need to look at their time ...
0votes
0answers
27views
Are there any libraries for generating synthetic anomalies in timeseries data in python?
I'm working on anomaly detection in timeseries data, and need to add synthetic anomalies to existing timeseries data (in order to test anomaly detection algorithms). I can do this by running a ...
1vote
0answers
19views
How to Handle Predictions with Two High-Cardinality Categorical Variables?
Dataset Overview I have a dataset with three columns: ProjectCode: A categorical variable representing the project. (~6 unique values per category) ...
0votes
0answers
21views
using PCA reconstruction to detect outliers
i have a banking customer for whom i am implementing a pilot. It deals with outlier detection in specific accounts. Now the number of transactions in these accounts, on a daily basis, number in their ...
0votes
0answers
31views
Statistical Approach for Anomaly Detection in Multivariate time series
I'm working on an anomaly detection problem for motor experiments 9same type of motor) and need advice on statistical approaches (No ML since I do no not have enough data). Here's the context: Dataset:...
0votes
0answers
40views
Does Increasing Dimensionality Before Compression Make Sense for Anomaly Detection with Autoencoders?
Given a dataset $X$ of shape $(n, p)$ such that $n \gg 1$ and $p \approx 10$, I would like to train an autoencoder to solve an anomaly detection problem. I did some experiments considering a classical ...