Skip to main content

All Questions

0votes
1answer
75views

detecting abnormality in a specific feature with respect to others (unsupervised?)

I have a large dataset with a feature y which is dependent in part on features x1 and x2. All features are noisy, and y is also dependent on other parameters not captured in the dataset. I would like ...
3votes
1answer
37views

Confirm understanding of decision_function in Isolation Forest sklearn

I am looking to better understand sklearn IsolationForest decision_function. My understanding is that if the metric is closer to -1 then the model is more confident ...
2votes
2answers
1kviews

How can I find anomalies in each row of data?

I have some reported data I want to spot anomalies on. The columns are a facility name then monthly reports of that given facility. ...
0votes
0answers
111views

Understanding Isolation Forest predictions

I'm running sklearn's IsolationForest on a dataset containing 2 classes of data, one that I know is the anomaly (~1.5% of the entire dataset), the other is the normal dataset. I'm using this (shuffled)...
5votes
1answer
3kviews

Isolation Forest Feature Importance

As of scikit-learn version 0.19.1, there is no implementation for calculating feature importance in an Isolation Forest. I'm also having trouble finding any online resources proposing ways to get at ...
10votes
3answers
15kviews

Isolation forest sklearn contamination param

I am working on an unsupervised anomaly detection task on time series data using an isolation forest algorithm. I am developing it in Python, more in detail using ...
2votes
3answers
406views

K-Means anomaly detection not clustering anomalies

K-means anomaly detection scatter plot The following code, takes a single column from a dataset and then adds 50 anomalies to the dataset that is quite bigger than the maximum values of the dataset. ...
2votes
1answer
416views

Adding anomalies to the Dataset

Recently I have been trying different Scikit-Learn anomaly detection clustering methods, like DBSCAN Isolation Forest. Based on how many training data I use, how I tweak on the algorithms ...
2votes
1answer
80views

Functions in scikit that detect outliers automatically?

I know a way to visualize outliers is to make a box plot, but wanted to know if scikit had any quick ways to detect outliers for each variable?
1vote
2answers
10kviews

How can I replace outliers with maximum non-outlier value?

I am doing univariate outlier detection in python. When I detect outliers for a variable, I know that the value should be whatever the highest non-outlier value is (i.e., the max if there were no ...
4votes
1answer
6kviews

Multivariate outlier detection with isolation forest..How to detect most effective features?

I am trying to detect outliers in my data-set with 5000 observations and 800 features. I have followed the simple steps told in http://scikit-learn.org/stable/auto_examples/ensemble/...

close