4
$\begingroup$

I am using sklearn's IsolationForest for unsupervised anomaly detection task. According to the docs, https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html, there are about 8 hyperparamters to tune. My dataset is relatively small (about 40 records (12 new records a year), 2-4 features).

Since it is an unsupervised usecase, what hyperparmaters would you recommend I look into + tune.

$\endgroup$

    1 Answer 1

    3
    $\begingroup$

    For a dataset with only 40 records and 2–4 features, an ML model like IsolationForest may be overkill.

    Simple EDA (box plots, histograms, and scatter plots), traditional outlier detection methods (z-scores, IQR, Grubbs test, Dixon's Q test, or mahalanobis distance), or clustering-based visualization techniques like k-means are likely to be more appropriate and reliable.

    These approaches help avoid overfitting, require no hyperparameter tuning, and offer more interpretable results for such a small dataset.

    $\endgroup$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.