Questions tagged [unsupervised-learning]
Finding hidden (statistical) structure in unlabelled data, including clustering and feature extraction for dimensionality reduction.
442 questions
1vote
0answers
29views
Using a differentiable Self-Organizing Map loss in a CNN
I've been trying to aggregate a normal CNN loss with a loss that quantifies how well we can cluster the second-to-last layer embeddings (i.e. feed the embeddings to a 2D Self Organizing Map (SOM) and ...
2votes
0answers
36views
Determine best hyperprameteres in GridSearch - Isolation Forest
I have implemented an Isolation Forest algorithm for anomaly detection (unsupervised learning), where I divided my dataset into 1000 subsets, and for each subset, there is one isolation tree. This ...
1vote
0answers
35views
What are the Strategies for Anomaly Detection in Sparse Datasets?
I’m working on a large dataset (300+ columns, 500k+ rows) and have been asked to build an anomaly detection algorithm, but I’m unsure how to define or approach these anomalies in a meaningful way. ...
0votes
0answers
33views
Finding dependencies between arbitrary features automatically
Given a 3-rank tensor with dimensions $x,y,z$. Where: $x$: number of graphs (number of samples) $y$: number of nodes/vectors/features (let's say $5$: $a, b, c, d,$ and $e$) $z$: embedding dimension (...
1vote
1answer
31views
Calculating LOF for big data
I have big dataset (hundreds of millions of records, counted in dozens of GBs) and I would like to perform LOF for the problem of anomaly detection (testing different methods for academic purposes) ...
1vote
0answers
28views
How to Interpret Laplacian Scores for Feature Importance Ranking in Unsupervised Feature Clustering?
I am currently working on unsupervised feature importance ranking using graph clustering methods, specifically focusing on the Laplacian score as a metric. However, I am struggling to clarify the ...
0votes
0answers
110views
Machine learning approach for bot detection
I am working on a project that tries to determine if users are bots or not. Currently, the labels that the dataset contains are not reliable, but I have found some trends/features that are solid for ...
0votes
0answers
14views
Best methods to stratify data into 4 groups (unsupervised manner) using a set/combination of variables
I'm trying to stratify a set of patients according to possible molecular subtypes of cancer. Now, I know all these patients have a type of cancer, but the goal is to (in a unsupervised manner) cluster ...
0votes
0answers
25views
How to understand if a model-algorithm is a machine learning ones or not?
I'm working on thesis to detect change points in a timeseries made of body movements. Im forced to not use any Machine Learning models because my colleague used ML and the professor wants to have a ...
0votes
0answers
18views
Unsupervised short text clustering with covariates
I'm working on a project where I have to categorise short texts. I don't know the topics ahead of time, so the work is unsupervised. Currently, I am using a Bi-Term Topic Model (BTM). I am seeing some ...
0votes
0answers
11views
Troubles using unsupervised domain adaptation
Hope somebody can help me, I've been stucked on this and there's no way I can find the origin of my problem... So I have a model that I have fine-tuned, it's a resent18 that looks like this (I'm just ...
0votes
1answer
140views
Does Including Contamination Turn Isolation Forest into Supervised?
In unsupervised anomaly detection, does including the contamination percentage turn isolation forest into supervised instead of unsupervised when I fit the data after?
0votes
0answers
9views
I have a confusion over the clustering, techniques involved and the scores. This is more about concept based since I am new to clustering models
What is astonishing to me is that the established norms for clustering data are not actually able to deduce the real results in my problem. I created a K=2 clustering kmeans and kmeans-constrained (...
0votes
1answer
55views
Doing unsupervised anomaly detection on a dataset without any labels and without variable descriptions
I am trying to do unsupervised anomaly detection on a dataset with a dozen of variables. None of them have descriptions, and the dataset doesn't have any labels or class variable. I have tried using a ...
0votes
0answers
21views
About autoencoder's latent state regularity
Suppose we are dealing with the problem of dimensionality reduction of an input $\mathbf{x}\in\mathbb{R}^N$, by employing an autoencoder, as a composition of the encoder and decoder map $\mathbf{x} \...