1
$\begingroup$

It is well known that Random Projection (RP) is tightly linked to Locality Sensitive Hashing (LSH). My goal is to cluster a large number of points lying in a d-dimensional Euclidean space, where $d$ is very large.


Questions: Does it make sense to cluster the points via LSH after having reduced the dimensionality of their input space by using first RP? Why yes/no? Is there any redundancy in the combined use of RP as dimensionality reduction method before LSH as clustering method?

$\endgroup$

    1 Answer 1

    1
    +50
    $\begingroup$

    It makes sense to reduce the dimensionality with Random Projection (RP) and then cluster with Locality Sensitive Hashing (LSH). One of the primary ways of improving LSH is running it multiple times and taking the consensus clusters. That process would be much faster on fewer dimensions.

    As far as redundancy - both methods rely on randomness. There is a small chance that the sequential randomness could yield non-robust results. If possible, run the process multiple times to find consistent results.

    $\endgroup$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.