There cannot be a unique answer to your question. There is a discrepancy in your question though -
I am aware that this is a classification problem on which I am working on.
Could you please help me with the right step by step guide that I should follow in order to achieve an efficient clustering at the end?
However, I am assuming that you are trying to do clustering and you want methods that would give you mathematically better clusters.
clustering is an unsupervised learning problem that does not require target variables. The steps that you mentioned are pretty standard and theoretically correct but there are also other steps that you should take care of. I am listing a few :
- Selection of input features - Input features that go into a clustering algorithm are of great importance. It should be noted that a variable not containing any relevant information (say, the telephone number of each person) is worse than useless because it will make the clustering less apparent. In general, the selection of “good” variables is a nontrivial task and may involve quite some trial and error
- Selection of clustering algorithm - Use of a good clustering algorithm as per your data is an important step. For example, K- Means better work with numerical features, K- Modes with categorical and K- prototypes in case if you have the data which is a mix of numerical and categorical features.