I am doing a multilevel regression, and my response variable is binary (presence of females on a tech board). all the EDA methods i know are about plotting correlation, but this as this is a binary i dont know where to start. my predictors are some continuous, other binaries and one categorical (industry).
1 Answer
Exploratory Data Analysis (EDA) is an analysis approach for figuring out how the response variable is connected with the independent variables. These are quite dependent on the data under consideration. For binary classification, you may go by the following way:
Identify any null or nan value. Either drop them or do some statistical imputation to fill up the null/nan values.
Some columns or variables can be dropped if they do not add value to our analysis(e.g, DateTime). For binary classification, generally, we don't go for any correlation matrix.
You can also use a graph. It's not the case that the graph can't be used for binary classification. Say you assigned 0 or 1 to the binary labels, you can plot it, and you could get some meaningful insights about it.
For further reading try :- https://deepnote.com/app/charan-chandrasekaran/EDA-for-Classification-problem-df3d55a6-09d3-48ec-8aca-d039a4b5ff5a