0
$\begingroup$

I want to convert String data to Numeric data as the Decision tree is only accepting numeric data. When I had Binary String data like Ever_Married[Yes/No] I converted using the .replace method to Numeric data. But now I have an attribute with 5 different options[Private, Self-employed, Children, Govt_job, Never_worked]. Is it okay to use .replace to map these attributes to five different Numeric values? will it affect my model and is this good practice?

$\endgroup$
2
  • 1
    $\begingroup$How is “ever married” continuous? Likewise, how is your five-category employment variable continuous?$\endgroup$
    – Dave
    CommentedNov 24, 2022 at 17:01
  • $\begingroup$Ohh sorry for the mistake, ever_married and employment attributes were String and I wanted to convert them to Numeric variables. Because an error was coming the decision tree cannot take string variables. I will edit the question.$\endgroup$CommentedNov 24, 2022 at 17:50

1 Answer 1

1
$\begingroup$

Since you tagged scikit-learn , then you can use its function preprocessing.LabelEncoder() to convert categories to numerical values. And yes, this is a good practice.

from sklearn import preprocessing label_encoder = preprocessing.LabelEncoder() label_encoder.fit(my_dataframe["status"]) 
$\endgroup$

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.