1
$\begingroup$

Let's say I have a dataset like this on which I want to perform classification:

idfeatureclassfactor
1...1A
2...1B
3...2A
4...2B
$\vdots$

How can I compare the performance of a model given the values of the factor?

For instance, let's say I'm using a handwritten digit dataset with a factor being if the person is left or right-handed. How could I compare if the model does better with left-handed or right-handed data?

$\endgroup$

    2 Answers 2

    3
    $\begingroup$

    Once you have the predictions for the full dataset, you can create 2 subsets (one filtering on Factor==A and the other on Factor==B) and compute your score on these 2 subsets.

    $\endgroup$
      2
      $\begingroup$

      Multinomial logistic regression could be used in which a dummy (0,1) or "one-hot" encoded feature is also input as a predictor where 0=A and 1=B. This would firstly tell you if A vs B "matters" at all, i.e., the binary predictor helps explain classificaiton results. Then, you could determine the difference in classification results for the two types of writers.

      $\endgroup$

        Start asking to get answers

        Find the answer to your question by asking.

        Ask question

        Explore related questions

        See similar questions with these tags.