3
$\begingroup$

In the documentation of Logisticregression() offered by sklearn library, it states the following note:

The underlying C implementation uses a random number generator to select features when fitting the model. It is thus not uncommon, to have slightly different results for the same input data. If that happens, try with a smaller tol parameter.

I have two questions regarding this note :

  1. What is the meaning of

The underlying C implementation uses a random number generator to select features when fitting the model

  1. What is tol parameter?
$\endgroup$

    1 Answer 1

    4
    $\begingroup$

    The Note you reference was added back when the only solver available in LogisticRegression was LIBLINEAR, and that solver uses coordinate descent: coordinates are examined and adjusted individually, iteratively. The order in which that happens apparently is based on a random number generator.

    See also
    https://stats.stackexchange.com/q/327225/232706
    https://stackoverflow.com/q/38640109/10495893

    Probably that note doesn't apply to all the newer solvers, and ought to be clarified.

    As for tol, it is the tolerance criterion for convergence: when the updates to be made are smaller than tol, we say that's good enough and stop iterating. What exactly "the updates" being measured are used may also depend on the solver. See e.g.
    https://stats.stackexchange.com/a/255380/232706
    https://github.com/scikit-learn/scikit-learn/issues/22243
    https://github.com/scikit-learn/scikit-learn/issues/11536

    $\endgroup$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.