Skip to content

ENH Raises warning when getting non-finite score in SearchCV#18266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Aug 29, 2020

Conversation

subrat93
Copy link
Contributor

@subrat93subrat93 commented Aug 26, 2020

Reference Issues/PRs

Fixes#10529
Supersedes and closes#10546
Supersedes and closes#15469

What does this implement/fix? Explain your changes.

The fix checks for the presence of any inf/-inf values in the mean score calculated after GridSearchCV.
If yes, it raises a warning - "One or more of the test scores are infinite"

Any other comments?

Copy link
Member

@glemaitreglemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an entry to the change log at doc/whats_new/v*.rst. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:.
In addition to yourself, please add @Nirvan101@ArthurBook as contributors to this PR.

@glemaitreglemaitre self-requested a review August 28, 2020 09:10
@glemaitreglemaitre changed the title Fixed -> inf or -inf values in CV ruin the mean and stdENH raise warning when getting non-finite value during scoring in GridSearchCVAug 28, 2020
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@glemaitre
Copy link
Member

Can you solve the linting issue: https://app.circleci.com/pipelines/github/scikit-learn/scikit-learn/7266/workflows/6e341ac6-b7dc-498a-9133-59690fcef076/jobs/118045

sklearn/model_selection/tests/test_search.py:28:1: F401 'scipy.stats.distributions.norm' imported but unused from scipy.stats.distributions import norm ^ sklearn/model_selection/tests/test_search.py:1790:1: W293 blank line contains whitespace ^ sklearn/model_selection/tests/test_search.py:1791:1: W293 blank line contains whitespace ^ 
@subrat93
Copy link
ContributorAuthor

@glemaitre, thanks for the review. I've resolved the linting issues.

@@ -26,6 +26,7 @@

from scipy.stats import bernoulli, expon, uniform


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this extra return line

Copy link
Member

@glemaitreglemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 2 small fixes otherwise LGTM.

subrat93and others added 2 commits August 28, 2020 19:38
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@glemaitreglemaitre removed their request for review August 28, 2020 14:29
@glemaitre
Copy link
Member

@thomasjpfan@adrinjalali Do you want to have a look?

Copy link
Member

@thomasjpfanthomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @subrat93 !

@@ -1750,6 +1750,43 @@ def get_n_splits(self, *args, **kw):
ridge.fit(X[:train_size], y[:train_size])


@pytest.mark.parametrize("return_train_score", [False, True])
def test_gridsearchcv_raise_warning_with_non_finite_score(return_train_score):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also check for RandomSearchCV here as well:

@pytest.mark.parametrize("return_train_score", [False, True])@pytest.mark.parametrize("SearchCV, specialized_params", [(GridSearchCV, {"param_grid": {"max_depth": [2, 3]}}), (RandomizedSearchCV, {"param_distributions": {"max_depth": [2, 3]}, "n_iter": 2})])deftest_searchcv_raise_warning_with_non_finite_score( SearchCV, specialized_params, return_train_score): ... grid=SearchCV( DecisionTreeClassifier(), scoring=FailingScorer(), cv=3, return_train_score=return_train_score, **specialized_params )
subrat93and others added 4 commits August 29, 2020 12:09
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Copy link
Member

@thomasjpfanthomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thank you @subrat93 !

@thomasjpfanthomasjpfan changed the title ENH raise warning when getting non-finite value during scoring in GridSearchCVENH Raises warning when getting non-finite score in SearchCVAug 29, 2020
@thomasjpfanthomasjpfan merged commit 192c233 into scikit-learn:masterAug 29, 2020
@glemaitre
Copy link
Member

Thanks @subrat93

@subrat93
Copy link
ContributorAuthor

Thanks a lot @glemaitre and @thomasjpfan for your support!! I look forward to contributing more.

jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
…learn#18266) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment
3 participants
@subrat93@glemaitre@thomasjpfan
close