Skip to content

BUG fix SparseCoder to follow scikit-learn API and allow cloning#17679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jun 25, 2020
Merged

BUG fix SparseCoder to follow scikit-learn API and allow cloning #17679

merged 33 commits into from
Jun 25, 2020

Conversation

sdpython
Copy link
Contributor

@sdpythonsdpython commented Jun 23, 2020

Reference Issues/PRs

Fixes#8675
Fixes#16336
Supersedes and closes#16346

What does this implement/fix? Explain your changes.

SparseCoder cannot be used in a GridSearchCV because it could not be cloned. A parameter in the constructor had a different meaning when stored in an instance (dictionary).

@glemaitre
Copy link
Member

Uhm did something about this one. Let me check.

@glemaitre
Copy link
Member

Found it: #16346

@sdpython
Copy link
ContributorAuthor

Let me know which one you prefer. PR #16346 does not test cloning and that's what caused the issue. I recommend adding one more test.

@glemaitre
Copy link
Member

I think this is important to make the estimator scikit-learn compliant (it will crash soon otherwise). Cloning is part of that so we should merge both PR. Could you merge my PR inside yours like this you supersede mine? I will review it then.

@glemaitre
Copy link
Member

I just solve the conflicts in mine so you should have no so much trouble to merge them.

@cmarmo
Copy link
Contributor

@sdpython do you mind editing the PR adding that you will also close #16336? Thanks for your patience.

@sdpython
Copy link
ContributorAuthor

ok :)

@glemaitreglemaitre self-requested a review June 25, 2020 09:40
@glemaitre
Copy link
Member

@sdpython I will do a review. I think that we might need to add some of the test (usually done in common test) but that require specific shape for X and dictionary.

Copy link
Member

@ogriselogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments but otherwise LGTM. I don't think there is an easy way to re-add SparseCoder to test_common as it requires to specify the dictionary in the constructor params and the dictionary shape should be consistent with X.shape[1].

@glemaitre
Copy link
Member

We would need to have something like that (it is only the transformer).

deftest_sparse_coder_common_transformer(): fromfunctoolsimportpartialfromsklearn.utils.estimator_checksimportcheck_transformer_data_not_an_arrayfromsklearn.utils.estimator_checksimportcheck_transformer_generalfromsklearn.utils.estimator_checksimportcheck_transformers_unfittedrng=np.random.RandomState(777) n_components, n_features=40, 3init_dict=rng.rand(n_components, n_features) sc=SparseCoder(init_dict) check_transformer_data_not_an_array(sc.__class__.__name__, sc) check_transformer_general(sc.__class__.__name__, sc) check_transformer_general_memmap=partial( check_transformer_general, readonly_memmap=True ) check_transformer_general_memmap(sc.__class__.__name__, sc) check_transformers_unfitted(sc.__class__.__name__, sc)
@glemaitre
Copy link
Member

I think this is wiser to merge with the current test and see how the test_common.py could be modify such that the size of the dictionary can be modified on the fly.

@adrinjalali
Copy link
Member

A few comments but otherwise LGTM. I don't think there is an easy way to re-add SparseCoder to test_common as it requires to specify the dictionary in the constructor params and the dictionary shape should be consistent with X.shape[1].

I have encountered this while working on other estimators outside sklearn, and it's partly why I wanted to have a better way of telling common tests now to generate required data for certain tests.

@sdpython
Copy link
ContributorAuthor

deftest_sparse_coder_common_transformer(): fromfunctoolsimportpartialfromsklearn.utils.estimator_checksimportcheck_transformer_data_not_an_arrayfromsklearn.utils.estimator_checksimportcheck_transformer_generalfromsklearn.utils.estimator_checksimportcheck_transformers_unfittedrng=np.random.RandomState(777) n_components, n_features=40, 3init_dict=rng.rand(n_components, n_features) sc=SparseCoder(init_dict) check_transformer_data_not_an_array(sc.__class__.__name__, sc) check_transformer_general(sc.__class__.__name__, sc) check_transformer_general_memmap=partial( check_transformer_general, readonly_memmap=True ) check_transformer_general_memmap(sc.__class__.__name__, sc) check_transformers_unfitted(sc.__class__.__name__, sc)

Added.

Copy link
Member

@ogriselogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @sdpython!

@glemaitre
Copy link
Member

There is a single thing missing which is an entry in what's new because we are deprecated one parameter.
Can you this acknowledging yourself @sdpython and then we are good for merging.

@glemaitreglemaitre changed the title Fixes #8675, fix cloning for SparseCoderBUG fix SparseCoder to follow scikit-learn API and allow cloningJun 25, 2020
Copy link
Member

@agramfortagramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glemaitre merge if happy

thx @sdpython

@agramfortagramfort merged commit b0c03d1 into scikit-learn:masterJun 25, 2020
@agramfort
Copy link
Member

actually @glemaitre had approved so go !

thx @sdpython

@glemaitre
Copy link
Member

I will make a PR adding the deprecation in what's new :)

ogrisel added a commit that referenced this pull request Jun 26, 2020
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Jul 17, 2020
…kit-learn#17679) * BUG fix SparseCoder to follow scikit-learn API * TST check that get_params and set_params work as expected * address comments * PEP8 * iter * Fixesscikit-learn#8675, fix cloning for SparseCoder * remove spaces * Update _dict_learning.py * fix confusing arguments * remove unnecessary code * Update test_common.py * removes spaces * PEP8 * iter * fix merge * ignore a mypy warning * type: ignore * remove one deprecated verification * Update sklearn/decomposition/_dict_learning.py Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * Update sklearn/decomposition/_dict_learning.py Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * Update sklearn/decomposition/tests/test_dict_learning.py Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * test clone produces different id * review * add one more test * lint * Update sklearn/decomposition/tests/test_dict_learning.py Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Jul 17, 2020
) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
…kit-learn#17679) * BUG fix SparseCoder to follow scikit-learn API * TST check that get_params and set_params work as expected * address comments * PEP8 * iter * Fixesscikit-learn#8675, fix cloning for SparseCoder * remove spaces * Update _dict_learning.py * fix confusing arguments * remove unnecessary code * Update test_common.py * removes spaces * PEP8 * iter * fix merge * ignore a mypy warning * type: ignore * remove one deprecated verification * Update sklearn/decomposition/_dict_learning.py Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * Update sklearn/decomposition/_dict_learning.py Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * Update sklearn/decomposition/tests/test_dict_learning.py Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * test clone produces different id * review * add one more test * lint * Update sklearn/decomposition/tests/test_dict_learning.py Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment
6 participants
@sdpython@glemaitre@cmarmo@adrinjalali@agramfort@ogrisel
close