Relax `check_X_y` for `DecayEstimator` #580

FBruzzesi · 2023-09-26T12:43:04Z

Description

Relaxes the check for X in DecayEstimator by delegating the actual checks to the given estimator. This fixes #573 and allows estimators that do not need X features.
Remark, as discussed in the issue, it may be worth it to completely remove the check.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

My code follows the style guidelines (flake8)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (also to the readme.md)
I have added tests that prove my fix is effective or that my feature works
I have added tests to check whether the new feature adheres to the sklearn convention
New and existing unit tests pass locally with my changes

koaning · 2023-10-05T09:27:44Z

I'm also OK with removing the entire check. Mainly because the internal method should already be doing that and this way we'll need to make fewer assumptions.

FBruzzesi · 2023-10-10T12:18:23Z

I'm also OK with removing the entire check. Mainly because the internal method should already be doing that and this way we'll need to make fewer assumptions.

To be completely pedantic, the check_X_y function call checks and converts X and y to arrays. This allows to then access .shape attribute to know how many samples are there and compute the decay.

Now both approaches feel wrong and with shortcomings:

Using a check, it may lead to errors or different results from the check in the estimator (e.g. if the estimator allows for multioutput while the check we are doing doesn't)
Not having a check doesn't allow to access .shape or len in the general case (I noticed this by running unit tests) and I would be unsure how to proceed.

Opinions?

koaning · 2023-10-10T12:54:15Z

Not having a check doesn't allow to access .shape or len in the general case (I noticed this by running unit tests) and I would be unsure how to proceed.

Ah, I forgot about that. That's a fair point. I guess the simplest solution is to offer a flag to the end-user and to defer the responsibility of applying the check to them? That said, I think most of the time X will be .shape-able or len()-able so the default could be True. Or am I missing something?

FBruzzesi · 2023-10-10T17:53:52Z

I guess the simplest solution is to offer a flag to the end-user and to defer the responsibility of applying the check to them?

That's reasonable!

That said, I think most of the time X will be .shape-able or len()-able so the default could be True

With such default the scikit-learn tests would fail. I will add a big warning in the docstring 😁

koaning · 2023-10-10T18:28:26Z

With such default the scikit-learn tests would fail. I will add a big warning in the docstring 😁

There are a lot of components in this library that fail the standard checks. That's ok. It's not ideal, but given how "non-standard" some of these ideas are ... it's fine to turn a test off sometimes. Adding a warning to the docstring is fair though!

koaning · 2023-10-12T12:01:42Z

sklego/meta/decay_estimator.py

    The DecayEstimator will use exponential decay to weight the parameters.

    w_{t-1} = decay * w_{t}
    """

-    def __init__(self, model, decay: float = 0.999, decay_func="exponential"):
+    def __init__(
+        self, model, decay: float = 0.999, decay_func="exponential", check_X_y=False


This is a bit of a nitpick.

Suggested change

self, model, decay: float = 0.999, decay_func="exponential", check_X_y=False

self, model, decay: float = 0.999, decay_func="exponential", check_input=False

My reasoning here is that some estimators may only have a check_X. This is the case when the component is a scikit-learn transformer. But it'd be nice to have a consistent name for checking the internal thing. So maybe this namechange?

Once that change is in/discussed I think we can hit merge :)

That's very reasonable 👌 let me adjust that!

koaning

Left one comment related to the name of a variable

koaning

LGTM!

FBruzzesi added 2 commits September 26, 2023 13:21

blanding check_X_y for decay estimator

68534b2

docs

a965eeb

FBruzzesi mentioned this pull request Sep 26, 2023

[FEATURE] Decay functions #581

Closed

make check_X_y optional

9143bbe

mtrbean mentioned this pull request Oct 12, 2023

relax check_X_y on EstimatorTransformer #584

Merged

9 tasks

Merge branch 'main' into patch/573-decay-estimator-check

8c4763a

koaning reviewed Oct 12, 2023

View reviewed changes

variable naming

0d833f2

koaning approved these changes Oct 13, 2023

View reviewed changes

koaning merged commit f377999 into koaning:main Oct 13, 2023
7 checks passed

FBruzzesi deleted the patch/573-decay-estimator-check branch October 13, 2023 09:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relax `check_X_y` for `DecayEstimator` #580

Relax `check_X_y` for `DecayEstimator` #580

FBruzzesi commented Sep 26, 2023

koaning commented Oct 5, 2023

FBruzzesi commented Oct 10, 2023

koaning commented Oct 10, 2023 •

edited

Loading

FBruzzesi commented Oct 10, 2023

koaning commented Oct 10, 2023

koaning Oct 12, 2023

FBruzzesi Oct 12, 2023

koaning left a comment

koaning left a comment

	self, model, decay: float = 0.999, decay_func="exponential", check_X_y=False
	self, model, decay: float = 0.999, decay_func="exponential", check_input=False

Relax check_X_y for DecayEstimator #580

Relax check_X_y for DecayEstimator #580

Conversation

FBruzzesi commented Sep 26, 2023

Description

Type of change

Checklist:

koaning commented Oct 5, 2023

FBruzzesi commented Oct 10, 2023

koaning commented Oct 10, 2023 • edited Loading

FBruzzesi commented Oct 10, 2023

koaning commented Oct 10, 2023

koaning Oct 12, 2023

Choose a reason for hiding this comment

FBruzzesi Oct 12, 2023

Choose a reason for hiding this comment

koaning left a comment

Choose a reason for hiding this comment

koaning left a comment

Choose a reason for hiding this comment

Relax `check_X_y` for `DecayEstimator` #580

Relax `check_X_y` for `DecayEstimator` #580

koaning commented Oct 10, 2023 •

edited

Loading