Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues while calling model.fit_predict #11

Open
lucamcwood opened this issue Aug 28, 2024 · 1 comment
Open

Issues while calling model.fit_predict #11

lucamcwood opened this issue Aug 28, 2024 · 1 comment

Comments

@lucamcwood
Copy link

Hello,
I am performing a grid search optimization and Claspy is a part of this search.
Sometimes while trying some combinations of hyper-parameters I get two types of error:

  1. Python int too large to convert to C long
  2. Negative dimensions are not allowed

I am trying to collect the parameter combinations that lead to those errors. I will keep you informed.

_
ERROR:root:Exception occurred: Python int too large to convert to C long
Traceback (most recent call last):
File ".../cp_algorithms.py", line 468, in predict
cps = self.clasp_model.fit_predict(batch)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/claspy/segmentation.py", line 315, in fit_predict
return self.fit(time_series).predict(sparse)
^^^^^^^^^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/claspy/segmentation.py", line 201, in fit
self.window_size = max(1, map_window_size_methods(self.window_size)(time_series) // 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/claspy/window_size.py", line 107, in suss
score = 1 - (_suss_score(time_series, window_size, stats) - min_score) / (max_score - min_score)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/claspy/window_size.py", line 35, in _suss_score
roll_mean = roll.mean().to_numpy()[window_size:]
^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/pandas/core/window/rolling.py", line 2223, in mean
return super().mean(
^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/pandas/core/window/rolling.py", line 1551, in mean
return self._apply(
^^^^^^^^^^^^
File ".../Dude_py311/lib64/python3.11/site-packages/pandas/core/window/rolling.py", line 663, in _apply
return self._apply_blockwise(homogeneous_func, name, numeric_only)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/pandas/core/window/rolling.py", line 503, in _apply_blockwise
return self._apply_series(homogeneous_func, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/pandas/core/window/rolling.py", line 487, in _apply_series
result = homogeneous_func(values)
^^^^^^^^^^^^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/pandas/core/window/rolling.py", line 658, in homogeneous_func
result = calc(values)
^^^^^^^^^^^^
File ".../Dude_py311/lib64/python3.11/site-packages/pandas/core/window/rolling.py", line 655, in calc
return func(x, start, end, min_periods, *numba_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pandas/_libs/window/aggregations.pyx", line 255, in pandas._libs.window.aggregations.roll_mean
OverflowError: Python int too large to convert to C long
ERROR:root:Exception occurred: Python int too large to convert to C long
_

_
ERROR:root:Exception occurred: negative dimensions are not allowed
Traceback (most recent call last):
File ".../cp_algorithms.py", line 468, in predict
cps = self.clasp_model.fit_predict(batch)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/claspy/segmentation.py", line 315, in fit_predict
return self.fit(time_series).predict(sparse)
^^^^^^^^^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/claspy/segmentation.py", line 242, in fit
profile = np.full(shape=self.n_timepoints - self.window_size + 1, fill_value=-np.inf, dtype=np.float64)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../lib64/python3.11/site-packages/numpy/core/numeric.py", line 344, in full
a = empty(shape, dtype, order)
^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: negative dimensions are not allowed
_

@ermshaua
Copy link
Owner

Looks like the window size makes problems here. Make sure that the window size is always a lot smaller than the time series size, otherwise the algorithm runs into problems.

As an upper bound for the window size, you could choose maybe 0.1 * time series size. Also, the window size should not be smaller than at least 3 or 5 values. So use such a constant as your lower bound.

Let me know if this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants