New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] ValueError: cannot reshape array of size 4 into shape (2,4,1) #6380
Comments
It seems there indeed is something not right with the conversion. The problem can be isolated to this: import pandas as pd
from sktime.datatypes import convert_to
# Define the multi-index
index = pd.MultiIndex.from_tuples([
(0, datetime.strptime('2024-04-20 18:22:14.877500', '%Y-%m-%d %H:%M:%S.%f')),
(0, datetime.strptime('2024-04-20 18:22:14.903000', '%Y-%m-%d %H:%M:%S.%f')),
(1, datetime.strptime('2024-04-20 18:24:42.453400', '%Y-%m-%d %H:%M:%S.%f')),
(1, datetime.strptime('2024-04-20 18:24:42.478800', '%Y-%m-%d %H:%M:%S.%f'))
], names=['instance', 'Time'])
# Define the DataFrame
df = pd.DataFrame({
'LeftControllerVelocity_0': [-0.01, -0.01, 0.06, 0.06]
}, index=index)
convert_to(df, "numpy3D") |
ok, I get what the cause is, although it is not entirely clear what the best way is to resolve this. The cause is that the panel has equal length series but does not have equal time stamp index. Some distances - including the default, The detection in the checker is off, possibly since "is_equal_length" is ill-specified, and sometimes it detects the first condition, sometimes the second, so no clear warning message is raised. There is a workaround, and multiple ways we could "fix" this. The workaround is to drop the time index entirely, or conver it into an offset. For fixes, I can think of:
Do you have a preference, @helloplayer1? As said, the workaround with current version of |
By offset, you mean that the first time point of each instance would be set to 0 and every time point afterward would be the time since then? For the fix, I think it would make more sense to raise the error message and leave the decision on what to do to the user, possibly hinting on what options he has. |
yes, exactly.
I see, in your "real" use case, I assume. The key is the "unequal length" tag which also means "unequal set of indices". Relevant material here: |
Describe the bug
I receive the following error when I try to call fit for a
KNeighborsTimeSeriesClassifier
with "pd-multiindex" data:To Reproduce
Expected behavior
The model is fitted without any error
Additional context
This is how the df printed looks like:
This is the result of
print(check_is_mtype(df, mtype="pd-multiindex", return_metadata=True))
:Versions
System:
python: 3.12.3 (tags/v3.12.3:f6650f9, Apr 9 2024, 14:05:25) [MSC v.1938 64 bit (AMD64)]
executable: d:\BAAI.venv\Scripts\python.exe
machine: Windows-11-10.0.22631-SP0
Python dependencies:
pip: 24.0
sktime: 0.29.0
sklearn: 1.4.2
skbase: 0.7.7
numpy: 1.26.4
scipy: 1.13.0
pandas: 2.2.2
matplotlib: None
joblib: 1.4.0
numba: None
statsmodels: None
pmdarima: None
statsforecast: None
tsfresh: None
tslearn: None
torch: None
tensorflow: None
tensorflow_probability: None
The text was updated successfully, but these errors were encountered: