Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] conversion to nested_univ is unexpected #6411

Closed
weenerplasticsgroup opened this issue May 11, 2024 · 5 comments
Closed

[BUG] conversion to nested_univ is unexpected #6411

weenerplasticsgroup opened this issue May 11, 2024 · 5 comments
Labels
bug Something isn't working module:datatypes datatypes module: data containers, checkers & converters

Comments

@weenerplasticsgroup
Copy link

weenerplasticsgroup commented May 11, 2024

Describe the bug
conversion to nested_univ is unexpected

To Reproduce
I start with this table

from sktime.datatypes import get_examples
X = get_examples(mtype="pd-multiindex", as_scitype="Panel")[0]
X_trafo = X.reset_index(level=0, drop=True).rename(columns=dict(map(reversed, enumerate(X.columns)))).stack().to_frame().groupby(level=[0,1]).first()
X_trafo
0
timepoints
0 0 1
1 4
1 0 2
1 5
2 0 3
1 6
Now I convert to nested_univ
from sktime.datatypes import convert_to
X_new = convert_to(X_trafo, "nested_univ")
X_new
0
timepoints
0 0 1 1 4 Name: 0, dtype: int64
1 0 2 1 5 Name: 0, dtype: int64
2 0 3 1 6 Name: 0, dtype: int64

Expected behavior
This is unexpected, I have 4 values where I expect two

I expect
Time 0 series 1 4
Time 1 series 2 5
etc.

Additional context

This example is very similar but does not give the unexpected result

from sktime.datatypes import convert_to
from sktime.datasets import load_arrow_head
X, _ = load_arrow_head(return_X_y=True)
X_new = convert_to(X, to_type="pd-multiindex").swaplevel()
convert_to(X_new.sort_index(), to_type="nested_univ")

Versions
sktime 2.9

@weenerplasticsgroup weenerplasticsgroup added the bug Something isn't working label May 11, 2024
@fkiraly fkiraly added the module:datatypes datatypes module: data containers, checkers & converters label May 11, 2024
@fkiraly
Copy link
Collaborator

fkiraly commented May 11, 2024

This is just how it prints out, there are actually only two values inside, in the pd.Series inside the cell.

It prints the index values, e.g., [0, 1] in the top left cell, next to the series values, e.g., [1, 4] in the top left cell.

PS, we do no longer recommend using the nested_univ specification, because pandas 2 no longer supports the commands that would render manipulation of the container easy, and it is not in a supported pandas API.

@hstarmans
Copy link

hstarmans commented May 11, 2024

The format is different, it looks strange. I tested it with X.iloc[0].values and ended up with four, but will check again tomorrow

@fkiraly
Copy link
Collaborator

fkiraly commented May 12, 2024

?? ok, let me run it. I was pretty sure it is just printing. Give me a sec.

@fkiraly
Copy link
Collaborator

fkiraly commented May 12, 2024

Yes, it's just printing, e.g.,

X_new.iloc[0, 0]

gives

0    1
1    4
Name: 0, dtype: int64

and

X_new.iloc[0, 0].shape

is

(2,)

anyway, nested_univ is dodgy anyway, not just in printing, I recommend not using it.

@weenerplasticsgroup
Copy link
Author

ok. there is no issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module:datatypes datatypes module: data containers, checkers & converters
Projects
None yet
Development

No branches or pull requests

3 participants