-
I'm trying to build a pipeline that generates datetime features and then encodes a categorical column. When i passed the data to the fit_transform method of the pipeline, i got the error "TypeError: 'NoneType' object is not subscriptable". from sktime.transformations.series.date import DateTimeFeatures
from sktime.transformations.series.adapt import TabularToSeriesAdaptor
from sktime.utils._testing.hierarchical import _make_hierarchical
from category_encoders.ordinal import OrdinalEncoder
y_train = _make_hierarchical()
X_train = y_train.drop("c0", axis=1)
X_train["product_family"] = X_train.index.get_level_values(1)
date_features = DateTimeFeatures(manual_selection=["day_of_week", "day_of_month", "day_of_year"], keep_original_columns=True)
encoder = OrdinalEncoder(cols=["product_family"])
adaptor = TabularToSeriesAdaptor(encoder)
pipeline = (date_features * adaptor)
pipeline.fit_transform(X_train) Error message from runing pipeline.fit_tranform(X_train)
I decided to run the transformers separately, but I still got the same error message. I was able to fit and transform with
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 7 replies
-
There is some discussion currently ongoing on categorical features, this issue is similar: #5867 FYI @yarnabrina. High-level, we are currently working on categorical support. |
Beta Was this translation helpful? Give feedback.
-
Regarding the error message, this is a bug with the error - the message should be informative and explain why the input is non-compliant. here is the fix: #5947 For now, you can run |
Beta Was this translation helpful? Give feedback.
-
FYI @tiloye, opened an umbrella issue on categorical feature support here: #6109 |
Beta Was this translation helpful? Give feedback.
an mtype is a specification for input format, e.g.,
pd.DataFrame
withpd.MultiIndex
where the last index is an integer or time index, and no columns areobject
type.See the datatypes tutorial for more info.
Thanks for pointing out that this is missing in the glossary, I will add it.
From your output, it seems that indeed the problem is that you have
object
dtypes
(dtypes are column types inpandas
), which is not permitted. We are currently working on extending support for categorical types, see here: #5886There is also a longer design discussion and project towards ensuring categorical types can be dealt with throughout the pipeline, @yarnabrina is also heavily involved. We are looking …