How to specify dynamic pipelines with sequential pipelines and graphical pipelines (and dunder methods) #5903
Replies: 3 comments 5 replies
-
FYI @fkiraly. I am hoping we can use this thread to create few different specifications of same pipeline in sequential/dunder/graphical, and can use for both documentation purpose and also to highlight to users the capability and simplification of dunder+graphical approaches. (This is from point of view of a developer of non-techinical user faced applications on top of sktime, not from sktime contributor view.) |
Beta Was this translation helpful? Give feedback.
-
It occurs to me that we had an issue like that already: #4413 Although it seems I did not action it. |
Beta Was this translation helpful? Give feedback.
-
That is a complicated example and I haven't tested the following code, so probably it will not run directly. However, I hope it helps to get a feeling how it might look like. from sktime.forecasting.arima import ARIMA
from sktime.forecasting.base import BaseForecaster, ForecastingHorizon
from sktime.forecasting.compose import (
ForecastingPipeline,
ForecastX,
TransformedTargetForecaster,
)
from sktime.forecasting.exp_smoothing import ExponentialSmoothing
from sktime.transformations.base import BaseTransformer
from sktime.transformations.compose import FeatureUnion, TransformerPipeline
from sktime.transformations.series.boxcox import BoxCoxTransformer
from sktime.transformations.series.detrend import Detrender
from sktime.transformations.series.difference import Differencer
from sktime.transformations.series.subset import ColumnSelect
from sktime.pipeline import Pipeline
def create_pipeline(
endogenous_forecaster: BaseForecaster,
prediction_horizon: ForecastingHorizon,
future_known_exogenous: list[str] | None = None,
future_unknown_exogenous: list[str] | None = None,
future_unknown_forecaster: BaseForecaster | None = None,
endogenous_transformation: list[BaseTransformer] | None = None,
future_known_transformation: list[BaseTransformer] | None = None,
future_unknown_transformation: list[BaseTransformer] | None = None,
) -> BaseForecaster:
pipe = Pipeline()
# Add all the endogenous transformation to the pipelien
tranfo_input = "y"
for endogenous_trafo in endogenous_transformation:
pipe = pipe.add_step(endogenous_trafo, endogenous_trafo_name, {"X", tranfo_input})
tranfo_input = endogenous_trafo_name
# Add all transformation for the known x values to the pipeline
known_xs = []
for col in future_known_exogenous:
ex_trafo_input_known = f"X__{col}"
for future_ex_trafo in future_known_transformation:
pipe = pipe.add_step(future_ex_trafo, future_ex_trafo_name, {"X": ex_trafo_input_known})
ex_trafo_input_known = future_ex_trafo_name
known_xs.append(future_ex_trafo_name)
# For all unknown x values add the transformation and the corresponding forecaster for the x values
unknown_xs = []
for col in future_unknown_exogenous:
ex_trafo_input = f"X__{col}"
for future_ex_trafo in future_known_transformation:
pipe = pipe.add_step(future_ex_trafo, future_ex_trafo_name, {"X": ex_trafo_input})
ex_trafo_input = future_ex_trafo_name
unknown_xs.append(future_ex_trafo_name)
# Add the forecaster for the unknown x values
future_forecater = future_unknown_forecaster if future_unknown_forecaster else endogenous_forecaster
pipe = pipe.add_step(future_forecater, f"future_forecater_{col}", {"y": unknown_xs})
# Add the forecasted to the known x values
known_xs.append("future_forecater_{col}")
# Add the endogenous forecaster to the pipeline with x values if available
y_forecast_input = {"y": tranfo_input} if len(known_xs) == 0 else {"y": tranfo_input, "X": known_xs}
pipe = pipe.add(endogenous_forecaster, "endogenous_forecaster", y_forecast_input)
# Add the inverse transformation to the pipeline
tranfo_input = "endogenous_forecaster"
for endogenous_trafo in endogenous_transformation:
if endogenous_forecaster.is_inverse:
pipe = pipe.add_step(endogenous_trafo, endogenous_trafo_name, {"X", tranfo_input}, method="inverse_transform")
tranfo_input = endogenous_trafo_name
return pipe To admit, this code looks also a bit complicated. My first idea of creating a blueprint and filling it out dynamically, seems to be even more complicated. |
Beta Was this translation helpful? Give feedback.
-
Hi, here's an example of a dynamic pipeline creation function:
This is untested in the sense I have tried to do fit/predict/update with this. This is also incomplete as error handling stuff is missing, but the objective of the function is to support all of these (plus more) cases:
y
with no transformation - simplest casey
with transformation(s)y
withX
which are known for futurey
with knownX
with transformations forX
y
withX
which are unknown for future with no specification for separate forecastery
with unknownX
with dedicated forecastery
with unknownX
with transformations forX
y
with both known and unknownX
with all types of transformations and specific forecaster for unknownX
This can be further expanded logically, e.g. graphical pipeline should able to support a distinction of "future-unknown" features which need other exogenous features and which do not.
@benHeid if you can share how will you convert the above function for graphical version, with extra modifications to support the above case, that will be very much appreciated. Since not all components exist all the time I had a hard time figuring out how to convert even for more or less sequential flow, and had to give up for more generalised dynamic DAG.
Beta Was this translation helpful? Give feedback.
All reactions