Building time-series machine learning models using sktime in Python

# introduction

If you work with sensor readings, server metrics, or data that arrives over time, you probably already know the standard. scikit-learn The pipeline is not fully compatible. Time series data has structure that tabular models ignore, such as seasonality, trends, temporal order, and the fact that future values depend on past values.

ski time is a Python library built specifically for this purpose. It provides a scikit-learn style API (fit, predict, transform) but is designed from the ground up for time series. Perform time series prediction, classification, regression, and clustering all in a consistent interface.

This article tackles the example problem of predicting temperature readings from industrial HVAC sensors. Learn how sktime processes time series data, how to build preprocessing pipelines, how to fit forecasters, and how to evaluate them.

The code is available on GitHub.

# Prerequisites

Requires Python 3.10 or higher and basic knowledge of pandas. Install everything you need.

pip install sktime pmdarima statsmodels

If you want to get all optional dependencies at once, use pip install sktime[all_extras] Cover them.

# Advantages of sktime

It will help you understand the problem that sktime is solving. In scikit-learn, the data is a 2D table. Rows are samples and columns are features. Time series data breaks this assumption because each “row” is actually a series of values over time, and the order of those values is important.

The main data containers used are:

data type	expression	explanation
series	`pd.Series` or `pd.DataFrame`	A single time series used in vanilla forecasting.
panel	`pd.DataFrame` with 2 levels `MultiIndex`	A collection of multiple independent time series.
hierarchical	`pd.DataFrame` For level 3 and above `MultiIndex`	A structured set of time series that includes levels of aggregation across multiple dimensions.

As for the time index itself, sktime supports several time indexes. DatetimeIndex, PeriodIndex, Int64Indexand RangeIndex to pandas object. The index must be monotonic. If using DatetimeIndex, freq Attributes must be set.

# Dataset setup

Let’s create a realistic dataset. Imagine an HVAC sensor in a factory recording temperature every hour. The measurements show a daily seasonal pattern (higher during work hours), a slight upward trend due to summer, and some noise.

import numpy as np
import pandas as pd

np.random.seed(42)

# 90 days of hourly readings starting Jan 1, 2026
n_hours = 90 * 24
timestamps = pd.date_range(start="2026-01-01", periods=n_hours, freq="h")

# Trend: gradual 5-degree rise over 90 days
trend = np.linspace(0, 5, n_hours)

# Daily seasonality: temperature peaks at 2pm, dips at 4am
hour_of_day = np.arange(n_hours) % 24
daily_cycle = 4 * np.sin(2 * np.pi * (hour_of_day - 4) / 24)

# Noise
noise = np.random.normal(0, 0.8, n_hours)

# Base temperature around 20°C
temperature = 20 + trend + daily_cycle + noise

# Introduce a few missing values (sensor dropout)
dropout_indices = [300, 301, 302, 1440, 1441]
temperature[dropout_indices] = np.nan

y = pd.Series(temperature, index=timestamps, name="temp_celsius")
y.index.freq = pd.tseries.frequencies.to_offset("h")

print(y.head())
print(f"\nShape: {y.shape}")
print(f"Missing values: {y.isna().sum()}")
print(f"Index type: {type(y.index)}")

output:

2026-01-01 00:00:00    16.933270
2026-01-01 01:00:00    17.063277
2026-01-01 02:00:00    18.522783
2026-01-01 03:00:00    20.190095
2026-01-01 04:00:00    19.821941
Freq: h, Name: temp_celsius, dtype: float64

Shape: (2160,)
Missing values: 5
Index type:

# Splitting time series data for training and testing

Splitting time series data is different from tabular data. Rows cannot be shuffled. It should always be divided chronologically. That is, train on earlier data and test on later data.

Provided by sktime temporal_train_test_split For this purpose:

from sktime.split import temporal_train_test_split

# Hold out the last 7 days (168 hours) as the test set
y_train, y_test = temporal_train_test_split(y, test_size=168)

print(f"Train: {y_train.index[0]} → {y_train.index[-1]}")
print(f"Test:  {y_test.index[0]} → {y_test.index[-1]}")
print(f"Train size: {len(y_train)}, Test size: {len(y_test)}")

output:

Train: 2026-01-01 00:00:00 → 2026-03-24 23:00:00
Test:  2026-03-25 00:00:00 → 2026-03-31 23:00:00
Train size: 1992, Test size: 168

This function ensures that the splits are clean and chronological, and no data is leaked into the training set from the future.

# Defining the scope of the forecast

Before fitting the model, we need to tell sktime which timesteps we want to predict. this is, ForecastingHorizon.

from sktime.forecasting.base import ForecastingHorizon

# Predict 168 steps ahead (7 days of hourly data)
# is_relative=False means we're using absolute timestamps
fh = ForecastingHorizon(y_test.index, is_relative=False)

print(f"Horizon length: {len(fh)}")
print(f"First forecast point: {fh[0]}")
print(f"Last forecast point:  {fh[-1]}")

This results in:

Horizon length: 168
First forecast point: 2026-03-25 00:00:00
Last forecast point:  2026-03-31 23:00:00

You can also use a relative horizon like fh = [1, 2, 3, ..., 168]which means “one step ahead, two steps ahead…” If you have actual timestamps that you want to predict, the absolute horizon becomes more clear.

# Building a preprocessing and prediction pipeline

Real-world sensor data contains missing values, seasonal patterns, and trends, all of which must be handled before or during forecasting. school time TransformedTargetForecaster You can chain transforms by predictors into a single estimator. Transformation is applied to target series y Automatically flips before fitting and midway through prediction.

from sktime.forecasting.exp_smoothing import ExponentialSmoothing
from sktime.forecasting.compose import TransformedTargetForecaster
from sktime.transformations.series.impute import Imputer
from sktime.transformations.series.detrend import Deseasonalizer, Detrender

pipeline = TransformedTargetForecaster(
    steps=[
        # Step 1: Fill missing sensor readings using linear interpolation
        ("imputer", Imputer(method="linear")),
        # Step 2: Remove the linear trend so the forecaster sees a stationary series
        ("detrender", Detrender()),
        # Step 3: Remove the daily seasonality (sp=24 for hourly data with 24-hour cycles)
        ("deseasonalizer", Deseasonalizer(model="additive", sp=24)),
        # Step 4: Forecast the cleaned, stationary residuals
        ("forecaster", ExponentialSmoothing(trend=None, seasonal=None)),
    ]
)

pipeline.fit(y_train, fh=fh)
y_pred = pipeline.predict()

print(y_pred.head())

output:

2026-03-25 00:00:00    21.210066
2026-03-25 01:00:00    21.788986
2026-03-25 02:00:00    22.615184
2026-03-25 03:00:00    23.688449
2026-03-25 04:00:00    24.621127
Freq: h, Name: temp_celsius, dtype: float64

The contents of each step are as follows.

Imputer(method="linear") Fill in missing values by linear interpolation between surrounding readings. This is suitable for sensor data.
Detrender() Fit and subtract a linear trend to the training series. Add trends when making predictions.
Deseasonalizer(sp=24) Remove the 24-hour cycle from the residuals. sp Represents a seasonal period.
Finally, ExponentialSmoothing Predict the detrended and deseasonalized residuals.
when predict() When called, all inverse transformations are automatically applied in reverse order and the prediction at the original temperature scale is returned.

# Evaluation of predictions

sktime is integrated with standard metrics. Mean absolute error (MAE) and mean absolute percent error (MAPE) are commonly chosen for forecasting.

from sktime.performance_metrics.forecasting import (
    mean_absolute_error,
    mean_absolute_percentage_error,
)

mae = mean_absolute_error(y_test, y_pred)
mape = mean_absolute_percentage_error(y_test, y_pred)

print(f"MAE:  {mae:.3f} °C")
print(f"MAPE: {mape*100:.2f}%")

output:

MAE:  0.584 °C
MAPE: 2.40%

# Switch to another Forecaster

One of the biggest advantages of the sktime interface is that replacing the underlying algorithm requires only one line of change. Let’s try and compare the ARIMA model instead of exponential smoothing.

from sktime.forecasting.arima import ARIMA

pipeline_arima = TransformedTargetForecaster(
    steps=[
        ("imputer", Imputer(method="linear")),
        ("detrender", Detrender()),
        ("deseasonalizer", Deseasonalizer(model="additive", sp=24)),
        # ARIMA(1,1,1) on the cleaned residuals
        ("forecaster", ARIMA(order=(1, 1, 1), suppress_warnings=True)),
    ]
)

pipeline_arima.fit(y_train, fh=fh)
y_pred_arima = pipeline_arima.predict()

mae_arima = mean_absolute_error(y_test, y_pred_arima)
mape_arima = mean_absolute_percentage_error(y_test, y_pred_arima)

print(f"ARIMA MAE:  {mae_arima:.3f} °C")
print(f"ARIMA MAPE: {mape_arima*100:.2f}%")

output:

ARIMA MAE:  0.586 °C
ARIMA MAPE: 2.41%

The important point is that the preprocessing steps (imputation, detrending, deseasonalization) remain the same. I just changed the last predictor and everything else was organized neatly around it.

# Cross-validation across time

Extending a single test period can be misleading. sktime provides cross-validation of time series through a splitter that respects temporal order.

SlidingWindowSplitter Use a rolling window. The training window slides forward over time and always remains the same length. ExpandingWindowSplitter The training set increases cumulatively as you proceed. This is suitable if you want to use all available history.

from sktime.split import ExpandingWindowSplitter
from sktime.forecasting.model_evaluation import evaluate

# Expanding window: start with 1800-hour train set, evaluate on 168-hour windows
cv = ExpandingWindowSplitter(
    initial_window=1800,
    fh=list(range(1, 169)),
    step_length=168,
)

results = evaluate(
    forecaster=pipeline,
    y=y,
    cv=cv,
    scoring=mean_absolute_error,
    return_data=False,
)

print(results[["test__DynamicForecastingErrorMetric", "fit_time"]].round(3))
print(f"\nMean CV MAE: {results['test__DynamicForecastingErrorMetric'].mean():.3f} °C")

output:

   test__DynamicForecastingErrorMetric  fit_time
0                                0.627     0.274
1                                0.585     0.100

Mean CV MAE: 0.606 °C

evaluate Returns a DataFrame containing per-fold metrics and timings. Cross-validation MAE ensures that the model generalizes consistently across different time windows in the data.

# next step

Although this article described sktime’s core prediction workflow, the library goes far beyond basic prediction tasks.

It also supports time series classification, probabilistic forecasting with uncertainty estimation, training shared models across multiple related time series, adapting traditional machine learning algorithms for sequential forecasting, and automating model selection and tuning workflows.

One of sktime’s greatest strengths is its consistent API and integration with the broader Python machine learning ecosystem, making experimentation easy for both beginners and experienced practitioners. The sktime documentation and sample notebooks are particularly well written and worth bookmarking if you regularly work with forecasting or time data problems.

Rose Priya C I’m a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her interests and areas of expertise include DevOps, data science, and natural language processing. She loves reading, writing, coding, and coffee. Currently, she is dedicated to learning and sharing her knowledge with the developer community by creating tutorials, how-to guides, opinion articles, and more. Bala also creates engaging resource summaries and coding tutorials.

Source link