5 Time Series Basic Models You’re Missing

Machine Learning


5 Time Series Basic Models You're Missing
Image by author | Chronos-2 diagram: From univariate predictions to universal predictions

# introduction

The underlying model didn’t start with ChatGPT. Long before large-scale language models became popular, pre-trained models were already driving advances in computer vision and natural language processing, such as image segmentation, classification, and text understanding.

The same approach is currently being used to reconstruct time series forecasts. Rather than building and tuning separate models for each dataset, a time series foundational model is pretrained on a large and diverse collection of time series data. These can provide strong zero-shot prediction performance across domains, frequencies, and ranges, often matching deep learning models that require hours of training using only historical data as input.

If you’re still relying primarily on classic statistical methods or single-dataset deep learning models, you may be missing out on major changes in the way predictive systems are built.

In this tutorial, we review five time series foundation models selected based on performance, popularity as measured by Hugging Face downloads, and ease of use in the real world.

# 1. Kronos-2

Kronos-2 is a 120 million parameter encoder-specific time series infrastructure model built for zero-shot prediction. It supports predictions based on univariate, multivariate, and covariate information in a single architecture, providing accurate multi-step probabilistic predictions without task-specific training.

Main features:

  1. Encoder-only architecture inspired by T5
  2. Zero shot prediction using quantile output
  3. Native support for past and known future covariates
  4. Long context lengths up to 8,192 and prediction periods up to 1,024
  5. High-throughput efficient CPU and GPU inference

Usage example:

  • Large-scale forecasting across many related time series
  • Forecasts based on covariates such as demand, energy, and prices
  • Rapid prototyping and production deployment without model training

Best use case:

  • Production forecasting system
  • Research and benchmarking
  • Complex multivariate predictions using covariates

# 2. T-Rex

Tyrex is a 35 million parameter pre-trained time series forecasting model based on xLSTM, designed for zero-shot forecasting over both long and short term. It can generate accurate predictions without requiring training on task-specific data and provides both point and probability predictions out of the box.

Main features:

  • Pre-trained xLSTM-based architecture
  • Zero-shot prediction without the need for dataset-specific training
  • Point prediction and quantile-based uncertainty estimation
  • Excellent performance in both long-term and short-term benchmarks
  • Optional CUDA acceleration for high-performance GPU inference

Usage example:

  • Zero-shot predictions for new or unconfirmed time series datasets
  • Long-term and short-term forecasts in finance, energy, and operations
  • Rapid benchmarking and deployment without requiring model training

# 3. Times FM

Times FM is a pre-trained time series foundational model developed by Google Research for zero-shot prediction. Open Checkpoint timesfm-2.0-500m is a decoder-only model designed for univariate prediction, supporting long historical context and flexible prediction horizons without task-specific training.

Main features:

  • Decoder-only basic model with 500M parameter checkpoints
  • Zero-shot univariate time series forecasting
  • Context length up to 2,048 time points to support beyond training limits
  • Flexible forecast horizon with optional frequency indicators
  • Optimized for fast point prediction at scale

Usage example:

  • Univariate predictions at scale across diverse datasets
  • Long-term forecasting of operational and infrastructure data
  • Rapid experimentation and benchmarking without model training

# 4. IBM Granite TTM R2

Granite-TimeSeries-TTM-R2 is a family of compact, pre-trained time series foundational models developed by IBM Research based on the TinyTimeMixers (TTM) framework. Designed for multivariate prediction, these models deliver strong zero-shot and few-shot performance despite a small model size of 1 million parameters, making them suitable for both research and resource-constrained environments.

Main features:

  • Small pre-trained model starting with 1M parameters
  • Powerful zero-shot and few-shot multivariate prediction performance
  • Focused models tailored to your specific situation and forecast period
  • Fast inference and fine-tuning on a single GPU or CPU
  • Support for exogenous variables and static categorical features

Usage example:

  • Multivariate prediction in low-resource or edge environments
  • Zero-shot baseline with optional lightweight fine-tuning
  • Rapid deployment for operational forecasting with limited data

# 5.TOTO open base 1

Toto-Open-Base-1.0 is a decoder-specific time series foundation model designed for multivariate forecasting in observability and monitoring settings. Optimized for high-dimensional, sparse, and non-stationary data, it provides strong zero-shot performance on large-scale benchmarks such as GIFT-Eval and BOOM.

Main features:

  • Decoder-specific transformer for flexible context and predicted length
  • Zero shot prediction without fine-tuning
  • Efficient processing of high-dimensional multivariate data
  • Probabilistic prediction using Student-T mixture models
  • Pre-trained on over 2 trillion time series data points

Usage example:

  • Predicting observability and monitoring metrics
  • High-dimensional systems and infrastructure telemetry
  • Zero-shot prediction of large non-stationary time series

summary

The table below compares the core characteristics of the time series-based models discussed, focusing on model size, architecture, and predictive capabilities.

model parameters architecture Prediction type Main strengths
Kronos-2 120M Encoder only Univariate, multivariate, probability Strong zero-shot accuracy, long context and range, and high inference throughput
Tyrex 35M xLSTM base univariate, stochastic Lightweight model with excellent short- and long-distance performance
Times FM 500M decoder only Univariate, point prediction Handles long contexts and flexible fields of view at scale
Granite Time Series TTM-R2 1M ~ small Focused pre-trained model Multivariate, point prediction Extremely compact, fast inference, powerful zero-shot and few-shot results
TOTO open base 1 151M decoder only multivariate, stochastic Optimized for high-dimensional non-stationary observability data

Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs about machine learning and data science technology. Abid holds a master’s degree in technology management and a bachelor’s degree in communications engineering. His vision is to build AI products using graph neural networks for students suffering from mental illness.



Source link