Auto Ml Forecasting Function
Automated Machine Learning
Forecasting away from training data
Contents
Introduction
This notebook demonstrates the full interface of the forecast() function.
The best known and most frequent usage of forecast enables forecasting on test sets that immediately follows training data.
However, in many use cases it is necessary to continue using the model for some time before retraining it. This happens especially in high frequency forecasting when forecasts need to be made more frequently than the model can be retrained. Examples are in Internet of Things and predictive cloud resource scaling.
Here we show how to use the forecast() function when a time gap exists between training data and prediction period.
Terminology:
- forecast origin: the last period when the target value is known
- forecast periods(s): the period(s) for which the value of the target is desired.
- lookback: how many past periods (before forecast origin) the model function depends on. The larger of number of lags and length of rolling window.
- prediction context:
lookbackperiods immediately preceding the forecast origin
![]()
Setup
Please make sure you have followed the configuration notebook so that your ML workspace information is saved in the config file.
This notebook is compatible with Azure ML SDK version 1.35.0 or later.
Data
For the demonstration purposes we will generate the data artificially and use them for the forecasting.
Let's see what the training data looks like.
Prepare remote compute and data.
The Machine Learning service workspace, is paired with the storage account, which contains the default data store. We will use it to upload the artificial data and create tabular dataset for training. A tabular dataset defines a series of lazily-evaluated, immutable operations to load data from the data source into tabular representation.
You will need to create a compute target for your AutoML run. In this tutorial, you create AmlCompute as your training compute resource.
Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.
Create the configuration and train a forecaster
First generate the configuration, in which we:
- Set metadata columns: target, time column and time-series id column names.
- Validate our data using cross validation with rolling window method.
- Set normalized root mean squared error as a metric to select the best model.
- Set early termination to True, so the iterations through the models will stop when no improvements in accuracy score will be made.
- Set limitations on the length of experiment run to 15 minutes.
- Finally, we set the task to be forecasting.
- We apply the lag lead operator to the target value i.e. we use the previous values as a predictor for the future ones.
- [Optional] Forecast frequency parameter (freq) represents the period with which the forecast is desired, for example, daily, weekly, yearly, etc. Use this parameter for the correction of time series containing irregular data points or for padding of short time series. The frequency needs to be a pandas offset alias. Please refer to pandas documentation for more information.
Run the model selection and training process. Validation errors and current status will be shown when setting show_output=True and the execution will be synchronous.
In this section we will review the forecast interface for two main scenarios: forecasting right after the training data, and the more complex interface for forecasting when there is a gap (in the time sense) between training and testing data.
X_train is directly followed by the X_test
Let's first consider the case when the prediction period immediately follows the training data. This is typical in scenarios where we have the time to retrain the model every time we wish to forecast. Forecasts that are made on daily and slower cadence typically fall into this category. Retraining the model every time benefits the accuracy because the most recent data is often the most informative.

We use X_test as a forecast request to generate the predictions.
Typical path: X_test is known, forecast all upcoming periods
Confidence intervals
Forecasting model may be used for the prediction of forecasting intervals by running forecast_quantiles().
This method accepts the same parameters as forecast().
Distribution forecasts
Often the figure of interest is not just the point prediction, but the prediction at some quantile of the distribution. This arises when the forecast is used to control some kind of inventory, for example of grocery items or virtual machines for a cloud service. In such case, the control point is usually something like "we want the item to be in stock and not run out 99% of the time". This is called a "service level". Here is how you get quantile forecasts.
Destination-date forecast: "just do something"
In some scenarios, the X_test is not known. The forecast is likely to be weak, because it is missing contemporaneous predictors, which we will need to impute. If you still wish to predict forward under the assumption that the last known values will be carried forward, you can forecast out to "destination date". The destination date still needs to fit within the forecast horizon from training.
Forecasting away from training data
Suppose we trained a model, some time passed, and now we want to apply the model without re-training. If the model "looks back" -- uses previous values of the target -- then we somehow need to provide those values to the model.

The notion of forecast origin comes into play: the forecast origin is the last period for which we have seen the target value. This applies per time-series, so each time-series can have a different forecast origin.
The part of data before the forecast origin is the prediction context. To provide the context values the model needs when it looks back, we pass definite values in y_test (aligned with corresponding times in X_test).
There is a gap of 12 hours between end of training and beginning of X_away. (It looks like 13 because all timestamps point to the start of the one hour periods.) Using only X_away will fail without adding context data for the model to consume.
How should we read that eror message? The forecast origin is at the last time the model saw an actual value of y (the target). That was at the end of the training data! The model is attempting to forecast from the end of training data. But the requested forecast periods are past the forecast horizon. We need to provide a define y value to establish the forecast origin.
We will use this helper function to take the required amount of context from the data preceding the testing data. It's definition is intentionally simplified to keep the idea in the clear.
Let's see where the context data ends - it ends, by construction, just before the testing data starts.
Note that the forecast origin is at 17:00 for both time-series, and periods from 18:00 are to be forecast.
Forecasting farther than the forecast horizon
When the forecast destination, or the latest date in the prediction data frame, is farther into the future than the specified forecast horizon, the forecaster must be iteratively applied. Here, we advance the forecast origin on each iteration over the prediction window, predicting max_horizon periods ahead on each iteration. There are two choices for the context data to use as the forecaster advances into the prediction window:
- We can use forecasted values from previous iterations (recursive forecast),
- We can use known, actual values of the target if they are available (rolling forecast).
The first method is useful in a true forecasting scenario when we do not yet know the actual target values while the second is useful in an evaluation scenario where we want to compute accuracy metrics for the max_horizon-period-ahead forecaster over a long test set. We refer to the first as a recursive forecast since we apply the forecaster recursively over the prediction window and the second as a rolling forecast since we roll forward over known actuals.
Recursive forecasting
By default, the forecast() function will make point predictions out to the later date using a recursive operation mode. Internally, the method recursively applies the regular forecaster to generate context so that we can forecast further into the future.
To illustrate the use-case and operation of recursive forecasting, we'll consider an example with a single time-series where the forecasting period directly follows the training period and is twice as long as the forecasting horizon given at training time.

Internally, we apply the forecaster in an iterative manner and finish the forecast task in two interations. In the first iteration, we apply the forecaster and get the prediction for the first forecast-horizon periods (y_pred1). In the second iteraction, y_pred1 is used as the context to produce the prediction for the next forecast-horizon periods (y_pred2). The combination of (y_pred1 and y_pred2) gives the results for the total forecast periods.
A caveat: forecast accuracy will likely be worse the farther we predict into the future since errors are compounded with recursive application of the forecaster.

Rolling forecasts
A rolling forecast is a similar concept to the recursive forecasts described above except that we use known actual values of the target for our context data. We have provided a different, public method for this called rolling_forecast. In addition to test data and actuals (X_test and y_test), rolling_forecast also accepts an optional step parameter that controls how far the origin advances on each iteration. The recursive forecast mode uses a fixed step of max_horizon while rolling_forecast defaults to a step size of 1, but can be set to any integer from 1 to max_horizon, inclusive.
Let's see what the rolling forecast looks like on the long test set with the step set to 1:
Notice that rolling_forecast has returned a single DataFrame containing all results and has generated some new columns: _automl_forecast_origin, _automl_forecast_y, and _automl_actual_y. These are the origin date for each forecast, the forecasted value and the actual value, respectively. Note that "y" in the forecast and actual column names will generally be replaced by the target column name supplied to AutoML.
The output above shows forecasts for two prediction windows, the first with origin at the end of the training set and the second including the first observation in the test set (2000-01-01 06:00:00). Since the forecast windows overlap, there are multiple forecasts for most dates which are associated with different origin dates.
Confidence interval and distributional forecasts
AutoML cannot currently estimate forecast errors beyond the forecast horizon set during training, so the forecast_quantiles() function will return missing values for quantiles not equal to 0.5 beyond the forecast horizon.
Similarly with the simple senarios illustrated above, forecasting farther than the forecast horizon in other senarios like 'multiple time-series', 'Destination-date forecast', and 'forecast away from the training data' are also automatically handled by the forecast() function.