Notebooks
A
Azure
Auto Ml Forecasting Univariate Recipe Run Experiment

Auto Ml Forecasting Univariate Recipe Run Experiment

how-to-use-azuremlazure-mldata-sciencenotebookforecasting-recipes-univariatemachine-learningazure-machine-learningautomated-machine-learningdeep-learningazuremlazure-ml-notebooksazure

Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

Impressions

!Important!
This notebook is outdated and is not supported by the AutoML Team. Please use the supported version (link).

Running AutoML experiments

See the auto-ml-forecasting-univariate-recipe-experiment-settings notebook on how to determine settings for seasonal features, target lags and whether the series needs to be differenced or not. To make experimentation user-friendly, the user has to specify several parameters: DIFFERENCE_SERIES, TARGET_LAGS and STL_TYPE. Once these parameters are set, the notebook will generate correct transformations and settings to run experiments, generate forecasts, compute inference set metrics and plot forecast vs actuals. It will also convert the forecast from first differences to levels (original units of measurement) if the DIFFERENCE_SERIES parameter is set to True before calculating inference set metrics.


The output generated by this notebook is saved in the experiment_outputfolder.

Setup

[ ]

As part of the setup you have already created a Workspace. You will also need to create a compute target for your AutoML run. In this tutorial, you create AmlCompute as your training compute resource.

Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.

[ ]

Data

Here, we will load the data from the csv file and drop the Covid period.

[ ]

Set parameters

The first set of parameters is based on the analysis performed in the auto-ml-forecasting-univariate-recipe-experiment-settings notebook.

[ ]

Next, define additional parameters to be used in the AutoML config class.

  • FORECAST_HORIZON: The forecast horizon is the number of periods into the future that the model should predict. Here, we set the horizon to 12 periods (i.e. 12 quarters). For more discussion of forecast horizons and guiding principles for setting them, please see the energy demand notebook .
  • TIME_SERIES_ID_COLNAMES: The names of columns used to group a timeseries. It can be used to create multiple series. If time series identifier is not defined, the data set is assumed to be one time-series. This parameter is used with task type forecasting. Since we are working with a single series, this list is empty.
  • BLOCKED_MODELS: Optional list of models to be blocked from consideration during model selection stage. At this point we want to consider all ML and Time Series models.
    • See the following link for a list of supported Forecasting models
[ ]

To run AutoML, you also need to create an Experiment. An Experiment corresponds to a prediction problem you are trying to solve, while a Run corresponds to a specific approach to the problem.

[ ]
[ ]
[ ]
[ ]

Upload files to the Datastore

The Machine Learning service workspace is paired with the storage account, which contains the default data store. We will use it to upload the bike share data and create tabular dataset for training. A tabular dataset defines a series of lazily-evaluated, immutable operations to load data from the data source into tabular representation.

[ ]

Config AutoML

[ ]

We will now run the experiment, you can go to Azure ML portal to view the run details.

[ ]

Retrieve the Best Run details

Below we retrieve the best Run object from among all the runs in the experiment.

[ ]

Inference

We now use the best fitted model from the AutoML Run to make forecasts for the test set. We will do batch scoring on the test dataset which should have the same schema as training dataset.

The inference will run on a remote compute. In this example, it will re-use the training compute.

[ ]

Retreiving forecasts from the model

We have created a function called run_forecast that submits the test data to the best model determined during the training run and retrieves forecasts. This function uses a helper script forecasting_script which is uploaded and expecuted on the remote compute.

[ ]

Download the prediction result for metrics calcuation

The test data with predictions are saved in artifact outputs/predictions.csv. We will use it to calculate accuracy metrics and vizualize predictions versus actuals.

[ ]
[ ]
[ ]

Calculate metrics and save output

[ ]

Generate and save visuals

[ ]