Train Remote
Copyright (c) Microsoft Corporation. All rights reserved.
Licensed under the MIT License.
![]()
Use MLflow with Azure Machine Learning for Remote Training Run
This example shows you how to use MLflow tracking APIs together with Azure Machine Learning services for storing your metrics and artifacts, from local Notebook run. You'll learn how to:
- Set up MLflow tracking URI so as to use Azure ML
- Create experiment
- Train a model on Machine Learning Compute while logging metrics and artifacts
- View your experiment within your Azure ML Workspace in Azure Portal.
Prerequisites
Make sure you have completed the Configuration notebook to set up your Azure Machine Learning workspace and ensure other common prerequisites are met.
Set-up
Check Azure ML SDK version installed on your computer, and then connect to your Workspace.
Let's also create a Machine Learning Compute cluster for submitting the remote run.
Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.
Create Azure ML Experiment
The following steps show how to submit a training Python script to a cluster as an Azure ML run, while logging happens through MLflow APIs to your Azure ML Workspace. Let's first create an experiment to hold the training runs.
Instrument remote training script using MLflow
Let's use train_diabetes.py to train a regression model against diabetes dataset as the example. Note that the training script uses mlflow.start_run() to start logging, and then logs metrics, saves the trained scikit-learn model, and saves a plot as an artifact.
Run following command to view the script file. Notice the mlflow logging statements, and also notice that the script doesn't have explicit dependencies on azureml library.
Submit Run to Cluster
Let's submit the run to cluster. When running on the remote cluster as submitted run, Azure ML sets the MLflow tracking URI to point to your Azure ML Workspace, so that the metrics and artifacts are automatically logged there.
Note that you have to specify the packages your script depends on, including azureml-mlflow that implicitly enables the MLflow logging to Azure ML.
First, create a environment with Docker enable and required package dependencies specified.
Next, specify a script run configuration that includes the training script, environment and CPU cluster created earlier.
Finally, submit the run. Note that the first instance of the run typically takes longer as the Docker-based environment is created, several minutes. Subsequent runs reuse the image and are faster.
You can navigate to your Azure ML Workspace at Azure Portal to view the run metrics and artifacts.
You can also get the metrics and bring them to your local notebook, and view the details of the run.