How To Use Scriptrun
Copyright (c) Microsoft Corporation. All rights reserved.
Licensed under the MIT License.
![]()
How to use configure a training run with data input and output
This notebook shows how to use ScriptRunConfig with input and output. A run submitted with ScriptRunConfig represents a single trial in an experiment. Submitting the run returns a ScriptRun object, which can be used to monitor the asynchronous execution of the run, log metrics and store output of the run, and analyze results and access artifacts generated by the run.
Prerequisite:
- Understand the architecture and terms introduced by Azure Machine Learning
- If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, go through the configuration notebook to:
- install the AML SDK
- create a workspace and its configuration file (
config.json)
Initialize workspace
Initialize a Workspace object from the existing workspace you created in the Prerequisites step. Workspace.from_config() creates a workspace object from the details stored in config.json.
Create or Attach existing AmlCompute
You will need to create a compute target for training your model. In this tutorial, you create AmlCompute as your training compute resource.
Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.
If we could not find the cluster with the given name, then we will create a new cluster here. We will create an AmlCompute cluster of STANDARD_D2_V2 GPU VMs. This process is broken down into 3 steps:
- create the configuration (this step is local and only takes a second)
- create the cluster (this step will take about 20 seconds)
- provision the VMs to bring the cluster to the initial size (of 1 in this case). This step will take about 3-5 minutes and is providing only sparse output in the process. Please make sure to wait until the call returns before moving to the next cell
Now that you have created the compute target, let's see what the workspace's compute_targets property returns. You should now see one entry named 'cpu-cluster' of type AmlCompute.
Use a simple script
We have already created a simple "hello world" script. This is the script that we will submit through the ScriptRunConfig. It reads iris dataset as input, and write it out to outputdataset folder in default blob datastore.
Every workspace comes with a default datastore (and you can register more) which is backed by the Azure blob storage account associated with the workspace. We can use it to transfer data from local to the cloud, and create dataset from it. We will now upload the Iris data to the default datastore (blob) within your workspace.
Now we are ready to define the input and output of your script. They can be passed in via arguments, which is a list of command-line arguments to pass to the training script specified in script.