Aml Pipelines Showcasing Datapath And Pipelineparameter
Copyright (c) Microsoft Corporation. All rights reserved.
Licensed under the MIT License.
![]()
Showcasing DataPath and PipelineParameter
This notebook demonstrateas the use of DataPath and PipelineParameters in AML Pipeline. You will learn how strings and DataPath can be parameterized and submitted to AML Pipelines via PipelineParameters. To see more about how parameters work between steps, please refer aml-pipelines-with-data-dependency-steps.
- How to create a Pipeline with a DataPath PipelineParameter
- How to submit a Pipeline with a DataPath PipelineParameter
- How to submit a Pipeline and change the DataPath PipelineParameter value from the sdk
- How to submit a Pipeline and change the DataPath PipelineParameter value using a REST call
- How to create a datastore trigger schedule and use the data_path_parameter_name to get the path of the changed blob in the Pipeline
Azure Machine Learning and Pipeline SDK-specific imports
Initialize Workspace
Initialize a workspace object from persisted configuration. If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure the config file is present at .\config.json
If you don't have a config.json file, go through the configuration Notebook first.
This sets you up with a working config file that has information on your workspace, subscription id, etc.
Create an Azure ML experiment
Let's create an experiment named "showcasing-datapath" and a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure.
Create or Attach an AmlCompute cluster
You will need to create a compute target for your AutoML run. In this tutorial, you get the default AmlCompute as your training compute resource.
Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.
Data and arguments setup
We will setup a trining script to run and its arguments to be used. The sample training script below will print the two arguments to show what has been passed to pipeline.
Let's setup string and DataPath arguments using PipelineParameter.
Note that Pipeline accepts a tuple of the form (PipelineParameters , DataPathComputeBinding) as an input. DataPath defines the location of input data. DataPathComputeBinding defines how the data is consumed during step execution. The DataPath can be modified at pipeline submission time with a DataPath parameter, while the compute binding does not change. For static data inputs, we use DataReference which defines both the data location and compute binding.
Create a Pipeline with a DataPath PipelineParameter
Note that the datapath_input is specified on both arguments and inputs to create a step.
Submit a Pipeline with a DataPath PipelineParameter
Pipelines can be submitted with default values of PipelineParameters by not specifying any parameters.
Submit a Pipeline and change the DataPath PipelineParameter value from the sdk
Or Pipelines can be submitted with values other than default ones by using pipeline_parameters.
Submit a Pipeline and change the DataPath PipelineParameter value using a REST call
Let's published the pipeline to use the rest endpoint of the published pipeline.
Create a Datastore trigger schedule and use data path parameter
When the Pipeline is scheduled with DataPath parameter, it will be triggered by the modified or added data in the DataPath. path_on_datastore should be a folder and the value of the DataPath will be replaced by the path of the modified data.