Notebooks
A
Azure
Aml Pipelines Publish And Run Using Rest Endpoint

Aml Pipelines Publish And Run Using Rest Endpoint

how-to-use-azuremlazure-mldata-sciencenotebookintro-to-pipelinesmachine-learningazure-machine-learningdeep-learningazuremlazure-ml-notebooksmachine-learning-pipelinesazure

Copyright (c) Microsoft Corporation. All rights reserved.
Licensed under the MIT License.

Impressions

How to Publish a Pipeline and Invoke the REST endpoint

In this notebook, we will see how we can publish a pipeline and then invoke the REST endpoint.

Prerequisites and Azure Machine Learning Basics

If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the configuration Notebook first if you haven't. This sets you up with a working config file that has information on your workspace, subscription id, etc.

Initialization Steps

[ ]

Compute Targets

Retrieve an already attached Azure Machine Learning Compute

Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.

[ ]
[ ]

Building Pipeline Steps with Inputs and Outputs

A step in the pipeline can take dataset as input. This dataset can be a data source that lives in one of the accessible data locations, or intermediate data produced by a previous step in the pipeline.

[ ]
[ ]
[ ]

Define a Step that consumes a dataset and produces intermediate data.

In this step, we define a step that consumes a dataset and produces intermediate data.

Open train.py in the local machine and examine the arguments, inputs, and outputs for the script. That will give you a good sense of why the script argument names used below are important.

The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the source_directory for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the source_directory would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the source_directory of the step.

[ ]

Define a Step that consumes intermediate data and produces intermediate data

In this step, we define a step that consumes an intermediate data and produces intermediate data.

Open extract.py in the local machine and examine the arguments, inputs, and outputs for the script. That will give you a good sense of why the script argument names used below are important.

[ ]

Define a Step that consumes multiple intermediate data and produces intermediate data

In this step, we define a step that consumes multiple intermediate data and produces intermediate data.

PipelineParameter

This step also has a PipelineParameter argument that help with calling the REST endpoint of the published pipeline.

[ ]

Open compare.py in the local machine and examine the arguments, inputs, and outputs for the script. That will give you a good sense of why the script argument names used below are important.

[ ]

Build the pipeline

[ ]

Run published pipeline

Publish the pipeline

[ ]

Note: the continue_on_step_failure parameter specifies whether the execution of steps in the Pipeline will continue if one step fails. The default value is False, meaning when one step fails, the Pipeline execution will stop, canceling any running steps.

Publish the pipeline from a submitted PipelineRun

It is also possible to publish a pipeline from a submitted PipelineRun

[ ]

Get published pipeline

You can get the published pipeline using pipeline id.

To get all the published pipelines for a given workspace(ws):

	all_pub_pipelines = PublishedPipeline.get_all(ws)

[ ]

Run published pipeline using its REST endpoint

This notebook shows how to authenticate to AML workspace.

[ ]
[ ]

Next: Data Transfer

The next notebook will showcase data transfer steps between different types of data stores.