Azure Aml Pipelines How To Use Azurebatch To Run A Windows Executable

Aml Pipelines How To Use Azurebatch To Run A Windows Executable

how-to-use-azuremlazure-mldata-sciencenotebookintro-to-pipelinesmachine-learningazure-machine-learningdeep-learningazuremlazure-ml-notebooksmachine-learning-pipelinesazure

alph-notebooks/azure-ml-notebooks / aml-pipelines-how-to-use-azurebatch-to-run-a-windows-executable.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Impressions

Azure Machine Learning Pipeline with AzureBatchStep

This notebook is used to demonstrate the use of AzureBatchStep in Azure Machine Learning Pipeline. An AzureBatchStep will submit a job to an AzureBatch Compute to run a simple windows executable.

Azure Machine Learning and Pipeline SDK-specific Imports

[ ]

Initialize Workspace

Initialize a workspace object from persisted configuration. Make sure the config file is present at .\config.json

If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, If you don't have a config.json file, please go through the configuration Notebook located here.

This sets you up with a working config file that has information on your workspace, subscription id, etc.

[ ]

Attach Batch Compute to Workspace

To submit jobs to Azure Batch service, you must attach your Azure Batch account to the workspace.

[ ]

Setup Datastore

Setting up the Blob storage associated with the workspace.
The following call retrieves the Azure Blob Store associated with your workspace.
Note that workspaceblobstore is the name of this store and CANNOT BE CHANGED and must be used as is.

If you want to register another Datastore, please follow the instructions from here: https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-access-data#register-a-datastore

[ ]

Setup Input and Output

For this example we will upload a file in the provided Datastore. These are some helper methods to achieve that.

[ ]

Here we associate the input DataReference with an existing file in the provided Datastore. Feel free to upload the file of your choice manually or use the upload_file_to_datastore method.

[ ]

Setup AzureBatch Job Binaries

AzureBatch can run a task within the job and here we put a simple .cmd file to be executed. Feel free to put any binaries in the folder, or modify the .cmd file as needed, they will be uploaded once we create the AzureBatch Step.

[ ]

Create an AzureBatchStep

AzureBatchStep is used to submit a job to the attached Azure Batch compute.

name: Name of the step
pool_id: Name of the pool, it can be an existing pool, or one that will be created when the job is submitted
inputs: List of inputs that will be processed by the job
outputs: List of outputs the job will create
executable: The executable that will run as part of the job
arguments: Arguments for the executable. They can be plain string format, inputs, outputs or parameters
compute_target: The compute target where the job will run.
source_directory: The local directory with binaries to be executed by the job

Optional parameters:

create_pool: Boolean flag to indicate whether create the pool before running the jobs
delete_batch_job_after_finish: Boolean flag to indicate whether to delete the job from Batch account after it's finished
delete_batch_pool_after_finish: Boolean flag to indicate whether to delete the pool after the job finishes
is_positive_exit_code_failure: Boolean flag to indicate if the job fails if the task exists with a positive code
vm_image_urn: If create_pool is true and VM uses VirtualMachineConfiguration.
Value format: 'urn:publisher:offer:sku'.
Example: urn:MicrosoftWindowsServer:WindowsServer:2012-R2-Datacenter
For more details:
https://docs.microsoft.com/en-us/azure/virtual-machines/windows/cli-ps-findimage#table-of-commonly-used-windows-images and
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/cli-ps-findimage#find-specific-images
run_task_as_admin: Boolean flag to indicate if the task should run with Admin privileges
target_compute_nodes: Assumes create_pool is true, indicates how many compute nodes will be added to the pool
source_directory: Local folder that contains the module binaries, executable, assemblies etc.
executable: Name of the command/executable that will be executed as part of the job
arguments: Arguments for the command/executable
inputs: List of input port bindings
outputs: List of output port bindings
vm_size: If create_pool is true, indicating Virtual machine size of the compute nodes
compute_target: BatchCompute compute
allow_reuse: Whether the module should reuse previous results when run with the same settings/inputs
version: A version tag to denote a change in functionality for the module

[ ]

Build and Submit the Pipeline

[ ]

Visualize the Running Pipeline

[ ]