Notebooks
A
Azure
Tensorboard

Tensorboard

how-to-use-azuremlazure-mldata-sciencenotebooktensorboardmachine-learningtrack-and-monitor-experimentsazure-machine-learningdeep-learningazuremlazure-ml-notebooksazure

Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

Impressions

Tensorboard Integration with Run History

  1. Run a TensorFlow job locally and view its TB output live.
  2. The same, for a DSVM.
  3. And once more, with an AmlCompute cluster.
  4. Finally, we'll collect all of these historical runs together into a single Tensorboard graph.

Prerequisites

  • Understand the architecture and terms introduced by Azure Machine Learning
  • If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, go through the configuration notebook notebook to:
    • install the AML SDK
    • create a workspace and its configuration file (config.json)
[ ]

Diagnostics

Opt-in diagnostics for better experience, quality, and security of future releases.

[ ]

Initialize Workspace

Initialize a workspace object from persisted configuration.

[ ]

Set experiment name and create project

Choose a name for your run history container in the workspace, and create a folder for the project.

[ ]

Download Tensorflow Tensorboard demo code

Tensorflow's repository has an MNIST demo with extensive Tensorboard instrumentation. We'll use it here for our purposes.

Note that we don't need to make any code changes at all - the code works without modification from the Tensorflow repository.

[ ]

Configure and run locally

We'll start by running this locally. While it might not initially seem that useful to use this for a local run - why not just run TB against the files generated locally? - even in this case there is some value to using this feature. Your local run will be registered in the run history, and your Tensorboard logs will be uploaded to the artifact store associated with this run. Later, you'll be able to restore the logs from any run, regardless of where it happened.

Note that for this run, you will need to install Tensorflow on your local machine by yourself. Further, the Tensorboard module (that is, the one included with Tensorflow) must be accessible to this notebook's kernel, as the local machine is what runs Tensorboard. In addition, you will also need to have the azureml-tensorboard package installed.

[ ]
[ ]

Start Tensorboard

Now, while the run is in progress, we just need to start Tensorboard with the run as its target, and it will begin streaming logs.

[ ]

Stop Tensorboard

When you're done, make sure to call the stop() method of the Tensorboard object, or it will stay running even after your job completes.

[ ]

Now, with a DSVM

Tensorboard uploading works with all compute targets. Here we demonstrate it from a DSVM. Note that the Tensorboard instance itself will be run by the notebook kernel. Again, this means this notebook's kernel must have access to the Tensorboard module.

If you are unfamiliar with DSVM configuration, check Train in a remote VM for a more detailed breakdown.

Note: To streamline the compute that Azure Machine Learning creates, we are making updates to support creating only single to multi-node AmlCompute. The DSVMCompute class will be deprecated in a later release, but the DSVM can be created using the below single line command and then attached(like any VM) using the sample code below. Also note, that we only support Linux VMs for remote execution from AML and the commands below will spin a Linux VM only.

	# create a DSVM in your resource group
# note you need to be at least a contributor to the resource group in order to execute this command successfully.
(myenv) $ az vm create --resource-group <resource_group_name> --name <some_vm_name> --image microsoft-dsvm:linux-data-science-vm-ubuntu:linuxdsvmubuntu:latest --admin-username <username> --admin-password <password> --generate-ssh-keys --authentication-type password

You can also use this url to create the VM using the Azure Portal.

[ ]

Submit run using TensorFlow estimator

Instead of manually configuring the DSVM environment, we can use the TensorFlow estimator and everything is set up automatically.

[ ]

Start Tensorboard with this run

Just like before.

[ ]

Stop Tensorboard

When you're done, make sure to call the stop() method of the Tensorboard object, or it will stay running even after your job completes.

[ ]

Once more, with an AmlCompute cluster

Just to prove we can, let's create an AmlCompute CPU cluster, and run our demo there, as well.

Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.

[ ]

Submit run using TensorFlow estimator

Again, we can use the TensorFlow estimator and everything is set up automatically.

[ ]

Start Tensorboard with this run

Once more...

[ ]

Stop Tensorboard

When you're done, make sure to call the stop() method of the Tensorboard object, or it will stay running even after your job completes.

[ ]

Finale

If you've paid close attention, you'll have noticed that we've been saving the run objects in an array as we went along. We can start a Tensorboard instance that combines all of these run objects into a single process. This way, you can compare historical runs. You can even do this with live runs; if you made some of those previous runs longer via the --max_steps parameter, they might still be running, and you'll see them live in this instance as well.

[ ]

Stop Tensorboard

As you might already know, make sure to call the stop() method of the Tensorboard object, or it will stay running (until you kill the kernel associated with this notebook, at least).

[ ]