Set up an Azure ML workspace

This tutorial gets you started with the Azure Machine Learning service by walking through the requirements and instructions for setting up a workspace, the top-level resource for Azure ML.

You do not need run this if you are working on an Azure Machine Learning Compute Instance, as the compute instance is already associated with an existing workspace.

What is an Azure ML workspace?

The workspace is the top-level resource for Azure ML, providing a centralized place to work with all the artifacts you create when you use Azure ML. The workspace keeps a history of all training runs, including logs, metrics, output, and a snapshot of your scripts.

When you create a new workspace, it automatically creates several Azure resources that are used by the workspace:

  • Azure Container Registry: Registers docker containers that you use during training and when you deploy a model. To minimize costs, ACR is lazy-loaded until deployment images are created.
  • Azure Storage account: Used as the default datastore for the workspace.
  • Azure Application Insights: Stores monitoring information about your models.
  • Azure Key Vault: Stores secrets that are used by compute targets and other sensitive information that’s needed by the workspace.

Setup

This section describes the steps required before you can access any Azure ML service functionality.

Azure subscription

In order to create an Azure ML workspace, first you need access to an Azure subscription. An Azure subscription allows you to manage storage, compute, and other assets in the Azure cloud. You can create a new subscription or access existing subscription information from the Azure portal. Later in this tutorial you will need information such as your subscription ID in order to create and access workspaces.

Azure ML SDK installation

Follow the installation guide to install azuremlsdk on your machine.

Configure your workspace

Workspace parameters

To use an Azure ML workspace, you will need to supply the following information:

  • Your subscription ID
  • A resource group name
  • (Optional) The region that will host your workspace
  • A name for your workspace

You can get your subscription ID from the Azure portal.

You will also need access to a resource group, which organizes Azure resources and provides a default region for the resources in a group. You can see what resource groups to which you have access, or create a new one in the Azure portal. If you don’t have a resource group, the create_workspace() method will create one for you using the name you provide.

The region to host your workspace will be used if you are creating a new workspace. You do not need to specify this if you are using an existing workspace. You can find the list of supported regions here. You should pick a region that is close to your location or that contains your data.

The name for your workspace is unique within the subscription and should be descriptive enough to discern among other workspaces. The subscription may be used only by you, or it may be used by your department or your entire enterprise, so choose a name that makes sense for your situation.

The following code chunk allows you to specify your workspace parameters. It uses Sys.getenv to read values from environment variables, which is useful for automation. If no environment variable exists, the parameters will be set to the specified default values. Replace the default values in the code below with your default parameter values.

subscription_id <- Sys.getenv("SUBSCRIPTION_ID", unset = "<my-subscription-id>")
resource_group <- Sys.getenv("RESOURCE_GROUP", default="<my-resource-group>")
workspace_name <- Sys.getenv("WORKSPACE_NAME", default="<my-workspace-name>")
workspace_region <- Sys.getenv("WORKSPACE_REGION", default="eastus2")

Create a new workspace

If you don’t have an existing workspace and are the owner of the subscription or resource group, you can create a new workspace. If you don’t have a resource group, create_workspace() will create one for you using the name you provide. If you don’t want it to do so, set the create_resource_group = FALSE parameter.

Note: As with other Azure services, there are limits on certain resources (e.g. AmlCompute quota) associated with the Azure ML service. Please read this article on the default limits and how to request more quota.

This cell will create an Azure ML workspace for you in a subscription, provided you have the correct permissions.

This will fail if:

  • You do not have permission to create a workspace in the resource group.
  • You do not have permission to create a resource group if it does not exist.
  • You are not a subscription owner or contributor and no Azure ML workspaces have ever been created in this subscription.

If workspace creation fails, please work with your IT admin to provide you with the appropriate permissions or to provision the required resources.

There are additional parameters that are not shown below that can be configured when creating a workspace. Please see create_workspace() for more details.

library(azuremlsdk)

ws <- create_workspace(name = workspace_name,
                       subscription_id = subscription_id,
                       resource_group = resource_group,
                       location = workspace_region,
                       exist_ok = TRUE,
                       auth = authentication)

You can out write out the workspace ARM properties to a config file with write_workspace_config(). The method provides a simple way of reusing the same workspace across multiple files or projects. Users can save the workspace details with write_workspace_config(), and use load_workspace_from_config() to load the same workspace in different files or projects without retyping the workspace ARM properties. The method defaults to writing out the config file to the current working directory with “config.json” as the file name. To specify a different path or file name, set the path and file_name parameters.

write_workspace_config(ws)

Access an existing workspace

You can access an existing workspace in a couple of ways. If your workspace properties were previously saved to a config file, you can load the workspace as follows:

ws <- load_workspace_from_config()

If Azure ML cannot find the config file, specify the path to the config file with the path parameter. The method defaults to starting the search in the current directory.

You can also initialize a workspace using the get_workspace() method.

ws <- get_workspace(name = workspace_name,
                    subscription_id = subscription_id,
                    resource_group = resource_group,
                    auth = authentication)

Authenticate a workspace

It will sometimes be necessary to provide authentication when accessing a workspace.

If you receive the error:

AuthenticationException: You don't have access to xxxxxx-xxxx-xxx-xxx-xxxxxxxxxx
subscription. All the subscriptions that you have access to = ...
check that the you used correct login and entered the correct subscription ID.

or

All the subscriptions that you have access to = []

You may have to specify the tenant ID of the Azure Active Directory you’re using in order to gain access. An example would be accessing a subscription as a guest to a tenant that is not your default.

AzureML SDK for R supports two authentication methods: service_principal_authentication and interactive_login_authentication. Interactive Login Authenticaion is suitable for local experimentation while Service Principal Authentication is suitable for automated workflows. To use either, construct the authenticator object and assign it to the auth param of get_workspace or create_workspace.

interactive_auth <- interactive_login_authentication(tenant_id="your-tenant-id")

ws <- get_workspace("<your workspace name>",
                    "<your subscription ID>",
                    "<your resource group>",
                    auth = interactive_auth)
svc_pr_password <- Sys.getenv("AZUREML_PASSWORD")
svc_pr <- service_principal_authentication(tenant_id="my-tenant-id"
                                           service_principal_id="my-application-id",
                                           service_principal_password=svc_pr_password)

ws <- get_workspace("<your workspace name>",
                    "<your subscription ID>",
                    "<your resource group>",
                    auth = svc_pr)

For more information on the two methods, read the official Azure Machine Learning documentation.