This repository provides the supporting code for my presentation entitled Beyond the Basics with Azure ML.
This data comes from the Chicago Parking Ticket database, courtesy of Daniel Hutmacher. I sampled 1,000,000 records from it and the file I used is available in CSV format.
Import this into Azure ML using the Dataset name ChicagoParkingTicketsFolder. Be sure to upload this as a uri_folder instead of an MLtable or uri_file!
Import the notebook in the Notebook folder into Azure Machine Learning. You will need to create a compute instance to run this.
In order to run the ML pipeline notebooks and jobs locally, you will need to have the following installed on your machine:
- Python (preferably the Anaconda distribution), with
pipinstalled:conda install -c anaconda pip - The Azure CLI
- The Azure ML Azure CLI extension:
az extension add -n ml - Pip packages:
pip install azure-ai-ml,pip install azure-identity - Visual Studio Code
- The Azure ML Visual Studio Code extension
Before you run the code, make sure your console has you logged into Azure via CLI:
az loginThen, create a folder called .azureml and a file named config.json. The file should look like the following structure:
{
"subscription_id": "YOUR SUBSCRIPTION ID",
"resource_group": "YOUR RESOURCE GROUP",
"workspace_name": "YOUR WORKSPACE NAME"
}Note that you must be logged into az cli with an account which has access to the subscription, resource group, and workspace.
From there, run the training code:
python deploy-train.pyYou can see the job in action by going to Azure ML Studio and viewing the "Chicago_Parking_Tickets_Code-First" experiment. There will be a new "train_pipeline" job.
For scoring, run the following code:
python deploy-score.pyThis will create a batch endpoint and deployment, upload data to a Datastore in Azure ML, create a job to generate predictions, and downloads the resulting predictions to a local file called predictions.csv.
IMPORTANT NOTE -- You must explicitly grant rights to the account running deploy-score.py against the Azure ML workspace. I granted Owner because I was running this personally, but it must be explicitly granted and not just have ownership as a side effect of subscription-level or resource group-level rights.
If you do not do this, you will likely get a strange BY_POLICY error message when running this script.