--- title: "Securing an AKS deployment with Traefik" Author: Hong Ooi output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Securing an AKS deployment with Traefik} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{utf8} --- The vignette "Deploying a prediction service with Plumber" showed you how to deploy a simple prediction service to a Kubernetes cluster in Azure. That deployment had a number of drawbacks: chiefly, it was open to the public, and used HTTP and thus was vulnerable to man-in-the-middle (MITM) attacks. Here, we'll show how to address these issues with basic authentication and HTTPS. To run the code in this vignette, you'll need the helm binary installed on your machine in addition to docker and kubectl. ## Model object and image We'll reuse the image from the Plumber vignette. The code and artifacts to build this image are reproduced below for convenience. Save the model object: ```r data(Boston, package="MASS") library(randomForest) # train a model for median house price as a function of the other variables bos_rf <- randomForest(medv ~ ., data=Boston, ntree=100) # save the model saveRDS(bos_rf, "bos_rf.rds") ``` Scoring script: ```r bos_rf <- readRDS("bos_rf.rds") library(randomForest) #* @param df data frame of variables #* @post /score function(req, df) { df <- as.data.frame(df) predict(bos_rf, df) } ``` Dockerfile: ```dockerfile FROM trestletech/plumber # install the randomForest package RUN R -e 'install.packages(c("randomForest"))' # copy model and scoring script RUN mkdir /data COPY bos_rf.rds /data COPY bos_rf_score.R /data WORKDIR /data # plumb and run server EXPOSE 8000 ENTRYPOINT ["R", "-e", \ "pr <- plumber::plumb('/data/bos_rf_score.R'); pr$run(host='0.0.0.0', port=8000)"] ``` Build and upload the image to ACR: ```r library(AzureContainers) # create a resource group for our deployments deployresgrp <- AzureRMR::get_azure_login()$ get_subscription("sub_id")$ create_resource_group("deployresgrp", location="australiaeast") # create container registry deployreg_svc <- deployresgrp$create_acr("deployreg") # build image 'bos_rf' call_docker("build -t bos_rf .") # upload the image to Azure deployreg <- deployreg_svc$get_docker_registry() deployreg$push("bos_rf") ``` ## Deploy to AKS Create a new AKS resource for this deployment: ```r # create a Kubernetes cluster deployclus_svc <- deployresgrp$create_aks("secureclus", agent_pools=agent_pool("pool1", 2)) # grant the cluster pull access to the registry deployreg_svc$add_role_assignment(deployclus_svc, "Acrpull") ``` ### Install traefik The middleware we'll use to enable HTTPS and authentication is [Traefik](https://traefik.io/traefik/). This features built-in integration with Let's Encrypt, which is a popular source for TLS certificates (necessary for HTTPS). An alternative to Traefik is [Nginx](https://www.nginx.com/). Deploying traefik and enabling HTTPS involves the following steps, on top of deploying the model service itself: - Install traefik - Assign a domain name to the cluster's public IP address - Deploy the ingress route The following yaml contains the configuration parameters for traefik. Insert your email address where indicated (it will be used to obtain a certificate from Let's Encrypt), and save this as `traefik_values.yaml`. ```yaml # traefik_values.yaml fullnameOverride: traefik replicas: 1 resources: limits: cpu: 500m memory: 500Mi requests: cpu: 100m memory: 200Mi externalTrafficPolicy: Local kubernetes: ingressClass: traefik ingressEndpoint: useDefaultPublishedService: true dashboard: enabled: false debug: enabled: false accessLogs: enabled: true fields: defaultMode: keep headers: defaultMode: keep names: Authorization: redact rbac: enabled: true acme: enabled: true email: "email.addr@example.com" # insert your email address here staging: false challengeType: tls-alpn-01 ssl: enabled: true enforced: true permanentRedirect: true tlsMinVersion: VersionTLS12 cipherSuites: - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 - TLS_RSA_WITH_AES_128_GCM_SHA256 - TLS_RSA_WITH_AES_256_GCM_SHA384 - TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 ``` We can now install traefik: ```r # get the cluster endpoint deployclus <- deployclus_svc$get_cluster() deployclus$helm("repo add stable https://kubernetes-charts.storage.googleapis.com/") deployclus$helm("repo update") deployclus$helm("install traefik-ingress stable/traefik --namespace kube-system --values traefik_values.yaml") ``` ### Assign domain name Installing an ingress controller will create a public IP resource, which is the visible address for the cluster. It takes a minute or two for this resource to be created; once it appears, we assign a domain name to it. ```r for(i in 1:100) { res <- read.table(text=deployclus$get("service", "--all-namespaces")$stdout, header=TRUE, stringsAsFactors=FALSE) ip <- res$EXTERNAL.IP[res$NAME == "traefik"] if(ip != "") break Sys.sleep(10) } if(ip == "") stop("Ingress public IP not assigned") cluster_resources <- deployclus_svc$list_cluster_resources() found <- FALSE for(res in cluster_resources) { if(res$type != "Microsoft.Network/publicIPAddresses") next res$sync_fields() if(res$properties$ipAddress == ip) { found <- TRUE break } } if(!found) stop("Ingress public IP resource not found") # assign domain name to IP address res$do_operation( body=list( location=ip$location, properties=list( dnsSettings=list(domainNameLabel="secureclus"), publicIPAllocationMethod=ip$properties$publicIPAllocationMethod ) ), http_verb="PUT" ) ``` ### Generate an auth file We'll use Traefik to handle authentication details. While some packages like RestRserve can also authenticate users, a key benefit of assigning this to the ingress controller is that it frees our container to focus purely on the task of returning predicted values. For example, if we want to switch to a different kind of model, we can do so without having to copy over any authentication code. Using basic auth with Traefik requires an authentication file, in [Apache `.htpasswd` format](https://en.wikipedia.org/wiki/.htpasswd). Here is R code to create such a file. This is essentially the same code that is used in the "Deploying an ACI service with HTTPS and authentication" vignette. ```r library(bcrypt) user_list <- list( c("user1", "password1") ) user_str <- sapply(user_list, function(x) paste(x[1], hashpw(x[2]), sep=":")) writeLines(user_str, "auth") ``` ### Deploy the service The yaml for the deployment and service is the same as in the original Plumber vignette. The main difference is that we now create a separate `bos-rf` namespace for our objects, which is the recommended practice: it allows us to manage the service independently of any others that may exist on the cluster. Save the following as `secure_deploy.yaml`: ```yaml # deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: bos-rf namespace: bos-rf spec: selector: matchLabels: app: bos-rf replicas: 1 template: metadata: labels: app: bos-rf spec: containers: - name: bos-rf image: deployreg.azurecr.io/bos_rf # insert URI for your container registry/image ports: - containerPort: 8000 resources: requests: cpu: 250m limits: cpu: 500m --- apiVersion: v1 kind: Service metadata: name: bos-rf-svc namespace: bos-rf spec: selector: app: bos-rf ports: - protocol: TCP port: 8000 ``` The yaml to create the ingress route is below. Save this as `secure_ingress.yaml`. ```yaml # ingress.yaml apiVersion: extensions/v1beta1 kind: Ingress metadata: name: bos-rf-ingress namespace: bos-rf annotations: kubernetes.io/ingress.class: traefik ingress.kubernetes.io/auth-type: basic ingress.kubernetes.io/auth-secret: bos-rf-secret spec: rules: - host: secureclus.australiaeast.cloudapp.azure.com http: paths: - path: / backend: serviceName: bos-rf-svc servicePort: 8000 ``` We can now create the deployment, service, and ingress route: ```r deployclus$kubectl("create namespace bos-rf") deployclus$kubectl("create secret generic bos-rf-secret --from-file=auth --namespace bos-rf") deployclus$create("secure_deploy.yaml") deployclus$apply("secure_ingress.yaml") ``` ## Call the service Once the deployment is complete, we can obtain predicted values from it. Notice that we use the default HTTPS port for the request (port 443). While port 8000 is nominally exposed by the service, this is visible only within the cluster; it is connected to the external port 443 by Traefik. ```r response <- httr::POST("https://secureclus.australiaeast.cloudapp.azure.com/score", httr::authenticate("user1", "password1"), body=list(df=MASS::Boston[1:10,]), encode="json") httr::content(response, simplifyVector=TRUE) #> [1] 25.9269 22.0636 34.1876 33.7737 34.8081 27.6394 21.8007 22.3577 16.7812 18.9785 ``` ## Further comments In this vignette, we've secured the predictive service with a single username and password. While we could add more users to the authentication file, a more flexible and scalable solution is to use a frontend service, such as [Azure API Management](https://azure.microsoft.com/services/api-management/), or a directory service like [Azure Active Directory](https://azure.microsoft.com/services/active-directory/). The specifics will typically be determined by your organisation's IT infrastructure, and are beyond the scope of this vignette.