How to deploy GPU Node groups in Amazon EKS ( version >= 1.16 ) with Autoscaling

This post walks you through the process of deploying GPU nodes groups( worker nodes ) in an existing EKS cluster that has been created using ekctl . Normally, you would like to fire up GPU instance only on demand to save on costs.

Prerequisites

Make sure you have following tools installed on your systems
- eksctl - eksctl.io/introduction/#installation
- kubectl - kubernetes.io/docs/tasks/tools/install-kube..
- You have cluster autoscaler deployed to your EKS cluster.

Create the node groups

Create GPU Node group configuration in EKS cluster's configuration file. Let's call this file cluster.yaml. Example configuration file is as given below:

...
nodeGroups:

  - name: ng-1-gpu-1-16
    labels:
      nvidia.com/gpu: "true"
      name: nvidia-device-plugin-ds
      k8s.amazonaws.com/accelerator: nvidia-tesla
    taints:
      nvidia.com/gpu: "true:NoSchedule"
    tags:
      k8s.io/cluster-autoscaler/node-template/label/nvidia.com/gpu: 'true'
      k8s.io/cluster-autoscaler/node-template/taint/dedicated: nvidia.com/gpu=true
      k8s.io/cluster-autoscaler/node-template/label/name: "nvidia-device-plugin-ds"
      k8s.io/cluster-autoscaler/node-template/taint/nvidia.com/gpu: "true:NoSchedule"
      k8s.io/cluster-autoscaler/enabled: "true"
    instanceType: p3.2xlarge
    desiredCapacity: 0
    minSize: 0
    maxSize: 10
    privateNetworking: true
    volumeSize: 30
    ssh:
      publicKeyName: my-keypair
    iam:
      withAddonPolicies:
        autoScaler: true
        externalDNS: true
        ebs: true
        cloudWatch: true
        albIngress: true

...

Note the labels and taints used for this node group.

Update the cluster node groups : eksctl create nodegroup --config-file=cluster.yaml

Configure EKS Cluster to use GPU resources

To be able to use GPU resources , Nvidia device plugin needs to be installed as a daemonset. kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.6.0/nvidia-device-plugin.yml

Test the cluster be deploying a sample pod

Create a file name nvidia-smi.yml with following contents:

apiVersion: v1
kind: Pod
metadata:
  name: nvidia-smi
spec:
  restartPolicy: OnFailure
  containers:
  - name: nvidia-smi
    image: nvidia/cuda:9.2-devel
    args:
    - "nvidia-smi"
    resources:
      limits:
        nvidia.com/gpu: 1
  tolerations:
  - key: "nvidia.com/gpu"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"

Deploy the pod using the command: kubectl apply -f nvidia-smi.yml
You should now see autoscaling actions kicking in and your autoscaling group for GPU nodes launch a new GPU instance
You can check that the pod is scheduled on to run on this node. Once the pod completes running, you should be able to see the logs as shown:

$ kubectl logs nvidia-smi
Tue Sep 29 13:20:05 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00    Driver Version: 418.87.00    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:1E.0 Off |                    0 |
| N/A   45C    P0    25W / 300W |      0MiB / 16130MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+