NVidia with Platform9 - /etc/docker/daemon.json keep being overwritten

After configuring a k8s cluster consisting of 3 machines, I need to access the GPUs from the pods running on these cluster. Thus, I installed k8s-device-plugin for nvidia on the three machines, and as per the installation instructions I had to manually modify the original file /etc/docker/daemon.json.

Original:

{
bridge": “none”,
“graph”: “/var/lib/docker”,
“group”: “pf9group”,
“exec-opts”: [“native.cgroupdriver=systemd”],
“live-restore”: true,
“log-driver”: “json-file”,
“log-opts”: {
“max-size”: “10m”,
“max-file”: “10”
},
“storage-driver”: “”,
“storage-opts”: [ ],
“debug”: false,
“registry-mirrors”: [“https://dockermirror.platform9.io/”]
}

Modified:

{
bridge": “none”,
“graph”: “/var/lib/docker”,
“group”: “pf9group”,
“exec-opts”: [“native.cgroupdriver=systemd”],
“live-restore”: true,
“log-driver”: “json-file”,
“log-opts”: {
“max-size”: “10m”,
“max-file”: “10”
},
“storage-driver”: “”,
“storage-opts”: [ ],
“debug”: false,
“registry-mirrors”: [“https://dockermirror.platform9.io/”],
“default-runtime”: “nvidia”,
“runtimes”: {
“nvidia”: {
“path”: “/usr/bin/nvidia-container-runtime”,
“runtimeArgs”: []
}
}
}

The configuration is working well and I can access the gpu, but the problem is that every while, platform9 daemon overwrite this file again so it gets back to its original form, thus the pods fail to access the GPU. Is there a way to prevent this from happening? Or is there a better way to configure the GPU to work with platform9?

Hi, @Mostafa thanks for reaching out, let me take a look at this and get back to you.

Hi @Mostafa thank you for your patience.

Can you shed some additional light on when this overwrite happens? Is it after a node restart or are you seeing the node go into an unhealthy state? Ideally the components are setup to not overwrite any configuration unless there is the full stack restart which would happen in case of an unhealthy node.

If you follow this procedure you can set your Docker to not be managed by Platform9.

The caveat here is that we will not touch Docker at all (apart from using it the way it’s configured by the user), the updates and changes will have to made at your end.