kubelet-device-plugin
settings.kubelet-device-plugin.*
)Topic list
Setting list for settings.kubelet-device-plugin
settings.kubelet-device-plugin.nvidia.device-id-strategy
settings.kubelet-device-plugin.nvidia.device-list-strategy
settings.kubelet-device-plugin.nvidia.device-sharing-strategy
settings.kubelet-device-plugin.nvidia.pass-device-specs
settings.kubelet-device-plugin.nvidia.time-slicing.fail-requests-greater-than-one
settings.kubelet-device-plugin.nvidia.time-slicing.rename-by-default
settings.kubelet-device-plugin.nvidia.time-slicing.replicas
Topics
Settings related to k8 nvidia device plugin
See the nvidia-k8s-device-plugin for more details.Settings
settings.kubelet-device-plugin.nvidia.device-id-strategy
settings.kubelet-device-plugin.nvidia.device-list-strategy
settings.kubelet-device-plugin.nvidia.device-sharing-strategy
settings.kubelet-device-plugin.nvidia.pass-device-specs
settings.kubelet-device-plugin.nvidia.time-slicing.fail-requests-greater-than-one
settings.kubelet-device-plugin.nvidia.time-slicing.rename-by-default
settings.kubelet-device-plugin.nvidia.time-slicing.replicas
NVIDIA Time-Slicing
Bottlerocket supports NVIDIA GPU time-slicing on Kubernetes nodes through the nvidia-k8s-device-plugin. This functionality enables system administrators to allocate a set of replicas on the node’s GPU(s), which can then be assigned to individual pods for executing various workloads. To learn more about Time-Slicing and its options, please take a look at the NVIDIA documentation, like their Kubernetes plugin and technical blog.
Lifecycle
When time-slicing configuration is defined on a Bottlerocket Kubernetes node with NVIDIA GPU variants, the configuration is applied to all GPUs present on the node. Modifications to the time-slicing configuration will affect the advertised resources available on the node. Existing pods that were already running and consuming the GPU are not automatically removed or restarted. Therefore, it is recommended to configure time-slicing settings before deploying pods to ensure consistency across all GPU workloads.
Use Cases
The time-slicing feature is disabled by default in Bottlerocket. This feature does not provide memory or fault isolation between replicas, and has unique resource request behavior as described here. According to NVIDIA, this feature is best used for over subscribing the GPU when needing to run multiple applications that are not latency-sensitive or can tolerate jitter.
Example Usage
In a Bottlerocket Kubernetes NVIDIA variant, if the below configuration were applied to a node with 8 GPUs on it, the plugin would now advertise 80 nvidia.com/gpu.shared
resources to Kubernetes instead of 8 (8 GPU’s x 10 replicas = 80).
The nvidia-k8s-device-plugin creates 10 references to each GPU and distributes them to any requestor. For behavior details, please refer to NVIDIA documentation.
[settings.kubelet-device-plugins.nvidia]
device-sharing-strategy = "time-slicing"
[settings.kubelet-device-plugins.nvidia.time-slicing]
replicas = 10
apiclient set --json '{
"settings": {
"kubelet-device-plugins": {
"nvidia": {
"device-sharing-strategy": "time-slicing",
"time-slicing": {
"replicas": 10
}
}
}
}
}'
Settings
Full Reference
settings.kubelet-device-plugin.nvidia.device-id-strategy
Specifies the desired strategy for passing device IDs to the container.
Default: index
index
uuid
settings.kubelet-device-plugin.nvidia.device-list-strategy
Specifies the desired strategy for passing the device list to the container. If the value is set to:
volume-mounts
, the list of devices is passed as a set of volume mounts instead of as an environment variable to instruct the NVIDIA Container Runtime to inject the devices.envvar
, theNVIDIA_VISIBLE_DEVICES
environment variable is used to select the devices that are to be injected by the NVIDIA Container Runtime.
Default: volume-mounts
volume-mounts
envvar
settings.kubelet-device-plugin.nvidia.device-sharing-strategy
Specifies the desired sharing strategy of of the GPU resource.
Default: none
none
time-slicing
settings.kubelet-device-plugin.nvidia.pass-device-specs
Specifies passing the paths and desired device node permissions for any NVIDIA devices being allocated to the container.
Default: true
true
false
settings.kubelet-device-plugin.nvidia.time-slicing.fail-requests-greater-than-one
Specifies the resource request handling behavior when a request has more than one GPU replica.
As described by NVIDIA, the purpose of this field is to enforce awareness that requesting more than one GPU replica does not result in receiving more proportional access to the GPU. When set to true
, a resource request for more than one GPU fails with an UnexpectedAdmissionError
. In this case, you must manually delete the pod, update the resource request, and redeploy.
Default: true
when settings.kubelet-device-plugins.nvidia.device-sharing-strategy
is set to time-slicing
.
true
false
settings.kubelet-device-plugin.nvidia.time-slicing.rename-by-default
Specifies the Kubernetes advertised resource as <resource-name>.shared
instead of <resource-name>
.
For example, if this field is set to true
the nodes that are configured for time-sliced GPU access then advertise the resource as nvidia.com/gpu.shared
. Setting this field to true can be helpful if you want to schedule pods on GPUs with shared access by specifying <resource-name>.shared
in the resource request. When this field is set to false
, the advertised resource name is not modified, such as nvidia.com/gpu
.
Default: true
when settings.kubelet-device-plugins.nvidia.device-sharing-strategy
is set to time-slicing
.
true
false
settings.kubelet-device-plugin.nvidia.time-slicing.replicas
Specifies the desired sharing strategy of of the GPU resource.
Default: 2
when settings.kubelet-device-plugins.nvidia.device-sharing-strategy
is set to time-slicing
.
- positive integer number
>=2