diff --git a/pages/gpu/how-to/use-mig-with-kubernetes.mdx b/pages/gpu/how-to/use-mig-with-kubernetes.mdx index bafcf8e5ad..1858cfb841 100644 --- a/pages/gpu/how-to/use-mig-with-kubernetes.mdx +++ b/pages/gpu/how-to/use-mig-with-kubernetes.mdx @@ -3,7 +3,7 @@ title: How to use NVIDIA MIG technology with Kubernetes description: This section provides information about MIG with Kubernetes tags: NVIDIA H100 MIG multi-instance gpu dates: - validation: 2025-07-21 + validation: 2026-02-10 posted: 2023-09-19 --- import Requirements from '@macros/iam/requirements.mdx' @@ -27,7 +27,8 @@ In this guide, we will explore the capabilities of NVIDIA MIG within a Kubernete - A [Kubernetes cluster](/kubernetes/quickstart/#how-to-create-a-kubernetes-cluster) with a [GPU Instance](/gpu/reference-content/choosing-gpu-instance-type/) as node - MIG is fully supported on [Scaleway managed Kubernetes](/kubernetes/quickstart/) clusters (Kapsule and Kosmos). + - MIG is fully supported on [Scaleway managed Kubernetes](/kubernetes/quickstart/) clusters (Kapsule and Kosmos). + - MIG is available on Scaleway [H100, H100-SXM](/gpu/how-to/use-nvidia-mig-technology/#how-to-list-mig-profiles), and [B3OO](/gpu/how-to/use-nvidia-mig-technology/#mig-profiles-for-b300-gpu-instances) GPU Instances. ## Configure MIG partitions inside a Kubernetes cluster diff --git a/pages/gpu/how-to/use-nvidia-mig-technology.mdx b/pages/gpu/how-to/use-nvidia-mig-technology.mdx index 55613bc89c..69d62a5aff 100644 --- a/pages/gpu/how-to/use-nvidia-mig-technology.mdx +++ b/pages/gpu/how-to/use-nvidia-mig-technology.mdx @@ -3,7 +3,7 @@ title: How to use the NVIDIA MIG technology on GPU Instances description: This section provides information about NVIDIA's MIG technology tags: NVIDIA H100 MIG multi-instance gpu dates: - validation: 2025-07-21 + validation: 2026-02-10 posted: 2023-08-31 --- import Requirements from '@macros/iam/requirements.mdx' @@ -48,7 +48,7 @@ By default, the MIG feature of NVIDIA GPUs is disabled. To use it with your GPU 1. [Connect to your GPU Instance](/gpu/how-to/create-manage-gpu-instance/#how-to-connect-to-a-gpu-instance) as root using SSH. 2. Check the status of the MIG mode of your Instance running `nvidia-smi`. It shows that MIG mode is disabled: - ```s + ```bash root@my-h100-instance:~# nvidia-smi -i 0 Tue Aug 22 11:58:39 2023 +-----------------------------------------------------------------------------+ @@ -64,13 +64,13 @@ By default, the MIG feature of NVIDIA GPUs is disabled. To use it with your GPU +-------------------------------+----------------------+----------------------+ ``` 2. Run the following command to enable MIG mode on the GPU: - ```s + ```bash root@my-h100-instance:~# sudo nvidia-smi -i 0 -mig 1 Enabled MIG Mode for GPU 00000000:01:00.0 All done. ``` 3. Run the following command to verify that MIG mode is enabled on the GPU: - ```s + ```bash root@my-h100-instance:~# nvidia-smi -i 0 --query-gpu=pci.bus_id,mig.mode.current --format=csv pci.bus_id, mig.mode.current 00000000:01:00.0, Enabled @@ -85,8 +85,8 @@ These profiles determine the sizes and functionalities available of the MIG part Refer to the official documentation for more information about the supported [MIG profiles on H100 GPU Instances](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/#h100-profiles). -1. Run the command `nvidia-smi mig -lgip` to retrieve a list of the available MIG profiles for the Instance. An output similar to the following displays: - ```s +1. Run the command `nvidia-smi mig -lgip` to retrieve a list of the available MIG profiles for the Instance. An output similar to the following displays for H100 GPU Instances: + ```bash root@my-h100-instance:~# nvidia-smi mig -lgip +-----------------------------------------------------------------------------+ | GPU instance profiles: | @@ -115,8 +115,11 @@ Refer to the official documentation for more information about the supported [MI | 8 7 1 | +-----------------------------------------------------------------------------+ ``` + + + 2. Run the following command to list the possible placements available. The syntax of the placement is `{}:` and shows the placement of the instances on the GPU. - ```s + ```bash root@my-h100-instance:~# nvidia-smi mig -lgipp GPU 0 Profile ID 19 Placements: {0,1,2,3,4,5,6}:1 GPU 0 Profile ID 20 Placements: {0,1,2,3,4,5,6}:1 @@ -127,31 +130,43 @@ Refer to the official documentation for more information about the supported [MI GPU 0 Profile ID 0 Placement : {0}:8 ``` -### MIG profiles for B300 GPU Instances - - - The MIG profiles for the NVIDIA B300 GPU are currently **not officially published** by NVIDIA. The information below is based on preliminary data confirmed via direct communication with NVIDIA and is **subject to change**. Be advised to proceed with caution in production environments. - - -While NVIDIA has not yet released the official MIG profile specifications for the **B300**, NVIDIA indicates that the MIG profile configurations will [mirror those of the B200](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/supported-mig-profiles.html#b200-mig-profiles), but with increased memory allocation due to the B300's larger onboard memory. - -The expected MIG profile memory capacities for the B300 are estimated as follows: - -| B200 Profile (Memory) | Estimated B300 Equivalent (Memory) | -|-----------------------|------------------------------------| -| 1g.10gb (180 GB) | → 288 GB (estimated) | -| 1g.5gb (90 GB) | → 144 GB (estimated) | -| 1g.3gb (45 GB) | → 72 GB (estimated) | -| 1g.2gb (23 GB) | → 36 GB (estimated) | - -These values are **proportional scaling** based on the B200-to-B300 memory ratio and internal confirmation, but are **not final**. Always verify available profiles using `nvidia-smi -L` or `nvidia-smi mig -ldev` on your Instance before deployment. - -Once NVIDIA officially releases the B300 MIG specifications, we will update the documentation accordingly. +### MIG profiles for B300 GPU Instances + +On NVIDIA B300 Instances, the following MIG profiles are available: +```bash + ++-----------------------------------------------------------------------------+ +| GPU instance profiles: | +| GPU Name ID Instances Memory P2P SM DEC ENC | +| Free/Total GiB CE JPEG OFA | +|=============================================================================| +| 0 MIG 1g.34gb 19 7/7 30.50 No 18 1 0 | +| 1 1 0 | ++-----------------------------------------------------------------------------+ +| 0 MIG 1g.34gb+me 20 1/1 30.50 No 18 1 0 | +| 2 1 1 | ++-----------------------------------------------------------------------------+ +| 0 MIG 1g.67gb 15 4/4 66.50 No 30 1 0 | +| 2 1 0 | ++-----------------------------------------------------------------------------+ +| 0 MIG 2g.67gb 14 3/3 66.50 No 70 2 0 | +| 4 2 0 | ++-----------------------------------------------------------------------------+ +| 0 MIG 3g.135gb 9 2/2 133.50 No 70 3 0 | +| 6 3 0 | ++-----------------------------------------------------------------------------+ +| 0 MIG 4g.135gb 5 1/1 133.50 No 62 4 0 | +| 8 4 0 | ++-----------------------------------------------------------------------------+ +| 0 MIG 7g.269gb 0 1/1 268.00 No 148 7 0 | +| 16 7 1 | ++-----------------------------------------------------------------------------+ +``` ## How to partition the GPU into several MIG partitions 1. Run the following command to divide the H100 GPU Instance into four slices (MIG partitions): - ```s + ```bash root@my-h100-instance:~# sudo nvidia-smi mig -cgi 9,19,19,19 -C Successfully created GPU instance ID 2 on GPU 0 using profile MIG 3g.40gb (ID 9) Successfully created compute instance ID 0 on GPU 0 GPU instance ID 2 using profile MIG 3g.40gb (ID 2) @@ -184,7 +199,7 @@ Once NVIDIA officially releases the B300 MIG specifications, we will update the 2. Run the following command to verify the MIG configuration of the GPU: - ```s + ```bash root@my-h100-instance:~# sudo nvidia-smi mig -lgi +-------------------------------------------------------+ | GPU instances: | @@ -201,7 +216,7 @@ Once NVIDIA officially releases the B300 MIG specifications, we will update the +-------------------------------------------------------+ ``` 3. Display the UUID of each of your MIG partitions: - ```s + ```bash root@my-h100-instance:~# nvidia-smi -L GPU 0: NVIDIA H100 PCIe (UUID: GPU-7cd6d4d6-9fa8-13be-3c42-09a1b2280a02) MIG 3g.40gb Device 0: (UUID: MIG-da06d78f-7534-56a0-a062-62fef012be91) @@ -212,7 +227,7 @@ Once NVIDIA officially releases the B300 MIG specifications, we will update the 4. Run `nvidia-smi` on three 1g.10gb MIG partitions to display their characteristics: * MIG 1g.10gb 1 - ```s + ```bash root@my-h100-instance:~# sudo docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=MIG-8aa1fc52-9818-58ec-bc64-8f0cae121bb4 nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi Tue Aug 22 12:53:40 2023 +-----------------------------------------------------------------------------+ @@ -247,7 +262,7 @@ Once NVIDIA officially releases the B300 MIG specifications, we will update the +-----------------------------------------------------------------------------+ ``` * MIG 1g.10gb 2 - ```s + ```bash root@my-h100-instance:~# sudo docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=MIG-42fa9c93-1430-5ccc-b623-c02fb93b7f5a nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi Tue Aug 22 12:54:11 2023 +-----------------------------------------------------------------------------+ @@ -282,7 +297,7 @@ Once NVIDIA officially releases the B300 MIG specifications, we will update the +-----------------------------------------------------------------------------+ ``` * MIG 1g.10gb 3 - ```s + ```bash root@my-h100-instance:~# sudo docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=MIG-6d96b431-44ba-5360-80b0-9359561c927d nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi Tue Aug 22 12:54:54 2023 +-----------------------------------------------------------------------------+ @@ -317,7 +332,7 @@ Once NVIDIA officially releases the B300 MIG specifications, we will update the +-----------------------------------------------------------------------------+ ``` 5. Launch a Jupyter Notebook within a Docker container on the MIG 3g.40gb MIG partition. Once initiated, you can reach it by connecting to port 8888 using the public IP of your GPU Instance. - ```s + ```bash root@my-h100-instance:~# sudo docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=MIG-da06d78f-7534-56a0-a062-62fef012be9 -p 8888:8888 jupyter/minimal-notebook ``` @@ -328,7 +343,7 @@ Once NVIDIA officially releases the B300 MIG specifications, we will update the Once you have finished your jobs, you can delete the MIG partitions. 1. Run the following command to remove all MIG partitions and their corresponding compute instances: - ```s + ```bash root@my-h100-instance:~# sudo nvidia-smi mig -dci && sudo nvidia-smi mig -dgi Successfully destroyed compute instance ID 0 from GPU 0 GPU instance ID 7 Successfully destroyed compute instance ID 0 from GPU 0 GPU instance ID 8 @@ -340,7 +355,7 @@ Once you have finished your jobs, you can delete the MIG partitions. Successfully destroyed GPU instance ID 2 from GPU 0 ``` 2. Verify that the MIG partitions have been removed by running the `nvidia-smi` command: - ```s + ```bash +-----------------------------------------------------------------------------+ | MIG devices: | +------------------+----------------------+-----------+-----------------------+ @@ -355,7 +370,7 @@ Once you have finished your jobs, you can delete the MIG partitions. ## How to disable MIG on a GPU Instance To disable MIG on your H100 GPU Instance, run the following command: -```s +```bash root@my-h100-instance:~# nvidia-smi -mig 0 Disabled MIG Mode for GPU 00000000:01:00.0 All done.