PCI GPU Device Passthrough

In this document we show how to add a NVIDIA GPU to a KVM virtual Machine in Ravada VDI. Any PCI device could be added in a similar way.

Note

We will not use the GPU as a display device in this guide. We will only try to run GPU calculations with CUDA.

Status

Please note that this is a very active topic and the instructions outlined here might not work in your environment

Requirements

  • One or more NVidia GPUs

  • A properly configured kernel along with a recent version of qemu. We were successful with Ubuntu 20.04 with kernel 5.8.

  • Ravada version 2.0. It is currently labelled as alpha but it is being used in production in our own servers.

Hardware Identification

sudo lspci -Dnn | grep NVIDIA
0000:1b:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2204] (rev a1)
0000:1b:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aef] (rev a1)

As you can see there are two devices attached to this PCI, that is because the video card has also an audio device. From here we gather:

  • PCI data: 0000:1b:00.0 and 0000:1b:00.1

  • Devices: 10de:2204 and 10de:1aef

OS Configuration

This is the configuration for the host server. We must setup the kernel so the PCI device is not used in any way so it can be passed to the virtual machine.

Grub

Add this to GRUB_CMDLINE_LINUX line in /etc/default/grub. This is suited for Intel computers. Notice we added the devices identified in the previous step to pci-stub.

Here we are preventing the GPU to be used by the main host so it can be passed to the virtual machine.

GRUB_CMDLINE_LINUX_DEFAULT="snd_hda_intel.blacklist=1 nouveau.blacklist=1 intel_iommu=on rd.driver.pre=vfio-pci pci-stub.ids=10de:2204,10de:1aef"

AMD require a similar setup adding

GRUB_CMDLINE_LINUX_DEFAULT="snd_hda_intel.blacklist=1 nouveau.blacklist=1 amd_iommu=on iommu=pt kvm_amd.npt=1 kvm_amd.avic=1 rd.driver.pre=vfio-pci pci-stub.ids=10de:2204,10de:1aef"

Modules

Blacklist modules creating the file /etc/modprobe.d/blacklist-gpu.conf

blacklist nouveau
blacklist radeon
blacklist amdgpu
blacklist snd_hda_intel
blacklist nvidiafb

GPUs with embedded USB

If your GPU also sports an USB device, you need to check what driver this device is using and blacklist it here also.

Warning

This procedure will render all the USB devices of the server unavailable. From now on the USB keyboard can not be used anymore. Make sure you always have access with ssh. In case you needed to use the keyboard again you have to remove the blacklist xhci_pci from the file /etc/modprobe.d/blacklist-gpu.conf, issue update-grub and reboot the server.

Use lspci -Dnn -k to check what driver it is using.

$ lspci -Dnn -k
000:b1:00.2 USB controller [0c03]: NVIDIA Corporation TU104 USB 3.1 Host Controller [10de:1ad8] (rev a1)
Subsystem: NVIDIA Corporation TU104 USB 3.1 Host Controller [10de:129f]
Kernel drivers in use: xhci_pci

Then add it to the end of the file /etc/modprobe.d/blacklist-gpu.conf:

blacklist xhci_pci

Make drivers use vfio

Also prevent the nvidia drivers to load and raise vfio-pci instead in /etc/modprobe.d/nvidia.conf

softdep nouveau pre: vfio-pci
softdep nvidia pre: vfio-pci
softdep nvidia* pre: vfio-pci

Add the modules so they are loaded on boot in /etc/modules

vfio
vfio_iommu_type1 allow_unsafe_interrupts=1
vfio_pci ids=10de:2204,10de:1aef
vfio_virqfd

Pass the device identifiers to vfio-pci in /etc/modprobe.d/vfio.conf. This is duplicated from the previous step, it may be removed eventually from this doc if we find it unnecessary. Anyway it doesn’t harm.

options vfio-pci ids=10de:2204,10de:1aef disable_vga=1

When loading KVM make it ignore MSRS in /etc/modprobe.d/kvm.conf options kvm ignore_msrs=1

options kvm ignore_msrs=1

Add id of the deviced used by nvidia at /etc/initramfs-tools/modules. This must be one line:

vfio vfio_iommu_type1 vfio_virqfd vfio_pci ids=10de:2204,10de:1aef,10de:1ad8 allow_unsafe_interrupts=1

Update the grub and initram configuration and reboot.

sudo update-grub
sudo update-initramfs -u
sudo reboot

Checks

Modules

No nvidia nor nouveau should be loaded:

sudo lsmod | egrep -i "(nouveau|nvidia)"

The device should use vfio driver:

lspci -k | egrep -A 5 -i nvidia
1b:00.0 VGA compatible controller: NVIDIA Corporation Device 2204 (rev a1)
  Subsystem: Gigabyte Technology Co., Ltd Device 403b
  Kernel driver in use: vfio-pci
  Kernel modules: nvidiafb, nouveau
1b:00.1 Audio device: NVIDIA Corporation Device 1aef (rev a1)
  Subsystem: Gigabyte Technology Co., Ltd Device 403b
      Kernel modules: snd_hda_intel

See that though in the NVIDIA VGA the preferred kernel modules are nvidiafb and nouveau, it actually loads vfio-pci which is great.

IOMMU

Check it is enabled

dmesg | grep -i iommu | grep -i enabled
[    0.873154] DMAR: IOMMU enabled

Verify the iommu groups. Both devices should be in the same group. We use grep to search for the PCI device numbers we found in the very first step.

dmesg | grep iommu | grep 1b:00
[    2.474726] pci 0000:1b:00.0: Adding to iommu group 38
[    2.474807] pci 0000:1b:00.1: Adding to iommu group 38

Ravada Setup

Now we want to use the GPU, by now we will only try to execute CUDA so it will not be a device used for display. This can also be achieved but it will be addressed in future releases.

After we have the host configured we must tell Ravada we want to pass some PCI devices to the virtual machines.

Configure the Node Host Device

At the node configuration we add a PCI Host Device group. This is a pool of devices that will be added to the clones.

In this example we select PCI and then click on “Add host device”.

../_images/node_hostdev.png

After a few seconds we can see the PCI devices available in the host, we filter only the Nvidia brand.

Now the Host Device will be available in the Hardware configuration in the virtual machine.

../_images/vm_hostdev.png

Now, when the virtual machine is started it will pick one of the free devices and it will appear as a PCI entry.

Virtual Machine GPU Ubuntu setup

As an example we load the GPU in Ubuntu and verify it is being used.

Packages

Configure from the graphical interface to load propietary drivers for NVIDIA server.

additional drivers

Search for the additional drivers application

Press the Windows key and type additional , click in the application called Additional Drivers

select the NVIDIA drivers

Choose the NVIDIA driver for servers

In our scenario we only want to run CUDA on the GPU so we just select the server drivers.

This is the list of packages for our setup:

  • nvidia-compute-utils-460-server

  • nvidia-dkms-460-server

  • nvidia-driver-460-server

  • nvidia-kernel-common-460-server

  • nvidia-kernel-source-460-server

  • nvidia-settings

  • nvidia-utils-460-server

Choose the Display VGA

After installing the NVidia drivers the Window Manager may try to run on top of the GPU and fail. Choose the other video card:

First let’s what cards do you have:

$ sudo prime-select
Usage: /usr/bin/prime-select nvidia|intel|on-demand|query

Choose not nvidia, in our case it is intel:

sudo prime-select intel

Add the nvidia module to load on startup. Check there is this line in /etc/modules

nvidia_uvm

Reboot the virtual machine now. The other VGA should be used for display and the NVIDIA GPU can be used to run other tasks.

CUDA

In this particular installation we wanted to try CUDA. We install the package and check if it works:

sudo apt install nvidia-cuda-toolkit
nvidia-smi

If it works nvidia smi will show the detected hardware:

Driver Version                            : 460.73.01
CUDA Version                              : 11.2
Attached GPUs                             : 1
GPU 00000000:01:01.0
Product Name                          : GeForce RTX 3090
Product Brand                         : GeForce

Common Problems

Error: enable unsafe interrupts

Add to /etc/default/grub vfio_iommu_type1.allow_unsafe_interrupts=1

GRUB_CMDLINE_LINUX_DEFAULT="vfio_iommu_type1.allow_unsafe_interrupts=1"

Error: iommu group is not viable

If you get this error trying to start the virtual machine with a GPU attached:

2021-12-17T07:35:06.533164Z qemu-system-x86_64: -device vfio-pci,host=0000:b1:00.0,id=hostdev0,bus=pci.1,addr=0x1,rombar=1: vfio 0000:b1:00.0:
group 155 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.

This means the PCI you want to pass through has more devices. Possibly an USB embedded is loading its driver and preventing the GPU from being attached in the virtual machine.

# dmesg | grep iommu | grep "group 155"
[    1.893555] pci 0000:b1:00.0: Adding to iommu group 155
[    1.893653] pci 0000:b1:00.1: Adding to iommu group 155
[    1.893751] pci 0000:b1:00.2: Adding to iommu group 155
[    1.893848] pci 0000:b1:00.3: Adding to iommu group 155

If your GPU also sports an USB device, you need to check what driver this device is using and blacklist it here also. Use lspci -Dnn -k to check what driver it is using. Search for the PCI device numbers you found in the previous command.

$ lspci -Dnn -k
000:b1:00.2 USB controller [0c03]: NVIDIA Corporation TU104 USB 3.1 Host Controller [10de:1ad8] (rev a1)
Subsystem: NVIDIA Corporation TU104 USB 3.1 Host Controller [10de:129f]
Kernel drivers in use: xhci_pci

Then add it to the end of the file /etc/modprobe.d/blacklist-gpu.conf:

blacklist xhci_pci

References