Skip to main content
GPU Passthrough

GPU Passthrough

·676 words·4 mins
Arda Bakici
Author
Arda Bakici
A university student majoring in Computer Science and Mathematics in NYU Abu Dhabi.
Table of Contents

Introduction
#

This is a simplified guide for doing GPU Passthrough on Arch Linux with KVM, libvirt and virt-manager. It is based on the Arch Wiki entry. I am using a muxless Dell G3 15 3590 Intel i7 + Nvidia 1660ti MaxQ gaming laptop with 32 GB RAM.

Setup
#

Enable IOMMU
#

Follow this part of Arch Wiki.

Binding NVIDIA GPU to VFIO-PCI
#

This is an important step where Arch Wiki article and this one diverge. Arch Wiki states that GPU drivers cannot be dynamically rebind and laptop has to be rebooted. This is not true. NVIDIA GPUs can easily hot reload their drivers and actually this way of doing it caused me less issues then rebooting each time. However, to dynamically reload, we have to make sure that no application is using the dGPU when idle like Wayland.

Preparation
#

Block Critical Applications From Using NVIDIA
#

DRM Modesetting
#

Disable DRM modesetting to stop kernel from occupying the dGPU. Use prime-offloading for GPU intensive applications. Pass nvidia_drm.modeset=0 as a kernel parameter.

Delete nvidia-drm-outputclass.conf (Xorg)
#

Delete the file at /usr/share/X11/xorg.conf.d/10-nvidia-drm-outputclass.conf to stop Xorg from using NVIDIA.

Delete EGL External Platform NVIDIA files (Wayland)
#

Delete file at /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json to stop Wayland from using NVIDIA. I couldn’t find any other way to stop Wayland from occupying the dGPU, especially on compositors like Hyprland.

Stop VFIO from Putting GPU Into Sleep
#

Important Note! If you have a laptop where NVIDIA GPU manages one of the USB 3.1 ports then very possibly VFIO-PCI will cause your laptop to halt while putting dGPU to D3 state.

This step is not mandatory for GPU Passthrough to work, but doing so can eliminate a lot of bugs that can occur. Normally vfio-pci puts devices under its control to D3 state sleep after certain amount of time (approximately 20 seconds). In certain NVIDIA GPU configurations (like the aforementioned) this causes the system to halt because dGPU rejects to switch states. Disabling sleep on idle in vfio-pci settings solves this issue. This doesn’t cause any issues since we are dynamically rebinding our GPU. vfio-pci is just a transition driver between host driver and guest driver.

Put options vfio-pci disable_idle_d3=1 in /etc/modprobe.d/vfio.conf.

Check PCI ID of your GPU
#

Use lspci -nnk and note down their PCI IDs. We are going to need this in next step.

Dynamic Rebind
#

Before executing the following commands, or starting your VM always check if any processes are using the NVIDIA GPU with sudo fuser -v /dev/nvidia0 and nvidia-smi. Trying to rebind while GPU is occupied may cause your system to crash.

Detach
#

# Stop Nvidia Persistenced
sudo systemctl stop nvidia-persistenced.service
# Unload Nvidia kernel drivers
sudo modprobe -r nvidia_drm
sudo modprobe -r nvidia_modeset
sudo modprobe -r nvidia_uvm
sudo modprobe -r nvidia
# Unbind the GPU from display driver
sudo virsh nodedev-detach pci_0000_01_00_0 
sudo virsh nodedev-detach pci_0000_01_00_1
sudo virsh nodedev-detach pci_0000_01_00_2
sudo virsh nodedev-detach pci_0000_01_00_3

Replace pci_0000_01_00_0 with the IDs you noted down in [[#Check PCI ID of your GPU]]. Note that you have to do sudo virsh nodedev-detach pci_0000_01_00_0 for each device in the IOMMU group of dGPU.

If all has gone well then your dGPU should be detached from NVDIA. Check it with nvidia-smi or sudo lsof /dev/nvidia*.

Create the Virtual Machine
#

Follow the steps in Arch Wiki article. Everything should work as intended.

Reattaching your GPU
#

Reattach
#

# Bind the GPU to display driver
sudo virsh nodedev-reattach pci_0000_01_00_0
sudo virsh nodedev-reattach pci_0000_01_00_1
sudo virsh nodedev-reattach pci_0000_01_00_2
sudo virsh nodedev-reattach pci_0000_01_00_3
# Load Nvidia kernel drivers
sudo modprobe nvidia_drm
sudo modprobe nvidia_modeset
sudo modprobe nvidia_uvm
sudo modprobe nvidia
# Start Nvidia Persistenced
sudo systemctl start nvidia-persistenced.service

Above script should make your dGPU reusable in the host. To make the Detach and Reattach easier, you could set them up as libvirt hooks. Follow instructions in here.

What’s Next
#

Congratulations on your GPU Passthrough. Now what’s left is to optimize your VM to achieve as close to bare-metal performance. Check the tips in Arch Wiki and VFIO Reddit.