PCI Passthrough error 'group x is not viable' 2
1Failed to build and run instance: libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2022-08-05T12:05:46.630755Z qemu-system-x86_64: -device vfio-pci,host=0000:c3:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio 0000:c3:00.0: group 15 is not viable
2Traceback (most recent call last):
3 File "/usr/local/lib/python3.8/dist-packages/nova/compute/manager.py", line 2398, in _build_and_run_instance
4 self.driver.spawn(context, instance, image_meta,
5 File "/usr/local/lib/python3.8/dist-packages/nova/virt/libvirt/driver.py", line 4225, in spawn
6 self._create_guest_with_network(
7 File "/usr/local/lib/python3.8/dist-packages/nova/virt/libvirt/driver.py", line 7293, in _create_guest_with_network
8 self._cleanup(
9 File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 227, in __exit__
10 self.force_reraise()
11 File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise
12 raise self.value
13 File "/usr/local/lib/python3.8/dist-packages/nova/virt/libvirt/driver.py", line 7262, in _create_guest_with_network
14 guest = self._create_guest(
15 File "/usr/local/lib/python3.8/dist-packages/nova/virt/libvirt/driver.py", line 7202, in _create_guest
16 guest.launch(pause=pause)
17 File "/usr/local/lib/python3.8/dist-packages/nova/virt/libvirt/guest.py", line 168, in launch
18 LOG.exception('Error launching a defined domain with XML: %s',
19 File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 227, in __exit__
20 self.force_reraise()
21 File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise
22 raise self.value
23 File "/usr/local/lib/python3.8/dist-packages/nova/virt/libvirt/guest.py", line 165, in launch
24 return self._domain.createWithFlags(flags)
25 File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 193, in doit
26 result = proxy_call(self._autowrap, f, *args, **kwargs)
27 File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 151, in proxy_call
28 rv = execute(f, *args, **kwargs)
29 File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 132, in execute
30 six.reraise(c, e, tb)
31 File "/usr/local/lib/python3.8/dist-packages/six.py", line 719, in reraise
32 raise value
33 File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 86, in tworker
34 rv = meth(*args, **kwargs)
35 File "/usr/lib/python3/dist-packages/libvirt.py", line 1265, in createWithFlags
36 if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
37libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2022-08-05T12:05:46.630755Z qemu-system-x86_64: -device vfio-pci,host=0000:c3:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio 0000:c3:00.0: group 15 is not viable
38Please ensure all devices within the iommu_group are bound to their vfio bus driver.
39
40Successfully unplugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:7d:97:c6,bridge_name='br-int',has_traffic_filtering=True,id=3860b4d7-48af-4e84-905e-514a7ab8c14f,network=Network(955e9ddc-604d-41dd-b2c5-df54c417615b),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap3860b4d7-48')
41default default] [instance: fd3de719-0fa5-44f6-ab75-bf35034d0726] Took 0.25 seconds to deallocate network for instance.
42default default] Deleted allocations for instance fd3de719-0fa5-44f6-ab75-bf35034d07261#!/bin/bash
2# change the 999 if needed
3shopt -s nullglob
4for d in /sys/kernel/iommu_groups/{0..999}/devices/*; do
5n=${d#*/iommu_groups/*}; n=${n%%/*}
6printf 'IOMMU Group %s ' "$n"
7lspci -nns "${d##*/}"
8done;
|
|
/etc/default/grub Set line to GRUB_CMDLINE_LINUX_DEFAULT=amd_iommu=on iommu=pt kvm.ignore_msrs=1 vfio-pci.ids=10de:1aef,10de:2230,10de:2231,10de:24B0,10de:228b adding the 10de:228b
/etc/modprobe.d/vfio.conf to options vfio-pci ids=10de:1aef,10de:2230,10de:2231,10de:24B0,10de:228b
update-grub2 reboot
Looks like the libvirt process cant pass the whole iommu group through to the guest VM becuase the nvidia audio device associated with the GPU on the A4000 has a different PCI ID to the A5000 and A6000, so it hasnt been added to the vfio driver. Thus when KVM tries to passthrough the GPU and the associated Audio device it fails becuase the Nvidia driver has the device locked. Ive made a change to the vfio.conf and I'm rolling it out now to test if it works
After the fix applied
|
|