[Solved] KVM causes kernel panic

Hi,

I have problems with a freezing Win10 VM and kernel panics which sometimes result that the freezing VM is taking the host with it which only a reboot can resolve.

What I do:
I am using a Windows 10 VM and PCI passthrough for gaming.

System:
Ryzen 5 3600X
Gigabyte X570 AORUS ELITE - Bios Version F31 - 31.12.2020 d.m.Y
4x 8GB G.Skill Ripjaws 3600 C16 - In XMP profile 1
MSI GeForce RTX 2080 SUPER GAMING X TRIO - pci passthrough to guest
MSI GeForce GT 1030 - host
SupaGeek 5-Port-PCI USB-3 Card - passthrough to guest
Samsung SSD 970 EVO - 500GB
Samsung SSD 860 EVO - 1TB

Disk partitions
860 EVO - NTFs drive for the VM where all the games are on
970 EVO

  • 512M UEFI Boot partition
  • 390GB luks partition - Manjaro
  • 75GB ntfs - Windows

Versions:

  • Qemu - 5.2.0
  • Virsh - 6.5.0
  • Kernel - 5.10.7-3-MANJARO / 5.11.rc3-Manjaro
  • VFIO-Guest - 0.1.190-1
  • Windows 10 - 19042.746
  • Win10 Nvidia driver - 460.89

What I also observed is that those panics are total random. Sometimes the VM runs 4-5mins until it freezes, yesterday I got >1H runtime without problems.

Sometimes there isn’t even any panic or output on dmesg or journalctl but the VM freezes complete (including those nasty sound buffer lock sounds)

I hope that I am in the right place here, if not it would be great if you can provide a direction where I should post this problem.

Cross post on reddit:

Kernel panic:

Jän 27 08:42:06 martin-x570aoruselite kernel: CR2: fffffffffffffff0 CR3: 00000006afb6a000 CR4: 0000000000350ee0
Jän 27 08:42:06 martin-x570aoruselite kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jän 27 08:42:06 martin-x570aoruselite kernel: FS:  0000000000000000(0053) GS:ffff89276ea80000(002b) knlGS:000000000024d000
Jän 27 08:42:06 martin-x570aoruselite kernel: R13: 00000000000ffe07 R14: 0000000000000006 R15: 0000000000000001
Jän 27 08:42:06 martin-x570aoruselite kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffe070d6
Jän 27 08:42:06 martin-x570aoruselite kernel: RBP: ffff8925ef82cc80 R08: 0000000100000000 R09: 0000000000000000
Jän 27 08:42:06 martin-x570aoruselite kernel: RDX: 00000000ffffffff RSI: ffff8925ef82dd38 RDI: ffffa33d41393c78
Jän 27 08:42:06 martin-x570aoruselite kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: fffffffffffffff0
Jän 27 08:42:06 martin-x570aoruselite kernel: RSP: 0018:ffffa33d41393c70 EFLAGS: 00010282
Jän 27 08:42:06 martin-x570aoruselite kernel: Code: 8b 40 10 48 81 c6 60 01 00 00 48 8d 48 f0 48 89 4f 20 48 39 c6 75 13 eb>
Jän 27 08:42:06 martin-x570aoruselite kernel: RIP: 0010:__mtrr_lookup_var_next+0x3b/0x90 [kvm]
Jän 27 08:42:06 martin-x570aoruselite kernel: ---[ end trace 40760db5febdbfe2 ]---
Jän 27 08:42:06 martin-x570aoruselite kernel: CR2: fffffffffffffff0
Jän 27 08:42:06 martin-x570aoruselite kernel:  soundcore fb_sys_fops dca wmi pinctrl_amd mac_hid acpi_cpufreq nvidia(POE) s>
Jän 27 08:42:06 martin-x570aoruselite kernel: Modules linked in: vhost_net tun vhost vhost_iotlb macvtap macvlan tap rfcomm>
Jän 27 08:42:06 martin-x570aoruselite kernel: R13: 0000000000000006 R14: 00007fe058bce640 R15: 0000000000000000
Jän 27 08:42:06 martin-x570aoruselite kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
Jän 27 08:42:06 martin-x570aoruselite kernel: RBP: 00005624b2bda840 R08: 00005624b0882b68 R09: 0000000000000038
Jän 27 08:42:06 martin-x570aoruselite kernel: RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000035
Jän 27 08:42:06 martin-x570aoruselite kernel: RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007fe05b1c6f6b
Jän 27 08:42:06 martin-x570aoruselite kernel: RSP: 002b:00007fe058bcd608 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jän 27 08:42:06 martin-x570aoruselite kernel: Code: 89 d8 49 8d 3c 1c 48 f7 d8 49 39 c4 72 b5 e8 1c ff ff ff 85 c0 78 ba 4c>
Jän 27 08:42:06 martin-x570aoruselite kernel: RIP: 0033:0x7fe05b1c6f6b
Jän 27 08:42:06 martin-x570aoruselite kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jän 27 08:42:06 martin-x570aoruselite kernel:  do_syscall_64+0x33/0x40
Jän 27 08:42:06 martin-x570aoruselite kernel:  __x64_sys_ioctl+0x83/0xb0
Jän 27 08:42:06 martin-x570aoruselite kernel:  kvm_vcpu_ioctl+0x25f/0x610 [kvm]
Jän 27 08:42:06 martin-x570aoruselite kernel:  ? __wake_up_common+0x7a/0x140
Jän 27 08:42:06 martin-x570aoruselite kernel:  ? pollwake+0x74/0x90
Jän 27 08:42:06 martin-x570aoruselite kernel:  kvm_arch_vcpu_ioctl_run+0xca1/0x16a0 [kvm]
Jän 27 08:42:06 martin-x570aoruselite kernel:  ? x86_virt_spec_ctrl+0xb3/0xe0
Jän 27 08:42:06 martin-x570aoruselite kernel:  ? native_load_tr_desc+0x73/0x80
Jän 27 08:42:06 martin-x570aoruselite kernel:  ? load_fixmap_gdt+0x32/0x40
Jän 27 08:42:06 martin-x570aoruselite kernel:  ? __svm_vcpu_run+0x8b/0x110 [kvm_amd]
Jän 27 08:42:06 martin-x570aoruselite kernel:  ? __svm_vcpu_run+0x97/0x110 [kvm_amd]
Jän 27 08:42:06 martin-x570aoruselite kernel:  ? _raw_spin_unlock_irqrestore+0x20/0x40
Jän 27 08:42:06 martin-x570aoruselite kernel:  kvm_mmu_page_fault+0x78/0x700 [kvm]
Jän 27 08:42:06 martin-x570aoruselite kernel:  kvm_tdp_page_fault+0x33/0x90 [kvm]
Jän 27 08:42:06 martin-x570aoruselite kernel:  kvm_mtrr_check_gfn_range_consistency+0xdd/0x130 [kvm]
Jän 27 08:42:06 martin-x570aoruselite kernel: Call Trace:
Jän 27 08:42:06 martin-x570aoruselite kernel: CR2: fffffffffffffff0 CR3: 00000006afb6a000 CR4: 0000000000350ee0
Jän 27 08:42:06 martin-x570aoruselite kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jän 27 08:42:06 martin-x570aoruselite kernel: FS:  0000000000000000(0053) GS:ffff89276ea80000(002b) knlGS:000000000024d000
Jän 27 08:42:06 martin-x570aoruselite kernel: R13: 00000000000ffe07 R14: 0000000000000006 R15: 0000000000000001
Jän 27 08:42:06 martin-x570aoruselite kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffe070d6
Jän 27 08:42:06 martin-x570aoruselite kernel: RBP: ffff8925ef82cc80 R08: 0000000100000000 R09: 0000000000000000
Jän 27 08:42:06 martin-x570aoruselite kernel: RDX: 00000000ffffffff RSI: ffff8925ef82dd38 RDI: ffffa33d41393c78
Jän 27 08:42:06 martin-x570aoruselite kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: fffffffffffffff0
Jän 27 08:42:06 martin-x570aoruselite kernel: RSP: 0018:ffffa33d41393c70 EFLAGS: 00010282
Jän 27 08:42:06 martin-x570aoruselite kernel: Code: 8b 40 10 48 81 c6 60 01 00 00 48 8d 48 f0 48 89 4f 20 48 39 c6 75 13 eb>
Jän 27 08:42:06 martin-x570aoruselite kernel: RIP: 0010:__mtrr_lookup_var_next+0x3b/0x90 [kvm]
Jän 27 08:42:06 martin-x570aoruselite kernel: Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELIT>
Jän 27 08:42:06 martin-x570aoruselite kernel: CPU: 2 PID: 30051 Comm: CPU 0/KVM Tainted: P        W  OE     5.11.0-1-MANJAR>
Jän 27 08:42:06 martin-x570aoruselite kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Jän 27 08:42:06 martin-x570aoruselite kernel: PGD 736615067 P4D 736615067 PUD 736617067 PMD 0 
Jän 27 08:42:06 martin-x570aoruselite kernel: #PF: error_code(0x0000) - not-present page
Jän 27 08:42:06 martin-x570aoruselite kernel: #PF: supervisor read access in kernel mode
Jän 27 08:42:06 martin-x570aoruselite kernel: BUG: unable to handle page fault for address: fffffffffffffff0

Here the KVM-XML

<domain type="kvm">
  <name>win10</name>
  <uuid>74c2e459-365b-49aa-8730-25cf3f7fa5bb</uuid>
  <metadata>
    <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
      <libosinfo:os id="http://microsoft.com/win/10"/>
    </libosinfo:libosinfo>
  </metadata>
  <memory unit="KiB">16777216</memory>
  <currentMemory unit="KiB">16777216</currentMemory>
  <vcpu placement="static">8</vcpu>
  <iothreads>1</iothreads>
  <cputune>
    <vcpupin vcpu="0" cpuset="2"/>
    <vcpupin vcpu="1" cpuset="8"/>
    <vcpupin vcpu="2" cpuset="3"/>
    <vcpupin vcpu="3" cpuset="9"/>
    <vcpupin vcpu="4" cpuset="4"/>
    <vcpupin vcpu="5" cpuset="10"/>
    <vcpupin vcpu="6" cpuset="5"/>
    <vcpupin vcpu="7" cpuset="11"/>
    <emulatorpin cpuset="0,6"/>
    <iothreadpin iothread="1" cpuset="0-1,6-7"/>
  </cputune>
  <os>
    <type arch="x86_64" machine="pc-q35-5.0">hvm</type>
    <loader readonly="yes" type="pflash">/usr/share/edk2-ovmf/x64/OVMF_CODE.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state="on"/>
      <vapic state="on"/>
      <spinlocks state="on" retries="8191"/>
      <vpindex state="on"/>
      <synic state="on"/>
      <stimer state="on"/>
      <vendor_id state="on" value="0123456789ab"/>
      <frequencies state="on"/>
    </hyperv>
    <kvm>
      <hidden state="on"/>
    </kvm>
    <vmport state="off"/>
  </features>
  <cpu mode="host-model" check="none">
    <topology sockets="1" dies="1" cores="4" threads="2"/>
    <feature policy="require" name="topoext"/>
  </cpu>
  <clock offset="localtime">
    <timer name="rtc" tickpolicy="catchup"/>
    <timer name="pit" tickpolicy="delay"/>
    <timer name="hpet" present="no"/>
    <timer name="hypervclock" present="yes"/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled="no"/>
    <suspend-to-disk enabled="no"/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type="block" device="disk">
      <driver name="qemu" type="raw" cache="none" io="native" discard="unmap" iothread="1" queues="8"/>
      <source dev="/dev/disk/by-id/nvme-Samsung_SSD_970_EVO_500GB_S466NX0M758355V-part3"/>
      <target dev="vda" bus="virtio"/>
      <boot order="1"/>
      <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
    </disk>
    <disk type="block" device="disk">
      <driver name="qemu" type="raw" cache="none" io="native" discard="unmap" iothread="1" queues="8"/>
      <source dev="/dev/disk/by-id/ata-Samsung_SSD_860_EVO_M.2_1TB_S415NB0M506678Z"/>
      <target dev="vdb" bus="virtio"/>
      <address type="pci" domain="0x0000" bus="0x0a" slot="0x00" function="0x0"/>
    </disk>
    <controller type="usb" index="0" model="qemu-xhci" ports="15">
      <address type="pci" domain="0x0000" bus="0x02" slot="0x00" function="0x0"/>
    </controller>
    <controller type="pci" index="0" model="pcie-root"/>
    <controller type="pci" index="1" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="1" port="0x8"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x0" multifunction="on"/>
    </controller>
    <controller type="pci" index="2" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="2" port="0x9"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x1"/>
    </controller>
    <controller type="pci" index="3" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="3" port="0xa"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x2"/>
    </controller>
    <controller type="pci" index="4" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="4" port="0xb"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x3"/>
    </controller>
    <controller type="pci" index="5" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="5" port="0xc"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x4"/>
    </controller>
    <controller type="pci" index="6" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="6" port="0xd"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x5"/>
    </controller>
    <controller type="pci" index="7" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="7" port="0xe"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x6"/>
    </controller>
    <controller type="pci" index="8" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="8" port="0xf"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x7"/>
    </controller>
    <controller type="pci" index="9" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="9" port="0x10"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x0" multifunction="on"/>
    </controller>
    <controller type="pci" index="10" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="10" port="0x11"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x1"/>
    </controller>
    <controller type="pci" index="11" model="pcie-to-pci-bridge">
      <model name="pcie-pci-bridge"/>
      <address type="pci" domain="0x0000" bus="0x09" slot="0x00" function="0x0"/>
    </controller>
    <controller type="pci" index="12" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="12" port="0x12"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x2"/>
    </controller>
    <controller type="sata" index="0">
      <address type="pci" domain="0x0000" bus="0x00" slot="0x1f" function="0x2"/>
    </controller>
    <interface type="direct">
      <mac address="52:54:00:26:23:72"/>
      <source dev="enp5s0" mode="bridge"/>
      <model type="virtio"/>
      <driver queues="8"/>
      <address type="pci" domain="0x0000" bus="0x01" slot="0x00" function="0x0"/>
    </interface>
    <input type="mouse" bus="ps2"/>
    <input type="keyboard" bus="ps2"/>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x0a" slot="0x00" function="0x0"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x03" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x0a" slot="0x00" function="0x1"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x0a" slot="0x00" function="0x2"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x05" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x06" slot="0x00" function="0x0"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x06" slot="0x00" function="0x0"/>
    </hostdev>
    <memballoon model="virtio">
      <address type="pci" domain="0x0000" bus="0x08" slot="0x00" function="0x0"/>
    </memballoon>
  </devices>
</domain>

New kernel panic

Jän 27 17:05:49 martin-x570aoruselite kernel:  secondary_startup_64_no_verify+0xc2/0xcb
Jän 27 17:05:49 martin-x570aoruselite kernel:  cpu_startup_entry+0x19/0x20
Jän 27 17:05:49 martin-x570aoruselite kernel:  do_idle+0x176/0x280
Jän 27 17:05:49 martin-x570aoruselite kernel:  schedule_idle+0x28/0x40
Jän 27 17:05:49 martin-x570aoruselite kernel:  __schedule+0x6af/0x8c0
Jän 27 17:05:49 martin-x570aoruselite kernel:  __schedule_bug.cold+0x89/0x97
Jän 27 17:05:49 martin-x570aoruselite kernel:  dump_stack+0x6b/0x83
Jän 27 17:05:49 martin-x570aoruselite kernel: Call Trace:
Jän 27 17:05:49 martin-x570aoruselite kernel: Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F31 12/31/2020
Jän 27 17:05:49 martin-x570aoruselite kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: P        W  OE     5.11.0-1-MANJARO #1
Jän 27 17:05:49 martin-x570aoruselite kernel: [<ffffffffb5695fbf>] irq_enter_rcu+0xf/0x50
Jän 27 17:05:49 martin-x570aoruselite kernel: Preemption disabled at:
Jän 27 17:05:49 martin-x570aoruselite kernel:  fb_sys_fops i2c_algo_bit dca pinctrl_amd mac_hid wmi acpi_cpufreq nvidia(POE) sg fuse crypto_user ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq usbhid dm_crypt cbc encryp>
Jän 27 17:05:49 martin-x570aoruselite kernel: Modules linked in: vhost_net tun vhost vhost_iotlb macvtap macvlan tap rfcomm uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi>
Jän 27 17:05:49 martin-x570aoruselite kernel: BUG: scheduling while atomic: swapper/2/0/0x00000000
Jän 27 17:05:00 martin-x570aoruselite kernel: ---[ end trace 37f281ed6dc6d515 ]---
Jän 27 17:05:00 martin-x570aoruselite kernel:  secondary_startup_64_no_verify+0xc2/0xcb
Jän 27 17:05:00 martin-x570aoruselite kernel:  cpu_startup_entry+0x19/0x20
Jän 27 17:05:00 martin-x570aoruselite kernel:  do_idle+0x176/0x280
Jän 27 17:05:00 martin-x570aoruselite kernel:  schedule_idle+0x28/0x40
Jän 27 17:05:00 martin-x570aoruselite kernel:  __schedule+0x2e2/0x8c0
Jän 27 17:05:00 martin-x570aoruselite kernel:  ? __switch_to+0x283/0x470
Jän 27 17:05:00 martin-x570aoruselite kernel: Call Trace:
Jän 27 17:05:00 martin-x570aoruselite kernel: CR2: 0000000000000000 CR3: 00000001fea38000 CR4: 0000000000350ee0
Jän 27 17:05:00 martin-x570aoruselite kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jän 27 17:05:00 martin-x570aoruselite kernel: FS:  0000000000000000(0000) GS:ffffa0852ec00000(0000) knlGS:0000000000000000
Jän 27 17:05:00 martin-x570aoruselite kernel: R13: ffffa07e869a9f40 R14: 0000000000000000 R15: 0000000000000000
Jän 27 17:05:00 martin-x570aoruselite kernel: R10: ffffa6c3801afc90 R11: ffffa0854f328ee8 R12: ffffa0852ec2c3c0
Jän 27 17:05:00 martin-x570aoruselite kernel: RBP: ffffa6c3801afe88 R08: 0000000000000000 R09: ffffa6c3801afc98
Jän 27 17:05:00 martin-x570aoruselite kernel: RDX: 0000000000000002 RSI: ffffffffb699e967 RDI: 00000000ffffffff
Jän 27 17:05:00 martin-x570aoruselite kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jän 27 17:05:00 martin-x570aoruselite kernel: RSP: 0018:ffffa6c3801afe58 EFLAGS: 00010082
Jän 27 17:05:00 martin-x570aoruselite kernel: Code: ff 65 48 8b 04 25 c0 7b 01 00 8b 90 80 05 00 00 48 8d b0 50 07 00 00 48 c7 c7 f8 0e 99 b6 c6 05 38 2b 70 01 01 e8 f9 dd 9d 00 <0f> 0b eb ae eb 0b 0f 1f 00 0f 01 e8 e9 5a fe ff ff 8c d0 50 54 48
Jän 27 17:05:00 martin-x570aoruselite kernel: RIP: 0010:finish_task_switch.isra.0+0x275/0x2a0
Jän 27 17:05:00 martin-x570aoruselite kernel: Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F31 12/31/2020
Jän 27 17:05:00 martin-x570aoruselite kernel: CPU: 8 PID: 0 Comm: swapper/8 Tainted: P           OE     5.11.0-1-MANJARO #1
Jän 27 17:05:00 martin-x570aoruselite kernel:  fb_sys_fops i2c_algo_bit dca pinctrl_amd mac_hid wmi acpi_cpufreq nvidia(POE) sg fuse crypto_user ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq usbhid dm_crypt cbc encryp>
Jän 27 17:05:00 martin-x570aoruselite kernel: Modules linked in: vhost_net tun vhost vhost_iotlb macvtap macvlan tap rfcomm uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi>
Jän 27 17:05:00 martin-x570aoruselite kernel: WARNING: CPU: 8 PID: 0 at kernel/sched/core.c:4160 finish_task_switch.isra.0+0x275/0x2a0
Jän 27 17:05:00 martin-x570aoruselite kernel: corrupted preempt_count: swapper/8/0/0x1
Jän 27 17:05:00 martin-x570aoruselite kernel: ------------[ cut here ]------------

Even after switching to LTS Kernel I get a kernel panic

Jän 27 22:22:20 martin-x570aoruselite kernel: CR2: ffffce837667a340 CR3: 00000002fd52c000 CR4: 0000000000340ee0
Jän 27 22:22:20 martin-x570aoruselite kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jän 27 22:22:20 martin-x570aoruselite kernel: FS:  00007f42dede3640(0000) GS:ffff8ac3ee880000(0000) knlGS:0000000000000000
Jän 27 22:22:20 martin-x570aoruselite kernel: R13: 0000000000000001 R14: ffffb147c388fce3 R15: ffffb147c2d30000
Jän 27 22:22:20 martin-x570aoruselite kernel: R10: 0000000000003550 R11: 0000000000002cc8 R12: 000000000000e000
Jän 27 22:22:20 martin-x570aoruselite kernel: RBP: ffff8ac2bf22b6f0 R08: 0000000000000001 R09: 0000000000000000
Jän 27 22:22:20 martin-x570aoruselite kernel: RDX: 0000000000000001 RSI: 000000000000000e RDI: ffffce837667a330
Jän 27 22:22:20 martin-x570aoruselite kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jän 27 22:22:20 martin-x570aoruselite kernel: RSP: 0018:ffffb147c388fc98 EFLAGS: 00010286
Jän 27 22:22:20 martin-x570aoruselite kernel: Code: 41 5e c3 45 31 e4 f7 c5 fe ff ff ff 74 df 0f 0b eb db 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 48 85 ff 74 14 <48> 8b 47 10 48 85 c0 74 0b 48 2b 37 48 63 f6 f0 48 0f ab 30 c3 90
Jän 27 22:22:20 martin-x570aoruselite kernel: RIP: 0010:mark_page_dirty_in_slot+0xa/0x20 [kvm]
Jän 27 22:22:20 martin-x570aoruselite kernel: ---[ end trace d8231b798b30899e ]---
Jän 27 22:22:20 martin-x570aoruselite kernel: CR2: ffffce837667a340
Jän 27 22:22:20 martin-x570aoruselite kernel:  btrfs libcrc32c crc32c_generic xor raid6_pq hid_generic usbhid hid dm_crypt dm_mod sd_mod usb_storage crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci aesni_intel libahci crypto_simd l>
Jän 27 22:22:20 martin-x570aoruselite kernel: Modules linked in: nvidia_uvm(OE) vhost_net tun vhost macvtap macvlan tap rfcomm squashfs ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter loop cmac algif_hash algif_skcipher af_alg bnep >
Jän 27 22:22:20 martin-x570aoruselite kernel: R13: 0000000000000006 R14: 00007f42dede3640 R15: 0000000000000000
Jän 27 22:22:20 martin-x570aoruselite kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
Jän 27 22:22:20 martin-x570aoruselite kernel: RBP: 0000561428cf4430 R08: 000056142654fb68 R09: 00000000ffffffff
Jän 27 22:22:20 martin-x570aoruselite kernel: RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000032
Jän 27 22:22:20 martin-x570aoruselite kernel: RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f42e0b99f6b
Jän 27 22:22:20 martin-x570aoruselite kernel: RSP: 002b:00007f42dede2608 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jän 27 22:22:20 martin-x570aoruselite kernel: Code: 89 d8 49 8d 3c 1c 48 f7 d8 49 39 c4 72 b5 e8 1c ff ff ff 85 c0 78 ba 4c 89 e0 5b 5d 41 5c c3 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 ae 0c 00 f7 d8 64 89 01 48
Jän 27 22:22:20 martin-x570aoruselite kernel: RIP: 0033:0x7f42e0b99f6b
Jän 27 22:22:20 martin-x570aoruselite kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jän 27 22:22:20 martin-x570aoruselite kernel:  do_syscall_64+0x49/0x90
Jän 27 22:22:20 martin-x570aoruselite kernel:  __x64_sys_ioctl+0x16/0x20
Jän 27 22:22:20 martin-x570aoruselite kernel:  ksys_ioctl+0x5e/0x90
Jän 27 22:22:20 martin-x570aoruselite kernel:  do_vfs_ioctl+0x3eb/0x6c0
Jän 27 22:22:20 martin-x570aoruselite kernel:  kvm_vcpu_ioctl+0x263/0x620 [kvm]
Jän 27 22:22:20 martin-x570aoruselite kernel:  ? check_preempt_curr+0x7e/0x90
Jän 27 22:22:20 martin-x570aoruselite kernel:  kvm_arch_vcpu_ioctl_run+0x704/0x1900 [kvm]
Jän 27 22:22:20 martin-x570aoruselite kernel:  kvm_lapic_sync_to_vapic+0x14c/0x200 [kvm]
Jän 27 22:22:20 martin-x570aoruselite kernel:  kvm_write_guest_offset_cached+0x9a/0xe0 [kvm]
Jän 27 22:22:20 martin-x570aoruselite kernel: Call Trace:
Jän 27 22:22:20 martin-x570aoruselite kernel: CR2: ffffce837667a340 CR3: 00000002fd52c000 CR4: 0000000000340ee0
Jän 27 22:22:20 martin-x570aoruselite kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jän 27 22:22:20 martin-x570aoruselite kernel: FS:  00007f42dede3640(0000) GS:ffff8ac3ee880000(0000) knlGS:0000000000000000
Jän 27 22:22:20 martin-x570aoruselite kernel: R13: 0000000000000001 R14: ffffb147c388fce3 R15: ffffb147c2d30000
Jän 27 22:22:20 martin-x570aoruselite kernel: R10: 0000000000003550 R11: 0000000000002cc8 R12: 000000000000e000
Jän 27 22:22:20 martin-x570aoruselite kernel: RBP: ffff8ac2bf22b6f0 R08: 0000000000000001 R09: 0000000000000000
Jän 27 22:22:20 martin-x570aoruselite kernel: RDX: 0000000000000001 RSI: 000000000000000e RDI: ffffce837667a330
Jän 27 22:22:20 martin-x570aoruselite kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jän 27 22:22:20 martin-x570aoruselite kernel: RSP: 0018:ffffb147c388fc98 EFLAGS: 00010286
Jän 27 22:22:20 martin-x570aoruselite kernel: Code: 41 5e c3 45 31 e4 f7 c5 fe ff ff ff 74 df 0f 0b eb db 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 48 85 ff 74 14 <48> 8b 47 10 48 85 c0 74 0b 48 2b 37 48 63 f6 f0 48 0f ab 30 c3 90
Jän 27 22:22:20 martin-x570aoruselite kernel: RIP: 0010:mark_page_dirty_in_slot+0xa/0x20 [kvm]
Jän 27 22:22:20 martin-x570aoruselite kernel: Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F33a 01/22/2021
Jän 27 22:22:20 martin-x570aoruselite kernel: CPU: 2 PID: 13224 Comm: CPU 0/KVM Tainted: P           OE     5.4.89-1-MANJARO #1
Jän 27 22:22:20 martin-x570aoruselite kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Jän 27 22:22:20 martin-x570aoruselite kernel: PGD 0 P4D 0 
Jän 27 22:22:20 martin-x570aoruselite kernel: #PF: error_code(0x0000) - not-present page
Jän 27 22:22:20 martin-x570aoruselite kernel: #PF: supervisor read access in kernel mode
Jän 27 22:22:20 martin-x570aoruselite kernel: BUG: unable to handle page fault for address: ffffce837667a340

Found the culprit.

It seems that the cpu pinning part is causing this issues. After removing it the VM run 8H straight without any problems.

Performance is not a problem therefore removing it is not an issue for me.

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.