All kernel after 5.4 crash on me after suspend/sleep

Hi, I’m still stuck on Kernel 5.4

Normally I am on the newest available Kernel but every new version and subversion does crash/freeze my system after awaking from suspend/sleep

Also the new Kernel 5.9

Kernel 5.4 is rock solid however.

I have an AMD RX 570 graphics card, am using ZFS as well as an OPAL NVME drive with Onboard encryption and 2 Monitors (I read there was a problem with multiple monitors and AMD on some kernels, but this is supposedly fixed)

Full system stats: (On stable kernel)

inxi -Fxzc0    
System:
  Kernel: 5.4.74-1-MANJARO x86_64 bits: 64 compiler: gcc v: 10.2.0 
  Console: tty 0 Distro: Manjaro Linux 
Machine:
  Type: Desktop Mobo: ASRock model: X370 Professional Gaming 
  serial: <filter> UEFI: American Megatrends v: P3.30 date: 01/15/2018 
CPU:
  Info: 8-Core model: AMD Ryzen 7 1800X bits: 64 type: MT MCP arch: Zen 
  rev: 1 L2 cache: 4096 KiB 
  flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm 
  bogomips: 115245 
  Speed: 2725 MHz min/max: 2200/3600 MHz boost: enabled Core speeds (MHz): 
  1: 2725 2: 2674 3: 2025 4: 1909 5: 1928 6: 1914 7: 1951 8: 2063 9: 1910 
  10: 1868 11: 1850 12: 1850 13: 1853 14: 1846 15: 1886 16: 1888 
Graphics:
  Device-1: AMD Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] 
  vendor: Micro-Star MSI driver: amdgpu v: kernel bus ID: 0e:00.0 
  Display: server: X.org 1.20.9 driver: amdgpu FAILED: ati 
  unloaded: modesetting tty: 80x24 
  Message: Advanced graphics data unavailable in console for root. 
Audio:
  Device-1: AMD Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] 
  vendor: Micro-Star MSI driver: snd_hda_intel v: kernel bus ID: 0e:00.1 
  Device-2: AMD Family 17h HD Audio vendor: ASRock driver: snd_hda_intel 
  v: kernel bus ID: 12:00.3 
  Device-3: Plantronics Plantronics GameCom 780 type: USB 
  driver: plantronics,snd-usb-audio,usbhid bus ID: 3-3:3 
  Sound Server: ALSA v: k5.4.74-1-MANJARO 
Network:
  Device-1: Aquantia AQC108 NBase-T/IEEE 802.3bz Ethernet [AQtion] 
  vendor: ASRock driver: atlantic v: 5.4.74-1-MANJARO-kern port: N/A 
  bus ID: 05:00.0 
  IF: enp5s0 state: up speed: 1000 Mbps duplex: full mac: <filter> 
  Device-2: Intel Dual Band Wireless-AC 3168NGW [Stone Peak] driver: iwlwifi 
  v: kernel port: e000 bus ID: 09:00.0 
  IF: wlp9s0 state: down mac: <filter> 
  Device-3: Intel I211 Gigabit Network vendor: ASRock driver: igb v: 5.6.0-k 
  port: d000 bus ID: 0b:00.0 
  IF: enp11s0 state: down mac: <filter> 
  IF-ID-1: wg0 state: unknown speed: N/A duplex: N/A mac: N/A 
Drives:
  Local Storage: total: 40.02 TiB used: 14.66 TiB (36.6%) 
  ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 960 EVO 500GB 
  size: 465.76 GiB 
  ID-2: /dev/sda vendor: Seagate model: ST10000VN0004-1ZD101 size: 9.10 TiB 
  ID-3: /dev/sdb type: USB vendor: JMicron Tech model: Generic 
  size: 465.76 GiB 
  ID-4: /dev/sdc vendor: Seagate model: ST10000VN0004-1ZD101 size: 9.10 TiB 
  ID-5: /dev/sdd vendor: Seagate model: ST10000VN0004-1ZD101 size: 9.10 TiB 
  ID-6: /dev/sde vendor: Western Digital model: WD30EFRX-68EUZN0 
  size: 2.73 TiB 
  ID-7: /dev/sdf vendor: Seagate model: ST10000VN0004-1ZD101 size: 9.10 TiB 
RAID:
  Device-1: tank type: zfs status: ONLINE size: 36.20 TiB free: 5.32 TiB 
  Components: online: N/A 
Partition:
  ID-1: / size: 455.16 GiB used: 302.32 GiB (66.4%) fs: ext4 
  dev: /dev/nvme0n1p2 
Swap:
  Alert: No Swap data was found. 
Sensors:
  System Temperatures: cpu: 48.6 C mobo: 42.0 C gpu: amdgpu temp: 53.0 C 
  Fan Speeds (RPM): fan-1: 873 fan-2: 829 fan-3: 734 fan-4: 0 fan-5: 705 
  gpu: amdgpu fan: 1039 
  Power: 12v: N/A 5v: N/A 3.3v: 3.34 vbat: 3.28 
Info:
  Processes: 551 Uptime: 4m Memory: 62.81 GiB used: 5.26 GiB (8.4%) 
  Init: systemd Compilers: gcc: 10.2.0 Packages: 1671 Shell: Bash v: 5.0.18 
  inxi: 3.1.08

Logs:

The first noteworthy error happened during the process of going to sleep mode:

Nov 08 00:02:31 ****.****.de systemd-coredump[24739]: Process 4620 (ksystemstats) of user 1001 dumped core.
                                                               
                                                               Stack trace of thread 4620:
                                                               #0  0x00007fc8f3cfe00b _ZNK15AggregateSensor5valueEv (libksgrdbackend.so + 0x700b)
                                                               #1  0x000055fd447fa942 n/a (ksystemstats + 0xb942)
                                                               #2  0x00007fc8f3a0a036 n/a (libQt5Core.so.5 + 0x2eb036)
                                                               #3  0x00007fc8f3cfdad7 n/a (libksgrdbackend.so + 0x6ad7)
                                                               #4  0x00007fc8f3a0e143 n/a (libQt5Core.so.5 + 0x2ef143)
                                                               #5  0x00007fc8f39ff71f _ZN7QObject5eventEP6QEvent (libQt5Core.so.5 + 0x2e071f)
                                                               #6  0x00007fc8f39d2cb0 _ZN16QCoreApplication15notifyInternal2EP7QObjectP6QEvent (libQt5Core.so.5 + 0x2b3cb0)
                                                               #7  0x00007fc8f3a2acc5 _ZN14QTimerInfoList14activateTimersEv (libQt5Core.so.5 + 0x30bcc5)
                                                               #8  0x00007fc8f3a2b572 n/a (libQt5Core.so.5 + 0x30c572)
                                                               #9  0x00007fc8f298c914 g_main_context_dispatch (libglib-2.0.so.0 + 0x52914)
                                                               #10 0x00007fc8f29e07d1 n/a (libglib-2.0.so.0 + 0xa67d1)
                                                               #11 0x00007fc8f298b121 g_main_context_iteration (libglib-2.0.so.0 + 0x51121)
                                                               #12 0x00007fc8f3a2b941 _ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE (libQt5Core.so.5 + 0x30c941)
                                                               #13 0x00007fc8f39d165c _ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE (libQt5Core.so.5 + 0x2b265c)
                                                               #14 0x00007fc8f39d9af4 _ZN16QCoreApplication4execEv (libQt5Core.so.5 + 0x2baaf4)
                                                               #15 0x000055fd447f4073 n/a (ksystemstats + 0x5073)
                                                               #16 0x00007fc8f33a1152 __libc_start_main (libc.so.6 + 0x28152)
                                                               #17 0x000055fd447f40de _start (ksystemstats + 0x50de)
                                                               
                                                               Stack trace of thread 4621:
                                                               #0  0x00007fc8f346e46f __poll (libc.so.6 + 0xf546f)
                                                               #1  0x00007fc8f29e075f n/a (libglib-2.0.so.0 + 0xa675f)
                                                               #2  0x00007fc8f298b121 g_main_context_iteration (libglib-2.0.so.0 + 0x51121)
                                                               #3  0x00007fc8f3a2b941 _ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE (libQt5Core.so.5 + 0x30c941)
                                                               #4  0x00007fc8f39d165c _ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE (libQt5Core.so.5 + 0x2b265c)
                                                               #5  0x00007fc8f37ebca2 _ZN7QThread4execEv (libQt5Core.so.5 + 0xccca2)
                                                               #6  0x00007fc8f3c88098 n/a (libQt5DBus.so.5 + 0x17098)
                                                               #7  0x00007fc8f37ece8f n/a (libQt5Core.so.5 + 0xcde8f)
                                                               #8  0x00007fc8f335e3e9 start_thread (libpthread.so.0 + 0x93e9)
                                                               #9  0x00007fc8f3479293 __clone (libc.so.6 + 0x100293)

Then during wake-up I first get this:

Nov 08 16:54:48 ****.****.de kernel: iommu ivhd0: AMD-Vi: Event logged [INVALID_DEVICE_REQUEST device=00:00.0 pasid=0x00000 address=0xfffffffdf8000000 flags=0x

Followed by this: Network adapter seems totally down

Nov 08 16:54:48 ****.****.de nmbd[2153]: [2020/11/08 16:54:48.160717,  0] ../../source3/libsmb/nmblib.c:922(send_udp)
Nov 08 16:54:48 ****.****.de nmbd[2153]:   Packet send failed to 192.168.0.255(138) ERRNO=Network is unreachable
Nov 08 16:54:48 ****.****.de nmbd[2153]: [2020/11/08 16:54:48.164723,  0] ../../source3/libsmb/nmblib.c:922(send_udp)
Nov 08 16:54:48 ****.****.de nmbd[2153]:   Packet send failed to 192.168.0.255(137) ERRNO=Network is unreachable
Nov 08 16:54:48 ****.****.de nmbd[2153]: [2020/11/08 16:54:48.164766,  0] ../../source3/nmbd/nmbd_packets.c:179(send_netbios_packet)
Nov 08 16:54:48 ****.****.de nmbd[2153]:   send_netbios_packet: send_packet() to IP 192.168.0.255 port 137 failed
Nov 08 16:54:48 ****.****.de nmbd[2153]: [2020/11/08 16:54:48.164788,  0] ../../source3/nmbd/nmbd_namequery.c:245(query_name)
Nov 08 16:54:48 ****.****.de nmbd[2153]:   query_name: Failed to send packet trying to query name WORKGROUP<1d>
Nov 08 16:54:48 ****.****.de nmbd[2153]: [2020/11/08 16:54:48.165136,  0] ../../source3/nmbd/nmbd.c:359(reload_interfaces)
Nov 08 16:54:48 ****.****.de nmbd[2153]:   reload_interfaces: No subnets to listen to. Waiting..

And finally this: kernel NULL pointer dereference

Nov 08 16:55:04 ****.****.de kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Nov 08 16:55:04 ****.****.de kernel: #PF: supervisor read access in kernel mode
Nov 08 16:55:04 ****.****.de kernel: #PF: error_code(0x0000) - not-present page
Nov 08 16:55:04 ****.****.de kernel: PGD 0 P4D 0 
Nov 08 16:55:04 ****.****.de kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Nov 08 16:55:04 ****.****.de kernel: CPU: 0 PID: 1640 Comm: NetworkManager Tainted: P           OE     5.9.3-1-MANJARO #1
Nov 08 16:55:04 ****.****.de kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X370 Professional Gaming, BIOS P3.30 01/15/2018
Nov 08 16:55:04 ****.****.de kernel: RIP: 0010:aq_ring_rx_fill+0xd1/0x200 [atlantic]
Nov 08 16:55:04 ****.****.de kernel: Code: 45 24 ba 00 00 00 00 83 c0 01 3b 45 28 48 0f 43 c2 89 45 24 41 83 ee 01 0f 84 f3 00 00 00 48 8d 1c 40 48 c1 e3 04 48 03 5d 00 <48> 8b 43 08 48 c7 43 28 00 08 00 00 48 85 c0 75 85 48 8b 45 10 31
Nov 08 16:55:04 ****.****.de kernel: RSP: 0018:ffffa7f793fbf390 EFLAGS: 00010246
Nov 08 16:55:04 ****.****.de kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Nov 08 16:55:04 ****.****.de kernel: RDX: 0000000000000000 RSI: 0000000000006100 RDI: ffff8eb715f0d3b8
Nov 08 16:55:04 ****.****.de kernel: RBP: ffff8eb715f0d3b8 R08: 0000000000000000 R09: 0000000000008000
Nov 08 16:55:04 ****.****.de kernel: R10: 00000000ffffffff R11: ffffcd3db52933c0 R12: 0000000000001000
Nov 08 16:55:04 ****.****.de kernel: R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000000
Nov 08 16:55:04 ****.****.de kernel: FS:  00007ff8aaed18c0(0000) GS:ffff8eb71ee00000(0000) knlGS:0000000000000000
Nov 08 16:55:04 ****.****.de kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 08 16:55:04 ****.****.de kernel: CR2: 0000000000000008 CR3: 0000000fa5f0e000 CR4: 00000000003506f0
Nov 08 16:55:04 ****.****.de kernel: Call Trace:
Nov 08 16:55:04 ****.****.de kernel:  aq_vec_init+0x8c/0xf0 [atlantic]
Nov 08 16:55:04 ****.****.de kernel:  aq_nic_init+0xc3/0x1c0 [atlantic]
Nov 08 16:55:04 ****.****.de kernel:  aq_ndev_open+0x19/0x60 [atlantic]
Nov 08 16:55:04 ****.****.de kernel:  __dev_open+0xfb/0x1b0
Nov 08 16:55:04 ****.****.de kernel:  __dev_change_flags+0x1a5/0x210
Nov 08 16:55:04 ****.****.de kernel:  dev_change_flags+0x21/0x60
Nov 08 16:55:04 ****.****.de kernel:  do_setlink+0x2bc/0x1160
Nov 08 16:55:04 ****.****.de kernel:  ? __kmalloc_node_track_caller+0x178/0x340
Nov 08 16:55:04 ****.****.de kernel:  ? __nla_validate_parse+0x5f/0x910
Nov 08 16:55:04 ****.****.de kernel:  __rtnl_newlink+0x65f/0x9e0
Nov 08 16:55:04 ****.****.de kernel:  rtnl_newlink+0x44/0x70
Nov 08 16:55:04 ****.****.de kernel:  rtnetlink_rcv_msg+0x13e/0x390
Nov 08 16:55:04 ****.****.de kernel:  ? rtnl_calcit.isra.0+0x120/0x120
Nov 08 16:55:04 ****.****.de kernel:  netlink_rcv_skb+0x75/0x140
Nov 08 16:55:04 ****.****.de kernel:  netlink_unicast+0x242/0x340
Nov 08 16:55:04 ****.****.de kernel:  netlink_sendmsg+0x243/0x480
Nov 08 16:55:04 ****.****.de kernel:  sock_sendmsg+0x5e/0x60
Nov 08 16:55:04 ****.****.de kernel:  ____sys_sendmsg+0x25a/0x2a0
Nov 08 16:55:04 ****.****.de kernel:  ? copy_msghdr_from_user+0x6e/0xa0
Nov 08 16:55:04 ****.****.de kernel:  ___sys_sendmsg+0x97/0xe0
Nov 08 16:55:04 ****.****.de kernel:  ? addrconf_sysctl_forward+0x12b/0x270
Nov 08 16:55:04 ****.****.de kernel:  __sys_sendmsg+0x81/0xd0
Nov 08 16:55:04 ****.****.de kernel:  do_syscall_64+0x33/0x40
Nov 08 16:55:04 ****.****.de kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 08 16:55:04 ****.****.de kernel: RIP: 0033:0x7ff8abbddddd
Nov 08 16:55:04 ****.****.de kernel: Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 4a ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 9e ee ff ff 48
Nov 08 16:55:04 ****.****.de kernel: RSP: 002b:00007ffe0ef415b0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
Nov 08 16:55:04 ****.****.de kernel: RAX: ffffffffffffffda RBX: 000055c5045ca030 RCX: 00007ff8abbddddd
Nov 08 16:55:04 ****.****.de kernel: RDX: 0000000000000000 RSI: 00007ffe0ef415f0 RDI: 000000000000000c
Nov 08 16:55:04 ****.****.de kernel: RBP: 0000000000000134 R08: 0000000000000000 R09: 0000000000000000
Nov 08 16:55:04 ****.****.de kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
Nov 08 16:55:04 ****.****.de kernel: R13: 00007ffe0ef41740 R14: 00007ffe0ef4173c R15: 0000000000000000
Nov 08 16:55:04 ****.****.de kernel: Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq fuse cmac algif_hash algif_skcipher af_alg bnep nct6775 hwmon_vid dm_crypt cbc encrypted_keys trusted tpm hid_plantronics wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 ip6_udp_tunnel udp_tunnel libcurve25519_generic libchacha libblake2s_generic snd_usb_audio snd_usbmidi_lib snd_rawmidi joydev mousedev snd_seq_d>
Nov 08 16:55:04 ****.****.de kernel:  cfg80211 drm_kms_helper snd_hda_core snd_hwdep snd_pcm igb cec rc_core snd_timer rapl atlantic syscopyarea snd sysfillrect k10temp pcspkr sp5100_tco sysimgblt i2c_algo_bit soundcore rfkill fb_sys_fops i2c_piix4 dca macsec wmi pinctrl_amd gpio_amdpt evdev mac_hid acpi_cpufreq zcommon(POE) znvpair(POE) spl(OE) uinput vboxnetflt(OE) vboxnetadp(OE) nfsd auth_rpcgss vboxdrv(OE) nfs_acl lockd grace drm videodev sunrpc mc sg crypto_>
Nov 08 16:55:04 ****.****.de kernel: CR2: 0000000000000008
Nov 08 16:55:04 ****.****.de kernel: ---[ end trace 45c10b91ba3505ab ]---
Nov 08 16:55:04 ****.****.de kernel: RIP: 0010:aq_ring_rx_fill+0xd1/0x200 [atlantic]
Nov 08 16:55:04 ****.****.de kernel: Code: 45 24 ba 00 00 00 00 83 c0 01 3b 45 28 48 0f 43 c2 89 45 24 41 83 ee 01 0f 84 f3 00 00 00 48 8d 1c 40 48 c1 e3 04 48 03 5d 00 <48> 8b 43 08 48 c7 43 28 00 08 00 00 48 85 c0 75 85 48 8b 45 10 31
Nov 08 16:55:04 ****.****.de kernel: RSP: 0018:ffffa7f793fbf390 EFLAGS: 00010246
Nov 08 16:55:04 ****.****.de kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Nov 08 16:55:04 ****.****.de kernel: RDX: 0000000000000000 RSI: 0000000000006100 RDI: ffff8eb715f0d3b8
Nov 08 16:55:04 ****.****.de kernel: RBP: ffff8eb715f0d3b8 R08: 0000000000000000 R09: 0000000000008000
Nov 08 16:55:04 ****.****.de kernel: R10: 00000000ffffffff R11: ffffcd3db52933c0 R12: 0000000000001000
Nov 08 16:55:04 ****.****.de kernel: R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000000
Nov 08 16:55:04 ****.****.de kernel: FS:  00007ff8aaed18c0(0000) GS:ffff8eb71ee00000(0000) knlGS:0000000000000000
Nov 08 16:55:04 ****.****.de kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 08 16:55:04 ****.****.de kernel: CR2: 0000000000000008 CR3: 0000000fa5f0e000 CR4: 00000000003506f0

Then when trying to go to sleep again:

Nov 08 17:02:22 ****.****.de kwin_x11[2847]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 22365, resource id: 127926277, major code: 18 (ChangeProperty), minor code: 0
Nov 08 17:02:29 ****.****.de systemd-sleep[29176]: Suspending system...
Nov 08 17:02:29 ****.****.de kernel: PM: suspend entry (deep)
Nov 08 17:02:29 ****.****.de kernel: Filesystems sync: 0.016 seconds
Nov 08 17:02:50 ****.****.de kernel: Freezing user space processes ... 
Nov 08 17:02:50 ****.****.de kernel: Freezing of tasks failed after 20.008 seconds (26 tasks refusing to freeze, wq_busy=0):
Nov 08 17:02:50 ****.****.de kernel: task:nmbd            state:D stack:    0 pid: 2153 ppid:     1 flags:0x00000084
Nov 08 17:02:50 ****.****.de kernel: Call Trace:
Nov 08 17:02:50 ****.****.de kernel:  __schedule+0x292/0x830
Nov 08 17:02:50 ****.****.de kernel:  schedule+0x46/0xf0
Nov 08 17:02:50 ****.****.de kernel:  schedule_preempt_disabled+0x14/0x20
Nov 08 17:02:50 ****.****.de kernel:  __mutex_lock.constprop.0+0x180/0x530
Nov 08 17:02:50 ****.****.de kernel:  __netlink_dump_start+0xca/0x2d0
Nov 08 17:02:50 ****.****.de kernel:  ? rtnl_fill_ifinfo+0x1410/0x1410
Nov 08 17:02:50 ****.****.de kernel:  rtnetlink_rcv_msg+0x288/0x390
Nov 08 17:02:50 ****.****.de kernel:  ? rtnl_fill_ifinfo+0x1410/0x1410
Nov 08 17:02:50 ****.****.de kernel:  ? rtnl_calcit.isra.0+0x120/0x120
Nov 08 17:02:50 ****.****.de kernel:  netlink_rcv_skb+0x75/0x140
Nov 08 17:02:50 ****.****.de kernel:  netlink_unicast+0x242/0x340
Nov 08 17:02:50 ****.****.de kernel:  netlink_sendmsg+0x243/0x480
Nov 08 17:02:50 ****.****.de kernel:  sock_sendmsg+0x5e/0x60
Nov 08 17:02:50 ****.****.de kernel:  __sys_sendto+0x120/0x180
Nov 08 17:02:50 ****.****.de kernel:  __x64_sys_sendto+0x25/0x30
Nov 08 17:02:50 ****.****.de kernel:  do_syscall_64+0x33/0x40
Nov 08 17:02:50 ****.****.de kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 08 17:02:50 ****.****.de kernel: RIP: 0033:0x7f0dd5b2e88a
Nov 08 17:02:50 ****.****.de kernel: Code: Unable to access opcode bytes at RIP 0x7f0dd5b2e860.
Nov 08 17:02:50 ****.****.de kernel: RSP: 002b:00007ffcde5f9168 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
Nov 08 17:02:50 ****.****.de kernel: RAX: ffffffffffffffda RBX: 00007ffcde5fa2c0 RCX: 00007f0dd5b2e88a
Nov 08 17:02:50 ****.****.de kernel: RDX: 0000000000000014 RSI: 00007ffcde5fa200 RDI: 000000000000000f
Nov 08 17:02:50 ****.****.de kernel: RBP: 00007ffcde5fa250 R08: 00007ffcde5fa1c0 R09: 000000000000000c
Nov 08 17:02:50 ****.****.de kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffcde5fa1c0
Nov 08 17:02:50 ****.****.de kernel: R13: 00007ffcde5fa200 R14: 00007ffcde5fa5a0 R15: 00007ffcde5f9170
Nov 08 17:02:50 ****.****.de kernel: task:Qt bearer threa state:D stack:    0 pid: 2880 ppid:     1 flags:0x00000084
... 23 tasks listed in similar way

And now the system is frozen. Can’t even enter another TTY

I should add, the system eventually freezes also when I’m not trying to suspend a second time. Network often is not working in that state and as soon as I access something system related, like the network manager, the system freezes as well.

I tried to disable AMD Cool & Quiet and C6-State in bios, but it did not help either

Can you post the partitioning ?

lsblk -f
NAME                                          FSTYPE       FSUSE% MOUNTPOINT
loop0                                         squashfs       100% /var/lib/snapd/snap/core/10185
loop1                                         squashfs       100% /var/lib/snapd/snap/solitaire/2
loop2                                         squashfs       100% /var/lib/snapd/snap/core/10126
loop3                                         squashfs       100% /var/lib/snapd/snap/cheat/2299
loop4                                         squashfs       100% /var/lib/snapd/snap/core20/634
sda                                           crypto_LUKS         
└─sdx_crypt7                                  zfs_member          
sdb                                                               
└─sdb1                                        crypto_LUKS         
  └─luks-bec59b93-a807-4927-9088-de74010a2d55 ext4            25% /run/media/****/work
sdc                                           crypto_LUKS         
└─sdx_crypt6                                  zfs_member          
sdd                                           crypto_LUKS         
└─sdx_crypt10                                 zfs_member          
sde                                                               
└─sde1                                        crypto_LUKS         
sdf                                           crypto_LUKS         
└─sdx_crypt11                                 zfs_member          
sr0                                           udf                 
nvme0n1                                                           
├─nvme0n1p1                                   vfat             0% /boot/efi
└─nvme0n1p2                                   ext4            66% / 

sdb is an external luks encrypted drive
sda, sdc, sdd, sdf are my ZFS raid
sde is unused
nvme is my boot and root partition. Some homes are on my ZFS raid but not the root home

I should note that I disabled swap for some other dynamic swap tool that adds 500MB swap files when needed, as I had troubles with swap in the past (Might be a relict of when I was forced to use Hibernation due to OPAL encryption). But don’t remember the details anymore, I will check on it. I fear it was in the old forum.

Ok, I found it, I switched to systemd-swap but it seems it’s failing:

systemctl status systemd-swap

● systemd-swap.service - Manage swap spaces on zram, files and partitions.
     Loaded: loaded (/usr/lib/systemd/system/systemd-swap.service; enabled; vendor preset: disabled)
     Active: failed (Result: exit-code) since Sun 2020-11-08 18:03:34 CET; 1h 38min ago
    Process: 1619 ExecStart=/usr/bin/systemd-swap start (code=exited, status=1/FAILURE)
   Main PID: 1619 (code=exited, status=1/FAILURE)
     Status: "Swap unit activation finished"

Nov 08 18:03:34 ****.de systemd-swap[1619]:   File "/usr/bin/systemd-swap", line 156, in __init__
Nov 08 18:03:34 ****.de systemd-swap[1619]:     self.assign_config(config)
Nov 08 18:03:34 ****.de systemd-swap[1619]:   File "/usr/bin/systemd-swap", line 289, in assign_config
Nov 08 18:03:34 ****.de systemd-swap[1619]:     self.swapfc_frequency = config.get("swapfc_frequency", int)
Nov 08 18:03:34 ****.de systemd-swap[1619]:   File "/usr/bin/systemd-swap", line 103, in get
Nov 08 18:03:34 ****.de systemd-swap[1619]:     return as_type(self.config[key])
Nov 08 18:03:34 ****.de systemd-swap[1619]: ValueError: invalid literal for int() with base 10: '1s'
Nov 08 18:03:34 ****.de systemd[1]: systemd-swap.service: Main process exited, code=exited, status=1/FAILURE
Nov 08 18:03:34 ****.de systemd[1]: systemd-swap.service: Failed with result 'exit-code'.
Nov 08 18:03:34 ****.de systemd[1]: Failed to start Manage swap spaces on zram, files and partitions..

But this is failing on the stable 5.4 kernel…

Anyway, I will look in to this as my current only lead.

Then try to enable swap partition and test again.

I fixed systemd-swap and restarted into the other kernel again. I also removed all but one monitor and all unneccessary USB devices. But still the same.

I’ll check using an old school swap device once I figure out how to do this best in my current setup, but I really doubt this is related.

As feared this did not help at all.

I noticed now this warning in my log:

ACPI BIOS Warning (bug): Optional FADT field Pm2ControlBlock has valid Length but zero Address

But it also appears in the working kernel log.

Do you see any kernel errors when trying to suspend for the first time? Please try running sudo systemctl isolate multi-user, then log in, and run sudo systemctl suspend, does that change anything?

What do you mean? So suspend works? Or you mean when suspend fails, and the system tries to restore itself?

1 Like

I did try again in the newst kernal by just switching to tty2 login as root and use sysemctrl suspend
then with the current working kernel to compare the logs in a diff tool.

I think I found the issue.

sedutil-cli[1566]: One or more header fields have 0 length
sedutil-cli[1566]: EndSession Failed
sedutil-cli[1566]: Unable to authenticate with the given password
sedutil-cli[1614]: You do not have permission to access the raw disk in write mode
sedutil-cli[1614]: Perhaps you might try sudo to run as root
sedutil-cli[1614]: Invalid or unsupported disk /dev/nvme1n1
systemd[1]: Finished Permit User Sessions.
sedutil-cli[1616]: You do not have permission to access the raw disk in write mode
sedutil-cli[1616]: Perhaps you might try sudo to run as root
sedutil-cli[1616]: Invalid or unsupported disk /dev/nvme1n1

As I mentioned, I am using self encrypting opal drives (Samsung EVO NVME) and for S3 to work I need to tell the kernel the password to unlock it after suspend, seems like something is broken there with newer kernels. I guess I will have to dig into it again and update the way to set this password :-/

pdate: Good that I wrote a tutorial on it back then, Should help me now :slight_smile: Enable S3 sleep mode for OPAL encrypted NVMe drives - Tutorials - Manjaro Linux Forum

Unfortunately this was not it. I found the same statements, just at another place in the old kernel log and I tested writing to the disk after suspend works for the new kernel. So back to square one :disappointed:

I tried to boot the new kernel in single user mode, by adding “single” grub, and there I could suspend and unsuspend multiple times without triggering this error.

Then I tried to do the same with booting to multi-user.target and there it seems to first work, but then on the 2nd resume from suspend I again got the same Kernel error as stated above…

Also in my comparison between the old an new kernel, only the sudden kernel error sticks out to me now…

Reading the error stack I feel like this is related to the Network. This is enforced by the fact that after resume all networking is dead (but soon after all system is dead, so…). Unfortunately I can not disable the onboard network ports. Detaching the cable was not helping.

Maybe virtualbox is the problem: The VirtualBox Kernel Driver Is Tainted Crap - Phoronix

What is the best way to disable it temporarily to check? I need it in the end for work but would like to know if this is the issue.

Run lsmod | grep vbox, and then unload them with sudo rmmod <name>. If they are not loaded, then they shouldn’t have any effect. Can you also try unloading the atlantic kernel module to see if that makes any difference?

Thank you, I’ll definitely try that in the evening.

A few questions to your suggestions however

  1. Is the atlantic the module for networking? To see if the network stack is the reason?
  2. Will rmmod persist through rebooting? Or should I do this once after boot before suspending?
  3. If persisting, should I simply run sudo insmod <name> again afterwards?

Yes, atlantic is “Marvell (Aquantia) Corporation® Network Driver” (modinfo atlantic).

It will not persist. You should do it before suspending.

1 Like

Disabling vbox modules did not help.

But disabling the atlantic module indeed did fix it. I could suspend and resume multiple times without issues.

Question is, what now?

UPDATE: here again the Kernel error in question. Unfortunately it is tainted, so I can’t make a bug report yet I guess.

When does this bug happen? Right before suspending? After wakeup?

After wakeup. a few seconds in.

Do not upgrade/change the kernel, stay on 5.9.10 for now.
Do the following:

# create a new directory, and enter it
mkdir ~/temp
cd ~/temp
# download kernel 5.9.10 source
wget https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.9.10.tar.xz
# extract it
tar -xf linux-5.9.10.tar.xz
# create a new directory, enter it
mkdir atlantic
cd atlantic
# copy the source code of the module
cp -r ../linux-5.9.10/drivers/net/ethernet/aquantia/atlantic/* .
# now the "Makefile" needs to be modified
sed -i 's/-I$(srctree)\/$(src)/-I$(PWD)/' Makefile
# the following is a single command ˇˇˇ
cat >> Makefile <<EOF
all:
\tmake -C /lib/modules/\$(shell uname -r)/build M=\$(PWD) modules

clean:
\tmake -C /lib/modules/\$(shell uname -r)/build M=\$(PWD) clean
EOF
# ^^^ ends here; paste the whole thing into your terminal
sed -i 's/\\t/\t/' Makefile
# now build the module
make CFLAGS_MODULE="-ggdb3 -Og" -j
# unload the original module
sudo modprobe -r atlantic
# load a dependency of the newly built module
sudo modprobe macsec
# load the just compiled one
sudo insmod atlantic.ko

If all of the above succeeds, try suspending, then resume, and then post the kernel error you get after wake-up.

Thank you for all your effort! :star_struck:

I did what you instructed. This time the crash happened only after the second suspend.

Nov 23 23:57:05 **** kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028
Nov 23 23:57:05 **** kernel: #PF: supervisor write access in kernel mode
Nov 23 23:57:05 **** kernel: #PF: error_code(0x0002) - not-present page
Nov 23 23:57:05 **** kernel: PGD 0 P4D 0 
Nov 23 23:57:05 **** kernel: audit: type=1101 audit(1606172225.633:328): pid=12586 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_time acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
Nov 23 23:57:05 **** kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
Nov 23 23:57:05 **** kernel: CPU: 2 PID: 1598 Comm: NetworkManager Tainted: P           OE     5.9.10-1-MANJARO #1
Nov 23 23:57:05 **** kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X370 Professional Gaming, BIOS P3.30 01/15/2018
Nov 23 23:57:05 **** kernel: RIP: 0010:aq_ring_rx_fill+0x66/0xb2 [atlantic]
Nov 23 23:57:05 **** kernel: Code: 00 00 00 00 eb 0d 29 d0 83 e8 01 eb dc 89 45 24 44 89 e0 44 8d 60 ff 85 c0 74 52 8b 45 24 48 8d 1c 40 48 c1 e3 04 48 03 5d 00 <48> c7 43 28 00 00 00 00 66 c7 43 28 00 08 44 89 ea 48 89 de 48 89
Nov 23 23:57:05 **** kernel: audit: type=1103 audit(1606172225.633:329): pid=12586 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_unix,pam_env acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
Nov 23 23:57:05 **** kernel: RSP: 0018:ffffad32a5ea73b0 EFLAGS: 00010246
Nov 23 23:57:05 **** kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000020
Nov 23 23:57:05 **** kernel: RDX: 0000000000000000 RSI: ffffad32a4946100 RDI: ffff997923d8f3b8
Nov 23 23:57:05 **** kernel: RBP: ffff997923d8f3b8 R08: 0000000000000000 R09: ffff99794b4d0720
Nov 23 23:57:05 **** kernel: R10: ffff997b52547088 R11: ffff997b57164070 R12: 00000000fffffffe
Nov 23 23:57:05 **** kernel: audit: type=1006 audit(1606172225.633:330): pid=12586 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=14 res=1
Nov 23 23:57:05 **** kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
Nov 23 23:57:05 **** kernel: FS:  00007f67819b38c0(0000) GS:ffff997b5ee80000(0000) knlGS:0000000000000000
Nov 23 23:57:05 **** kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 23 23:57:05 **** kernel: CR2: 0000000000000028 CR3: 0000000fa4a2a000 CR4: 00000000003506e0
Nov 23 23:57:05 **** kernel: audit: type=1105 audit(1606172225.637:331): pid=12586 uid=0 auid=0 ses=14 msg='op=PAM:session_open grantors=pam_loginuid,pam_limits,pam_unix acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
Nov 23 23:57:05 **** kernel: Call Trace:
Nov 23 23:57:05 **** kernel:  aq_vec_init+0x9e/0xe1 [atlantic]
Nov 23 23:57:05 **** kernel:  aq_nic_init+0xf1/0x191 [atlantic]
Nov 23 23:57:05 **** kernel: audit: type=1110 audit(1606172225.637:332): pid=12586 uid=0 auid=0 ses=14 msg='op=PAM:setcred grantors=pam_unix,pam_env acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
Nov 23 23:57:05 **** kernel:  aq_ndev_open+0x16/0x5a [atlantic]
Nov 23 23:57:05 **** kernel:  __dev_open+0xfb/0x1b0
Nov 23 23:57:05 **** kernel:  __dev_change_flags+0x1a5/0x210
Nov 23 23:57:05 **** audit[12586]: USER_START pid=12586 uid=0 auid=0 ses=14 msg='op=PAM:session_open grantors=pam_loginuid,pam_limits,pam_unix acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
Nov 23 23:57:05 **** audit[12586]: CRED_REFR pid=12586 uid=0 auid=0 ses=14 msg='op=PAM:setcred grantors=pam_unix,pam_env acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
Nov 23 23:57:05 **** kernel:  dev_change_flags+0x21/0x60
Nov 23 23:57:05 **** kernel:  do_setlink+0x2bc/0x1160
Nov 23 23:57:05 **** kernel:  ? __nla_validate_parse+0x5f/0x910
Nov 23 23:57:05 **** kernel:  __rtnl_newlink+0x65f/0x9e0
Nov 23 23:57:05 **** kernel:  rtnl_newlink+0x44/0x70
Nov 23 23:57:05 **** kernel:  rtnetlink_rcv_msg+0x13e/0x390
Nov 23 23:57:05 **** kernel:  ? rtnl_calcit.isra.0+0x120/0x120
Nov 23 23:57:05 **** kernel:  netlink_rcv_skb+0x75/0x140
Nov 23 23:57:05 **** kernel:  netlink_unicast+0x242/0x340
Nov 23 23:57:05 **** kernel:  netlink_sendmsg+0x243/0x480
Nov 23 23:57:05 **** kernel:  sock_sendmsg+0x5e/0x60
Nov 23 23:57:05 **** kernel:  ____sys_sendmsg+0x25a/0x2a0
Nov 23 23:57:05 **** kernel:  ? copy_msghdr_from_user+0x6e/0xa0
Nov 23 23:57:05 **** kernel:  ___sys_sendmsg+0x97/0xe0
Nov 23 23:57:05 **** kernel:  __sys_sendmsg+0x81/0xd0
Nov 23 23:57:05 **** kernel:  do_syscall_64+0x33/0x40
Nov 23 23:57:05 **** kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 23 23:57:05 **** kernel: RIP: 0033:0x7f67826bfddd
Nov 23 23:57:05 **** kernel: Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 4a ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 9e ee ff ff 48
Nov 23 23:57:05 **** kernel: RSP: 002b:00007ffc47500d20 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
Nov 23 23:57:05 **** kernel: RAX: ffffffffffffffda RBX: 0000563cc28d6050 RCX: 00007f67826bfddd
Nov 23 23:57:05 **** kernel: RDX: 0000000000000000 RSI: 00007ffc47500d60 RDI: 000000000000000c
Nov 23 23:57:05 **** kernel: RBP: 000000000000017b R08: 0000000000000000 R09: 0000000000000000
Nov 23 23:57:05 **** kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
Nov 23 23:57:05 **** kernel: R13: 00007ffc47500eb0 R14: 00007ffc47500eac R15: 0000000000000000
Nov 23 23:57:05 **** kernel: Modules linked in: atlantic(OE) macsec rfcomm snd_seq_dummy snd_hrtimer snd_seq fuse cmac algif_hash algif_skcipher af_alg bnep nct6775 hwmon_vid dm_crypt cbc encrypted_keys trusted tpm btusb btrtl btbcm btintel bluetooth snd_usb_audio s>
Nov 23 23:57:05 **** kernel:  pcspkr rng_core rfkill wmi pinctrl_amd gpio_amdpt evdev mac_hid acpi_cpufreq zcommon(POE) znvpair(POE) spl(OE) uinput vboxnetflt(OE) vboxnetadp(OE) nfsd auth_rpcgss vboxdrv(OE) nfs_acl lockd grace videodev drm sunrpc mc sg crypto_user a>
Nov 23 23:57:05 **** kernel: CR2: 0000000000000028
Nov 23 23:57:05 **** kernel: ---[ end trace 71753c3b496c2743 ]---
Nov 23 23:57:05 **** kernel: RIP: 0010:aq_ring_rx_fill+0x66/0xb2 [atlantic]
Nov 23 23:57:05 **** kernel: Code: 00 00 00 00 eb 0d 29 d0 83 e8 01 eb dc 89 45 24 44 89 e0 44 8d 60 ff 85 c0 74 52 8b 45 24 48 8d 1c 40 48 c1 e3 04 48 03 5d 00 <48> c7 43 28 00 00 00 00 66 c7 43 28 00 08 44 89 ea 48 89 de 48 89
Nov 23 23:57:05 **** kernel: RSP: 0018:ffffad32a5ea73b0 EFLAGS: 00010246
Nov 23 23:57:05 **** kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000020
Nov 23 23:57:05 **** kernel: RDX: 0000000000000000 RSI: ffffad32a4946100 RDI: ffff997923d8f3b8
Nov 23 23:57:05 **** kernel: RBP: ffff997923d8f3b8 R08: 0000000000000000 R09: ffff99794b4d0720
Nov 23 23:57:05 **** kernel: R10: ffff997b52547088 R11: ffff997b57164070 R12: 00000000fffffffe
Nov 23 23:57:05 **** kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
Nov 23 23:57:05 **** kernel: FS:  00007f67819b38c0(0000) GS:ffff997b5ee80000(0000) knlGS:0000000000000000
Nov 23 23:57:05 **** kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 23 23:57:05 **** kernel: CR2: 0000000000000028 CR3: 0000000fa4a2a000 CR4: 00000000003506e0