System damage after hibernation

Hello,

I woke up my PC from hibernation yesterday. After that, when I am remembering correctly, I wanted to start a program, but it didn’t work, I decided not to spend too much time with the problem instead I rebooted my PC in order to solve the “hiccup”.
After the reboot, I saw that my xfce whiskers-menu was different in style and configuration. Also, lutris which I had open before the hibernation don’t work now. Some program in playOnLinux don’t run too. However, I started another game, but this worked.

Is it normal that system damage happen after hibernation?
How can I diagnose what actually happened?
Is there a system health check?
How can I repair it?

I have no idea what happened, therefore I cannot fix it by myself :frowning:

What I have already done:

  • Took a quick look at the smart values of my SSD. And executed a short test → no problems.
  • Booted from manjaro USB stick and performed e2fsck -f on my system partition, it changed a bit the sizing, but I couldn’t read anything about bad sectors

Output lutris:

$ lutris
2021-08-28 17:56:54,114: Magic not available. Unable to automatically find game executables. Please install python-magic
Traceback (most recent call last):
  File "/usr/bin/lutris", line 52, in <module>
    from lutris.gui.application import Application  # pylint: disable=no-name-in-module
  File "/usr/lib/python3.9/site-packages/lutris/gui/application.py", line 53, in <module>
    from .lutriswindow import LutrisWindow
  File "/usr/lib/python3.9/site-packages/lutris/gui/lutriswindow.py", line 26, in <module>
    from lutris.gui.widgets.sidebar import LutrisSidebar
  File "/usr/lib/python3.9/site-packages/lutris/gui/widgets/sidebar.py", line 6, in <module>
    from lutris import platforms, runners, services
  File "/usr/lib/python3.9/site-packages/lutris/platforms.py", line 19, in <module>
    _init_platforms()
  File "/usr/lib/python3.9/site-packages/lutris/platforms.py", line 14, in _init_platforms
    runner = runners.import_runner(runner_name)()
  File "/usr/lib/python3.9/site-packages/lutris/runners/wine.py", line 229, in __init__
    "default": dxvk.DXVKManager().version,
  File "/usr/lib/python3.9/site-packages/lutris/util/wine/dxvk.py", line 56, in version
    if self.versions:
  File "/usr/lib/python3.9/site-packages/lutris/util/wine/dxvk.py", line 45, in versions
    self._versions = self.load_dxvk_versions()
  File "/usr/lib/python3.9/site-packages/lutris/util/wine/dxvk.py", line 70, in load_dxvk_versions
    dxvk_versions = [v["tag_name"] for v in json.load(dxvk_version_file)]
  File "/usr/lib/python3.9/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Parts of journalctl output from that day:

Aug 27 16:43:28 workstation kernel: Uhhuh. NMI received for unknown reason 3c on CPU 0.
Aug 27 16:43:28 workstation kernel: Do you have a strange power saving mode enabled?
Aug 27 16:43:28 workstation kernel: Dazed and confused, but trying to continue

and (it seems my vlc crashed)

Aug 27 16:49:35 workstation systemd-coredump[18811]: [🡕] Process 7369 (vlc) of user 1000 dumped core.
                                                     
                                                     Stack trace of thread 7369:
                                                     #0  0x00007ffadee46610 n/a (libnvidia-glcore.so.470.63.01 + 0xef8610)
                                                     #1  0x00007ffadee5aa63 n/a (libnvidia-glcore.so.470.63.01 + 0xf0ca63)
                                                     #2  0x00007ffadedfc256 n/a (libnvidia-glcore.so.470.63.01 + 0xeae256)
                                                     #3  0x00007ffadedfcde5 n/a (libnvidia-glcore.so.470.63.01 + 0xeaede5)
                                                     #4  0x00007ffaded13791 n/a (libnvidia-glcore.so.470.63.01 + 0xdc5791)
                                                     #5  0x00007ffaded14131 n/a (libnvidia-glcore.so.470.63.01 + 0xdc6131)
                                                     #6  0x00007ffaf1a97e45 n/a (libGLX_nvidia.so.0 + 0xa7e45)
                                                     #7  0x00007ffaf1a3bac1 n/a (libGLX_nvidia.so.0 + 0x4bac1)
                                                     #8  0x00007ffaf1a3c1af n/a (libGLX_nvidia.so.0 + 0x4c1af)
                                                     #9  0x00007ffaf1ad7009 n/a (libGLX_nvidia.so.0 + 0xe7009)
                                                     #10 0x00007ffb1d48c4a7 __run_exit_handlers (libc.so.6 + 0x3f4a7)
                                                     #11 0x00007ffb1d48c64e exit (libc.so.6 + 0x3f64e)
                                                     #12 0x00007ffb1d474b2c __libc_start_main (libc.so.6 + 0x27b2c)
                                                     #13 0x00005651d918f42e n/a (vlc + 0x142e)

My PC:

$ inxi -Fxz
System:
  Kernel: 5.10.59-1-MANJARO x86_64 bits: 64 compiler: gcc v: 11.1.0 
  Desktop: Xfce 4.16.0 Distro: Manjaro Linux base: Arch Linux 
Machine:
  Type: Desktop Mobo: ASUSTeK model: ROG STRIX X470-F GAMING v: Rev X.0x 
  serial: <filter> UEFI: American Megatrends v: 5007 date: 06/17/2019 
CPU:
  Info: 8-Core model: AMD Ryzen 7 2700X bits: 64 type: MT MCP arch: Zen+ 
  rev: 2 cache: L2: 4 MiB 
  flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm 
  bogomips: 118226 
  Speed: 1713 MHz min/max: 2200/3700 MHz boost: enabled Core speeds (MHz): 
  1: 1713 2: 2650 3: 1752 4: 1728 5: 2090 6: 2187 7: 2085 8: 2190 9: 1750 
  10: 1712 11: 1888 12: 2107 13: 1891 14: 2200 15: 1743 16: 2455 
Graphics:
  Device-1: NVIDIA GP104 [GeForce GTX 1070 Ti] driver: nvidia v: 470.63.01 
  bus-ID: 08:00.0 
  Device-2: Logic3 / SpectraVideo plc G-720 Keyboard type: USB 
  driver: hid-generic,usbhid bus-ID: 5-1:2 
  Display: x11 server: X.Org 1.20.13 driver: loaded: nvidia 
  resolution: 1920x1080~60Hz 
  OpenGL: renderer: NVIDIA GeForce GTX 1070 Ti/PCIe/SSE2 
  v: 4.6.0 NVIDIA 470.63.01 direct render: Yes 
Audio:
  Device-1: NVIDIA GP104 High Definition Audio driver: snd_hda_intel 
  v: kernel bus-ID: 08:00.1 
  Device-2: AMD Family 17h HD Audio vendor: ASUSTeK driver: snd_hda_intel 
  v: kernel bus-ID: 0a:00.3 
  Sound Server-1: ALSA v: k5.10.59-1-MANJARO running: yes 
  Sound Server-2: JACK v: 1.9.19 running: no 
  Sound Server-3: PulseAudio v: 15.0 running: no 
  Sound Server-4: PipeWire v: 0.3.33 running: yes 
Network:
  Device-1: Intel I211 Gigabit Network vendor: ASUSTeK driver: igb v: kernel 
  port: e000 bus-ID: 06:00.0 
  IF: enp6s0 state: down mac: <filter> 
  Device-2: AVM FRITZ WLAN N v2 [RT5572/rt2870.bin] type: USB 
  driver: rt2800usb bus-ID: 3-2:2 
  IF: wlp4s0u2 state: up mac: <filter> 
Bluetooth:
  Device-1: Cambridge Silicon Radio Bluetooth Dongle (HCI mode) type: USB 
  driver: btusb v: 0.8 bus-ID: 5-4:4 
  Report: rfkill ID: hci0 rfk-id: 0 state: up address: see --recommends 
Drives:
  Local Storage: total: 4.12 TiB used: 606.22 GiB (14.4%) 
  ID-1: /dev/sda vendor: Western Digital model: WD5000BHTZ-60JCPV0 
  size: 465.76 GiB 
  ID-2: /dev/sdb vendor: Seagate model: ST31500341AS size: 1.36 TiB 
  ID-3: /dev/sdc vendor: Crucial model: CT2000MX500SSD1 size: 1.82 TiB 
  ID-4: /dev/sdd vendor: Samsung model: SSD 850 EVO 500GB size: 465.76 GiB 
  ID-5: /dev/sde type: USB vendor: SanDisk model: USB 3.2Gen1 
  size: 28.65 GiB 
Partition:
  ID-1: / size: 440.21 GiB used: 390.03 GiB (88.6%) fs: ext4 dev: /dev/sdd2 
  ID-2: /boot/efi size: 299.4 MiB used: 36 MiB (12.0%) fs: vfat 
  dev: /dev/sdd1 
Swap:
  ID-1: swap-1 type: partition size: 17.22 GiB used: 0 KiB (0.0%) 
  dev: /dev/sdd3 
Sensors:
  System Temperatures: cpu: 47.4 C mobo: 38.0 C gpu: nvidia temp: 42 C 
  Fan Speeds (RPM): cpu: 881 case-1: 531 case-2: 687 case-3: 872 gpu: nvidia 
  fan: 0% 
  Power: 12v: 10.14 5v: N/A 3.3v: N/A vbat: N/A 
Info:
  Processes: 335 Uptime: 1h 6m Memory: 15.61 GiB used: 3.12 GiB (20.0%) 
  Init: systemd Compilers: gcc: 11.1.0 Packages: 1725 Shell: Bash v: 5.1.8 
  inxi: 3.3.06 

Obviously not.
Alterations may appear only after reboot/relogin as started applications and services keep their starting configuration in memory (or start with those, then don’t need to actively check them as long as they run).


You can check the system journals.
https://wiki.archlinux.org/title/Systemd/Journal

The only strange thing I could find, was during a wake-up, before the above mentioned unfortunate hibernation:

Aug 26 23:59:48 workstation kernel: serial 00:03: activated
Aug 26 23:59:48 workstation kernel: sd 0:0:0:0: [sda] Starting disk
Aug 26 23:59:48 workstation kernel: sd 2:0:0:0: [sdc] Starting disk
Aug 26 23:59:48 workstation kernel: sd 1:0:0:0: [sdb] Starting disk
Aug 26 23:59:48 workstation kernel: sd 5:0:0:0: [sdd] Starting disk
Aug 26 23:59:48 workstation kernel: ------------[ cut here ]------------
Aug 26 23:59:48 workstation kernel: WARNING: CPU: 7 PID: 17108 at net/mac80211/util.c:2380 ieee80211_reconfig+0x22f/0x1490 [mac80211]
Aug 26 23:59:48 workstation kernel: Modules linked in: usb_storage nvidia_uvm(POE) ccm rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device cmac algif_hash algif_skcipher af_alg rt>
Aug 26 23:59:48 workstation kernel:  eeepc_wmi asus_wmi sparse_keymap rfkill video wmi_bmof mxm_wmi agpgart vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) crypto_user asus_wmi_sensors(OE) >
Aug 26 23:59:48 workstation kernel: CPU: 7 PID: 17108 Comm: kworker/u64:85 Tainted: P        W  OE     5.10.59-1-MANJARO #1
Aug 26 23:59:48 workstation kernel: Hardware name: System manufacturer System Product Name/ROG STRIX X470-F GAMING, BIOS 5007 06/17/2019
Aug 26 23:59:48 workstation kernel: Workqueue: events_unbound async_run_entry_fn
Aug 26 23:59:48 workstation kernel: RIP: 0010:ieee80211_reconfig+0x22f/0x1490 [mac80211]
Aug 26 23:59:48 workstation kernel: Code: 30 0c 00 00 83 e2 fd 83 fa 04 74 e6 48 8b 93 98 04 00 00 83 e2 01 74 da 48 89 de 48 89 ef e8 58 5f fc ff 41 89 c4 85 c0 74 c8 <0f> 0b 48 8b 5>
Aug 26 23:59:48 workstation kernel: RSP: 0018:ffffbd3442cf3d70 EFLAGS: 00010286
Aug 26 23:59:48 workstation kernel: RAX: 00000000ffffffed RBX: ffff9c8adbc00940 RCX: ffff9c8cccfd7e58
Aug 26 23:59:48 workstation kernel: RDX: 0000000000000001 RSI: ffff9c8adbc01570 RDI: ffff9c8b2c670800
Aug 26 23:59:48 workstation kernel: RBP: ffff9c8b2c670800 R08: ffffbd3442cf3d48 R09: 00000000ffffffed
Aug 26 23:59:48 workstation kernel: R10: 00000cd844506134 R11: 0000000000000000 R12: 00000000ffffffed
Aug 26 23:59:48 workstation kernel: R13: ffff9c8b2c671920 R14: 0000000000000010 R15: 0000000000000000
Aug 26 23:59:48 workstation kernel: FS:  0000000000000000(0000) GS:ffff9c8cce9c0000(0000) knlGS:0000000000000000
Aug 26 23:59:48 workstation kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 26 23:59:48 workstation kernel: CR2: 0000000000000000 CR3: 000000004a210000 CR4: 00000000003506e0
Aug 26 23:59:48 workstation kernel: Call Trace:
Aug 26 23:59:48 workstation kernel:  ? __prepare_to_swait+0x4b/0x70
Aug 26 23:59:48 workstation kernel:  wiphy_resume+0x7c/0x130 [cfg80211]
Aug 26 23:59:48 workstation kernel:  ? wiphy_suspend+0x2a0/0x2a0 [cfg80211]
Aug 26 23:59:48 workstation kernel:  dpm_run_callback+0x4c/0x150
Aug 26 23:59:48 workstation kernel:  device_resume+0xa7/0x200
Aug 26 23:59:48 workstation kernel:  async_resume+0x19/0x30
Aug 26 23:59:48 workstation kernel:  async_run_entry_fn+0x39/0x160
Aug 26 23:59:48 workstation kernel:  process_one_work+0x1ad/0x370
Aug 26 23:59:48 workstation kernel:  worker_thread+0x50/0x3b0
Aug 26 23:59:48 workstation kernel:  ? rescuer_thread+0x380/0x380
Aug 26 23:59:48 workstation kernel:  kthread+0x133/0x150
Aug 26 23:59:48 workstation kernel:  ? kthread_associate_blkcg+0xc0/0xc0
Aug 26 23:59:48 workstation kernel:  ret_from_fork+0x22/0x30
Aug 26 23:59:48 workstation kernel: ---[ end trace fc8a2a7fa3eb22ea ]---
Aug 26 23:59:48 workstation kernel: PM: dpm_run_callback(): wiphy_resume+0x0/0x130 [cfg80211] returns -19
Aug 26 23:59:48 workstation kernel: PM: Device phy3 failed to resume async: error -19

Thanks for your help :+1:, but I couldn’t figure out the exact problem. Therefore, as the last solution, I decided to restore a backup.