System crash/frozen ~5min after gaming (related to Flatpak nvidia driver)

Of course …

I didn’t know a GPU driver could be installed as flatpak…

I removed nvidia_drm.modeset=1 from Grub, but the game/driver freezed my system again… the new Flatpak driver made it even worse… i regret it that i updated it :frowning:

I also pressed free memory again, so the OOM can be ignored.

I have no idea who is in charge about this Flatpak driver, should i report this to nvidia?

$ journalctl -b -1 -p3 --no-pager
Sep 01 13:31:23 koboldx-z170 kernel: x86/cpu: SGX disabled by BIOS.
Sep 01 13:31:24 koboldx-z170 kernel: 
Sep 01 13:31:32 koboldx-z170 pulseaudio[1381]: GetManagedObjects() failed: org.freedesktop.DBus.Error.NameHasNoOwner: Could not activate remote peer 'org.bluez': unit failed
Sep 01 14:59:04 koboldx-z170 kernel: irq 16: nobody cared (try booting with the "irqpoll" option)
Sep 01 14:59:04 koboldx-z170 kernel: handlers:
Sep 01 14:59:04 koboldx-z170 kernel: [<000000003bd7a28f>] i801_isr [i2c_i801]
Sep 01 14:59:04 koboldx-z170 kernel: [<00000000f56cf98d>] azx_interrupt [snd_hda_codec]
Sep 01 14:59:04 koboldx-z170 kernel: Disabling IRQ #16
Sep 01 14:59:16 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
Sep 01 14:59:16 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
Sep 01 14:59:16 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
Sep 01 14:59:16 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
Sep 01 14:59:45 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
Sep 01 14:59:55 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
Sep 01 14:59:55 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
Sep 01 14:59:57 koboldx-z170 systemd-coredump[4338]: [🡕] Process 4261 (spring-main) of user 1000 dumped core.
                                                     
                                                     Stack trace of thread 139:
                                                     #0  0x0000000000000000 n/a (n/a + 0x0)
                                                     ELF object binary architecture: AMD x86-64
Sep 01 15:00:25 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
Sep 01 15:00:38 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57d:0:0:0x0000000f
Sep 01 15:00:38 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x0000000f
Sep 01 15:00:38 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:1:0:0x0000000f
Sep 01 15:00:38 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:2:0:0x0000000f
Sep 01 15:00:38 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:3:0:0x0000000f
Sep 01 15:00:38 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:4:0:0x0000000f
Sep 01 15:00:38 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:5:0:0x0000000f
Sep 01 15:00:38 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:0:0x0000000f
Sep 01 15:00:38 koboldx-z170 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:7:0:0x0000000f
Sep 01 15:01:11 koboldx-z170 kernel: Out of memory: Killed process 1254 (plasmashell) total-vm:2646132kB, anon-rss:280696kB, file-rss:205440kB, shmem-rss:0kB, UID:1000 pgtables:1328kB oom_score_adj:200
Sep 01 15:01:52 koboldx-z170 systemd[1164]: Failed to start KDE Plasma Workspace.
Sep 01 15:02:36 koboldx-z170 pulseaudio[1381]: Error opening PCM device front:2: No such file or directory
Sep 01 15:02:36 koboldx-z170 pulseaudio[1381]: Error opening PCM device front:2: No such file or directory

With the old driver i could play atleast for 6 days almost stable but this new one freezes just after a hour.

And i have not even a idea, if my system shutdown correctly with reisub… is there a way to see if all steps from REISUB went through successfull?

I have no idea - if it were to me - I’d remove every shred of Nvidia - I am not refering to hardware - but the drivers - whatever you have installed.

There was a question once on Flatpak removal [root tip] [How To] Removing a flatpak app.

There was a time where I used Nvidia - there was a slight difference between using the prebuilt drivers and using DKMS.

Mostly drivers were stable with DKMS (remember the headers for your kernel)

1 Like

Does this happen only during/after playing a particular game?

Yeah, i have problems with this one native linux flatpak game, which used a flatpak GPU driver.

No other issues besides this game.

Maybe i run into problem’s now, because Manjaro is holding his distro nvidia GPU driver update back? :man_shrugging:

So other flatpak games run fine on your system or you didn’t try other ones?

About a week ago I did the last stable update, updated flatpak nvidia drivers and after that all flatpak apps refused to launch. Then I restored from backup, redid the update and the problem was gone.

I don’t have other flatpak games.

My game refused too to run too, but 24hours later another flatpak nvidia driver was available… which lead to the system freezes, but only 2 freezes in 7 days.

At least im not the only one who has problems with this Flatpak driver’s.

Which backup, full system restore with timeshift?

Clonezilla backup that I do before each update

I still experience crashes after done with gaming, but at least it no longer crash ingame.

Is the nvidia Flatpak driver still working for you after the stable update 4 days ago?

I haven’t had any issues with flatpak after I redid the update. So no idea what it was caused by.

Did you try another kernel?

1 Like

Not yet, i just managed to reinstall the flatpak nvidia driver with @linux-aarhus link that he posted above, i hope this could finally fix it.

$ flatpak list
$ flatpak uninstall nvidia-550-107-02
$ flatpak update --system

So hopefully, next time you don’t need a backup :wink:

Edit: Still my system is freezing again… damn reinstall didn’t solved it :frowning:

$ flatpak list
Name                   Application ID                                  Version                    Branch       Installation
BAR Team               info.beyondallreason.bar                        1.2988.0.0                 stable       system
Freedesktop Platform   org.freedesktop.Platform                        freedesktop-sdk-23.08.21   23.08        system
Mesa                   org.freedesktop.Platform.GL.default             24.1.3                     23.08        system
Mesa (Extra)           org.freedesktop.Platform.GL.default             24.1.3                     23.08-extra  system
nvidia-550-107-02      org.freedesktop.Platform.GL.nvidia-550-107-02                              1.4          system
openh264               org.freedesktop.Platform.openh264               2.1.0                      2.2.0        system

Maybe i should also try reinstalling Mesa and openh264?

The problems you face could be flatpak-related, kernel-related or could indicate that your hardware is slowly dying.
You might need to do more tests to identify the source of the problem:

  1. There is an AUR version of “Beyond All Reason”. Check if it runs without problems. May be you don’t need flatpak after all.
  2. Install kernel 6.10. Add nvidia_drm.modeset=1 nvidia_drm.fbdev=1 to the GRUB command line, MODULES=(nvidia nvidia_modeset nvidia_uvm nvidia_drm) to /etc/mkinitcpio.conf and reboot to that kernel. Play the game using new kernel.
  3. Monitor RAM/VRAM consumption during/after the game to see if ...kernel: Out of memory: Killed process 4181 (spring-main)... is really due to the insufficient memory (memory leak?) or something else.