System freeze/crash while gaming

Hi,
I have a problem while gaming.
The problem probably occurs while some shaders are being processed and because of it my system freezes completely. The screen is frozen, no audio and no reaction to keyboard (can’t use sysrq), and I have to do hard reset by pressing power button.

Problem occurs in such games like Star Citizen, Cyberpunk, Witcher 1 and lasty in Genshin Impact. It happens randomly, sometimes after 2-5 min of playing, sometimes after more than 1-2 hours.

I have read several topics on forums about similar problems and tried to limit GPU power, different kernels, different GPU drivers, but nothing worked.

I read that it can by happening because of faulty GPU, PSU or PSU cable, so I have tried to check my GPU and PSU by replacing them with other ones, but it doesn’t help.

I can’t find any errors in logs, that can help me solve the issue. I have another theory that switching ethernet to WIFI may be a cause - sometimes my ethernet is losing connection for 5-10 sec, it switches to WIFI and after a moment switches back do ethernet. In some cases this coincided with the error.

I had the similiar problem on Windows, before i switched to Manjaro - system went black screen while gaming and fans went full speed. I haven’t fixed it either.

Do you have some ideas what i can do to find the cause and fix it?

inxi -Fazy
System:
  Kernel: 6.11.11-1-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 14.2.1
    clocksource: tsc avail: hpet,acpi_pm
    parameters: BOOT_IMAGE=/boot/vmlinuz-6.11-x86_64
    root=UUID=3eef2ff6-889d-4970-9d06-12f35f216bb4 rw nvidia_drm.modeset=1
    nvidia_drm.fbdev=1 quiet splash acpi=force apm=power_off
    resume=UUID=1d96076d-a7df-49da-a069-268864b35c00 udev.log_priority=3
    sysrq_always_enabled=1
  Desktop: KDE Plasma v: 6.2.4 tk: Qt v: N/A info: frameworks v: 6.8.0
    wm: kwin_x11 vt: 2 dm: SDDM Distro: Manjaro base: Arch Linux
Machine:
  Type: Desktop System: Micro-Star product: MS-7E06 v: 3.0
    serial: <superuser required>
  Mobo: Micro-Star model: Z790 GAMING PLUS WIFI (MS-7E06) v: 3.0
    serial: <superuser required> UEFI: American Megatrends LLC. v: H.40
    date: 04/16/2024
Battery:
  Device-1: hidpp_battery_0 model: Logitech G305 Lightspeed Wireless Gaming
    Mouse serial: <filter> charge: 100% (should be ignored) rechargeable: yes
    status: discharging
CPU:
  Info: model: Intel Core i5-14600KF bits: 64 type: MST AMCP arch: Raptor Lake
    gen: core 14 level: v3 note: check built: 2022+ process: Intel 7 (10nm)
    family: 6 model-id: 0xB7 (183) stepping: 1 microcode: 0x12B
  Topology: cpus: 1x dies: 1 clusters: 8 cores: 14 threads: 20 mt: 6 tpc: 2
    st: 8 smt: enabled cache: L1: 1.2 MiB desc: d-8x32 KiB, 6x48 KiB; i-6x32
    KiB, 8x64 KiB L2: 20 MiB desc: 6x2 MiB, 2x4 MiB L3: 24 MiB desc: 1x24 MiB
  Speed (MHz): avg: 1100 min/max: 800/5300:4000 scaling: driver: intel_pstate
    governor: powersave cores: 1: 1100 2: 1100 3: 1100 4: 1100 5: 1100 6: 1100
    7: 1100 8: 1100 9: 1100 10: 1100 11: 1100 12: 1100 13: 1100 14: 1100
    15: 1100 16: 1100 17: 1100 18: 1100 19: 1100 20: 1100 bogomips: 139820
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: reg_file_data_sampling mitigation: Clear Register File
  Type: retbleed status: Not affected
  Type: spec_rstack_overflow status: Not affected
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Enhanced / Automatic IBRS; IBPB: conditional;
    RSB filling; PBRSB-eIBRS: SW sequence; BHI: BHI_DIS_S
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: NVIDIA AD103 [GeForce RTX 4070 Ti SUPER] vendor: PNY driver: nvidia
    v: 550.135 alternate: nouveau,nvidia_drm non-free: 550.xx+
    status: current (as of 2024-09) arch: Lovelace code: AD1xx
    process: TSMC n4 (5nm) built: 2022+ pcie: gen: 4 speed: 16 GT/s lanes: 16
    ports: active: none off: DP-2,DP-3 empty: DP-1,HDMI-A-1 bus-ID: 01:00.0
    chip-ID: 10de:2705 class-ID: 0300
  Display: x11 server: X.Org v: 21.1.14 with: Xwayland v: 24.1.4
    compositor: kwin_x11 driver: X: loaded: nvidia gpu: nvidia,nvidia-nvswitch
    display-ID: :0 screens: 1
  Screen-1: 0 s-res: 5120x1440 s-dpi: 110 s-size: 1182x333mm (46.54x13.11")
    s-diag: 1228mm (48.35")
  Monitor-1: DP-2 note: disabled pos: left model: Dell P2421D
    serial: <filter> built: 2020 res: 2560x1440 hz: 60 dpi: 123 gamma: 1.2
    size: 527x296mm (20.75x11.65") diag: 604mm (23.8") ratio: 16:9 modes:
    max: 2560x1440 min: 640x480
  Monitor-2: DP-3 mapped: DP-4 note: disabled pos: primary,right model: LG
    (GoldStar) ULTRAGEAR+ serial: <filter> built: 2024 res: 2560x1440 hz: 240
    dpi: 110 gamma: 1.2 size: 590x330mm (23.23x12.99") diag: 677mm (26.7")
    ratio: 16:9 modes: max: 2560x1440 min: 640x480
  API: EGL v: 1.5 hw: drv: nvidia platforms: device: 0 drv: nvidia device: 2
    drv: swrast gbm: drv: nvidia surfaceless: drv: nvidia x11: drv: nvidia
    inactive: wayland,device-1
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 550.135
    glx-v: 1.4 direct-render: yes renderer: NVIDIA GeForce RTX 4070 Ti
    SUPER/PCIe/SSE2 memory: 15.62 GiB
  API: Vulkan v: 1.4.303 layers: 10 device: 0 type: discrete-gpu name: NVIDIA
    GeForce RTX 4070 Ti SUPER driver: N/A device-ID: 10de:2705
    surfaces: xcb,xlib
Audio:
  Device-1: Intel Raptor Lake High Definition Audio vendor: Micro-Star MSI
    driver: snd_hda_intel v: kernel alternate: snd_soc_avs,snd_sof_pci_intel_tgl
    bus-ID: 00:1f.3 chip-ID: 8086:7a50 class-ID: 0403
  Device-2: NVIDIA vendor: PNY driver: snd_hda_intel v: kernel pcie: gen: 4
    speed: 16 GT/s lanes: 16 bus-ID: 01:00.1 chip-ID: 10de:22bb class-ID: 0403
  Device-3: SteelSeries ApS Arctis Nova Pro Wireless
    driver: hid-generic,snd-usb-audio,usbhid type: USB rev: 2.0 speed: 12 Mb/s
    lanes: 1 mode: 1.1 bus-ID: 1-3.4:9 chip-ID: 1038:12e0 class-ID: 0300
  API: ALSA v: k6.11.11-1-MANJARO status: kernel-api with: aoss
    type: oss-emulator tools: alsactl,alsamixer,amixer
  Server-1: JACK v: 1.9.22 status: off tools: N/A
  Server-2: PipeWire v: 1.2.7 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel Raptor Lake-S PCH CNVi WiFi driver: iwlwifi v: kernel
    bus-ID: 00:14.3 chip-ID: 8086:7a70 class-ID: 0280
  IF: wlo1 state: up mac: <filter>
  Device-2: Intel Ethernet I225-V vendor: Micro-Star MSI driver: igc
    v: kernel pcie: gen: 2 speed: 5 GT/s lanes: 1 port: N/A bus-ID: 04:00.0
    chip-ID: 8086:15f3 class-ID: 0200
  IF: enp4s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  IF-ID-1: docker0 state: down mac: <filter>
  Info: services: NetworkManager, systemd-timesyncd, wpa_supplicant
Bluetooth:
  Device-1: Intel AX211 Bluetooth driver: btusb v: 0.8 type: USB rev: 2.0
    speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-14:7 chip-ID: 8087:0033
    class-ID: e001
  Report: btmgmt ID: hci0 rfk-id: 1 state: up address: <filter> bt-v: 5.3
    lmp-v: 12 status: discoverable: no pairing: no class-ID: 6c0104
Drives:
  Local Storage: total: 2.79 TiB used: 651.2 GiB (22.8%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/nvme0n1 maj-min: 259:1 vendor: GOODRAM model: SSDPR-PX700-02T-80
    size: 1.86 TiB block-size: physical: 512 B logical: 512 B speed: 63.2 Gb/s
    lanes: 4 tech: SSD serial: <filter> fw-rev: SN15299 temp: 43.9 C
    scheme: GPT
  ID-2: /dev/nvme1n1 maj-min: 259:0 vendor: GOODRAM model: SSDPR-PX700-01T-80
    size: 953.87 GiB block-size: physical: 512 B logical: 512 B speed: 63.2 Gb/s
    lanes: 4 tech: SSD serial: <filter> fw-rev: SN15299 temp: 33.9 C
    scheme: GPT
Partition:
  ID-1: / raw-size: 200 GiB size: 195.8 GiB (97.90%) used: 68.71 GiB (35.1%)
    fs: ext4 dev: /dev/nvme0n1p3 maj-min: 259:4
  ID-2: /boot/efi raw-size: 512 MiB size: 511 MiB (99.80%)
    used: 300 KiB (0.1%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:2
  ID-3: /home raw-size: 1.64 TiB size: 1.61 TiB (98.37%)
    used: 582.49 GiB (35.3%) fs: ext4 dev: /dev/nvme0n1p4 maj-min: 259:5
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default) zswap: no
  ID-1: swap-1 type: partition size: 32 GiB used: 0 KiB (0.0%) priority: -2
    dev: /dev/nvme0n1p2 maj-min: 259:3
Sensors:
  System Temperatures: cpu: 33.8 C mobo: N/A gpu: nvidia temp: 50 C
  Fan Speeds (rpm): N/A gpu: nvidia fan: 0%
Info:
  Memory: total: 32 GiB available: 31.18 GiB used: 3.55 GiB (11.4%)
  Processes: 398 Power: uptime: 59m states: freeze,mem,disk suspend: deep
    avail: s2idle wakeups: 0 hibernate: platform avail: shutdown, reboot,
    suspend, test_resume image: 12.41 GiB services: org_kde_powerdevil,
    power-profiles-daemon, upowerd Init: systemd v: 256 default: graphical
    tool: systemctl
  Packages: 1583 pm: pacman pkgs: 1570 libs: 478 tools: pamac pm: flatpak
    pkgs: 13 Compilers: clang: 18.1.8 gcc: 14.2.1 Shell: Zsh v: 5.9 default: Bash
    v: 5.2.37 running-in: konsole inxi: 3.3.36

journalctl from crash: http://0x0.st/8-rr.txt
journalctl --priority=3 --catalog --no-pager: https://pastebin.com/iiJAGHxD

If you encounter the problem both on Windows and Manjaro, then it is probably a hardware problem. Maybe check the computer, reseat the RAM modules, verify your CPU and video card cooling systems are not obstructed by dust, and that the fans are working, unplug and replug all connectors (SSD, HDD, video card, motherboard…) try to do the most you can to discard basic hardware issues.
On the system you can also monitor temperatures maybe while performing “burn” tests for the CPU and video card.
Maybe check if your RAM is OK with MEMTEST program.

1 Like

I agree, quite likely.

Thanks for your system information. I’ll quickly note that kernel 6.11 has now reached EOL. You could either switch to kernel 6.12 or possibly 6.6 (LTS) and see if either of those make any difference:

Overheating and poor ventilation can also be responsible for random crashes; you might wish to look into that.

I’m sure someone will help further when they are able.

Regards.


Welcome to the Manjaro community.

As a new or infrequent forum user, please take some time to familiarise yourself with Forum requirements; in particular, the many ways to use the forum to your benefit:


Required Reading:

Resources:


Update Announcements:

The Update Announcements contain update related information and a Known Issues and Solutions section that should generally be checked before posting a request for support.


System Information:

Output of the following command (formatted according to forum requirements) may be useful for those wishing to help:

inxi --admin --verbosity=8 --filter --no-host --width

Be prepared to provide more information and outputs from other commands when asked.


Regards.

I found time to reconnect all connectors, clean all fans (and the rest of the computer) and update the kernel, but it didn’t help.

I checked the temperatures (using mangohud and KDE System Monitor) and I don’t see any problems with them - GPU maintains a constant temperature of 66-67 degrees, CPU fluctuates between 55 and 65 degrees. I also checked the CPU and GPU burners, but they work fine for about 2 hours. Memtest also doesn’t find any problems.

I am thinking to check other hardware (new RAM, CPU, motherboard), but let’s assume it’s not hardware problem - what else can I check?

It’s hard to say what is going on but I stay on my first impression, if you have the issue on Windows too, to me it is hardware related.

1.Another question, do you have games which runs 100% stable?

2.As you mentionend that you also have this problem under Windows.
Do you have this problems since you build this PC? If not, how old is your PC and since when you have issues?

3.Tell us your GPU Hotspot temps.

  1. Yes, such as Noita, Minecraft, Brotato or Stardew Valley. These are mainly less demanding games.
  2. Since I build it
  3. I can’t get this number. I’m not sure why.It’s not shown in any app I tried - mangohud, nvidia-settings, nvidia-smi.
nvidia-smi -q -d TEMPERATURE
==============NVSMI LOG==============

Timestamp                                 : Tue Jan 21 15:40:27 2025
Driver Version                            : 550.144.03
CUDA Version                              : 12.4

Attached GPUs                             : 1
GPU 00000000:01:00.0
    Temperature
        GPU Current Temp                  : 40 C
        GPU T.Limit Temp                  : N/A
        GPU Shutdown Temp                 : 97 C
        GPU Slowdown Temp                 : 92 C
        GPU Max Operating Temp            : 90 C
        GPU Target Temperature            : 84 C
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A

Under Windows you can use HwInfo or GPU-Z, i saw some people mentionend there is a app (Hardinfo2 and hw-probe) in Arch that can show GPU Hotspot also, but i just try for myself and Hardinfo2 doesn’t show my nvidia GPU Hotspot.

Default Bios Setups, Overclock’s today your CPU and sometimes put your system in dangerous voltage states.

Almost all Mainbaord Vendor’s have horrible default settings, that you need to adjust for yourself.

Here is just a example:

I checked it, updated BIOS and adjusted the settings, and after last 3 hours I think that was the problem! I played without any crashes in Satisfactory and Star Citizen.
I will check it more in next days to confirm it.

Thank you for your help.

1 Like

Please also confirm that you attended to the EOL kernel:

I currently use kernel 6.12; with kernel 6.6 (LTS) as a failsafe. This configuration is working fine (for me).

Either way, the EOL kernel 6.11 should be replaced.

Regards.

I’m glad i could help, happy gaming :slight_smile:

I switched to the same configuration as you (using 6.12 with 6.6 as a failsafe), i forgot to specify it earlier :wink:

1 Like

The combination of updating your BIOS and switching to a supported kernel has probably helped greatly (but, mainly the BIOS, I suspect).

Enjoy your gaming.

Regards.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.