I’m not entirely sure when this started or what could have prompted this. I monitor my GPU usage and temperature with MangoHUD, and I’ve noticed that when playing any game now (emulator, Steam, Genshin, etc) at some point during gameplay my GPU fan speed will instantly jump to 100%, and stay there - Regardless of what the GPU usage or actual temperature is. It seems to get triggered at very low temperatures, in some of these games I never see the GPU go over 55c (typically around 40-45), and the fan will still ramp up and stay there.
Even closing the game/application will not return the fan speed to normal, I have to reboot. I installed GreenWithEnvy to check the fan profile, and it appears that Green isn’t able to see the fan information (reads 0% fan duty and 0 RPM). Attempting to change the fan profile does not have an effect. I also attempted upgrading my kernel from 5.13 → 5.15.
GPU is a 1660 Super, and I’m using the latest 495.44 proprietary Nvidia drivers. Any ideas are greatly appreciated as this issue makes me pretty uncomfortable, for obvious reasons I don’t want to yeet a fan on my GPU and have to replace it in the current market.
Please let me know what other information would be useful.
Edit: Here is the inxi output of system info
System:
Kernel: 5.15.7-1-MANJARO x86_64 bits: 64 compiler: gcc v: 11.1.0
parameters: BOOT_IMAGE=/boot/vmlinuz-5.15-x86_64
root=UUID=fada4d6d-7bdb-40cc-a80c-ed14fd89d9ac rw quiet apparmor=1
security=apparmor udev.log_priority=3
Desktop: KDE Plasma 5.23.4 tk: Qt 5.15.2 wm: kwin_x11 vt: 1 dm: SDDM
Distro: Manjaro Linux base: Arch Linux
Machine:
Type: Desktop System: ASUS product: N/A v: N/A serial: <superuser required>
Mobo: ASUSTeK model: PRIME B550M-A (WI-FI) v: Rev X.0x
serial: <superuser required> UEFI: American Megatrends v: 2423
date: 08/09/2021
Battery:
Message: No system battery data found. Is one present?
Memory:
RAM: total: 31.32 GiB used: 2.87 GiB (9.2%)
RAM Report:
permissions: Unable to run dmidecode. Root privileges required.
CPU:
Info: model: AMD Ryzen 7 5800X bits: 64 type: MT MCP arch: Zen 3
family: 0x19 (25) model-id: 0x21 (33) stepping: 0 microcode: 0xA201016
Topology: cpus: 1x cores: 8 tpc: 2 threads: 16 smt: enabled cache:
L1: 512 KiB desc: d-8x32 KiB; i-8x32 KiB L2: 4 MiB desc: 8x512 KiB
L3: 32 MiB desc: 1x32 MiB
Speed (MHz): avg: 3161 high: 3807 min/max: 2200/4850 boost: enabled
scaling: driver: acpi-cpufreq governor: schedutil cores: 1: 2813 2: 2874
3: 3305 4: 3593 5: 2873 6: 2871 7: 3588 8: 3591 9: 3320 10: 2874 11: 2864
12: 2871 13: 2876 14: 2872 15: 3587 16: 3807 bogomips: 121425
Flags: 3dnowprefetch abm adx aes aperfmperf apic arat avic avx avx2 bmi1
bmi2 bpext cat_l3 cdp_l3 clflush clflushopt clwb clzero cmov cmp_legacy
constant_tsc cpb cpuid cqm cqm_llc cqm_mbm_local cqm_mbm_total
cqm_occup_llc cr8_legacy cx16 cx8 de decodeassists erms extapic
extd_apicid f16c flushbyasid fma fpu fsgsbase fsrm fxsr fxsr_opt ht
hw_pstate ibpb ibrs ibs invpcid irperf lahf_lm lbrv lm mba mca mce
misalignsse mmx mmxext monitor movbe msr mtrr mwaitx nonstop_tsc nopl npt
nrip_save nx ospke osvw overflow_recov pae pat pausefilter pclmulqdq
pdpe1gb perfctr_core perfctr_llc perfctr_nb pfthreshold pge pku pni popcnt
pse pse36 rapl rdpid rdpru rdrand rdseed rdt_a rdtscp rep_good sep sha_ni
skinit smap smca smep ssbd sse sse2 sse4_1 sse4_2 sse4a ssse3 stibp succor
svm svm_lock syscall tce topoext tsc tsc_scale umip v_spec_ctrl
v_vmsave_vmload vaes vgif vmcb_clean vme vmmcall vpclmulqdq wbnoinvd wdt
xgetbv1 xsave xsavec xsaveerptr xsaveopt xsaves
Vulnerabilities:
Type: itlb_multihit status: Not affected
Type: l1tf status: Not affected
Type: mds status: Not affected
Type: meltdown status: Not affected
Type: spec_store_bypass
mitigation: Speculative Store Bypass disabled via prctl and seccomp
Type: spectre_v1
mitigation: usercopy/swapgs barriers and __user pointer sanitization
Type: spectre_v2 mitigation: Full AMD retpoline, IBPB: conditional,
IBRS_FW, STIBP: always-on, RSB filling
Type: srbds status: Not affected
Type: tsx_async_abort status: Not affected
Graphics:
Device-1: NVIDIA TU116 [GeForce GTX 1660 SUPER] vendor: Gigabyte
driver: nvidia v: 495.44 alternate: nouveau,nvidia_drm bus-ID: 0a:00.0
chip-ID: 10de:21c4 class-ID: 0300
Display: x11 server: X.Org 1.21.1.2 compositor: kwin_x11 driver:
loaded: nvidia display-ID: :0 screens: 1
Screen-1: 0 s-res: 3840x2160 s-dpi: 102 s-size: 956x543mm (37.6x21.4")
s-diag: 1099mm (43.3")
Monitor-1: HDMI-0 res: 3840x2160 hz: 60 dpi: 122
size: 800x450mm (31.5x17.7") diag: 918mm (36.1")
OpenGL: renderer: NVIDIA GeForce GTX 1660 SUPER/PCIe/SSE2
v: 4.6.0 NVIDIA 495.44 direct render: Yes
Audio:
Device-1: NVIDIA TU116 High Definition Audio vendor: Gigabyte
driver: snd_hda_intel v: kernel bus-ID: 0a:00.1 chip-ID: 10de:1aeb
class-ID: 0403
Device-2: AMD Starship/Matisse HD Audio vendor: ASUSTeK
driver: snd_hda_intel v: kernel bus-ID: 0c:00.4 chip-ID: 1022:1487
class-ID: 0403
Sound Server-1: ALSA v: k5.15.7-1-MANJARO running: yes
Sound Server-2: JACK v: 1.9.19 running: no
Sound Server-3: PulseAudio v: 15.0 running: yes
Sound Server-4: PipeWire v: 0.3.40 running: yes
Network:
Device-1: Intel Wi-Fi 6 AX200 driver: iwlwifi v: kernel bus-ID: 08:00.0
chip-ID: 8086:2723 class-ID: 0280
IF: wlp8s0 state: up mac: <filter>
IP v4: <filter> type: dynamic noprefixroute scope: global
broadcast: <filter>
Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
vendor: ASUSTeK PRIME B450M-A driver: r8169 v: kernel port: f000
bus-ID: 09:00.0 chip-ID: 10ec:8168 class-ID: 0200
IF: enp9s0 state: down mac: <filter>
IF-ID-1: wg-mullvad state: unknown speed: N/A duplex: N/A mac: N/A
IP v4: <filter> scope: global
WAN IP: <filter>
Bluetooth:
Device-1: Intel AX200 Bluetooth type: USB driver: btusb v: 0.8 bus-ID: 1-5:2
chip-ID: 8087:0029 class-ID: e001
Report: rfkill ID: hci0 rfk-id: 1 state: up address: see --recommends
Logical:
Message: No logical block device data found.
RAID:
Message: No RAID data found.
Drives:
Local Storage: total: 10.14 TiB used: 8.24 TiB (81.3%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/sda maj-min: 8:0 vendor: PNY model: SSD2SC120G1SA754D117-820
size: 111.79 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
type: SSD serial: <filter> rev: 0A scheme: GPT
ID-2: /dev/sdb maj-min: 8:16 vendor: Samsung model: SSD 850 PRO 1TB
size: 953.87 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
type: SSD serial: <filter> rev: 2B6Q scheme: GPT
ID-3: /dev/sdc maj-min: 8:32 vendor: Seagate model: ST10000NM0086-2AA101
size: 9.1 TiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
type: HDD rpm: 7200 serial: <filter> rev: SN05 scheme: GPT
Message: No optical or floppy data found.
Partition:
ID-1: / raw-size: 111.49 GiB size: 109.18 GiB (97.93%)
used: 33.5 GiB (30.7%) fs: ext4 dev: /dev/sda2 maj-min: 8:2 label: N/A
uuid: fada4d6d-7bdb-40cc-a80c-ed14fd89d9ac
ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
used: 288 KiB (0.1%) fs: vfat dev: /dev/sda1 maj-min: 8:1 label: NO_LABEL
uuid: 6B06-7918
ID-3: /media/viejo/Media raw-size: 9.1 TiB size: 9.02 TiB (99.20%)
used: 7.92 TiB (87.8%) fs: ext4 dev: /dev/sdc1 maj-min: 8:33 label: Media
uuid: ad03e1ca-ec94-40e6-b1d7-69899065cbb1
ID-4: /media/viejo/SSDGames raw-size: 953.87 GiB size: 937.82 GiB (98.32%)
used: 296.27 GiB (31.6%) fs: ext4 dev: /dev/sdb1 maj-min: 8:17
label: SSD Games uuid: f8e436ed-6dfe-4dc7-8e9d-8c33d597863a
Swap:
Alert: No swap data was found.
Unmounted:
Message: No unmounted partitions found.
USB:
Hub-1: 1-0:1 info: Hi-speed hub with single TT ports: 10 rev: 2.0
speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
Device-1: 1-5:2 info: Intel AX200 Bluetooth type: Bluetooth driver: btusb
interfaces: 2 rev: 2.0 speed: 12 Mb/s power: 100mA chip-ID: 8087:0029
class-ID: e001
Device-2: 1-6:3 info: ASUSTek AURA LED Controller type: HID
driver: hid-generic,usbhid interfaces: 2 rev: 2.0 speed: 12 Mb/s power: 16mA
chip-ID: 0b05:1939 class-ID: 0300 serial: <filter>
Device-3: 1-7:4 info: SINOWEALTH Game Mouse type: Mouse,Keyboard
driver: hid-generic,usbhid interfaces: 2 rev: 1.1 speed: 12 Mb/s
power: 480mA chip-ID: 258a:1007 class-ID: 0301
Device-4: 1-8:5 info: Logitech Keyboard K120 type: Keyboard,HID
driver: hid-generic,usbhid interfaces: 2 rev: 1.1 speed: 1.5 Mb/s
power: 90mA chip-ID: 046d:c31c class-ID: 0300
Hub-2: 2-0:1 info: Super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s
chip-ID: 1d6b:0003 class-ID: 0900
Hub-3: 3-0:1 info: Hi-speed hub with single TT ports: 2 rev: 2.0
speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
Hub-4: 4-0:1 info: Super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s
chip-ID: 1d6b:0003 class-ID: 0900
Hub-5: 5-0:1 info: Hi-speed hub with single TT ports: 4 rev: 2.0
speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
Hub-6: 6-0:1 info: Super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s
chip-ID: 1d6b:0003 class-ID: 0900
Sensors:
System Temperatures: cpu: N/A mobo: N/A gpu: nvidia temp: 33 C
Fan Speeds (RPM): N/A gpu: nvidia fan: 0%
Info:
Processes: 353 Uptime: 38m wakeups: 0 Init: systemd v: 249 tool: systemctl
Compilers: gcc: 11.1.0 Packages: 1389 pacman: 1377 lib: 466 flatpak: 12
Shell: Bash v: 5.1.12 running-in: konsole inxi: 3.3.11
And error output from dmesg, looks like there could be something here. Driver error?:
[ 0.566072] #9 #10 #11 #12 #13 #14 #15
[ 1.606232] ata2.00: supports DRM functions and may not be fully accessible
[ 1.621594] ata2.00: supports DRM functions and may not be fully accessible
[ 3.474209] ata5: failed to resume link (SControl 0)
[ 4.491077] ipmi_si: Unable to find any System Interface(s)
[ 4.587946] acpi PNP0C14:01: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[ 4.587993] acpi PNP0C14:02: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[ 4.588024] acpi PNP0C14:03: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[ 4.588079] acpi PNP0C14:04: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[ 4.677095] sp5100-tco sp5100-tco: Watchdog hardware is disabled
[ 4.871070] nvidia: loading out-of-tree module taints kernel.
[ 4.871084] nvidia: module license 'NVIDIA' taints kernel.
[ 4.871086] Disabling lock debugging due to kernel taint
[ 4.878847] iwlwifi 0000:08:00.0: Direct firmware load for iwlwifi-cc-a0-66.ucode failed with error -2
[ 4.878915] iwlwifi 0000:08:00.0: Direct firmware load for iwlwifi-cc-a0-65.ucode failed with error -2
[ 4.878940] iwlwifi 0000:08:00.0: Direct firmware load for iwlwifi-cc-a0-64.ucode failed with error -2
[ 4.885490] iwlwifi 0000:08:00.0: api flags index 2 larger than supported by driver
[ 4.914832] kvm: disabled by bios
[ 4.975121] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 495.44 Fri Oct 22 06:13:12 UTC 2021
[ 5.019065] usb 1-6: config 1 has an invalid interface number: 2 but max is 1
[ 5.019067] usb 1-6: config 1 has no interface number 1
[ 5.043525] kvm: disabled by bios
[ 5.192541] thermal thermal_zone0: failed to read out thermal zone (-61)
[ 5.250622] nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint.
[ 5.255084] kvm: disabled by bios
[ 5.416390] kvm: disabled by bios
[ 5.422039] urandom_read: 4 callbacks suppressed
[ 5.565754] kvm: disabled by bios
[ 5.590400] kauditd_printk_skb: 30 callbacks suppressed
[ 5.677655] kvm: disabled by bios
[ 5.819447] kvm: disabled by bios
[ 6.003436] kvm: disabled by bios
[ 6.039261] nvidia-gpu 0000:0a:00.3: i2c timeout error e0000000
[ 6.039264] ucsi_ccg 0-0008: i2c_transfer failed -110
[ 6.039266] ucsi_ccg 0-0008: ucsi_ccg_init failed - -110
[ 6.039268] ucsi_ccg: probe of 0-0008 failed with error -110
[ 6.137696] kvm: disabled by bios
[ 6.271360] kvm: disabled by bios
[ 10.756920] kauditd_printk_skb: 25 callbacks suppressed
[ 16.318268] kauditd_printk_skb: 37 callbacks suppressed