After recent update, GPU fan always on

Nvidia GTX960 with relatively low temps, before the update the fan was nearly always off in non-gaming usage.

nvidia-smi -q -d TEMPERATURE
==============NVSMI LOG==============

Timestamp : Fri Sep 9 17:28:25 2022
Driver Version : 515.65.01
CUDA Version : 11.7

Attached GPUs : 1
GPU 00000000:29:00.0
Temperature
GPU Current Temp : 36 C
GPU Shutdown Temp : 101 C
GPU Slowdown Temp : 96 C
GPU Max Operating Temp : N/A
GPU Target Temperature : 80 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A

Using Nvidia Settings I can control the fan speed but when I set Target Speed to 1%, it never goes below ~1520rpm. Seems there is some “always on” flag now active.

I see nothing different than on my end.

In Nvidia Settings i have set the default Thermal Settings to this:

And the PowerMizer to Auto:

Is also true that on my /etc/X11/mhwd.d/nvidia.conf i use this in Section "Device"

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 960"
    Option         "TripleBuffer"  "On"
    Option  "ConnectToAcpid"    "Off"
    Option         "Coolbits" "31"
EndSection

And the Nvidia GPU Fan is at 0% in idle

Sensors:
  System Temperatures: cpu: 45.0 C mobo: N/A gpu: nvidia temp: 42 C
  Fan Speeds (RPM): N/A gpu: nvidia fan: 0%

Did you reboot the system and check without any app if the fan is off?

Everything except TripleBuffer and ConnectToAcpid in nvidia.conf are identical.
However in Nvidia Settings the fan speed is automatically set to a target of 45%.
This is the first time I’ve looked at the fan speeds for the GPU on this installation, so I can be certain that I didn’t manually alter any settings. I can alter the fan speed higher but lowering to anything below 45% has no effect.

Restarting the machine has the fan off, as soon as KDE windowing starts to load, it spins up.

Well, enable it and put it back to zero … apply and then disable it again. Is that helping?

Driver versions:

Screenshot_20220909_182851

Nope, enabled I can go 45 and higher and it responds as expected. Anything below 45 has no effect, it remains at the same RPM.

Did you use at some point the Green With Envy app and then the conf was overwritten? Also the settings in your home directory contain some funky lines?
My ~/.nvidia-settings-rc is like this:

RcFileLocale = C
DisplayStatusBar = Yes
SliderTextEntries = Yes
IncludeDisplayNameInConfigFile = No
UpdateRulesOnProfileNameChange = Yes
Timer = PowerMizer_Monitor_(GPU_0),Yes,1000
Timer = Thermal_Monitor_(GPU_0),Yes,1000
Timer = Memory_Used_(GPU_0),Yes,3000

# Attributes:

0/SyncToVBlank=1
0/LogAniso=0
0/FSAA=0
0/TextureClamping=1
0/FXAA=0
0/AllowFlipping=1
0/FSAAAppControlled=1
0/LogAnisoAppControlled=1
0/OpenGLImageSettings=1
0/FSAAAppEnhanced=0
0/ShowGraphicsVisualIndicator=0

The rest is not relevant tho … and seems all is identical. What Kernel are you on, what branch?

First time hearing about Green With Envy, never installed it. The install is also fairly fresh, about 2 months old.
nvidia-settings-rc are identical to yours.
Kernel is 5.15.60-1.
I appreciate the help.

did you tried with different kernels? the 5.10 lts and the 5.19.1-3?

Tried both just now, no difference. Is there a reliable way to downgrade Nvidia drivers?

if you still have them in cache, so check with:
ls -hl /var/cache/pacman/pkg | grep nvidia
ls -hl /var/cache/pacman/pkg | grep linux5

before downgrading you can try uninstalling them, removing all configs, rebooting, installing them, rebooting … do you have dual graphics?

No dual graphics, just a basic desktop setup (no integrated graphics).

ok, so lets try first reinstalling it, post output from:
mhwd -l && mhwd -li
find /etc/X11/ -name "*.conf"

0000:25:00.0 (0200:10ec:8168) Network controller Realtek Semiconductor Co., Ltd.:


              NAME               VERSION          FREEDRIVER           TYPE

     network-r8168            2016.04.20                true            PCI

0000:29:00.0 (0300:10de:1401) Display controller nVidia Corporation:


              NAME               VERSION          FREEDRIVER           TYPE

      video-nvidia            2021.11.04               false            PCI
video-nvidia-470xx            2021.11.04               false            PCI
video-nvidia-390xx            2021.11.26               false            PCI
       video-linux            2018.05.04                true            PCI
 video-modesetting            2020.01.13                true            PCI
        video-vesa            2017.03.12                true            PCI

Installed PCI configs:


              NAME               VERSION          FREEDRIVER           TYPE

      video-nvidia            2021.12.18               false            PCI

Warning: No installed USB configs!

/etc/X11/mhwd.d/nvidia.conf
/etc/X11/xorg.conf.d/30-touchpad.conf
/etc/X11/xorg.conf.d/00-keyboard.conf
/etc/X11/xorg.conf.d/90-mhwd.conf

ok, so remove them:
sudo mhwd -r pci video-nvidia
then remove this leftover config, if it wasnt remove with the uninstall:
sudo rm /etc/X11/mhwd.d/nvidia.conf
then install them again:
sudo mhwd -i pci video-nvidia
reboot and see if it helped

1 Like

Excuse the late reply. Tried the suggestion to no effect. Seems that on the hardware level the change was already made as in the newest front-end, the minimum fan rate is 45%.

I am not sure if this is the source of your problem, but I had to disable the compositor after the last update to be able to have my PC usable on two separate machines (both on Manjaro). First I had to kill it (ps aux | grep compiz and kill PID). Then, I am using KDE, so System Settings > Display and Monitor > Compositor > Uncheck the box for Enable at startup to prevent it from spawning again with xorg. Please try and see if it helps.

Wouldn’t disabling compositor introduce other issues?

It is mostly for desktop effects such as transparency, etc. AFAIK. In my case, my computer became usable and responsive again, so it was a solution rather than trouble.

I experienced this, and funny thing, in my case one small cable was blocking one of the fans, which made the whole thing behaving strangely, and it would speed up the other 2 fans at 100%.

I realized this because it started happening after I worked and arranged some things in the case.

Maybe it helps in some cases. It was a hardware issue :slight_smile: