Amdgpu no fan speed

before anything this is how i got amdgpu running (instead of the proprietary driver), just in case that this particular setup is what is causing problems:

/etc/default/grub
radeon.cik_support=0 amdgpu.cik_support=1 radeon.si_support=0 amdgpu.si_support=1

/etc/modprobe.d/amdgpu.conf
options amdgpu si_support=1
options amdgpu cik_support=1

/etc/modprobe.d/radeon.conf
options radeon si_support=0
options radeon cik_support=0

So basically my issue is that the gpu fan speed is not showing up (sensors, corectrl, etc)

sensors ouput:
amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:      850.00 mV 
fan1:             N/A  (min =    0 RPM, max = 6000 RPM)
edge:         +31.0Ā°C  (crit = +104000.0Ā°C, hyst = -273.1Ā°C)
power1:        6.21 W  (cap =  47.00 W)

tried doing this like so:

chip "amdgpu-pci-0100*"
set fan1_div 4

and then:

sudo sensors -s
Error: File /etc/sensors.d/fan-speed-control.conf, line 1: Parse error in chip name

or (if chip "amdgpu-*")

Error: File /etc/sensors.d/fan-speed-control.conf, line 2: Unknown feature name
amdgpu-pci-0100: No such subfeature known

also, assuming that this is the correct path:

cat /sys/class/drm/card0/device/hwmon/hwmon4/fan1_enable
1

i did install/started amdgpu-fan.service, nothing.

pwmconfig output:

...
Found the following devices:
   hwmon0 is sony_controller_battery_00:06:f5:d9:70:1f
   hwmon1 is atk0110
   hwmon2 is fam15h_power
   hwmon3 is k10temp
   hwmon4 is amdgpu

Found the following PWM controls:
   hwmon4/pwm1           current value: 79

Giving the fans some time to reach full speed...
Found the following fan sensors:
   hwmon1/fan1_input     current speed: 3026 RPM
   hwmon1/fan2_input     current speed: 3183 RPM
cat: hwmon4/fan1_input: No such device
   hwmon4/fan1_input     current speed:  RPM

btw, what does this mean?

Found the following PWM controls:
   hwmon4/pwm1           current value: 79

Afterwards i ran fancontrol, it complained about bad configuration, ran pwmconfig again and went throught its configuration wizard, did nothing at first, ran again and my gpu fan went 100% and no matter what i did, it continued running full speed. it got back to normal after rebooting

apparently corectrl is now able to change my gpu fan speed, but still no temps edit: gpu fan speed. however after configuring fancontrol with pwmconfig i got fancontrol service running properly, i was also able to configure it with fancontrol-gui, but no fan speedā€¦

btw pwmconfig is a little confusing, so i need help with these numbers:

# Configuration file generated by pwmconfig, changes will be lost
INTERVAL=10
DEVPATH=hwmon4=devices/pci0000:00/0000:00:02.0/0000:01:00.0
DEVNAME=hwmon4=amdgpu
FCTEMPS=hwmon4/pwm1=hwmon4/temp1_input
FCFANS= hwmon4/pwm1=
MINTEMP=hwmon4/pwm1=30
MAXTEMP=hwmon4/pwm1=75
MINSTART=hwmon4/pwm1=30
MINSTOP=hwmon4/pwm1=20
MAXPWM=hwmon4/pwm1=255

System Information:

CoreCtrl v1.1.1

==== Software ====
Kernel version: 5.10.34
Mesa version: 21.0.3

==== Radeon R7 360 [GPU 0] ====
BIOS version: 113-TOBAGO_PROL_D5_
Device: Tobago PRO
Device ID: 665F
Device model: Radeon R7 360
Device model ID: 7360
Driver: amdgpu
Memory: 2048MB
OpenGL version (compat): 4.6
OpenGL version (core): 4.6
PCI Slot: 0000:01:00.0
Revision: 81
Vendor: Advanced Micro Devices, Inc.
Vendor ID: 1002
Vendor model ID: 1682
Vulkan API version: 1.2.145

You could try running with the additional kernel parameter acpi_enforce_resources=lax, since it would seem to me the programs simply cannot access the necessary information.
You would be able to test my theory by checking the output of
cat /sys/class/drm/card0/device/hwmon/hwmon4/fan1_input

1 Like

Now the gpu device is hwmon2 for whatever reason and i got ā€˜no such deviceā€™ from hwmon2/fan1_outputā€¦ corectrl is able to change it, but it canā€™t read it for whatever reasonā€¦

After kernel parameter acpi_enforce_resources=lax
you have to do in shell: sudo sensors-detect --auto
because the ā€œcount of devicesā€ changes when using this kernel parameterā€¦
Psensors is helpful to determin the right parameter for GPU-Fan.
The configuration for displaying the sensors is in /etc/sensors.d

1 Like

Thanks, but i probably wonā€™t use acpi_enforce_resources=lax anyway. i tried Psensors, same thing.

What tells:
sudo sensors -u

1 Like
amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:
  in0_input: 0.850
fan1:
ERROR: Can't get value of subfeature fan1_input: Can't read
  fan1_min: 0.000
  fan1_max: 6000.000
edge:
  temp1_input: 32.000
  temp1_crit: 104000.000
  temp1_crit_hyst: -273.150
power1:
  power1_average: 5.108
  power1_cap: 47.000

this is likely a bug, i donā€™t know where to submit to though.

1 Like

Well you could try kernel 5.4 and 5.12. to get a clearer picture and identify if the problem is only with that specific kernel version.
And regarding filing a bug you can read up on it here.

1 Like

tried both kernels, no difference regarding gpu sensors.

Thank you, will check it soon.

Soā€¦ iā€™ve submitted this issue here, according to it:

See https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface for reference.  Try reading /sys/class/drm/card0/device/hwmon/hwmon4/pwm1
Some older boards only supported percentage based (pwm) fan controls.

which seems to be the case for this gpu (i think its gcn 2), so this is not a bug. :partying_face:

so i think iā€™m going to submit a request feature to some apps like corectrl to take this particular behavior into account when trying to monitor fan speed.

Thank you guys for all your help :smiley:

Hmmm i did not know that. Well nice to know and happy that you now atleast know what is happening and why. I hope you thanked him.

1 Like