Overheat or CPU Frequency Management Issues

Short backstory: I’ve been facing an overheating issue on my ASUS Vivobook 14 M1402IA-AM173 laptop running Manjaro (kernel 6.5.3-1-MANJARO, scaling_driver amd-pstate) with 16GB of RAM and a Ryzen 7 4800Hs processor. Every time I launch applications like PHPStorm (project indexing) or run PHP unit tests (Paratest), all CPU cores max out their frequency to the maximum value. According to htop, it reaches 4300 MHz, and even when the processor heats up, it doesn’t reduce the frequency to prevent overheating. Eventually, it hits a critical temperature of 105 degrees, and the laptop shuts down.

I couldn’t find another solution except for what I’ll describe below. Maybe you can help me with this! I’m tired of dealing with it; I need to unplug the laptop from the charger to get my work done.

I found the program TLP, and it does indeed control the governor and sets the maximum value (I talked to the maintainer, and they said the application is functioning correctly). However, nothing changes; it’s as if the processor itself is ignoring these settings and still raising the frequencies to 4300. Here’s a summary of tlp-stat -p:

--- TLP 1.6.0 --------------------------------------------

+++ Processor

CPU model = AMD Ryzen 7 4800HS with Radeon Graphics

/sys/devices/system/cpu/cpu0/cpufreq/scaling_driver = amd-pstate

/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor = ondemand

/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors = conservative ondemand userspace powersave performance schedutil

/sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq = 400000 [kHz]

/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq = 3900000 [kHz]

/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq = 400000 [kHz]

/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq = 4300000 [kHz]

/sys/devices/system/cpu/cpu1..cpu15: omitted for clarity, use -v to show all

/sys/devices/system/cpu/amd_pstate/status = guided

/sys/devices/system/cpu/amd_pstate/cppc_dynamic_boost = (not available)

/sys/devices/system/cpu/cpufreq/boost = 1

/sys/module/workqueue/parameters/power_efficient = Y

/proc/sys/kernel/nmi_watchdog = 0

+++ Platform Profile

/sys/firmware/acpi/platform_profile = (not available)

/sys/firmware/acpi/platform_profile_choices = (not available)

The most interesting part is that there is no such issue on Windows; the problem seems to be in Linux. I’ve tried it on Ubuntu 22.04, and it’s the same story.

If you know how to resolve this or can help, please do. I’m tired of this. The laptop is only a month old.

Hello @Troubleshooting359 :wink:

Your story leaves me wondering one thing: How is the CPU fan controlled? Via BIOS/UEFI or through the OS? Does the fan become louder/faster when the CPU generates heat?

If the fan speed has to be controlled by the OS here, then you have to set it accordingly. The default settings are most likely not good enough for your laptop.

In any case, it is always better to leave the fan control to the BIOS/UEFI if possible. Check the UEFI Settings.

Well, problematic is the turbo boost as I see. Possibly disable it with tlp like that in /etc/tlp.conf to mitigate you problem at the time being:

CPU_BOOST_ON_AC=0
CPU_BOOST_ON_BAT=0
1 Like

Thank you for your response!
I was already considering turning off the turbo mode. Most likely, if I don’t find a solution, I will have to do that. But for me, it’s kind of like buying an RTX 4090 but playing on the integrated graphics card :).
After checking the BIOS, I couldn’t find any information about fan control. Could you please advise me on where to find these settings?
I’ve also updated the BIOS to the latest version (v 310).

In most cases there is no documentations about the UEFI, only how you enter it. So sorry :man_shrugging: I guess you have to share pictures of it so that someone is able to help you.

But if there is none, then most likely it have to be managed by the OS.

Please share the output of:

sensors
1 Like

No problems!



sensors:

sensors                                                          ✔ 
amdgpu-pci-0400
Adapter: PCI adapter
vddgfx:      718.00 mV 
vddnb:       674.00 mV 
edge:         +46.0°C  
PPT:           4.00 W  

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +52.5°C  

BAT0-acpi-0
Adapter: ACPI interface
in0:          11.85 V  

asus-isa-0000
Adapter: ISA adapter
cpu_fan:     1900 RPM

nvme-pci-0300
Adapter: PCI adapter
Composite:    +30.9°C  (low  =  -0.1°C, high = +76.8°C)
                       (crit = +79.8°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +53.0°C  (crit = +103.0°C)

Allright. In the UEFI there is only a fan monitor, nothing to control. And sensors finally reveals that you have to control the fan by the OS.

There is for example corectrl which let you manage it in a more advanced way.

pamac install corectrl

Probably try amdgpu-fan. Should be available in the AUR. You can manage your own fan curve there, which fits more the needs of your laptop.

pamac build amdgpu-fan

Anyway, didn’t test both, since I have Intel CPUs/GPUs.

Yes, I understood you, thank you for at least trying to help! CoreCtrl doesn’t provide me with any controls and settings for frequency scaling, etc. I can only change the governor as in TLP, and that’s it. Try amdgpu-fan

@megavolt

I tried amdgpu-fan, but unfortunately, the program doesn’t work. Here’s the log it provided me with:

sudo amdgpu-fan                                                                                                                                                               
starting amdgpu-fan
Traceback (most recent call last):
  File "/usr/bin/amdgpu-fan", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/lib/python3.11/site-packages/amdgpu_fan/controller.py", line 95, in main
    FanController(config).main()
  File "/usr/lib/python3.11/site-packages/amdgpu_fan/controller.py", line 47, in main
    logger.debug(f'{name}: Temp {temp}, Setting fan speed to: {speed}, fan speed: {card.fan_speed}, min: {card.fan_min}, max: {card.fan_max}')
                                                                                                          ^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/amdgpu_fan/lib/amdgpu.py", line 64, in fan_min
    return int(self.read_endpoint('pwm1_min'))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/amdgpu_fan/lib/amdgpu.py", line 36, in read_endpoint
    with open(self._endpoints[endpoint], 'r') as e:
              ~~~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'pwm1_min'

You have to unlock it on driver level, see Setup · Wiki · CoreCtrl / CoreCtrl · GitLab, but I guess that is not desired on a productive machine.

Ah well… a quick look gives me that: KeyError: 'pwm1_min' and can't start service · Issue #3 · zzkW35/amdgpu-fan · GitHub It is meant to be used with dedicated gpus, not integrated ones. My bad.

So you need a cpu fan control… since it is ASUS, take a look at the ArchWiki, there is a section for ASUS: Fan speed control - ArchWiki Probably you get it working with pwmconfig and fancontrol?!

Alright, I started dealing with pwmconfig and fancontrol. I immediately encountered the issue that I don’t have pwm1 :slight_smile:

Here’s the ls output:

device  fan1_input  fan1_label  name  power  pwm1_enable  subsystem  uevent

Here’s the pwd:

/sys/devices/platform/asus-nb-wmi/hwmon/hwmon5

From what works for me:
echo 2 > pwm_enable - works
echo 1 > pwm_enable - says it’s an invalid value
echo 0 > pwm_enable - turns the fans on at full speed. Honestly, I’ve never heard them running so powerfully. However, I think you’re right; this fan has potential, and perhaps they could cool the processor effectively.

Another idea: take a look at the aur package cpupower-gui

1 Like

Hello, thank you for the tip, any clue is highly appreciated.
Do you think it will conflict with TLP?

Okay, I set 3900 MHz

BUT :smiley:

I don’t think they conflict but do your own reading, i cannot guarantee for sure. I have it deinstalled now in favor of Corectl (they worked both, i only needed the governor and not the frequency scaling), now i see the tlp service. I guess if they conflict it is automatically dis/enabled on install.

I tried disabling and completely removing TLP. After that, cpupower doesn’t retain its settings after reboot, which is strange but kind of okay. Then I set a limit, but still saw impressive numbers at 4300 MHz in htop.
But, okay, thanks for your help!

Okay… well. Now you have 2 modes: automatic and full speed. You can write a simple script and monitor the temperatur and when it reaches a threshold, it will switch to full speed until it is below the threshold again. That would need little work, but better than nothing when you do intensive tasks.

Many years ago, on a Xubuntu 14.04 i used this.
It is very old and probably needs heavy adapting to work on manjaro today, but as a starting point for a project

@megavolt
@Teo

Thank you, I’ll try to write a script.
I’m curious, what could be causing this?
Maybe it’s worth opening the laptop and checking the thermal paste? I have some experience with laptop disassembly. But I’m not sure if it’s the right solution.

I can offer a quick rundown of what I’ve done with a zen3 laptop:

  • TLP ( not power-profiles-daemon )
  • zenpower ( zenpower3-dkms )
  • amd-pstate-epp ( using boot option amd_pstate=active )

Thats about it really.
I believe yours is a zen2, but I believe all of the above is still compatible.

I will also point out the pstate scenario is a bit confusing … and would take a lot of space here … but this reddit post seems roughly accurate:
https://www.reddit.com/r/linux/comments/15p4bfs/amd_pstate_and_amd_pstate_epp_scaling_driver/

$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
amd-pstate-epp

I can also report in my case the epp scaling driver automatically switches between balance_performance when plugged and balance_power when unplugged.

$ cat /sys/devices/system/cpu/cpu*/cpufreq/energy_performance_preference
balance_performance
balance_performance
balance_performance
balance_performance
balance_performance
balance_performance
balance_performance
balance_performance
balance_performance
balance_performance
balance_performance
balance_performance

# unplug #

$ cat /sys/devices/system/cpu/cpu*/cpufreq/energy_performance_preference
balance_power
balance_power
balance_power
balance_power
balance_power
balance_power
balance_power
balance_power
balance_power
balance_power
balance_power
balance_power

While I havent done exhaustive testing the above configuration seems to work for me with some fan spin up under load, otherwise quiet, and never hitting a major heat ceiling.

1 Like

Thank you for the advice, I’ll give this option a try. A couple of questions:
Did your laptop also overheat?
What does “not power-profiles-daemon” mean in this context with TLP (not power-profiles-daemon)?

No overheating that I have observed.
Though it was louder while also being less energy efficient before I started tweaking.
tlp conflicts with power-profiles-daemon so one cannot have both. power-profiles-daemon being a newer package that is supposed to integrate with desktop power management (being able to choose ‘performance’ in KDE power widget, etc).
Well … you can have both installed, but then you would need to mask power-profiles-daemon’s service.

More tlp info:
https://wiki.archlinux.org/title/TLP

Ah, I see, I’m using GNOME, and it seems there’s no power management, but just in case, I’ll check out that daemon.

@cscs

In general, yes, it seems to really help. At least my laptop shut down a bit later, which I did the following:

I switched to amd-pstate-epp,
checked that I don’t have a power management daemon.

The only thing is, I couldn’t compile zenpower, here’s the log:

(1/2) Arming ConditionNeedsUpdate...
(2/2) Install DKMS modules
==> ERROR: Missing var kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing mnt kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing root kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing home kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing lost+found kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing usr kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing opt kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing lib64 kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing proc kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing sbin kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing dev kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing rootfs-pkgs.txt kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing srv kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing etc kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing sys kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing lib kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing run kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing bin kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing tmp kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing boot kernel headers for module zenpower3/0.2.0.
==> ERROR: Missing desktopfs-pkgs.txt kernel headers for module zenpower3/0.2.0.

Here’s the tlp-stat -p:

+++ Processor
CPU model      = AMD Ryzen 7 4800HS with Radeon Graphics

/sys/devices/system/cpu/cpu0/cpufreq/scaling_driver    = amd-pstate-epp
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor  = powersave
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors = performance powersave
/sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq  =   400000 [kHz]
/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq  =  4300000 [kHz]
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq  =   400000 [kHz]
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq  =  4300000 [kHz]
/sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference = balance_performance [EPP]
/sys/devices/system/cpu/cpu0/cpufreq/energy_performance_available_preferences = default performance balance_performance balance_power power 

/sys/devices/system/cpu/cpu1..cpu15: omitted for clarity, use -v to show all

/sys/devices/system/cpu/amd_pstate/status              = active
/sys/devices/system/cpu/amd_pstate/cppc_dynamic_boost  = (not available)
/sys/module/workqueue/parameters/power_efficient       = Y
/proc/sys/kernel/nmi_watchdog                          = 0

+++ Platform Profile
/sys/firmware/acpi/platform_profile                    = (not available)
/sys/firmware/acpi/platform_profile_choices            = (not available)