AMD P-State EPP Scaling Driver on AMD Ryzen 7 5800H laptop

Ok, I think I got important finding’s, even crucial for me, and not sure where to put it.
Also, if someone have an idea where else I can address it, please let me know.

Probably this related to abrupt random shutdowns (power off) without any load, and without meaningful logs anywhere.

Kernel: 6.5.3-1-MANJARO
Hardware:
Model: HP Victus Laptop 16-e0xxx
CPU: AMD Ryzen 7 5800H
GPU: Radeon Vega Mobile and NVIDIA GeForce RTX 3060 6GB

The finding’s:

  1. Kernel automatically loads AMD P-State EPP scaling (more info) driver (amd_pstate_epp).
  2. By default, driver use “powersave” cpufreq governor (it is right thing to do, it allows performances hints from the OS, basically “performance” governor is useless for most of the cases)
  3. By default, driver use “performance” hint (energy_performance_preference). And this leads to extremely high voltage on CPU(integrated GPU?) (vddgfx) in all times, as well as power consumption. 1.46V voltage and 16W power in totally idle state.
  4. Setting energy_performance_preference to “balance_performance” leads to immediate drop of voltage to 1V and power draw to 5W
  5. Setting energy_performance_preference to “power” drops voltage to 0.8V level and power draw to 3-4W.

My assumptions is, that first of all, it is extremely unhealthy to put (presumably integrated GPU?) to constant high voltage. And second, that under emerging load this causes immediate high power draw and quick chip heating without cooling system even active, and probably thermal shutdown of the system.

sensors output for default “performance” mode:

amdgpu-pci-0600
Adapter: PCI adapter
vddgfx:        1.46 V  
vddnb:       949.00 mV 
edge:         +43.0°C  
PPT:          15.00 W

sensors output for “balance_performance” mode:

amdgpu-pci-0600
Adapter: PCI adapter
vddgfx:      999.00 mV 
vddnb:       949.00 mV 
edge:         +43.0°C  
PPT:           4.00 W

Important!
power-profiles-daemon at the time of writing cannot use AMD P-State EPP amd_pstate_epp and platform_profile drivers at the same time, and, because platform_profile is available in system it will ignore amd_pstate_epp completely, leaving it in default values (“powersave” mode and “perfomance” energy preference)

To see in what mode now power-profiles-daemon, use powerprofilesctl command.
Related power-profiles-daemon issue.

Also, there is auto-epp python script/systemd service for automatically manage energy performance preferences depending on power source of laptop (AC or Battery)

You can set energy_performance_preference by doing:

# echo "balance_performance" | tee /sys/devices/system/cpu/cpufreq/policy*/energy_performance_preference

You can query available profiles by doing:

cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences

To see current cpufreq state and temps:

# sensors
# cpupower frequency-info

Addition:
Ok. I installed zenpower3, and this is what I found.
On default “performance” preference I got constant 1.46V on CPU and integrated GPU.
This is BAD.

$ sensors
hp-isa-0000
Adapter: ISA adapter
fan1:           0 RPM
fan2:           0 RPM

nvme-pci-0500
Adapter: PCI adapter
Composite:    +37.9°C  (low  = -273.1°C, high = +80.8°C)
                       (crit = +81.8°C)
Sensor 1:     +37.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)

BAT0-acpi-0
Adapter: ACPI interface
in0:          16.81 V  

zenpower-pci-00c3
Adapter: PCI adapter
SVI2_Core:     1.46 V  
SVI2_SoC:    950.00 mV 
Tdie:         +41.6°C  (high = +95.0°C)
Tctl:         +41.6°C  
SVI2_P_Core:   9.64 W  
SVI2_P_SoC:    3.63 W  
SVI2_C_Core:   7.25 A  
SVI2_C_SoC:    3.83 A  

amdgpu-pci-0600
Adapter: PCI adapter
vddgfx:        1.46 V  
vddnb:       949.00 mV 
edge:         +41.0°C  
PPT:           8.00 W  

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +41.0°C  (crit = +255.0°C)

On “balance_performance” I got dynamic voltage scaling and around 1V on CPU and GPU in idle state.

$ sensors
hp-isa-0000
Adapter: ISA adapter
fan1:           0 RPM
fan2:           0 RPM

nvme-pci-0500
Adapter: PCI adapter
Composite:    +36.9°C  (low  = -273.1°C, high = +80.8°C)
                       (crit = +81.8°C)
Sensor 1:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)

BAT0-acpi-0
Adapter: ACPI interface
in0:          16.81 V  

zenpower-pci-00c3
Adapter: PCI adapter
SVI2_Core:   1000.00 mV 
SVI2_SoC:    950.00 mV 
Tdie:         +43.1°C  (high = +95.0°C)
Tctl:         +43.1°C  
SVI2_P_Core:   3.27 W  
SVI2_P_SoC:    1.96 W  
SVI2_C_Core:   3.29 A  
SVI2_C_SoC:    2.06 A  

amdgpu-pci-0600
Adapter: PCI adapter
vddgfx:      999.00 mV 
vddnb:       949.00 mV 
edge:         +40.0°C  
PPT:           4.00 W  

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +43.0°C  (crit = +255.0°C)

Useful links:

  1. Benchmarks for Ryzen mobile system using AMD P-State EPP
  2. Kernel.org documentation on new AMD P-State driver
  3. power-profiles-daemon Consider using multiple “drivers” for CPU P-State support
  4. CPU frequency scaling AchWiki
  5. auto-epp python script
  6. AMD P-State and AMD P-State EPP Scaling Driver Configuration Guide
  7. Zenpower3
1 Like

You might be interested in this thread:

1 Like

Thank you for the info.
Although I do not want use TLP fro now because it is not the default and I am using KDE that have settings for power-profiles-daemon.
Need to look to zenpower3 probably… Not really want to install dkms…

so you can test

  • amd-pstate=passive
  • amd-pstate=active
  • amd-pstate=guided

check with :
cpupower frequency-info and
sudo turbostats

in my case ( desktop + 5600x ) i use
“iommu=pt amd-pstate=passive nowatchdog processor.max_cstate=5 systemd.unified_cgroup_hierarchy=true scsi_mod.use_blk_mq=1”

which drivers videos are you using ?

Ok. I installed zenpower3, and this is what I found.
On default “performance” preference I got constant 1.46V on CPU and integrated GPU.
This is BAD.

$ sensors
hp-isa-0000
Adapter: ISA adapter
fan1:           0 RPM
fan2:           0 RPM

nvme-pci-0500
Adapter: PCI adapter
Composite:    +37.9°C  (low  = -273.1°C, high = +80.8°C)
                       (crit = +81.8°C)
Sensor 1:     +37.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)

BAT0-acpi-0
Adapter: ACPI interface
in0:          16.81 V  

zenpower-pci-00c3
Adapter: PCI adapter
SVI2_Core:     1.46 V  
SVI2_SoC:    950.00 mV 
Tdie:         +41.6°C  (high = +95.0°C)
Tctl:         +41.6°C  
SVI2_P_Core:   9.64 W  
SVI2_P_SoC:    3.63 W  
SVI2_C_Core:   7.25 A  
SVI2_C_SoC:    3.83 A  

amdgpu-pci-0600
Adapter: PCI adapter
vddgfx:        1.46 V  
vddnb:       949.00 mV 
edge:         +41.0°C  
PPT:           8.00 W  

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +41.0°C  (crit = +255.0°C)

On “balance_performance” I got dynamic voltage scaling and around 1V on CPU and GPU in idle state.

$ sensors
hp-isa-0000
Adapter: ISA adapter
fan1:           0 RPM
fan2:           0 RPM

nvme-pci-0500
Adapter: PCI adapter
Composite:    +36.9°C  (low  = -273.1°C, high = +80.8°C)
                       (crit = +81.8°C)
Sensor 1:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)

BAT0-acpi-0
Adapter: ACPI interface
in0:          16.81 V  

zenpower-pci-00c3
Adapter: PCI adapter
SVI2_Core:   1000.00 mV 
SVI2_SoC:    950.00 mV 
Tdie:         +43.1°C  (high = +95.0°C)
Tctl:         +43.1°C  
SVI2_P_Core:   3.27 W  
SVI2_P_SoC:    1.96 W  
SVI2_C_Core:   3.29 A  
SVI2_C_SoC:    2.06 A  

amdgpu-pci-0600
Adapter: PCI adapter
vddgfx:      999.00 mV 
vddnb:       949.00 mV 
edge:         +40.0°C  
PPT:           4.00 W  

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +43.0°C  (crit = +255.0°C)

Also. No random shutdowns till this moment!