Amdgpu glitch on hybrid laptop

no, when disabling integrated gpu (so only nvidia is working, I have absolutely no issue regarding the graphic and freeze.

Yeah I read some threads about TPM but I canā€™t find any related option in the bios.

edit: Iā€™m trying something else, I changed integrated gpu memory allocated in the bios from 512 to 2g.

right now, it seem more stable. could it be simply memory full issue ?

Maybe ā€¦ but it also reminds me of some buggy Cachesā€¦

for i in {Shader,GPU}; do find ~/.config -name *"$i"Cache*; done

You may try removing any of the directories that command prints.
(particularly those that pertain to your browser of course)

nothing found for ShaderCache or GPUCache

it could be because of thatā€¦ if you no longer experience any issues, it was because of itā€¦

@eephyne: I have the same 16ARX8 with NVIDIA 4070. Please also update your BIOS, mine is LPCN44WW while yours is LPCN25WW.
I am about to check Manjaro on that box as under Ubuntu/Linux Mint 21.2 with Kernel 6.2 I cannot suspend (as it immediately resumes) and also have some issues with WIFI mt7921e which sometimes gives kernel panic.

I fell less alone :slight_smile:
I have issue with sleep and mt7921e too but not exactly the same sa you do.
You saw difference after upgrading your bios ?

you can watch Bluetooth is not AutoEnabled - #10 by koshikas for my issue with the mt7921e card if you want.
For the sleep , it donā€™t wakeup instantly like you but sometime I canā€™t wakeup at all, and its very long to wakeup from sleep (deep or s2idle) (about 10 seconds).

edit : just updated my bios (using windows), didnā€™t think it was going to erase bios settings and reset efi partition

glitches still happen.

Iā€™v checked the vram state and its not full (around 60-70%), so Its probably not related to an out of memory issue

create a new test user, reboot, log in with it and see if it happens there tooā€¦

Still the same using another user

Unfortunately, updating to v44 didnā€™t help (Iā€™m not sure if there were more menu items visible in the former version as I canā€™t remember differences from my head - settings arre reset anyway making it harder without screen shots).

Yeah, mt7921e made indeed trouble, thatā€™s why I updated to kernel 6.4.3 with latest linux-firmware to even get any new FW from/for Mediatek. At least losing WIFI happens way less frequently with this setup. Sometimes I just do

modprobe -rv mt7921e && modprobe -av mt7921e

to reset the glitches.

I think that my suspend/resume problem is due to not being able to switch power saving modes/methods aka ā€œModern Sleepā€ and S3 in BIOS even though dmesg says

ACPI: PM: (supports S0 S3 S4 S5)

while checking deeper for s2idle capabilities (which I came across by chance) using amd_s2idle.py from freedesktop org:

root@LEGION5PRO:~# ./amd_s2idle.py
Location of log file (default s2idle_report-2023-07-22.txt)?
Debugging script for s2idle on AMD systems
:computer: LENOVO 82WM (Legion Pro 5 16ARX8) running BIOS 1.44 (LPCN44WW) released 06/28/2023 and EC 1.44
:penguin: Linux Mint 21.2
:penguin: Kernel 6.4.3-060403-generic
:battery: Battery BAT0 (BYD L22B4PC0) is operating at 100.00% of design
Checking prerequisites for s2idle
:white_check_mark: Logs are provided via systemd
:white_check_mark: AMD Ryzen 9 7945HX with Radeon Graphics (family 19 model 61)
:white_check_mark: LPS0 _DSM enabled
:x: ACPI FADT doesnā€™t support Low-power S0 idle
:white_check_mark: HSMP driver amd_hsmp not detected (blocked: False)
:x: PMC driver amd_pmc not loaded
:white_check_mark: GPU driver amdgpu available
:x: System isnā€™t configured for s2idle in firmware setup
:x: NVME SK hynix is not configured for s2idle in BIOS
:x: NVME SK hynix is not configured for s2idle in BIOS
:white_check_mark: GPIO driver pinctrl_amd available
Your system does not meet s2idle prerequisites!
S0i3 failures reported on your system
:vertical_traffic_light: The kernel didnā€™t emit a message that low power idle was supported
Low power idle is a bit documented in the FADT to indicate that
low power idle is supported.
Only newer kernels support emitting this message, so if you run on
an older kernel you may get a false negative.
When launched as root this script will try to directly introspect the
ACPI tables to confirm this.
:vertical_traffic_light: AMD-PMC driver is missing
The amd-pmc driver is required for the kernel to instruct the
soc to enter the hardware sleep state.
Be sure that you have enabled CONFIG_AMD_PMC in your kernel.

If CONFIG_AMD_PMC is enabled but the amd-pmc driver isnā€™t loading
then you may have found a bug and should report it.
:vertical_traffic_light: The system hasnā€™t been configured for Modern Standby in BIOS setup
AMD systems must be configured for Modern Standby in BIOS setup
for s2idle to function properly in Linux.
On some OEM systems this is referred to as ā€˜Windowsā€™ sleep mode.
If the BIOS is configured for S3 and you manually select s2idle
in /sys/power/mem_sleep, the system will not enter the deepest hardware state.
:vertical_traffic_light: SK hynix missing ACPI attributes
An NVME device was found, but it doesnā€™t specify the StorageD3Enable
attribute in the device specific data (_DSD).
This is a BIOS bug, but it may be possible to work around in the kernel.

If you added an aftermarket SSD to your system, the system vendor might not have added this
property to the BIOS for the second port which could cause this behavior.

Please re-run this script with the --acpidump argument and file a bug to investigate.

:vertical_traffic_light: SK hynix missing ACPI attributes
An NVME device was found, but it doesnā€™t specify the StorageD3Enable
attribute in the device specific data (_DSD).
This is a BIOS bug, but it may be possible to work around in the kernel.

If you added an aftermarket SSD to your system, the system vendor might not have added this
property to the BIOS for the second port which could cause this behavior.

Please re-run this script with the --acpidump argument and file a bug to investigate.

Do you remember if there was a selectable BIOS setting for switching Power Mode to something other than ā€œModern Sleepā€?

No, I searched around for advanced bios settings and I saw many articles/post talking about a series of key to enable this mode in the bios but either itā€™s not available for this legion, or I canā€™t do it properly.
If you want to try it out try this:

Go to More settings then hold down Fn and press each key horizontally from q to p, a to l, then z to m, let go of Fn and press F10. Click save changes and reboot into BIOS. Advanced settings will now be available

Since I have a azerty keyboard, I didnā€™t hit specifically those touch but their positions instead(seem more logical to me). If thatā€™s work for you that mean that itā€™s really the letter we have to type instead of the position of the keys.

ā€“
For me the sleep work~ for deep or s2idle and itā€™s set by default to deep .When I cat /sys/power/mem_sleep I have ā€œs2idle [deep]ā€ with the selected type in the [ ] .
I also tried to use sleepgraph python tool to see why itā€™s so slow to wake in either modes but canā€™t use it , it default when analyzing the ftrace file :frowning: .

ā€“
I never had wifi loss just issue to get it started at some bootups . thatā€™s very weird those difference of issue with what seem to be the same or very similar hardware.

ā€“ for now I changed to discrete mode so I donā€™t use amdgpu. I need to work a little bit and that just not possible with amdgpu but thatā€™s killing me to not be able to solve that and to have a power saving feature that I canā€™t use.

I never had so much trouble with a computer on Linux. Probably that the hardware is too newer.
For the wifi card if that being too much of an issue I think Iā€™ll replace it with another one (for 20ā‚¬ thatā€™s not so much).

Iā€™ll try your amd_s2idle python script to see what it say for me. Could you tell me where you find it?

https://gitlab.freedesktop.org/drm/amd/-/blob/master/scripts/amd_s2idle.py

BTW: Did you increase the 512m value for amd_gpu in BIOS? I guess it can be set up to 2G of memory reservation for video mem.

It is interesting to see that basically the same system behaves that different. Iā€™ll try the key sequences. UPDATE: It doesnā€™t work. I also tried different sequence as of
F1 1 Q A Y
F2 2 W S X
F3 3 E D C
F4 4 R F V
but no luck.

I also tried to poke with efivar:

for i in $(efivar -l);do echo -n ā€œ$i:ā€ && efivar --name $i --print;echo =============;done

as it shows a variable named H2OFormDialogConfig which indicates the Insyde H20 part of advanced BIOS settings and I enabled it (be careful when doing so, read about efivarā€™s command as it could brick your system if you write values randomly. Youā€™ll need another file to read from in order to write via -w. Beware of the 4 bytes header defining the variables attributes type). But it still refused advanced BIOS.

root@LEGION5PRO:~# efivar -p -n 98ae8272-ce5a-46be-9f5d-d9f9cbbb99f2-H2OFormDialogConfig
GUID: 98ae8272-ce5a-46be-9f5d-d9f9cbbb99f2
Name: ā€œH2OFormDialogConfigā€
Attributes:
Non-Volatile
Boot Service Access
Runtime Service Access
Value:
00000000 01

Regarding replacing the WIFI card - Lenovo might have a static list of supported cards and could refuse to initialize BIOS-wise if the new card is not on the list. At least this is/was applicable in the past on certain Thinkpads and other models.

ython amd_s2idle.py                                                                                                         ī‚² āœ” ī‚² 7s ļ‰’ 
Location of log file (default s2idle_report-2023-07-23.txt)? 
Debugging script for s2idle on AMD systems
šŸ’» LENOVO 82WM (Legion Pro 5 16ARX8) running BIOS 1.44 (LPCN44WW) released 06/28/2023 and EC 1.44
šŸ§ Manjaro Linux
šŸ§ Kernel 6.4.3-1-MANJARO
šŸ”‹ Battery BAT0 (BYD L22B4PC0) is operating at 104.00% of design
Checking prerequisites for s2idle
āœ… Logs are provided via systemd
āœ… AMD Ryzen 7 7745HX with Radeon Graphics (family 19 model 61)
āœ… LPS0 _DSM enabled
āœ… HSMP driver `amd_hsmp` not detected (blocked: False)
āŒ PMC driver `amd_pmc` not loaded
āŒ GPU driver `amdgpu` not loaded
āŒ System isn't configured for s2idle in firmware setup
āŒ NVME Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO is not configured for s2idle in BIOS
āœ… GPIO driver `pinctrl_amd` available
GPIO dynamic debugging information unavailable
MSR checks unavailable
šŸ‘€ Suspend must be initiated by root user
Your system does not meet s2idle prerequisites!
S0i3 failures reported on your system
šŸš¦ AMD-PMC driver is missing
	The amd-pmc driver is required for the kernel to instruct the
	soc to enter the hardware sleep state.
	Be sure that you have enabled CONFIG_AMD_PMC in your kernel.

	If CONFIG_AMD_PMC is enabled but the amd-pmc driver isn't loading
	then you may have found a bug and should report it.
šŸš¦ AMDGPU driver is missing
	The amdgpu driver is used for hardware acceleration as well
	as coordination of the power states for certain IP blocks on the SOC.
	Be sure that you have enabled CONFIG_AMDGPU in your kernel.

šŸš¦ The system hasn't been configured for Modern Standby in BIOS setup
	AMD systems must be configured for Modern Standby in BIOS setup
	for s2idle to function properly in Linux.
	On some OEM systems this is referred to as 'Windows' sleep mode.
	If the BIOS is configured for S3 and you manually select s2idle
	in /sys/power/mem_sleep, the system will not enter the deepest hardware state.
šŸš¦ Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO missing ACPI attributes
	An NVME device was found, but it doesn't specify the StorageD3Enable
	attribute in the device specific data (_DSD).
	This is a BIOS bug, but it may be possible to work around in the kernel.

For more information on this failure see:
	https://bugzilla.kernel.org/show_bug.cgi?id=216440

thats weird because amd_pmc module is loaded

I did upgrade to 2g in ram for the amd gpu card and at first it seemed to solve the graphic glitches issue but it just seem to happen less often

I also wonder about amd_pmc be missing as I can load it but the message still appears during rerun even though and it doesnā€™t get used by anything else as of

root@LEGION5PRO:/etc/libvirt/hooks/qemu.d# modprobe -av amd_pmc
insmod /lib/modules/6.4.3-060403-generic/kernel/drivers/platform/x86/amd/amd-pmc.ko enable_stb=1
root@LEGION5PRO:/etc/libvirt/hooks/qemu.d# lsmod|grep amd_pmc
amd_pmc 32768 0

root@LEGION5PRO:~# cat /etc/modprobe.d/amd_pmc.conf
#parm: enable_stb:Enable the STB debug mechanism (bool)
#parm: disable_workarounds:Disable workarounds for platform bugs (bool)
options amd_pmc enable_stb=1

but setting enable_stb makes no difference.

BTW: did you try the latest AMD-related GPU drivers from this PPA (sorry, not sure how this can be enabled or refered to in Manjaro):

root@LEGION5PRO:~# cat /etc/apt/sources.list.d/oibaf-graphics-drivers-jammy.list
deb [signed-by=/etc/apt/keyrings/oibaf-graphics-drivers-jammy.gpg] Index of /oibaf/graphics-drivers/ubuntu jammy main

deb-src [signed-by=/etc/apt/keyrings/oibaf-graphics-drivers-jammy.gpg] Index of /oibaf/graphics-drivers/ubuntu jammy main

I use them mainly to check if they could improve the situation. Maybe it could help yours?

My vfio setup works pretty decent btw - as I have 7945hx it has 16 cores/32 threads and I assigned 16 (8/8) to a Win11pro VM which has the 4070 assigned. As nvme0 is also in that PCI group 0 (the same as NVIDIA) I lose access to that one under Linux while VM is running but I take use of it by having that 512GB nvme dedicatedly assigned to that VM - which allows me to even boot it natively to get the last piece of performance when needed (it took 2-3 boots 1st time but now it is stable). Interestingly, regardless of how I got it booted (VM or natively), it stays activated so I can use it in parallel most of the time (I spend >90% of my time under Linux as I hate Windows - but sometimes, you knowā€¦)

UPDATE: It finally works using pm-suspend with kernel setting s2idle.prefer_microsoft_guid=1 as of:

root@LEGION5PRO:~# grep s2 /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT=ā€œamd_iommu=pgtbl_v1 iommu=pt vfio_iommu_type1.allow_unsafe_interrupts=1 kvm.ignore_msrs=1 vfio-pci.ids=1022:14da,1022:14db,1022:14db,10de:2860,10de:22bd default_hugepagesz=1G hugepagesz=1G hugepages=8 pcie_aspm=force acpi=copy_dsdt s2idle.prefer_microsoft_guid=1ā€

I found this hint at

216101 ā€“ lost acpi events after resume from suspend - AMD Ryzen 6800H

So happy!!! :upside_down_face:

2nd Update: I just realized that I have to unplug HDMI cable to get it working - I might have been fooledā€¦ Anyway, lessons learnedā€¦

I tried the kernel flag but it didnā€™t change anything, I canā€™t use s2idle.
I booted on windows and it donā€™t want to use s2idle either.
for me its not a killing feature but I donā€™t like having a issue with my hardware.

By the way using nvtop I realized also that the nvidia gpu is at 55w max (lenovo sell a 140w max). Even If I donā€™t want full power , Iā€™d like to have at least 100w, you know how to change that (tested the three mode using fn+q and that donā€™t change it).

Using nvidia-smi it tell me that its not supported.

sudo nvidia-smi -pl 100                                                                 ī‚² 1 āœ˜ 
Changing power management limit is not supported for GPU: 00000000:01:00.0.
Treating as warning and moving on.
All done.

Can you tell me the amdgpu version of you debien repo ?

edit: I succed increasing the tgp using nvidia-powerd service but it donā€™t go higher than 80w, and i donā€™t find information on how to configure this daemon.

edit2: seem to be an issue know with nvidia : Has anyone been able to run an RTX 3060 laptop GPU at more than 80W on Linux? - #107 by snakesgwa - Linux - NVIDIA Developer Forums

as I currently have completely disabled NVIDIA I canā€™t advise on power limit but vaguely remember that it was not capped the last time. I read in a test that due to cooling restrictions it was not possible to reach 140W but around 100W was seen.

My current AMD related drivers are

root@LEGION5PRO:~# dpkg -l|grep oibaf
ii libdrm-amdgpu1:amd64 2.4.115+git2307210500.cc8c22~oibaf~j amd64 Userspace interface to amdgpu-specific kernel DRM services ā€“ runtime
ii libdrm-amdgpu1:i386 2.4.115+git2307210500.cc8c22~oibaf~j i386 Userspace interface to amdgpu-specific kernel DRM services ā€“ runtime
ii libdrm-common 2.4.115+git2307210500.cc8c22~oibaf~j all Userspace interface to kernel DRM services ā€“ common files
ii libdrm-intel1:amd64 2.4.115+git2307210500.cc8c22~oibaf~j amd64 Userspace interface to intel-specific kernel DRM services ā€“ runtime
ii libdrm-intel1:i386 2.4.115+git2307210500.cc8c22~oibaf~j i386 Userspace interface to intel-specific kernel DRM services ā€“ runtime
ii libdrm-nouveau2:amd64 2.4.115+git2307210500.cc8c22~oibaf~j amd64 Userspace interface to nouveau-specific kernel DRM services ā€“ runtime
ii libdrm-nouveau2:i386 2.4.115+git2307210500.cc8c22~oibaf~j i386 Userspace interface to nouveau-specific kernel DRM services ā€“ runtime
ii libdrm-radeon1:amd64 2.4.115+git2307210500.cc8c22~oibaf~j amd64 Userspace interface to radeon-specific kernel DRM services ā€“ runtime
ii libdrm-radeon1:i386 2.4.115+git2307210500.cc8c22~oibaf~j i386 Userspace interface to radeon-specific kernel DRM services ā€“ runtime
ii libdrm2:amd64 2.4.115+git2307210500.cc8c22~oibaf~j amd64 Userspace interface to kernel DRM services ā€“ runtime
ii libdrm2:i386 2.4.115+git2307210500.cc8c22~oibaf~j i386 Userspace interface to kernel DRM services ā€“ runtime
ii libegl-mesa0:amd64 23.3~git2307220600.5cca11~oibaf~j amd64 free implementation of the EGL API ā€“ Mesa vendor library
ii libegl-mesa0:i386 23.3~git2307220600.5cca11~oibaf~j i386 free implementation of the EGL API ā€“ Mesa vendor library
ii libgbm1:amd64 23.3~git2307220600.5cca11~oibaf~j amd64 generic buffer management API ā€“ runtime
ii libgbm1:i386 23.3~git2307220600.5cca11~oibaf~j i386 generic buffer management API ā€“ runtime
ii libgl1-mesa-dri:amd64 23.3~git2307220600.5cca11~oibaf~j amd64 free implementation of the OpenGL API ā€“ DRI modules
ii libgl1-mesa-dri:i386 23.3~git2307220600.5cca11~oibaf~j i386 free implementation of the OpenGL API ā€“ DRI modules
ii libglapi-mesa:amd64 23.3~git2307220600.5cca11~oibaf~j amd64 free implementation of the GL API ā€“ shared library
ii libglapi-mesa:i386 23.3~git2307220600.5cca11~oibaf~j i386 free implementation of the GL API ā€“ shared library
ii libglx-mesa0:amd64 23.3~git2307220600.5cca11~oibaf~j amd64 free implementation of the OpenGL API ā€“ GLX vendor library
ii libglx-mesa0:i386 23.3~git2307220600.5cca11~oibaf~j i386 free implementation of the OpenGL API ā€“ GLX vendor library
ii libvdpau1:amd64 1.5-1~oibaf~j amd64 Video Decode and Presentation API for Unix (libraries)
ii libxatracker2:amd64 23.3~git2307220600.5cca11~oibaf~j amd64 X acceleration library ā€“ runtime
ii mesa-va-drivers:amd64 23.3~git2307220600.5cca11~oibaf~j amd64 Mesa VA-API video acceleration drivers
ii mesa-va-drivers:i386 23.3~git2307220600.5cca11~oibaf~j i386 Mesa VA-API video acceleration drivers
ii mesa-vdpau-drivers:amd64 23.3~git2307220600.5cca11~oibaf~j amd64 Mesa VDPAU video acceleration drivers
ii mesa-vulkan-drivers:amd64 23.3~git2307220600.5cca11~oibaf~j amd64 Mesa Vulkan graphics drivers
ii mesa-vulkan-drivers:i386 23.3~git2307220600.5cca11~oibaf~j i386 Mesa Vulkan graphics drivers

but I have an update pending. It is bleeding edge.

Got info that s2idle.prefer_microsoft_guid=1 is no longer supported and ignored. So I guess my suspend success was just a coincidence of having the HDMI cable unplugged in parallel :see_no_evil:

Iā€™m currently slowly working on reverting my VFIO setup to be switchable and not static - so my NVIDIA options are currently limited.

But when I just loaded the kernel without that VFIO stuff as of

root@LEGION5PRO:~# cat /proc/cmdline
BOOT_IMAGE=/@/boot/vmlinuz-6.4.3-060403-generic root=UUID=a2ec4268-40ad-400d-b714-1f5cd394b39e ro rootflags=subvol=@ amd_iommu=pgtbl_v1 iommu=pt vfio_iommu_type1.allow_unsafe_interrupts=1 kvm.ignore_msrs=1 vfio-pci.ids=1022:14da,1022:14db,1022:14db,10de:2860,10de:22bd default_hugepagesz=1G hugepagesz=1G hugepages=8 pcie_aspm=force acpi=copy_dsdt

by booting the same kernel with just

ā€¦
linux /@/boot/vmlinuz-6.4.3-060403-generic root=UUID=a2ec4268-40ad-400d-b714-1f5cd394b39e ro rootflags=subvol=@ pcie_aspm=force acpi=copy_dsdt
initrd /@/boot/initrd.img-6.4.3-060403-generic

I noticed a flicker and garbage on the screen which looked like yours!

Additionally, I saw a slow boot regarding problems with usb 5.x devices:

dmesg.2.gz:[ 1.880509] kernel: usb 5-1.1: new high-speed USB device number 3 using xhci_hcd
dmesg.2.gz:[ 7.032525] kernel: usb 5-1.1: device descriptor read/64, error -110
dmesg.2.gz:[ 22.640549] kernel: usb 5-1.1: device descriptor read/64, error -110
dmesg.2.gz:[ 22.828530] kernel: usb 5-1.1: new high-speed USB device number 4 using xhci_hcd
dmesg.2.gz:[ 28.016567] kernel: usb 5-1.1: device descriptor read/64, error -110
dmesg.2.gz:[ 43.632569] kernel: usb 5-1.1: device descriptor read/64, error -110
dmesg.2.gz:[ 43.740955] kernel: usb 5-1-port1: attempt power cycle
dmesg.2.gz:[ 44.344527] kernel: usb 5-1.1: new high-speed USB device number 5 using xhci_hcd
dmesg.2.gz:[ 55.024522] kernel: usb 5-1.1: device not accepting address 5, error -62
dmesg.2.gz:[ 55.104533] kernel: usb 5-1.1: new high-speed USB device number 6 using xhci_hcd
dmesg.2.gz:[ 65.776524] kernel: usb 5-1.1: device not accepting address 6, error -62
dmesg.2.gz:[ 65.778439] kernel: usb 5-1-port1: unable to enumerate USB device

I then booted again my VFIO enabled kernel and again saw garbage on the screen. I powered off the system and currently have not seen the usb issues in dmesg and it works fine currently without screen garbage. The last thing I dealt with was AFAIR installing, configuring and uninstalling laptop-mode-tools, as I saw the issues starting from then, without having the system powered off (warm boot only).

  1. Do you see USB messages in your dmesg, too, when the issues occur?
  2. Did you enable overclocking in BIOS for CPU and/or GPU?
  3. Do you have a powertop --autotune set dealing with power savings per device?
  4. You seem to have ā€œquiet splashā€ set in GRUB - you should probably consider to remove that in order to see the boot messages early to find a pattern?!
  1. I do not have usb error message (at least I donā€™t see them). Last time I didā€™nt have glitches but my latopt froze simply (no access to tty and canā€™t suspend using power button).
  2. Gpu is On since the beginning and I didnā€™t changed it.
    3.never used powertop , useful tool ?
    4.yes, Iā€™ll turn them off for now.

I donā€™t krow why, But my system seem a lot stable since two days, only have one freeze today and yesterday, some back ā€œblinkingā€ of the sceen two or three times.
I didā€™nt change anything apart deactivating and reactivating the amd gpu when I need to work on a stable laptop.

For what its worth , I find a way to increase tgp of the nvidia gpu.
After starting nvidia-powerd, it stuck at 80w, but if you cycle throught the performance mode using the fn+q key (its cycle between white/red/blue, which stand for normal/boost/low performance), it go up when using normal or boost and get down to 50w when using low.
Thats very weird because its not consistant, sometimes its 95w, sometimes its 125 , 130 or 140, etcā€¦
But even if the max is set to above 100w, the max used is never more than 100 :frowning:

well, powertop has some nice capabilities though being a bit odd from UI pov. You can use TAB to switch through the panels and it shows you info about power usage device-wise. It has a Tunables page where you can set on/off power savings per device via space key toggle and it will show the required ā€œechoā€ command to be used in and their paths to /sys fs which you could use in own scripts.

You can fine tune the the devices regarding their power settings and if you do an auto-tune it will enable everything at once - which can lead to lags especially with regards to USB device (like having an unresponsive mouse cursor, etc.) but can show you power save potential. Your milage may vary. RTFM.

GPU power consumption should basically be as low as possible and the relative low W consumption might just be due to efficiency as nothing more is required in that moment. Check out the test at https://www.notebookcheck.net/Ryzen-7-7745HX-performance-debut-Lenovo-Legion-Pro-5-16-Gen-8-laptop-review.717942.0.html

I also enabled the overclock modes in BIOS but set the CTCL value to 85 - which is the max temp for CPU not to be exceeded. But as Iā€™m interested in a quiet system I tend to use the settings there just for undervolting, by lowering the voltage in the curve settings (keep the ā€œ-ā€ in the setting as it will decrease the voltage part while ā€œ+ā€ will overclocke i.e. exceed that). Effectivly ā€œ-ā€ is undervolting - which to a certain degree can even get you more performance as the frequency can get a bit higher. As the system will stay cooler when temp is capped, there might be enough room for the GPU cooling system to consume more power.

I got memory bit flips when activating RAM overclocking with memtest86 so I disabled it again. It gained 1GB/s but stable RAM is mandatory. I ordered my system with 16GB and bought 32GB afterwards as I didnā€™t want to spend +200ā‚¬ for 32GB with Lenovo directly.



Iā€™m still checking things out but here you can see a comparison of my VFIOā€™d Win11-VM with half the cores and just 8GB RAM assigned against its native pendant benchmark using Passmark. Check out the coloured numbers when comparing the values. The passed-through GPU is 10% faster than natively - to be honest I doubt that, it might mislead due to timer issues but at least it feels sufficient regarding the lower rest test results.

CPU in VM can only use 8c/16t (the rest is reserved for Linux) but it reaches 40% of the CPUmark value - not that bad as if extrapolated it equals 80% from native.

PS:Regarding your ACPI messages from above: Did you ever try acpi=copy_dsdt in GRUB_CMDLINE_LINUX_DEFAULT?

About the GPU usage I was using pytorch during the screenshot, AFAIK it take all the possible power from it (I donā€™t think there a default 100w limit).

I already used acpi=copy_dsdt but the last time I modified grub config I removed it to see again if there was a change in the system.

About the OC. I just leaved it as-is as Iā€™m not used to do it . I didnā€™t modify settings in particular except performance mode to extreme too.