Amdgpu: Unsupported power profile mode 0 on RENOIR x no video output after computer idle

Under 5.11.10-1-MANJARO i let the computer working and return after a few hours, but there was no video signal. I SSH into the computer remotely, it was working, i have not seen any dmesg and journalctl issues, maybe i had wrong command and dmesg shown only most recent log lines. After i terminated apps and sent reboot command, the display started showing shutting down lines. Now i am booted back to report this issue and dmesg showing interesting red line:
amdgpu: Unsupported power profile mode 0 on RENOIR
rest of the red lines was also on older kernel, but i had no serious issue with graphics, except the issue on older kernel 5.10.23-1-MANJARO: XFCE and all apps terminated, drm:amdgpu_job_timedout [amdgpu] VLC; GPU reset

log lines on 5.11.10-1-MANJARO

$ sudo dmesg|grep -Ei “amd|gpu|failed|failure|error”;inxi -Gazy

[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.11-x86_64 root=UUID=… rw cryptdevice=UUID=… root=/dev/mapper/luks-… apparmor=1 security=apparmor udev.log_priority=3 amdgpu.ppfeaturemask=0xffffffff
[ 0.000000] AMD AuthenticAMD
[ 0.004787] RAMDISK: [mem 0x3606f000-0x3702efff]
[ 0.004821] ACPI: SSDT 0x0000000084BF3000 007229 (v02 AMD AmdTable 00000002 MSFT 04000000)
[ 0.004823] ACPI: IVRS 0x0000000084BF2000 0000D0 (v02 AMD AmdTable 00000001 AMD 00000000)
[ 0.004856] ACPI: VFCT 0x0000000084BC7000 00D484 (v01 HPQOEM SLIC-BPC 00000001 AMD 31504F47)
[ 0.004858] ACPI: SSDT 0x0000000084BC3000 0034A8 (v01 AMD AmdTable 00000001 INTL 20160527)
[ 0.004863] ACPI: SSDT 0x0000000084BFC000 0000BF (v01 AMD AmdTable 00001000 INTL 20160527)
[ 0.004865] ACPI: SSDT 0x0000000084BBF000 001405 (v01 AMD AmdTable 00000001 INTL 20160527)
[ 0.004867] ACPI: SSDT 0x0000000084BB9000 005354 (v02 AMD AmdTable 00000001 AMD 00000001)
[ 0.004869] ACPI: CRAT 0x0000000084BB8000 000F28 (v01 AMD AmdTable 00000001 AMD 00000001)
[ 0.004871] ACPI: CDIT 0x0000000084BB7000 000029 (v01 AMD AmdTable 00000001 AMD 00000001)
[ 0.203827] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.11-x86_64 root=UUID=… rw cryptdevice=UUID=…:luks-… root=/dev/mapper/luks-… apparmor=1 security=apparmor udev.log_priority=3 amdgpu.ppfeaturemask=0xffffffff
[ 0.354477] Spectre V2 : Mitigation: Full AMD retpoline
[ 0.462710] smpboot: CPU0: AMD Ryzen 7 PRO 4750GE with Radeon Graphics (family: 0x17, model: 0x60, stepping: 0x1)
[ 0.462843] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
[ 0.515721] ACPI BIOS Error (bug): Failure creating named object [_GPE._L00], AE_ALREADY_EXISTS (20201113/dswload2-326)
[ 0.515734] ACPI Error: AE_ALREADY_EXISTS, During name lookup/catalog (20201113/psobject-220)
[ 0.516274] ACPI BIOS Error (bug): Could not resolve symbol [_SB.I2CD], AE_NOT_FOUND (20201113/dswload2-162)
[ 0.516278] ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20201113/psobject-220)
[ 1.764889] pci 0000:00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.
[ 1.767293] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 1.767298] pci 0000:00:00.2: AMD-Vi: Extended features (0x206d73ef22254ade):
[ 1.767305] AMD-Vi: Interrupt remapping enabled
[ 1.767305] AMD-Vi: Virtual APIC enabled
[ 1.767306] AMD-Vi: X2APIC enabled
[ 1.767549] AMD-Vi: Lazy IO/TLB flushing enabled
[ 1.768455] amd_uncore: 4 amd_df counters detected
[ 1.768480] amd_uncore: 6 amd_l3 counters detected
[ 1.770000] perf: AMD IBS detected (0x000003ff)
[ 1.788170] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel jroedel@suse.de
[ 1.816449] RAS: Correctable Errors collector initialized.
[ 4.561582] ata6: failed to resume link (SControl 0)
[ 13.647154] spl: module verification failed: signature and/or required key missing - tainting kernel
[ 14.908716] EDAC amd64: F17h_M60h detected (node 0).
[ 14.908912] EDAC amd64: Node 0: DRAM ECC disabled.
[ 14.918774] hp_wmi: query 0x4 returned error 0x5
[ 14.932634] hp_wmi: query 0xd returned error 0x5
[ 14.939944] hp_wmi: query 0x1b returned error 0x5
[ 15.021812] EDAC amd64: F17h_M60h detected (node 0).
[ 15.021871] EDAC amd64: Node 0: DRAM ECC disabled.
[ 15.116068] EDAC amd64: F17h_M60h detected (node 0).
[ 15.116118] EDAC amd64: Node 0: DRAM ECC disabled.
[ 15.281578] EDAC amd64: F17h_M60h detected (node 0).
[ 15.281643] EDAC amd64: Node 0: DRAM ECC disabled.
[ 15.433061] EDAC amd64: F17h_M60h detected (node 0).
[ 15.433111] EDAC amd64: Node 0: DRAM ECC disabled.
[ 15.433916] [drm] amdgpu kernel modesetting enabled.
[ 15.434070] amdgpu: Topology: Add CPU node
[ 15.434188] fb0: switching to amdgpudrmfb from EFI VGA
[ 15.434317] amdgpu 0000:0a:00.0: vgaarb: deactivate vga console
[ 15.434468] amdgpu 0000:0a:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[ 15.435560] amdgpu 0000:0a:00.0: amdgpu: Fetched VBIOS from VFCT
[ 15.435563] amdgpu: ATOM BIOS: 113-RENOIR-026
[ 15.449612] amdgpu 0000:0a:00.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[ 15.449615] amdgpu 0000:0a:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[ 15.449617] amdgpu 0000:0a:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[ 15.449905] [drm] amdgpu: 512M of VRAM memory ready
[ 15.449908] [drm] amdgpu: 3072M of GTT memory ready.
[ 15.449910] [drm] GART: num cpu pages 262144, num gpu pages 262144
[ 15.573266] EDAC amd64: F17h_M60h detected (node 0).
[ 15.573314] EDAC amd64: Node 0: DRAM ECC disabled.
[ 15.692898] EDAC amd64: F17h_M60h detected (node 0).
[ 15.692959] EDAC amd64: Node 0: DRAM ECC disabled.
[ 15.893105] EDAC amd64: F17h_M60h detected (node 0).
[ 15.893163] EDAC amd64: Node 0: DRAM ECC disabled.
[ 15.999450] EDAC amd64: F17h_M60h detected (node 0).
[ 15.999486] EDAC amd64: Node 0: DRAM ECC disabled.
[ 16.049878] EDAC amd64: F17h_M60h detected (node 0).
[ 16.049918] EDAC amd64: Node 0: DRAM ECC disabled.
[ 16.106120] EDAC amd64: F17h_M60h detected (node 0).
[ 16.106168] EDAC amd64: Node 0: DRAM ECC disabled.
[ 16.172596] EDAC amd64: F17h_M60h detected (node 0).
[ 16.172635] EDAC amd64: Node 0: DRAM ECC disabled.
[ 16.239247] EDAC amd64: F17h_M60h detected (node 0).
[ 16.239305] EDAC amd64: Node 0: DRAM ECC disabled.
[ 16.293385] EDAC amd64: F17h_M60h detected (node 0).
[ 16.293446] EDAC amd64: Node 0: DRAM ECC disabled.
[ 16.338220] amdgpu 0000:0a:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 16.358227] amdgpu 0000:0a:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 16.359068] amdgpu 0000:0a:00.0: amdgpu: SMU is initialized successfully!
[ 16.359195] EDAC amd64: F17h_M60h detected (node 0).
[ 16.359236] EDAC amd64: Node 0: DRAM ECC disabled.
[ 16.392764] snd_hda_intel 0000:0a:00.1: bound 0000:0a:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[ 16.425024] Virtual CRAT table created for GPU
[ 16.425649] EDAC amd64: F17h_M60h detected (node 0).
[ 16.425686] EDAC amd64: Node 0: DRAM ECC disabled.
[ 16.425744] amdgpu: Topology: Add dGPU node [0x1636:0x1002]
[ 16.425754] amdgpu 0000:0a:00.0: amdgpu: SE 1, SH per SE 2, CU per SH 18, active_cu_number 28
[ 16.426432] fbcon: amdgpudrmfb (fb0) is primary device
[ 16.563170] amdgpu 0000:0a:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[ 16.595062] amdgpu 0000:0a:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[ 16.595067] amdgpu 0000:0a:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 16.595069] amdgpu 0000:0a:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 16.595070] amdgpu 0000:0a:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[ 16.595071] amdgpu 0000:0a:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[ 16.595073] amdgpu 0000:0a:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[ 16.595074] amdgpu 0000:0a:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[ 16.595075] amdgpu 0000:0a:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[ 16.595076] amdgpu 0000:0a:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[ 16.595077] amdgpu 0000:0a:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[ 16.595079] amdgpu 0000:0a:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
[ 16.595080] amdgpu 0000:0a:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1
[ 16.595082] amdgpu 0000:0a:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
[ 16.595083] amdgpu 0000:0a:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
[ 16.595084] amdgpu 0000:0a:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
[ 16.597421] [drm] Initialized amdgpu 3.40.0 20150101 for 0000:0a:00.0 on minor 0
[ 17.425730] amdgpu 0000:0a:00.0: amdgpu: Unsupported power profile mode 0 on RENOIR
[ 19.579543] audit: type=1103 audit(1618523774.272:91): pid=2188 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg=‘op=PAM:setcred grantors=? acct=“user” exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed’

Graphics:
Device-1: AMD Renoir vendor: Hewlett-Packard driver: amdgpu v: kernel
bus-ID: 0a:00.0 chip-ID: 1002:1636 class-ID: 0300
Display: x11 server: X.Org 1.20.10 driver: loaded: amdgpu,ati
unloaded: modesetting alternate: fbdev,vesa display-ID: :0.0 screens: 1
Screen-1: 0 s-res: 1920x1080 s-dpi: 96 s-size: 508x285mm (20.0x11.2")
s-diag: 582mm (22.9")
Monitor-1: HDMI-A-0 res: 1920x1080 hz: 60 dpi: 102
size: 477x268mm (18.8x10.6") diag: 547mm (21.5")
OpenGL: renderer: AMD RENOIR (DRM 3.40.0 5.11.10-1-MANJARO LLVM 11.1.0)
v: 4.6 Mesa 21.0.1 direct render: Yes

Happened two times in last 24 hours, reconnecting display HDMI cable not helped, but what helped to continue displaying was either of the two commands ran via SSH from remote machine:

sudo systemctl restart display-manager
or
sudo systemctl restart dbus.service
worked too, then i had to “systemctl restart NetworkManager” again all apps terminated during process!

but it possibly cause all apps terminated :-/ because i not seen any running.
Any ideas what to try?

I think that is a bug in amdgpu and needs to be fixed from the kernel side.
Did you already try updating to 5.11.14 to see if it’s working again?

I’m currently staying put on 5.11.6 as that is the only 5.11-series kernel that’s working well for me - sleep/suspend is working and on waking it’s not crashing.

One thing to note: 5.11.14 does not support S3 sleep modes, so your Renoir laptop (assuming it’s one and not a desktop machine) will not have proper sleep. Another annoying kernel issue for Renoir users atm.

Yes, happen in 5.10.30-1 and in 5.11.14-1-MANJARO too, can not try 5.12 as it does not support ZFS kernel modules yet apparently. Just happened for the third time today. So where to report this? No one knows what service to try restart so it does not terminate my apps and i loose data (somethinbg like a screen locker, pam, graphics)? Because when i click power button or restart display-manager or dbus.service, graphics start working - apps terminated). This Linux may be less reliable than Microsoft Windows.

btw: how do i downgrade to 5.11.6, i do not see it in Manjaro Settings manager/Kernel, nor in “mhwd-kernel -l”

You could try this tutorial for shutting down the PC gracefully, have not tested myself if it works when the issue occurs, however:

To downgrade I used 5.11.6 from the pacman cache on my system via

`sudo pacman -U /var/cache/pacman/pkg/linux511-5.11.6-1-x86_64.pkg.tar.zst `

If you are lucky, you might still have it in the cache, otherwise you need to redownload it somewhere (would need to check myself how to find it and where).

About reporting it, I guess the kernel devs should have a bug tracker somewhere, but afaik it’s a known issue already anyway.

@Dino-Fossil
somehow i was unable to find that command and o no longer have that package in my cache and have no experience building/compiling it.

But i am positive now ::, because for the last 24 hours the issue with no video signal not repeated even i am on 5.11.14-1-MANJARO and downgraded no packages (not know how to do it anyway?), maybe due to following two changes i have made:

click Manjaro menu, type “lock”, click “Light Locker Settings” app. Disable Light-locker.

sudo powertop
type Tab key a few times until on “Tunables” tab where i hit Enter to disable power management for the following >> highlighted entry (it also disabled other 2 (“Bad” marked ones):

>> Bad           Runtime PM for I2C Adapter i2c-0 (AMDGPU DM i2c hw bus 0)                                              
   Bad           Runtime PM for I2C Adapter i2c-1 (AMDGPU DM i2c hw bus 1)
   Good          Autosuspend for USB device 4-Port USB 2.0 Hub [Generic]
   Good          Autosuspend for USB device 4-Port USB 3.0 Hub [Generic]
   Good          Autosuspend for USB device USB3.0 Card Reader [Generic]
   Good          Autosuspend for USB device USB 10/100/1000 LAN [Realtek]
   Bad           Runtime PM for I2C Adapter i2c-2 (AMDGPU DM i2c hw bus 2)

if i know where this issue has its bug report, i would share this with them, but i do not know where this issue/bug has its home or how to discover when the issue was fixed.