Dell Inspiron 15 5000 (amdgpu RX Vega 10) black screen or kernel warning after waking from sleep

I'm using a Dell Inspiron 15 5585 with AMD CPU and GPU, and the amdgpu driver. I use the laptop hooked up to an external monitor, with the internal display turned off.

(Sidenote: on Windows, opening the lid switches the laptop from external display to internal. I'm not sure if that's caused by hardware or Windows drivers.)

On Manjaro KDE, when I sleep and wake the laptop, I sometimes/always??? get a black screen when the laptop wakes. Sometimes I can get to tty2, but after unplugging the GPU cable, I was unable to do so. After rebooting the computer, I saved journalctl -b -1 to a file and uploaded it to https://gist.github.com/jimbo1qaz/7d0158cb9f3f030dd4052fd210414b2b.

If you ctrl+f tainted, you'll find one warning mentioning nvme_poll_irqdisable, followed by many warnings mentioning amdgpu_dm_atomic_commit_tail.

The warning is raised at dcn10_verify_allow_pstate_change_high().

Linux source code: https://github.com/torvalds/linux/blob/v5.3/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c#L932

(Note: I have earlyoom running, to hopefully avoid OOM errors.)

Actions taken

  • Installed Linux 5.4 fixed failed to write reg and *ERROR* IB ring test failed (-22).
  • sudo nano /etc/mkinitcpio.conf, MODULES="amdgpu radeon" (does nothing of note)
    • mkinitcpio
    • sudo mkinitcpio -P
    • What does it do? I have no clue!
  • https://bbs.archlinux.org/viewtopic.php?pid=1873058#p1873058
    • Simply paste this into your xorg config:
Section "Device" 
    Identifier "AMDGPU" 
    Driver "amdgpu" 
    Option "DRI" "2" 
EndSection
    • DRI2 works!!! Major flicker with compositing off, works well with compositing on.
    • no I still get IOMMU/xHCI crashes.

Before DRI2, in one sleep, I get IO_PAGE_FAULT, IB test failed (-22), xHCI host controller not responding, assume dead, and IB test failed on gfx (-110) on the same run!

After DRI2, reproducible failure mode: On the second sleep, you can use the keyboard to wake the computer when the USB interface is in sleep mode, but the screen isn't... The screen turns off, on, then off (and the keyboard sleeps again). When I wake the computer, it becomes brain-dead. 12:19.

in one sleep, I get IO_PAGE_FAULT, xHCI host controller not responding, assume dead, and IB test failed (-22) on the same run!

AMD-Vi IO_PAGE_FAULT. AMD-Vi is IOMMU. I should disable iommu then.

Fixing xHCI suspend issues? If I run rmmod before sleeping, I still get a hang on wake (with no journalctl output after system goes to sleep). So disabling xHCI on sleep does not solve the issue. So the scripts are pointless. (But I'll mention them anyway.)

Wayland? I get a different issue.

Actions not taken yet

full-text journalctl dumps

Uploaded to gitlab.com/nyanpasu64/manjaro-dell-issues. They contain many failure modes and lots of error codes to analyze and understand through rg (or grep).

ACPI-induced freezes when initiating sleep/shutdown ("Sleep causes ACPI failure") (not fixed in 5.4) (doesn't seem fixed by dri2)

I think my laptop has an ACPI issue. On earlier Linux kernels and UEFI set to "connected suspend", it wouldn't boot. Now with UEFI in S3 mode (I think), sleep and wake sometimes doesn't work. Often, the USB devices would fall asleep right after I told Linux to sleep, but the display would remain frozen and on for several seconds.

Failure to sleep (random)

Sometimes, according to journalctl, systemd thought the computer had slept. In reality, the keyboard backlight and/or CPU fan remained on, and the computer could not be waken up. If I was using the internal display, the keyboard and fan would stop, then a second later the keyboard backlight and display backlight would turn on (but display was black).

This seems to often happen as the second sleep after logging in.

This has happened on the login screen after 3-4 sleeps and me spam-clicking the Suspend button.

Testing without spam-clicking the Suspend button causes the computer to fail to wake.

Failure to halt (always happens from tty)

shutdown -P now powers off now.
shutdown -H now halts (doesn't shutdown machine) now.
halt does not shut down my laptop, but shuts down other machines I've used.

Possible solutions

BIOS upgrade? idk

AMD-Vi and xhci_hcd IO_PAGE_FAULT (not fixed in 5.4) (still an issue now)

AMD's implementation of IOMMU is also known as AMD-Vi.

I cannot disable IOMMU as it breaks wifi.

xhci_hcd IO_PAGE_FAULT is consistently associated with *ERROR* IB test failed on gfx (-110). and brain death.

https://askubuntu.com/questions/805008 claims it's harmless.

Linux 5.4, "login screen, I spam*" (I performed these actions, then saved the resulting journalctl log to this filename.)

I pressed the keyboard to unsleep the computer, which took an increasingly long time to display a working (not stuck) GUI, and eventually froze at a black screen.

I think clicking too many times, you can bump the keyboard, not click the mouse, and queued mouse clicks will make the computer sleep.

  • Sometimes the computer wakes by itself. It didn't happen, there was a final keypress which woke the computer halfway before it crashed.

Forum search

Wake causes broken monitor output, xHCI driver issues (xhci_pci_suspend) (not fixed in 5.4)

Linux 5.4, "fast vs slow suspend"

  • log in. suspend a few times. no issues, as long as i wait 5 seconds after each login before suspending again.
  • if i suspend quickly after i wake and log in, then bad things happen.
    • the computer wakes itself. i can move the cursor on the login screen background (which creates hundreds of CreateNotify events), but nothing else gets drawn.
    • suspend entry occurrence 6 is followed by xhci_pci_suspend errors, and *ERROR* IB ring test failed (-22).
      • I've seen xhci_pci_suspend errors in previous logs.
      • xhci_pci_suspend and *ERROR* IB ring test failed (-22) appear in pairs.
        • -22 can appear seconds before xhci_pci_suspend.
      • *ERROR* IB ring test failed (-110) is not preceded by xhci_pci_suspend.
  • afterwards, i get kworker blocked for more than errors related to xhci.

Pressing the power button caused the system to shut down within 6 seconds according to the journal, but ctrl alt f2 worked initially, and caps lock kept responding way afterwards.

Symptom: CreateNotify

ksmserver[2795]: CreateNotify: 35651712
    (hundreds/thousands of lines like this, each with a unique number)

ksmserver is KDE's session manager. On startup the session manager launches auto-start applications and restores applications from the previous session.

The X server can report CreateNotify events to clients wanting information about creation of windows. The X server generates this event whenever a client application creates a window by calling XCreateWindow() or XCreateSimpleWindow().

Forum search

Wake causes broken monitor output, failed to write reg (fixed in 5.4)

failed to write reg

Ever since switching to Linux 5.4, I have not encountered failed to write reg in my logs.

Sometimes, during the login process, the display froze and Ctrl+Alt+F2 did nothing, but pressing Caps Lock flashed the keyboard light. Afterwards I held the power button for 4 seconds.

Dec 06 04:53:15 dell-manjaro kernel: failed to write reg 28b4 wait reg 28c6
Dec 06 04:53:16 dell-manjaro kernel: failed to write reg 1a6f4 wait reg 1a706
...
Dec 06 04:53:24 dell-manjaro kernel: failed to write reg 28b4 wait reg 28c6
Dec 06 04:53:26 dell-manjaro kernel: failed to write reg 1a6f4 wait reg 1a706
Dec 06 04:53:36 dell-manjaro kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=3799, emitted seq=3801
Dec 06 04:53:36 dell-manjaro kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1028 thread Xorg:cs0 pid 1038
Dec 06 04:53:36 dell-manjaro kernel: [drm] GPU recovery disabled.
... normal startup entries
Dec 06 04:55:24 dell-manjaro systemd-logind[772]: Power key pressed.

rg 'failed to write reg', remove timestamps, then sort -u, which revealed that there were only 2 types of errors.

I found exactly two types of failed to write errors, always occurring in order, about 1-2 seconds apart (one time 0 seconds apart) (sometimes with unrelated log entries in between):

kernel: failed to write reg 28b4 wait reg 28c6
kernel: failed to write reg 1a6f4 wait reg 1a706

Sometimes I get 28b4, 1a6f4, and 28b4 and 1a6f4 again, with 1-2 seconds between each log message.

*ERROR*

I'm not sure if "broken monitor output" is preceded by failed to write reg or *ERROR*.

rg '\*ERROR\*', remove timestamps, then sort -u to filter out duplicates

kernel: amdgpu 0000:05:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on gfx (-22).
kernel: [drm:process_one_work] *ERROR* ib ring test failed (-22).

I filtered out all unique *ERROR* messages as follows:

amdgpu 0000:05:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.0.1 (-110).
amdgpu 0000:05:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 (-110).
amdgpu 0000:05:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.2.1 (-110).
amdgpu 0000:05:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110).
amdgpu 0000:05:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on gfx (-22).
amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process plasmashell pid 1289 thread plasmashel:cs0 pid 1490
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process plasmashell pid 1313 thread plasmashel:cs0 pid 1535
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=26189, emitted seq=26191
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=4030, emitted seq=4031
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=4030, emitted seq=4033
[drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CONNECTOR:64:eDP-1] flip_done timed out
[drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CONNECTOR:69:HDMI-A-1] flip_done timed out
[drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:56:crtc-0] flip_done timed out
[drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:58:crtc-1] flip_done timed out
[drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:46:plane-2] flip_done timed out
[drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:56:crtc-0] flip_done timed out
[drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:58:crtc-1] flip_done timed out
[drm:gfx_v9_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
[drm:process_one_work] *ERROR* ib ring test failed (-110).
[drm:process_one_work] *ERROR* ib ring test failed (-22).

Looking through my logs:

  • IB test failed on comp_1.0.1 (-110). occurs early in boot and doesn't cause boot to fail.
  • flip_done timed out occurs far after the initial amdgpu failures.

Forum search ("ring gfx timeout")

  • Blank screen when coming back from suspend
    • journalctl -p3 -b 0 filters out low-priority results, and produces much less text than journalctl.
    • I found that [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.0.1 (-110). occurs before the system finishes booting to the login screen.
    • There is discussion about a possible solution at Blank screen when coming back from suspend and below.
    • ❰nyanpasu64❙~❱✔≻ systemctl status acpid
      ● acpid.service - ACPI event daemon
         Loaded: loaded (/usr/lib/systemd/system/acpid.service; disabled; vendor preset: disabled)
         Active: inactive (dead)
           Docs: man:acpid(8)
      
    • What does amdgpu.gpu_recovery=1 do? Try to restart after the crash, and not prevent the crash?
    • sudo nano /etc/mkinitcpio.conf, MODULES="amdgpu radeon" ?
  • Troubleshooting random system freeze Same issue as me, no solution.
  • [Stable Update] 2019-10-10 - Gnome 3.34.1, Plasma 5.16.5, KDE Apps & Framework, Pamac 9.0, Mesa 19.2.1
    • Looks like the 5.4 kernel finally solved that ring gfx timeout issue (probably related to powerplay) also by me, but I didn't test for long. Good news for rx vega users.

    • I installed 5.4 and it fixes this issue, but not the others.
  • Several threads have the error [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!. I did not get this at first, but during further testing (some with my external monitor unplugged), I encountered this error as well, along with freezing with the screen on I think.

amdgpu.gpu_recovery=1

Wake causes hard freeze (dcn10_verify_allow_pstate_change_high) (not fixed in 5.4) (fixed by dri2)

Sometimes when waking from sleep, pressing Caps Lock does not flash the keyboard light (if I remember correctly). This is accompanied by dcn10_verify_allow_pstate_change_high appearing in the systemd journal.

Linux 5.4, "suspend twice freezes, pstate":

  • log in.
  • sleep. usb sleeps and screen turns into login screen. 3 seconds later, screen shuts off.
  • wake. orange screen, fixed by replugging hdmi cable.
    • systemctl reveals massive flood of dcn10_verify_allow_pstate_change_high errors.
  • sleep.
    • systemctl log mostly here. No errors after (at time of freeze).
  • wake. no video, caps lock works.
    • There are very few systemctl logs after this point.
    • One of them is "Suspending console(s) (use no_console_suspend to debug)"
  • switch to tty2. caps lock stops responding for a few seconds, after which I kill power.

Based on what I saw in a forum search, I ran watch -n3 xset s activate to repeatedly blank my screen.

Xorg bug tracker report

This is a rich thread full of relevant information. Not too active, last post was 4 weeks ago before Gitlab migration.

seeing similar with a Dell Latitude 5495 with AMD Ryzen 5 PRO 2500U:

  • My laptop is an inspiron 5585 with ryzen 7 3700u.
  • I got dcn10_prepare_bandwidth, but only on kernel 5.3 (not 5.4), and only in one boot.

For me it's a regression in the 5.2-development. Testing with 5.1-series show no errors. Resume after S3 suspend works without problem.

git bisect points me to df8368be1382b442384507a5147c89978cd60702

Thread with similar problem https://bugs.freedesktop.org/show_bug.cgi?id=111459

It seems DCC is broken on Raven Ridge. So how about disabling it here, until the problems are solved?

I had the same problem with Ryzen 2400G on kernels 5.2, 5.3 and 5.4, but it would only be reproduced when X was running. If I stop X before going to sleep, wakeup would work. I managed to fix it by reverting the following commit in X driver: https://github.com/freedesktop/xorg-xf86-video-amdgpu/commit/a2b32e72fdaff3007a79b84929997d8176c2d512

Confirming the posted X driver workaround fixes it on 2700U. Debian 5.2.* kernels and vanilla 5.3.1 work perfectly now.

Anyway, the latest X driver from git is broken as well. Should the issue be reported there, or is it better to fix it in kernel layer?

From the reports, it seems to be compositor related. For me, kwin with OpenGL 3.1 backend works fine. xfwm4 seems to trigger the bug, maybe other compositors too.

I may have spoken too soon. I'm non-deterministically experiencing the basic symptom of hang on resume with a blank screen, sometimes with the backlight on and sometimes without, but I no longer get the traceback in logs, so I can't tell if it's mostly the same bug but without tripping the failure mode that causes it to log, or if there's an unrelated suspend/resume bug. Switching to the OpenGL 3.1 compositor has definitely made [dcn10_verify_allow_pstate_change_high] stop appearing in my logs though.

The thread was migrated to the freedesktop.org gitlab, but is inactive now.

Switching to OpenGL 3.1 and sleeping led to an immediate freeze upon waking. The last entry was suspend entry (deep).

Try again, run sudo ./keep-console first... still no output.

https://bbs.archlinux.org/viewtopic.php?id=248278 is interesting.

Points to https://bugs.freedesktop.org/show_bug.cgi?id=111244 which is different from the above. This bug is still active.

Section "Device" 
    Identifier "AMDGPU" 
    Driver "amdgpu" 
    Option "DRI" "2" 
EndSection

My command line flags are these, but some of them might not be needed anymore:

idle=nomwait amd_iommu=fullflush amdgpu.gpu_recovery=1
  • Kernel 5.5rc1 seems to fix it for me, the laptop now wakes up as it's supposed to.

Forum search dcn10_verify_allow_pstate_change_high

  • AMDGPU doesn’t work with RX Vega 11 on Kernel version >=5.2.8-1
    • Same symptoms as me, same call trace and warning location at dcn10_verify_allow_pstate_change_high
    • This person claims the symptoms started in kernel 5.2.8-1. Which kernel was they running before?
    • For me, downgrading is likely not an option. I recall on the Manjaro live USB (which may come with an older kernel), my wifi driver would start spitting errors and stop communicating with my "Qualcomm Atheros QCA9377" wireless adapter.
    • Unfortunately the person ■■■■■■■ their OS install, and never visited the forum again. I updated my laptop BIOS on 2019-09-17 (Inspiron 5485_2n1_5485_5585_2.2.3.exe). A newer BIOS has been released at https://www.dell.com/support/... but I did not install it yet.

Web search dcn10_verify_allow_pstate_change_high

Forum search amdgpu_dm_atomic_commit_tail

Irrelevant results

Upgrading to Linux 5.4

  • Sleep causes ACPI failure: Still happens.
  • Wake causes broken monitor output: IDK
  • Wake causes hard freeze: IDK.

inxi -Fx output:

System:    Host: dell-manjaro Kernel: 5.3.12-1-MANJARO x86_64 bits: 64 compiler: gcc v: 9.2.0 Desktop: KDE Plasma 5.17.3 
           Distro: Manjaro Linux 
Machine:   Type: Laptop System: Dell product: Inspiron 5585 v: 2.2.3 serial: <root required> 
           Mobo: Dell model: 0NTKCX v: A00 serial: <root required> UEFI: Dell v: 2.2.3 date: 06/13/2019 
Battery:   ID-1: BAT0 charge: 29.3 Wh condition: 37.0/42.0 Wh (88%) model: SMP-ATL-3.61 DELL VM73283 status: Unknown 
CPU:       Topology: Quad Core model: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx bits: 64 type: MT MCP arch: Zen+ rev: 1 
           L2 cache: 2048 KiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 36750 
           Speed: 1807 MHz min/max: 1400/2300 MHz Core speeds (MHz): 1: 1807 2: 3047 3: 1304 4: 1305 5: 1545 6: 2279 7: 1320 
           8: 1323 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Picasso vendor: Dell driver: amdgpu v: kernel bus ID: 05:00.0 
           Display: x11 server: X.Org 1.20.5 driver: amdgpu FAILED: ati unloaded: modesetting resolution: 1920x1080~60Hz 
           OpenGL: renderer: AMD RAVEN (DRM 3.33.0 5.3.12-1-MANJARO LLVM 9.0.0) v: 4.5 Mesa 19.2.6 direct render: Yes 
Audio:     Device-1: Advanced Micro Devices [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio vendor: Dell driver: snd_hda_intel 
           v: kernel bus ID: 05:00.1 
           Device-2: Advanced Micro Devices [AMD] Family 17h HD Audio vendor: Dell driver: snd_hda_intel v: kernel 
           bus ID: 05:00.6 
           Sound Server: ALSA v: k5.3.12-1-MANJARO 
Network:   Device-1: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter vendor: Dell driver: ath10k_pci v: kernel 
           bus ID: 02:00.0 
           IF: wlp2s0 state: up mac: c0:b5:d7:46:e1:e9 
           Device-2: Realtek RTL810xE PCI Express Fast Ethernet driver: r8169 v: kernel port: 2000 bus ID: 03:00.0 
           IF: enp3s0 state: down mac: e4:54:e8:02:1c:43 
           Device-3: Qualcomm Atheros type: USB driver: btusb bus ID: 3-2.3:4 
Drives:    Local Storage: total: 1.38 TiB used: 27.27 GiB (1.9%) 
           ID-1: /dev/nvme0n1 vendor: SK Hynix model: BC501 NVMe 512GB size: 476.94 GiB 
           ID-2: /dev/sda vendor: Toshiba model: MQ04ABF100 size: 931.51 GiB 
Partition: ID-1: / size: 900.82 GiB used: 27.27 GiB (3.0%) fs: ext4 dev: /dev/sda2 
           ID-2: swap-1 size: 15.01 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/sda3 
Sensors:   System Temperatures: cpu: 61.0 C mobo: 36.0 C sodimm: 41.0 C gpu: amdgpu temp: 60 C 
           Fan Speeds (RPM): cpu: 0 
Info:      Processes: 229 Uptime: 38m Memory: 13.62 GiB used: 2.18 GiB (16.0%) Init: systemd Compilers: gcc: 9.2.0 
           clang: 9.0.0 Shell: fish v: 3.0.2 inxi: 3.0.36 
❰nyanpasu64❙~/Documents/manjaro dell issues(git≠master)❱✔≻ pacman -Ss amdgpu
core/mhwd-amdgpu 1.2.1-1 [installed]
    MHWD module-ids for amdgpu
extra/xf86-video-amdgpu 19.1.0-1 (xorg-drivers) [installed]
    X.org amdgpu video driver
community/amdgpu-experimental 20180518-1
    Enables experimental features (exp_hw_support, cik_support, si_support, deep_color, dc)

What's amdgpu-experimental?

have you tried
linux-amd-raven
kernel from aur
it will solve your vega gpu problem
at least it does for me ryzen 3400g vega 11
no glitches no freezes
everything runs smooth.
manjaro kernel on other hand is almost useless.
cant even play videos
dont talk about gaming.

1 Like

I'm installing Linux 5.5 from Manjaro unstable. If it doesn't work, I'll try out https://aur.archlinux.org/packages/linux-amd-raven/.

EDIT: Linux 5.5 RC was even worse than 5.4. I get corrupted display output instead of a functioning desktop manager. (Keep in mind I didn't update the rest of my system to unstable.)

I'm testing linux-amd-raven, but it seems to no longer have actual patches relative to upstream 5.4, so I'm not optimistic about it working.

Have you tried iommu=soft or iommu=pt?

I have a Git repository of various crash dumps. Reposting for added visibility: https://gitlab.com/nyanpasu64/manjaro-dell-issues

I set my BIOS back to "Force S3 Sleep", since "OS Automatic Configuration" didn't help Linux suspend, and broke Windows sleep.

On 5.5rc1 (Manjaro unstable combined with all other packages from stable), I get scrambled prior VRAM (including scrambled Windows screen contents, and fragments of intact Linux terminals). Pressing Caps Lock only sometimes toggles the light. The system does not appear to respond to Ctrl Alt F2.

Linux 5.5rc1 journalctl log on Gitlab. The issue seems to be a result of amdgpu, not iommu.

[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=20, emitted seq=23
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 816 thread Xorg:cs0 pid 1036
amdgpu 0000:05:00.0: GPU reset begin!
amdgpu 0000:05:00.0: GPU reset succeeded, trying to resume
[drm:gmc_v9_0_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for sem acquire in VM flush!
[drm:gmc_v9_0_hw_init [amdgpu]] *ERROR* Timeout waiting for VM flush ACK!
[drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
[drm:gmc_v9_0_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for sem acquire in VM flush!
[drm:amdgpu_gart_bind [amdgpu]] *ERROR* Timeout waiting for VM flush ACK!
(last 2 lines repeat)

I get the same issues with iommu=soft/pt:

you can try with testing

Linux-raven is kind of BS.
This is where all the latest good s*** sits

https://aur.archlinux.org/packages/linux-amd-staging-drm-next-git/

you can try with testing

No difference, same issue as before. https://gitlab.com/nyanpasu64/manjaro-dell-issues/blob/master/2019-12-14/linux55%20manjaro%20testing%20login%20screen#L1391-1399

After updating to testing, I booted Linux 5.4, logged in (Xorg set to DRI3, with my force-DRI2 Xorg.conf commented out), and received failed to write reg errors again. The laptop froze after I logged in (though journalctl was running and recording power button events). I thought this was fixed on 5.4. Apparently not.

After rebooting with the same config (Linux 5.4, Xorg set to DRI3), I managed to log in fine. I have not tested sleeping yet. I have switched to DRI2 again for the time being; I hope it will have less issues.

AUR, linux-amd-staging-drm-next-git

I don't want to recompile the kernel from scratch. It takes a long time.

There really isn't many more ways to fix drivers bugs...
And it's not like your hardware should take all that much time.

Good day - I am on manjaro testing and recently my external display stopped working. Since it looks like it is also a recent amdgpu issue, I just wanted to post it here to be worth a shot.

Have a look at early kernel modesetting for your drivers. See: https://wiki.archlinux.org/index.php/Kernel_mode_setting#Early_KMS_start

Forum kindly sponsored by