Sporadic failure to open display (Plasma, X11)

Occasionally, with no obvious cause that I can see (in the middle of a session, not immediately after logging in), I stop being able to open X programs either from the applications menu or from the command line. The error message from the command line is (e.g.)

$ xev
Authorization required, but no authorization protocol specified

xev:  unable to open display ':0'

All of the fixes that I’ve seen floating around for this error relate to remote applications, while in this case it is strictly local.

Basic system info:
Kernel

$ uname -a
Linux red-kite 6.4.6-1-MANJARO #1 SMP PREEMPT_DYNAMIC Tue Jul 25 09:30:58 UTC 2023 x86_64 GNU/Linux

Video card:

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Lexa PRO [Radeon 540/540X/550/550X / RX 540X/550/550X] (rev c7) (prog-if 00 [VGA controller])
	Subsystem: Sapphire Technology Limited Lexa PRO [Radeon 540/540X/550/550X / RX 540X/550/550X]

The only solution I’ve found is to log out and log in again (no need to reboot).

Any suggestions?

Added after re-logging in:
Needs Ctrl-Alt-Backspace to logout.

amdgpu module is loaded.

looks like its something you did …
create a new test user in system settings, reboot, log in with the test user and see if it has the same issue…

I really do not see how that could help.

This is a sporadic occurrence that happens only after I have been logged in for several days. The only immediately visible symptom being that when I try to start a program it doesn’t.

As it is an infrequent heisenbug (average interval greater than the typical interval between kernel upgrades), the main question is whether there is a way to restore X access without logging out?

as @brahma already suggested ! create a new user and log in as this new user. if the problem does not longer exist with the new user then something with your personal profile of the origin user failed. if the problem still exists with the new user too then something in general settings fails !
this is the first step of troubleshooting of a mysterious problem.

I’m not clear how this would be done:

  1. Create a user, use “Switch user” to access when the problem manifests. My guess is it would be unlikely to be possible, since logging out needed Ctrl-Alt-Backspace.

  2. Create a user and have parallel sessions, and use Ctrl-Alt-F2 to access the dummy user when the problem arises and see if the other session has it too.

  3. Work as the new user until the problem occurs (or doesn’t). Unfortunately not practical as this is a working machine where I need my settings to work effectively (and the problem has so far happened 2 or 3 times in as many months) and copying all of ~/.config and ~/.local would defeat the object of the exercise.

Well if you are not willing to trouble-shoot an error that happens “after several days of login” without a clean user config, how do you expect to find the cause?

It could be literally anything in your configs or applications used that mess-up stuff. :woman_shrugging:

Plus your problem seems to be related to the window manager but you didn’t provide any info about that…

inxi -zSG

Exactly, it could be almost anything.

So the only likely chance of an explanation was if it was something reasonably well-known. I was more hoping that someone might know of an X reset command that could circumvent the need to log out.

FWIW:

inxi -zSG
System:
  Kernel: 6.4.6-1-MANJARO arch: x86_64 bits: 64 Desktop: KDE Plasma v: 5.27.6
    Distro: Manjaro Linux
Graphics:
  Device-1: AMD Lexa PRO [Radeon 540/540X/550/550X / RX 540X/550/550X]
    driver: amdgpu v: kernel
  Display: x11 server: X.Org v: 21.1.8 with: Xwayland v: 23.1.2 driver: X:
    loaded: amdgpu unloaded: modesetting,radeon dri: radeonsi gpu: amdgpu
    resolution: 2560x1440
  API: OpenGL v: 4.6 Mesa 23.0.4 renderer: AMD Radeon RX 550 / 550 Series
    (polaris12 LLVM 15.0.7 DRM 3.52 6.4.6-1-MANJARO)

Window manager:

Version
=======
KWin version: 5.27.6
Qt Version: 5.15.10
Qt compile version: 5.15.10
XCB compile version: 1.15

Operation Mode: X11 only

Might be worth me trying cuda_memtest (which is supposed to work for AMD gpu’s as well as Nvidia).

you can delete the .Xauthority file from your home folder, if there are more than one, delete them all, and immediately reboot and see if it still happens…
also check if you have a xorg.conf in here:
find /etc/X11/ -name "*.conf"
or better, post the output here…

Thanks for those ideas, I’ll give removing .Xauthority a try, that rings a bell from very old problems with remote applications (late '90’s or early 2000’s).

There is no xorg.conf file to be seen.

$ find /etc/X11/ -name "*.conf"
/etc/X11/xorg.conf.d/00-keyboard.conf
/etc/X11/xorg.conf.d/30-touchpad.conf

(Interestingly I don’t have a touchpad)

good, there should be no xorg.conf there…
ok, so try it with the xauthority and see how it works…

I’ll do that when I’m next going to reboot (either because of an update or a recurrence of the problem).

Aside: The disappearance of xorg.conf is something of a reminder of how far we have come in the last quarter-century or so. I remember manually editing XFree86 modelines in Red Hat 4.2 to get the display area to map correctly to the CRT monitor.

I did remove .Xauthority before the next reboot.

However the problem recurred tonight. No messages at all in dmesg or in Xorg.log for at least 12 hours prior.