X random failing to start at boot

Gt 710 with nouveau
Manjaro Xfce fully update
kernel 5.4 and 5.9

basicaly in 50% of the cases it fails to start X, even startx from console fails
this is a clean install, nvidia proprietary drivers have never been installed, no hardware change (already said it’s a clean install)

Xorg.0.log doesn’t really say much when it fails

    [     5.290] (II) modesetting: Driver for Modesetting Kernel Drivers: kms
    [     5.291] (EE) [drm] Failed to open DRM device for pci:0000:01:00.0: -19
    [     5.291] (EE) open /dev/dri/card0: No such file or directory
    [     5.291] (WW) Falling back to old probe method for modesetting
    [     5.291] (EE) open /dev/dri/card0: No such file or directory
    [     5.291] (EE) Screen 0 deleted because of no matching config section.
    [     5.291] (II) UnloadModule: "modesetting"
    [     5.291] (EE) Device(s) detected, but none match those in the config file.
    [     5.291] (EE) 
    Fatal server error:
    [     5.291] (EE) no screens found(EE) 
    [     5.291] (EE) 

this is what I see when it fails.
full log: Xorg.0.log - Pastebin.com

dmesg between when it starts and when it fails has only a minor difference (this log is from when in works, DRM loaded at 5,0, X ended up at the DRM part at 5,6 so no issues):

    [    3.526547] fb0: switching to nouveaufb from VESA VGA
    [    3.534662] nouveau 0000:01:00.0: NVIDIA GK208B (b060b0b1)
    [    3.818711] nouveau 0000:01:00.0: bios: version 80.28.b8.00.6d
    [    3.819225] nouveau 0000:01:00.0: fb: 2048 MiB GDDR5
    [    5.009272] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB
    [    5.009273] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
    [    5.009275] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
    [    5.009276] nouveau 0000:01:00.0: DRM: DCB version 4.0
    [    5.009278] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030
    [    5.009279] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011f62 00020010
    [    5.009279] nouveau 0000:01:00.0: DRM: DCB outp 02: 02000f00 00000000
    [    5.009281] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001030
    [    5.009281] nouveau 0000:01:00.0: DRM: DCB conn 01: 00002161
    [    5.009575] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
    [    5.280352] nouveau 0000:01:00.0: DRM: allocated 1920x1080 fb: 0x80000, bo 00000000e94edf95
    [    5.280990] fbcon: nouveaudrmfb (fb0) is primary device
    [    5.280993] nouveau 0000:01:00.0: fb0: nouveaudrmfb frame buffer device
    [    5.296982] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0

the part that you see here at 5.00 it’s at 5.70+ when it fails. Compared to my other machine with a GT 1030 the DRM part is considerable slower, on the GT 1030 there is no 1,5-2 seconds delay, it basicaly follows in under 0,2s

lspci -knn (it’s the same in both cases, when it works and when it fails)

    01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK208B [GeForce GT 710] [10de:128b] (rev a1)
    Subsystem: Gigabyte Technology Co., Ltd Device [1458:375a]
    Kernel driver in use: nouveau
    Kernel modules: nouveau

I kinda understand that it fails because the DRM part is not really loaded when X tries to start. What I don’t understand is why it takes so long…
I can wait as much as I want it will still be a nice blackscreen. Droping to console and trying to start X result in a failure. Only thing changed in the Xorg.0.log is

    parse_vt_settings: Cannot open /dev/tty0 (Permission denied)

DRM part is already loaded in dmesg.

Any suggestion on how to speed up the DRM part that is lagging?
Alternative how do I add a delay before X starts? Hopefully the DRM part will be loaded by then…

Solution is to delay lightdm service, basicaly you need to make it start after the nouveau DRM part has loaded. It’s not really normal to try to start X/lightdm/desktop manager before nouveau has finished loading everything that supposed to load so the solution is a workout for the problem. A proper solution would be that X/lightdm/desktop manager check first if the gpu driver has finish loading and only after that to try to actually start (yes it complicates the code but in same time it avoids situations when the gpu driver is lagging to load and the desktop manager try to start and epic fails cause the driver loading is not completed like it happens in my case; average user doesn’t like to end up with a black screen and nothing else and it’s not even capable to debug the problem, if you want Linux to get some proper market share problems like the one I face should be fixed, if not then ask yourself why on earth do you even bother to make an OS for x86 because the % you are targeting, if such problems are not fixed (while in my case it’s nouveau I don’t see why it wouldn’t happen with any other gpu driver!!!), it’s so low that it just doesn’t worth the effort; yes harsh words here because I was left in the dark with no help at all, most new users wouldn’t come back to post the solution they found because you kinda ask yourself why on earth would u do that).

Solutions steps in the cases when the desktop manager starts (for the cases when it doesn’t start replace mousepad with vi, I’m lucky and it starts in 50% of the cases other users might not be that lucky and it might be a 100% failure to load the desktop manager…):

sudo mousepad /etc/systemd/system/display-manager.service

and add the following line
(change that 7 with the worst highest first number that you see after you do

sudo dmesg | grep nouveau

+1 or +2
so if after sudo dmesg | grep nouveau the last line is

[ 7.434109] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0

change that 7 to 8 or 9
)

ExecStartPre=/bin/sleep 7

just bellow

[Service]

in the end it will look like this:

[ Service]
ExecStartPre=/bin/sleep 7
ExecStart=/usr/bin/lightdm
Restart=always
IgnoreSIGPIPE=no

I used 7 because nouveau DRM when, X fails to start, load before 6s mark so I just wanted to make sure it finished loading.

Now Xorg.0.log start at ~10 second:

[ 10.665] Build Operating System: Linux Manjaro Linux

so everything is fine because by then the nouveau DRM is loaded.

This is only a workaround because that 7 needs to be adjusted based on each situation and also you can’t really set it to a high value for everyone because it will just delay the starting of the desktop manager in some cases for no reason at all. The proper solution is that X/lightdm/desktop manager when starting should check if the gpu driver has finished loading it not wait 1s check again and so on until the driver has finished loading (I’m assuming here that the driver actually say something when it finish loading (no reason to use successful, if it fails during loading it doesn’t finish loading simple as that), if not this part should be added also).

P.S.: I had to remove some links from the files because I’m not allowed to post links. Not really my fault that the config/log files include links…

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.