My Manjaro installation often fails to boot. I’d say about 50% of the time. I looked and compared the output of journalctl -d (the latest successful boot) vs journalctl -d -1 (the unsuccessful boot before it) and what I see is that the unsuccessful boot’s log doesn’t show any errors explicitly and terminates at the following output:
systemd-journald[297]: Time spent on flushing to var/log/journal/5fc951759f6d4b6a8ccd4f1fb4460120 is 22.272ms for 800 entries.
The successful boot goes on like this
systemd-journald[299]: Time spent on flushing to /var/log/journal/5fc951759f6d4b6a8ccd4f1fb4460120 is 7.713ms for 802 entries.
systemd-journald[299]: System Journal (/var/log/journal/5fc951759f6d4b6a8ccd4f1fb4460120) is 4.0G, max 4.0G, 0B free.
Does this mean the failure happens in systemd-journald? Or is it just that the log wasn’t properly flushed during the unsuccessful boot?
If the log is cut off short, then what can I do to investigate the boot failure?
After I select the kernel in the grub menu there’s no visible output on screen. Login screen never appears. Ctrl+Alt+Fn doesn’t work. As I mentioned in my question, the journal seems to be truncated, or else something dies without producing any log output. I can see the point where systemd was started but soon after the log ends.
OK, I removed quiet. I saw a quick flash of targets all with the green OK next to them. Then the screen turned black with the cursor in the top left. And that was it. I could hear the fan working loudly. journalctl -b -1 now looks different. This is how it ends.
Jan 14 17:06:31 Jaguar systemd-logind[475]: Power key pressed.
Jan 14 17:06:31 Jaguar systemd-logind[475]: Powering Off…
Jan 14 17:06:31 Jaguar systemd-logind[475]: System is powering down.
This means the system was still up when I lost patience and pressed the power button.
It looks like the the system fails to switch to the graphical mode. I found this in the journal:
Jan 14 17:02:00 Jaguar kernel: nvidia 0000:01:00.0: can’t change power state from D3cold to D0 (config space inaccessible)
Jan 14 17:02:00 Jaguar kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x23:0x56:574)
Jan 14 17:02:00 Jaguar kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
Jan 14 17:02:00 Jaguar kernel: kernel read not supported for file pci0000:00/0000:00:01.0/0000:01:00.0/config (pid: 516 comm: Xorg)
Give details of your system (inxi) and full boot logs. You have some nVidia problem there according to these few files of code you pasted. NVidia is notoriously problematical with their closed, proprietary driver. Could have bought Radeon
Fast startup uses hybrid hibernation instead of completely powering down devices
Rebooting to Linux can fail because device cannot be initialised from hybrid hibernation state
But if Linux is restarted, devices get powered down correctly and work OK on 2nd boot
so it could appear to be an intermittent issue happening about 50% of the time
I found the same issue on nvidia forum: https://forums.developer.nvidia.com/t/bug-cant-change-power-state-from-d3cold-to-d0-config-space-inaccessible-stuck-at-boot/112912
It looks like the problem is this: an earlier step in the boot process instructed the GPU to switch to D3cold. While GPU is switching it is unable to accept another mode switch request. Hence the problem. If the first mode switch finishes before the second arrives everything works fine. If not, then we have that error.
The response from nvidia seems to be that the distro didn’t configure udev rules correctly…