Boot to black (Not a metal song)

I'm going to put this in the #newbies area because I feel like one (And let's face it, I am!) with this stupid issue.
On the suggestion of @openminded I'm going to ask for assistance.
In the end this will probably be some simple thing I missed, I'm sure.
Hold on to your butts! :wink:


Frank

$ inxi -Fxxxza

System:    Host: Frank Kernel: 5.2.8-1-MANJARO x86_64 bits: 64 compiler: gcc v: 9.1.0 
           parameters: BOOT_IMAGE=/boot/vmlinuz-5.2-x86_64 root=UUID=910dd8c2-76dd-4e17-963d-e10d25c1c445 
           rw apparmor=1 security=apparmor 
           Desktop: KDE Plasma 5.16.4 tk: Qt 5.13.0 wm: kwin_x11 dm: SDDM Distro: Manjaro Linux 
Machine:   Type: Desktop Mobo: MSI model: 970A-G43 (MS-7693) v: 3.0 serial: <filter> 
           UEFI: American Megatrends v: 10.6 date: 01/08/2016 
CPU:       Topology: 6-Core model: AMD FX-6300 bits: 64 type: MCP arch: Bulldozer family: 15 (21) 
           model-id: 2 stepping: N/A microcode: 6000852 L2 cache: 2048 KiB 
           flags: avx lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 42019 
           Speed: 3500 MHz min/max: N/A Core speeds (MHz): 1: 3500 2: 3500 3: 3500 4: 3500 5: 3500 6: 3500 
           Vulnerabilities: Type: l1tf status: Not affected 
           Type: mds status: Not affected 
           Type: meltdown status: Not affected 
           Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl and seccomp 
           Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization 
           Type: spectre_v2 
           mitigation: Full AMD retpoline, IBPB: conditional, STIBP: disabled, RSB filling 
Graphics:  Device-1: AMD Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] vendor: XFX Pine 
           driver: amdgpu v: kernel bus ID: 01:00.0 chip ID: 1002:67df 
           Display: x11 server: X.Org 1.20.5 driver: amdgpu FAILED: ati unloaded: modesetting,radeon 
           alternate: fbdev,vesa compositor: kwin_x11 resolution: 2560x1080~60Hz 
           OpenGL: renderer: Radeon RX 570 Series (POLARIS10 DRM 3.32.0 5.2.8-1-MANJARO LLVM 8.0.1) 
           v: 4.5 Mesa 19.1.4 direct render: Yes 
Audio:     Device-1: AMD SBx00 Azalia vendor: Micro-Star MSI driver: snd_hda_intel v: kernel 
           bus ID: 00:14.2 chip ID: 1002:4383 
           Device-2: AMD Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] vendor: XFX Pine 
           driver: snd_hda_intel v: kernel bus ID: 01:00.1 chip ID: 1002:aaf0 
           Sound Server: ALSA v: k5.2.8-1-MANJARO 
Network:   Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: Micro-Star MSI 
           driver: r8169 v: kernel port: d000 bus ID: 05:00.0 chip ID: 10ec:8168 
           IF: enp5s0 state: down mac: <filter> 
           Device-2: Ralink RT5372 Wireless Adapter type: USB driver: rt2800usb bus ID: 3-1:2 
           chip ID: 148f:5372 
           IF: wlp0s19f2u1 state: up mac: <filter> 
Drives:    Local Storage: total: 1.43 TiB used: 154.60 GiB (10.6%) 
           ID-1: /dev/sda vendor: Western Digital model: WD3200BEKT-75PVMT1 size: 298.09 GiB block size: 
           physical: 512 B logical: 512 B speed: 3.0 Gb/s rotation: 7200 rpm serial: <filter> rev: 1A01 
           scheme: GPT 
           ID-2: /dev/sdb vendor: Mushkin model: MKNSSDSR250GB size: 232.89 GiB block size: 
           physical: 512 B logical: 512 B speed: 6.0 Gb/s serial: <filter> rev: 2A0 scheme: GPT 
           ID-3: /dev/sdc vendor: Samsung model: HD102UJ size: 931.51 GiB block size: physical: 512 B 
           logical: 512 B speed: 3.0 Gb/s serial: <filter> rev: 1113 scheme: GPT 
Partition: ID-1: / raw size: 232.59 GiB size: 227.94 GiB (98.00%) used: 72.44 GiB (31.8%) fs: ext4 
           dev: /dev/sdb2 
Sensors:   System Temperatures: cpu: 16.1 C mobo: N/A gpu: amdgpu temp: 30 C 
           Fan Speeds (RPM): N/A gpu: amdgpu fan: 2825 
Info:      Processes: 233 Uptime: 2h 11m Memory: 23.49 GiB used: 2.90 GiB (12.4%) Init: systemd v: 242 
           Compilers: gcc: 9.1.0 clang: 8.0.1 Shell: zsh v: 5.7.1 running in: konsole inxi: 3.0.35 

It started with Testing update on 03/08/19 and I posted about it here.
It has continued from that point. For subsequent updates booting to black screen would only happen when I would update the kernel. Then on the 10/08/19 update it began every time I started my computer each day. I don't use suspend because it flakes out KDE and I gave up trying to get it to work.

So what have I done (Shh, that was rhetorical.)?

  • Journalctl

I posted this for @openminded.

This is the typical Journalctl entry I get
$ journalctl --no-pager --no-hostname -xb -1 -p3                                                      130 ↡
-- Logs begin at Fri 2019-07-19 21:54:48 CDT, end at Mon 2019-08-12 03:58:43 CDT. --
-- No entries --
***
$ journalctl --no-pager --no-hostname -xb -2 -p3
-- Logs begin at Fri 2019-07-19 21:54:48 CDT, end at Mon 2019-08-12 03:59:01 CDT. --
Aug 12 02:45:51 systemd-udevd[398]: could not read from '/sys/module/pcc_cpufreq/initstate': No such device  
Aug 12 02:46:26 kernel: watchdog: watchdog0: watchdog did not stop!
***
$ journalctl --no-pager --no-hostname -xb -3 -p3
-- Logs begin at Fri 2019-07-19 21:54:48 CDT, end at Mon 2019-08-12 03:59:13 CDT. --
Aug 12 02:45:31 kernel: watchdog: watchdog0: watchdog did not stop!
***
$ journalctl --no-pager --no-hostname -xb -4 -p3
-- Logs begin at Fri 2019-07-19 21:54:48 CDT, end at Mon 2019-08-12 04:00:22 CDT. --
Aug 12 02:44:03 kernel: watchdog: watchdog0: watchdog did not stop!
***
$ journalctl --no-pager --no-hostname -xb -5 -p3
-- Logs begin at Fri 2019-07-19 21:54:48 CDT, end at Mon 2019-08-12 04:00:13 CDT. --
Aug 12 00:27:11 kernel: watchdog: watchdog0: watchdog did not stop!

The most recent one is where it booted fine.
Btw, #2 has an error about cpu freq. Ive never seen that before and it hasn't happened since. :woman_shrugging:

To get Frank to boot I have to get to Grub Advanced menu and re-select the kernel I'm booting by default. (Linux52)
It happens with every Kernel I've tried as well.


  • Grub config
Summary
GRUB_DEFAULT=saved
GRUB_TIMEOUT=5
GRUB_TIMEOUT_STYLE=menu
GRUB_DISTRIBUTOR='Manjaro'
GRUB_CMDLINE_LINUX_DEFAULT="apparmor=1 security=apparmor"
GRUB_CMDLINE_LINUX=""

# If you want to enable the save default function, uncomment the following
# line, and set GRUB_DEFAULT to saved.
GRUB_SAVEDEFAULT=true

# Preload both GPT and MBR modules so that they are not missed
GRUB_PRELOAD_MODULES="part_gpt part_msdos"

# Uncomment to enable booting from LUKS encrypted devices
#GRUB_ENABLE_CRYPTODISK=y

# Uncomment to use basic console
GRUB_TERMINAL_INPUT=console

# Uncomment to disable graphical terminal
#GRUB_TERMINAL_OUTPUT=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command 'videoinfo'
GRUB_GFXMODE=auto

# Uncomment to allow the kernel use the same resolution used by grub
GRUB_GFXPAYLOAD_LINUX=keep

# Uncomment if you want GRUB to pass to the Linux kernel the old parameter
# format "root=/dev/xxx" instead of "root=/dev/disk/by-uuid/xxx"
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
GRUB_DISABLE_RECOVERY=true

# Uncomment and set to the desired menu colors.  Used by normal and wallpaper
# modes only.  Entries specified as foreground/background.
GRUB_COLOR_NORMAL="light-gray/black"
GRUB_COLOR_HIGHLIGHT="green/black"

# Uncomment one of them for the gfx desired, a image background or a gfxtheme
#GRUB_BACKGROUND="/usr/share/grub/background.png"
GRUB_THEME="/usr/share/grub/themes/manjaro/theme.txt"

# Uncomment to get a beep at GRUB start
#GRUB_INIT_TUNE="480 440 1"


  • mkinitcpio config
Summary
# vim:set ft=sh
# MODULES
# The following modules are loaded before any boot hooks are
# run.  Advanced users may wish to specify all system modules
# in this array.  For instance:
#     MODULES=(piix ide_disk reiserfs)
MODULES=""

# BINARIES
# This setting includes any additional binaries a given user may
# wish into the CPIO image.  This is run last, so it may be used to
# override the actual binaries included by a given hook
# BINARIES are dependency parsed, so you may safely ignore libraries
BINARIES=()

# FILES
# This setting is similar to BINARIES above, however, files are added
# as-is and are not parsed in any way.  This is useful for config files.
FILES=""

# HOOKS
# This is the most important setting in this file.  The HOOKS control the
# modules and scripts added to the image, and what happens at boot time.
# Order is important, and it is recommended that you do not change the
# order in which HOOKS are added.  Run 'mkinitcpio -H <hook name>' for
# help on a given hook.
# 'base' is _required_ unless you know precisely what you are doing.
# 'udev' is _required_ in order to automatically load modules
# 'filesystems' is _required_ unless you specify your fs modules in MODULES
# Examples:
##   This setup specifies all modules in the MODULES setting above.
##   No raid, lvm2, or encrypted root is needed.
#    HOOKS=(base)
#
##   This setup will autodetect all modules for your system and should
##   work as a sane default
#    HOOKS=(base udev autodetect block filesystems)
#
##   This setup will generate a 'full' image which supports most systems.
##   No autodetection is done.
#    HOOKS=(base udev block filesystems)
#
##   This setup assembles a pata mdadm array with an encrypted root FS.
##   Note: See 'mkinitcpio -H mdadm' for more information on raid devices.
#    HOOKS=(base udev block mdadm encrypt filesystems)
#
##   This setup loads an lvm2 volume group on a usb device.
#    HOOKS=(base udev block lvm2 filesystems)
#
##   NOTE: If you have /usr on a separate partition, you MUST include the
#    usr, fsck and shutdown hooks.
HOOKS="base udev autodetect modconf block keyboard keymap filesystems"

# COMPRESSION
# Use this to compress the initramfs image. By default, gzip compression
# is used. Use 'cat' to create an uncompressed image.
#COMPRESSION="gzip"
#COMPRESSION="bzip2"
#COMPRESSION="lzma"
#COMPRESSION="xz"
#COMPRESSION="lzop"
#COMPRESSION="lz4"

# COMPRESSION_OPTIONS
# Additional options for the compressor
#COMPRESSION_OPTIONS=()


  • Memtest passed (twice)
  • Changed Mouses (Was a Razer, now a Corsair). So I could uninstall OpenRazer-dkms in case that was an issue. I hate that program anyhow. :stuck_out_tongue:
  • Checked every log I could find. Not that I knew many times what I was reading but searching for any kind of errors.
  • Removed quiet most recently to try and see if I can spot where it hangs. Desperate times and all. It booted with out hanging. So I rebooted another 11 times and it never hung. However, I don't trust this and I'm willing to bet it'll happen next time there's a kernel update.
    And I need to know WTF is happening. Between this and the other twitchy issues I'm having with KDE it's starting to grate.
    I'm sure I've forgotten to add things. Please let me know anything you need.

Thank you.

1 Like

This seems like a clue. It's as if your kernel was selecting the wrong graphics driver. :thinking:

1 Like

That's normal for AMD Gpus. It's using the amdgpu driver and not the ATi legacy driver so it fails to load it.
When I first started using Manjaro, that scared the crap out of me until it was explained to me. :grin:

1 Like

Going out on a limb, could it be an intermittently failing power supply? :thinking:

1 Like

I highly doubt it. It's a 5 months old Seasonic 850 Focus gold.
However, just to be sure, later today I'll pull it and stick it on my tester. See what kind of ripple I get.


It's frustrating isn't it?
Can you think of any other error logs system-wide I can check for boot messages?
I was under the impression that journal pretty much covers it. :woman_shrugging:

1 Like

Actually, no, I have no idea. :frowning: It is my impression that it would be hardware-related, but you could check what...

sudo dmesg

... tells you. :thinking:

I've had flaky hardware myself on a few occasions, and with the exception of my previous machine ─ a refurbished box that I knew was dying and whose behavior was rather predictable ─ the others all exhibited highly unpredictable behavior. A bit like SchrΓΆdinger's cat: sometimes it works, sometimes it doesn't. :man_shrugging:

1 Like

This is all I have for sudo dmesg | grep "Warning"

[  +0.000005] ACPI BIOS Warning (bug): Optional FADT field Pm2ControlBlock has valid Length but zero Address: 0x0000000000000000/0x1 (20190509/tbfadt-615)
[  +0.000001] Warning! ehci_hcd should always be loaded before uhci_hcd and ohci_hcd, not after

I think the first is some BIOS bug (as it's labeled)
The second I have no idea but it always shows in Warning.

There is no error at all.


It could very well be flaky hardware. It works until it doesn't, right?

I've used my Diagnostics USB a couple times testing things and everything passes. Of course if a cap is going bad on the board. No way to test for that except looking.
So when I test the PSU later today I'll do a visual on the mobo.

/sigh

grumble grumble

1 Like

Is it a BIOS machine or a UEFI machine, and if the latter, does it boot in BIOS compatibility mode or in UEFI mode?

UEFI, I have all the drives in GPT also.

1 Like

Then it could indeed be a UEFI bug. :thinking:

1 Like

So, some change in how ACPI works happened in the Kernel and it's flaking out my motherboard you think?

But... why would it boot at all? Instead of booting once I re-select it in grub advanced menu?
Not discounting it, just... wtf? if that's what's happening. :rofl:

1 Like

I'm not sure, but it is possible that the loading of a kernel overwrites part of the exposed memory of the UEFI information.

All in all, it could be something like the enumeration of HDDs/SSDs by the kernel at boot time. This too is not guaranteed to be consistent, which is why we use LABELs or UUIDs (or PARTLABELs or PARTUUIDs) in /etc/fstab.

Likewise, systemd now names Ethernet adapters to enp-something, whereas it used to be eth0, eth1, and so on, but this enumeration depended on the order of detection by the kernel at boot time, and some small glitch ─ a static charge remnant or whatever ─ could upset the order of the network adapters, and your system would suddenly regard the internet as being the LAN and vice versa.

There is always a certain randomness in the initialization of hardware at boot time, and this may even differ between cold-booting and warm-booting. So it's possible that you're seeing this kind of unpredictable behavior because of that particular UEFI bug.

On the other hand, there has indeed been a change in the ACPI framework (and even some other frameworks) between the 4.x and 5.x kernel generations. Among other things, this is why certain proprietary NVIDIA drivers ─ which NVIDIA itself has in the meantime stopped supporting ─ would no longer work on 5.x kernels, while they worked perfectly well on 4.x.

Mind you, I'm just thinking out loud here. By no means do I claim any expertise in this regard, and it has been ages since I last built a kernel from sources, albeit that I do pride myself in never having built a kernel that didn't boot. But when it comes to the kernel, things have changed so much over the years that I'd be loath trying to configure a vanilla kernel today.

1 Like

Well did some google-fu, redhat and the Arch forums seem to think it's not an issue unless you're using IOMMU and Pass through. And perhaps some specific virtualization.

There's a few posts about this specific error on the MSI forums as well for a variety of boards with the user using varies flavors of linux. Most answers say to ignore it and not an issue. meh!

Good to know. So let me through a wrench into that. I tried linux419, linux414, and linux49. They do the same thing.

Absolutely. :grin: I'm pretty much at wits end here with this so I welcome any ideas or suggestions. You have experience which is what I lack for in Linux. I've built computers for a long time and pride myself on never buying a prebuilt desktop.

:purple_heart:


Uh oh. @dglt is typing. I'm about to get schooled. :stuck_out_tongue:

No, I'm not installing Optimus-switch!

1 Like

edit the modules line so the amdgpu drivers are preferred over the radeon and get loaded earlier and before trying to load the radeon/ati drivers (and failing)

MODULES="amdgpu radeon"

sudo mkinitcpio -P

3 Likes

Done.
Rebooting.
Be right back.
.
.
.
.
.
.
.
.
Back. Going to put quiet back into grub since it seems to hang when I can't see what it's doing and try again.
But it rebooted fine (Without quiet). Manjaro is messing with me, I'm telling you!

1 Like

Booted fine both time.

I'm going to install linux53 and reboot and see if I can get it to do it again.

2 Likes

im curious if that gets rid of the "FAILED: ati" on inxi output

2 Likes

Holy crap! linux53 boots super fast. We like it!
But, it booted right up. Went through all 3 installed kernels and it boots right up. /sigh
I swear I'm not making this issue up! :rofl:

$ inxi -Gxxx
Graphics:  Device-1: AMD Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] vendor: XFX Pine 
           driver: amdgpu v: kernel bus ID: 01:00.0 chip ID: 1002:67df 
           Display: x11 server: X.Org 1.20.5 driver: amdgpu FAILED: ati unloaded: modesetting,radeon 
           alternate: fbdev,vesa compositor: kwin_x11 resolution: 2560x1080~60Hz 
           OpenGL: renderer: Radeon RX 570 Series (POLARIS10 DRM 3.33.0 5.3.0-1-MANJARO LLVM 8.0.1) 
           v: 4.5 Mesa 19.1.4 direct render: Yes 
2 Likes

this is common but not everyone using amdgpu shows this, i dont think it matters much anyhow.

so after a fresh install of 5.3, no more black screen?

1 Like

No black screen with any kernel.
It was only when I updated kernels at first, then after the update on the 10th of Aug. it was everytime I rebooted.
Now it seems to have disappeared?
I guess I'll find out next kernel update.

I'm still going to test my PSU and check the Mobo for crap caps later. Just to be sure.

1 Like