Several times after installing updates via the GUI update manager, when the graphics driver has been updated, this has resulted in a system which boots to black.
I have fixed this by booting into single user mode and then using mhwd to remove and then reinstall the graphics driver.
This has happened probably 4-5 times on multiple machines in the house so is pretty annoying, especially since one of these is a tablet attached to a wall and it is a hassle getting a keyboard attached. Even worse is when it happens on a PC hundreds of miles away at my parents house, after they simply run an update.
What I’d like to understand is why this is happening, and how this can be fixed by the Manjaro team so that it stops happening. I am hesitant now to recommend Manjaro to non-technical people as I expect this problem will just occur to them. The update procedure should never result in a system which looks, for all intents and purposes, to a non-technical person as completely broken.
Do you guys know what I’m talking about? Anyone else get this issue periodically?
I’m just going to follow up here because I managed to login to the remote machine (machine 4 in the above list) by getting my sister to setup a remote tunnel via a server, and I fixed it in the following manner:
There was no package for linux513-nvidia but the machine was running linux513
So I installed linux515 and then used mhwd -a auto nonfree 0300 to install the nvidia drivers
All my sister did was follow the standard update dialog. And this is the same issue I’ve had before.
Can someone explain how this happens? Why doesn’t the update process prompt to change kernels if upgrading the system is going to break the graphics driver? Or is it because you want to decouple these two things? It’s not very intuitive for a non-technical.
As I’ve said, this keeps happening to me, once or twice a year on at least one of the machines. I believe update should just work. My parents should just be able to click update and not expect this kind of issue. Well that’s what I think would be good for Manjaro anyway.
Kernels have a lifespan of 3 months, and thus need to be upgraded regularly. You can lower the need of upgrading it by using LTS kernels, which have a lifespan of 6 years.
I almost forgot. MSM should be installed by default, and AFAIK its default configuration should warn you whenever you run an EOL kernel.
Its just a simple thing - you cannot run out of date and unsupported software and expect things to keep working.
And this issue in particular? You arent alone … but that means that the little search button up at the top would return at least ~20 or more threads with these exact symptoms … and low-and-behold … the same answers.
I’ll admit, I was annoyed before by Mr cscs and rage quit the thread. You got me bro, good trolling. But I’ll go ahead and try and think how this can be solved for the broader user base, since that was my initial motivation, not solving the problem for myself which is trivial.
I need to go back to why the system breaks in the first place. Please try and help me understand this, without stupid sarcastic comments, if you can just refrain from that for a while that would be useful.
I need to understand this in detail, and I don’t want to be confused, so I’ll start with some simple statements which I believe to be true:
Only patch updates are applied automatically to the kernel according to the documentation.
The graphics kernel module is specific to the kernel version.
The graphics kernel module is installed via a package rather than dynamically generated via DKMS
Are these statements correct?
So I’m asking myself, what sequence of updates leads to a broken system, and the following questions present themselves:
If the kernel is automatically patched, is it possible that a graphics module is not available for that patched version?
If a user updates the kernel manually to a new version, on the next system update will the corresponding graphics module be installed?
Theoretically, as long as both the kernel version and the graphics module are maintained, yes.
But then, if the GPU manufacturer stop maintaining their old proprietary driver, it thus won’t receive updates while the kernel still does, which can lead to issues. Such cases are usually announced so that users can switch to another driver.
Likewise, when a kernel reaches EOL, the associated drivers are also dropped. But when trying to update the rest, including still maintained kernels, such update may fail due to broken requirements.
If you install only the kernel, through the package manager, then no.
It is recommended to use msm or mhwd-kernel for installing a kernel, as those will install the drivers at the same time.
If you check linux*-nvidia’s dependencies, you can see it does not only needs the associated kernel version, but also packages that can keep on updating “independently” from the kernel. And as dependency requirements shall stay valid, the package manager may need to remove the package in order to update the dependencies.
Beside this, since those dependencies can still be updated, they may do so beyond the compatibility with the dropped packages, which then may not work correctly.
Looking at the depends of linux515-nvidia as an example (is there a way to look at the old repos?), the following are listed:
linux515
nvidia-utils=495.46
Presumably the package manager isn’t going to automatically remove linux515 if that’s the running kernel, so then nvidia-utils=495.46 would be the only other package.
Are you saying that nvidia-utils being updated could somehow result in the removal of linux-nvidia515?
Exactly.
As updating nvidia-utils may break the dependency for linux513-nvidia, and since the latter is no longer in the repository, the package manager will logically suggest to remove linux513-nvidia.
The best you can do is to ensure the remote system is running on an LTS kernel.
For the time being this is linux515.
The issues you have faced is due to the 5.13 kernel and kernel modules has been decommissioned entirely.
A kernel upgrade - such as this one 5.13 to 5.15 - on a nvidia based system has to be done hands on. There is no easy way - my personal recommendation is that such updates needs to be done in console - no gui as xorg depends on graphics which are going to be replaced - it usually presents a challenge.
I guess there is always a chance something will go wrong. If mhwd-kernel was instrumented through the GUI this could work.
It only needs to be triggered in the specific circumstance that a kernel reaches EOL and then the user can be guided through it. Maybe it fails, but if it is going to fail anyway due to the graphics module getting bumped, then it’s probably worth it.
I’m thinking from the perspective of your naive user who just downloads the ISO, installs, and expects the system to keep itself upto date. Eventually that user is going to encounter the EOL kernel issue if they do nothing but use the GUI update procedure. It would be nice if this case was handled. It happens a lot according to a forum search.
Manjaro boasts for being user-friendly, not fool-proof.
It secures quite a lot of things – kernels, drivers, branches, installation… – but it still is a cutting-edge, rolling-release distribution. From a user perspective, it is rather easy to maintain – mainly, one shall (almost) only need to follow the update announcements – but it is definitely not an install-and-forget OS.