How do I properly clean out a rogue kernel and/or modules?

While working through a separate issue @ DE froze with graphic glitches... lots of kernel, drm, and amdgpu entries in journal - #3 by Daniel-I, I discovered some remnants of a kernel that I believe no longer exists (5.14.10-1) as it was replaced by 5.14.18-1 during the last stable update.

Note: I should also mention that prior to the update I had removed kernel-alive and replaced it with kernel-modules-hook… just in case that has some significance.

The reason I believe I have some rogue kernel files/folders is because I found an entire collection for kernel 5.14.10-1…

$ ls -la /lib/modules
total 248
drwxr-xr-x  11 root root   4096 Nov 29 21:38 .
drwxr-xr-x 225 root root 208896 Dec  2 16:50 ..
drwxr-xr-x   3 root root   4096 Nov 19 10:51 5.10.79-1-MANJARO
drwxr-xr-x   3 root root   4096 Nov 29 21:01 5.13.19-2-MANJARO
drwxr-xr-x   3 root root   4096 Nov 19 10:51 5.14.10-1-MANJARO
drwxr-xr-x   3 root root   4096 Nov 29 21:38 5.14.18-1-MANJARO
drwxr-xr-x   3 root root   4096 Nov 20 00:13 5.15.2-2-MANJARO
drwxr-xr-x   2 root root   4096 Nov 19 10:51 extramodules-5.10-MANJARO
drwxr-xr-x   2 root root   4096 Nov 29 21:01 extramodules-5.13-MANJARO
drwxr-xr-x   2 root root   4096 Nov 29 21:38 extramodules-5.14-MANJARO
drwxr-xr-x   2 root root   4096 Nov 20 00:13 extramodules-5.15-MANJARO

… and yet kernel 5.14.18-1 is the only 5.14 kernel listed in the kernel tool…

Can I safely remove the 5.14.10-1-MANJARO folder? If so, are there any other places I should check to make sure there are no other rogue files? If not, is there a better way to remove any/all 5.14.10 remnants?

Not sure if my command is very suitable, but I tried to find other 5.14.10 instances, but found only this one path… (editing out all Timeshift snapshots paths I found)

$ sudo find / -name "5.14.10-1-MANJARO"
/usr/lib/modules/5.14.10-1-MANJARO

You can remove 5.14 and 5.13 without issues as your running the 5.15 kernel.I would suggest leaving the 5.10 kernel installed just in case something happens and you need to fallback on an LTS kernel.

1 Like

Thanks for the reply @straycat

I had reinstalled the 5.14.18 and 5.13.19 kernels after I experienced the issue I linked… just in case I wanted to go backwards.

My question is, if 5.14.18 is my only 5.14 kernel listed as being installed… why does the 5.14.10 folder I found exist… and can I safely trash it?

Note: A kernel older than 5.12 is not an option for my hardware… AMD 5600X CPU and 6800XT GPU.

I think when you remove the kernels in MSM it should remove any files that go with it.

I agree, but as you can see in my screenshot, 5.14.10 is not listed for me to uninstall via MSM. 5.14.10 was replaced by 5.14.18 during the last stable update… and I’m just finding the leftovers.

Actually, I’m recalling a dkms-related issue I had during the upgrade (2021-11-19 Stable Update - error in "Install DKMS modules" step), so maybe that tripped up the cleanup that normally would have happened?

Either way, it looks like manual intervention is required now.

I had an issue when I removed kernel 5.13 via Manjaro Settings GUI and manual intervention was required to remove files from /boot. I found a similar topic that suggested the same action. If uncertain, make a backup of the files first. I suspect that grub makes no reference to these non-listed kernels.

A side note: I use locate -i (alias loc) to find things (see updatedb & systemctl status updatedb), at least to narrow it down, and then use find.

In that case a manual start of the included cleanup service should suffice:

$ systemctl start linux-modules-cleanup.service
1 Like

Thank you for the helpful command @freggel.doe , it definitely made some changes, which appear to have just moved the “rogue” 5.14.10 folder into a newly created .old folder…

$ systemctl start linux-modules-cleanup.service

$ ls -la /lib/modules
total 248
drwxr-xr-x  11 root root   4096 Dec  8 09:42 .
drwxr-xr-x 227 root root 208896 Dec  4 15:01 ..
drwxr-xr-x   3 root root   4096 Nov 19 10:51 5.10.79-1-MANJARO
drwxr-xr-x   3 root root   4096 Nov 29 21:01 5.13.19-2-MANJARO
drwxr-xr-x   3 root root   4096 Nov 29 21:38 5.14.18-1-MANJARO
drwxr-xr-x   3 root root   4096 Nov 20 00:13 5.15.2-2-MANJARO
drwxr-xr-x   2 root root   4096 Nov 19 10:51 extramodules-5.10-MANJARO
drwxr-xr-x   2 root root   4096 Nov 29 21:01 extramodules-5.13-MANJARO
drwxr-xr-x   2 root root   4096 Nov 29 21:38 extramodules-5.14-MANJARO
drwxr-xr-x   2 root root   4096 Nov 20 00:13 extramodules-5.15-MANJARO
drwxr-xr-x   3 root root   4096 Dec  8 09:42 .old

$ ls -la /lib/modules/.old
total 12
drwxr-xr-x  3 root root 4096 Dec  8 09:42 .
drwxr-xr-x 11 root root 4096 Dec  8 09:42 ..
drwxr-xr-x  3 root root 4096 Nov 19 10:51 5.14.10-1-MANJARO

Will the system eventually take care of the .old folder (and it’s contents), or is there another manual command that will do it?

I would keep the .old folder until you make sure nothing is needed in there and if not just delete it.

You kann safely remove /lib/modules/.old and everything inside,
when this is not the kernel you are running at the moment !

But you do not need to do this, because this will happen automatically in a few days through kernel-modules-hook

1 Like

Thank you for the feedback @straycat and @andreas85 !

I see from reading your findings that it was kernel-module-hook that created the .old folder (point #3)… so I will be patient and wait for kernel-module-hook to complete the cleanup (point #5).

kernel-modules-hook:

  • :+1: The scripts are inserted as a single line directly in exec
  • :-1: The scripts are therefore not commented and difficult to read
  • /lib/modules/.old is used as a dir
  • :sparkling_heart: protects all installed kernels. Before deleting modules, checks with pacman whether the corresponding kernel is still installed
  • :+1: The modules are first moved and only deleted with a time delay (It is possible to restore them if problems arise)
  • :heart: elegantly integrated into the system

Although I selected @freggel.doe 's post as the solution (the required command)… your post @andreas85 definitely helped me understand what has happened (and will happen) in the background.

Also that should run automatically. Daniel, check you scheduler for possible issues (or you turned it off manually?):

systemctl list-timers -all

For me the systemd-tmpfiles-clean.timer looks like to run daily.

1 Like

Thank you for the great suggestion @alven !

I have confirmed it’s running daily for me as well… first entry in the list

$ systemctl list-timers -all
NEXT                        LEFT                LAST                        PASSED     UNIT                          ACTIVATES                      
Wed 2021-12-08 14:26:16 CST 3h 34min left       Tue 2021-12-07 14:26:16 CST 20h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service
Thu 2021-12-09 00:00:00 CST 13h left            Wed 2021-12-08 00:00:12 CST 10h ago    logrotate.timer               logrotate.service
Thu 2021-12-09 00:00:00 CST 13h left            Wed 2021-12-08 00:00:12 CST 10h ago    man-db.timer                  man-db.service
Thu 2021-12-09 00:00:00 CST 13h left            Wed 2021-12-08 00:00:12 CST 10h ago    pkgfile-update.timer          pkgfile-update.service
Thu 2021-12-09 00:00:00 CST 13h left            Wed 2021-12-08 00:00:12 CST 10h ago    shadow.timer                  shadow.service
Thu 2021-12-09 00:53:59 CST 14h left            Wed 2021-12-08 02:48:52 CST 8h ago     updatedb.timer                updatedb.service
Thu 2021-12-09 14:03:25 CST 1 day 3h left       Thu 2021-12-02 10:31:36 CST 6 days ago pamac-mirrorlist.timer        pamac-mirrorlist.service
Mon 2021-12-13 00:02:32 CST 4 days left         Mon 2021-12-06 00:21:07 CST 2 days ago fstrim.timer                  fstrim.service
Sat 2022-01-01 15:00:00 CST 3 weeks 3 days left Sat 2021-12-04 15:00:07 CST 3 days ago pamac-cleancache.timer        pamac-cleancache.service
n/a                         n/a                 n/a                         n/a        mdadm-last-resort@md127.timer mdadm-last-resort@md127.service

I’m thinking this issue had one of two possible root causes…

  1. The switch from kernel-alive to kernel module-hook was done at a point where the removal/introduction interrupted the processes in some way… and/or
  2. the dkms issue I experienced during the stable update tossed a wrench in the mix

So hopefully with @freggel.doe 's command, things are now in the appropriate places for kernel-modules-hook to continue it’s cleanup, and keep things lined up for the other processes/timers to find and deal with what they are expecting.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.