Second NIC no longer recognized after update

Hi, everyone!

Note to future readers: This turned out to be a flaky PCI card that decided it didn’t want to work. After sitting powered off overnight, it started working again.

I’m looking for some help troubleshooting a problem that surfaced after the 2021-01-19 update (which I applied a few days ago). My system has one NIC built into the motherboard, and a PCI card providing four more, which I use for work-related VMs. After this update, my PCI card is no longer recognized.

The igb module drives both—or at least it used to. :slight_smile:

Here’s a snip from the kernel logs in a previous boot, in which the card was properly detected.

Previous Boot
[carl@kotoko ~]$ journalctl -k -b -2 | grep igb
Jan 27 08:00:28 kotoko kernel: igb: Intel(R) Gigabit Ethernet Network Driver - version 5.6.0-k
Jan 27 08:00:28 kotoko kernel: igb: Copyright (c) 2007-2014 Intel Corporation.
Jan 27 08:00:28 kotoko kernel: igb 0000:05:00.0: enabling device (0000 -> 0002)
Jan 27 08:00:28 kotoko kernel: igb 0000:05:00.0: added PHC on eth0
Jan 27 08:00:28 kotoko kernel: igb 0000:05:00.0: Intel(R) Gigabit Ethernet Network Connection
Jan 27 08:00:28 kotoko kernel: igb 0000:05:00.0: eth0: (PCIe:5.0Gb/s:Width x1) 00:1b:21:d3:86:20
Jan 27 08:00:28 kotoko kernel: igb 0000:05:00.0: eth0: PBA No: Unknown
Jan 27 08:00:28 kotoko kernel: igb 0000:05:00.0: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s)
[... 3 more entries like this, which are the add-in card ...]
Jan 27 08:00:29 kotoko kernel: igb 0000:06:00.0: added PHC on eth4
Jan 27 08:00:29 kotoko kernel: igb 0000:06:00.0: Intel(R) Gigabit Ethernet Network Connection
Jan 27 08:00:29 kotoko kernel: igb 0000:06:00.0: eth4: (PCIe:2.5Gb/s:Width x1) 04:d4:c4:4a:3c:57
Jan 27 08:00:29 kotoko kernel: igb 0000:06:00.0: eth4: PBA No: FFFFFF-0FF
Jan 27 08:00:29 kotoko kernel: igb 0000:06:00.0: Using MSI-X interrupts. 2 rx queue(s), 2 tx queue(s)
[The above 5 lines are for the on-board NIC.]

Currently, the log shows this instead:

Current Boot
[carl@kotoko ~]$ journalctl -k | grep igb       
Jan 31 11:08:34 kotoko kernel: igb: Intel(R) Gigabit Ethernet Network Driver - version 5.6.0-k
Jan 31 11:08:34 kotoko kernel: igb: Copyright (c) 2007-2014 Intel Corporation.
Jan 31 11:08:34 kotoko kernel: igb 0000:06:00.0: added PHC on eth0
Jan 31 11:08:34 kotoko kernel: igb 0000:06:00.0: Intel(R) Gigabit Ethernet Network Connection
Jan 31 11:08:34 kotoko kernel: igb 0000:06:00.0: eth0: (PCIe:2.5Gb/s:Width x1) 04:d4:c4:4a:3c:57
Jan 31 11:08:34 kotoko kernel: igb 0000:06:00.0: eth0: PBA No: FFFFFF-0FF
Jan 31 11:08:34 kotoko kernel: igb 0000:06:00.0: Using MSI-X interrupts. 2 rx queue(s), 2 tx queue(s)
Jan 31 11:08:35 kotoko kernel: igb 0000:06:00.0 enp6s0: renamed from eth0
Jan 31 11:08:43 kotoko kernel: igb 0000:06:00.0 enp6s0: igb: enp6s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Jan 31 11:08:47 kotoko kernel: igb 0000:06:00.0 enp6s0: igb: enp6s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX

Considering the igb version is the same, I can only assume something has changed so that igb no longer recognizes the card. Searching kernel log for the PCI device identifier (if that’s the correct term) shows that it was detected with the previous kernel but is not now:

Kernel log searches
journalctl -k -b -2 | grep "0000:05:00"
	Jan 27 08:00:28 kotoko kernel: pci 0000:05:00.0: [8086:150e] type 00 class 0x020000
	Jan 27 08:00:28 kotoko kernel: pci 0000:05:00.0: reg 0x10: [mem 0xfc500000-0xfc57ffff]
	Jan 27 08:00:28 kotoko kernel: pci 0000:05:00.0: reg 0x1c: [mem 0xfc58c000-0xfc58ffff]
	Jan 27 08:00:28 kotoko kernel: pci 0000:05:00.0: reg 0x30: [mem 0xfc480000-0xfc4fffff pref]
	Jan 27 08:00:28 kotoko kernel: pci 0000:05:00.0: PME# supported from D0 D3hot D3cold
	Jan 27 08:00:28 kotoko kernel: pci 0000:05:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5 GT/s x1 link at 0000:03:01.0 (capable of 16.000 Gb/s with 5 GT/s x4 link)

journalctl -k | grep "0000:05"
	(Nothing returned.)

Using modprobe -r igb and then reloading it does not produce any different results. At this point I have run out of ideas.

Here are my system details, in case that’s useful. Although I’m booted into 5.10 at this moment (as a troubleshooting step) I normally run 5.4 as shown below:

System Info
System:    Host: kotoko Kernel: 5.4.89-1-MANJARO x86_64 bits: 64 Desktop: KDE Plasma 5.20.5 Distro: Manjaro Linux 
Machine:   Type: Desktop Mobo: ASUSTeK model: ROG CROSSHAIR VII HERO v: Rev 1.xx serial: <superuser required> 
           UEFI: American Megatrends v: 2501 date: 07/12/2019 
CPU:       Info: 8-Core model: AMD Ryzen 7 2700X bits: 64 type: MT MCP L2 cache: 4 MiB 
           Speed: 4098 MHz min/max: 2200/3700 MHz Core speeds (MHz): 1: 4098 2: 2063 3: 2058 4: 2060 5: 3474 6: 2088 7: 2090 
           8: 2079 9: 2065 10: 2062 11: 3913 12: 2060 13: 2068 14: 2082 15: 4121 16: 2054 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] driver: amdgpu 
           v: kernel 
           Display: x11 server: X.Org 1.20.10 driver: loaded: amdgpu,ati unloaded: modesetting resolution: 2560x1440 
           OpenGL: renderer: Radeon RX 580 Series v: 4.6.13572 Core Profile Context 
Audio:     Device-1: AMD Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] driver: snd_hda_intel 
           Device-2: Advanced Micro Devices [AMD] Family 17h HD Audio driver: snd_hda_intel 
           Device-3: Sunplus Innovation Full HD webcam type: USB driver: snd-usb-audio,uvcvideo 
           Sound Server: ALSA v: k5.4.89-1-MANJARO 
Network:   Device-1: Intel Wireless-AC 9260 driver: iwlwifi 
           IF: wlp4s0 state: down mac: 7e:84:cf:70:02:b4 
           Device-2: Intel I211 Gigabit Network driver: igb 
           IF: enp6s0 state: up speed: 1000 Mbps duplex: full mac: 04:d4:c4:4a:3c:57 
RAID:      Device-1: datastore type: zfs status: ONLINE size: 2.72 TiB free: 1.1 TiB 
           Components: Online: N/A 
Drives:    Local Storage: total: raw: 3.18 TiB usable: 5.9 TiB used: 1011.42 GiB (16.7%) 
           ID-1: /dev/nvme0n1 vendor: Western Digital model: WDS500G3X0C-00SJG0 size: 465.76 GiB 
           ID-2: /dev/sda vendor: Crucial model: CT1000MX500SSD1 size: 931.51 GiB 
           ID-3: /dev/sdb vendor: Crucial model: CT1000MX500SSD1 size: 931.51 GiB 
           ID-4: /dev/sdc vendor: Crucial model: CT1000MX500SSD1 size: 931.51 GiB 
Partition: ID-1: / size: 389.18 GiB used: 237.26 GiB (61.0%) fs: ext4 dev: /dev/nvme0n1p2 
           ID-2: /boot/efi size: 299.4 MiB used: 296 KiB (0.1%) fs: vfat dev: /dev/nvme0n1p1 
Swap:      ID-1: swap-1 type: partition size: 69.06 GiB used: 0 KiB (0.0%) dev: /dev/nvme0n1p3 
Sensors:   System Temperatures: cpu: 46.6 C mobo: N/A gpu: amdgpu temp: 61.0 C 
           Fan Speeds (RPM): N/A gpu: amdgpu fan: 785 
Info:      Processes: 477 Uptime: 10h 32m Memory: 62.78 GiB used: 13.16 GiB (21.0%) Shell: Bash inxi: 3.2.02

Any help will be greatly appreciated! On Monday, I will need to use this card again, and I’d prefer not to have to Timeshift the entire update away just for that. :wink:

You have an igb and an igb-dkms package in AUR. I’d personally prefer the dkms package, but it is flagged as out-of-date. You can try it anyway, since it is built against your installed kernels.

1 Like

Sounds like it’s worth a try!

So… went to build the igb package, selected linux54-headers when prompted. GUI installed linux54-headers and linux510-headers matching my installed kernels, then bombed out with this:

==> Entering fakeroot environment...
==> Starting package()...
/var/tmp/pamac-build-carl/igb/PKGBUILD: line 25: /usr/src/linux/version: No such file or directory
==> ERROR: A failure occurred in package().
    Aborting...

Indeed, there is no such file—in fact, /usr/src/ is empty. On other distros I’ve used, /usr/src/linux is a symlink to the default kernel source, but I don’t seem to have anything, despite having just installed two headers packages.

I guess I have a new issue to troubleshoot…

You must edit the PKGBUILD and correct the error. Check the comments section at AUR, they usually contain solutions.

You can also try the binary package. The version I have on my system (stable branch) is 5.6.0, which is newer. It isn’t a package though, it is built into the kernel.

EDIT: you can also try to downgrade your kernel from your cache: sudo pacman -U /var/cache/pacman/pkg/<package>

1 Like

Out of curiosity, what’s your reasoning for suggesting a kernel downgrade? I have checked on 5.4.89 (currently running) and 5.10.7, and the behavior seems the same. With no extra package installed, I’m using the igb module that’s included with the kernel.

It’s not a downgrade: it’s actually the most recent LTS. All the other kernels are ‘experimental’ though 5.10 is on its way to become an LTS release as well, but not there yet.

For more info, have a look here:

https://www.kernel.org/

1 Like

OK, that’s good to know. I’m planning to keep 5.10 installed alongside 5.4 from now on, for troubleshooting and testing.

Do you have any idea if I’m even looking in the right direction here? Unless something changed between 5.4.78 and 5.4.89, that affects hardware detection—and that same something is in 5.10.7—I’m inclined to think it isn’t a kernel problem.

I spent some time reading through the kernel changelogs for recent versions, but I didn’t make it all the way back to 5.4.79 yet, and I didn’t see anything that looked relevant… though I admit that most of it’s over my head. :slight_smile:

Yes, that’s what @mbb was trying to point out.

Have you done this yet:

:question:

Not yet, but I will try that now because I still have the package for 5.4.78 in there. Stand by…

pacman output
sudo pacman -U /var/cache/pacman/pkg/linux54-5.4.78-1-x86_64.pkg.tar.zst
		
	loading packages...
	warning: downgrading package linux54 (5.4.89-1 => 5.4.78-1)
	resolving dependencies...
	looking for conflicting packages...

	Packages (1) linux54-5.4.78-1

	Total Installed Size:  140.18 MiB
	Net Upgrade Size:        0.01 MiB

	:: Proceed with installation? [Y/n] y
	(1/1) checking keys in keyring                                                                                           [########################################################################] 100%
	(1/1) checking package integrity                                                                                         [########################################################################] 100%
	(1/1) loading package files                                                                                              [########################################################################] 100%
	(1/1) checking for file conflicts                                                                                        [########################################################################] 100%
	(1/1) checking available disk space                                                                                      [########################################################################] 100%
	:: Running pre-transaction hooks...
	(1/2) Removing linux initcpios...
	(2/2) Save Linux kernel modules
	:: Processing package changes...
	(1/1) downgrading linux54                                                                                                [########################################################################] 100%
	:: Running post-transaction hooks...
	(1/6) Arming ConditionNeedsUpdate...
	(2/6) Updating module dependencies...
	(3/6) Updating linux initcpios...
	==> Building image from preset: /etc/mkinitcpio.d/linux54.preset: 'default'
	-> -k /boot/vmlinuz-5.4-x86_64 -c /etc/mkinitcpio.conf -g /boot/initramfs-5.4-x86_64.img
	==> Starting build: 5.4.78-1-MANJARO
	-> Running build hook: [base]
	-> Running build hook: [udev]
	-> Running build hook: [autodetect]
	-> Running build hook: [modconf]
	-> Running build hook: [block]
	-> Running build hook: [filesystems]
	-> Running build hook: [keyboard]
	-> Running build hook: [keymap]
	-> Running build hook: [resume]
	-> Running build hook: [fsck]
	==> Generating module dependencies
	==> Creating gzip-compressed initcpio image: /boot/initramfs-5.4-x86_64.img
	==> Image generation successful
	==> Building image from preset: /etc/mkinitcpio.d/linux54.preset: 'fallback'
	-> -k /boot/vmlinuz-5.4-x86_64 -c /etc/mkinitcpio.conf -g /boot/initramfs-5.4-x86_64-fallback.img -S autodetect
	==> Starting build: 5.4.78-1-MANJARO
	-> Running build hook: [base]
	-> Running build hook: [udev]
	-> Running build hook: [modconf]
	-> Running build hook: [block]
	-> Running build hook: [filesystems]
	-> Running build hook: [keyboard]
	-> Running build hook: [keymap]
	-> Running build hook: [resume]
	-> Running build hook: [fsck]
	==> Generating module dependencies
	==> Creating gzip-compressed initcpio image: /boot/initramfs-5.4-x86_64-fallback.img
	==> Image generation successful
	(4/6) Updating Grub-Bootmenu
	Generating grub configuration file ...
	Found linux image: /boot/vmlinuz-5.10-x86_64
	Found initrd image: /boot/amd-ucode.img /boot/initramfs-5.10-x86_64.img
	Found initrd fallback image: /boot/initramfs-5.10-x86_64-fallback.img
	Found linux image: /boot/vmlinuz-5.4-x86_64
	Found initrd image: /boot/amd-ucode.img /boot/initramfs-5.4-x86_64.img
	Found initrd fallback image: /boot/initramfs-5.4-x86_64-fallback.img
	Found memtest86+ image: /boot/memtest86+/memtest.bin
	/usr/bin/grub-probe: warning: unknown device type nvme0n1.
	done
	(5/6) Restore Linux kernel modules

	==> Warning:
			-> Kernel has been updated. Modules of the current kernel
			-> have been backed up so you can continue to use your
			-> computer. However, the new kernel will only work 
			-> at next boot.


	(6/6) Checking which packages need to be rebuilt
1 Like

Currently booted into 5.4.78. The other NIC is still not recognized, and in addition I have the lovely “Failed to start Load Kernel Modules” message that other folks have been talking about.

I’m almost ready to give up on this, revert to my snapshot from before doing the massive update, then apply a few packages at a time until I can determine what caused this problem. I’d better do that today in case the rollback still doesn’t fix it, so I won’t still be troubleshooting this when I should be working.

UPDATE: I rolled back to my snapshot from a week ago, only to find that—as I feared—the extra NIC is still not detected. I then restored my snapshot from earlier today and will look for a work-around so I can use my VM again. For further troubleshooting, I plan to find some portable distro that’s geared for diagnostics, to make sure the card didn’t just mysteriously die between one boot and the next… because right now I really can’t prove it actually works.

That looks like a hardware problem: NIC busted…

Do you have a spare?

:fearful:

1 Like

You know… it is looking more and more like that may be the problem. If the PCI card did fail, then it stopped working the moment I rebooted after applying the 2021-01-19 update. I find the timing suspicious, but until I have some way (outside of my running system) to verify the card actually works I can’t rule it out.

So, first I’ve got to figure out how to set up a network bridge so I can use my work VM. After that, I will check to see if the card died.

Looks like my networking is configured using Network Manager. I have set up a bridge using nmcli, as documented on the Arch wiki: Network bridge with Network Manager

Thanks for your help today! You are all over this forum like it’s your full-time job, and I appreciate your taking time to look at my problem. :slight_smile:

Murphy’s laws:

  • If anything can go wrong, it will
  • Corollaries:
    • It can
    • It should
    • At the most inopportune time
  • Extension:
    • it will be all your fault, and everyone will know it.

:crazy_face: :wink:

1 Like

If the driver is built into the kernel, then the kernel update may be the issue.

If you use binary modules, you need to also downgrade them. That’s why I always prefer dkms modules. But, in principle, the kernel is excluded from the problem. You can return to the current version.

Grab an older live image, burn it to a usb drive and test by booting it.

1 Like

Update: After doing a shutdown and leaving the PC powered off overnight, in the morning the card is magically working again. I am going to call this a hardware problem, although I haven’t verified that yet, and start shopping for a new card.

I did learn how to set up a bridged connection to work around this, though, so something good came out of this mess. :slight_smile:

2 Likes

**If it magically started working after an extended shutdown, the hardware problem did not automagically disappear and the exact same issue will most likely randomly re-occur over and over again in the near future. I hope you still ordered a new card. :sob:

There is no need to change the title here on the forum: I’ve marked this answer as the solution to your question as it is by far the best answer you’ll get.

However, if you disagree with my choice, please feel free to take any other answer as the solution to your question or even remove the solution altogether: You are in control! (If you disagree with my choice, just send me a personal message and explain why I shouldn’t have done this or :heart: or :+1: if you agree)

:innocent:
P.S. In the future, please don’t forget to come back and click the 3 dots below the answer to mark a solution like this below the answer that helped you most:
Solution
so that the next person that has the exact same problem you just had will benefit from your post as well as your question will now be in the “solved” status.

Or a capacitor leaking.

Yes you’re absolutely correct! (And/or! :grin: )

I should have said:
the hardware problem did not automagically disappear and the exact same issue will most likely randomly re-occur over and over again in the near future
instead of:
you have an overheating problem and the issue will most likely re-occur in the near future

Edited!

:blush:

1 Like

Hey, thanks for letting me know how to mark the solution! I obviously missed how this forum does it, and I agree with your choice.

As for the underlying cause… well, who knows! I do anticipate that it will start failing more frequently, so I’m researching a replacement.

Thanks to both of you for your advice!

2 Likes

No worries, mate!

Not only is Manjaro itself a very advanced OS, but our forum software is also top notch!

:joy: