I first posted this on the /r/manjaro sub reddit looking for advice but was re-directed here.
I can’t put links in my post, so just turn this into a link on your own:
hXXps://old.reddit.com/r/ManjaroLinux/comments/o92yx8/how_where_do_i_submit_a_bug_report_for_a_specific/?
I am having issues with a particular network driver and I am not sure if the issue is known / fixed / yet to be merged into kernel (it looks like 5.13 does not contain any of the very recent patches for the driver). It could also be the case that this issue is not known.
In a nutshell, that’s what i’m trying to establish. Is the issue documented below something that needs to be escalated… and if yes, how?
My system:
OS: Manjaro Linux x86_64
Kernel: 5.12.9-1-MANJARO
Uptime: 38 mins
Packages: 1024 (pacman), 5 (flatpak), 5 (snap)
Shell: zsh 5.8
DE: Plasma 5.21.5
WM: KWin
CPU: AMD Ryzen 9 5950X (32) @ 3.400GHz
GPU: AMD ATI 0a:00.0 Navi 22
Memory: 11085MiB / 64294MiB
The device / driver in question is Chelsio T520-CR
:
$ ethtool -i enp11s0f4d1
driver: cxgb4
version: 5.12.9-1-MANJARO
firmware-version: 1.25.4.0, TP 0.1.4.9
expansion-rom-version: 1.0.0.68
bus-info: 0000:0b:00.4
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
The Problem
When I take my system out of sleep, the NIC does not show any link. I can still interact with it via ethtool
and ifconfig
but no packets can flow. When I try ifconfig up/down
i get an interesting error:
$ sudo ifconfig enp11s0f4d1 down
$ echo $?
0
$ sudo ifconfig enp11s0f4d1 up
SIOCSIFFLAGS: Protocol error
It’s actually the SIOCSIFFLAGS: Protocol error
message that prompted me to open this thread.
When i check dmesg
:
# I believe this is right around the time I issued the ifconfig up cmd that errord
[ 1700.148846] cxgb4 0000:0b:00.3: Device not initialized
[ 1700.191310] cxgb4 0000:0b:00.2: Device not initialized
[ 1700.247567] cxgb4 0000:0b:00.1: Device not initialized
[ 1700.267269] cxgb4 0000:0b:00.0: Device not initialized
<...>
# And this is around the time that i re-inserted the module, i think
[ 1714.364826] cxgb4 0000:0b:00.4: Coming up as MASTER: Initializing adapter
[ 1715.567594] cxgb4 0000:0b:00.4: Successfully configured using Firmware Configuration File "/lib/firmware/cxgb4/t5-config.txt", version 0x1425001c, computed checksum 0xd8c8fbd6
[ 1715.730935] cxgb4 0000:0b:00.4: Hash filter supported only on T6
[ 1715.781356] cxgb4 0000:0b:00.4: max_ordird_qp 21 max_ird_adapter 387072
[ 1715.821211] cxgb4 0000:0b:00.4: Current filter mode/mask 0x632b:0x21
[ 1715.883843] cxgb4 0000:0b:00.4: 128 MSI-X vectors allocated, nic 32 eoqsets 34 per uld 8 mirrorqsets 2
[ 1715.883857] cxgb4 0000:0b:00.4: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link)
[ 1715.912304] cxgb4 0000:0b:00.4 eth0: eth0: Chelsio T520-CR (0000:0b:00.4) 1G/10GBASE-SFP
[ 1715.912529] cxgb4 0000:0b:00.4 eth1: eth1: Chelsio T520-CR (0000:0b:00.4) 1G/10GBASE-SFP
[ 1715.913172] cxgb4 0000:0b:00.4 enp11s0f4: renamed from eth0
[ 1715.941722] cxgb4 0000:0b:00.4 enp11s0f4d1: renamed from eth1
[ 1715.950925] cxgb4 0000:0b:00.4: Chelsio T520-CR rev 0
[ 1715.950929] cxgb4 0000:0b:00.4: S/N: PT26140032, P/N: 110116050E0
[ 1715.950930] cxgb4 0000:0b:00.4: Firmware version: 1.25.4.0
[ 1715.950931] cxgb4 0000:0b:00.4: Bootstrap version: 1.1.0.0
[ 1715.950932] cxgb4 0000:0b:00.4: TP Microcode version: 0.1.4.9
[ 1715.950932] cxgb4 0000:0b:00.4: Expansion ROM version: 1.0.0.68
[ 1715.950933] cxgb4 0000:0b:00.4: Serial Configuration version: 0x1004000
[ 1715.950934] cxgb4 0000:0b:00.4: VPD version: 0x2
[ 1715.950935] cxgb4 0000:0b:00.4: Configuration: RNIC MSI-X, Offload capable
What I’ve tried:
- Updated my BIOS to the latest (2021-06-13)
- Poured through bios looking for any/every setting that relates to power management and devices on the PCIE bus. Toggled things on/off and tested. No change.
- Googled for
cxgb4 sleep issues
and related things. I don’t find much. The links that DO show up are for issues that are quite old in most cases. I did find one link that’s recent. More on that below… - Checked for any NIC FW updates (not that I know how to apply them…). I found that there is a recent (2021-05-21) release of the Chelsio drivers for linux:
3.14.0.3
which does contain a FW that is slightly newer than the one that appears to be running on the card right now:ChelsioUwire-3.14.0.3/src/network/firmware/t4fw-1.25.6.0.bin
. I don’t know where the changelog for the FW is, but i really don’t think that the issue is caused by the delta between1.25.6
and1.25.4
.
What i’ve figured out:
Despite the SIOCSIFFLAGS: Protocol error
error, i can get my NIC back up and working again if I just remove / re-insert the kernel module:
$ sudo rmmod cxgb4
*works*
$ ethtool -i enp11s0f4d1
Cannot get driver information: No such device
(expected)
$ sudo modprobe cxgb4
$ ethtool -i enp11s0f4d1
driver: cxgb4
<...>
I can do this every time I take the system out of sleep, but i’d prefer not to. Which brings me to my question…
My questions:
- Is the
SIOCSIFFLAGS: Protocol error
something that should be reported to the driver maintainer for the card? - If yes, who is that / where / how do I report it?
- I did find a few commits* that seem to be updates to the driver for this NIC, but I don’t fully understand what the commits are fixing/addressing. They sound related to my issue, but I can also totally understand if the patches in the link are for something else entirely.
*
: I can’t put links in post, so: hXXps://www.spinics.net/lists/netdev/msg747745.htm
Thanks for your time / advice.