It appears I cannot post on the ARM board, and the issue seems not really that related to the ARM architecture itself, so I’m posting it here.
Summary: odd packet loss between kernel space and userland
Longer background:
I’m using WireGuard to connect to the Internet, but I leave some applications connected directly to the network (not via WG). Quite standard routing stuff, right? Here’s how I always did this:
Table = off
PostUp = ip -4 route add default dev %i via VPN_GATEWAY_V4 table 64; ip -6 route add default dev %i via VPN_GATEWAY_V6 metric 0 table 64; ip rule add uidrange 1000-1000 table 64; ip -6 rule add uidrange 1000-1000 table 64
PreDown = ip -4 route del default dev %i via VPN_GATEWAY_V4 table 64; ip -6 route del default dev %i via VPN_GATEWAY_V6 metric 0 table 64; ip rule del uidrange 1000-1000 table 64; ip -6 rule del uidrange 1000-1000 table 64
I’ve tested this on Ubuntu x86_64, Arch x86_64, Raspberry Pi OS, and Arch ARM (sorry I do not have a spare machine for Manjaro x86_64 right now). It worked well everywhere until I did exactly the same on Manjaro ARM (dev build from GitHub - manjaro-arm/rpi4-images since I’m using Raspberry Pi 5).
The problem is, that while IPv4 works, IPv6 (TCP, UDP, ICMP) connections never receive responses. I then did some tests to find out what was going on, and tcpdump
clearly told me response packets are received:
wg0 Out IP6 pi > dns.google: ICMP6, echo request, id 17, seq 7, length 64
wg0 In IP6 dns.google > pi: ICMP6, echo reply, id 17, seq 7, length 64
But, the received packets never made their way to the userland. Timeout, timeout, timeout. strace
indicated that the read operation failed:
sendto(4, "\200\0\0\0\377\377\0\2\323\"[e\0\0\0\0\221\260\n\0\0\0\0\0\20\21\22\23\24\25\26\27"..., 64, 0, {sa_family=AF_INET6, sin6_port=htons(58), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2001:4860:4860::8888", &sin6_addr), sin6_scope_id=0}, 28) = 64
recvmsg(4, {msg_namelen=128}, 0) = -1 EAGAIN (Resource temporarily unavailable)
So, the issue lies in the kernel space, which is really out of my expertise. Has anyone seen similar issues before? Any kind of help or hint is highly appreciated. Thanks.