Khadas VIM3: Strange Wifi IP packet corruption in recent kernels

Hi, seem to have found a weird bug in recent kernels… Also posted about this over in the Khadas forums, since it’s happening on VIM3 and also happens with the Khadas/Ubuntu kernel.

Manjaro ARM 21.07 VIM3 image
VIM3 v1.2, 4GB/32GB version
brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4359/9 wl0: Jan 19 2018 12:14:30 version 9.87.51.11.8 (a85e25e@shgit) (r) FWID 01-cb5aa0a5

After installing and running the pamac upgrade, kernel 5.13.12 is installed. With this kernel (and also with the 5.14-rc5 kernel being used in the current Khadas fenix pulls), there’s peculiar, seemingly random IP corruption. Rolling back the kernel to the 5.12.11 shipped with the image fixes it, but that version has a little weirdness with voltage regulators and shutdown procedure.

It shows up quite immediately as lag in DNS queries and other things using small packets, and can be reproduced quickly with ping.

Observations:
Corruption of the penultimate 3 bytes of the IP packet – last byte is intact, prior 3 are overwritten with 00.
IP checksum is incorrect when observed from the receiving end.
Small packets are affected, large ones are not.
Ping -s <= 575 sees random corruption. -s >= 576 does not.
ping -f results in ~35% packet loss. ping -f -s 600 does not.
Kernels 5.13.12 and 5.14-rc5 are affected, 5.12.11 is not.
2.4 or 5 GHz band does not matter.
AP or STA mode does not matter.
Wired ethernet is not affected.

I’m digging around in the kernel trees to see what changed between 5.12.11 and 5.13.12, but also putting this out here for anyone who might have more knowledge of the innards of the wifi drivers and be able to quickly pinpoint this.

2 Likes

Update: 5.12.13 works, 5.12.14 doesn’t. Seem to have narrowed it down to the meson memcpy changes. Don’t know why yet.

commit d698344a97bdc295932c8e7f1876ace9d39bd928
Author: Neil Armstrong narmstrong@baylibre.com
Date: Wed Jun 9 17:02:30 2021 +0200

mmc: meson-gx: use memcpy_to/fromio for dram-access-quirk

commit 103a5348c22c3fca8b96c735a9e353b8a0801842 upstream.

It has been reported that usage of memcpy() to/from an iomem mapping is invalid,
and a recent arm64 memcpy update [1] triggers a memory abort when dram-access-quirk
is used on the G12A/G12B platforms.

This adds a local sg_copy_to_buffer which makes usage of io versions of memcpy
when dram-access-quirk is enabled.

[1] 285133040e6c ("arm64: Import latest memcpy()/memmove() implementation")

Fixes: acdc8e71d9bb ("mmc: meson-gx: add dram-access-quirk")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20210609150230.9291-1-narmstrong@baylibre.com
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 Like