Wifi frequently fail set DNS from DHCP

I got a new laptop some days ago. It’s installed using manjaro XFCE minimal iso on usb.

Frequently I experience that DNS does not work when connecting to a wifi. What I’ve noticed is that there is no DNS servers listed in /etc/resolv.conf
The workaround is to disable wifi (hardware switch), wait 15+ seconds, then reenable wifi. Then usually resolv.conf is updated and network works normally, until the next time I connect to a wifi (another wifi, or the same after sleep, or if I go out of range and comes back)
If l wait less than 15 seconds, the DNS issue is never resolved.
While resolv.conf is emtpy I can use internet with IP addresses. I can access URLs if I put the ip and hostname into /etc/hosts. DNS seems to be the only part of the wifi connect / DHCP lease process that is missing.

It seems like the DHCP-client that runs after wifi connect simply forget to update the resolv.conf. It does update IP and gateway.
How can this be debugged? Is there any debug option to the DHCP-client? Could it be that the client simply does not receive DNS from the DHCP server?

The other manjaro laptop in the household does never experience any issues with the same wifi. And the laptop experiencing issues is identical in hardware to the one that was stolen some weeks ago, which never had any such issues either. From that I’m thinking that if this is a driver issue, it must be related to something being updated during the past 2 months.
This laptop runs kernel 6.9.9-1. The stolen laptop ran kernel 6.6.40-1 LTS. Also the other laptop in the household runs kernel 6.6.40-1 LTS.

inxi -n returns:

Network:
  Device-1: Intel Wireless 8265 / 8275 driver: iwlwifi
  IF: wlp2s0 state: up mac: bc:a8:a6:xx:xx:xx

As a first step I’m thinking that logging the DHCP lease dialog would give some useful information. How can I get such log?

UPDATE: It became quite clear that this was a filesystem issue.
New topic about that started here: https://forum.manjaro.org/t/ext4-luks-filesystem-error-on-root-partition

DHCP-Client is integrated in NetworkManager. Don’t run a second one.

journalctl --boot 0 --unit NetworkManager.service --follow --no-hostname

Switch logging level on the fly:

sudo nmcli general logging level DEBUG
sudo nmcli general logging level INFO

Since only 15 seconds… a workaround would be running a dns cache like dnsmasq, which is not default on Manjaro. Then 15 sec wouldn’t bother.

pamac install dnsmasq
echo -e '[main]\ndns=dnsmasq' | sudo tee /etc/NetworkManager/conf.d/dns.conf
sudo nmcli general reload

Then NetworkManager will start dnsmasq on its own.

2 Likes

Revert to Kernel 6.6 (LTS).

This may (or may not) resolve the issue, but it will also ensure all laptops are equal, for the sake of comparison.

Cheers.

1 Like

Thank you. I’ve switched logging level to debug, now waiting for the issue to reoccur.

I immediately found this in the log:

juli 26 20:06:45 NetworkManager[848]: [1722017205.7074] dhcp4 (wlp2s0): error saving lease to /var/lib/NetworkManager/internal-72d813a8-14c9-45b0-8604-95147abef2a3-wlp2s0.lease: Failed to write file “/var/lib/NetworkManager/internal-72d813a8-14c9-45b0-8604-95147abef2a3-wlp2s0.lease.UHJQR2”: fsync() failed: Structure needs cleaning

I didn’t experience any issues at that time though.
Is it the filesystem that needs cleaning or something else?

I was thinking about that. But figured the next LTS might not be so far away. I could test 6.9 for a while. I’ll collect a bit more information before changing anything.

I would check the filesystem, yes. And if it is a HDD, check for bad blocks / bad sectors.

Here is probably the error (from yesterday):

Failed to write file “/var/lib/NetworkManager/internal-72d813a8-14c9-45b0-8604-95147abef2a3-wlp2s0.lease.UHJQR2”: fsync() failed: Structure needs cleaning

It surely is a filesystem issue. The output of:
tune2fs -l /dev/mapper/xxxx

Mount count:              3
Maximum mount count:      -1
FS Error count:           11
First error time:         Wed Jul 24 17:45:20 2024
First error function:     ext4_validate_block_bitmap
First error line #:       423
First error err:          EFSCORRUPTED
Last error time:          Thu Jul 25 12:18:09 2024
Last error function:      ext4_free_inode
Last error line #:        362
Last error err:           EFSCORRUPTED

I’ll create a new topic about that

Your issue might only reveal itself using kernel 6.9, so reverting to 6.6 is a viable consideration, especially as your other laptop(s) using 6.6 do not exhibit the issue.

Remember, you can always install kernel 6.9 (again) if reverting to 6.6 doesn’t produce results.

Cheers.

1 Like

It became quite clear that this is a filesystem issue.
I’ll see if it gets fixed as soon as the laptop is free for a reboot. And if it comes back. I really hope filesystem issues is not caused by kernel issues…

1 Like

I see.

Then, if you’re starting a new topic for that (as I now read), please mark that post (or perhaps the post from @megavolt ) as the solution, and link to this thread in the new thread for some context.

Cheers.

LTS kernels are usually the last kernel released during that year, which can be anytime from the end of October to the end of December. So you still have 3 to 5 months before the next LTS kernel will be announced:

Version Maintainer Released Projected EOL
6.6 Greg Kroah-Hartman & Sasha Levin 2023-10-29 Dec, 2026
6.1 Greg Kroah-Hartman & Sasha Levin 2022-12-11 Dec, 2026
5.15 Greg Kroah-Hartman & Sasha Levin 2021-10-31 Dec, 2026
5.10 Greg Kroah-Hartman & Sasha Levin 2020-12-13 Dec, 2026
5.4 Greg Kroah-Hartman & Sasha Levin 2019-11-24 Dec, 2025
4.19 Greg Kroah-Hartman & Sasha Levin 2018-10-22 Dec, 2024

Sources:

The Linux Kernel Archives - Releases &

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.