For those who use systemd services that rely on a "network" connection

…and you use the default Network Manager (whether KDE, GNOME, or Xfce).

How do you get it to work correctly?

By “work correctly” I mean that the service should not run when a network connection is unavailable.


However, there seems to be a strange definition for what constitutes “online”.

For example, when I am connected via wired or wireless in my KDE Network Manager, these are the following outputs, which are correct:

The default dependencies for network-online.target:

systemctl list-dependencies network-online.target                                                                                                                                                                       INT ✘ 
network-online.target
● └─NetworkManager-wait-online.service

network-online.target status:

systemctl status network-online.target                                                                                                                                                                                      ✔ 
● network-online.target - Network is Online
     Loaded: loaded (/usr/lib/systemd/system/network-online.target; static)
     Active: active since Sun 2021-09-19 11:38:53 BRT; 3 days ago

NetworkManager-wait-online.service status:

systemctl status NetworkManager-wait-online.service                                                                                                                                                                         ✔ 
● NetworkManager-wait-online.service - Network Manager Wait Online
     Loaded: loaded (/usr/lib/systemd/system/NetworkManager-wait-online.service; enabled; vendor preset: disabled)
     Active: active (exited) since Sun 2021-09-19 11:38:53 BRT; 3 days ago

So far, so good, yes? There’s just one problem. Even after disconnecting from the network (confirmed by being unable to ping neither a local address nor remote website), network-online.target continues to show that I am online.

This is a problem, because custom services that depend on a truthful network status will immediaely attempt to run and fail (rather than wait 30 seconds for the network to be “online”) which I confirmed by checking the logs, even though I declared them with Wants=network-online.target and After=network-online.target


The most I could gather from my limited research capability is that this is due to the lack of consensus of what it means to be “online”. :man_shrugging:

Some would argue you need to be able ping a remote website, others argue that the interface must be up, others argue that you’ve been assigned a DHCP address, others argue if you can ping a local network device, etc.


Does anyone know how to instruct systemd (when using Network Manager) that “online” is only acceptable when a remote website can be reached?


To reiterate, when I am absolutely offline, I still get this result when checking the status of network-online.target:

● network-online.target - Network is Online
     Loaded: loaded (/usr/lib/systemd/system/network-online.target; static)
     Active: active since Sun 2021-09-19 11:38:53 BRT; 3 days ago

For reference, this unsolved problem similarly describes what I am facing:

How can I run a systemd service after network link has been established?

1 Like

It is maybe a really simple solution…

LANG=C nm-online -h
Usage:
  nm-online [OPTION?]

Waits for NetworkManager to finish activating startup network connections.

Help Options:
  -h, --help                  Show help options

Application Options:
  -q, --quiet                 Don't print anything
  -s, --wait-for-startup      Wait for NetworkManager startup instead of a connection
  -t, --timeout=<timeout>     Time to wait for a connection, in seconds (without the option, default value is 30)
  -x, --exit                  Exit immediately if NetworkManager is not running or connecting

Remove the -s parameter:

ExecStart=/usr/bin/nm-online -q --timeout=30

and it will wait for a connection and not for NetworkManager startup.

1 Like

This is to be added to my custom service located before my real ExecStart command, like this?

ExecStart=/usr/bin/nm-online -q --timeout=30
ExecStart=bunch of network-dependent stuff in here

I would rather do this:

ExecCondition=/usr/bin/sh -c "if [[ $(ping -c1 manjaro.org) =~ "100% loss" ]]; then echo continue; exit 0; else echo skip next commands; exit 1; fi"
ExecStart=/usr/bin/nm-online -q --timeout=30
ExecStartPost=/usr/bin/sh -c "if [[ $(ping -c1 manjaro.org) =~ "100% loss" ]]; then echo connection failed; exit 255; else echo connection there; exit 0; fi"

But I did not test it. You control it by exit codes. :wink:

1 Like

I’ll give it a shot!

Regardless I must say that I’m disappointed that it’s not working with built-in systemd dependencies, such as Wants= and After=. :frowning:

One of the main reasons of declaring certain conditions in such a standardized way was to cleanup services and scripts.

(Might not be systemd’s fault, since it appears that NetworkManager has its own definition of online?)


Another “workaround” is to add a sleep 30 command, to force it to wait for 30 seconds before continuing, which can prevent it from failing immediately upon being invoked by a timer / condition, since if resuming from suspend the network has a chance to be fully established.

But I prefer your nm-online method. :wink: I’ll play around and report back.

I guess the “online service” just checks if a connection has been established, but not if you can reach the World Wide Web… That is the difference here. Not sure, but systemd-networkd seems to do it the same way.

1 Like

The network-online.target does not indicate “Internet access”. It mans there is a network interface available with an IP(not localhost) address added to it and there are routes for this IP address.
This does not mean Internet access, just that there is Networking available for the Network the system is directly connected to. If the system is directly connected to the internet, like a server, it means internet access, but this depends on the network the system is directly connected to.

The purpose of this target is to delay the start of services that needs a working network. It is not to monitor and “unreach” the target if the network is lost. After the boot is completed and all enabled services have started, the target is meaningless.

A little bit to read
https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/#conceptsinsystemd

5 Likes

I agree - the definition of network online is a very floating description and the correct answer depends on what you are trying to achieve.

Over the years there have been a lot of questions asking - what it that apollo.archlinux.org I am seeing in my wireshark logs? It was the NetworkManager connectivity check.

The page linked by @xabbu is quite informative - thank you @xabbu

This is what NetworkManager-wait-online.service is doing using the nm-online command.
This was not correct - what I thought I was saying was that NetworkManager runs a regular check which has surprised many users.

$ cat /usr/lib/NetworkManager/conf.d/20-connectivity.conf
[connectivity]
uri=http://ping.manjaro.org/check_network_status.txt

The successful run of the command reveals four thing

  • your nic is up and running
  • your dns is up and running
  • your router is up and running
  • the target is accessible using name resolution

But as you have noted - the network-online is a generic concept and you need to handle the actual situation in your script code.

In the simplest form - use -x and grep for [online].

nm-online -x
2 Likes

The Unit does not run the connectivity check. And even if the connectivity check failed it will start network-online.target . Also if the connectivity check is disabled the network-online.target will be reached. The network-online.target will not be reached, if no NM connection that should connect automatically succeed in starting (like not getting a IP from DHCP, static configs almost always succeed.)
NM will try the connectivity check before it logs “startup complete”, but the fail will of the check has no influence if NM starts the network-online.target or not.

$ systemctl cat NetworkManager-wait-online.service
# /usr/lib/systemd/system/NetworkManager-wait-online.service
[Unit]
Description=Network Manager Wait Online
Documentation=man:nm-online(1)
Requires=NetworkManager.service
After=NetworkManager.service
Before=network-online.target

[Service]
# `nm-online -s` waits until the point when NetworkManager logs
# "startup complete". That is when startup actions are settled and
# devices and profiles reached a conclusive activated or deactivated
# state. It depends on which profiles are configured to autoconnect and
# also depends on profile settings like ipv4.may-fail/ipv6.may-fail,
# which affect when a profile is considered fully activated.
# Check NetworkManager logs to find out why wait-online takes a certain
# time.

Type=oneshot
ExecStart=/usr/bin/nm-online -s -q
RemainAfterExit=yes

# Set $NM_ONLINE_TIMEOUT variable for timeout in seconds.
# Edit with `systemctl edit NetworkManager-wait-online`.
#
# Note, this timeout should commonly not be reached. If your boot
# gets delayed too long, then the solution is usually not to decrease
# the timeout, but to fix your setup so that the connected state
# gets reached earlier.
Environment=NM_ONLINE_TIMEOUT=60

[Install]
WantedBy=network-online.target

The comment is part of the unit file.

2 Likes

I can see that - but for as far my little knowledge takes me - I know that NetworkManager is doing connectivity checks - and as nm-online is a tool provided by NetworkManager I assume it is the tool used to check for connectivity.

I didn’t dissect the unit file - just took the command from the

systemctl status NetworkManager-wait-online

And I quickly forgot the arguments used -s -q since I was more interested in nm-online command.

So according to the unit file - the wait-online.service is only an indication that NetworkManager actually made to the end - but not if a connection has been made.

I think the naming logic can be a little confusing - I guess it has a different meaning for at machine than a human :slight_smile:

I have been abusing the network-online.target in my mount units (and the sample mount units found elsewhere on the forum) and I recently learned that this is bad practice because it cause dependency circles and slows down the boot of the system.

And some parts of the page you linked earlier seems to have contradicting statements - or it may just be that I don’t understand this fully.

From the document I quote from the first section almost at the end

A robust system boots up independently of external services.

From the next section it seems that for network-online.target

Its primary purpose is to actively delay activation of services until the network is set up

From the third section it is said that setting the following in the [Unit] header

After=network-online.target
Wants=network-online.target

This will ensure that all configured network devices are up and have an IP address assigned before the service is started.

The right “wait” service must be enabled too

Only until recently the $MANAGER-wait-online.service was something you had to enable manually - now it is in the managers install section - this has not always been so.

What the above state does not cover is the situation described in the OT - when one deliberately removes the network connection.

The system knows the network stack is configured and up - what would one expect from network-online.target in such case?

If a script depends on the network being up - and to run after network-online.target - if the script becomes dysfunctional e.g. due to a dysfunctional router - then the system will know the network static is up and running - but a route somewhere is missing - that is hardly a matter for the system - the system should work anyway - albeit not having a route to a specific target that would be a matter for the script to handle.

Network up - check - ok
Route up - check - fail - recheck periodically - fail after x attempts - log it.

What if one create an extra service as

/etc/systemd/system/NetworkManager2-wait-online-service
$ systemctl cat NetworkManager2-wait-online.service

[Unit]
Description=Using nm-online to verify network up and running
Requires=NetworkManager.service
After=NetworkManager.service
After=network-online.target

[Service]
Type=oneshot
#ExecStart=/usr/bin/nm-online -s -q
ExecStart=/usr/bin/nm-online -x -q
RemainAfterExit=yes

Environment=NM_ONLINE_TIMEOUT=60

[Install]
WantedBy=multi-user.target
2 Likes

Your assumption is wrong. Maybe the problem is the word “online”. “online” in this context does not mean the system has internet access. “online” in this context, for NM and systemd, means the system is part of a network.
For a computer at Home, it usually means, it is connected to a LAN. The system is online, since it has an IP address and routes, and can connect to other devices. It does not mean Internet access.

If the modem is part of the home computer(like in the 90s), it would mean internet access, but today, that is not common.

No, it waits for the successful set up of a configured network connection in NM.

By default, connections have the ipv4.may-fail and ipv6.may-fail properties set to yes; this means that NetworkManager waits for one of the two address families to complete configuration before considering the connection activated.

form the nm-online man page.

The -x is just “Exit immediately if NetworkManager is not running or connecting.” It does not wait, if a NM network connections in the process of starting.
NM noticed via the connectivity check, if there is no Internet access, but the system is still “online”, since the LAN usually still works.

1 Like

I played with the nm-online and using the -x it returns immediately with a message indicating if the system has internet connection or not.

I tested using the network manager status icon disable/enable network.

if one adds the -f argument it will countdown the default 30 seconds or if one add e.g. -f 5 it will countdown 5 seconds then display a message if an internet connection exist.

The message is the same - it is only the countdown which is disabled using the -x argument.

1 Like

The message indicates the system has network access, not internet access. There is a difference.

A better test would be to disconnect your Router form the Internet, but do not power it off. You will lose internet access, but not network access. nm-online will still report online.

2 Likes

Enlightening - I didn’t think of that - but doing so (in a manner of speaking) - confirms what you are saying.

Hmm - this makes we speculate - which part of network manager uses the connectivity check configuration?

Never mind - I am straying of topic - nevertheless - it has been enlightening to dive into the mysteries of systemd :slight_smile:

1 Like

For example the nmcli g command shows it (g for general).

If there is Internet access and the connectivity check is enabled it looks like this

$ nmcli g
STATE      CONNECTIVITY  WIFI-HW  WIFI     WWAN-HW  WWAN     
connected  full          enabled  enabled  enabled  disabled 

if the connectivity check failed it will look like this

$ nmcli g
STATE                  CONNECTIVITY  WIFI-HW  WIFI      WWAN-HW  WWAN     
connected (site only)  limited       enabled  disabled  enabled  disabled

If the connectivity check is disabled it shows full . The doc lists other values like unknown , none , portal .

There is a dispatcher action connectivity-change which can be use to do something, if the connectivity changes. The variable CONNECTIVITY_STATE can be used inside a dispatcher script to determine if there is full connectivity or something else.
For example a dispatcher script could start or stop a unit depending on the connectivity state.

https://developer-old.gnome.org/NetworkManager/stable/NetworkManager.html#id-1.2.2.6

1 Like

Thank you to @megavolt , @linux-aarhus , and @xabbu. :heart:

I appreciate each and every one of you who replied, and I’ll share what I can gather from all of this.

There’s a lot to unpack and still play around, so I might have to cut some extraneous detail from my post.

At first I was going to directly quote your individual response’s in my reply, but I think it might be better to combine everything as a “composite” response.

What follows is what I’ve read from all the responses in this thread so far, and my attempt to “put it all together” and make sure I understand correctly. Please correct me if I made a mistake!


My assumption / ideal:

There’s a universal definition of what “online” constitutes.

The reality:

There’s no such consensus among the software authors, end-users, and even system services. There’s no “standard” way of leveraging systemd targets and/or dependencies to “intelligently” check for online status before attempting to run a service. It depends if it requires local access (LAN, NAS, etc) or remote access (DropBox, Google, etc).

I like your approach, @megavolt , however I might need some further explanation about your command. Is it looking for a particular string from a failed ping attempt? Because if a target is not reachable, I get the output "Name or service not known". I don’t see anything about "100% loss".


Another assumption:

Systemd’s network-online.target can be used as a Wants and After dependency to make sure the computer is online (definition of “online” notwithstanding), before attempting to run a service. This can safeguard against early aborted services that are invoked upon resuming from suspend in which the network is not fully connected. (I.e, resuming from suspend and a persistent service immediately fails because there was no connection; yet if it had only waited for the network to be available, it would have succeeded.)

The reality:

Systemd’s network-online.target is only useful after booting up the system. It does not actively monitor the state of the network connection, regardless of local-only or internet access. Thus, using it as a dependency in a service doesn’t work as I naively misunderstood it. (This has caused confusion elsewhere in other forums and websites.)


As it stands now, one must invoke nm-online -t 30 as an ExecStart in each individual custom service that requires a network connection (for local-only), and possibly further requirements of trying to ping/reach a remote website related to the service (e.g, DropBox, Google, Amazon, etc) before continuing.


I tested out adding the line ExecStart=nm-online -q -t 30 before the service’s actual command (rsync to a NAS server), and when I wake from suspend , it works perfectly, every time! However, without it, then it fails about half the time (possibly because the network connection wasn’t ready quickly enough, and thus rsync tried to reach an unreachable target.)

Using network-online.target makes zero difference, and does not work as one would assume by the name of the target.


I’m a bit disappointed because I was excited that systemd had some slick, intelligent checks and dependencies built-in, but alas, you still need to include “workarounds” in each service, such as a sleep timer or invoking nm-online. :unamused:


Am I missing something? Is there not a cleaner way to do this?


Addendum: The above issue is not a problem for systemd network mount/automount units (NFS, SMB, etc), since they have a TimeoutSec option that you can specify, for example 10 seconds. This means it will safely wait at least 10 seconds before giving up.

Is there such a thing as a TimeoutSec option for non-mount units, which one can add to their custom services? Otherwise, it appears that invoking "nm-online and/or ping" as the first ExecStart is the only feasible workaround.

:man_shrugging:


I’m going to re-read everyone’s comments when I have more free time, but I must run now!

:wave:

Another quick update. After doing some more testing, it appears that resuming from suspend has a very short window where the network interface is up, but a target is not reachable.

Reviewing the logs show that the main script (which is intended to run when I’m “online”) will very likely fail, unless I use a successful ping with an if-then-else condition to proceed to the script itself.


Here is an example,

# Do not continue until nm-online reports "online" within 30 seconds. Otherwise, abort.
if nm-online -q --timeout 30;

	then

  	# Do not continue until google.com is reachable within 20 seconds. Otherwise, abort.
 	if timeout 20s bash -c 'until ping -c1 -q google.com > /dev/null 2>&1 ; do echo "Not reachable! Retrying in 2 seconds!" ; sleep 2s ; done'

        then
			
			# Run the actual script with a bunch of stuff in here
			blah blah blah do stuff in here yadda yadda yay

		else

			# Abort and exit because google.com was not reachable after 20 seconds.
			echo "Target not reachable. Aborting."
	fi

else

	# Abort and exit because nm-online timed out after 30 seconds.
	echo "No network connection available. Aborting."

fi

Using nm-online on its own was not sufficient enough. Even though nm-online would return “online”, the script would go on to immediately fail since the target is not reachable. (This didn’t always happens, since the window of failure is not consistent every time I resume from suspend.)

Reviewing the log under journalctl, I see "Not reachable! Retrying in 2 seconds!" echoed one time before the actual script runs, which means had the script tried to immediately run, it would have failed with an unreachable target, even though nm-online exits with “online”.


Overall, I’m not as impressed with creating custom systemd services, as I was hoping to leverage more “intelligence” from it in terms of pre-run “dependencies” and “requirements” before attempting to run a service upon resuming from suspend.

Using network-online.target and other dependencies doesn’t do much after the initial bootup of the system.

I must still resort to using hack’ish methods to ensure that a service that needs a real “online” connection will not immediately fail because of an unreachable target. (Which means it will be skipped until the next time it is invoked by the timer.) :pensive:


Ideally, I was hoping for something like this:

Using "Wants" or "Requires" or "After" would create a hard condition for the service, such as,

  • network-interface-up.target
    – If the minimal requirement is that you need a network interface “up”

  • network-local-online.target
    – If the minimal requirement is that your gateway must be reachable

  • network-wan-online.target
    – If a remote address must be reachable

Furthermore, these “targets” will be dynamic and their active state would change based on connectivity, suspending, resuming, disconnects, etc. They would not be static after a successful system boot.


I can dream, can’t I?

:pleading_face: