Bug 14444 - systemd remote-fs.target is getting started before wireless networking is up
Summary: systemd remote-fs.target is getting started before wireless networking is up
Status: RESOLVED INVALID
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: x86_64 Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Colin Guthrie
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-03 16:35 CET by Frank Griffin
Modified: 2016-04-09 17:51 CEST (History)
1 user (show)

See Also:
Source RPM: systemd
CVE:
Status comment:


Attachments

Description Frank Griffin 2014-11-03 16:35:23 CET
As the summary says, but *literally*.  It's not failing startup, there is simply no attempt to start it at all, even though it shows as enabled.  The only mention in journalctl for it is that various NFS mounts failed because of their dependency on it:

[root@ftgme2 ~]# journalctl -a -b -u remote-fs.target 
-- Logs begin at Tue 2014-01-14 10:08:01 EST, end at Mon 2014-11-03 10:19:56 ES
Nov 03 10:10:45 ftgme2 systemd[1]: Dependency failed for Remote File Systems.
[root@ftgme2 ~]# systemctl status remote-fs.target
â remote-fs.target - Remote File Systems
   Loaded: loaded (/usr/lib/systemd/system/remote-fs.target; enabled)
   Active: inactive (dead)
     Docs: man:systemd.special(7)

Nov 03 10:10:45 ftgme2 systemd[1]: Dependency failed for Remote File Systems.
[root@ftgme2 ~]# 

However, obce the system is up, it runs with no problem:

[root@ftgme2 ~]# systemctl start remote-fs.target
[root@ftgme2 ~]# df
Filesystem               Size  Used Avail Use% Mounted on
/dev/sda2                 40G   30G  8.2G  79% /
devtmpfs                 3.8G     0  3.8G   0% /dev
tmpfs                    3.8G  1.4M  3.8G   1% /dev/shm
tmpfs                    3.8G  1.2M  3.8G   1% /run
tmpfs                    3.8G     0  3.8G   0% /sys/fs/cgroup
tmpfs                    3.8G  336K  3.8G   1% /tmp
/dev/sdb10               563G  282G  252G  53% /mnt/windows
/dev/sda6                 16G  6.1G  8.8G  41% /usr/local
/dev/sda10               252G  217G   23G  91% /data2
/dev/sda8                252G  236G  4.0G  99% /mnt/VirtualBox
/dev/sda12               227G  146G   69G  68% /mga4
/dev/sda7                252G  175G   65G  74% /data
/dev/sda9                252G   91G  149G  38% /oma
tmpfs                    774M  8.0K  774M   1% /run/user/501
ftgfiles1:/mnt/cauldron  222G  112G   99G  53% /mnt/cauldron
ftgfiles1:/mnt/backups   201G  170G   21G  90% /mnt/backups
[root@ftgme2 ~]# 


I reported this a while back in the cauldron ML, and after a while I thought it had stopped happening.  However back then, I think I recall that it was starting  but failing, e. g. before the network was fully up.  Now it's just MIA.
Comment 1 Colin Guthrie 2014-11-03 16:50:16 CET
I remember discussions of a bit of a rejig here upstream... will dig through that and take a look.
Comment 2 Colin Guthrie 2014-11-03 17:39:28 CET
Oh and just for reference, it starts fine here on my system.

What does "systemctl status systemd-networkd-wait-online.service" say?

If it's enabled and in any way failed, can you do "systemctl disable systemd-networkd-wait-online.service" and reboot and see if it helps?

There was a time when this would have been enabled by accident in cauldron (due to the preset file being shipped in a separate package to the script that used it), but this should no longer be the case for fresh installs or upgrades. It might be interfering...
Comment 3 Frank Griffin 2014-11-03 17:54:31 CET
Intresting you should mention that.  I've actually manually enabled it in the past, because I thought it had something to do with delaying dm until the network was up (like the old network-up kludge).

But I've checked both systems with ths problem, and it shows as disabled on both.
Comment 4 Colin Guthrie 2014-11-03 18:10:02 CET
(In reply to Frank Griffin from comment #3)
> I thought it had something to do with delaying dm until
> the network was up (like the old network-up kludge).

Yeah that's it's intention and will hopefully be the official replacement for our old network-up stuff in mga6 (when we'll hopefully move over to networkd properly and drop all the old ifup stuff in initscripts - but this is all just "it would be nice to"'s so far - and depends somewhat on RH guys writing a generator for the old sysconfig files that we can steal, but AFAIK from speaking to Tom, that's the plan).

But I'm not sure we can rely on it just yet and it remains disabled in the .preset file shipped upstream, so I'm guessing it's still not quite ready yet for general usage. networkd for now is mostly passive, but we can still use the networkctl commands to get information and see it taking shape, so that's nice.

> But I've checked both systems with ths problem, and it shows as disabled on
> both.

OK, back to the drawing board.

Would you be able to reboot with "systemd.log_level=7 systemd.log_target=journal" on the kernel command line and post output of "journalctl -b"? Might be nice to see what dep is missing and if there is any kind of ordering cycle thing going on.
Comment 5 Colin Guthrie 2014-11-03 18:15:42 CET
Could be related to this...

http://cgit.freedesktop.org/systemd/systemd/commit/?id=919699ec301ea507edce4a619141ed22e789ac0d

Will apply it in the next build.
Comment 6 Frank Griffin 2014-11-03 18:53:58 CET
When I change the kernel parameters, the system tty1 output increases dramatically, but the system boots to the emergency shell (no idea why).

Control-D cogitates for a while, and then restarts the emergency shell.

What should I be doing here ?
Comment 7 Frank Griffin 2014-11-03 23:05:03 CET
OK, the latest systemd trashes the boot by locking up on the "Transferring journalctl to persistent storage" target.

If you hit ESC and watch the tty1 input, it gets to this stage and the timestamp at the end of the line ( xxx / nolimit ) just has xxx increasing.  I let it sit there for 6 minutes.

If you do an ALT-SYSRQ-E, everything magically proceeds and the boot actually succeeds.
Comment 8 Colin Guthrie 2014-11-04 10:48:52 CET
(In reply to Frank Griffin from comment #7)
> OK, the latest systemd trashes the boot by locking up on the "Transferring
> journalctl to persistent storage" target.

Ouch. It worked (and works) fine in my testing here, but do you happen to have a separate /var? Can you comment on bug #14452 instead on this bit?
Comment 9 Piotr Mierzwinski 2015-01-28 18:53:51 CET
I think, I had similar problem with remote-fs.target as Frank here, that's why I'm writing in this thread.
I'm telling 'had', because I've solved it (without workaround). Below you can find my investigation (including solution). I hope it will helpful someone with similar problem. The differences are: my system is i586 (32 bit) and I have newer packages (especially systemd) - all comes from the latest Mageia 5 beta 2 including all updates till 2015/01/28 5pm CET.

I needed to fire up: 'mount -a -t nfs' to have mounted nfs resources. Whilst on Mageia 4 it works quite good.

[root@phenom2x4 system]# mount -a -t nfs
[root@phenom2x4 system]# systemctl list-dependencies --after remote-fs.target
remote-fs.target
â ââhome-piotr-Pobrane_DS.mount
â ââremote-fs-pre.target  (comment: this one is red)
â   âânfs-lock.service
â   âârpcbind.service

[root@phenom2x4 system]# df -h /var
System plików  rozm. użyte dost. %uż. zamont. na
/dev/sdb2       8,8G  7,9G  451M  95% /

[root@phenom2x4 system]# systemctl status remote-fs.target
â remote-fs.target - Remote File Systems
   Loaded: loaded (/usr/lib/systemd/system/remote-fs.target; enabled)
   Active: inactive (dead)
     Docs: man:systemd.special(7)

sty 28 12:09:44 phenom2x4.home systemd[1]: Dependency failed for Remote File Systems.

[root@phenom2x4 system]# journalctl -a -b -u remote-fs.target 
-- Logs begin at sob 2014-12-13 17:54:35 CET, end at Åro 2015-01-28 14:26:51 CET. --
sty 28 12:09:44 phenom2x4.home systemd[1]: Dependency failed for Remote File Systems.


Additionally in log I had following error:
[root@phenom2x4 system]# journalctl -a -b | grep mount.nfs
sty 28 12:09:44 phenom2x4.home mount[1370]: mount.nfs: Network is unreachable


I found out from some forum that network-up.service should be enabled whilst I've met:
[root@phenom2x4 system]# systemctl status network-up.service
â network-up.service - LSB: Wait for the hotplugged network to be up
   Loaded: loaded (/etc/rc.d/init.d/network-up)
   Active: inactive (dead)

so I did it:
[root@phenom2x4 system]# systemctl enable network-up.service
network-up.service is not a native service, redirecting to /sbin/chkconfig.
Executing /sbin/chkconfig --no-reload --no-redirect network-up on

[root@phenom2x4 mnt]# systemctl is-enabled network-up.service
network-up.service is not a native service, redirecting to /sbin/chkconfig.
Executing /sbin/chkconfig --no-reload --no-redirect network-up --level=5
enabled

And to be sure:
[root@phenom2x4 system]# /sbin/chkconfig --no-reload --no-redirect network-up --level=5


I wasn't sure if this helped, because I didn't restart system to check it out, but I was looking for the solution further. Then I checked what units related to systemd are disabled.
[root@phenom2x4 system]# systemctl list-unit-files | grep disabled | grep systemd
systemd-journal-upload.service          disabled
systemd-networkd-wait-online.service    disabled
systemd-nspawn@.service                 disabled
systemd-journal-gatewayd.socket         disabled
systemd-journal-remote.socket           disabled

The suspect has become: 'systemd-networkd-wait-online.service'
So I enabled it:
[root@phenom2x4 system]# systemctl enable systemd-networkd-wait-online.service
Created symlink from /etc/systemd/system/network-online.target.wants/systemd-networkd-wait-online.service to /usr/lib/systemd/system/systemd-networkd-wait-online.service.
[root@phenom2x4 system]# systemctl status systemd-networkd-wait-online.service
â systemd-networkd-wait-online.service - Wait for Network to be Configured
   Loaded: loaded (/usr/lib/systemd/system/systemd-networkd-wait-online.service; enabled)
   Active: inactive (dead)
     Docs: man:systemd-networkd-wait-online.service(8)
[root@phenom2x4 system]# systemctl restart systemd-networkd-wait-online.service
[root@phenom2x4 system]# systemctl status systemd-networkd-wait-online.service
â systemd-networkd-wait-online.service - Wait for Network to be Configured
   Loaded: loaded (/usr/lib/systemd/system/systemd-networkd-wait-online.service; enabled)
   Active: active (exited) since Åro 2015-01-28 16:55:24 CET; 1s ago
     Docs: man:systemd-networkd-wait-online.service(8)
  Process: 18345 ExecStart=/usr/lib/systemd/systemd-networkd-wait-online (code=exited, status=0/SUCCESS)
 Main PID: 18345 (code=exited, status=0/SUCCESS)

OK. Time to restart with hope, will work....

After restart all nfs resources have been mounted :-). What one changed in systemd:
[root@phenom2x4 system]# systemctl list-dependencies --after remote-fs.target
remote-fs.target
â ââhome-piotr-Pobrane_DS.mount
â ââremote-fs-pre.target    (comment: this one is still red.)
â   âânfs-lock.service
â   âârpcbind.service

[root@phenom2x4 system]# systemctl status remote-fs-pre.target
â remote-fs-pre.target - Remote File Systems (Pre)
   Loaded: loaded (/usr/lib/systemd/system/remote-fs-pre.target; static)
   Active: inactive (dead)
     Docs: man:systemd.special(7)

[root@phenom2x4 piotr]# systemctl is-enabled remote-fs-pre.target
static

And here is problematic 'target':
[root@phenom2x4 system]# systemctl status remote-fs.target
â remote-fs.target - Remote File Systems
   Loaded: loaded (/usr/lib/systemd/system/remote-fs.target; enabled)
   Active: active since Åro 2015-01-28 17:24:22 CET; 11min ago
     Docs: man:systemd.special(7)

[root@phenom2x4 system]# journalctl -a -b -u remote-fs.target
-- Logs begin at sob 2014-12-13 17:54:35 CET, end at Åro 2015-01-28 17:42:26 CET. --

[root@phenom2x4 system]# journalctl -a -b | grep mount.nfs
there is no output

Now all is fine.
I suppose that the reason of 'Dependency failed for Remote File Systems." was not active: 'systemd-networkd-wait-online.service' or/and 'network-up.service'.

CC: (none) => piotr.mierzwinski

Comment 10 Samuel Verschelde 2015-05-31 23:53:23 CEST
What's the status for this bug? Still valid?
Samuel Verschelde 2015-05-31 23:53:30 CEST

Keywords: (none) => NEEDINFO

Comment 11 Frank Griffin 2015-07-13 17:38:41 CEST
Still valid:

[root@ftglap ~]# systemctl status remote-fs.target -l
â remote-fs.target - Remote File Systems
   Loaded: loaded (/usr/lib/systemd/system/remote-fs.target; enabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:systemd.special(7)

Jul 13 11:25:53 ftglap systemd[1]: Dependency failed for Remote File Systems.
Jul 13 11:25:53 ftglap systemd[1]: remote-fs.target: Job remote-fs.target/start 

But here's why (from journalctl):

Jul 13 11:25:53 ftglap mount[12582]: mount.nfs: Failed to resolve server ftgme2: Name or service not known
Jul 13 11:25:53 ftglap systemd[1]: mnt-ftgme2.mount: Mount process exited, code=exited status=32
Jul 13 11:25:53 ftglap systemd[1]: mnt-ftgme2.mount: Unit entered failed state.
Jul 13 11:25:53 ftglap systemd[1]: mnt-ftgme2.data.mount: Mount process exited, code=exited status=32
Jul 13 11:25:53 ftglap systemd[1]: mnt-ftgme2.data.mount: Unit entered failed state.
Jul 13 11:25:53 ftglap systemd[1]: mnt-ftgme2.data2.mount: Mount process exited, code=exited status=32
Jul 13 11:25:53 ftglap systemd[1]: mnt-ftgme2.data2.mount: Unit entered failed state.
Jul 13 11:25:53 ftglap xinetd[12635]: Reading included configuration file: /etc/xinetd.d/eklogin [file=/etc/xinetd.d/eklogin] [line=13]
Jul 13 11:25:53 ftglap systemd[1]: data-ftg-.thunderbird.mount: Mount process exited, code=exited status=32

systemd is trying to activate this target before the network is up.  Note that it would probably work if this were wired ethernet, which comes up earlier than wireless, but in my case it's wireless.

systemd should check for network mounts in fstab and if found should either delay this target until the network is up or else use a form of mount that automatically retries the mount until it works (and ignore whatever error mount returns the first time).
Comment 12 Frank Griffin 2015-11-03 03:28:37 CET
This is still occurring in current cauldron installs.  Ping ?
Frank Griffin 2015-11-03 03:47:22 CET

Summary: systemd remote-fs.target is not getting started => systemd remote-fs.target is getting started before wireless networking is up

Comment 13 Piotr Mierzwinski 2015-11-03 20:08:15 CET
For me all is fine. To be sure I've made simple test as following.
I've added 2 nfs resources to /etc/fstab and restarted system. When system was up and running I checked that both resources have been properly mounted.
Command: "systemctl status remote-fs.target" returns me status Active as active.
System shuts down correctly.
Comment 14 Frank Griffin 2015-11-04 00:56:16 CET
I think that the critical issue here is wireless versus wired network access.  Wired is brought up far earlier in the boot sequence than wireless, and I have this problem exclusively (and consistently) with wireless under the control of NetworkManager.
Comment 15 Colin Guthrie 2015-11-04 09:19:12 CET
Do you have the NetworkManager-wait-online.service enabled? If not, does it help to enable it?
Comment 16 Frank Griffin 2015-11-04 13:08:38 CET
Now I'm confused:

*****************************************************************
[root@ftglap ~]# systemctl status NetworkManager-wait-online
â NetworkManager-wait-online.service - Network Manager Wait Online
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager-wait-online.service; disabled; vendor preset: enabled)
   Active: inactive (dead) since Mon 2015-11-02 20:50:58 EST; 1 day 10h ago
 Main PID: 5675 (code=exited, status=0/SUCCESS)

Nov 02 20:50:52 localhost.localdomain systemd[1]: Starting Network Manager Wa...
Nov 02 20:50:58 ftglap systemd[1]: Started Network Manager Wait Online.
Hint: Some lines were ellipsized, use -l to show in full.
******************************************************************

"disabled; vendor preset: enabled" ?  It looks like it ran OK, but why does it show as disabled if it's shipped as enabled ?  Or am I reading this wrong ?
Comment 17 Frank Griffin 2015-11-04 13:14:29 CET
Looks like it really was disabled:

**********************************************************************
[root@ftglap ~]# systemctl enable NetworkManager-wait-online
Created symlink from /etc/systemd/system/multi-user.target.wants/NetworkManager-wait-online.service to /usr/lib/systemd/system/NetworkManager-wait-online.service.
[root@ftglap ~]# systemctl status NetworkManager-wait-online
â NetworkManager-wait-online.service - Network Manager Wait Online
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager-wait-online.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Mon 2015-11-02 20:50:58 EST; 1 day 10h ago
 Main PID: 5675 (code=exited, status=0/SUCCESS)

Nov 02 20:50:52 localhost.localdomain systemd[1]: Starting Network Manager Wa...
Nov 02 20:50:58 ftglap systemd[1]: Started Network Manager Wait Online.
Hint: Some lines were ellipsized, use -l to show in full.
**********************************************************************

I'll see if that makes a difference.  Why is this not enabled to start with ?
Comment 18 Frank Griffin 2015-11-04 13:53:13 CET
Enabling it made no difference, but I suspect I see the problem.  The laptop has both wired and wireless NICs, and both are set to start at boot.  The wired NIC has nothing plugged into it.

I think that even though the wired NIC isn't cabled, it qualifies as being "up" for the purposes of network-up and NetworkManager-wait-online, and this allows the network mounts to start and fail even though no network access is actually there.

I think the criterion for these tests has to be not "is a NIC present and powered up" but "is a NIC actually working".
Comment 19 Colin Guthrie 2015-11-04 14:43:40 CET
We don't (yet) universally use the preset system, so it's not really surprising it's not enabled - it would requires specific code in the package to enable it as things stand. It's something I'd certainly like to introduce and one of the (many) things I'll be looking to spend some time on while at the systemd conference in Berlin over the next few days.

Regarding the wired NIC, I don't think it's mere existence qualifies as "online" in terms of the NM test. I could be wrong, but it should be easy enough to test - just look at the wait-online unit, take down your wireless and run the test and see if it passes.
Comment 20 Frank Griffin 2015-11-04 16:11:34 CET
>just look at the wait-online unit, take down your wireless and run the test and see if it passes.

I assume you mean remove the wireless interface and reboot ?  With NM (not using the ifcfg-rh stuff), how do you remove an interface so that NM won't find it automatically upon restart ?  I assume I can't just disable NM without having some effect on NM-wait-online ?
Comment 21 Frank Griffin 2015-11-17 17:13:08 CET
Ping ?
Comment 22 Frank Griffin 2016-03-09 23:39:07 CET
Ping again ?  This is still a problem in cauldron...

Keywords: NEEDINFO => (none)

Comment 23 Frank Griffin 2016-04-09 17:33:09 CEST
I think I finally see what's happening here.

I had a problem with SDDM which led me to switch my default runlevel to 3.  When I did that, I noticed that NM was not starting my wireless interface by default.  NM itself started, and it started virbr0 automatically, but not the wireless.  However, if I switch back to runlevel 5, the wireless *is* started when the desktop initializes.

So what I think is happening is NM starts but doesn't activate any usable interface.  Technically the network is up, so remote-fs runs but can't do any mounts because the only usable interface (wifi) hasn't been started yet.  Once the desktop is up, so is the wifi and remote-fs starts fine.

I was under the impression that NM was supposed to automatically start any interface that was marked as "system", which my wifi always was set to use.  However now when I try to configure the wireless with the plasma-applet-nm and select "all users may use this interface" (the successor to "system"), the button is selected, I click OK to exit the dialog and get no error.  But when I reopen the configuration dialog, the button is unselected again.

Looks like NM configuration is broken.
Comment 24 Frank Griffin 2016-04-09 17:51:29 CEST
That was it.  If I manually remove the "permissions" line from the /etc/NetworkManager/system-connections/{interface} file, remote-fs starts just fine during boot.

I'll enter a plasma-applet-nm bug.

Status: NEW => RESOLVED
Resolution: (none) => INVALID


Note You need to log in before you can comment on or make changes to this bug.