I'm noticing that my fstab NFS mounts aren't getting mounted, and unlike similar bugs it has nothing to do with the network not being up. I'll attach an edited syslog fragment that shows the network coming up, but all of the systemd NFS mount tasks failing, possibly because NFS isn't completely up. However, after that, at a time when NFS absolutely is up, you can see mga-applet through urpmi trying to mount the NFS directories containing the cauldron repository tree and failing. This continued at intervals overnight. Much, much later, I notice that the NFS mounts aren't there, and manually do the following: umount -a -t NFS -l mount -a -t NFS which works fine. This is not new. It's been the case for a few months. I just didn't have the time to look into it before now.
Created attachment 2485 [details] syslog fragment
Assignee: bugsquad => mageia
Can you post your fstab please?
# Entry for /dev/sda2 : UUID=b7c08455-81fd-49af-81ad-174e541356b0 / ext4 acl,relatime 1 1 # Entry for /dev/sda7 : UUID=7704902a-4f89-4198-b3c9-ce1b9357505d /data ext3 acl,relatime 1 2 # Entry for /dev/sda12 : UUID=9d637c2c-4257-4ec8-a6d4-e914850b638b /data2 ext4 acl,relatime 1 2 # Entry for /dev/sda10 : UUID=e64d0a6e-51f7-40e4-a517-03fecf2c35c9 /data3 ext4 acl,relatime 1 2 /dev/fd0 /media/floppy auto umask=0,users,iocharset=utf8,noauto,exec,flush 0 0 # Entry for /dev/sda8 : UUID=7484643d-4b4c-4fa8-b53c-ecf52da2567a /mnt/VirtualBox ext3 acl,relatime 1 2 # Entry for /dev/sda11 : UUID=04d64ee7-78ce-4236-b2a5-b3b49e024d4d /oma ext4 acl,relatime 1 2 none /proc proc defaults 0 0 none /tmp tmpfs defaults 0 0 # Entry for /dev/sda6 : UUID=c7cc1a49-beee-4517-a6b5-5379dd321aab /usr/local ext3 acl,relatime 1 2 # Entry for /dev/sda5 : UUID=384d236d-635d-41ac-b1f7-8773152d04a0 swap swap defaults 0 0 ftglap:/ /mnt/ftglap nfs rw,bg,soft 0 0 ftglap:/data /mnt/ftglap.data nfs rw,bg,soft 0 0 ftgfiles1:/mnt/cauldron /mnt/cauldron nfs rw,bg,soft 0 0 ftgfiles1:/mnt/cooker /mnt/cooker nfs rw,bg,soft 0 0 ftgfiles1:/mnt/plf /mnt/plf nfs rw,bg,soft 0 0 ftgfiles1:/usr/local /mnt/ftgfiles1.usr.local nfs rw,bg,soft 0 0 ftgfiles1:/mnt/backups /mnt/backups nfs rw,bg,soft 0 0
Ahh sorry, this is in cauldron right? Recently too? I've not really looked at the nfs stuff lately, but I can imagine this being a bit broken just now after switching to native units for starting the nfs stuff. Guillaume, what's the latest status of this?
CC: (none) => guillomovitch
Yes, cauldron, but it's not really recent. It definitely predates MGA2 release. Of course, the cause may have changed. I couldn't swear that it hasn't gotten fixed and then resurfaced. This is just one of a few long-standing behaviors having to do with initialization that I'm just now finding time to look into.
There was a problem up until quite late in mga2 cycle but it was fixed in time - it was one you yourself reported and confirmed fixed bug #5890 So fingers crossed this is a resurface rather than a long standing issue.
(In reply to comment #6) > There was a problem up until quite late in mga2 cycle but it was fixed in time > - it was one you yourself reported and confirmed fixed bug #5890 Errr, and another grey cell hits the old bug light..... Looking into the bug#5890 fix, the fix to rpcbind.service is there currently, but there is no longer a remote-fs-pre.target file, just a remote-fs.target, and it contains nothing like the fix you made to remote-fs-pre.target.
Are you sure remote-fs-pre.target no longer exists? It should be there.... It is here in my updated systemd package in updates_testing, but I'm pretty sure it's still part of the older systemd in release too. /lib/systemd/system/remote-fs-pre.target Most confusing if it's no longer there...
It's in the /lib subdirectory, but not in /etc/systemd: [ftg@ftgme2 multi-user.target.wants]$ cd /lib/systemd [ftg@ftgme2 systemd]$ find . -name 'remote-fs*' ./system/remote-fs-pre.target ./system/remote-fs.target ./system/remote-fs-login.target [ftg@ftgme2 systemd]$ cd /etc/systemd/system [ftg@ftgme2 system]$ ls basic.target.wants/ graphical.target.wants/ bluetooth.target.wants/ httpd.service@ dbus-org.bluez.service@ multi-user.target.wants/ dbus-org.freedesktop.NetworkManager.service@ printer.target.wants/ default.target@ sockets.target.wants/ default.target.wants/ syslog.service@ getty.target.wants/ [ftg@ftgme2 system]$ find . -name 'remote-fs*' ./multi-user.target.wants/remote-fs.target [ftg@ftgme2 system]$
(In reply to comment #9) > It's in the /lib subdirectory, but not in /etc/systemd: That's how it should be. /etc/ is for administrator controlled/controllable changes. /lib/ is for packaging stuff. Both folders (along with their companion, /run/) are searched when loading units. This allows the system to provide units, but if the user is unhappy with them for whatever reason they can typically write their own and place it in /etc. So this is all normal. As I said tho', I've not looked at the latest incarnation of NFS stuff to see how it's working and what's needed. I'll try and have a look at some point this week.
ping ?
The fix to remote-fs-pre.target for bug#5890 appears to have gone missing in current cauldron, unless it's been reworked.
This is behaving differently now. In a fresh install, no attempt to start NFS is made at all. I'll attach a syslog fragment that illustrates this. After doing "systemctl enable rpcbind.service" and "systemctl enable nfs-server.service" and rebooting, systemd immediately produces a raft of errors about NFS (next syslog fragment).
Created attachment 2669 [details] syslog fragment before issuing systemctl enable
Created attachment 2670 [details] syslog fragment after systemctl enable My assumptions are: (1) nfs stuff isn't being enabled when it should be, and (2) I probably screwed up my manual attempt to enable things
*** Bug 6994 has been marked as a duplicate of this bug. ***
CC: (none) => wilcal.int
Since, upon my posting my comment at Bug 6994, that bug has been identified as a duplicate of this bug, I'm posting the same comment here so I can track it: >>This bug is valid for me in M3A3. The relevant fstab line is >>192.168.1.2:/mnt/data /home/david/Server nfs defaults 0 0 >> >>It's a home server running NAS4Free. >> >>The mount, configured manually in fstab, shows up in MCC (Network >>Sharing/Configure NFS Shares) but I have to tell it to mount after each reboot.
CC: (none) => DShelbyD
@Comment 17 I have just completed a fresh install of the Mageia 3 Beta 1, KDE 64bit Live DVD on another partition of a hard drive which already has an up to date Mageia 2 and an up to date rolling Cauldron, both are 64bit. I can confirm exactly the same problem as you have reported, on this new install of M3B1, whereas the same fstab line that I use in the Mageia2 and Cauldron install are working perfectly.
CC: (none) => kin7500-ffrm
I have just found a reasonable workaround for my problem. I created an rc.local file in /etc/rc.d/ and made it executable for the owner (root). The contents are: #!/bin/sh -e # # rc.local # # In order to enable or disable this script just change the execution # bits. mount 192.168.x.x:/YourShare /mnt/YourMount exit 0 (Modify the mount line for your system.) Systemd will automatically find this, ref: /usr/lib/systemd/system/rc-local.service I have also removed the mount line for the NAS in fstab. My conclusion is that there is a timing issue in the boot process in M3B1 which is not present in my rolling Cauldron installation, so may not be the same as Frank's or David's problem. Perhaps Colin could comment if this conclusion is a possibility and whether there is a more elegant solution that does not involve rc.local ?
A further data point. I have the same problem on my Cauldron installation (64-bit, bare-metal install on an i3 laptop) which has been rolling forward since Alpha 2. It is not a Beta 1 issue. There is nothing obvious in any logs.
CC: (none) => mg
Colin, any progress on this ? I am still having to start nfs-server by hand and do a mount -a -t nfs after each boot, and we're nearly at freeze... This may or may not be related, but I also see timing issues with services that require the network. I have a JBoss server that is configured to listen on all interfaces, but systemd appears to start network stuff as soon as "lo" is up, the result being that the server comes up listening only to 127.0.0.1 rather than on that plus the actual DHCP-assigned eth0 interface. If I stop and start the service from the command line, it's back to normal.
i have the same issue. no nfs mount is attempted. considering this is a PXE+NFS setup, this can very well be why systemd-remount-fs fails... i noticed that nfs-common and nfs-server were removed and replaced by nfs.target (i assume for nfs servers). however, nothing seems to happen at boot time: relevant unit files remote-fs-login.target static remote-fs-pre.target static remote-fs.target disabled i'm gonna try enabling remote-fs.target and see what happens at reboot
CC: (none) => alien
>considering this is a PXE+NFS setup I don't think that's the case for any of the rest of the posters. It really seems to be a disconnect between the way systemd has evolved and the services that it controls. From your post, it seems that several other services have supplanted the old nfs-client, and it's unclear whether they are being started or started at the proper time. The people who have modified the services and the people who maintain systemd really need to talk to each other to sort this out.
after enabling remote-fs.target i got a partial success namely 3 of the 4 mount points have succeeded in mounting after boot. the status of the 4th one (actually the first) is that lockd wasn't started (yet). i guess some tuning is necessary here...
(In reply to comment #23) > >considering this is a PXE+NFS setup > > I don't think that's the case for any of the rest of the posters. It really > seems to be a disconnect between the way systemd has evolved and the services > that it controls. > > From your post, it seems that several other services have supplanted the old > nfs-client, and it's unclear whether they are being started or started at the > proper time. > > The people who have modified the services and the people who maintain systemd > really need to talk to each other to sort this out. i don't think it's got a bearing on the issue if i use this setup or not. i'm just stating that for me, it's ESSENTIAL to have my stuff mounted at boot time, or weird stuff starts happening
Yup, same here. I suspect that this is all fallout from the frantic race over the past few years to reduce the boot clock time until the user is presented with a GUI DM login prompt. To this end, even before systemd arrived, we were having conflicts between traditional server-type systems that wanted things started in lock-step and the GUI-noob-we-don't-use-network-services-before-the-desktop-is-up philosophy of get-the-login-prompt-up-ASAP. I had hoped that systemd would have helped the situation, but not unless more attention is paid to the relationship among the more granular startup services, that's not going to be the case.
I didn't personally look at the NFS units after they were converted to systemd units so I've no idea what state they are in. My NFS mounts are fine when I boot here but not running an NFS server on this machine so not looked closes. It obviously has to work correctly for final - it's a very important part of the system. Regarding the starting of services after the network is up that's really not something systemd should do directly. How do you define whether network is "up" anyway? Different apps have different definitions: lo existing, a real network device existing, a real network device existing with a link beat, a real network device existing with a link beat and IP, a real network device existing with a link beat, IP and can connect to the internet, etc etc. This isn't really something we can enumerate with systemd targets for all the services that want all the various permutations. The "correct" solution for such services is to subscribe to the kernel netlink events and simply deal with network hotplugging properly within themselves. With USB and bluetooth network devices etc. that's the only sensible way to do it anyway - there is not real "point in time" where things are valid. In order to have a semblance of support here, you can ensure the network-up.service is enabled and running. This service *should* ensure that network.target is delayed to a sufficient point that the network is "up" (to the definition that it has an IP). We also have NetworkManager-wait-online.service which does much the same thing but only for NM setups and has to be ordered/enabled manually. Perhaps the former one (network-up.service) is no longer working 100% with latest initscripts and/or NM? If you guys could poke a bit deeper into the the problem that would be great!
>Regarding the starting of services after the network is up that's really not something systemd should do directly. How do you define whether network is "up" anyway? Different apps have different definitions: lo existing, a real network device existing, a real network device existing with a link beat, a real network device existing with a link beat and IP, a real network device existing with a link beat, IP and can connect to the internet, etc etc. This isn't really something we can enumerate with systemd targets for all the services that want all the various permutations. Why not ? Isn't that the bedrock of pre-systemd initscript ordering ? And wasn't increased parallelism achieved through finer granularity of event requirements one of the goals of systemd ? >The "correct" solution for such services is to subscribe to the kernel netlink events and simply deal with network hotplugging properly within themselves. With USB and bluetooth network devices etc. that's the only sensible way to do it anyway - there is not real "point in time" where things are valid. Well, except for the fact that there are far more legacy apps that won't work without the network than ones that won't work without USB and bluetooth, and many of them are simply not going to support Linux-only paradigms, e. g. in the JBoss case, there's no way to tell a Java Virtual Machine to subscribe to kernel events. It may be reasonable to require apps playing with newer devices to use newer techniques, but it would be better design to centralize such event detection in one place for portability. systemd must already be doing this to support network-up to the extent that it does. I think a reasonable compromise would be to have a single new event, say network-usable-external, that is true only if some interface has an IP and can connect to the internet. That's really all most apps care about. That said, I'll keep poking to see what I can find.
(In reply to comment #27) > I didn't personally look at the NFS units after they were converted to systemd > units so I've no idea what state they are in. My NFS mounts are fine when I > boot here but not running an NFS server on this machine so not looked closes. > It obviously has to work correctly for final - it's a very important part of > the system. it isn't really about a NFS server, but mounting your fstab mounts at boot time. i personally had to enable remote-fs.target to get it somewhat working, i suppose that's not the best way either...
There is additional information in systemd.automount(5) and systemd.mount(5), including /etc/fstab support. Indeed, the bug report should be renamed, if the problem is only with boot-time mounting of statically declared mount point.
Summary: systemd seems to have problems with NFS mounts => systemd seems to have problems with NFS mounts in /etc/fstab
Enabling remote-fs.target is exactly what you're supposed to do - it's like the old netfs service of old. I guess we probably need to do something about that - i.e. automatically transition it - I forget if I did something "clever" to enable it on upgrade last time around (I know I masked netfs service to prevent it interfering)... will have to have a poke. When you say it's somewhat working, what do you mean? Is it mounting some but not all definitions? I guess starting them all manually after boot works fine? If so, do a fresh boot and issue "systemctl status foo.mount" for all the mount units. That should show us at very least the exit status from the mount operation for the failed ones.
[Your quoting on the reply seems messed up. I tried to fix] (In reply to comment #28) > > Regarding the starting of services after the network is up that's really not > > something systemd should do directly. How do you define whether network is "up" > > anyway? Different apps have different definitions: lo existing, a real network > > device existing, a real network device existing with a link beat, a real > > network device existing with a link beat and IP, a real network device existing > > with a link beat, IP and can connect to the internet, etc etc. This isn't > > really something we can enumerate with systemd targets for all the services > > that want all the various permutations. > > Why not ? Isn't that the bedrock of pre-systemd initscript ordering ? And > wasn't increased parallelism achieved through finer granularity of event > requirements one of the goals of systemd ? No, not at all. The pre-systemd initscript ordering had the same level of granularity as systemd in this regard. The old "network" script started the network and the "network-up" script waited for it to be online. Both contributed towards $network in and several other services Required $network. This is 1:1 mapped to systemd's network.target. While increased parallelism is indeed a goal of systemd, you have to draw the line somewhere otherwise the core of systemd itself becomes totally bloated dealing with every single aspect of userspace hotplugging and local-to-device states. Dealing with network events is certainly one of these places where the line has been drawn. Dealing with flakey networks (e.g. 3G coming and going) or hot plugged networks (e.g. when I pair my phone or plug in a USB network device) should, for the most part, be handled in the apps themselves. If the app requires a fixed, working network then you could argue that app itself has an incomplete implementation. It should be able to cope with things coming and going - the mythical "network is ready" event is not something that really happens - it might be ready now, but half a microsecond later it might not be ready again. True robustness only comes through properly handling these things where they need to be handled. Of course 99% of the time the "network is ready" point in time can be defined - it's "good enough" in most cases. Doesn't mean it's the right way to do things tho'. > > The "correct" solution for such services is to subscribe to the kernel netlink > > events and simply deal with network hotplugging properly within themselves. > > With USB and bluetooth network devices etc. that's the only sensible way to do > > it anyway - there is not real "point in time" where things are valid. > > Well, except for the fact that there are far more legacy apps that won't work > without the network than ones that won't work without USB and bluetooth, and > many of them are simply not going to support Linux-only paradigms, e. g. in the > JBoss case, there's no way to tell a Java Virtual Machine to subscribe to > kernel events. It may be reasonable to require apps playing with newer devices > to use newer techniques, but it would be better design to centralize such event > detection in one place for portability. Well that's why hacks like network-up and NetworkManager-wait-online as I mentioned above exist. These still provide a support for the "it's ready" point in time (even when it's not actually really ready!) exactly as before. > systemd must already be doing this to support network-up to the extent that it > does. I think a reasonable compromise would be to have a single new event, say > network-usable-external, that is true only if some interface has an IP and can > connect to the internet. That's really all most apps care about. I'm not sure what you mean by "event" here. There are not really such things as "events" in systemd. Closest thing is a target starting and stopping. We already have the network-up.service and NetworkManager-wait-online.service as mentioned. Individual units can either explicitly request these units (e.g. Requires=network-up.service) or they can just do the recommended After=network.target which will only be available when network-up.service has been run if it's enabled. Typically, the login screen will not be delayed waiting for network to become ready, so enableing network-up.service carries little impact to the user experience. Obviously there are some exceptions to that rule. If we enable remote LDAP authentication then the network-auth.service is enabled. It automatically pulls in network-up.service to ensure that login screens are delayed until after the network is "up" and thus the authentication service is available. Similar tricks are used for NFS mounts that are not marked as nofail in fstab. The lack of a nofail option is a hint to us that the NFS filesystem is a critical part of the system and it's needed for logins (e.g. /home over NFS). In this instance again, the login screen is delayed until after network.target. So really the granularity is there. You just need to ensure that network-up.service is enabled and ordering things After=network.target should mean they are only started *after* the network is up and ready. "systemd-analyze plot" should give you a nice visualisation of this in the form of an svg. Looking at mine locally this all appears to be working fine on my machine.
(In reply to comment #31) > Enabling remote-fs.target is exactly what you're supposed to do - it's like the > old netfs service of old. > > I guess we probably need to do something about that - i.e. automatically > transition it - I forget if I did something "clever" to enable it on upgrade > last time around (I know I masked netfs service to prevent it interfering)... > will have to have a poke. actually, iinm, another systemd unit is looking at the local mount points, why wouldn't it just start remote-fs if it seems that there ARE remote mount points? that seems to me like a cleaner solution. > When you say it's somewhat working, what do you mean? Is it mounting some but > not all definitions? I guess starting them all manually after boot works fine? > If so, do a fresh boot and issue "systemctl status foo.mount" for all the mount > units. That should show us at very least the exit status from the mount > operation for the failed ones. what i mean is that the first one fails, and the next 3 mount points succeed. my guess would be that something is required to mount them, but status didn't give any info except that it failed to mount. but, then i would guess that remote-fs isn't necessarily NFS only? so i hope that remote-fs.target should load whatever is required? what about these other remote-fs* units? do i need to enable those too?+
Created attachment 3329 [details] systemd-analyze blame for Rolling Cauldron
Created attachment 3330 [details] systemd-analyze plot for Rolling Cauldron Whoops, lost my comments about my attachments. As I reported in Comment #18, I have a rolling Cauldron installed and my nfs mount (mounted as GNU) is correctly mounted with my fstab entry. I also have on the same HDD a clean M3B1 install and the same entry in fstab does not work. I reported a workaround, using rc.local, in comment #19. I have therefore attached a systemd blame and systemd plot for both installs. Hopefully someone with more knowledge than I have could compare them to see if they reveal anything interesting .
Created attachment 3331 [details] systemd-analyze blame for Clean M3B1
Created attachment 3332 [details] systemd-analyze plot for Clean M3B1
@Steve so as you can see from your plots, network.service and network-up.service appear to both be running and doing their jobs correctly. network.target is delayed until network-up.service has run (which presumably means network is available). Various other services (ntpdate, nfs-server, smb etc) are delayed until network.target is available (and thus network-up has completed). That all seems to be as intended. On your clean M3B1, there is no remote-fs.target present. So can you run: "systemctl enable remote-fs.target" which should start it on boot. What is interesting about your Rolling Cauldron plot is that remote-fs.target is there, but I do not see any mount units being started as part of it. Can you post your fstab and note which partitions you expect to be mounted as network mount points?
Created attachment 3341 [details] RollingCauldron fstab Colin, I ran your command on my Clean M3B1 and it now mounts my nfs share with the fstab entry. Do you want me to post my new systemd-analyze plot for M3B1 ? I can see that remote-fs.target is now there and my mount (mnt-GNU.mount) is shown occurring just before remote-fs.target, as in the RollingCauldron plot. I've attached my fstab for my RollingCauldron, the last line shows the share that I am mounting on /mnt/GNU.
Oh yeah I missed the mnt-GNU.mount in your initial Rolling Cauldron plot - I thought it wasn't mounting anything but I see now it's OK. As an optimisation, you should be able to add the "nofail" option to your fstab entry for the mount point. This will mean that systemd-user-sessions.service (and thus gdm/kdm or whatever prefdm.service loads) should start a *lot* earlier. This will give the impression/feeling of a faster boot. Due to the mount point not haveing the "nofail" option, we presume it's a needed mount for normal operation and delay login screens accordingly (you'll see from the plot that prefdm.service only starts after the remote-fs.target. If you add the nofail option and then reboot and compare the plots you should see it starting a lot earlier). So just to confirm, this is OK for you now after the enabling? I guess the change we should make at least is to either: a) Enable remote-fs.target by default. or b) Make any remote mounts automatically enabled it when needed. I think I prefer to allow admin control so a) is my personal preference (and is 1:1 mapped to how the old netfs service worked).
Confirming this is OK for me, after you've made change a) or b) ;-) And thanks for the tip about nofail, I'll try that.
I prepared a reply to you, Colin, and I'd love to know where I posted it, but I don't see it here, so here was the gist... I agree that it makes more sense for apps to handle interfaces that go away or aren't up to begin with, using whatever means are at their disposal. I checked my LSB sevices, and they specified $network rather than $network-up, so that may explain what I saw. I verified that remote-fs was not enabled, then enabled it and rebooted. All of my fstab NFS mounts mounted successfully. So, as soon as you enable remote-fs, this bug can be closed.
is it harmful to have remote-fs enabled by default for everyone? maybe it'd be better to only do this by having the thing that looks at the fstab (for local mounts) to trigger this if it comes accross remote mounts which it will skip anyway
(In reply to comment #42) > So, as soon as you enable remote-fs, this bug can be closed. Well, not quite. Now, in a fresh laptop install, remote-fs.target, even while enabled, is showing the same hit-and-miss behavior: [root@ftglap grub]# systemctl status remote-fs.target remote-fs.target - Remote File Systems Loaded: loaded (/usr/lib/systemd/system/remote-fs.target; enabled) Active: inactive (dead) Docs: man:systemd.special(7) [root@ftglap grub]# systemctl start remote-fs.target [root@ftglap grub]# systemctl status remote-fs.target remote-fs.target - Remote File Systems Loaded: loaded (/usr/lib/systemd/system/remote-fs.target; enabled) Active: active since Tue, 2013-01-22 10:35:22 EST; 5s ago Docs: man:systemd.special(7) The target is enabled, but fails during boot, although it works fine if started manually.
(In reply to comment #42) > I checked my LSB sevices, and they specified $network rather than $network-up, > so that may explain what I saw. Just to reply to this bit in particular, the $ prefix on the name here "$network" isn't referring to the "network" initscript, but rather an abstract concept of "networking". It can be thought of as a target in systemd speak vs. a service for a regular init script. i.e. multiple services may all combine to contribute to $network (aka network.target). It's kinda hard to explain it, but if you already grok the difference between targets and services in systemd then the difference between network and $network in LSB speak should be at least partially clearer :)
(In reply to comment #44) > (In reply to comment #42) > > So, as soon as you enable remote-fs, this bug can be closed. > > Well, not quite. Now, in a fresh laptop install, remote-fs.target, even while > enabled, is showing the same hit-and-miss behavior: > > [root@ftglap grub]# systemctl status remote-fs.target > remote-fs.target - Remote File Systems > Loaded: loaded (/usr/lib/systemd/system/remote-fs.target; enabled) > Active: inactive (dead) > Docs: man:systemd.special(7) > > [root@ftglap grub]# systemctl start remote-fs.target > [root@ftglap grub]# systemctl status remote-fs.target > remote-fs.target - Remote File Systems > Loaded: loaded (/usr/lib/systemd/system/remote-fs.target; enabled) > Active: active since Tue, 2013-01-22 10:35:22 EST; 5s ago > Docs: man:systemd.special(7) > > The target is enabled, but fails during boot, although it works fine if started > manually. Is there any info on boot from the individual .mount units affected? e.g. "systemctl status foo.mount" will tell you the return value of the mount command itself. Are these shares all nfs shares or some other filesystem type? Only a select few filesystem types are automatically recognised as network. Otherwise you have to tell systemd they are network mounts using the _netdev option in fstab. Regarding automatically enabling remote-fs.target when a network mount is found, that would ultimately remove functionality. e.g. in the past we could simply disable the netfs service under sysvinit to not mount network shares. We would lose simple disabling of remote mounts should the sysadmin want it (we could still mask the target but that seems overkill). I have no problem with enabling remote-fs.target by default tho' - which is essentially the same as it was in the past.
Hmm, actually, remote-fs.target should be enabled by default on new installs anyway. Certainly the %post script of systemd-units package suggests it is. From the above comments do I understand correctly that this is not the case you've seen in practice?
100% certain it was (at that time of course) perhaps that is fixed now, but it ran too soon for comment 44 thus failing...
(In reply to comment #47) > Hmm, actually, remote-fs.target should be enabled by default on new installs > anyway. Certainly the %post script of systemd-units package suggests it is. > From the above comments do I understand correctly that this is not the case > you've seen in practice? On a fresh install of Beta 2 (64-bit from installer DVD into a VM): [root@localhost ~]# systemctl status remote-fs.target remote-fs.target - Remote File Systems Loaded: loaded (/usr/lib/systemd/system/remote-fs.target; disabled) Active: inactive (dead) Docs: man:systemd.special(7)
OK, I think I've noticed a problem in the post script that would prevent remote-fs.target getting enabled. Should be fixed in subversion so the next systemd release should have the fix. A net install will see it before the rc if you want to try that (note: dvd+updates won't work).
[root@localhost ~]# systemctl status media-fotos.mount media-fotos.mount - /media/fotos Loaded: loaded (/etc/fstab) Active: failed (Result: exit-code) since Sun, 2013-01-27 14:57:42 CET; 11min ago Where: /media/fotos What: 10.238.9.1:/media/fotos Process: 800 ExecMount=/bin/mount 10.238.9.1:/media/fotos /media/fotos -t nfs -o defaults,ro,rsize=32768,wsize=32768,nfsvers=3,hard,proto=tcp (code=exited, status=32) CGroup: name=systemd:/system/media-fotos.mount â 809 rpc.statd --no-notify Jan 27 14:57:42 localhost rpc.statd[809]: Version 1.2.7 starting Jan 27 14:57:42 localhost rpc.statd[809]: Flags: TI-RPC Jan 27 14:57:42 localhost mount[800]: mount.nfs: rpc.statd is not running but is required for remote locking. Jan 27 14:57:42 localhost mount[800]: mount.nfs: Either use '-o nolock' to keep locks local, or start statd. Jan 27 14:57:42 localhost mount[800]: mount.nfs: an incorrect mount option was specified Jan 27 14:57:42 localhost systemd[1]: Failed to mount /media/fotos. Jan 27 14:57:42 localhost systemd[1]: Unit media-fotos.mount entered failed state
that was the first mount point that fails (the others don't fail). this one is one that doesn't fail: [root@localhost ~]# systemctl status media-audio.mount media-audio.mount - /media/audio Loaded: loaded (/etc/fstab) Active: active (mounted) since Sun, 2013-01-27 14:57:42 CET; 13min ago Where: /media/audio What: 10.238.9.2:/audio Process: 803 ExecMount=/bin/mount 10.238.9.2:/audio /media/audio -t nfs -o defaults,ro,rsize=32768,wsize=32768,nfsvers=3,hard,proto=tcp (code=exited, status=0/SUCCESS) CGroup: name=systemd:/system/media-audio.mount â 814 rpc.statd --no-notify Jan 27 14:57:42 localhost rpc.statd[814]: Version 1.2.7 starting Jan 27 14:57:42 localhost rpc.statd[814]: Flags: TI-RPC Jan 27 14:57:42 localhost systemd[1]: Mounted /media/audio.
Hmm, so perhaps the problem is that rpc.statd is not yet run (or fully started up and ready) when the mount is started. If possible can you do a status on that service too (from the same boot) such that we can compare the times that it was started?
(In reply to comment #50) > OK, I think I've noticed a problem in the post script that would prevent > remote-fs.target getting enabled. > > Should be fixed in subversion so the next systemd release should have the fix. > > A net install will see it before the rc if you want to try that (note: > dvd+updates won't work). Net install onto a fresh VM has pulled in systemd-195-13, and now NFS mounts in /etc/fstab are being mounted at boot! Many thanks and well done!
(In reply to comment #54) > (In reply to comment #50) > > OK, I think I've noticed a problem in the post script that would prevent > > remote-fs.target getting enabled. > > > > Should be fixed in subversion so the next systemd release should have the fix. > > > > A net install will see it before the rc if you want to try that (note: > > dvd+updates won't work). > > Net install onto a fresh VM has pulled in systemd-195-13, and now NFS mounts in > /etc/fstab are being mounted at boot! > > Many thanks and well done! I confirm on a fresh net install just finished. Mounts are all present after installing nfs-utils and adding statement to fstab. Nice!
(In reply to comment #55) > I confirm on a fresh net install just finished. Mounts are all present after > installing nfs-utils and adding statement to fstab. Nice! I can also confirm on a fresh install I no longer have to manually execute "mount -a" after a boot to mount the external NFS shares. Many thanks to all.
I'll test this as soon as bug#8867 gets fixed so that I can dare to reboot a cauldron system...
Looking good here now, too.
Status: NEW => RESOLVEDResolution: (none) => FIXED
Arrggh. It's not completely fixed. Remote-fs is getting started too early. When I tested it above, it was on a desktop system with wired ethernet, and it worked fine. I just rebooted a current cauldron laptop which uses wireless, and remote-fs is getting started before the wireless network interface is up. I'll attach the output of "journalctl --no-pager -a -b".
Status: RESOLVED => REOPENEDResolution: FIXED => (none)
Created attachment 3488 [details] journalctl -a -b Note that NetworkManager is started at line 1232, and remote-fs is started at line 1465, but wlan0 doesn't begin activation until line 1897. The reason is clear. network, around line 1420, is trying to bring up wlan0 with ifplugd and failing, at which point network-up terminates in FAILED status and systemd lets everything else take off. There are a couple of problems here: 1) network-up doesn't seem to support NM 2) something is creating an ifcfg for wlan0 in spite of the fact that I never ran drakconnect for wlan0, and I'm not runing NM with ifcfg-rh. It contains: DEVICE=wlan0 BOOTPROTO=dhcp ONBOOT=yes so ifplugd assumes it has control because there's no NM_CONTROLLED. 3) network-up should NOT fail, ever. Services should not require network-up unless they really truly need the network to be up, and if the network can't come up for whatever reason, those services should be blocked from starting. Services that can run even without the network should require something else, or monitor things dynamically as you suggested above.
(In reply to comment #60) > Created attachment 3488 [details] > journalctl -a -b > > Note that NetworkManager is started at line 1232, and remote-fs is started at > line 1465, but wlan0 doesn't begin activation until line 1897. > > The reason is clear. network, around line 1420, is trying to bring up wlan0 > with ifplugd and failing, at which point network-up terminates in FAILED status > and systemd lets everything else take off. > > There are a couple of problems here: > > 1) network-up doesn't seem to support NM It should have support for NM. It may have broken recently but there is certainly code in there to handle this. > 2) something is creating an ifcfg for wlan0 in spite of the fact that I never > ran drakconnect for wlan0, and I'm not runing NM with ifcfg-rh. It contains: > DEVICE=wlan0 > BOOTPROTO=dhcp > ONBOOT=yes > so ifplugd assumes it has control because there's no NM_CONTROLLED. The ifcfg file is written via a udev rule: /usr/lib/udev/rules.d/76-net.rules namely the net_create_ifcfg. This has been the case for many years, so nothing particularly new here in that regard. Also the network scripts should assume that if there is no NM_CONTROLLED variable present, that it should assume NM *is* controlling it (i.e. see the code: "! is_false $NM_CONTROLLED && is_nm_running && _use_nm=true" in both network-up and network-functions in initscripts package > 3) network-up should NOT fail, ever. Services should not require network-up > unless they really truly need the network to be up, and if the network can't > come up for whatever reason, those services should be blocked from starting. Services shouldn't refer to network-up directly in terms of ordering themselves ever. They should only ever refer to network.target for ordering purposes. When network-up.service is enabled, it should delay network.target from starting accordingly, thus allowing things ordered after network.target to be delayed appropriately. network-up is basically a custom hack and to refer to it in any upstream units is definitely wrong. There are also other similar systems that achieve the same result (i.e. NetworkManager-wait-online.service) should people want to use them (tho' we should cover all the bases without the need for this in network-up). As stated previously, there are many different definitions for the "network up" state. Some apps will want the device to exist, others for the link to be up, others still for the interface to have an IP and yet more that the interface can connect to the wider network. It would be incredibly ugly to enumerate all these conditions in systemd targets and doing so would also legitimise the concept that any of these "network up" states exists as a single point in time and there after into the future ad infinitum. This is simply not the case - the network is a dynamic beast - it may come and go. While there may be a fixed point in time when it's "ready" (for whatever value of "ready" you need), it's no guarnetee that it'll still be "ready" 1us later. The network-up stuff is really just a hack for applications that are too dumb to get this on their own - the real solution is to fix the apps to listen to the kernel netlink events and deal gracefully with the conditions that they care about. To make the services fail or not start would be a change in behaviour to sysvinit and would lead to confusion for some users and legitimise the technique of actually waiting for a point in time when all is well. I think any efforts to allow this to happen would be far too invasive in the system for far too little gain. Any effort would be much better spent fixing the apps in question. > Services that can run even without the network should require something else, > or monitor things dynamically as you suggested above. Making the apps handle things properly is the only sane way to really fix this. I don't think an issue with network-up.service not working on NM systems (and it works fine on mine I should add) should be tracked in this bug as it's not specifically about this problem - it's more generic than that (and this bug is already waaaaay to long!) Can you open a new bug with the details/attachment from your last post please? CC me if you can.
Status: REOPENED => RESOLVEDResolution: (none) => FIXED
iiuc, this is really remote-fs.target starting too soon, because it should depend on network.target, rather than network-up.service?
(In reply to comment #61) > The ifcfg file is written via a udev rule: > /usr/lib/udev/rules.d/76-net.rules > namely the net_create_ifcfg. > > This has been the case for many years, so nothing particularly new here in that > regard. > > Also the network scripts should assume that if there is no NM_CONTROLLED > variable present, that it should assume NM *is* controlling it (i.e. see the > code: "! is_false $NM_CONTROLLED && is_nm_running && _use_nm=true" in both > network-up and network-functions in initscripts package Well, apparently they're not. > > 3) network-up should NOT fail, ever. Services should not require network-up > > unless they really truly need the network to be up, and if the network can't > > come up for whatever reason, those services should be blocked from starting. > > Services shouldn't refer to network-up directly in terms of ordering themselves > ever. They should only ever refer to network.target for ordering purposes. When > network-up.service is enabled, it should delay network.target from starting > accordingly, thus allowing things ordered after network.target to be delayed > appropriately. network-up is basically a custom hack and to refer to it in any > upstream units is definitely wrong. There are also other similar systems that > achieve the same result (i.e. NetworkManager-wait-online.service) should people > want to use them (tho' we should cover all the bases without the need for this > in network-up). Look, I don't know the history of network-up. I'm just going on your previous comments that it's a hack designed to stall network-dependent services until a network is actually working. If that's the case, it's not working. > > As stated previously, there are many different definitions for the "network up" > state. Some apps will want the device to exist, others for the link to be up, > others still for the interface to have an IP and yet more that the interface > can connect to the wider network. It would be incredibly ugly to enumerate all > these conditions in systemd targets and doing so would also legitimise the > concept that any of these "network up" states exists as a single point in time > and there after into the future ad infinitum. This is simply not the case - the > network is a dynamic beast - it may come and go. While there may be a fixed > point in time when it's "ready" (for whatever value of "ready" you need), it's > no guarnetee that it'll still be "ready" 1us later. The network-up stuff is > really just a hack for applications that are too dumb to get this on their own > - the real solution is to fix the apps to listen to the kernel netlink events > and deal gracefully with the conditions that they care about. > > To make the services fail or not start would be a change in behaviour to > sysvinit and would lead to confusion for some users and legitimise the > technique of actually waiting for a point in time when all is well. I think any > efforts to allow this to happen would be far too invasive in the system for far > too little gain. Any effort would be much better spent fixing the apps in > question. > > > Services that can run even without the network should require something else, > > or monitor things dynamically as you suggested above. > > Making the apps handle things properly is the only sane way to really fix this. C'mon Col, I'm with you on this theoretically, but neither you nor I are going to fix all these apps, or else network-up wouldn't exist. If it has to exist, then it damn well ought to do what it is supposed to do, which is block network-reliant apps until a network is available. > > I don't think an issue with network-up.service not working on NM systems (and > it works fine on mine I should add) should be tracked in this bug as it's not > specifically about this problem - it's more generic than that (and this bug is > already waaaaay to long!) > > Can you open a new bug with the details/attachment from your last post please? > CC me if you can. Agreed, I'll migrate the network stuff to new bugs.
(In reply to comment #63) > (In reply to comment #61) > > The ifcfg file is written via a udev rule: > > /usr/lib/udev/rules.d/76-net.rules > > namely the net_create_ifcfg. > > > > This has been the case for many years, so nothing particularly new here in that > > regard. > > > > Also the network scripts should assume that if there is no NM_CONTROLLED > > variable present, that it should assume NM *is* controlling it (i.e. see the > > code: "! is_false $NM_CONTROLLED && is_nm_running && _use_nm=true" in both > > network-up and network-functions in initscripts package > > Well, apparently they're not. Well we'll need to find out exactly why it's failing, so that might involve a little hacking to see if $_use_nm is actually set properly and if not, which bit of the test causes it not to be etc. > C'mon Col, I'm with you on this theoretically, but neither you nor I are going > to fix all these apps, or else network-up wouldn't exist. If it has to exist, > then it damn well ought to do what it is supposed to do, which is block > network-reliant apps until a network is available. I agree, but only with in a reasonable timeout, not indefinitely. The fact is that to fail those services reliant on network if the network never comes up, it means you have to do a lot more work. And I mean a lot. As I was trying to describe, it would require a big change overall both in terms of hacks in systemd to handle this special case and in terms of changing lots and lots of packages to split them into the "can handle it themselves" and "are too dumb to know any better" camps. I'm afraid I'm not prepared to go down that route personally, especially at this stage in the release process. It's just far to big a body of work towards a path that is, by design, broken. If you want to wait longer, you can technically change the timeout (tho' you need to change it in two places - and as network-up is a sysvinit script then you may have to also patch systemd to read the timeout from the LSB header somehow), but this has other knock on effects too. It delays the whole boot, it means that legacy runlevels are not reached until later which may have other unintended consequences on some setups. I'm not saying network-up should be broken like in your example. There is clearly a bug or misconfiguration there somewhere which needs fixed, but to adopt your other suggestion is not something I've personally got the time or inclination to do. Sorry if that disappoints you but I just do not agree that it's worth all the effort involved, especially as this has been the behaviour for many, many years before now. I see no need to urgently "solve" it (and again, to even attempt to solve it give legitimacy to the approach generally which I believe is the wrong message to be sending out). (In reply to comment #62) > iiuc, this is really remote-fs.target starting too soon, because it should > depend on network.target, rather than network-up.service? remote-fs.target has to start after remote-fs-pre.target (due to each individual mount ensuring this is the case - assuming nofail or noauto options are not present) and remote-fs-pre.target has to start after network.target. Thus, by extension, each remote mount has to start after network.target. Therefore there shouldn't be an problem here. Even if you do specify nofail option, then remote-fs.target will start sooner (as the mounts are not considered critical for boot) but the actual mount units are still ordered after remote-fs-pre.target which means they still wait for the network to be up before the mount is attempted. So I don't think there is anything amiss here.
(In reply to comment #64) > > I agree, but only with in a reasonable timeout, not indefinitely. The fact is > that to fail those services reliant on network if the network never comes up, > it means you have to do a lot more work. And I mean a lot. > > As I was trying to describe, it would require a big change overall both in > terms of hacks in systemd to handle this special case and in terms of changing > lots and lots of packages to split them into the "can handle it themselves" and > "are too dumb to know any better" camps. I'm afraid I'm not prepared to go down > that route personally, especially at this stage in the release process. It's > just far to big a body of work towards a path that is, by design, broken. I really wasn't looking to create a lot of work for anyone. I just assumed that either increasing the timeout to effective infinity or just not setting it would do the trick. Nor am I advocating modifying or splitting services. As far as I'm concerned, anything that can run without the network, or to state it better, anything that is currently starting without the network anyway can remove its dependence on network-up and not be any worse off than it is today. In terms of getting to runlevel 5, IIRC the DMs don't wait for network-up now unless network authentication is active, presumably because we don't want people logging in unless the network is there. That said, I see another problem here. NM is not marking the wireless interface as a system interface by default, so it delays activating wlan0 until a user (ftg) who has initialized it logs in. I'll mark it as a system connection and see what effect that has.
(In reply to comment #65) > That said, I see another problem here. NM is not marking the wireless > interface as a system interface by default, so it delays activating wlan0 until > a user (ftg) who has initialized it logs in. I'll mark it as a system > connection and see what effect that has. OK, that makes all the difference. Now NM activates wlan0 well before remote-fs runs. The new bug issues are still valid, though, because ifplugd is still trying to bring up wlan0, in fact before the kernel driver has initializaed it.