Bug 9028

Summary: /etc/init.d/network (network.service) is starting wpa_supplicant when NetworkManager is used
Product: Mageia Reporter: Sander Lepik <mageia>
Component: RPM PackagesAssignee: Colin Guthrie <mageia>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: release_blocker CC: lmenut, mageia, mageia
Version: Cauldron   
Target Milestone: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Source RPM: initscripts CVE:
Status comment:

Description Sander Lepik 2013-02-10 19:09:41 CET
Description of problem:

I have set all my interfaces to be controlled by NM but for some reason /etc/sysconfig/network-scripts/ifup-eth (lines 126-128) is starting wpa_supplicant and this will stop NM to work.

My wireless device is named wlan0, nothing to do with eth.

This problem started to happen in december or maybe a bit later.
Sander Lepik 2013-02-10 19:10:45 CET

Priority: Normal => release_blocker

Comment 1 Damien Lallement 2013-02-26 16:33:23 CET
Thank you Sander, you saved my life! I confirm this bug since december...

# ps -ef | grep wpa
root      1280     1  0 11:10 ?        00:00:00 /usr/sbin/wpa_supplicant -B -i wlan0 -c /etc/wpa_supplicant.conf -D wext
root     11734     1  0 16:27 ?        00:00:00 /usr/sbin/wpa_supplicant -u -f /var/log/wpa_supplicant.log -c /etc/wpa_supplicant.conf -P /run/wpa_supplicant.pid

# wpa_cli terminate
Selected interface 'wlan0'
OK

ps -ef | grep wpa
root     11734     1  0 16:27 ?        00:00:00 /usr/sbin/wpa_supplicant -u -f /var/log/wpa_supplicant.log -c /etc/wpa_supplicant.conf -P /run/wpa_supplicant.pid

# ping -I wlan0 free.fr
PING free.fr (212.27.48.10) from 172.16.1.4 wlan0: 56(84) bytes of data.

CC: (none) => mageia, mageia

Comment 2 Sander Lepik 2013-02-26 16:40:52 CET
3-step workaround (w/o patching) seems to be this:

systemctl disable network.service
systemctl disable network-up.service
systemctl enable NetworkManager-wait-online.service

This way only NM is in control and init scripts can't screw up things :)
Comment 3 Colin Guthrie 2013-02-27 00:39:21 CET
OK so from what I can gather, this is due to it being started via dbus activation.... looking into it.
Comment 4 Colin Guthrie 2013-02-27 01:22:56 CET
Hmm, actually the dbus launched version for me is the one I actually want :)

The real problem is that ifup-eth is run at all it seems.

Can you guys check to see if you have a UUID= in your ifcfg-wlan0 file?

If you do not, try picking the UUID of a network you usually join (by looking in the ifcfg-Auto* files) and putting UUID= into the ifcfg-wlan0 file.


Alternatively, source the following function:


get_uuid_by_config ()
{
    dbus-send --system --print-reply --dest=com.redhat.ifcfgrh1 /com/redhat/ifcfgrh1 com.redhat.ifcfgrh1.GetIfcfgDetails string:"/etc/sysconfig/network-scripts/$1" 2>/dev/null | awk -F '"' '/string / { print $2 }'
}


And then try:

get_uuid_by_config ifcfg-wlan0


Does this print out a UUID?


For me, I both have a UUID in the ifcfg-wlan0 file AND I get a UUID from the above function.



A result from either of these would cause the ifup script to bail out before doing anything manually.
Comment 5 Sander Lepik 2013-02-27 10:33:00 CET
# get_uuid_by_config ifcfg-wlan0
Usage: dbus-send [--help] [--system | --session | --address=ADDRESS] [--dest=NAME] [--type=TYPE] [--print-reply[=literal]] [--reply-timeout=MSEC] <destination object path> <message name> [contents ...]
bash: /com/redhat/ifcfgrh1: No such file or directory
Comment 6 Colin Guthrie 2013-02-27 11:03:05 CET
@sander Bugzilla messing up. It's meant to all be on one line.
Comment 7 Sander Lepik 2013-02-27 11:08:59 CET
Ok, the new result:

# get_uuid_by_config ifcfg-wlan0
#

:)
Comment 8 Colin Guthrie 2013-02-27 11:30:58 CET
OK, so the question is "why does it return blank.

Can you dig into it more? i.e. drop the error redirection, see what come out from the commands without awk etc.?

So what I suspect is happening:

1. The is no UUID in ifcfg-wlan0 (I have one for whatever reason)
2. This means that the above function (get_uuid_by_config) is called but it fails. It works for me, so not quite sure what it does internally (perhaps it also relies on the UUID in the ifcfg-wlan0 file?)

Either way, we can solve it by doing one of two things:

1. Ensuring there is a UUID in the ifcfg-wlan0 file of the network it's currently on.
or 
2. Change the code in ifup to always exit if _use_nm==true.

I'll see if I can nail down Bill Nottingham to quiz him about the logic of the scripts.
Comment 9 Colin Guthrie 2013-02-27 12:00:34 CET
Actually this is kinda weird... The upstream initscripts don't do anything with wpa_supplication. These are modifications on our side. Not sure how it was handled in the past.

Perhaps we just need to adjust our scripts to work with a single instance of wpa_supplication and not a per-device one?

I dunno... will take a bit of digging.

Quick and dirty solution would be to just bail out of ifup if _use_nm==true (which is what the previous version of the initscripts did....

I'll likely add that as a stop-gap but really we should try and reduce the delta from the upstream initscripts and how they work as I'm not sure they're offering much value these days :s
Comment 10 Colin Guthrie 2013-02-27 12:09:19 CET
OK, latest initscripts package should work around this issue.
Comment 11 Sander Lepik 2013-02-27 12:21:26 CET
# get_uuid_by_config ifcfg-wlan0
Must use org.mydomain.Interface.Method notation, no dot in "string:/etc/sysconfig/network-scripts/ifcfg-wlan0"

(In reply to Colin Guthrie from comment #9)
> Actually this is kinda weird... The upstream initscripts don't do anything
> with wpa_supplication. These are modifications on our side. Not sure how it
> was handled in the past.
> 
> Perhaps we just need to adjust our scripts to work with a single instance of
> wpa_supplication and not a per-device one?

I didn't test if initscripts would work like that. I really don't care much about that old stuff :) But it's too late to kick those scripts out. Tho' once I replaced those 3 lines with systemctl start wpa_supplicant.service and that was OK for NM.

> Quick and dirty solution would be to just bail out of ifup if _use_nm==true
> (which is what the previous version of the initscripts did....

Sounds like a better idea than searching for UUID as during boot there is no current connection and the UUID should be unknown anyway. network.service kicks in before NM even starts to check for connections.

> I'll likely add that as a stop-gap but really we should try and reduce the
> delta from the upstream initscripts and how they work as I'm not sure
> they're offering much value these days :s

IMHO we should try to kick those scripts out the door in mga4 :) NM isn't the best solution (and I haven't checked what's the status of SUSE's solution) but it works better than those old initscripts do.. And it's probably the most popular solution currently. We just need to integrate it :/
Comment 12 Colin Guthrie 2013-02-27 12:25:26 CET
(In reply to Sander Lepik from comment #11)
> > I'll likely add that as a stop-gap but really we should try and reduce the
> > delta from the upstream initscripts and how they work as I'm not sure
> > they're offering much value these days :s
> 
> IMHO we should try to kick those scripts out the door in mga4 :) NM isn't
> the best solution (and I haven't checked what's the status of SUSE's
> solution) but it works better than those old initscripts do.. And it's
> probably the most popular solution currently. We just need to integrate it :/

For desktops I tend to agree, but for a server (and I have a couple) NM really isn't that well suited IMO. A static configuration for networking device in that situation works reasonably well I think.

Longer term, there will be a consolidation of the various distro networking scripts and it'll be absorbed into systemd. This will be when we can eventually kill them off completely in their current form. That might or might not happen in the mga4 timeframe, but if it does I'm sure we'll adopt it :)
Comment 13 Sander Lepik 2013-02-27 20:33:54 CET
Ok, this bug seems to be fixed. Thanks!

Status: NEW => RESOLVED
Resolution: (none) => FIXED

Comment 14 Luc Menut 2013-02-27 21:56:36 CET
(In reply to Colin Guthrie from comment #10)
> OK, latest initscripts package should work around this issue.

Hmm, I'm not sure that your patch 0102-Bail-early-if-NetworkManager-is-used.patch is fully valid.
This part of code is used only when the network device should be started at boot (ONBOOT=yes, if ONBOOT=no, we exit before at line 60) and with NetworkManager (NM_CONTROLLED=yes -> "$_use_nm" = "true" ), so using nmcli con up.
If we can't connect the device because $UUID is not set, I think that we should exit with failure instead of success.

regards,
Luc

CC: (none) => lmenut

Comment 15 Colin Guthrie 2013-02-27 23:04:03 CET
I thought about exiting with a failure code, but this is just a direct forward port of how it used to work in the 9.34 initscripts and rather than introduce a different code, I figured I'd just leave it as it was on mga2 - try not to rock the boat more than necessary was my thinking. If you think it would help more, then I can change it to a failure.
Comment 16 Luc Menut 2013-02-27 23:41:54 CET
OK, I didn't look how it was in mga2. You are right, it's probably better to keep it simple.
Furthermore, most of the time, the ONBOOT=yes is just forgotten by users when they choose to use NetworkManager.