Bug 4206 - Wifi connection connetc / disconnects constantly
Summary: Wifi connection connetc / disconnects constantly
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal major
Target Milestone: ---
Assignee: Mageia Bug Squad
QA Contact:
URL:
Whiteboard:
Keywords: NEEDINFO
Depends on:
Blocks: 5015
  Show dependency treegraph
 
Reported: 2012-01-20 15:31 CET by Wolfgang Bornath
Modified: 2012-03-26 15:39 CEST (History)
4 users (show)

See Also:
Source RPM: broadcom-wl
CVE:
Status comment:


Attachments

Description Wolfgang Bornath 2012-01-20 15:31:30 CET
Description of problem:

wifi hardware: Broadcom 4312.

After new install of alpha3 (including dkms-broadcom-wl) I was able to configure wifi access the usual way, it worked. After the reboot the connection came up but it disconnects a few seconds later, then connects again, then disconnects after a few seconds.

/var/log/messages says:
-> Executing '/etc/ifplugd.action eth1 up'
-> Then follow the usual DHCP request and successful answer.
-> Then after a few seconds I see: ifplugd(eth1): Link beat lost.

syslog is filling up with 
Jan 16 20:48:51 skadi ifplugd(eth1)[5430]: Link beat lost.
Jan 16 20:48:51 skadi ifplugd(eth1)[1415]: Link beat lost.
Jan 16 20:48:52 skadi ifplugd(eth1)[5430]: Link beat detected.
Jan 16 20:48:52 skadi ifplugd(eth1)[1415]: Link beat detected.

As my router/access point is working ok I see only one option: switch off wifi by hardware switch. Of course this makes the machine unusable.

With Mageia 1 and Mageia 2 Alpha 2 there were no problems with wifi on that machine.
Comment 1 Manuel Hiebel 2012-01-23 00:38:55 CET
Are you using NM or drakx-net ?

José have you also see that ?

CC: (none) => lists.jjorge

Comment 2 Wolfgang Bornath 2012-01-23 09:38:44 CET
I use drakx-net, this wifi chip never worked with NM, I even have to uninstall NM if it is installed.

CC: (none) => molch.b

Comment 3 José Jorge 2012-01-23 12:26:06 CET
I has the same problem because : b43 driver calls this device wlan0, while wl driver calls it eth1 . If it is loaded before ethernet driver, it ends called eth0, and drakxnet is puzzled.

The solution is to delete every connections in drakxnet after having installed wl driver, then recreate them.
Comment 4 Manuel Hiebel 2012-01-23 12:51:50 CET
Ok good to know

Keywords: (none) => USABILITY
Source RPM: (none) => broadcom-wl
Whiteboard: (none) => Errata

Comment 5 Wolfgang Bornath 2012-01-23 14:21:13 CET
Unfortunately this does not change anything here. I deleted all 3 connections (eth0, wlan0, eth1). Then I created eth1. After reboot nothing has changed wrt the continuous up/down of eth1.
Comment 6 Wolfgang Bornath 2012-01-23 14:51:13 CET
Installed new updates (around 40 packages) - now wifi seems to remain connected. Watched syslog for 20 minutes, no messages about linkbeat lost or related.

Seems the bug is resolved - one more ticked on my way to a usable system \o/
Comment 7 Wolfgang Bornath 2012-01-23 15:01:17 CET
1 message about link beat lost and detected, but connection stays up
Comment 8 Wolfgang Bornath 2012-02-09 18:13:02 CET
Unfortunately the issue is not solved!

Just did a net-install, minimal installation with X. After that I installed dkms-broadcom-wl and configured wifi. 
After next reboot the problem was back as described. Unfortunately this bug is not assigned yet and still goes as NEW, although reported 3 weeks ago.
Marja Van Waes 2012-03-18 20:08:38 CET

Blocks: (none) => 5015

Comment 9 José Jorge 2012-03-19 09:54:35 CET
I don't have this bug anymore, on a clean Beta 2 install with 4312 hardware (b43 driver). Do you still have it?
Sander Lepik 2012-03-25 19:48:26 CEST

CC: (none) => sander.lepik
Hardware: i586 => All

Comment 10 Sander Lepik 2012-03-25 21:18:46 CEST
I found something interesting that made my connection work again. I'm using networkmanager + broadcom-wl. For some reason wpa_supplicant is started twice.

So here's what i did:

# pkill -9 wpa_sup
# service network restart

After that only one wpa_supplicant is started and network stays up, no more disconnects.

It seems that the bogus file for me is /etc/sysconfig/network-scripts/ifup-eth (lines 126-128) which can't detect correctly if wpa_supplicant is already started. Or maybe systemd (with dbus.service) is starting its wpa_supplicant later so ifup-eth can't detect it.

If i add comments to these lines my system boots up with only one wpa_supplicant running:
# ps aux |grep wpa
root      1955  0.0  0.0  31060  2848 ?        S    21:12   0:00 /usr/bin/wpa_supplicant -c /etc/wpa_supplicant.conf -u -P /var/run/wpa_supplicant.pid

As (at least for me) broadcom's device is detected as eth1 ifup-eth runs and causes problems. It doesn't happen with nics that start up as wlanX.

So i think that ifup-eth should be fixed to check if we are using systemd or not. If we are then we should't start another wpa_supplicant as systemd seems to do it anyway.

#####

Some debugging later:
Now looking at #3344 it seems that line 123 of ifup-eth might be broken in the first place. And i have no idea how to make wpa_cli to work. Maybe it shouldn't work in dbus mode at all?

Colin, ideas?

Also, can someone else confirm two concurrent wpa_supplicant instances?

Keywords: USABILITY => (none)
CC: (none) => mageia
Whiteboard: Errata => (none)

Comment 11 Colin Guthrie 2012-03-25 21:54:05 CEST
Hmmm, if you're using network manager, why does service network restart do anything?

I suspect the problem itself is that the network service is doing stuff when it shouldn't.

Have you got the very latest initscripts (from yesterday)? I think this should help it do pretty much nothing when network manager is used.

If this still doesn't work, can you check your ifcfg-* files and make sure they all have NM_CONTROLLED=true  set properly? You can ignore the ifcfg-Auto* ones that have TYPE=Wireless in them.

Also I don't think systemd starts wpa_supplicant. I believe it's NetworkManager which requests wpa supplicant to start via dbus.

You can confirm this by checking which cgroup the process is in:

[root@jimmy ~]# ps -o cgroup -p `pidof wpa_supplicant`
CGROUP
cpuacct,cpu:/system/dbus.service;name=systemd:/system/dbus.service


So this shows that it was indeed started via dbus.

As for wpa_cli, I guess it only works when it's started via the initscripts, not via dbus (I presume it relies on a communication socket in /tmp or something)
Comment 12 Sander Lepik 2012-03-25 22:41:36 CEST
Restarting network was my bad (wpa_supplicant was restarted anyway). It doesn't help. But there is still some bug. It only works when NM_CONTROLLED is set to "true". Problem is: Mageia's tools set it to "yes" and with "yes" it starts another wpa_supplicant - network is again messed up & starts to disconnect.

initscripts is latest, 9.34-11.

So if "true" is the right thing to have there then Mageia's tools need to get fixed.
Comment 13 Colin Guthrie 2012-03-25 22:51:33 CEST
I may have misquoted there in terms of true vs. yes, but if NM_CONTROLLED is set to anything other than false, then the "network" initscript should not do anything to do with wpa_supplicant. So that's what needs tracked down.

There are code paths that do "! is_false $NM_CONTROLLED"... so I guess it depends how is_false is implemented.
Comment 14 Sander Lepik 2012-03-25 23:15:23 CEST
Hmm, ok, only "true" doesn't help. I did few more restarts and another wpa_supplicant still pops up. So something is still running ifup-eth. If i now do "service network restart" then the other wpa_supplicant gets killed and problem is solved. But it shouldn't pop up at first place.
Comment 15 Colin Guthrie 2012-03-25 23:39:38 CEST
Indeed. Can you give debug output from when the network is restarted? Syslog or journal. It should cycle through all the interfaces and say something like:

Mar 25 22:38:30 jimmy network[1685]: Bringing up interface eth0:  ./ifup: interface eth0 is controlled by NetworkManager; skipping.


If it actually tries to do *anything* for any of your interfaces can you check the ifcfg-* file for it?
Comment 16 Sander Lepik 2012-03-26 10:03:09 CEST
Colin uploaded new initscripts (9.34-12). It fixed problems for me, so José or Wolfgang can you please test and report back?

Keywords: (none) => NEEDINFO

Comment 17 Wolfgang Bornath 2012-03-26 10:14:54 CEST
Ah, sorry I did not report earlier but the problem was fixed a while ago in a new install. I should have answered on comment #9. 

Unfortunately all I can do is apologize, but I had so many other things on my mind at that time I just forgot about it.
Comment 18 José Jorge 2012-03-26 10:16:11 CEST
Also fixed for me installing Beta2.

Status: NEW => RESOLVED
Resolution: (none) => FIXED

Comment 19 Wolfgang Bornath 2012-03-26 11:43:57 CEST
Did updates as of last couple of days (new kernel, new broadcom-wl, new initscripts), nothing was destroyed :) No failure messages in dmesg, no suspicious lines in 'tail -f syslog', wifi coming up and working stable. 

I may add: I just tested to use NM but I get no connection with NM. Without NM wifi connects without problems. But I think this is a different issue.
Comment 20 Sander Lepik 2012-03-26 11:53:02 CEST
How did you test NM?

* install nm (+ check it's started)
* check NM box in interface's configuration
* check that two wpa_supplicants are not running at the same time, if they are, just run service network restart

or 2 first steps + reboot and then configure NM
Comment 21 Wolfgang Bornath 2012-03-26 13:23:35 CEST
Ok, will do the test in a minute. 

Meanwhile another test I ran about the "link beat lost" messages:

When wifi was up (without NM) I let 'tail -f /var/log/messages' run in a konsole window. It looked like this:

 - No related messages for the first 20 minutes
 - then "ifplugd(eth1) Link beat lost" and a second later "Link beat detected"
 - then the "lost" message followed by the "detected" messages repeated after each 5 minutes.

 - now I opened an internet application (Google mail in a browser) and there were no messages about lost link beat for as long as I let the browser open. 
 - as soon as I closed the browser it took 4 minutes and the messages came back.
 - then I opened an irc session and the messages were gone as long as the session ran. Closing the session was followed by the "link beat lost/connected" messages in a 5 minute routine.

In other words: as long as a connection was requested and held by an application the messages were gone, as soon as no connection was needed the messages were there.
Comment 22 Wolfgang Bornath 2012-03-26 13:53:44 CEST
(In reply to comment #20)
> * install nm (+ check it's started)
Done, started nm with 'systemctl enable NetworkManager.service' -> ok.

> * check NM box in interface's configuration
Done. Disconnected and connected again, ok

> * check that two wpa_supplicants are not running at the same time, if they are,
> just run service network restart
Only one wpa-supplicant running.

Now reboot:
wifi does not come up.
 - started nm with 'systemctl enable NetworkManager.service' -> ok.
 - checked the connection in drakconnect -> it was not enabled.
 - configuration shows controlled by NM enabled
 - started to connect -> Connection failed.
 - checked wpa_supplicant -> only one running

Opened drakconnect and removed the mark in NM box, tried to start connection - failed.

Reboot:
wifi does not come up although NM-box is unmarked.
 - removed the wifi connection and re-created it (NM not marked) - connection failed.

 - de-installed NM

Reboot:
wifi comes up ok.
Comment 23 Sander Lepik 2012-03-26 14:15:39 CEST
Hmm, when you enabled NM service did you actually used NM tools to configure network too? Or did you just hope it will do it itself? You need plasma-applet-networkmanagement or something like that to configure your interface with NM.
Comment 24 Wolfgang Bornath 2012-03-26 14:41:42 CEST
Ah, ok, I never knew that. Does that mean that it makes no sense setting up wifi with the draktool when NM is installed? How should I know this?

If so I have questions:

 - Why then is there an option about NM in the first place? As you see with my example it only confuses the user who thinks that all he has to do is marking this option.

 - Why then is 'plasma-applet-networkmanaging' not visible in systray? Because its not even installed as dependency of NM. And without that plasma how could I know that I need it?

 - Where do I get information how it is done with NM present? When I use RightMouseClick on the applet in systray and select "Wireless connection Management" (or whatever it is called in English) I am taken to the draktool.

Taken the previous points 
 - marking the "handled by NM" should check if NM is installed in the first place, 
 - when NM is installed the necessary tools should be installed as well
 - when NM is installed the matching applet should be replace the other applet in systray.

This would give a hint about what the user has to do instead of merely let him mark an option without knowing the consequentual workflow.

Last question:  
Why has NM any impact on the connection as soon as it is installed, even if it is not marked to be in charge? As I wrote in my previous post wifi worked only after I de-installed NM. Unmarking had no influence at all.
Comment 25 Colin Guthrie 2012-03-26 15:13:16 CEST
(In reply to comment #24)
> Ah, ok, I never knew that. Does that mean that it makes no sense setting up
> wifi with the draktool when NM is installed? How should I know this?

Network manager is heavily integrated into e.g. GNOME UI. It appears right at the top and next to battery indicator. You don't have to do any setup to see this, it's just there if NM is running.

I agree it's not always obvious, but that's not really something we can solve easily - having two systems to configure and control networks is always going to be tricky.
Comment 26 Sander Lepik 2012-03-26 15:14:56 CEST
Well, for sure i'm not the right person to answer but i'll try my best.

(In reply to comment #24)
> Ah, ok, I never knew that. Does that mean that it makes no sense setting up
> wifi with the draktool when NM is installed? How should I know this?
You probably shouldn't. But that's why NM isn't checked by default. And yes, if you check this box then drak* tools stop managing your network.

> If so I have questions:
> 
>  - Why then is there an option about NM in the first place? As you see with my
> example it only confuses the user who thinks that all he has to do is marking
> this option.
Well, it has to be somewhere. You shouldn't check thing you don't know what they do :)

>  - Why then is 'plasma-applet-networkmanaging' not visible in systray? Because
> its not even installed as dependency of NM. And without that plasma how could I know that I need it?
You can make it visible after installing it when you configure which icons are shown in the systray. And it's not required as there are more than one way to configure NM. You can do it with GNOME's tools, AFAIK KDE settings had some options, there is some gtk applet also. So more than one way to configure NM.

>  - Where do I get information how it is done with NM present? When I use
> RightMouseClick on the applet in systray and select "Wireless connection
> Management" (or whatever it is called in English) I am taken to the draktool.
This applet is used for connections that you don't want to manage with NM. If you manage all your connections with NM then you can disable this applet from starting.

> Taken the previous points 
>  - marking the "handled by NM" should check if NM is installed in the first
> place, 
AFAIK if NM is not installed or is disabled then drak* tools take over whatever the checkbox says. We tested with Colin yesterday and if i disabled NM but had old config for drak* tools my network was started by network.service.
>  - when NM is installed the necessary tools should be installed as well
AFAIK it was GNOME that requires NM, it's not a default tool for Mageia yet. So i don't think those tools are that much needed.
>  - when NM is installed the matching applet should be replace the other applet
> in systray.
As i already said above, you might need that other applet for other connections you have.

> This would give a hint about what the user has to do instead of merely let him
> mark an option without knowing the consequentual workflow.
As also already said, you should not do such things anyway :)

> Last question:  
> Why has NM any impact on the connection as soon as it is installed, even if it
> is not marked to be in charge? As I wrote in my previous post wifi worked only
> after I de-installed NM. Unmarking had no influence at all.
I'm not 100% sure here but i think that NM is not depending on that checkbox. If it's enabled it will run and maybe even started wpa_supplicant so network.service was unable to do that correctly.
You can try to install NM but disable it (systemctl disable NetworkManager.service) and then run "service network restart" or reboot the system.

Hope you got some answers.
Comment 27 Wolfgang Bornath 2012-03-26 15:33:57 CEST
Thx for that explanation.

 - I never wrote that I use Gnome, but I remember using Gnome I also used the applet (there it was installed but obviously not as dependency of NM) and I remember setting up wifi with NM which never worked.

 - I checked NM only because people (can't remember who it was) recommended using NM. It was not my idea in the first place to check something I did not know how to handle. So you may forgive me here :)

So, what I read from this explanations (thx to both) it is not recommended to enable NM in non-Gnome environment because  
 - enabling NM in the draktool does not automatically install it. But the draktool does not show a message box that it is not installed. 
 - installing NM in the non-Gnome environment does not pull the matching applet so you have to search another way to setup wifi (other tools which are not installed either) or manually by editing config files.

Ok, so as I am not using Gnome anymore and not using NM I will obviously not of much help here. But I learned something here though it is not Bugzilla's job to give lessons :)
Comment 28 Sander Lepik 2012-03-26 15:38:14 CEST
Well, i'm using NM with KDE myself and it works well for me. And i also hope that we can make mga3 work with NM as default but it's always nice tho have dreams :)
Comment 29 Colin Guthrie 2012-03-26 15:39:01 CEST
(In reply to comment #27)

> So, what I read from this explanations (thx to both) it is not recommended to
> enable NM in non-Gnome environment because  
>  - enabling NM in the draktool does not automatically install it. But the
> draktool does not show a message box that it is not installed. 


I think that's a pretty fair summary yes.

That said, the latest initscripts pushed in the last couple days should effectively ignore the "Allow Network Manager manage this interface" (NM_CONTROLLED=yes) setting if networkmanager is not actually installed and running, so it's should be slightly safer than it used to be, but the other caveats mentioned are still valid.

Note You need to log in before you can comment on or make changes to this bug.