Description of problem: If Mageia is installed on a computer where no network is connected during installation then on adding connection with drakconnect the network test will fail even though the connection actually works. How reproducible: Seen when no network is available during install e.g. laptops I believe it affects wireless connections too, but this is harder for me to test in a virtual box. Steps to Reproduce: 1. Install Mageia from DVD in a Virtual Box. Ensure the VBox Virtual Ethernet connection has its cable pulled out. (right click on the network icon in the Vbox) Save the machine state when install is complete but before rebooting to facilitate retesting. 2. Boot into the completed install. Start MCC>drakconnect Insert the virtual ethernet cable and go through the add a connection wizard. 3. Observe the network connection test fails but the connection is actually up and working. This bug is very similar to, and related to Bug 5772, but the fix for 5772 does not resolve this bug even after changing the patch to apply to an installed system.
Created attachment 2250 [details] Wireshark log during connection test Note: when simulating the failure with a vbox it is not actually necessary to disconnect/reconnect the cable. Deleting the existing connection in MCC is all that is required. Attached is wireshark log. As soon as DHCP is finished mDNS messages start appearing, but there is no sign of an network test taking place. I suspect the test is simply not starting.
I'm waiting for your patch :-)
CC: (none) => thierry.vignaudAssignee: bugsquad => mageia
This bug exists both in Mageia 2 RC and in Mageia 2 beta 3. My computer connects to the Internet via PPPoE, so I need to add that connection manually. Whether I add it after I boot into the LiveCD or after I install from the LiveCD, the network test always fails while the network connection is actually up and working.
CC: (none) => piscestong
It appears there are two bugs causing this test to fail. The first is our old friend bug 5772 If I use this patch in tools.pm that part of the test passes sub connected { if ($::isInstall) { symlink "$::prefix/etc/resolv.conf", "/etc/resolv.conf" if ! -e "/etc/resolv.conf"; } return scalar grep { /1 received/ } `$::prefix/bin/ping -qc1 www.mageia.org`; } The second problem is in network::connection::get_status which is supposed to return if a gateway is defined by calling network::tools::get_interface_status unfortunately decoding get_interface_status is straining my miniscule perl skills
Ok After having to learn how to use the perl debugger I have confirmed this bug is due to a timing isse in network:netconnect around line 315 if (!$::isInstall) { services::start('network-up'); } else { my $timeout = $connection->get_up_timeout; while ($timeout--) { my $status = $connection->get_status; last if $status; sleep 1; } } $success = $connection->get_status(); It test for the connection status without waiting for the network to come up (except during install when it does wait) If I comment out the } else { it works perfectly.
Sorry, cancel Comment 5 After refreshing by Virtual Box it is failing again :-(
Created attachment 2268 [details] Patch to netconnect.pm No cancel my cancel. I was right the first time. I had left the virtual ethernet cable disconnected in my Virtual Box. Doh! Attached is patch to netconnect.pm to wait for the network to come up before testing for connectivity. There is no need to do anything to tools.pm
For the record, $::isInstall is set in installer but not in drakconnect, when run as a standalone tool. So you're making drakconnect run that block. Maybe should we just drop it since it's never run by drakconnect but only in drakx where we've failures. WDYT Blino? BTW, use "diff -u" next time in order to have better patches
Created attachment 2269 [details] Revised patch for netconnect.pm This patch is better. The last one screws up the install.
Attachment 2268 is obsolete: 0 => 1
oops. network-auth is a NoOp now. That might be the real issue.
Keywords: (none) => PATCHCC: (none) => mageia
network-auth has always been a noop. It's just a virtual script that changes ordering. It still has effect when enabled - i.e. it will delay prefdm.service startup until network-up.service is complete.
I am pretty confident it is a timing issue. If I run with the perl debugger and pause execution just before $connection->get_status() then the test passes. making it do that block in attachment 2269 [details] makes it delay long enough for the connection to be ready before testing it.
Could it be that the fact the code is calling service::start("network-up")? If it is already logged as running (systemctl status network-up.service) then systemd won't start it again. e.g. compare: systemctl start network-up.service vs systemctl try-restart network-up.service the latter takes much longer. So maybe the fix is simply to use service::restart rather than service::start? Could you maybe test such a fix?
I have tried using service::restart no luck. Does not work. We have a working patch in attachment 2269 [details] Why not use it?
(In reply to comment #14) > We have a working patch in attachment 2269 [details] Why not use it? Because I do not think it solves the problem properly, it's just a hacky solution that unconditionally injects a delay and doesn't actually analyse the problem itself fully. The patch you refer to unconditionally runs code meant only for the installer to always run, even when network-up script exists to do this job. Now in this case, because we start the services in non-blocking mode (for other reasons on another bug) the actually command executed will be non-blocking.... So we actually need a fix here to use blocking mode in this particular case. (I knew this would come back to bite me TV :p) You can likely use: run_program::rooted($::prefix, '/bin/systemctl', 'restart', 'network-up.service'); (rather than the "services::start('network-up')") as a hacky test, but there may be other reasons that this might not work. Probably easier just to add a non-block option to _run_action() and allow restart to use it. Sadly the initscript doesn't have a "restart" action, so it would need to be modified as well to ensure it works with both sysvinit and systemd. Not sure this qualifies as a release blocker so it might have to wait until after release for an update. However, the other option is simply to ditch using network up at all here and only use the code that the installer normally uses (i.e. remove some more code in your patch). This would likely be fine as I think using network-up here is actually overkill. TV, WDYT?
Blino WDYT?
Having looked at how services::start is structured I can see you are probably concerned this may be a bug introduced by the move to systemd. So I did some experiments. First of all I forced it to use systemd, then /etc/rc.d/init.d/ : No change Next I measured the time between executing services::start('network-up') and a gateway address appearing in 'route'. It takes between 1 and 2 seconds for the gateway to appear (Limited by the 1 second resolution of my timer loop) So if you refer to my patch the test succeeds on the third pass of the loop. Next I removed the --no-block option from systemctl - No change. It still takes 1-2 seconds for route to be updated
Hi, This bug was filed against cauldron, but we do not have cauldron at the moment. Please report whether this bug is still valid for Mageia 2. Thanks :) Cheers, marja
Keywords: (none) => NEEDINFO
This bug is still valid on Mageia 2
Version: Cauldron => 2
Keywords: NEEDINFO => (none)
Please look at the bottom of this mail to see whether you're the assignee of this bug, if you don't already know whether you are. If you're the assignee: We'd like to know for sure whether this bug was assigned correctly. Please change status to ASSIGNED if it is, or put OK on the whiteboard instead. If you don't have a clue and don't see a way to find out, then please put NEEDHELP on the whiteboard. Please assign back to Bug Squad or to the correct person to solve this bug if we were wrong to assign it to you, and explain why. Thanks :) **************************** @ the reporter and persons in the cc of this bug: If you have any new information that wasn't given before (like this bug being valid for another version of Mageia, too, or it being solved) please tell us. @ the reporter of this bug If you didn't reply yet to a request for more information, please do so within two weeks from now. Thanks all :-D
*** Bug 7335 has been marked as a duplicate of this bug. ***
CC: (none) => davidwhodgins
Attachment 2269 filename: patch-netconnect.pm => patch-netconnect.diff
Target Milestone: --- => Mageia 3Whiteboard: (none) => 3alpha1
Fixed in git. Since blino didn't answered, and since it's good enough for drakx, then it can't hurt in standalone mode
Status: NEW => RESOLVEDResolution: (none) => FIXED
drakx-net update pushed: https://wiki.mageia.org/en/Support/Advisories/MGAA-2012-0187
CC: (none) => tmb