Bug 5772 - Ethernet test fails when adding additional media
Summary: Ethernet test fails when adding additional media
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: Installer (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Olivier Blin
QA Contact:
URL:
Whiteboard:
Keywords:
: 4958 7335 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-05-06 17:13 CEST by Derek Jennings
Modified: 2012-09-04 18:17 CEST (History)
6 users (show)

See Also:
Source RPM: drakx-net
CVE:
Status comment:


Attachments
Wireshark capture log (3.99 KB, application/octet-stream)
2012-05-06 17:16 CEST, Derek Jennings
Details
More complete wireshark log (24.21 KB, application/octet-stream)
2012-05-06 19:50 CEST, Derek Jennings
Details
Test script (101 bytes, text/plain)
2012-05-09 12:56 CEST, Derek Jennings
Details
Patch File (461 bytes, text/plain)
2012-05-10 02:17 CEST, Derek Jennings
Details
patch for drakx-net (831 bytes, patch)
2012-05-10 09:43 CEST, Thierry Vignaud
Details | Diff
can you check this patch works as well? (382 bytes, text/plain)
2012-05-10 09:43 CEST, Thierry Vignaud
Details
write resolver conf in both chroot & installer root (662 bytes, patch)
2012-05-14 10:30 CEST, Thierry Vignaud
Details | Diff
live patch for drakx (1.63 KB, text/plain)
2012-05-14 13:02 CEST, Thierry Vignaud
Details
symlink chrooted /etc/resolv.conf into installer (450 bytes, patch)
2012-05-14 14:44 CEST, Thierry Vignaud
Details | Diff
Working patch (336 bytes, text/plain)
2012-05-14 15:02 CEST, Derek Jennings
Details

Description Derek Jennings 2012-05-06 17:13:12 CEST
Description of problem:
If when installing Mageia2 RC Dual CD or DVD additional online media is declared, then on starting Ethernet the connectivity test invariably fails even though the connection actually works if the user allows the installer to proceed.


Version-Release number of selected component (if applicable):
Mageia2 RC

How reproducible:
Every time in a Virtual box, and in all hardware I have.


Steps to Reproduce:
1. Start an install from the Dual CD  (the DVD does the same)
2. At the point where the installer asks if you have additional media select FTP
3. The installer then installs some packages and then asks if it should bring up the Ethernet. Accept all defaults.
4. A pop up appears saying 'Testing Connection' after a time a window appears saying the connection does not work and the user should check their router. If at this point the user selects 'Continue' then installation proceeds and packages are downloaded from the network without a problem.

Analysis
I monitored the install with Wireshark and got a capture of the failure.
After going through DHCP the installer does a DNS request on www.mageia.org but it uses the wrong destination address. It makes the request to 127.0.0.1 instead of the DNS server address given during DHCP see capture file attached.
Comment 1 Derek Jennings 2012-05-06 17:16:01 CEST
Created attachment 2196 [details]
Wireshark capture log
Comment 2 claire robinson 2012-05-06 17:33:50 CEST
This is probably what causes it to fail the connectivity test when setting up the network on the Summary page too. Nice find Derek :)
claire robinson 2012-05-06 17:35:53 CEST

CC: (none) => thierry.vignaud

claire robinson 2012-05-06 17:50:16 CEST

CC: (none) => eeeemail

Comment 3 Derek Jennings 2012-05-06 19:50:19 CEST
Created attachment 2200 [details]
More complete wireshark log

Here is a better wireshark log filtered on mac address so we can see the full DHCP exchange.

At line 3 the DHCP server offers the IP address and declares the DNS server to be 192.168.1.254

The Mageia installer ignores the DNS server and tests the connection by sending a DNS request to 127.0.0.1

At 77 seconds the connectivity test times out and the user tells the installer to continue, which the installer does by making a correct DNS request to the DNS server already declared.
Comment 4 Manuel Hiebel 2012-05-07 19:01:57 CEST
*** Bug 4958 has been marked as a duplicate of this bug. ***
Manuel Hiebel 2012-05-07 19:04:04 CEST

CC: (none) => mageia
Source RPM: (none) => drakx-net

Comment 5 Derek Jennings 2012-05-08 14:59:31 CEST
A bit more information to help with debugging.
While the network test is in progress /mnt/etc/resolv.conf contains the correct DNS server address.  /etc/resolv.conf does not exist.

If I symlink between the two and repeat the network test it still fails. It is still trying to use 127.0.0.1 as the DNS server
However a 'ping' on the command line works with the symlink in place.

As far as I can tell with my appalling perl skills, the test is initiated in /usr/lib/libDrakX/network/netconnect.pm  line 329  but I do not see what is wrong. It simply calls a subroutine which runs
gethostbyname("www.mageia.org")
Comment 6 Derek Jennings 2012-05-09 12:56:24 CEST
Created attachment 2230 [details]
Test script

OK I think I understand where the test is going wrong.
I read in a perl forum that gethostbyname() only reads /etc/resolv.conf once per process so if resolv.conf is not present when the test begins, writing it later is not going to help.

I verified this with the perl script test1.pl attached. If I run it on my desktop computer with perl test1.pl  and while the test is running rename /etc/resolv.conf then at the end of the test I can see the results are all the same regardless of changing the contents of resolv.conf

As far as fixing this bug is concerned problably all that is required is to copy /mnt/etc/resolv.conf to /etc/resolv.conf so the test works on the first pass of the loop.  A better fix might be to copy resolv.conf over, and then in the loop make a system call to ping instead of gethostbyname

Unfortunately I cannot try out any fix because I cannot write to the squashfs used in the installer.
Comment 7 Thierry Vignaud 2012-05-09 13:11:59 CEST
Yes you can:
See https://wiki.mageia.org/en/Drakx-installer_tips_and_tricks#modifying_the_stage2
Comment 8 Derek Jennings 2012-05-09 17:47:45 CEST
Thanks for that. Very useful

drakx-in-chroot skips the network setup stage, but I have managed to set up a VBox with a  patch-oem.pl  file in the iso image.  I should be able to try out a patch now :-)
Comment 9 Derek Jennings 2012-05-10 02:17:45 CEST
Created attachment 2235 [details]
Patch File

patch-oem.pl file attached

This works for me.

If the DNS server is unavailable or ping fails then it times out as before after 60 seconds.  If ping succeeds then the user gets a success message. 

If drakx-net is being run after installation it uses gethostbyname() as previously. (Probably not necessary but I wanted to avoid any regression)
Comment 10 Thierry Vignaud 2012-05-10 09:43:17 CEST
Created attachment 2238 [details]
patch for drakx-net
Comment 11 Thierry Vignaud 2012-05-10 09:43:50 CEST
Created attachment 2239 [details]
can you check this patch works as well?
Comment 12 Derek Jennings 2012-05-10 13:38:03 CEST
Confirming patch in attachment 2239 [details] works OK
It also fixes bug 4958
Comment 13 Derek Jennings 2012-05-10 19:36:03 CEST
There is a very similar and almost certainly related bug for which I have just raised bug 5830.

I tried this patch without the if $::isInstall statement to see if it fixed 5830 but it did not so I have raised a separate bug report.
Thierry Vignaud 2012-05-10 20:46:31 CEST

Attachment 2235 is obsolete: 0 => 1

Thierry Vignaud 2012-05-10 20:46:52 CEST

Attachment 2235 mime type: application/octet-stream => text/plain

Thierry Vignaud 2012-05-10 20:47:28 CEST

Attachment 2230 mime type: application/octet-stream => text/plain

Comment 14 Thierry Vignaud 2012-05-10 20:58:17 CEST
Fixed in SVN

Status: NEW => RESOLVED
Resolution: (none) => FIXED

Comment 15 Olivier Blin 2012-05-10 21:30:04 CEST
This is not a proper fix, since it won't work from behind firewalls.

Status: RESOLVED => REOPENED
Resolution: FIXED => (none)

Comment 16 Olivier Blin 2012-05-10 21:34:25 CEST
We should keep the previous check, and maybe run res_init() before.
There is a commented res_init call in drakx-net, I'll check history.
Comment 17 Olivier Blin 2012-05-10 22:07:34 CEST
Fixed in drakx 4459 and drakx-net 4460, both will need to be released though.
Comment 18 Derek Jennings 2012-05-13 01:13:37 CEST
Tried out drakx-net-1.12-1mga2
Sorry it does not fix the standalone error Bug 5830,  nor does it fix the installer.

At the end of the installer network test a window pops up saying that c::res_init() cannot be found.

In the standalone network test there is no alarm from perl, but the test does not pass. If I apply the patch in attachment 2269 [details]  the test passes.

Back to ping?
FWIW the subroutine check_link_beat() does precisely the same ping test. (But using Net::Ping which also does not work in the installer)
 
While many firewalls will block an incoming ping, I have never heard of one blocking outgoing pings.
Comment 19 Derek Jennings 2012-05-13 01:26:02 CEST
Olivier I just want to check I applied your patch correctly in the installer.
I can only patch the installer with a patch-oem.pl file

The patch file I used was :-

use network::tools;

package network::tools;

undef *connected;
*connected = sub {
                c::res_init();
                gethostbyname("www.mageia.org") ? 1 : 0;
}


Correct Yes?
Comment 20 Thierry Vignaud 2012-05-13 20:59:33 CEST
It's not.
It rely on an updated c module.
You need to test latest stage2 image.
Comment 21 Derek Jennings 2012-05-14 00:03:51 CEST
I have put mdkinst.sqfs onto my iso from drakx-installer-stage2-14.21-3.mga2
On the command line I can see the res_init() command in tools.pm so I am confident I have used the correct file.  

When I run the network test in the installer I do not get the alert I saw in Comment 18 so I assume it is finding res_init() OK. However it still does not work. 
Wireshark confirms it is still using 127.0.0.1 as the DNS server.
Comment 22 Derek Jennings 2012-05-14 01:31:00 CEST
Ok. The problem is because /etc/resolv.conf still does not exist.

If I copy /mnt/etc/resolv.conf to /etc/resolv.conf  then the test passes.
Thierry Vignaud 2012-05-14 10:27:56 CEST

Assignee: bugsquad => mageia

Comment 23 Thierry Vignaud 2012-05-14 10:30:10 CEST
Created attachment 2303 [details]
write resolver conf in both chroot & installer root
Comment 24 Thierry Vignaud 2012-05-14 10:30:29 CEST
blino we need to update installer conf as well
Comment 25 Derek Jennings 2012-05-14 12:59:06 CEST
I put attachment 2303 [details] into a patch file, but it did not make any difference.
The patch file loaded without any errors, but resolv.conf is still not written.
I then put a log message into the patch to confirm the code was being executed and the log message did not appear in ddebug.log.

So either I have implemented the patch file wrong, or write_resolv_conf() is not being called at this point.
Comment 26 Thierry Vignaud 2012-05-14 13:01:21 CEST
Attachment #2303 [details] is a patch againt SVN, to be commited in, not a patch file for drakx which would need undefine then re-define the function
Comment 27 Thierry Vignaud 2012-05-14 13:02:48 CEST
Created attachment 2308 [details]
live patch for drakx

In order to live test, you would need sg like this
Comment 28 Derek Jennings 2012-05-14 13:29:43 CEST
I should have made Comment 25 clearer. I had made a patch-oem.pl file which was identical to your attachment 2308 [details].

Just to be sure I have tried again using attachment 2308 [details]. Same results.
/etc/resolv.conf is not written.
Comment 29 Thierry Vignaud 2012-05-14 14:44:51 CEST
Created attachment 2309 [details]
symlink chrooted /etc/resolv.conf into installer

so we're back to this, then
Comment 30 Derek Jennings 2012-05-14 15:02:44 CEST
Created attachment 2310 [details]
Working patch

Almost, but not quite.
The symlink has to be made before the res_init()

The attached patch file works OK for me

Attachment 2309 is obsolete: 0 => 1

Comment 31 Derek Jennings 2012-05-19 08:08:53 CEST
Thierry, Can this bug be closed yet?  I have not seen a commit in SVN.
Comment 32 Dave Hodgins 2012-05-19 08:54:03 CEST
(In reply to comment #31)
> Thierry, Can this bug be closed yet?  I have not seen a commit in SVN.

It's still failing in all of the latest iso images, so it should
not be closed.

CC: (none) => davidwhodgins

Comment 33 Marja Van Waes 2012-05-26 13:06:34 CEST
Hi,

This bug was filed against cauldron, but we do not have cauldron at the moment.

Please report whether this bug is still valid for Mageia 2.

Thanks :)

Cheers,
marja

Keywords: (none) => NEEDINFO

Comment 34 Derek Jennings 2012-05-26 13:51:06 CEST
This bug is in the installer. It is therefore no longer possible to fix it for Mageia 2, so I am leaving it assigned to Cauldron.
Sander Lepik 2012-05-26 14:30:55 CEST

Keywords: NEEDINFO => (none)
CC: (none) => sander.lepik

Comment 35 Marja Van Waes 2012-07-06 15:03:38 CEST
Please look at the bottom of this mail to see whether you're the assignee of this  bug, if you don't already know whether you are.


If you're the assignee:

We'd like to know for sure whether this bug was assigned correctly. Please change status to ASSIGNED if it is, or put OK on the whiteboard instead.

If you don't have a clue and don't see a way to find out, then please put NEEDHELP on the whiteboard.

Please assign back to Bug Squad or to the correct person to solve this bug if we were wrong to assign it to you, and explain why.

Thanks :)

**************************** 

@ the reporter and persons in the cc of this bug:

If you have any new information that wasn't given before (like this bug being valid for another version of Mageia, too, or it being solved) please tell us.

@ the reporter of this bug

If you didn't reply yet to a request for more information, please do so within two weeks from now.

Thanks all :-D
Comment 36 Manuel Hiebel 2012-07-06 20:37:23 CEST
*** Bug 2341 has been marked as a duplicate of this bug. ***

CC: (none) => paiiou

Comment 37 Thierry Vignaud 2012-09-04 18:06:54 CEST
*** Bug 7335 has been marked as a duplicate of this bug. ***
Comment 38 Thierry Vignaud 2012-09-04 18:17:58 CEST
OK, after reviewing, on second though, this is fine.
Commited in git.

Status: REOPENED => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.