Bug 24278

Summary: Installer hangs at setupSCSI step when using --local_install on a machine with logical volumes
Product: Mageia Reporter: Martin Whitaker <mageia>
Component: InstallerAssignee: Mageia tools maintainers <mageiatools>
Status: RESOLVED FIXED QA Contact:
Severity: critical    
Priority: release_blocker CC: thierry.vignaud
Version: Cauldron   
Target Milestone: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Source RPM: CVE:
Status comment:
Attachments: do not call 'setupSCSI' for local install
alternative patch
Extended alternative patch
Extended alternative patch v2

Description Martin Whitaker 2019-02-01 10:15:34 CET
The latest version of lvm2 appears to rely on udev. Last messages in ddebug.log are

* looking for vgs in sda1
* running: lvm2 vgscan
  Reading all physical volumes.  This may take a while...
  WARNING: Device /dev/ram0 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/sda not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/vg-mga/lv_mga1 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/ram1 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/sda1 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/vg-mga/lv_mga2 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/ram2 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/sda2 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/ram3 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/sda3 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/ram4 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/sda4 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/ram5 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/sda5 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/ram6 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/sda6 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/ram7 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/sda7 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/ram8 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/sda8 not initialized in udev database even after waiting 10000000 microseconds.
  WARNING: Device /dev/ram9 not initialized in udev database even after waiting 10000000 microseconds.

When I run 'lvm2 vgscan' manually in a chroot, it goes through all the devices, then loops back and starts trying again.

@Thierry, advice please. This is stopping me building Live ISOs on the Mageia infra. Do we really need to run the setupSCSI step for a local install?
Comment 1 Thierry Vignaud 2019-02-01 14:30:31 CET
Created attachment 10707 [details]
do not call 'setupSCSI' for local install

Can you try this patch?
Comment 2 Thierry Vignaud 2019-02-01 14:43:47 CET
Created attachment 10708 [details]
alternative patch
Comment 3 Martin Whitaker 2019-02-01 14:54:32 CET
Created attachment 10709 [details]
Extended alternative patch

The alternative version cures the initial problem, but you then get a "Oops, no root partition" from check_hds_boot_and_root(). This revised patch fixes that, but maybe there's a better way?

Attachment 10708 is obsolete: 0 => 1

Comment 4 Thierry Vignaud 2019-02-01 17:31:45 CET
Use the first patch then
Comment 5 Martin Whitaker 2019-02-01 17:45:10 CET
(In reply to Thierry Vignaud from comment #4)
> Use the first patch then

No, that doesn't help. It hangs in the call to install::any::getHds($o). And not calling that leads to the "Oops, no root partition".
Comment 6 Martin Whitaker 2019-02-01 18:02:44 CET
Created attachment 10713 [details]
Extended alternative patch v2

This simplified patch works for the Live ISO build.

Attachment 10709 is obsolete: 0 => 1

Comment 7 Martin Whitaker 2019-02-03 11:07:00 CET
Unless you object, I'll push my fix.
Comment 8 Thierry Vignaud 2019-02-03 14:09:43 CET
Your patch alter the installer semantics.
eg:
$o->{fstab} is no more used
we don't look anymore for mounted device in  $o->{fstab} 
$o->{fstab} is overwritten & reduced to only the partition

Note that we care much in drakx-in-chroot case but it would be better to have a nice explaining commit log
Comment 9 Martin Whitaker 2019-02-03 15:19:35 CET
If we skip the setupSCSI step, $o->{fstab} is empty, so we aren't overwriting anything, and there are no other partitions to mount. As far as I can tell, we just need that one fake fstab entry.

Proposed commit message:
====
Skip setupSCSI step when run with --local_install (mha#24278)

In a local install, we don't have udev running, so the setupSCSI step will hang if it tries to probe for logical volumes (lvm2 uses udev).

A local install is used to test the installer(drakx_in_chroot) and to build the Live ISOs (draklive2), and in both cases we don't really want the install to be affected by the hardware of the host system. Skipping the setupSCSI step means $o->{fstab} contains no entries, so we add a fake entry for our chroot, to allow us to pass the subsequent check that we have a root partition.
====

BTW, this should also fix bug 24201.
Comment 10 Thierry Vignaud 2019-02-03 19:57:45 CET
Good enough for me.
Go ahead with godspeed
Comment 11 Martin Whitaker 2019-02-03 20:56:59 CET
Thanks. I've also added a comment in the code, as a reminder.
Comment 12 Martin Whitaker 2019-02-03 23:15:55 CET
Confirmed fixed when running Live ISO build on rabbit.

Resolution: (none) => FIXED
Status: NEW => RESOLVED