Bug 5730 - Truly minimal (otherwise default) install ends up unbootable
Summary: Truly minimal (otherwise default) install ends up unbootable
Status: RESOLVED OLD
Alias: None
Product: Mageia
Classification: Unclassified
Component: Installer (show other bugs)
Version: 3
Hardware: x86_64 Linux
Priority: Normal major
Target Milestone: ---
Assignee: Mageia Bug Squad
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-03 04:43 CEST by Herbert Poetzl
Modified: 2014-11-16 22:15 CET (History)
7 users (show)

See Also:
Source RPM:
CVE:
Status comment:


Attachments
boot failure: unknown filesystem type 'ext4' (22.70 KB, image/png)
2012-05-03 04:45 CEST, Herbert Poetzl
Details
similar result with encrypted rootfs (6.83 KB, image/png)
2012-05-03 05:07 CEST, Herbert Poetzl
Details

Description Herbert Poetzl 2012-05-03 04:43:02 CEST
Description of problem:
when installing (all default options) and selecting truly minimal install, the system ends up with:

mount: 'unknown filesystem type 'ext4'
umount: /sysroot: not mounted

dracut Warning: Can't mount root filesystem

Version-Release number of selected component (if applicable):
f9f7035901919940e9727e7057885dff  Mageia-2-rc-dual-CD.iso

How reproducible:
always

Steps to Reproduce:

# qemu-img create -f qcow2 mga2rc.qcow2 4G
# qemu-kvm -m 1024 -hda mga2rc.qcow2 -cdrom Mageia-2-rc-dual-CD.iso

take all defaults up to the package selection, then remove all packages and select truly minimal install, continue till reboot without doing an update.
Comment 1 Herbert Poetzl 2012-05-03 04:45:28 CEST
Created attachment 2164 [details]
boot failure: unknown filesystem type 'ext4'
Comment 2 Herbert Poetzl 2012-05-03 05:07:00 CEST
Created attachment 2165 [details]
similar result with encrypted rootfs
Manuel Hiebel 2012-05-04 17:22:01 CEST

CC: (none) => mageia, pterjan, thierry.vignaud

Comment 3 Colin Guthrie 2012-05-05 11:32:09 CEST
I tried to reproduce with a net install but it worked fine. I'll try again with the dual CD iso when I get a moment.
Comment 4 Colin Guthrie 2012-05-05 15:50:10 CEST
OK, reproduced with the latest RC Image.
claire robinson 2012-05-05 16:20:05 CEST

CC: (none) => ennael1

Comment 5 Colin Guthrie 2012-05-05 16:29:38 CEST
Hmm, reproduced in qemu, but not in VirtualBox... The plot thickens....
Comment 6 Colin Guthrie 2012-05-05 17:07:07 CEST
OK, so this is really weird. It seems that under the qemu install, /mnt/dev/ is not bind mounted to the root filesystem's /dev. Thus when creating the new disk, it does not inherit the /dev/disk/by-uuid/ path/symlinks and thus it doesn't detect the rootfs and doesn't include ext4.

I'm not 100% sure if it's just a missing bind mount or something else that makes the exact same process work fine under VirtualBox.
Comment 7 Colin Guthrie 2012-05-05 18:20:49 CEST
Right, so the only thing I can think of here is a timing issue. It seems that rather than bind mounting /dev we actually copy the device nodes across.

As udev is repsonsible for creating the symlinks needed by dracut, if it is lagging behind, we might do the copying before the symlinks are ready. If this is the case we get a failure.

I was able to do a qemu install that worked so it's certainly intermittent.

As a workaround until this is fixed properly (and I'd really appreciate the extra tests), can you do this during the install:

 1. After formatting the disks, switch to tty2 (in VirtualBox rightctl+f2, in qemu ctrl-alt-2 -> "sendkey ctrl-alt-f2" -> ctrl-alt-1 : please tell me if there is a better way as I'm a qemu noob!)
 2. Type: mount -o bind /dev /mnt/dev
 3. Switch back to tty7 and continue install as before

If you could test this in your RAID setup bug too that would be awesome as I suspect they are actually the same problem.

I'll change the copy to a bind mount in drakx and we can respin some ISOs after RC but before final to give this a bit more testing.
Comment 8 Herbert Poetzl 2012-05-05 18:35:44 CEST
(In reply to comment #7)
> Right, so the only thing I can think of here is a timing issue. It seems that
> rather than bind mounting /dev we actually copy the device nodes across.

> As udev is repsonsible for creating the symlinks needed by dracut, if it is
> lagging behind, we might do the copying before the symlinks are ready. If this
> is the case we get a failure.

that certainly explains a lot ...

> I was able to do a qemu install that worked so it's certainly intermittent.

> As a workaround until this is fixed properly (and I'd really appreciate the
> extra tests), can you do this during the install:
 
>  1. After formatting the disks, switch to tty2 (in VirtualBox rightctl+f2, in
> qemu ctrl-alt-2 -> "sendkey ctrl-alt-f2" -> ctrl-alt-1 : please tell me if
> there is a better way as I'm a qemu noob!)

you can add -monitor stdio, which will give you the monitor on the terminal,
but by default, qemu/kvm doesn't do magic keyboard translations although
it can be configured to do so

>  2. Type: mount -o bind /dev /mnt/dev
>  3. Switch back to tty7 and continue install as before

> If you could test this in your RAID setup bug too that would be awesome 
> as I suspect they are actually the same problem.

will give it a try

> I'll change the copy to a bind mount in drakx and we can respin some 
> ISOs after RC but before final to give this a bit more testing.

thanks

CC: (none) => herbert

Comment 9 Pascal Terjan 2012-05-05 22:58:07 CEST
I think we should probably use devtmpfs during install and just mount it twice (but that's way too late to do such a change for mageia 2)
Comment 10 Thierry Vignaud 2012-05-06 02:46:39 CEST
hint: grep devtmpfs install/install2.pm
Comment 11 Colin Guthrie 2012-05-06 11:32:37 CEST
We appear to do that already as Titi said.

I went for a bind mount (just because it's a standard trick in chroots), but using devtmpfs would work equally well AFAIUI. As the fix I did was next to two other mounts for sysfs and proc, perhaps the devtmpfs would be more appropriate?

Some quick tests would indicate this would work fine. Should I make that change?

As a question: should we favour mounting such filesystems like this twice over bind mounts? 

For example in the systemd-nspawn source it bind mounts /proc and /sys and then makes them read only (it's a two stage thing with bind mounts and I'm not even sure it fully works as I tried doing readonly bind mounts a while back without success, but that's another story :D).

With /dev, systemd-nspawn is a bit more custom and hides most h/w nodes which doesn't apply in our case.

But should we consider using the same bind mounting as a general rule here?

So, I guess the question is: Use all bind mounts or mount the same special fs's twice?
Comment 12 Herbert Poetzl 2012-05-07 03:46:31 CEST
(In reply to comment #11)

> As a question: should we favour mounting such filesystems like this 
> twice over bind mounts?

there should be no difference from VFS PoV
 
> For example in the systemd-nspawn source it bind mounts /proc and /sys 
> and then makes them read only (it's a two stage thing with bind mounts 
> and I'm not even sure it fully works as I tried doing readonly bind 
> mounts a while back without success, but that's another story :D).

yes, they didn't work for many years (changes to the ro state got
silently ignored) but the kernel eventually got there, making it a
two stage process (i.e. first the bind mount, then the remount,ro)
or if you are concerned about security, a three stage process where
you basically do the bind mount outside the security critical path
and then move mount it into place

> With /dev, systemd-nspawn is a bit more custom and hides most h/w 
> nodes which doesn't apply in our case.

> But should we consider using the same bind mounting as a general 
> rule here?

> So, I guess the question is: Use all bind mounts or mount the same 
> special fs's twice?

IMHO a mount of a virtual filesystem like proc or sys is more intuitive
than a bind mount and the vfs results are the same there, but there 
will be a difference for e.g. tmpfs.
Comment 13 Thierry Vignaud 2012-05-07 08:51:30 CEST
There's no reason to make them RO.
We've mounted them in the chroot for years RW
Comment 14 Colin Guthrie 2012-05-07 10:10:44 CEST
@TV that wasn't really the question I was asking which was more a stylistic one: Should we always use bind mounts or should we always mount the VFS type directly when possible.

Personally I'm much more in favour of the bind mount approach, but that's just because it feels more appropriate rather than any technical reason.
Comment 15 Marja Van Waes 2012-05-26 13:04:29 CEST
Hi,

This bug was filed against cauldron, but we do not have cauldron at the moment.

Please report whether this bug is still valid for Mageia 2.

Thanks :)

Cheers,
marja

Keywords: (none) => NEEDINFO

Comment 16 Marja Van Waes 2012-08-04 15:31:04 CEST

Did this get fixed?

CC: (none) => marja11

Comment 17 Colin Guthrie 2012-08-05 12:29:00 CEST
It's hard to judge exactly if it's fully fixed but there is definitely still a problem with the mageia-theme package which ends up generating an initrd during install when it is installed, rather than suppressing the initrd generation until the end when all packages are installed. I'll push a new version of this shortly to updates_testing but really there is little to test until it's available to a net install. See bug #6692.

If someone can still reproduce this bug, then please try the work around noted in the bug above.
Marja Van Waes 2012-08-26 15:37:30 CEST

Keywords: NEEDINFO => (none)

Comment 18 Thierry Vignaud 2012-09-03 16:26:06 CEST
Colin, I think we can close this one as FIXED now, WDYT?
Comment 19 Colin Guthrie 2012-09-03 16:34:13 CEST
Well the theme package that was updated post mga2 release did break things again as per my comment above. I'm happy to either open another bug for tracking that tho' as the original issues for this were certainly not related to that (it was only introduced after initial QA).
Manuel Hiebel 2013-07-22 20:53:51 CEST

Version: Cauldron => 3

Comment 20 Samuel Verschelde 2013-08-27 15:53:00 CEST
(In reply to Colin Guthrie from comment #19)
> Well the theme package that was updated post mga2 release did break things
> again as per my comment above. I'm happy to either open another bug for
> tracking that tho' as the original issues for this were certainly not
> related to that (it was only introduced after initial QA).

It would be better to open a new bug report if there's still a problem to be fixed. This bug report has become hard to read (and inactive).

CC: (none) => stormi

Comment 21 Dick Gevers 2014-11-16 22:15:40 CET
As per suggestion of stormi@laposte.net #c20 of 15 months ago, I am closing this bug and:

> It would be better to open a new bug report if there's still a problem to be 
> fixed. 

Anyone who feels that: please do.

Status: NEW => RESOLVED
Resolution: (none) => OLD


Note You need to log in before you can comment on or make changes to this bug.