Bug 15696 - dracut unable to boot from a degraded raid1 array
Summary: dracut unable to boot from a degraded raid1 array
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: i586 Linux
Priority: Normal major
Target Milestone: ---
Assignee: Colin Guthrie
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-14 13:49 CEST by Alejandro Vargas
Modified: 2015-06-03 23:43 CEST (History)
1 user (show)

See Also:
Source RPM: dracut
CVE:
Status comment:


Attachments

Description Alejandro Vargas 2015-04-14 13:49:28 CEST
Until Mageia 5, I was able to install mageia in 2 disks RAID 1. This way, I only need to manually execute "grub2-install /dev/sdb" for being able to boot from any of the 2 hard disks. 

Now in MGA5 cauldron (lastest release until now), dracut does not start the raid1 in degraded mode and the system does not boot, rendering useless the RAID1 installation. 

Also, there is other problem starting from MGA4: dracut refuses to boot if it can't find the swap partition (i.e. reformatting it changes the uuid and system can't boot any more, a very annoying problem in a "newby friendly" distro).

How reproducible:

With virtualbox, create a virtual machine with 2 hard disks of the same size. 
Go to persolalized partitioning, enable expert mode and in sda:

1) create a partition STARTING AT SECTOR 2048 (important), and 1024Mb size of type Linux Raid
2) create a partition of the rest of the disk space (size=9999999999) of type Linux Raid
3) "add to raid" the first partition to md0
4) "add to raid" the second partition to new raid - md1

5) repeat the previous steps for sdb

6) In the RAID tab: select type "Linux Swap" for md0.
7) In the RAID tab: select type ext4 or btrfs for md1 and select mount point to  /.

8) Then install the system and ensure you use grub2. The first sector in 2048 is for leaving room for grub2 modules.

9) after system installed and started, execute grub2-install /dev/sdb for writing grub to both disks. 

10) stop the machine (virtual or physical), remove one of the disks (no mater which one) and boot.

Result: the system does not boot because the raids are in inactive state

dracut:/# cat /proc/mdstat
Personalities : 
md1 : inactive sda5[1](S)
      15717208 blocks super 1.0

md0 : inactive sda1[1](S)
      1042624 blocks super 1.0


What is expected: In previos mageia versions, the system booted normally with degraded raid. 

[root@server2 anv]# cat /proc/mdstat
Personalities : [raid1] 
md1 : active raid1 sda5[0]
      15717208 blocks super 1.2 [2/1] [U_]

md0 : active raid1 sda1[0]
      1042624 blocks super 1.2 [2/1] [U_]

You can add a new hard disk even without rebooting if you can hot-plug the disks.

A similar problem started in mageia 4 with the swap partition. This is why I am placing swap in an md device now but the best way is to use sda1 and sdb1 as swap because the kernel can use both and balance the load. 

I found a similar report for RedHat: https://bugzilla.redhat.com/show_bug.cgi?id=772926

Reproducible: 

Steps to Reproduce:
Comment 1 David Walser 2015-04-14 15:03:37 CEST
The can't boot if it can't find swap issue is Bug 12305.  Leave this bug for the RAID issue only.

Assignee: bugsquad => mageia
Summary: dracut unable to boot from a degraded raid1 array (or with missing swap partition) => dracut unable to boot from a degraded raid1 array

Thierry Vignaud 2015-04-14 15:54:54 CEST

Source RPM: (none) => dracut

Comment 2 Colin Guthrie 2015-04-14 16:25:29 CEST
This smells like grub's fault.

Does the same thing happen when you have a separate /boot (that is mirrored to each disk manually e.g. via rsync) that is not part of a raid set... e.g. grub does not have to do any magic to try and read kernel/initramfs from a raid in the first please leaving only dracut to worry about that.

Failing that theory, can you test with the dracut version from mga4 to see if it fairs better.
Comment 3 Alejandro Vargas 2015-04-14 16:33:45 CEST
I can try if you want, but I'm pretty sure it will fail the same way because the problem is it is not starting the degradated arrays. 

Grub and Grub2 were perfectly able to read raid 1 devices from many rears ago and I installed many servers with this schema. In the years when Mandriva/Mageia installer refused to install /boot in raid devices, I converted it to raid after installation.
Comment 4 Thomas Backlund 2015-06-03 21:37:05 CEST
I can reproduce this one, and its a regression since mga4.

using mga4 dracut on mga5 allows the system to boot in degraded mode

CC: (none) => tmb

Comment 5 Thomas Backlund 2015-06-03 21:38:38 CEST
And I use legacy grub, and separate /boot that is not on raid
Comment 6 Thomas Backlund 2015-06-03 23:31:27 CEST
And I found the upstream fix for this submitted by Neil Brown...

So I will push a fixed dracut to cauldron/mga5
Comment 7 Thomas Backlund 2015-06-03 23:43:02 CEST
Fix pushed in: dracut-038-18.mga5

Status: NEW => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.