Bug 31011 - kernel 5.15.74-desktop-1.mga8 boot fails to dracut shell on one of four machines
Summary: kernel 5.15.74-desktop-1.mga8 boot fails to dracut shell on one of four machines
Status: RESOLVED INVALID
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: 8
Hardware: x86_64 Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Mageia Bug Squad
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-10-24 04:37 CEST by Rolf Pedersen
Modified: 2022-10-25 03:30 CEST (History)
0 users

See Also:
Source RPM:
CVE:
Status comment:


Attachments
dracut report and debug report in archive (25.82 KB, application/x-xz)
2022-10-24 04:37 CEST, Rolf Pedersen
Details

Description Rolf Pedersen 2022-10-24 04:37:10 CEST
Created attachment 13443 [details]
dracut report and debug report in archive

Howdy.  First, some context:

I have 4 machines on the LAN running up-to-date 64-bit Mageia for some years with few challenges.  Latest kernel update is normal for the following three:

2-are Wyze thin clients, one runs a webserver, the other mostly displays a surveillance camera.  Some properties:

[rolf@d90d7 ~]$ uname -r
5.15.74-desktop-1.mga8
[rolf@d90d7 ~]$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.15.74-desktop-1.mga8 root=UUID=f8166285-e50a-4d90-a6f1-9a3c29453ce8 ro verbose noiswmd noresume audit=0
[rolf@d90d7 ~]$ cat /etc/release
Mageia release 8 (Official) for x86_64

[rolf@d90d7 ~]$ uname -r
5.15.74-desktop-1.mga8
[rolf@d90d7 ~]$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.15.74-desktop-1.mga8 root=UUID=f8166285-e50a-4d90-a6f1-9a3c29453ce8 ro verbose noiswmd noresume audit=0
[rolf@d90d7 ~]$ cat /etc/release
Mageia release 8 (Official) for x86_64

The third is my primary workstation, ROG STRIX X570-I GAMING, Ryzen 7 5700G, etc:

[rolf@x570i ~]$ uname -r
5.15.74-desktop-1.mga8
[rolf@x570i ~]$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.15.74-desktop-1.mga8 root=UUID=392cbe8f-abfc-4dd9-b7d6-c8c21b6294fd ro verbose noiswmd resume=/dev/nvme0n1p2 audit=0 cx23885.debug=8
[rolf@x570i ~]$ cat /etc/release
Mageia release 8 (Official) for x86_64

I had this problem with my former workstation, Z170I PRO GAMING, Intel Core i5-6500, etc. necessarily booted to the previous kernel, which works:

[rolf@z170i boot]$ uname -r
5.15.65-desktop-1.mga8
[rolf@z170i boot]$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.15.65-desktop-1.mga8 root=UUID=eabf32a7-39a6-4e78-af5d-841e55d30069 ro splash quiet noiswmd audit=0 intel_idle.max_cstate=1
[rolf@z170i boot]$ cat /etc/release
Mageia release 8 (Official) for x86_64

Deeper background on this machine is that I recently used gparted to copy the OS partitions from an m.2 card in sata enclosure plus /var on little sata card plugged directly to sata connection on the motherboard then pasted to a single sata m.2, which now holds the OS.  I edited fstab that mounted on /dev/sd[a-b] to call on /dev/nvme???.  When that failed, I stumbled on booting an MGA8 installation key rescue session and "re-installing bootloader".  That was sufficient for the previous kernel.  Installing the new kernel resulted in a dracut shell advising, for example:
"dracut could not boot, /dev/sdb4 does not exist"
and that "rdsosreport.txt" had been written under /run, that I might want to copy it somewhere.  Also, there was the option of booting with the kernel command ~ "rd.debug" for more verbosity.  I collected both,  compressed xz archive attached, the debug is 2M.

While I was there, I mounted and chrooted into the filesystem to find that /boot/EFI partition refused to mount:
"wrong fs type, bad option, bad superblock, codepage not found". The fstab entry that boots with previous kernel:
UUID=3C14-DB71 /boot/EFI vfat defaults,umask=000 0 0
Booted to the older kernel, I used the MCC "setup boot" module to remove the "resume" directive, pointed to sdb4, previous mount point for swap.  It's not in  /proc/cmdline, anymore, but still shows up in rdsosreport.txt
I tried regenerating initrd:
[root@z170i boot]# dracut --force /boot/initrd-5.15.74-desktop-1.mga8.img 5.15.74-desktop-1.mga8
I believe the several subsequent configurings with "setup boot" in MCC would have run the bootloader installer.

In summation, the flag on sdb4 is distracting but the failure to mount /boot/EFI under the new kernel seems the more important problem to me.  Again, the fstab entry, above, works with 5.15.65-desktop-1.mga8
Thanks.
Rolf Pedersen 2022-10-24 04:46:46 CEST

Summary: kernel 5.15.74-desktop-1.mga8 boot fails to dracut shell on one of three machines => kernel 5.15.74-desktop-1.mga8 boot fails to dracut shell on one of four machines

Comment 1 Rolf Pedersen 2022-10-24 18:59:05 CEST
Time to leave for work so I am making a note while I have a record.

Strategy was to remove kernel|-devel|-latest packages and re-install them.

Watching boot messages, I can see kernel is running and recognizing devices.

However, the end is the same @ "dracut Warning:  Could not boot.  /dev/sdb4 does not exist"

Further to this, while re-installing these packages, I noticed:

Creating: target|kernel|dracut args|basicmodules 
/etc/dracut.conf.d/51-mageia-resume.conf:add_device+="/dev/sdb4"
You should restart your computer for kernel-desktop-5.15.74-1.mga8
[root@z170i boot]# rpm -qf /etc/dracut.conf.d/51-mageia-resume.conf
dracut-051-4.mga8

So I can return to this, wondering where that 'sdb4' directive originates and start hammering away. ;)
Thanks.
Comment 2 Thomas Backlund 2022-10-24 20:58:59 CEST

if your system uses a sata m.2, it does not use /dev/nvme*
Comment 3 Thomas Backlund 2022-10-24 21:00:05 CEST
and according to 

/etc/dracut.conf.d/51-mageia-resume.conf:add_device+="/dev/sdb4"


you have (had?) swap on /dev/sdb4
Comment 4 Rolf Pedersen 2022-10-24 21:38:05 CEST
(In reply to Thomas Backlund from comment #2)
> 
> if your system uses a sata m.2, it does not use /dev/nvme*

Sorry, I confused terms.  It's nvme and partitions are labeled as such.
Comment 5 Rolf Pedersen 2022-10-24 21:43:08 CEST
(In reply to Thomas Backlund from comment #3)
> and according to 
> 
> /etc/dracut.conf.d/51-mageia-resume.conf:add_device+="/dev/sdb4"
> 
> 
> you have (had?) swap on /dev/sdb4

Yes, previous storage devices were sata-connected and that was swap.  I do have swap on the nvme card, still, but no need for resume, afaict, as I don't suspend or sleep this machine.  I could delete 51-mageia-resume.conf, I suppose, but wonder if is going to be restored by some configuration I can't find, atm.
Thanks.
Comment 6 Thomas Backlund 2022-10-24 21:49:49 CEST
(In reply to Rolf Pedersen from comment #5)
> (In reply to Thomas Backlund from comment #3)
> > and according to 
> > 
> > /etc/dracut.conf.d/51-mageia-resume.conf:add_device+="/dev/sdb4"
> > 
> > 
> > you have (had?) swap on /dev/sdb4
> 
> Yes, previous storage devices were sata-connected and that was swap.  I do
> have swap on the nvme card, still, but no need for resume, afaict, as I
> don't suspend or sleep this machine.  I could delete 51-mageia-resume.conf,
> I suppose, but wonder if is going to be restored by some configuration I
> can't find, atm.
> Thanks.

nope, it's only added by drakx installer.

after you remove the file, recreate the initrd so it gets removed from there too...
Comment 7 Rolf Pedersen 2022-10-25 03:30:13 CEST
(In reply to Thomas Backlund from comment #6)
> (In reply to Rolf Pedersen from comment #5)
> > (In reply to Thomas Backlund from comment #3)
> > > and according to 
> > > 
> > > /etc/dracut.conf.d/51-mageia-resume.conf:add_device+="/dev/sdb4"
> > > 
> > > 
> > > you have (had?) swap on /dev/sdb4
> > 
> > Yes, previous storage devices were sata-connected and that was swap.  I do
> > have swap on the nvme card, still, but no need for resume, afaict, as I
> > don't suspend or sleep this machine.  I could delete 51-mageia-resume.conf,
> > I suppose, but wonder if is going to be restored by some configuration I
> > can't find, atm.
> > Thanks.
> 
> nope, it's only added by drakx installer.
> 
> after you remove the file, recreate the initrd so it gets removed from there
> too...

Done and done.  Ok, now.  Thanks!

Status: NEW => RESOLVED
Resolution: (none) => INVALID


Note You need to log in before you can comment on or make changes to this bug.