Bug 33719 - Kernel regression: unable to read some jfs volumes
Summary: Kernel regression: unable to read some jfs volumes
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: 9
Hardware: All Linux
Priority: Normal major
Target Milestone: ---
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 33775
Blocks:
  Show dependency treegraph
 
Reported: 2024-11-05 09:06 CET by Marc Krämer
Modified: 2024-11-22 18:21 CET (History)
2 users (show)

See Also:
Source RPM: kernel-server-6.6.58-2.mga9.x86_64
CVE:
Status comment:


Attachments

Description Marc Krämer 2024-11-05 09:06:48 CET
Running kernel 6.6.58-2 some lvm-volumes could not be mounted:

mount: /mnt/backuplog: wrong fs type, bad option, bad superblock on /dev/mapper/server-backuplog, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.

Running dmesg does not show any hint, neither does journal

Booting kernel 6.6.52-server-1.mga9 everything could be mounted.

jfs_fsck -f /dev/mapper/server-backuplog
does not show any errors.


Since this is a production machine, which now runs 6.6.52 again, I can't test much here.
Comment 1 Lewis Smith 2024-11-06 08:40:07 CET
You seem to have tried and looked at obvious things.
"Booting kernel 6.6.52-server-1.mga9 everything could be mounted" is clear enough.

To give more info about the disc setup, please post the O/P of:
 $ inxi -DLop
and clarify "some lvm-volumes could not be mounted": which? Some are OK?

CC: (none) => lewyssmith

Comment 2 Marc Krämer 2024-11-06 10:02:17 CET
# inxi -DLop
Logical:
  Device-1: VG: server type: LVM2 size: 1.75 TiB free: 1.65 TiB
  LV-1: HiDrive type: linear size: 10 GiB Components: c-1: md1
  LV-2: backuplog type: linear size: 16 GiB Components: c-1: md1
  LV-3: brickconfig type: linear size: 500 MiB Components: c-1: md1
  LV-4: mrtg type: linear size: 204 MiB Components: c-1: md1
  LV-5: mysql type: linear size: 15 GiB Components: c-1: md1
  LV-6: mysqllog type: linear size: 15 GiB Components: c-1: md1
  LV-7: oldbackuplog type: linear size: 16 GiB Components: c-1: md1
  LV-8: oldvarlog type: linear size: 4 GiB Components: c-1: md1
  LV-9: opt type: linear size: 10 GiB Components: c-1: md1
  LV-10: root type: linear size: 10 GiB Components: c-1: md1
  LV-11: varlog type: linear size: 4 GiB Components: c-1: md1
Drives:
  Local Storage: total: raw: 3.49 TiB usable: 1.75 TiB used: 38.75 GiB (2.2%)
  ID-1: /dev/nvme0n1 vendor: Samsung model: MZQL21T9HCJR-00A07
    size: 1.75 TiB
  ID-2: /dev/nvme1n1 vendor: Samsung model: MZQL21T9HCJR-00A07
    size: 1.75 TiB
Partition:
  ID-1: / size: 9.96 GiB used: 4.35 GiB (43.7%) fs: jfs dev: /dev/dm-0
  ID-2: /boot size: 198.7 MiB used: 81.2 MiB (40.9%) fs: jfs dev: /dev/md0
  ID-3: /mnt/backuplog size: 15.94 GiB used: 9.75 GiB (61.2%) fs: jfs
    dev: /dev/dm-9
  ID-4: /mnt/bricks/config size: 497.8 MiB used: 171.5 MiB (34.5%) fs: jfs
    dev: /dev/dm-1
  ID-5: /mnt/bricks/mrtg size: 202.8 MiB used: 134.7 MiB (66.4%) fs: jfs
    dev: /dev/dm-2
  ID-6: /mnt/config size: 497.8 MiB used: 171.5 MiB (34.5%) fs: fuse.sfs
    source: ERR-102
  ID-7: /mnt/hidrive size: 9.96 GiB used: 4.96 GiB (49.8%) fs: jfs
    dev: /dev/dm-7
  ID-8: /mnt/mrtg size: 202.8 MiB used: 134.7 MiB (66.4%) fs: fuse.sfs
    source: ERR-102
  ID-9: /opt size: 9.96 GiB used: 7.18 GiB (72.0%) fs: jfs dev: /dev/dm-4
  ID-10: /var/lib/mysql size: 14.94 GiB used: 9.31 GiB (62.3%) fs: jfs
    dev: /dev/dm-5
  ID-11: /var/lib/mysql-log size: 14.94 GiB used: 1.59 GiB (10.6%) fs: jfs
    dev: /dev/dm-6
  ID-12: /var/log size: 3.98 GiB used: 1.07 GiB (26.9%) fs: jfs
    dev: /dev/dm-10
Unmounted:
  ID-1: /dev/dm-3 size: 4 GiB fs: jfs
  ID-2: /dev/dm-8 size: 16 GiB fs: jfs


with kernel 6.6.58 the volumes
LV-7, LV-8 were not mountable
I booted kernel 6.6.52, created volumes LV-2 and LV-11 and renamed the old ones

booting the newer kernel failed to mount any of them, while kernel 6.6.52 can read all of them without error.

The superblock e.g. of LV-9 which is mountable in both kernels looks not very different to the others:

# jfs_tune -l /dev/server/opt
jfs_tune version 1.1.15, 04-Mar-2011

JFS filesystem superblock:

JFS magic number:	'JFS1'
JFS version:		1
JFS state:		mounted
JFS flags:		JFS_LINUX  JFS_COMMIT  JFS_GROUPCOMMIT  JFS_INLINELOG  
Aggregate block size:	4096 bytes
Aggregate size:		20888552 blocks
Physical block size:	512 bytes
Allocation group size:	32768 aggregate blocks
Log device number:	0xfc04
Filesystem creation:	Mon Nov 13 15:13:00 2023
Volume label:		''


# jfs_tune -l /dev/server/oldbackuplog 
jfs_tune version 1.1.15, 04-Mar-2011

JFS filesystem superblock:

JFS magic number:	'JFS1'
JFS version:		1
JFS state:		clean
JFS flags:		JFS_LINUX  JFS_COMMIT  JFS_GROUPCOMMIT  JFS_INLINELOG  
Aggregate block size:	4096 bytes
Aggregate size:		33421928 blocks
Physical block size:	512 bytes
Allocation group size:	32768 aggregate blocks
Log device number:	0xfc08
Filesystem creation:	Mon Nov 13 15:12:40 2023
Volume label:		''
Comment 3 Morgan Leijström 2024-11-06 11:10:17 CET
Do our linus kernel show the same problem?
I am asking because it is less patched by us.

Assignee: bugsquad => kernel
CC: (none) => fri
Severity: normal => major
Summary: Kernel: unable to read some jfs volumes => Kernel regression: unable to read some jfs volumes

Comment 4 Marc Krämer 2024-11-06 11:12:00 CET
I have to schedule that test, since it is a database server and it is in production use. I'll try that in the next days.
Comment 5 Morgan Leijström 2024-11-11 09:40:46 CET
Giuseppe, i guess you already see this as memeber of kernel?

Anyway, I ask if maybe you have some idea of what more Marc can test when he is able to?
- Or even if you can guess what the problem may be and have a testing kernel ready?

CC: (none) => ghibomgx

Comment 6 Giuseppe Ghibò 2024-11-11 15:28:25 CET
Apparently, it seems our kernel@ml... stopped working in mid june 2024, at least according to archives (and also missing pre-2016 ones).

For the jfs, I tried a local VM creating a LFS with JFS partition and it works, so it's something more subtle.

Upstream kernel 6.6.5,4 and in particular 6.6.55, had several fixes to JFS, which might be involved.

I'd guess the -linus variant won't help in this case.

There will be a 6.6.61 (not yet out upstream), which includes upstream an extra fix for jfs (the fix is included in 6.6.59). For mga there will be an intermediate 6.6.60-1.mga9 included in updates_testing, before 6.6.61-1.mga9.
Comment 7 Morgan Leijström 2024-11-11 16:45:40 CET
(In reply to Giuseppe Ghibò from comment #6)
> Apparently, it seems our kernel@ml... stopped working in mid june 2024, at
> least according to archives (and also missing pre-2016 ones).

Thank you Giuseppe for the input on both this issue and the ml

-> I now opened
Bug 33748 - The kernel mailing list seem broken since mid June 

And I see kernel-6.6.60-2.mga9 is building already!
Please say if you think it is worth testing for this issue.
Comment 8 Marc Krämer 2024-11-11 18:35:11 CET
When 6.6.60 is ready I can give it a try.

I have some servers running 6.6.58 with jfs - but on 2 systems mounting all volumes was not possible. So I hope the newer kernel will fix the issue :)
Comment 9 Marc Krämer 2024-11-12 09:51:11 CET
is there some option I can enable, in case it fails, to get some more information about the reason. The latest try did not give any information in logs or in dmesg.
Comment 10 Giuseppe Ghibò 2024-11-12 16:51:28 CET
Beware when you are dealing with partitions on a production system...

For logs you might get more verbosity with appending "debug loglevel=7", alternatively there is the possibility to pass some more advanced parameter to /sys/kernel/debug/dynamic_debug/control.

Another option is to strace a manual mount.
katnatek 2024-11-17 18:16:10 CET

Depends on: (none) => 33775

Comment 11 katnatek 2024-11-18 19:56:55 CET
Please give your feedback in bug#33775
Lewis Smith 2024-11-18 20:20:01 CET

CC: lewyssmith => (none)

Comment 12 Marc Krämer 2024-11-20 17:51:55 CET
kernel 6.6.61 fixed this issue!
Comment 13 katnatek 2024-11-22 18:21:37 CET
(In reply to Marc Krämer from comment #12)
> kernel 6.6.61 fixed this issue!

Closing

Resolution: (none) => FIXED
Status: NEW => RESOLVED


Note You need to log in before you can comment on or make changes to this bug.