Bug 9832 - os-prober tries to mount extended partition
Summary: os-prober tries to mount extended partition
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: x86_64 Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Barry Jackson
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 15579
Blocks: 416
  Show dependency treegraph
 
Reported: 2013-04-23 01:36 CEST by Richard Neill
Modified: 2016-06-22 02:57 CEST (History)
3 users (show)

See Also:
Source RPM: os-prober-1.65-6.mga5.src.rpm
CVE:
Status comment:


Attachments

Description Richard Neill 2013-04-23 01:36:40 CEST
Running update-grub2 detects all my Linux kernels, and then hangs.
As a workaround, I edited /etc/grub.d/30_os-prober to immediately exit.

The root cause is that os-prober itself simply hangs, and never reurns till I kill-9 it.

This is rather puzzling, since I have a very simple setup:

sda1 : Mageia root fs (ext4)
sda5 : swap
sda6 : emptry ext4 partition for future use
sdb1 : /home (ext4)


The underlying process that is getting stuck is:

mount -o ro -t ext2 /dev/sda2 /var/lib/os-prober/mount

this is called by:

/usr/lib/os-probes/50-mounted-tests.0007 /dev/sda2

and the underlying mount process cannot be terminated except by a reboot.


Marking this as critical, because it is preventing urpmi --auto-update from running to completion.


Reproducible: 

Steps to Reproduce:
Comment 1 Thierry Vignaud 2013-04-23 01:52:11 CEST
What happens when you manually run "mount -o ro -t ext2 /dev/sda2 /var/lib/os-prober/mount"?

Keywords: (none) => NEEDINFO
CC: (none) => thierry.vignaud
Assignee: bugsquad => zen25000

Comment 2 Richard Neill 2013-04-23 02:30:51 CEST
> mount -o ro -t ext2 /dev/sda2 /var/lib/os-prober/mount

The mount process hangs. It never returns, and cannot even be kill-9'd.

The odd thing is, why would anything try to mount /dev/sda2, which is merely the container for /dev/sda5 and sda6 ?

dmesg shows this:

[ 2036.520345] INFO: task mount:22332 blocked for more than 120 seconds.
[ 2036.520345] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2036.520346] mount           D ffffffff8160c1c0     0 22332  22325 0x00000004
[ 2036.520348]  ffff88042bbd3c68 0000000000000082 ffff88042bbd3c18 ffff88042f028de0
[ 2036.520350]  ffff880416bd0000 ffff88042bbd3fd8 ffff88042bbd3fd8 ffff88042bbd3fd8
[ 2036.520352]  ffff88042d972d20 ffff880416bd0000 ffff88042bbd3cc8 ffff880428000400
[ 2036.520354] Call Trace:
[ 2036.520355]  [<ffffffff8144fe9f>] schedule+0x3f/0x60
[ 2036.520357]  [<ffffffff8144ecea>] __mutex_lock_slowpath+0xca/0x140
[ 2036.520359]  [<ffffffff8144e93a>] mutex_lock+0x2a/0x50
[ 2036.520360]  [<ffffffff811601ef>] mount_bdev+0x7f/0x200
[ 2036.520362]  [<ffffffff811d80b0>] ? ext2_error+0x130/0x130
[ 2036.520364]  [<ffffffff811526bd>] ? __kmalloc_track_caller+0x13d/0x190
[ 2036.520365]  [<ffffffff811d6a35>] ext2_mount+0x15/0x20
[ 2036.520367]  [<ffffffff81161103>] mount_fs+0x43/0x1b0
[ 2036.520369]  [<ffffffff81126710>] ? __alloc_percpu+0x10/0x20
[ 2036.520370]  [<ffffffff8117ac92>] vfs_kern_mount+0x72/0x110
[ 2036.520372]  [<ffffffff8117b4f4>] do_kern_mount+0x54/0x110
[ 2036.520373]  [<ffffffff8117cce4>] do_mount+0x1a4/0x850
[ 2036.520375]  [<ffffffff81120bab>] ? memdup_user+0x4b/0x90
[ 2036.520376]  [<ffffffff81120c4b>] ? strndup_user+0x5b/0x80
[ 2036.520378]  [<ffffffff8117d4d0>] sys_mount+0x90/0xe0
[ 2036.520379]  [<ffffffff81458e79>] system_call_fastpath+0x16/0x1b
Comment 3 Thierry Vignaud 2013-04-23 04:13:23 CEST
Obviously os-prober should ignore (primary) extended partitions and only look at regular primary/logical partitions

Summary: update-grub2 hangs and never returns - caused by os-prober => update-grub2 hangs and never returns (os-prober stuch trying to mount extended partition)

Comment 4 Richard Neill 2013-04-23 04:38:39 CEST
Yes, that's true. If I run 
  /usr/lib/os-probes/50-mounted-tests.0007 /dev/sda5    #or sda6
then it's fine: it returns 0 instantly.

However, I don't know if it's os-prober that is to blame here. I think the problem is that the mount call is locking up, rather than failing quickly on what should be an invalid request. It's invalid to try to mount /dev/sda2, and it's certainly invalid to mount it with "-t ext2".



Aside: there is something rather wrong with mount on this system. Experimenting a bit with my spare partition (empty /dev/sda6, ext4):

mount -t ext4 /dev/sda6 /spare   #works
mount -t ext2 /dev/sda6 /spare   #fails [1]

mount -o rw /dev/sda6 /spare     #works
mount -o ro /dev/sda6 /spare     #fails [2]

[1] is correct to fail, but the error message is wrong: "mount: /dev/sda6 is already mounted or /spare is busy"

[2] should't fail, but it does. Furthermore, it gives the same error message as [1]. 


(it's kernel 3.3.8-desktop-2.mga2)
Comment 5 Richard Neill 2013-04-23 04:43:44 CEST
This is a kernel problem in 3.3.8.mga2 If I try all the tests above under 3.8.8.mga3, then everything works fine.

BUT... I think people are going to hit this bug when upgrading from Mga2: it will result in update-grub2 hanging, and in turn, making urpmi stall.

(Also, I can't run a later kernel than that at the moment, because the 3.8.8 kernel refuses to load the module for my graphics card, an intel i915!)
Comment 6 Thierry Vignaud 2013-04-23 05:14:23 CEST
Thomas, see comment #4 & #5

CC: (none) => tmb

Thierry Vignaud 2013-04-23 07:07:13 CEST

Blocks: (none) => 416

Comment 7 Barry Jackson 2013-05-03 14:45:59 CEST
(In reply to Richard Neill from comment #5)
> This is a kernel problem in 3.3.8.mga2 If I try all the tests above under
> 3.8.8.mga3, then everything works fine.
> 
> BUT... I think people are going to hit this bug when upgrading from Mga2: it
> will result in update-grub2 hanging, and in turn, making urpmi stall.

It is unlikely to hit upgraders as grub2 was not offered in Mga2.

> (Also, I can't run a later kernel than that at the moment, because the 3.8.8
> kernel refuses to load the module for my graphics card, an intel i915!)

Resetting assignee and qa to default as this appears not to be a grub2/os-prober fault.

Assignee: zen25000 => bugsquad

Comment 8 Thierry Vignaud 2013-05-03 14:51:57 CEST
It _is_ also an os-prober issue. It should not try to mount _logical_ partitions.
That's just pure non sense

CC: (none) => zen25000

Comment 9 Barry Jackson 2013-05-03 17:17:46 CEST
(In reply to Richard Neill from comment #5)
> This is a kernel problem in 3.3.8.mga2 If I try all the tests above under
> 3.8.8.mga3, then everything works fine.
> 
> BUT... I think people are going to hit this bug when upgrading from Mga2: it
> will result in update-grub2 hanging, and in turn, making urpmi stall.
> 
> (Also, I can't run a later kernel than that at the moment, because the 3.8.8
> kernel refuses to load the module for my graphics card, an intel i915!)

I just pushed an updated os-prober to cauldron updates/testing, however I'm unsure if the new Fedora patches applied actually address this problem (one is related to extended partitions and btrfs).

Could you test os-prober-1.57-6.mga3 please?

Thanks,
Barry
Comment 10 Richard Neill 2013-05-03 23:58:35 CEST
I've just tried to get the latest version from the mirrors, but the latest one available (by urpmi --auto-select) is still 1.57-5.mga3.  Am I missing something?
Comment 11 Barry Jackson 2013-05-04 00:05:55 CEST
(In reply to Richard Neill from comment #10)
> I've just tried to get the latest version from the mirrors, but the latest
> one available (by urpmi --auto-select) is still 1.57-5.mga3.  Am I missing
> something?

You need to enable 'core updates testing' repo just to install it and then disable that repo again.
Comment 12 Richard Neill 2013-05-04 00:18:12 CEST
Thanks. I managed to install 1.57-6.

os-prober still hangs, though fortunately it's now killable with "Ctrl-\".
syslog shows...

May  3 23:14:43 chocolate logger: os-prober: debug: running /usr/lib/os-probes/50mounted-tests on /dev/sda2
May  3 23:14:43 chocolate logger: os-prober: debug: running /usr/lib/os-probes/50mounted-tests.0007 on /dev/sda2

This is a problem of 2 parts: os-prober shouldn't probe a logical partition, also there is something wrong with mount.
Comment 13 Richard Neill 2013-05-04 00:26:38 CEST
Summary:

1. In kernel  3.3.8-desktop-2.mga2,  os-prober (1.57-6) shows the bug (probing a logical partition), and the kernel's mount call hangs.

2. In kernel 3.8.11-desktop-1.mga3, os-prober still tries to probe the logical partition (sda2), but does so safely, and the kernel doesn't hang when mounting. Fragment of syslog below.

3. [Annoyingly, I am still stuck with 3.3.8 because of another bug which prevents the intel graphics driver loading on newer kernels!]


------------
May  3 23:23:57 chocolate logger: os-prober: debug: running /usr/lib/os-probes/50mounted-tests on /dev/sda2
May  3 23:23:57 chocolate logger: os-prober: debug: running /usr/lib/os-probes/50mounted-tests.0007 on /dev/sda2
May  3 23:23:57 chocolate kernel: EXT2-fs (sda2): error: unable to read superblock
May  3 23:23:57 chocolate kernel: EXT4-fs (sda2): unable to read superblock
May  3 23:23:57 chocolate kernel: cramfs: wrong magic
May  3 23:23:57 chocolate kernel: EXT3-fs (sda2): error: unable to read superblock
May  3 23:23:57 chocolate kernel: REISERFS warning (device sda2): sh-2006 read_super_block: bread failed (dev sda2, block 8, size 1024)
May  3 23:23:57 chocolate kernel: REISERFS warning (device sda2): sh-2006 read_super_block: bread failed (dev sda2, block 64, size 1024)
May  3 23:23:57 chocolate kernel: REISERFS warning (device sda2): sh-2021 reiserfs_fill_super: can not find reiserfs on sda2
May  3 23:23:57 chocolate kernel: XFS (sda2): bad magic number
May  3 23:23:57 chocolate kernel: XFS (sda2): SB validate failed
May  3 23:23:57 chocolate kernel: FAT-fs (sda2): bogus number of reserved sectors
May  3 23:23:57 chocolate kernel: FAT-fs (sda2): Can't find a valid FAT filesystem
May  3 23:23:57 chocolate kernel: FAT-fs (sda2): bogus number of reserved sectors
May  3 23:23:57 chocolate kernel: FAT-fs (sda2): Can't find a valid FAT filesystem
May  3 23:23:57 chocolate kernel: MINIX-fs: unable to read superblock
May  3 23:23:57 chocolate kernel: attempt to access beyond end of device
May  3 23:23:57 chocolate kernel: sda2: rw=0, want=3, limit=2
May  3 23:23:57 chocolate kernel: hfs: unable to find HFS+ superblock
May  3 23:23:57 chocolate kernel: qnx4: wrong fsid in superblock.
May  3 23:23:57 chocolate kernel: You didn't specify the type of your ufs filesystem
May  3 23:23:57 chocolate kernel: 
May  3 23:23:57 chocolate kernel: mount -t ufs -o ufstype=sun|sunx86|44bsd|ufs2|5xbsd|old|hp|nextstep|nextstep-cd|openstep ...
May  3 23:23:57 chocolate kernel: 
May  3 23:23:57 chocolate kernel: >>>WARNING<<< Wrong ufstype may corrupt your filesystem, default is ufstype=old
May  3 23:23:57 chocolate kernel: hfs: can't find a HFS filesystem on dev sda2.
Comment 14 Barry Jackson 2013-05-04 01:00:32 CEST
Thanks for testing - I was hoping those patches may have fixed it. :(

So this is harmless with later kernels and so is not actually critical for Mga3, however it still *should* be fixed in os-prober.

I will report upstream.

Keywords: NEEDINFO => (none)
Assignee: bugsquad => zen25000

Comment 15 Barry Jackson 2013-05-04 18:35:52 CEST
(In reply to Richard Neill from comment #13)
Before I do please try the latest unstable just to be sure this is not already fixed upstream : 
http://mtf.no-ip.co.uk/pub/linux/barjac/distrib/cauldron/x86_64/media/extra/release/os-prober-1.58-1.mga3.x86_64.rpm
or for 32 bit:
http://mtf.no-ip.co.uk/pub/linux/barjac/distrib/cauldron/i586/media/extra/release/os-prober-1.58-1.mga3.i586.rpm

Thanks,
Barry
Comment 16 Richard Neill 2013-05-04 23:49:37 CEST
Thanks. Yes, this still affects the latest version 1.58-1, which still probes /dev/sda2.

I agree - please file this upstream, but it's not a serious critical bug any more. 

[I remain mystified by the weird behaviour of "mount" under the older kernel 3.3.8, but that's not really relevant to this bug]
Comment 17 Barry Jackson 2013-05-05 01:36:20 CEST
Bug report sent upstream - if it gets a number I will post it here.
Comment 18 Barry Jackson 2013-05-05 01:51:59 CEST
This is not a release blocker so removing from tracker

Blocks: 416 => (none)
Severity: critical => normal

Barry Jackson 2013-05-05 01:53:50 CEST

Summary: update-grub2 hangs and never returns (os-prober stuch trying to mount extended partition) => os-prober tries to mount extended partition

Barry Jackson 2013-05-05 12:26:00 CEST

Status: NEW => ASSIGNED

Comment 19 Richard Neill 2014-01-26 22:56:15 CET
This is now working, as of Mga4, so can be closed.

On the same system as previously (which has only Linux present), running "/etc/grub.d/30_os-prober" now exits successfully, and prints nothing. (This is as expected).

Status: ASSIGNED => RESOLVED
Resolution: (none) => FIXED

Comment 20 Thierry Vignaud 2015-04-08 14:25:44 CEST
It still tries to open extended partition in mga5.
It should only look at logical partitions (the contents) not the extended one (the container).

Status: RESOLVED => REOPENED
Resolution: FIXED => (none)

Thierry Vignaud 2015-04-08 14:26:08 CEST

Source RPM: os-prober-1.57-5.mga3.src.rpm => os-prober-1.65-6.mga5.src.rpm

Comment 21 Thierry Vignaud 2015-04-08 14:33:44 CEST
I think that's b/c we're totally unsynced with RH which has fixed it:
http://pkgs.fedoraproject.org/cgit/os-prober.git/commit/?id=1cc85085b1fb5ddf7e66548f2db7808397f23c11

URL: (none) => http://pkgs.fedoraproject.org/cgit/os-prober.git/commit/?id=1cc85085b1fb5ddf7e66548f2db7808397f23c11

Comment 22 Thierry Vignaud 2015-04-08 14:36:46 CEST
Also we should exclude backup files generated by patch:
/usr/lib/linux-boot-probes/mounted/40grub2.0007
/usr/lib/linux-boot-probes/mounted/40grub2.0015
/usr/lib/os-probes/50mounted-tests.0007
/usr/lib/os-probes/50mounted-tests.0012
/usr/lib/os-probes/mounted/05efi.0013
/usr/lib/os-probes/mounted/20microsoft.0016
/usr/lib/os-probes/mounted/83haiku.0016
/usr/lib/os-probes/mounted/90linux-distro.0001
/usr/lib/os-probes/mounted/90linux-distro.0004
/usr/lib/os-probes/mounted/90linux-distro.0007
/usr/lib/os-probes/mounted/90linux-distro.0011
/usr/lib/os-probes/mounted/90linux-distro.0012
Comment 23 Thierry Vignaud 2015-04-08 14:38:31 CEST
Actually removing those files fix that issue.
it's a side effect of switching to %apply_patches (bug #15579)
It also make it faster...
Comment 24 Thierry Vignaud 2015-04-08 14:40:24 CEST
Well, we still have a bogus error message but that's quite a lot better:

/dev/sdb3:Fedora release 13 (Goddard):Fedora:linux
rmdir: failed to remove â/var/lib/os-prober/mountâ: Device or resource busy

URL: http://pkgs.fedoraproject.org/cgit/os-prober.git/commit/?id=1cc85085b1fb5ddf7e66548f2db7808397f23c11 => (none)

Comment 25 Thierry Vignaud 2015-04-08 14:41:16 CEST
This should do it:

%exclude     /usr/lib/os-probes/*.00??
%exclude     /usr/lib/os-probes/mounted/*.00??
Comment 26 Barry Jackson 2015-04-08 17:35:12 CEST
OK thanks,
Fixed the patch backup files in svn.
Committed revision 819882

Fedora are using 1.57 we are using 1.65, so not sure if any more of their patches are needed.
.
Shall I ask for freeze push now or is there anything else?
I have tested in UEFI and BIOS with no apparent regressions.

Index: SPECS/os-prober.spec
===================================================================
--- SPECS/os-prober.spec	(revision 819739)
+++ SPECS/os-prober.spec	(working copy)
@@ -1,7 +1,7 @@
 %define _libexecdir %{_exec_prefix}/lib
 Name:           os-prober
 Version:        1.65
-Release:        %mkrel 6
+Release:        %mkrel 7
 Summary:        Probes disks on the system for installed operating systems
 License:        GPLv1 and GPLv2+
 Group:          System/Boot and Init
@@ -73,6 +73,7 @@
         install -m 755 -p os-probes/mounted/powerpc/20macosx \
             %{buildroot}%{_libexecdir}/os-probes/mounted
 fi
+find %{buildroot}/usr/lib/os-probes -name "*.00??" -delete
 
 %files
 %doc README TODO debian/copyright debian/changelog COPYING-note.txt
Comment 27 Barry Jackson 2015-04-08 18:21:02 CEST
BTW I see no errors when running os-prober-1.65-7 in an EFI system with multiple BIOS HD attached (two of which have extended partitons)

[root@localhost ~]# os-prober
/dev/sda3:Mageia 4 (4):Mageia:linux
/dev/sda5:Mageia 2 (2):Mageia1:linux
/dev/sda6:Mageia 3 (3):Mageia2:linux
/dev/sda7:Mageia 5 (5):Mageia3:linux
/dev/sda8:Mageia 5 (5):Mageia4:linux
/dev/sdb16:Mageia 5:Mageia5:linux
[root@localhost ~]#

In debug mode there are these for both of them, so it does try to run 50mounted-tests on the extended:

--------------snip--------------
os-prober: debug: running /usr/lib/os-probes/50mounted-tests on /dev/sda2
--------------snip--------------
os-prober: debug: running /usr/lib/os-probes/50mounted-tests on /dev/sdb2
os-prober: debug: /dev/sdb5: is active swap
os-prober: debug: /dev/sdb6: is active swap
os-prober: debug: running /usr/lib/os-probes/50mounted-tests on /dev/sdb7
50mounted-tests: debug: mounted as ext3 filesystem
50mounted-tests: debug: running subtest /usr/lib/os-probes/mounted/05efi
05efi: debug: /dev/sdb7 is ext3 partition: exiting
50mounted-tests: debug: running subtest /usr/lib/os-probes/mounted/10freedos
10freedos: debug: /dev/sdb7 is not a FAT partition: exiting
------------snip--------------

Note that it quietly fails to mount them. See further down where it says:

os-prober: debug: running /usr/lib/os-probes/50mounted-tests on /dev/sdb7
50mounted-tests: debug: mounted as ext3 filesystem

...where it has mounted sdb7 and reports the fs.

So is this an issue?
Comment 28 Thierry Vignaud 2015-04-08 18:22:18 CEST
(In reply to Barry Jackson from comment #26)
I've a freeze push

(In reply to Barry Jackson from comment #27)
It still better (quite a lot less dangerous commands run on extended partitions)
Comment 29 Barry Jackson 2015-04-08 22:32:32 CEST
(In reply to Thierry Vignaud from comment #28)
> (In reply to Barry Jackson from comment #26)
> I've a freeze push
> 
> (In reply to Barry Jackson from comment #27)
> It still better (quite a lot less dangerous commands run on extended
> partitions)


I had already committed it in #26.
The testing I referred to in #27 was with the fix I committed in #26.

Never mind it won't do any harm to have both, I'll remove one next time it's updated. ;)
Comment 30 Barry Jackson 2015-04-09 01:04:09 CEST
Closing then

Status: REOPENED => RESOLVED
Resolution: (none) => FIXED

Thierry Vignaud 2015-04-09 10:11:18 CEST

Depends on: (none) => 15579

Thierry Vignaud 2016-06-22 02:57:33 CEST

Blocks: (none) => 416


Note You need to log in before you can comment on or make changes to this bug.