Bug 27998 - Kernel driver for VirtualBox 6.1.16 and kernel 5.7.19-desktop not loading
Summary: Kernel driver for VirtualBox 6.1.16 and kernel 5.7.19-desktop not loading
Status: RESOLVED WORKSFORME
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: 7
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Mageia Bug Squad
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-02 02:47 CET by Mark Dawson Butterworth
Modified: 2021-01-02 19:03 CET (History)
2 users (show)

See Also:
Source RPM:
CVE:
Status comment:


Attachments

Description Mark Dawson Butterworth 2021-01-02 02:47:02 CET
Description of problem:

With kernel 5.7.19-desktop and latest update to VirtualBox, the kernel driver appears to no longer be loading

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Install latest updates
2. Attempt to run a VM
3. Reports VERR_VM_DRIVER_NOT_INSTALLED
4. Notice that dmesg says it is using the VirtualBox 6.0 driver
5. Notice that both VirtualBox 6.0 and 6.1 kernel drivers are installed for kernel 5.7.19-desktop 
6. Uninstall 6.0 driver using software installer
7. Reboot
8. Still get same error
9. Notice that dmesg now displays no messages relating to VirtualBox

Note:
Uninstalling and reinstalling VirtualBox doesn't help
Comment 1 Mark Dawson Butterworth 2021-01-02 02:48:24 CET
Sorry, failed to note this is for x86_64
Comment 2 Mark Dawson Butterworth 2021-01-02 03:24:46 CET
Using dkms-virtualbox as an attempted work-around results in following report from journalctl:

Jan 02 02:06:47 <machine name> systemd[1]: Starting LSB: VirtualBox Linux kernel module...
Jan 02 02:06:47 <machine name> systemd[1]: Started LSB: VirtualBox Linux kernel module.

However, VirtualBox still reports the same error.

I suspect dkms-virtualbox actually made no difference - I hadn't checked using journalctl before.

What is apparent is that the following lines are missing from the journal since the update:

Dec 30 19:09:30 <machine name> kernel: VBoxNetFlt: Successfully started.
Dec 30 19:09:30 <machine name> kernel: VBoxNetAdp: Successfully started.

So, the module may well be installed but it looks like it isn't actually working. This is probably why dmesg no longer reports anything.
Comment 3 Dave Hodgins 2021-01-02 04:01:35 CET
In cauldron, kmod and dkms packages are only updated for the latest available
kernel. Is there a reason why you are still using 5.7.19 instead of
kernel-desktop-latest which currently pulls in
kernel-desktop-5.10.4-2.mga8-1-1.mga8?

On my system, that's working with virtualbox-kernel-desktop-latest-6.1.16-32.mga8
and virtualbox-6.1.16-8.mga8

Closing as works for me.

Status: NEW => RESOLVED
Resolution: (none) => WORKSFORME
CC: (none) => davidwhodgins

Comment 4 Mark Dawson Butterworth 2021-01-02 04:07:42 CET
Kernel 5.7.19 is latest for mga7 I believe?

I don't know status for mga8. This is for mga7 - mga8 is still in beta surely?

Status: RESOLVED => REOPENED
Resolution: WORKSFORME => (none)

Comment 5 Dave Hodgins 2021-01-02 05:24:48 CET
Sorry, I failed to notice the version tag. Looking into it to try and
figure out what went wrong.
Comment 6 Dave Hodgins 2021-01-02 05:53:38 CET
$ rpm -q virtualbox
virtualbox-6.1.16-4.mga7
[dave@x3 ~]$ uname -a
Linux x3.hodgins.homeip.net 5.7.19-desktop-3.mga7 #1 SMP Sun Oct 18 15:46:00 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
[dave@x3 ~]$ grep vb /proc/modules
vboxnetadp 28672 0 - Live 0x0000000000000000 (O)
vboxnetflt 32768 1 - Live 0x0000000000000000 (O)
vboxdrv 528384 3 vboxnetadp,vboxnetflt, Live 0x0000000000000000 (O)

With the above, virtualbox is working for me.

For kernel 5.7.19 there were 3 virtualbox kernel updates released. Checking ...
https://mirror.math.princeton.edu/pub/mageia/distrib/7.1/x86_64/media/core/updates/
virtualbox-kernel-5.7.19-desktop-1.mga7-6.0.24-5.mga7.x86_64.rpm 
virtualbox-kernel-5.7.19-desktop-3.mga7-6.0.24-6.mga7.x86_64.rpm
virtualbox-kernel-5.7.19-desktop-3.mga7-6.1.16-4.mga7.x86_64.rpm

Which 5.7.19-desktop kernel are you using, and which mirror?
(use "urpmq --list-url|grep core/updates" to determine which mirror)
Comment 7 Mark Dawson Butterworth 2021-01-02 12:26:11 CET
I think I can see what's wrong now. The VirtualBox driver is virtualbox-kernel-5.7.19-desktop-3.mga7-6.1.16-4.mga7.x86_64.rpm but the kernel is 5.7.19-desktop-1.mga7.

5.7.19-desktop-3.mga7 is present but not being used for some reason. I'm about to remove 5.7.19-desktop-1.mga7 to see if that fixes things.
Comment 8 Mark Dawson Butterworth 2021-01-02 12:48:04 CET
I'm using the mirror http://www.mirrorservice.org/sites/mageia.org/pub/mageia/distrib/7

It turns out here that the only issue with VirtualBox is that the driver loader mechanisms are not checking compatibility of 'x' in 'desktop/server-x'.

The real problem is that the updater is not changing boot.cfg if the only thing that changes is 'x'. Removing kernel 5.7.19-desktop-1.mga7 resulted in a non-booting system with 5.7.19-desktop-3.mga7 not appearing in the list of kernels. I had to manually edit on boot to change -1 to -3. The system then booted and VirtualBox works.

I guess this behaviour may have been present for a long time and not been noticed. The user doesn't realise they are using the -1 version unless they boot from the list of kernels rather than "Mageia" - they just think they are using the latest.

The intended action I would have thought is to either replace 5.7.19-desktop-1.mga7 with 5.7.19-desktop-3.mga7 or shuffle down as normal for a new kernel.

Thanks for your help getting to this point.

Should I retitle this bug, close this one and open another noting the problem with boot.cfg and changes in 'x' for the same kernel, or do both so there is a record of that fact that the VirtualBox driver loader doesn't check 'x'?
Comment 9 Aurelien Oudelet 2021-01-02 12:53:28 CET
Hi, to make sure there is any issue with grub, please do as root:

# update-grub

And export here the output.

CC: (none) => ouaurelien

Comment 10 Mark Dawson Butterworth 2021-01-02 13:02:23 CET
Generating grub configuration file ...
Found theme: /boot/grub2/themes/maggy/theme.txt
Found linux image: /boot/vmlinuz-5.7.19-desktop-3.mga7
Found initrd image: /boot/initrd-5.7.19-desktop-3.mga7.img
Found linux image: /boot/vmlinuz-5.7.14-desktop-1.mga7
Found initrd image: /boot/initrd-5.7.14-desktop-1.mga7.img
Found linux image: /boot/vmlinuz-5.6.8-desktop-1.mga7
Found initrd image: /boot/initrd-5.6.8-desktop-1.mga7.img
Found linux image: /boot/vmlinuz-desktop
Found initrd image: /boot/initrd-desktop.img
Jan 02 11:53:57 | DM multipath kernel driver not loaded
Found Mageia 7 (7) on /dev/sda1
Found CentOS Linux 7 (Core) on /dev/sdb5
Found Ubuntu 16.04.2 LTS on /dev/sdb5
done

However, I don't think this is conclusive, because it is after the 5.7.19-desktop-1.mga7 kernel has been removed.

I've just checked another machine, which is using the same mirror, and this has correctly picked up -3. Perhaps this problem only occurred if someone updated between release of 5.7.19-desktop-1.mga7 and 5.7.19-desktop-3.mga7?
Comment 11 Dave Hodgins 2021-01-02 14:00:04 CET
The most likely cause is that previously, the 5.7.19-desktop-1.mga7 was
accidentally manually selected in grub rather then the Mageia desktop entry,
preventing that kernel from being removed when the
virtualbox-kernel-5.7.19-desktop-1.mga7 package was removed. With GRUB_DEFAULT=saved in /etc/default/grub it kept booting the old kernel.
Comment 12 Mark Dawson Butterworth 2021-01-02 14:53:21 CET
It definitely wasn't due to manual selection - I haven't manually selected a kernel on this machine for well over 12 months. 

If it had been manual selection, that wouldn't explain why 5.7.19-desktop-3.mga7 was not in boot.cfg would it?

Also, when the 5.7.19-desktop-1.mga7 kernel was removed, it would have been removed from boot.cfg even if it had been manually selected before, wouldn't it?

Now, I apologise for my bad memory. I was party to the forum discussion regarding automatically removing old kernels and note https://bugs.mageia.org/show_bug.cgi?id=24403. This is the very machine of mine that made me contribute to the feature request - it has a tiny boot drive. Is it fluke that there were only four kernels loaded or is there a kernel cleaning script available which I have enabled and forgotten about? This could perhaps be what caused the problem if I have enabled it?
Comment 13 Dave Hodgins 2021-01-02 14:58:48 CET
Was /boot full?
Comment 14 Mark Dawson Butterworth 2021-01-02 18:21:10 CET
There's 4.7G free now so I don't think so - only 5.7.19-desktop-1.mga7 kernel and VirtualBox 6.0.24 have been deleted since then. I'm not sure which of virtualbox-kernel-5.7.19-desktop-1.mga7-6.0.24-5.mga7.x86_64.rpm or
virtualbox-kernel-5.7.19-desktop-3.mga7-6.0.24-6.mga7.x86_64.rpm I had loaded before.

My theory about the problem occurring only if someone updated between release of 5.7.19-desktop-1.mga7 and 5.7.19-desktop-3.mga7 is disproved. Another machine has both installed and both appear in the grub menu, so it did both updates and didn't get the problem. There's something specific about the way this machine has updated that has caused the issue.

Given this appears to be somewhat unique, is it worth the effort to try and work out what happened or shall I close this until/if the problem recurs?
Comment 15 Aurelien Oudelet 2021-01-02 18:58:51 CET
We don't think so.
Too much kernels in /boot is only disk-space issue.

update-grub script find all kernels in /boot and probes other OS in other partitions.
bootloader-config is one of our tools which calls update-grub.
In default situation, all is well done.

BUT there is a caveat with the saved entry for Grub2.
If you ever used this functionality, you can still boot by default an old kernel rather the new one.
This is due you HAVE selected "Advanced Options for Mageia..." and a specific version.
This functionality helps Windows' users to by default boot the Microsoft OS.

So I don't see a issue here.

You can also complain at Bug 24403 to have a tool to handle old kernels.

See Also: (none) => https://bugs.mageia.org/show_bug.cgi?id=24403

Comment 16 Mark Dawson Butterworth 2021-01-02 19:03:31 CET
Appears to be unique to this one machine, so closing bug as problem no longer exists after editing boot.cfg and the cause is not easy to determine.

Resolution: (none) => WORKSFORME
Status: REOPENED => RESOLVED


Note You need to log in before you can comment on or make changes to this bug.