Bug 44 - slow OS under VirtualBox, due to HZ=1000 (was mkinitrd fails in VB)
Summary: slow OS under VirtualBox, due to HZ=1000 (was mkinitrd fails in VB)
Status: NEW
Alias: None
Product: Mageia
Classification: Unclassified
Component: Release (media or process) (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: High critical
Target Milestone: ---
Assignee: Erwan VELU
QA Contact:
URL:
Whiteboard: 5beta1
Keywords: NEEDINFO
Depends on:
Blocks:
 
Reported: 2011-02-16 02:45 CET by David W. Hodgins
Modified: 2017-07-10 16:28 CEST (History)
16 users (show)

See Also:
Source RPM: syslinux, mageia-gfxboot-theme, drakx-installer-images, kernel
CVE:
Status comment:


Attachments
Screenshot of install under VirtualBox log (128.89 KB, image/jpeg)
2011-02-16 02:47 CET, David W. Hodgins
Details

Description David W. Hodgins 2011-02-16 02:45:10 CET
mkinitrd fails during installations, after 6 hours installing the kde
version.  I suspect this is related to
https://qa.mandriva.com/show_bug.cgi?id=62529

I'm attaching a screenshot of the log
Comment 1 David W. Hodgins 2011-02-16 02:47:46 CET
Created attachment 2 [details]
Screenshot of install under VirtualBox log
Comment 2 David W. Hodgins 2011-02-16 23:39:43 CET
I was able to workaround the problem by booting from an iso image of Knoppix,
which uses CONFIG_HZ_300=y, mounting the mageia install, and using chroot
to enter the magiea system.

I then ran the mkinitrd, and it completed in about 3 minutes.  I was then
able to reboot the mageia install, and use the upgrade option, to complete
the install.

The system is now working, but very, very slow.  I'm going to try and
compile a custom kernel woth CONFIG_HZ_100=y, to confirm that that will
fix the problem.

So the problem is not with mkinitrd itself.  The problem is excessive cpu
usage, combined with the time limit imposed by /usr/lib/libDrakX/run_program.pm

Regards, Dave Hodgins
Ahmad Samir 2011-02-17 05:20:35 CET

CC: (none) => tmb

D Morgan 2011-02-20 19:56:55 CET

CC: (none) => dmorganec
Assignee: ahmadsamir3891 => bugsquad

Comment 3 Dave Hodgins 2011-03-06 01:19:17 CET
Took 10 days (220 hours of cpu time) to compile a custom kernel.

The only change between my custom kernel, and the desktop kernel, is changing
the HZ from 1000 to 100.

Booting to run level 1 failsafe, from pressing b in grub to the bash prompt ...

custom   1m 40s
desktop  4m 10s

Booting to run level 3, from selecting the kernel, to the login prompt

custom   2  minutes
desktop  13 minutes

LXDE works fine with the custom kernel, but locks up with the desktop kernel.

So having a kernel using 100HZ available is important, and should be the default
used by the installation iso.  It's fine to install a 1000HZ kernel on systems that
are new enough, but the installer has to be able to run in older systems.

I'll now start trying to figure out why the server kernels are failing to run here.

CC: (none) => davidwhodgins

Comment 4 Marja van Waes 2011-09-28 22:35:08 CEST
@ David

Sorry for responding so very, very late!

Was the problem still there the last time you installed Cauldron (rc 1?). And what about Mageia 1 official?

You only have the problem when installing, but never when upgrading your kernel, is that correct?

You're very different from me, It is only now that I have a new laptop that I would consider installing anything under virtual box.
 
No one confirmed this bug, that makes it very hard, even if the problem persists, to assign it to a maintainer.

What we can do, however, (if it is still a problem, of course) is turn it into an enhancement request. Then it doesn't need to be reproduced by someone else.

CC: (none) => m.van.waes
Source RPM: (none) => drakx-installer

Marja van Waes 2011-09-28 22:35:40 CEST

Whiteboard: (none) => NEEDINFO

Comment 5 Dave Hodgins 2011-09-29 02:23:11 CEST
I haven't tried a cauldron install under VirtualBox.  The problem is
still there with Mageia 1 Official.

I've been running xppro under VirtualBox since Early 2008.  Even with
my slow cpu and 2GB of ram (I give vb 768MB), I've found very little
difference when running xppro under VirtualBox compared to running it
on native hardware, provided the vdi file is a static size, not
dynamic, and the host is mostly idle.

I've had no problems with multiple other linux systems under VB, such
as Knoppix, which do not use the CONFIG_HZ=1000 kernel option.

As to no-one confirming the bug, just take a look at the VirtualBox
Help, Section 12.4.1. Linux guests may cause a high CPU load,
which states "Some Linux distributions, for example Fedora, ship a Linux kernel configured for a timer frequency of 1000Hz. We recommend to recompile the guest kernel and to select a timer frequency of 100Hz.".

As per comment 1, using a kernel with 1000hz, it takes 6 hours to install up
to the point where mkintrd fails, since it exceeds 10 minutes.

With the 1000hz timer, it takes 12 minutes to run mkinitrd.
With the  300hz timer (Knoppix), it takes 3 minutes.
Take a look at comment 3 for the differences between 1000hz and 100hz.

I'm not sure how many people still use 6 year old hardware, like mine
and would be willing to spend the 6+ hours it takes to get to the point
where the mkinitrd fails.

Mageia needs to have a kernel suitable for use in VirtualBox on older
hardware.  It should use 100hz, no PAE (since VirtualBox does not
enable it by default), and be used by the installer, as well as the
installed system.

Since it would be used by the installer, it should be based on the 586
kernel.  Since there are so few 586 systems still in use, I think that
kernel should be changed to use the 100hz timer.
Comment 6 Marja van Waes 2011-09-29 12:30:05 CEST
(In reply to comment #5)


>  I think that
> kernel should be changed to use the 100hz timer.

And then no one is going to report a bug: "Timer frequency is only 100Hz" ??
I'm sure some people will start complaining.

It would be nice though, to be able to choose between frequencies. 

The kernel.org website is down for maintenance at this moment. Maybe someone is already working on it?
Comment 7 Marja van Waes 2011-09-29 12:35:10 CEST
Well, some people do think about having variable kernel timer frequency:
http://kerneltrap.org/node/6532
Comment 8 Thomas Backlund 2011-09-29 12:52:05 CEST
I think we could maybe drop down the desktop586 kernel to 250Hz as a tradeoff between 100/1000, like we used to have before switching all desktop kernels to 1000Hz.

Would that be ok ?
Comment 9 Thomas Backlund 2011-09-29 13:40:04 CEST
An other option could be a kernel parameter, as the virtualbox faq states:

"Linux kernels shipped with Red Hat Enterprise Linux (RHEL) as of release 4.7 and 5.1 as well as kernels of related Linux distributions (for instance CentOS and Oracle Enterprise Linux) support a kernel parameter divider=N. Hence, such kernels support a lower timer frequency without recompilation. We suggest to add the kernel parameter divider=10 to select a guest kernel timer frequency of 100Hz."
Comment 10 Marja van Waes 2011-09-29 16:57:33 CEST
(In reply to comment #9)
> An other option could be a kernel parameter, as the virtualbox faq states:
> 
> "Linux kernels shipped with Red Hat Enterprise Linux (RHEL) as of release 4.7
> and 5.1 as well as kernels of related Linux distributions (for instance CentOS
> and Oracle Enterprise Linux) support a kernel parameter divider=N. Hence, such
> kernels support a lower timer frequency without recompilation. We suggest to
> add the kernel parameter divider=10 to select a guest kernel timer frequency of
> 100Hz."

I prefer that option, because, IIUR, it won't lead to other users complaining ;)

Do you and tv have to work on this together, to make that work in installer? If so, can I assign this bug to you?

Whiteboard: NEEDINFO => (none)

Comment 11 Dave Hodgins 2011-09-30 02:07:30 CEST
Interesting results.

Using divider=10, the install of just console and config packages took only 40
minutes to get to the point of running mkinitrd, instead of almost 6 hours.

The mkinitrd failed on first run, as it still exceeded 10 minutes.  I checked
syslog on the host, and it turned out mgaapplet had run during the mkinitrd,
slowing it down.

On second run, the mkinitrd took about 8 minutes, but the install completed.

On reboot (with divider=10), the system was still quite slow.

I installed the kernel-server-lastest, during which the mkinitrd took
7 min, 50 seconds.

After rebooting, to use the server kernel (without the divide, since it
already uses 100 HZ, I reran the identical mkinitrd.  It took
4 min, 2 seconds.

So while using the divider=10 does allow the install to complete, it looks
like it's still not as good as using 100 HZ.

Also, using the server kernel, the mouse etc, are much more responsive.
Comment 12 Dave Hodgins 2011-09-30 05:04:02 CEST
Just for comparison, the same mkinitrd command on the host system takes 47
seconds.
Comment 13 Marja van Waes 2011-09-30 07:46:54 CEST
(In reply to comment #11)
> Interesting results.
> 
> Using divider=10, the install of just console and config packages took only 40
> minutes to get to the point of running mkinitrd, instead of almost 6 hours.
> 

Thx! So the option is in Mageia kernel too (When I read "as well as kernels of related distributions" in comment 10, I didn't guess we were so much related to RHEL)

Since the divider=10 option is good enough complete install, I'll assign this bug to tv


> 
> So while using the divider=10 does allow the install to complete, it looks
> like it's still not as good as using 100 HZ.
> 

@ tmb, of course you're still welcome to react!

Assignee: bugsquad => thierry.vignaud

Comment 14 Marja van Waes 2011-09-30 08:00:06 CEST
I'm waking up:

If this divider=10 option is available in installer, this isn't a bug. Or is the option impossible to find if you do not know it is there?
Comment 15 Marja van Waes 2011-09-30 08:38:56 CEST
I just checked with Mageia 1 DVD.

The first screen I see when booting the DVD, with a lot of colored words, disappears so quickly that I don't have the slightest idea what it says.

After that one, in the next screen, for kernel I could choose between:
Default, Safe settings, No ACPI and No local ACPI.

@ tv Is it possible to add the divider=10 option there?
Comment 16 Dave Hodgins 2011-09-30 08:52:31 CEST
It's only taken 7 months for me to find out this parameter exists.
grep divider /usr/src/linux-2.6.38.8-6.mga/Documentation/kernel-parameters.txt
doesn't show anything.

It's an undocumented kernel parameter that only partially fixes the problem.

Using a 100HZ kernel cuts the cpu usage in half over using the undocumented
divider option.

Please use a kernel that uses 100HZ for the installer.  Preferably one that
doesn't require PAE, and if installing under VB, make that the default for
the installed kernel.
Comment 17 Thierry Vignaud 2011-09-30 09:43:04 CEST
I disagree: this will only work for the couple users who'll search about.

Others will just discard Mga as the slow distro.

I think the best solution would be for the kernel to detect the VB environment and switch to 100Hz.
eg: looking for VBOXBIOS in DSDT

Summary: mkinitrd fails under VirtualBox => slow OS under VirtualBox due to HZ=1000 (was mkinitrd fails in VB)

Comment 18 Thierry Vignaud 2011-09-30 10:29:51 CEST
Maybe simpler would be to detect it in syslinux and automatically add the paremeter

Source RPM: drakx-installer => mageia-gfxboot-theme / drakx-installer-images

Comment 19 Dave Hodgins 2011-10-01 16:18:36 CEST
Note from comment 11,
using the divider=10, mkinitrd took 7 min, 50 seconds
using the 100 HZ, mkinitrd took 4 min, 2 seconds.

While the divider does help, it is not the same as a 100 HZ kernel.

It was enough of a help to get the install to finish on second mkinitrd try,
however there is noticeable sluggishness with the system running with the
divider=10, compared to using a kernel with 100HZ.
Comment 20 Thierry Vignaud 2011-10-04 17:58:59 CEST
Erwan, would that be possible?

CC: (none) => erwanaliasr1, thierry.vignaud

Comment 21 Thierry Vignaud 2011-10-04 17:59:39 CEST
Erwan: see comment #18
Comment 22 Erwan VELU 2011-10-05 10:18:48 CEST
Syslinux is able to detect the cpu flag meaning we run under a virtualized setup.
In such configuration, that would also apply for kvm or vmware but I do think this is a good idea. Do you agree on that ?

I can have a look at it once the migration of the syslinux 4 will be completed.
Comment 23 Thierry Vignaud 2011-10-05 14:34:31 CEST
I agree.
And I offer you this bug :-)

Assignee: thierry.vignaud => erwanaliasr1

Comment 24 Thomas Backlund 2011-11-12 19:30:24 CET
Have anyone actually verified that this happends on kvm or vmware ?
Comment 25 Thierry Vignaud 2012-01-27 11:12:44 CET
Erwan, any news?
Manuel Hiebel 2012-01-27 16:50:30 CET

Blocks: (none) => 4299

Comment 26 Guillaume Rousse 2012-05-02 21:48:27 CEST
Target set to mageia 3, as installer changes are needed.

CC: (none) => guillomovitch
Target Milestone: --- => Mageia 3

Guillaume Rousse 2012-05-02 21:49:07 CEST

Blocks: 4299 => (none)

Comment 27 Marja van Waes 2012-05-26 13:08:32 CEST
Hi,

This bug was filed against cauldron, but we do not have cauldron at the moment.

Please report whether this bug is still valid for Mageia 2.

Thanks :)

Cheers,
marja

Keywords: (none) => NEEDINFO

Comment 28 Thierry Vignaud 2012-06-08 14:15:32 CEST
Thomas, is there a way to dynamically change this parameter from userspace?
Or can kernel be patched to change divider at runtime, on bootstrapping, in order to reduce it when detecting vbox?
Comment 29 Marja van Waes 2012-06-28 21:26:15 CEST
@ tmb

Can you answer tv's question in comment 28?

(In reply to comment #28)
> Thomas, is there a way to dynamically change this parameter from userspace?
> Or can kernel be patched to change divider at runtime, on bootstrapping, in
> order to reduce it when detecting vbox?
Comment 30 Marja van Waes 2012-07-06 15:05:02 CEST
Please look at the bottom of this mail to see whether you're the assignee of this  bug, if you don't already know whether you are.


If you're the assignee:

We'd like to know for sure whether this bug was assigned correctly. Please change status to ASSIGNED if it is, or put OK on the whiteboard instead.

If you don't have a clue and don't see a way to find out, then please put NEEDHELP on the whiteboard.

Please assign back to Bug Squad or to the correct person to solve this bug if we were wrong to assign it to you, and explain why.

Thanks :)

**************************** 

@ the reporter and persons in the cc of this bug:

If you have any new information that wasn't given before (like this bug being valid for another version of Mageia, too, or it being solved) please tell us.

@ the reporter of this bug

If you didn't reply yet to a request for more information, please do so within two weeks from now.

Thanks all :-D
Comment 31 Dave Hodgins 2012-07-30 02:45:37 CEST
During testing of Bug 6694, it appears qemu is affected too.

Adding the divider=10 does help, but not as much as switching to the
server kernel.
Marja van Waes 2012-08-18 12:18:59 CEST

Keywords: NEEDINFO => (none)
Whiteboard: (none) => (MGA2)

Comment 32 claire robinson 2012-09-10 13:39:10 CEST
swecarp found this with 3alpha1 in virtualbox

Looks to be the same problem

http://i.imgur.com/WKNWt.png
claire robinson 2012-09-10 14:02:10 CEST

CC: (none) => eeeemail

Mårten Ström 2012-09-10 14:34:06 CEST

CC: (none) => marten

Comment 33 Thierry Vignaud 2012-09-10 14:54:58 CEST
Bumping priority so that we fixes it for mga3.
We could either:

- use a HZ=100 kernel for installer (one more flavor :-( since
  server flavor won't work everywhere)

- patch kernel so that it dynamically adjust HZ at boot time
  when detecting vbox

- detect in syslinux if we're under vbox and add the "divider"
  parameter (see comment #17 &  #18)

Priority: Normal => release_blocker
CC: (none) => ennael1
Severity: normal => critical

Comment 34 Thierry Vignaud 2012-09-10 15:05:43 CEST
we could also always use "divider=10" in our syslinux/isolinux config.
It won't be copied to the installed system, which is both:
- nice: it won't affect systems not running under vbox
- sad: it won't fix booting installed systems under vbox
Thierry Vignaud 2012-09-10 15:08:09 CEST

CC: (none) => mageia

Comment 35 Anne Nicolas 2012-09-10 15:15:02 CEST
We will take some time before alpha2 to clean syslinux integration in Mageia so that we are able to use syslinux detection.
claire robinson 2012-09-10 15:18:04 CEST

Whiteboard: (MGA2) => (MGA2) 3alpha1

Comment 36 Thierry Vignaud 2012-09-10 15:27:09 CEST
Note that this only happens on CPUs w/o VT-x or AMD-V (which includes new low
end CPUs)

Summary: slow OS under VirtualBox due to HZ=1000 (was mkinitrd fails in VB) => slow OS under VirtualBox on CPU w/o HW support for virt, due to HZ=1000 (was mkinitrd fails in VB)

Joe Shmoe 2012-09-11 18:47:14 CEST

CC: (none) => callimera.42

claire robinson 2012-10-12 12:01:26 CEST

Whiteboard: (MGA2) 3alpha1 => (MGA2) 3alpha1 3alpha2

Marja van Waes 2012-12-10 16:52:30 CET

Whiteboard: (MGA2) 3alpha1 3alpha2 => 3alpha1 3alpha2 (MGA2)

Comment 37 Nicolas Lécureuil 2013-03-27 11:07:50 CET
is this still valid now that we use dracut ?

CC: (none) => nicolas.lecureuil

Comment 38 Thierry Vignaud 2013-03-27 18:19:51 CET
Yes, that's totally unrelated.
Comment 39 Dave Hodgins 2013-04-23 03:00:10 CEST
Removing the release blocker tag, as there is a partial
workaround, and there's been two releases with it set.

Priority: release_blocker => High

Comment 40 Dave Hodgins 2013-07-07 05:34:14 CEST
Removing "on CPU w/o HW support for virt" from the subject.

While it's less of a problem on cpus with hw support for virt,
it's still a problem.

Summary: slow OS under VirtualBox on CPU w/o HW support for virt, due to HZ=1000 (was mkinitrd fails in VB) => slow OS under VirtualBox, due to HZ=1000 (was mkinitrd fails in VB)

Comment 41 Dave Hodgins 2013-07-07 05:43:39 CEST
Btw, during testing of the xen update, I can confirm this bug affects
hardware with virt support, and isn't limited to virtualbox.

Using full hvm emulation under xen, it's much slower than virtualbox,
and on my new qaud core, still couldn't run the mkinitrd in under 10
minutes, without the divider=10 option.
Dave Hodgins 2013-08-05 01:36:27 CEST

Whiteboard: 3alpha1 3alpha2 (MGA2) => 4alpha1

Comment 42 Shlomi Fish 2013-11-25 18:19:47 CET
Hi all,

this problem is still present in Mageia 4 Beta 2 KDE4-i586-LiveCD with VirtualBox running on top of Mageia Cauldron. The installer won't exit. After adding "divider=10" to the boot commands, it finishes the installation fine.

Regards,

-- Shlomi Fish

CC: (none) => shlomif

Comment 43 Dave Hodgins 2013-11-25 21:59:59 CET
Note, that I suggested trying the divider=10 option, due to journalctl error
messages, that indicate to me, too many interrupts were being generated.
Nov 25 18:01:10 localhost rtkit-daemon[2533]: The canary thread is apparently starving. Taking action.
Nov 25 18:01:10 localhost rtkit-daemon[2533]: Demoting known real-time threads.

Shlomi, can you add what the host cpu is?
Comment 44 Shlomi Fish 2013-11-26 07:22:12 CET
(In reply to Dave Hodgins from comment #43)
> Note, that I suggested trying the divider=10 option, due to journalctl error
> messages, that indicate to me, too many interrupts were being generated.
> Nov 25 18:01:10 localhost rtkit-daemon[2533]: The canary thread is
> apparently starving. Taking action.
> Nov 25 18:01:10 localhost rtkit-daemon[2533]: Demoting known real-time
> threads.
> 
> Shlomi, can you add what the host cpu is?

Yes, it's a «Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz» (from «cat /proc/cpuinfo»). The machine's SPECs are:

[QUOTE]

    An Intel Core i3 CPU (x86-64).
    8 GB of RAM.
    Intel Corporation Sandy Bridge Integrated Graphics Controller (rev 09)
    A 2 TB hard-disk.
    A 21×´ Wide LCD Screen by LG.
    Intel Corporation Cougar Point High Definition Audio Controller.
    Intel Corporation 82579V Gigabit Network Connection.

[/QUOTE]
Comment 45 Dave Hodgins 2014-03-26 00:47:51 CET
On my 2005 single core celeron system, now running Mageia 4, even with the
divider=10 option, dracut cannot run in a vb guest, in under 10 minutes.
The time limit has to be increased, if we still want to support running
under vb, on older systems.
Guillaume Rousse 2014-03-26 09:04:05 CET

CC: guillomovitch => (none)

Florian Hubold 2014-05-17 18:27:51 CEST

CC: (none) => doktor5000
Hardware: i586 => All
Target Milestone: Mageia 3 => Mageia 5

Thierry Vignaud 2014-07-17 15:29:23 CEST

Source RPM: mageia-gfxboot-theme / drakx-installer-images => syslinux, mageia-gfxboot-theme, drakx-installer-images

Comment 46 Dick Gevers 2014-11-15 08:48:33 CET
@David or anyone: please advise if this applies to 5beta1 (because 4 installer won't be fixed), Thanks
Comment 47 Marja van Waes 2014-11-15 09:39:39 CET
(In reply to Dick Gevers from comment #46)
> @David or anyone: please advise if this applies to 5beta1 (because 4
> installer won't be fixed), Thanks

Yes, it does, I've seen people discussing it on #mageia-qa last week.
both Dave Hodgins and tmb advised others to use or try using "divider=10", they would have known if that was no longer needed.

Whiteboard: 4alpha1 => 5beta1

Comment 48 Rémi Verschelde 2014-11-20 00:21:49 CET
@Anne, Erwan: Do you think we can get this fixed for mga5?

CC: (none) => remi

Thierry Vignaud 2015-05-01 20:00:46 CEST

CC: (none) => sysadmin-bugs
Component: Installer => Release (media or process)

Comment 49 Thierry Vignaud 2016-07-01 17:47:00 CEST
I think it's time we advertize testers to use virt-manager instead of VirtualBox then...
Thierry Vignaud 2016-07-01 17:47:08 CEST

Source RPM: syslinux, mageia-gfxboot-theme, drakx-installer-images => syslinux, mageia-gfxboot-theme, drakx-installer-images, kernel

Dave Hodgins 2016-07-12 21:13:21 CEST

Blocks: (none) => 18932

Comment 50 Thomas Backlund 2016-07-20 14:21:45 CEST
Is this still an issue with VirtualBox 5.1 series ?
Comment 51 Dave Hodgins 2016-07-23 17:18:51 CEST
My old i586 system died, so the impact is minor on my current system, but
there is still a noticeable difference testing an install under vb with or
without the divider=10 option.
Comment 52 Samuel Verschelde 2016-10-10 17:26:13 CEST
Could someone summarize this issue and possible solutions?
Comment 53 Thierry Vignaud 2016-10-10 17:36:59 CEST
Basically network can be really slow with kernels build with high (aka normal) HZ values, such as in most distros.
See https://www.google.fr/search?q=divider%3D10+site:virtualbox.org

A workaround is to add the "divider=10" kernel parameter to the boot loader config.
A solution is to switch from VirtualBox to virt-manager (virtio for everything anyway :-) )

There were some improvements with VBox 5.1.x so we would like some VBox users to confirm whether it's better with VBox 5.1
Comment 54 Thierry Vignaud 2016-10-10 17:37:48 CEST
See also comment #33 about possible way to auto add the "divider=10" parameter
Comment 55 Samuel Verschelde 2016-10-10 22:52:30 CEST
(In reply to Thierry Vignaud from comment #53)
> There were some improvements with VBox 5.1.x so we would like some VBox
> users to confirm whether it's better with VBox 5.1

Adding NEEDINFO keyword for that.

Keywords: (none) => NEEDINFO

Samuel Verschelde 2016-10-10 22:52:46 CEST

Target Milestone: Mageia 5 => ---

Samuel Verschelde 2016-11-10 10:36:01 CET

Blocks: 18932 => (none)

Comment 56 Samuel Verschelde 2017-07-10 16:28:26 CEST
It would be good to add a note about this in the Errata.

Note You need to log in before you can comment on or make changes to this bug.