| Summary: | kernel on Mageia 4 does not do KMS properly on Dell PowerEdge R610 | ||
|---|---|---|---|
| Product: | Mageia | Reporter: | David Walser <luigiwalser> |
| Component: | RPM Packages | Assignee: | Thomas Backlund <tmb> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | normal | ||
| Priority: | Normal | CC: | sysadmin-bugs |
| Version: | 4 | ||
| Target Milestone: | --- | ||
| Hardware: | i586 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Source RPM: | kernel | CVE: | |
| Status comment: | |||
| Bug Depends on: | 14301 | ||
| Bug Blocks: | |||
| Attachments: |
dmesg output
lsmod output from Dell server lsmod from 3.11.2 dmesg from 3.11.2 |
||
|
Description
David Walser
2014-03-26 23:12:14 CET
David Walser
2014-04-28 23:41:16 CEST
Component:
Release (media or process) =>
RPM Packages Pasting specs from bug 13264 Dell PowerEdge R610 Intel Xeon X5670, 2 6-core CPUs with hyperthreading Intel ICH9 chipset Matrox G200eW WPCM450 Card:Matrox Millennium G series (single head): Matrox Electronics Systems Ltd.|MGA G200eW WPCM450 [DISPLAY_VGA] (vendor:102b device:0532 subv:1028 subd:0236) (rev: 0a) Can you also attach a dmesg from that system Created attachment 5134 [details]
dmesg output
These lines look relevant: [ 4.574462] [drm:mga_vram_init] *ERROR* can't reserve VRAM [ 4.574467] mgag200 0000:06:03.0: Fatal error during GPU init: -6 I also see those with the 3.14.17 in updates_testing. is uvesafb loaded on that system ? if so, does it help to blacklist it ? It is not loaded. Can you provide attach output of lsmod Created attachment 5422 [details]
lsmod output from Dell server
I just built 3.11.2 from our kernel package SVN revision 487545 and it works fine. Ok, so there is minimal changes in the gpu code between 3.11 and 3.12, so maybe this is not really the gpu code that has been broken, but some acpi/mm/mtrr code change... Can you get a dmesg and lsmod from running the 3.11 kernel too Created attachment 5423 [details]
lsmod from 3.11.2
Created attachment 5424 [details]
dmesg from 3.11.2
Not having luck booting the first kernel I built in the bisect process. When grub selects it, the screen immediately goes completely black, and then a minute later dracut prints a bunch of errors about devices not existing (UUIDs which correspond to my swap, /, and /usr partitions). Usually the first message is about the megasas raid adapter, so I'm guessing the one I built is failing to initialize the hardware raid adapter properly. I have no idea why, as I tried starting with the configs from both 3.11.2 and 3.12-rc5, which both at least do boot.
My process for building and installing the kernel was extracted from our kernel spec, so maybe you can find some flaw in my process here, but it looks right to me.
git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
cd linux
git bisect start
git bisect bad v3.12-rc5
git bisect good v3.11
cp /boot/config-3.12.0-desktop-0.rc5.1.mga4 .config
make oldconfig
make -j 24 -s all
KernelVer=3.11.0 # based on the contents of the Makefile currently
install -m 644 System.map /home/admin/bisect/boot/System.map-$KernelVer
install -m 644 .config /home/admin/bisect/boot/config-$KernelVer
xz -c Module.symvers > /home/admin/bisect/boot/symvers-$KernelVer.xz
cp -f arch/x86/boot/bzImage /home/admin/bisect/boot/vmlinuz-$KernelVer
make INSTALL_MOD_PATH=/home/admin/bisect KERNELRELEASE=$KernelVer modules_install
rm -rf /home/admin/bisect/lib/firmware
find /home/admin/bisect/lib/modules -name "*.ko" | xargs -P 24 xz -6e
rm -f /home/admin/bisect/lib/modules/{build,source}
pushd /home/admin/bisect/lib/modules
/sbin/depmod -ae -b /home/admin/bisect -F /home/admin/bisect/boot/System.map-$KernelVer $KernelVer
pushd $KernelVer
modules=`find . -name "*.ko.[g,x]z"`
echo $modules | xargs -P 24 /sbin/modinfo | perl -lne 'print "$name\t$1" if $name && /^description:\s*(.*)/; $name = $1 if m!^filename:\s*(.*)\.k?o!; $name =~ s!.*/!!' > modules.description
popd
popd
pushd /home/admin/bisect
tar -cvf ../${KernelVer}.tar lib/modules/$KernelVer boot/*
popd
su -
cd /
tar -xvf /home/admin/${KernelVer}.tar
/sbin/installkernel $KernelVer
This really doesn't make sense. I thought maybe the upstream kernel just didn't work and it was some patch in the Mageia kernel that made it work, but kernel-linus 3.12 from mga4 boots. So, I don't understand why the one I built in Comment 14 won't boot. Regardless, I found a convoluted way to create a patch from 3.11.0 to the current contents of my git bisect and integrate that into our RPM build procedure, so I managed to build a kernel in an RPM that boots and works. So, step 1 => git bisect good. I made it all the way through the git bisect and all of the kernels came up good. So, the issue was caused by a config change in our package, namely here: http://svnweb.mageia.org/packages/cauldron/kernel/current/PATCHES/configs/i386.config?r1=496716&r2=496715&pathrev=496716 Can you try to disable: CONFIG_FB_SIMPLE (In reply to Thomas Backlund from comment #17) > Can you try to disable: CONFIG_FB_SIMPLE I've played with that option as well as CONFIG_X86_SYSFB. X86_SYSFB = n, FB_SIMPLE = n, works X86_SYSFB = y, FB_SIMPLE = y, doesn't work X86_SYSFB = y, FB_SIMPLE = n, doesn't work X86_SYSFB = n, FB_SIMPLE = y, works So the issue is actually X86_SYSFB, a new option that was added during 3.12 development, that when set to yes, breaks our server's console. FB_SIMPLE does have a slight noticeable impact, with it set to y, I can at least see that megasas adapter initialization message at the beginning, with FB_SIMPLE set to n, I can't even see that. So it appears FB_SIMPLE is better off staying as yes. I rebuilt the mga4 updates_testing kernel with X86_SYSFB = n, and the display works again :D (In reply to claire robinson from comment #20) > See also https://forums.mageia.org/en/viewtopic.php?f=7&t=8450 I haven't tried to run X on it. I would think that would be a different issue. @David: There is now a kernel-3.14.19-1.mga4 building with X86_SYSFB disabled and also the interesting vfs fix I mentioned on irc X86_SYSFB is also disabled in Cauldron in the upcoming 3.17-rc7 kernel Thanks Thomas! This is now running on our production Squid server.
David Walser
2014-11-14 22:20:52 CET
Depends on:
(none) =>
14301 Fixed in http://advisories.mageia.org/MGASA-2014-0453.html Status:
NEW =>
RESOLVED |