Bug 3186 - Update request: kernel-2.6.38.8-8.mga
Summary: Update request: kernel-2.6.38.8-8.mga
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: Security (show other bugs)
Version: 1
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: QA Team
QA Contact:
URL:
Whiteboard:
Keywords: validated_update
Depends on:
Blocks: 2240 2264 3128 3139 3210 3235
  Show dependency treegraph
 
Reported: 2011-10-25 23:41 CEST by Thomas Backlund
Modified: 2011-11-11 20:22 CET (History)
5 users (show)

See Also:
Source RPM: kernel
CVE:
Status comment:


Attachments
syslog output with bad page state x2 followed by null pointer dereference (8.47 KB, text/plain)
2011-10-26 19:02 CEST, Dave Hodgins
Details
lshw output (26.02 KB, text/plain)
2011-11-01 18:03 CET, Dave Hodgins
Details

Description Thomas Backlund 2011-10-25 23:41:10 CEST
There is now a kernel-2.6.38.8-7.mga to validate.
( and this time BS didn't eat half the packages... :) )

Suggested advisory:
---
This update adressed the following CVE:
* A flaw was found in the way splitting two extents in
  ext4_ext_convert_to_initialized() worked. Althrough ext
  has been updated in memory, it is not dirtied both
  inext4_ext_convert_to_initialized() and 
  ext4_ext_insert_extent(). The disk layout is corrupted. 
  Then it will meet with a BUG_ON() when writting at the 
  start of that extent again. (CVE-2011-3638)

Other fixes in this release:
* enable PM_RUNTIME and USB_SUSPEND on netbook kernel too (mga #3139)
* re-enable usblp as it is needed by both usb-pp adapters and some 
  printers (mga #2240, #2264) (Note: cups is already patched to work
  with both usblp and libusb)
* backport rtlwifi from upstream 3.1 to add support for:
  * RTL8192SE/RTL8191SE PCIe
  * RTL8192DE/RTL8188DE PCIe
  * RTL8192CU/RTL8188CU USB
---
Thomas Backlund 2011-10-25 23:42:03 CEST

Blocks: (none) => 2240, 2264, 3128, 3139

Comment 1 Dave Hodgins 2011-10-26 01:41:40 CEST
The following package has bad signature:
/var/cache/urpmi/rpms/kernel-server-2.6.38.8-7.mga-1-1.mga1.i586.rpm: Missing signature (OK ((none)))

At least all of the packages are there this time. :-)

CC: (none) => davidwhodgins

Comment 2 Dave Hodgins 2011-10-26 05:36:00 CEST
Except for the missing sig on the kernel server package, everything looks
good on i586.

All 5 kernels install cleanly, and compile the virtualbox kernel
module.

Running make xconfig works in /usr/src/linux-2.6.38.8-7.mga/

In order to fix the kernel-server package signature, does that
mean the subrel will have to be bumped, and all of the packages
reissued?
Comment 3 Luan Pham 2011-10-26 05:49:38 CEST
I seen this error while boot from i586 Mageia 1 Laptop:


b43-phy0 ERROR: PHY transmission error

Look like every thing work fine beside that error.

I don't see that error from x86_64 Mageia 1 Laptop.

CC: (none) => pham182b

Comment 4 Thomas Backlund 2011-10-26 08:23:49 CEST
(In reply to comment #2)
> Except for the missing sig on the kernel server package, everything looks
> good on i586.
> 

> In order to fix the kernel-server package signature, does that
> mean the subrel will have to be bumped, and all of the packages
> reissued?

Nope.

I have re-signed the package so it should soon be on the mirrors.
Comment 5 Thomas Backlund 2011-10-26 08:28:35 CEST
(In reply to comment #3)
> I seen this error while boot from i586 Mageia 1 Laptop:
> 
> 
> b43-phy0 ERROR: PHY transmission error
> 
> Look like every thing work fine beside that error.

That shouldn't be any different from previous kernels as this update does not change anything regarding b43.
Comment 6 Dave Hodgins 2011-10-26 19:02:54 CEST
Created attachment 1013 [details]
syslog output with bad page state x2 followed by null pointer dereference

Found the null pointer dereference output on the screen when I started
using the system today.  Used the magic sysrq keys to reboot.

Looking at the log after, it seems the system (cron, etc.) was still running.
Comment 7 Thomas Backlund 2011-10-26 20:09:12 CEST
Ok, that's bad.
Was that the first BUG/WARNING or other stack trace in the logs?
Comment 8 Dave Hodgins 2011-10-26 21:36:04 CEST
Yes.  $ grep kernel /var/log/syslog|wc -l
13092

Most of that is from rebooting during the testing of all 5 kernels.
$ grep -e "Call Trace" -e BUG /var/log/syslog|grep -v msec
Oct 26 03:28:45 hodgins kernel: BUG: Bad page state in process mgaapplet  pfn:2fce8
Oct 26 03:28:45 hodgins kernel: Call Trace:
Oct 26 03:51:33 hodgins kernel: BUG: Bad page state in process mgaapplet  pfn:26ee5
Oct 26 03:51:33 hodgins kernel: Call Trace:
Oct 26 04:03:05 hodgins kernel: BUG: unable to handle kernel NULL pointer dereference at 00000028
Oct 26 04:03:05 hodgins kernel: Call Trace:

I'd closed everything down except kde, and had locked the screen with the clock
screen saver around 0300, about half an hour before the first BUG.

No further BUG reports since then, although I selected the desktop kernel
when I rebooted, just past noon.
Comment 9 Thomas Backlund 2011-10-26 22:00:24 CEST
Ok, 
now when checking it some more it might not be related to this update at all, since there have been similar report on other distros 2.6.38 series too, but I'll set up a fs stress-tester here.


do you use ext4?

do you have any of theese in your system?
  * RTL8192SE/RTL8191SE PCIe
  * RTL8192DE/RTL8188DE PCIe
  * RTL8192CU/RTL8188CU USB
Comment 10 Dave Hodgins 2011-10-26 22:17:45 CEST
Yes on ext4.  No on RTL*.  A quick google search indicates those are
wireless drivers.  I do have a wireless dongle, but it was not connected
between booting the server kernel, and the BUG.
After connecting it, lspcidrake shows it's using rt73usb.
Comment 11 Dave Hodgins 2011-10-27 00:14:46 CEST
Confirming that the kernel-server package on the mirrors is now signed.
Comment 12 Dave Hodgins 2011-10-28 04:12:59 CEST
Just fyi, I just got hit with another "unable to handle kernel NULL pointer
dereference".  This one was not preceded by Bad Page state, or any other
BUG.

This time, in the syslog, where the BUG data should be, there are just
a few hundred periods.

Also with the server kernel.
Comment 13 claire robinson 2011-10-30 21:18:45 CET
I'm not sure if this is relevant but it mentions redhat in syslog x86_64 & i586, I just updated tonight so will report any errors that occur after some usage.

kernel: device-mapper: ioctl: 4.19.1-ioctl (2011-01-07) initialised: dm-devel@redhat.com

It's not a new mention of redhat, it was present with the older kernel apparently too so it's probably nothing to worry about.

All hardware works fine on both systems, no regressions. No error messages dmesg or syslog.
Manuel Hiebel 2011-10-31 20:01:36 CET

Blocks: (none) => 3235

Comment 14 claire robinson 2011-10-31 22:09:40 CET
i586
----

A couple of messages from /var/log/kernel/errors.log

When S3 suspending, I'm not sure if it when it is suspending or coming back out

kernel: fb: conflicting fb hw usage inteldrmfb vs VESA VGA - removing generic driver


A different message when using the button to turn the wifi on/off on an old laptop. The message occurs when turning the wifi back on, within a second or two.

kernel: ipw2200: Failed to send CARD_DISABLE: Command timed out.

also in /var/log/kernel/warnings.log

Oct 31 08:30:51 localhost kernel: btusb 2-1:1.0: no reset_resume for driver btusb?
Oct 31 08:30:51 localhost kernel: btusb 2-1:1.1: no reset_resume for driver btusb?


after resuming. There is only one USB bluetooth adapter which doesn't work after suspending. I filed a bug about this some time ago now but it appears it's still present in this kernel too.


x86_64
------

When booting errors.log

Oct 31 08:24:07 mega kernel: pci 0000:05:00.0: ignoring class 01 (doesn't match header type 01)

lspci -v

05:00.0 Non-VGA unclassified device: Gammagraphx, Inc. (or missing ID) Device 0001 (rev 08)
	!!! Invalid class 0000 for header type 01
	Flags: bus master, fast devsel, latency 0
	Memory at <ignored> (32-bit, non-prefetchable) [disabled]
	Memory at <invalid-64bit-slot> (64-bit, non-prefetchable) [disabled]
	Bus: primary=18, secondary=00, subordinate=01, sec-latency=0
	!!! Unknown I/O range types 1c/0
	!!! Unknown memory range types 20/1
	!!! Unknown prefetchable memory range types 24/1
	Expansion ROM at <ignored> [disabled]


To be honest, I'm not sure what that is.. I'll have to check. I do have a cctv card in there but that is SAA7130. I'm not sure without checking if the motherboard has built in vga.
Comment 15 Thomas Backlund 2011-11-01 15:21:43 CET
(In reply to comment #12)
> Just fyi, I just got hit with another "unable to handle kernel NULL pointer
> dereference".  This one was not preceded by Bad Page state, or any other
> BUG.
> 
> This time, in the syslog, where the BUG data should be, there are just
> a few hundred periods.
> 
> Also with the server kernel.


I haven't been able so far to reproduce, can you describe your hardware with more detail to see if anything else is going on.
I mean arch/chipset/cpu/nic/gpu.

and when it comes to gpu, is this crash with some proprietary drivers?

I'll also will build a new testkernel for you with more debug enabled to see if we can pinpoint it
Comment 16 Dave Hodgins 2011-11-01 18:03:48 CET
Created attachment 1035 [details]
lshw output

In /etc/X11/xorg.conf, it specifies the ati driver, which in
turn loads the radeon module.  The ati radeon 9200SE doesn't use
a firmware.  I'm curious what part of attachment 1013 [details] indicates
it's video related?

Except for the 3 BUGs on Oct 26th, and one on the 28th, it's been
working fine since.  No idea if I can recreate it.
Comment 17 Thomas Backlund 2011-11-01 18:47:02 CET
(In reply to comment #16)
> Created attachment 1035 [details]
> lshw output
> 
> In /etc/X11/xorg.conf, it specifies the ati driver, which in
> turn loads the radeon module.  The ati radeon 9200SE doesn't use
> a firmware.  I'm curious what part of attachment 1013 [details] indicates
> it's video related?
> 

Nothing really, I just want to be able to replicate the system as close as I can.
And it happends from time to time that some drivers have been overwriting memory belonging to other apps/drivers ...

> Except for the 3 BUGs on Oct 26th, and one on the 28th, it's been
> working fine since.  No idea if I can recreate it.

But that is still 4 in ~ one week, so that's really bad, even if it might be system specific.
Comment 18 Florian Hubold 2011-11-03 13:05:06 CET
(In reply to comment #0)
> There is now a kernel-2.6.38.8-7.mga to validate.
> 
> Other fixes in this release:
> * re-enable usblp as it is needed by both usb-pp adapters and some 
>   printers (mga #2240, #2264) (Note: cups is already patched to work
>   with both usblp and libusb)


Just to report back, the following issue is solved by using this update candidate:
https://bugs.mageia.org/show_bug.cgi?id=3235

Also it enabled the same reporter to use a Canon MX340 printer including the scanner, which was not possible before with the Canon vendor drivers. Link to the german forum for reference: https://forums.mageia.org/de/viewtopic.php?f=7&t=312

CC: (none) => doktor5000

Comment 19 Thomas Backlund 2011-11-04 07:38:04 CET
There is now a 2.6.38.8-8.mga to validate.

In addition to the fixes in 2.6.38.8-7.mga there is a some more fixes to memory management, isci driver fixes, drm i915 SNB fixes, ata_piix fixes for SNB, added rts_pstor driver, and closing some more CVE's

The changes since 2.6.38.8-7.mga:
- add rts_pstor Realtek PCIe card reader (mga #3210)
- ata_piix: make DVD Drive recognisable on systems with Intel Sandybridge chipsets
- mm: fix negative commitlimit when gigantic hugepages are allocated
- mm: fix race between mremap and removing migration entry
- mm, hotplug: fix error handling in mem_online_node()
- drm/i915: Fix gen6 (SNB) missed BLT ring interrupts
- drm/i915: Apply HWSTAM workaround for BSD ring on SandyBridge
- mm/nommu.c: fix remap_pfn_range()
- mm: vmscan: do not use page_count without a page pin
- mm: oom: task->mm == NULL doesn't mean the memory was freed
- mm: compaction: ensure that the compaction free scanner does not
  move to the next zone
- mm: compaction: abort compaction if too many pages are isolated and
  caller is asynchronous V2
- ipv4: Kill spurious opt->srr check in ip_options_rcv_srr()
  * closes CVE-2011-4087: Multiple remote denial of service in Linux
    bridge networking (already partly fixed in 2.6.38.4 and 2.6.38.8)
- mm/oom: fix integer overflow of points in oom_badness (CVE-2011-4097)
- crypto/ghash: Avoid null pointer dereference if no key is set (CVE-2011-4081)
- scsi/isci: fix sata response handling
- scsi/isci: fix 32-bit operation when CONFIG_HIGHMEM64G=n
- scsi/isci: change sas phy timeouts from 54us to 59us
- scsi/isci: fix sgpio register definitions
- scsi/isci: fix support for large smp requests
- scsi/isci: fix missed unlock in apc_agent_timeout()

I will rewrite a nicer advisory later.

Blocks: (none) => 3210
Summary: Update request: kernel-2.6.38.8-7.mga => Update request: kernel-2.6.38.8-8.mga

Comment 20 claire robinson 2011-11-07 16:35:13 CET
No errors here. x86_64 & i586
Comment 21 Dave Hodgins 2011-11-08 02:00:07 CET
dkms-virtualbox compiled cleanly on all 5 i586 kernels.
Started kde4 via gdm, and all hardware working.

I've been running 2.6.38.8-8 for about 72 hours now, without
any errors encountered.  I'd like to give it at least a couple
more days before validating the update though.
Comment 22 David GEIGER 2011-11-08 06:58:20 CET
Hello,

Tested the kernel-server-2.6.38.8-8.mga on Mageia release 1 (Official) for x86_64
and for me it work perfectly too.

For module pata_amd in bug #1525 is also OK.

CC: (none) => geiger.david68210

Comment 23 Thomas Backlund 2011-11-08 07:03:33 CET
(In reply to comment #21)
> dkms-virtualbox compiled cleanly on all 5 i586 kernels.
> Started kde4 via gdm, and all hardware working.
> 
> I've been running 2.6.38.8-8 for about 72 hours now, without
> any errors encountered.  

That's good to hear. Hopefylly it will stay that way.

> I'd like to give it at least a couple
> more days before validating the update though.

Agreed, 
It's good to get it properly tested.
Comment 24 Thomas Backlund 2011-11-08 09:55:49 CET
Updated advisory:
_________________
This update adresses the following CVEs:
* A flaw was found in the way splitting two extents in
  ext4_ext_convert_to_initialized() worked. Althrough ext
  has been updated in memory, it is not dirtied both
  inext4_ext_convert_to_initialized() and 
  ext4_ext_insert_extent(). The disk layout is corrupted. 
  Then it will meet with a BUG_ON() when writting at the 
  start of that extent again. (CVE-2011-3638)
* The ghash_update function passes a pointer to gf128mul_4k_lle 
  which will be NULL if ghash_setkey is not called or if the most
  recent call to ghash_setkey failed to allocate memory. This
  causes an oops. 
  This is trivially triggered from unprivileged userspace through 
  the AF_ALG interface by simply writing to the socket without 
  setting a key. Fix this up by returning an error code in the
  null case. (CVE-2011-4081)
* An integer overflow will happen on 64bit archs if task's sum of
  rss, swapents and nr_ptes exceeds (2^31)/1000 value. This was 
  introduced by commit f755a04 "oom: use pte pages in OOM score."
  This can cause a denial of service. (CVE-2011-4097)
* Linux kernel 2.6.37 introduced with commit 462fb2af "bridge: 
  Sanitize skb before it enters the IP stack" several regressions
  that be used to trigger remote denial of service attacks when
  bridging is in use. (CVE-2011-4087, already partly fixed in 
  upstream 2.6.38.4 and 2.6.38.8)


Other fixes in this release:
* enable PM_RUNTIME and USB_SUSPEND on netbook kernel too (mga #3139)
* re-enable usblp as it is needed by both usb-pp adapters and some 
  printers (mga #2240, #2264, #3235) (Note: cups is already patched to
  work with both usblp and libusb)
* add rts_pstor Realtek PCIe card reader (mga #3210)
* backport rtlwifi from upstream 3.1 to add support for:
  * RTL8192SE/RTL8191SE PCIe
  * RTL8192DE/RTL8188DE PCIe
  * RTL8192CU/RTL8188CU USB
* ata_piix: make DVD Drive recognisable on systems with Intel Sandybridge
  chipsets (errata for some broken chip revisions)
* mm fixes:
  * fix negative commitlimit when gigantic hugepages are allocated
  * fix race between mremap and removing migration entry
  * hotplug: fix error handling in mem_online_node()
  * nommu: fix remap_pfn_range()
  * vmscan: do not use page_count without a page pin
  * oom: task->mm == NULL doesn't mean the memory was freed
  * compaction: 
    * ensure that the compaction free scanner does not move to the
      next zone
    * abort compaction if too many pages are isolated and caller
      is asynchronous
* drm/i915 SandyBridge:
  * Fix missed BLT ring interrupts
  * Apply HWSTAM  workaround for BSD ring
* scsi/isci fixes:
  * fix sata response handling
  * fix 32-bit operation when CONFIG_HIGHMEM64G=n
  * change sas phy timeouts from 54us to 59us
  * fix sgpio register definitions
  * fix support for large smp requests
  * fix missed unlock in apc_agent_timeout()
Comment 25 Dave Hodgins 2011-11-11 03:19:55 CET
I haven't seen any problems with this kernel.

If no one objects, I'll validate the kernel update tomorrow afternoon.

Roughly 18 hours from now.
Comment 26 Dave Hodgins 2011-11-11 19:59:36 CET
Validating the update.

Could someone from the sysadmin team push the srpm
kernel-2.6.38.8-8.mga1.src.rpm
from Core Updates Testing to Core Updates

Advisory:
This kernel update adresses the following CVEs:
* A flaw was found in the way splitting two extents in
  ext4_ext_convert_to_initialized() worked. Althrough ext
  has been updated in memory, it is not dirtied both
  inext4_ext_convert_to_initialized() and 
  ext4_ext_insert_extent(). The disk layout is corrupted. 
  Then it will meet with a BUG_ON() when writting at the 
  start of that extent again. (CVE-2011-3638)
* The ghash_update function passes a pointer to gf128mul_4k_lle 
  which will be NULL if ghash_setkey is not called or if the most
  recent call to ghash_setkey failed to allocate memory. This
  causes an oops. 
  This is trivially triggered from unprivileged userspace through 
  the AF_ALG interface by simply writing to the socket without 
  setting a key. Fix this up by returning an error code in the
  null case. (CVE-2011-4081)
* An integer overflow will happen on 64bit archs if task's sum of
  rss, swapents and nr_ptes exceeds (2^31)/1000 value. This was 
  introduced by commit f755a04 "oom: use pte pages in OOM score."
  This can cause a denial of service. (CVE-2011-4097)
* Linux kernel 2.6.37 introduced with commit 462fb2af "bridge: 
  Sanitize skb before it enters the IP stack" several regressions
  that be used to trigger remote denial of service attacks when
  bridging is in use. (CVE-2011-4087, already partly fixed in 
  upstream 2.6.38.4 and 2.6.38.8)


Other fixes in this release:
* enable PM_RUNTIME and USB_SUSPEND on netbook kernel too (mga #3139)
* re-enable usblp as it is needed by both usb-pp adapters and some 
  printers (mga #2240, #2264, #3235) (Note: cups is already patched to
  work with both usblp and libusb)
* add rts_pstor Realtek PCIe card reader (mga #3210)
* backport rtlwifi from upstream 3.1 to add support for:
  * RTL8192SE/RTL8191SE PCIe
  * RTL8192DE/RTL8188DE PCIe
  * RTL8192CU/RTL8188CU USB
* ata_piix: make DVD Drive recognisable on systems with Intel Sandybridge
  chipsets (errata for some broken chip revisions)
* mm fixes:
  * fix negative commitlimit when gigantic hugepages are allocated
  * fix race between mremap and removing migration entry
  * hotplug: fix error handling in mem_online_node()
  * nommu: fix remap_pfn_range()
  * vmscan: do not use page_count without a page pin
  * oom: task->mm == NULL doesn't mean the memory was freed
  * compaction: 
    * ensure that the compaction free scanner does not move to the
      next zone
    * abort compaction if too many pages are isolated and caller
      is asynchronous
* drm/i915 SandyBridge:
  * Fix missed BLT ring interrupts
  * Apply HWSTAM  workaround for BSD ring
* scsi/isci fixes:
  * fix sata response handling
  * fix 32-bit operation when CONFIG_HIGHMEM64G=n
  * change sas phy timeouts from 54us to 59us
  * fix sgpio register definitions
  * fix support for large smp requests
  * fix missed unlock in apc_agent_timeout()

https://bugs.mageia.org/show_bug.cgi?id=3186

Keywords: (none) => validated_update
CC: (none) => sysadmin-bugs

Comment 27 Thomas Backlund 2011-11-11 20:22:48 CET
Update pushed.

Status: NEW => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.