Bug 3825 - Update request: kernel-2.6.38.8-9.mga1
Summary: Update request: kernel-2.6.38.8-9.mga1
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: Security (show other bugs)
Version: 1
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: QA Team
QA Contact:
URL:
Whiteboard:
Keywords: validated_update
Depends on:
Blocks: 3826
  Show dependency treegraph
 
Reported: 2011-12-20 11:36 CET by Thomas Backlund
Modified: 2012-01-31 11:41 CET (History)
7 users (show)

See Also:
Source RPM: kernel
CVE:
Status comment:


Attachments
syslog entry from Bad page state in process mgaapplet pfn:255fe (1.88 KB, text/plain)
2011-12-24 09:05 CET, Dave Hodgins
Details
Another bad page state (3.70 KB, text/plain)
2011-12-29 02:45 CET, Dave Hodgins
Details

Description Thomas Backlund 2011-12-20 11:36:54 CET
There is soon a new kernel to validate (currently building)


Suggested advisory:
-------------------
This update adresses the following CVEs:

* A flaw was found in the way Linux kernel's XFS filesystem implementation
  handled links with pathname larger than MAXPATHLEN. When CONFIG_XFS_DEBUG
  configuration option was not enabled when compiling Linux kernel, an
  attacker able to mount malicious XFS image could use this flaw to crash
  the system, or potentially, elevate his privileges on that system.
  (CVE-2011-4077)

* A kernel null pointer deref at dev_queue_xmit can be triggered by setting
  up a bridge over vlan, and running pktgen. 
  (CVE-2011-4112)

* A flaw was found in the way Linux kernel's Journaling Block Device (JBD)
  handled invalid log first block value. An attacker able to mount malicious
  ext3 or ext4 image could use this flaw to crash the system.
  (CVE-2011-4132)

* On a corrupted hfs file system the ->len field could be wrong leading to
  a buffer overflow.
  (CVE-2011-4330)


Other fixes in this release:
* ASPM: bring kernel power usage back down to 2.6.37 level.
  * PCI: PCIe links may not get configured for ASPM under POWERSAVE mode
  * PCI: Changing ASPM policy, via /sys, to POWERSAVE could cause NMIs
  * PCI: Enable ASPM state clearing regardless of policy
  * PCI/e1000e: Add and use pci_disable_link_state_locked()
  * PCIe ASPM: forcedly -> forcibly
  * PCI: Disable ASPM when _OSC control is not granted for PCIe services
  * PCI/ACPI: Report _OSC control mask returned on failure to get control
  * PCI: Rework ASPM disable code
  * aacraid: controller hangs if kernel uses non-default ASPM policy

* switch transparent hugepages from on by default to madvise (only enabled
  for apps that requests it), as it fixes desktop freeze when accessing
  slow media such as usb (thanks to fbui/mdv mail on @cooker ml).

* xfs: dont serialise direct IO reads on page cache 
  (fixes performance regression introduced in 2.6.38)

* md/raid5: fix bug that could result in reads from a failed device 

* enable ISDN in netbook config (#3367)
-------------------
Thomas Backlund 2011-12-21 12:27:08 CET

Blocks: (none) => 3826

Comment 1 Thomas Backlund 2011-12-21 12:30:17 CET
This one should spend atleast a week in testing before validating due to the aspm backport.

Ususally I would mail -dev and -discuss for broader tests, but since the ml's are currently down that is not an option.
Comment 2 David GEIGER 2011-12-21 12:41:16 CET
Testing complete on Mageia release 1 (Official) for x86_64.

For me it work very fine. Nothing to report at this time,current test since yesterday.

-Installation ,Ok
-Reboot ,Ok
-It's always Ok to the fixed bugs: # 1525 and # 1954
Comment 3 claire robinson 2011-12-21 13:48:24 CET
No POC's that I can find for the security fixes.
Comment 4 José Jorge 2011-12-21 18:50:07 CET
Tested on x86_64, all is OK.

CC: (none) => lists.jjorge

Comment 5 claire robinson 2011-12-22 15:19:53 CET
I'm not sure if this is related to the new kernel or maybe the headers. I don't really know how to read these.

On resume from suspend to ram i586 I had a segfault in kde daemon which seems to point to glibc.

$ rpm -qif /lib/i686/librt.so.1
Name        : glibc                        Relocations: (not relocatable)
Version     : 2.12.1                            Vendor: Mageia.Org
Release     : 11.2.mga1                     Build Date: Sun 20 Nov 2011 00:11:53 GMT
Install Date: Tue 22 Nov 2011 17:27:55 GMT      Build Host: =
Group       : System/Libraries              Source RPM: glibc-2.12.1-11.2.mga1.src.rpm


Segfault info below:

Application: KDE Dæmon (kdeinit4), signal: Segmentation fault
[Current thread is 1 (Thread 0xb55f66d0 (LWP 10627))]

Thread 3 (Thread 0xaf2ffb70 (LWP 10677)):
#0  0xb5c5bf94 in clock_gettime () from /lib/i686/librt.so.1
#1  0xb6e441c5 in ?? () from /usr/lib/libQtCore.so.4
#2  0xb6f17dd6 in ?? () from /usr/lib/libQtCore.so.4
#3  0xb6f165bb in ?? () from /usr/lib/libQtCore.so.4
#4  0xb6f1665d in ?? () from /usr/lib/libQtCore.so.4
#5  0xb5ba6a40 in g_main_context_prepare () from /lib/libglib-2.0.so.0
#6  0xb5ba78b2 in ?? () from /lib/libglib-2.0.so.0
#7  0xb5ba7f9a in g_main_context_iteration () from /lib/libglib-2.0.so.0
#8  0xb6f16e37 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#9  0xb6ee711d in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#10 0xb6ee7399 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#11 0xb6de7e99 in QThread::exec() () from /usr/lib/libQtCore.so.4
#12 0xb6ec710d in ?? () from /usr/lib/libQtCore.so.4
#13 0xb6deaa93 in ?? () from /usr/lib/libQtCore.so.4
#14 0xb6d75e89 in start_thread () from /lib/i686/libpthread.so.0
#15 0xb61352be in clone () from /lib/i686/libc.so.6

Thread 2 (Thread 0xadfd0b70 (LWP 10692)):
#0  0xb6d78020 in pthread_mutex_lock () from /lib/i686/libpthread.so.0
#1  0xb5ba7a1d in ?? () from /lib/libglib-2.0.so.0
#2  0xb5ba83ab in g_main_loop_run () from /lib/libglib-2.0.so.0
#3  0xae089d81 in ?? () from /lib/libgio-2.0.so.0
#4  0xb5bd1204 in ?? () from /lib/libglib-2.0.so.0
#5  0xb6d75e89 in start_thread () from /lib/i686/libpthread.so.0
#6  0xb61352be in clone () from /lib/i686/libc.so.6

Thread 1 (Thread 0xb55f66d0 (LWP 10627)):
[KCrash Handler]
#7  0xb4f59722 in ?? () from /usr/lib/libsolid.so.4
#8  0xb4f59873 in ?? () from /usr/lib/libsolid.so.4
#9  0xb6eee75d in QMetaObject::metacall(QObject*, QMetaObject::Call, int, void**) () from /usr/lib/libQtCore.so.4
#10 0xb6efe04c in QMetaObject::activate(QObject*, QMetaObject const*, int, void**) () from /usr/lib/libQtCore.so.4
#11 0xb4f186d5 in ?? () from /usr/lib/libsolid.so.4
#12 0xb4f583a0 in ?? () from /usr/lib/libsolid.so.4
#13 0xb4f18761 in ?? () from /usr/lib/libsolid.so.4
#14 0xb6eee75d in QMetaObject::metacall(QObject*, QMetaObject::Call, int, void**) () from /usr/lib/libQtCore.so.4
#15 0xb6efe04c in QMetaObject::activate(QObject*, QMetaObject const*, int, void**) () from /usr/lib/libQtCore.so.4
#16 0xb5d64bd0 in ?? () from /usr/lib/libQtDBus.so.4
#17 0xb5d4e843 in ?? () from /usr/lib/libQtDBus.so.4
#18 0xb5d59816 in ?? () from /usr/lib/libQtDBus.so.4
#19 0xb6efdb4f in QObject::event(QEvent*) () from /usr/lib/libQtCore.so.4
#20 0xb63e17a4 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib/libQtGui.so.4
#21 0xb63e6787 in QApplication::notify(QObject*, QEvent*) () from /usr/lib/libQtGui.so.4
#22 0xb7621801 in KApplication::notify(QObject*, QEvent*) () from /usr/lib/libkdeui.so.5
#23 0xb6ee7f0e in QCoreApplication::notifyInternal(QObject*, QEvent*) () from /usr/lib/libQtCore.so.4
#24 0xb6eebcac in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () from /usr/lib/libQtCore.so.4
#25 0xb6eebdfc in QCoreApplication::sendPostedEvents(QObject*, int) () from /usr/lib/libQtCore.so.4
#26 0xb6f16c64 in ?? () from /usr/lib/libQtCore.so.4
#27 0xb5ba74d9 in g_main_context_dispatch () from /lib/libglib-2.0.so.0
#28 0xb5ba7ce0 in ?? () from /lib/libglib-2.0.so.0
#29 0xb5ba7f9a in g_main_context_iteration () from /lib/libglib-2.0.so.0
#30 0xb6f16deb in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#31 0xb64992fa in ?? () from /usr/lib/libQtGui.so.4
#32 0xb6ee711d in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#33 0xb6ee7399 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#34 0xb6eebed0 in QCoreApplication::exec() () from /usr/lib/libQtCore.so.4
#35 0xb63df524 in QApplication::exec() () from /usr/lib/libQtGui.so.4
#36 0xb514be2d in kdemain () from /usr/lib/libkdeinit4_kded4.so
#37 0x0804dca4 in _start ()
Comment 6 claire robinson 2011-12-22 15:25:35 CET
It may even be kdelibs, I'll try and get mikala to have a look too.
Comment 7 Thomas Backlund 2011-12-22 15:28:51 CET
Is it reproducable ?

If you use previous kernel-2.6.38.8-8.mga, does it happend then ?
Comment 8 claire robinson 2011-12-22 15:46:24 CET
I had to reboot in the end as the kde desktop became unresponsive.

I've suspended to ram a couple of times since reboot to test and no recurrence.
Comment 9 Florian Hubold 2011-12-23 11:57:19 CET
Unfortunately the update candidate freezes here. After starting Firefox and/or Thunderbird after some time the system does not react, even to keypresses like Ctrl+Alt+F1-8 or magic SysRq keys (this was for the second time i tested, froze even earlier directly after starting firefox.)

Does not happen with 2.6.38.8-desktop-8.mga but reproducible with new kernel-2.6.38.8-9.mga.

Thomas: Is there a way to disable all those ASPM changes?

CC: (none) => doktor5000

Comment 10 Florian Hubold 2011-12-23 13:13:11 CET
FWIW, with 2.6.38.8-desktop-8.mga and manually applying the transparent hugepages -> madvise fix it works as intended, so seems the ASPM changes would be the culprit for the freezes i've encountered.
Comment 11 John Balcaen 2011-12-23 13:36:27 CET
(In reply to comment #5)
> I'm not sure if this is related to the new kernel or maybe the headers. I don't
> really know how to read these.
> 
> On resume from suspend to ram i586 I had a segfault in kde daemon which seems
> to point to glibc.
> 
[...]
I'm not able to reproduce here on my netbook.
Could you eventually install the -debug package (so it can be a little more useful for a google search for example :p ).
& the crash could be in solid (kdelibs as you said on irc).

CC: (none) => balcaen.john

Comment 12 Florian Hubold 2011-12-23 13:46:14 CET
Sorry for the noise, the freezes came from nouveau driver, after manually switching back to nvidia kernel-2.6.38.8-9.mga working fine here :)
Please disregard comments #9 and #10.
Comment 13 claire robinson 2011-12-23 13:48:55 CET
I've not been able to reproduce the segfault. I've tried combinations of suspending on battery and resuming on AC power but nothing has caused it to happen again.
Comment 14 Dave Hodgins 2011-12-23 21:32:10 CET
All 5 i586 kernels booted, installed dkms updates, and stared kde with sound,
etc., ok here.

CC: (none) => davidwhodgins

Comment 15 Dave Hodgins 2011-12-23 21:38:54 CET
I've noticed that bug 3730 is not fixed with this update.  Not a regression,
just a reminder for a future update.
Comment 16 Thomas Backlund 2011-12-23 21:51:12 CET
(In reply to comment #15)
> I've noticed that bug 3730 is not fixed with this update.  Not a regression,
> just a reminder for a future update.

Yep. I know.
Unfortunately it's not a simple config option change as it is depreceated and can confuse other tools...

I have to see if I'll fix powertop instead...
Comment 17 Dave Hodgins 2011-12-24 09:05:20 CET
Created attachment 1289 [details]
syslog entry from Bad page state in process mgaapplet  pfn:255fe

Turns out, after leaving the system running, I have now had one bad page bug.
Comment 18 Dave Hodgins 2011-12-29 02:45:57 CET
Created attachment 1296 [details]
Another bad page state

This one happened in akonadiserver, as kde was being shutdown.

/home is on an ext4 filesystem.
Comment 19 Thomas Backlund 2011-12-30 18:09:14 CET
@Dave:

Seems you are hitting the same problems as with 2.6.38.8-7

Can you try to enable hugepages again by:
echo always >/sys/kernel/mm/transparent_hugepage/enabled

and see if the problem goes away ?
Comment 20 Thomas Backlund 2011-12-30 21:41:44 CET
And another way to test the hugepage change:

boot 2.6.38.8-8

echo madvise >/sys/kernel/mm/transparent_hugepage/enabled
Comment 21 Dave Hodgins 2011-12-30 21:56:53 CET
Given that it's been 4 days since the last "bad page state", any testing
that doesn't generate the error isn't going to provide any useful information.

Are you expecting 8-8 with madvise to cause the bad page state?
Comment 22 Thomas Backlund 2012-01-03 14:27:52 CET
I was thinkingyou had problems it 2.6.38.8-7, wich I assumed got fixed by the 2.6.38.8-8 wich worked for you, and now they came back, and the only memory-related change was the madvise change.

But as you cant reproduce, and no other problems has shown up so far, I think it's ok to push this update.
Comment 23 David GEIGER 2012-01-03 16:40:27 CET
(In reply to comment #2)
> Testing complete on Mageia release 1 (Official) for x86_64.
> 
> For me it work very fine. Nothing to report at this time,current test since
> yesterday.
> 
> -Installation ,Ok
> -Reboot ,Ok
> -It's always Ok to the fixed bugs: # 1525 and # 1954

For my part, I still have nothing to report. Seems to be Ok.

CC: (none) => geiger.david68210

Comment 24 Dave Hodgins 2012-01-04 00:18:45 CET
Ok.  Validating the update.

Could someone from the sysadmin team push the srpm
kernel-2.6.38.8-9.mga1.src.rpm
from Core Updates Testing to Core Updates.

Advisory: This kernel security update adresses the following CVEs:

* A flaw was found in the way Linux kernel's XFS filesystem implementation
  handled links with pathname larger than MAXPATHLEN. When CONFIG_XFS_DEBUG
  configuration option was not enabled when compiling Linux kernel, an
  attacker able to mount malicious XFS image could use this flaw to crash
  the system, or potentially, elevate his privileges on that system.
  (CVE-2011-4077)

* A kernel null pointer deref at dev_queue_xmit can be triggered by setting
  up a bridge over vlan, and running pktgen. 
  (CVE-2011-4112)

* A flaw was found in the way Linux kernel's Journaling Block Device (JBD)
  handled invalid log first block value. An attacker able to mount malicious
  ext3 or ext4 image could use this flaw to crash the system.
  (CVE-2011-4132)

* On a corrupted hfs file system the ->len field could be wrong leading to
  a buffer overflow.
  (CVE-2011-4330)


Other fixes in this release:
* ASPM: bring kernel power usage back down to 2.6.37 level.
  * PCI: PCIe links may not get configured for ASPM under POWERSAVE mode
  * PCI: Changing ASPM policy, via /sys, to POWERSAVE could cause NMIs
  * PCI: Enable ASPM state clearing regardless of policy
  * PCI/e1000e: Add and use pci_disable_link_state_locked()
  * PCIe ASPM: forcedly -> forcibly
  * PCI: Disable ASPM when _OSC control is not granted for PCIe services
  * PCI/ACPI: Report _OSC control mask returned on failure to get control
  * PCI: Rework ASPM disable code
  * aacraid: controller hangs if kernel uses non-default ASPM policy

* switch transparent hugepages from on by default to madvise (only enabled
  for apps that requests it), as it fixes desktop freeze when accessing
  slow media such as usb (thanks to fbui/mdv mail on @cooker ml).

* xfs: dont serialise direct IO reads on page cache 
  (fixes performance regression introduced in 2.6.38)

* md/raid5: fix bug that could result in reads from a failed device 

* enable ISDN in netbook config (#3367)

https://bugs.mageia.org/show_bug.cgi?id=3825

Keywords: (none) => validated_update
CC: (none) => sysadmin-bugs

Comment 25 Thomas Backlund 2012-01-04 13:39:39 CET
Update pushed.

Status: NEW => RESOLVED
Resolution: (none) => FIXED

Comment 26 Wolfgang Bornath 2012-01-19 18:26:11 CET
Concerning transparent_hugepages problem:

With kernel-2.6.38.8-8.mga1 I had that issue and after I applied the recommended change:
-> echo madvise >/sys/kernel/mm/transparent_hugepage/enabled
the freezes were history.

After updating the kernel to kernel-2.6.38.8-9 and rebooting the freezes came back again. I checked /sys/kernel/mm/transparent_hugepage/enabled and 'madvise' is set.

Reproduced it:
Started to dd a 4g iso to an usb key, after a while Firefox starts to respond sluggish until total freeze.

CC: (none) => molch.b

Comment 27 Florian Hubold 2012-01-30 10:51:10 CET
Unfortuantely, it also reappeared here. It does not happen anymore with faster usb storage media, like f.ex. an external 2,5" hdd, which writes ~20MB/s.

But with an older mp3 player, or flash sticks in general, it's horrible. System freezes sometimes for minutes, does not respond to keypresses until it's unfrozen again.

Thomas, could you take a look again why writing to really slow usb storage freezes the whole system for long periods of time? For now i've set transparent_hugepages to never, which seems to help.

Status: RESOLVED => REOPENED
Resolution: FIXED => (none)

Comment 28 Thomas Backlund 2012-01-31 11:41:15 CET
Closing this report again as it was for the update push, wich did happend ~1 month ago.

Please open a new report with dmesg from 2.6.38.8-8.mga1 and 2.6.38.8-9.mga1

and also do:

 echo  always >/sys/kernel/mm/transparent_hugepage/enabled

Does it get better or worse?

And then back:

 echo madvise >/sys/kernel/mm/transparent_hugepage/enabled


Does it get better or worse?

Status: REOPENED => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.