Bug 10056

Summary: OOM killer goes crazy during boot
Product: Mageia Reporter: David Walser <luigiwalser>
Component: RPM PackagesAssignee: Thomas Backlund <tmb>
Status: RESOLVED FIXED QA Contact:
Severity: critical    
Priority: release_blocker CC: mageia
Version: Cauldron   
Target Milestone: ---   
Hardware: i586   
OS: Linux   
Whiteboard:
Source RPM: systemd / kernel CVE:
Status comment:
Attachments: journalctl -b log
full journalctl log, compressed

Description David Walser 2013-05-10 20:36:26 CEST
When graphical.target starts and KDM loads, if I wait 10-20 seconds, X will crash and then restart (whether I stare at the login screen or try and log in).

I checked Xorg.0.log.old and see that it just cuts off at a certain point, compared to Xorg.0.log which is longer.

Then I checked journalctl -b and saw the OOM killer going nuts and killing all sorts of processes, including X.  I will attach this log.

This started happening some time this week.

Reproducible: 

Steps to Reproduce:
Comment 1 David Walser 2013-05-10 20:36:51 CEST
Created attachment 3928 [details]
journalctl -b log
David Walser 2013-05-10 20:41:10 CEST

Priority: Normal => release_blocker

Comment 2 David Walser 2013-05-10 20:49:43 CEST
Looking back over my full journalctl log (without -b), this started yesterday, after I installed kernel 3.8.12 and rebooted.  Here's my rpm -qa --last from that transaction (some packages may be missing as I've installed today's updates already).  The kernel seems like the most likely candidate.

bind-utils-9.9.2.P2-2.mga3.i586               Thu 09 May 2013 12:04:10 PM EDT
rpmdrake-5.49-1.mga3.noarch                   Thu 09 May 2013 12:04:09 PM EDT
openldap-2.4.33-6.mga3.i586                   Thu 09 May 2013 12:04:08 PM EDT
libldap2.4_2-2.4.33-6.mga3.i586               Thu 09 May 2013 12:04:08 PM EDT
mageia-release-Default-3-0.11.mga3.i586       Thu 09 May 2013 12:03:57 PM EDT
mageia-release-common-3-0.11.mga3.i586        Thu 09 May 2013 12:03:57 PM EDT
libupower-glib1-0.9.19-3.mga3.i586            Thu 09 May 2013 12:03:56 PM EDT
upower-0.9.19-3.mga3.i586                     Thu 09 May 2013 12:03:55 PM EDT
bootsplash-3.3.11-1.mga3.noarch               Thu 09 May 2013 12:03:27 PM EDT
initscripts-9.41-14.mga3.i586                 Thu 09 May 2013 12:03:25 PM EDT
kernel-server-3.8.12-1.mga3-1-1.mga3.i586     Thu 09 May 2013 12:03:22 PM EDT
harddrake-ui-15.50-1.mga3.i586                Thu 09 May 2013 12:02:42 PM EDT
harddrake-15.50-1.mga3.i586                   Thu 09 May 2013 12:02:42 PM EDT
drakx-net-applet-1.24-1.mga3.noarch           Thu 09 May 2013 12:02:42 PM EDT
kdebase4-workspace-handbooks-4.10.2-3.mga3.noarch Thu 09 May 2013 12:02:41 PM EDT
kdebase4-workspace-4.10.2-3.mga3.i586         Thu 09 May 2013 12:02:39 PM EDT
libkhotkeysprivate4-4.10.2-3.mga3.i586        Thu 09 May 2013 12:02:36 PM EDT
drakx-net-1.24-1.mga3.noarch                  Thu 09 May 2013 12:02:36 PM EDT
drakx-kbd-mouse-x11-0.109-1.mga3.i586         Thu 09 May 2013 12:02:36 PM EDT
drakxtools-15.50-1.mga3.i586                  Thu 09 May 2013 12:02:35 PM EDT
drakx-net-text-1.24-1.mga3.noarch             Thu 09 May 2013 12:02:35 PM EDT
libdrakx-net-1.24-1.mga3.noarch               Thu 09 May 2013 12:02:34 PM EDT
drakxtools-curses-15.50-1.mga3.i586           Thu 09 May 2013 12:02:34 PM EDT
drakxtools-backend-15.50-1.mga3.i586          Thu 09 May 2013 12:02:33 PM EDT
libdrakx-kbd-mouse-x11-0.109-1.mga3.i586      Thu 09 May 2013 12:02:32 PM EDT
libksignalplotter4-4.10.2-3.mga3.i586         Thu 09 May 2013 12:02:04 PM EDT
libplasmagenericshell4-4.10.2-3.mga3.i586     Thu 09 May 2013 12:02:03 PM EDT
libkwinglutils1-4.10.2-3.mga3.i586            Thu 09 May 2013 12:02:03 PM EDT
libkscreensaver5-4.10.2-3.mga3.i586           Thu 09 May 2013 12:02:03 PM EDT
libkdecorations4-4.10.2-3.mga3.i586           Thu 09 May 2013 12:02:03 PM EDT
libsolidcontrol4-4.10.2-3.mga3.i586           Thu 09 May 2013 12:02:02 PM EDT
liboxygenstyle4-4.10.2-3.mga3.i586            Thu 09 May 2013 12:02:02 PM EDT
kded_randrmonitor-4.10.2-3.mga3.i586          Thu 09 May 2013 12:02:02 PM EDT
plasma-applet-system-monitor-hwinfo-4.10.2-3.mga3.i586 Thu 09 May 2013 12:01:56 PM EDT
libweather_ion6-4.10.2-3.mga3.i586            Thu 09 May 2013 12:01:56 PM EDT
libplasma-geolocation-interface4-4.10.2-3.mga3.i586 Thu 09 May 2013 12:01:56 PM EDT
liboxygenstyleconfig4-4.10.2-3.mga3.i586      Thu 09 May 2013 12:01:56 PM EDT
libkephal4-4.10.2-3.mga3.i586                 Thu 09 May 2013 12:01:56 PM EDT
plasma-applet-system-monitor-temperature-4.10.2-3.mga3.i586 Thu 09 May 2013 12:01:55 PM EDT
libsystemsettingsview2-4.10.2-3.mga3.i586     Thu 09 May 2013 12:01:55 PM EDT
libtaskmanager4-4.10.2-3.mga3.i586            Thu 09 May 2013 12:01:54 PM EDT
plasma-applet-system-monitor-hdd-4.10.2-3.mga3.i586 Thu 09 May 2013 12:01:46 PM EDT
kdebase4-workspace-plasma-config-4.10.2-3.mga3.noarch Thu 09 May 2013 12:01:46 PM EDT
plasma-applet-system-monitor-cpu-4.10.2-3.mga3.i586 Thu 09 May 2013 12:01:45 PM EDT
libkwineffects1-4.10.2-3.mga3.i586            Thu 09 May 2013 12:01:45 PM EDT
libksgrd4-4.10.2-3.mga3.i586                  Thu 09 May 2013 12:01:45 PM EDT
kdm-4.10.2-3.mga3.i586                        Thu 09 May 2013 12:01:44 PM EDT
libkfontinstui4-4.10.2-3.mga3.i586            Thu 09 May 2013 12:01:43 PM EDT
libkfontinst4-4.10.2-3.mga3.i586              Thu 09 May 2013 12:01:43 PM EDT
libplasmaclock4-4.10.2-3.mga3.i586            Thu 09 May 2013 12:01:32 PM EDT
plasma-applet-system-monitor-net-4.10.2-3.mga3.i586 Thu 09 May 2013 12:01:31 PM EDT
kdm-handbook-4.10.2-3.mga3.noarch             Thu 09 May 2013 12:01:31 PM EDT
plasma-scriptengine-python-4.10.2-3.mga3.i586 Thu 09 May 2013 12:01:30 PM EDT
plasma-krunner-nepomuk-4.10.2-3.mga3.i586     Thu 09 May 2013 12:01:30 PM EDT
libkwinglesutils1-4.10.2-3.mga3.i586          Thu 09 May 2013 12:01:30 PM EDT
libprocessui4-4.10.2-3.mga3.i586              Thu 09 May 2013 12:01:29 PM EDT
libprocesscore4-4.10.2-3.mga3.i586            Thu 09 May 2013 12:01:29 PM EDT
plasma-applet-battery-4.10.2-3.mga3.i586      Thu 09 May 2013 12:01:23 PM EDT
libsolidcontrolifaces4-4.10.2-3.mga3.i586     Thu 09 May 2013 12:01:23 PM EDT
libplasma_applet_system_monitor4-4.10.2-3.mga3.i586 Thu 09 May 2013 12:01:23 PM EDT
plasma-krunner-powerdevil-4.10.2-3.mga3.i586  Thu 09 May 2013 12:01:22 PM EDT
libpowerdevilconfigcommonprivate4-4.10.2-3.mga3.i586 Thu 09 May 2013 12:01:22 PM EDT
libpowerdevilcore0-4.10.2-3.mga3.i586         Thu 09 May 2013 12:01:21 PM EDT
libkworkspace4-4.10.2-3.mga3.i586             Thu 09 May 2013 12:01:21 PM EDT
libpowerdevilui4-4.10.2-3.mga3.i586           Thu 09 May 2013 12:01:20 PM EDT
Comment 3 David Walser 2013-05-10 20:50:44 CEST
Created attachment 3929 [details]
full journalctl log, compressed
Comment 4 Thomas Backlund 2013-05-10 20:52:11 CEST
Hm, I wonder if this is systemd-related...

Hmmm, I now remember having seen some screenshots on other bugreports showing:

systemd-readahead: Out of Memory... in red

and looking in your log, the first one to trigger OOM is:
May 10 14:09:03 B-STU73-XPP.roctrng.net kernel: systemd-readahe invoked oom-killer: gfp_mask=0x800d0, order=0, oom_score_adj=1000

and looking earlier in the log I see:
May 10 14:08:30 B-STU73-XPP.roctrng.net systemd[1]: Starting Replay Read-Ahead Data...
May 10 14:08:30 B-STU73-XPP.roctrng.net systemd[1]: Starting Collect Read-Ahead Data...
May 10 14:08:30 B-STU73-XPP.roctrng.net systemd-readahead[344]: Bumped block_nr parameter of 8:0 to 20480. This is a temporary hack and should be removed one day.

what is this "temporary hack" ?

Colin, any thoughts ?


An other thing to try is kernel-linus in case one of the core kernel patches is getting systemd into trouble...

CC: (none) => mageia
Source RPM: (none) => systemd ?

Comment 5 Thomas Backlund 2013-05-10 20:54:00 CEST
Oh, and come to think of it...
what about this:

Starting LSB: Adaptive readahead daemon...

does this mean we now run 2 readahead deamons at the same time ?
Comment 6 David Walser 2013-05-10 21:12:56 CEST
I just booted kernel-linus and the OOM killer didn't activate.

Then I disabled the three systemd-readahead services in drakxservices and booted the server kernel and the OOM still went nuts (and it was really slow).

I won't be able to try anything else until Monday.
Comment 7 Thomas Backlund 2013-05-10 21:16:57 CEST
ok, so something I added for 3.8.12-1 makes us go boinkers... or some of the stuff in upcoming 3.8.13 of wich I had some in 3.8.12... :/
Comment 8 Thomas Backlund 2013-05-10 23:53:56 CEST
So, I realized another thing...

You are running Sandy Bridge HW with 16GB of ram, and use i586... that's something like having a ferrari and sticking a vw beetle engine in it :)

But ignoring that part for now...
to see if this is upstream issue and/or memory related, can you try

kernel-linus-3.8.12-2.mga3
kernel-linus-3.8.12-3.mga3


from:
http://tmb.mine.nu/Mageia/Cauldron/bugs/10056/

and see wich one works (or not)

Source RPM: systemd ? => systemd / kernel

Comment 9 Thomas Backlund 2013-05-11 18:22:30 CEST
I now tried to reproduce this on i586 install on a i7-860/8GB/HD5770, but nothing..


So it will be interesting to see what those testkernels do on your hw
Comment 10 David Walser 2013-05-11 19:55:41 CEST
Yep, I will let you know on Monday.  Thanks Thomas.
Comment 11 David Walser 2013-05-13 18:01:00 CEST
(In reply to Thomas Backlund from comment #8)
> So, I realized another thing...
> 
> You are running Sandy Bridge HW with 16GB of ram, and use i586... that's
> something like having a ferrari and sticking a vw beetle engine in it :)
> 
> But ignoring that part for now...
> to see if this is upstream issue and/or memory related, can you try
> 
> kernel-linus-3.8.12-2.mga3
> kernel-linus-3.8.12-3.mga3
> 
> 
> from:
> http://tmb.mine.nu/Mageia/Cauldron/bugs/10056/
> 
> and see wich one works (or not)

They both work fine, no OOM killer.
Comment 12 Thomas Backlund 2013-05-13 23:41:10 CEST
Can you also try kernel-server-3.8.12-3.mga3 from http://tmb.mine.nu/Mageia/Cauldron/bugs/10056/
Comment 13 David Walser 2013-05-14 00:13:20 CEST
(In reply to Thomas Backlund from comment #12)
> Can you also try kernel-server-3.8.12-3.mga3 from
> http://tmb.mine.nu/Mageia/Cauldron/bugs/10056/

That one works fine too.
Comment 14 Thomas Backlund 2013-05-14 18:47:49 CEST
(In reply to David Walser from comment #13)

> That one works fine too.

Great!

So it was the Cougar Point related patch that I suspected...

I even reproduced it at work today on a similar system after installing i586 on it...

Time to flush out a new kernel...
Comment 15 Thomas Backlund 2013-05-14 21:42:02 CEST
kernel-3.8.13-1.mga3 building with the offending commit removed

Status: NEW => RESOLVED
Resolution: (none) => FIXED