Bug 15092

Summary: ada tests during gcc build trigger kernel problem
Product: Mageia Reporter: Pascal Terjan <pterjan>
Component: RPM PackagesAssignee: Thomas Backlund <tmb>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: Normal    
Version: Cauldron   
Target Milestone: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Source RPM: kernel-3.18.3-1.mga5.src.rpm CVE:
Status comment:

Description Pascal Terjan 2015-01-20 11:39:16 CET
Trying to run a mass rebuild I got a lot of processes blocked forever, mostly urpmi waiting for filetriggers, gconftool-2 once calling /usr/bin/killall -q -HUP /usr/libexec/gconfd-2 which hangs.

Also some configure scripts call ps which hangs too.

Trying to attach to the killall process hangs

The stack of killall process:

[root@instance-2 pterjan]# cat /proc/4446/stack
[<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[<ffffffff8106968b>] get_mm_exe_file+0x1b/0x40
[<ffffffff81228cc5>] proc_exe_link+0x55/0xa0
[<ffffffff81227fba>] proc_pid_follow_link+0x4a/0x70
[<ffffffff811cceda>] path_lookupat+0x61a/0xd10
[<ffffffff811cd5f6>] filename_lookup.isra.25+0x26/0x80
[<ffffffff811d0694>] user_path_at_empty+0x54/0xa0
[<ffffffff811d06f1>] user_path_at+0x11/0x20
[<ffffffff811c4012>] vfs_fstatat+0x52/0xa0
[<ffffffff811c44af>] SYSC_newstat+0x1f/0x40
[<ffffffff811c46ee>] SyS_newstat+0xe/0x10
[<ffffffff816c29ed>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

Using kernel 3.18.3-server-1.mga5

Reproducible: 

Steps to Reproduce:
Comment 1 Pascal Terjan 2015-01-20 11:43:10 CET
Also, rebooting and restarting the build, it happened again.
(It could be some package leaving a process in a strange state)
Comment 2 Pascal Terjan 2015-01-20 12:02:31 CET
# for d in *; do echo $d; readlink $d/exe; done

pointed me to 1671

cat /proc/1671/cmdline also hangs

# cat /proc/1671/stack 
[<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[<ffffffff816c49e8>] page_fault+0x28/0x30
[<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[<ffffffff81013ec2>] do_signal+0x952/0xbb0
[<ffffffff81014190>] do_notify_resume+0x70/0x90
[<ffffffff816c37a2>] retint_signal+0x48/0x86
[<ffffffffffffffff>] 0xffffffffffffffff

fd give me an indication, it is from the build of gcc:

l-wx------ 1 pterjan 1001 64 Jan 20 11:00 6 -> /home/pterjan/build/chroot_tmp/pterjan/chroot_cauldron.x86_64.0.20150119221317_279/home/pterjan/rpmbuild/BUILD/gcc-4.9.2/obj-x86_64-mageia-linux-gnu/gcc/testsuite/ada/acats1/tests/cb/cb1010d/cb1010d.log (deleted)

this is probably related to the hung tasks in kernel logs:

[ 6120.690552] INFO: task c52103x:22055 blocked for more than 120 seconds.
[ 6120.692575]       Not tainted 3.18.3-server-1.mga5 #1
[ 6120.694051] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6120.696366] c52103x         D ffff881a3fd932c0     0 22055  22052 0x00000000
[ 6120.698400]  ffff88140c97bb30 0000000000000082 ffff8819abf8e590 00000000000132c0
[ 6120.701117]  ffff88140c97bfd8 00000000000132c0 ffff881850d8a310 ffff8819abf8e590
[ 6120.703643]  ffffffff8117f676 ffff8819abf8e590 ffff8818b4f0e4a0 ffff8818b4f0e4b8
[ 6120.706110] Call Trace:
[ 6120.706970]  [<ffffffff8117f676>] ? expand_downwards+0x86/0x2a0
[ 6120.709045]  [<ffffffff816be469>] schedule+0x29/0x70
[ 6120.710906]  [<ffffffff816c123d>] rwsem_down_read_failed+0xdd/0x120
[ 6120.712481]  [<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[ 6120.714115]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6120.715461]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6120.716799]  [<ffffffff816c08c7>] ? down_read+0x17/0x20
[ 6120.718149]  [<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[ 6120.719588]  [<ffffffff81079168>] ? __send_signal+0x178/0x4a0
[ 6120.721058]  [<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[ 6120.722516]  [<ffffffff816c49e8>] page_fault+0x28/0x30
[ 6120.723793]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6120.725116]  [<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[ 6120.726512]  [<ffffffff81013ec2>] do_signal+0x952/0xbb0
[ 6120.728123]  [<ffffffff8109c99d>] ? set_next_entity+0x9d/0xb0
[ 6120.730040]  [<ffffffff81490000>] ? regulator_min_uA_show+0x70/0x70
[ 6120.732086]  [<ffffffff814a18d4>] ? pty_write+0x54/0x60
[ 6120.733734]  [<ffffffff816bdf81>] ? __schedule+0x3a1/0x860
[ 6120.735242]  [<ffffffff81014190>] do_notify_resume+0x70/0x90
[ 6120.736970]  [<ffffffff816c37a2>] retint_signal+0x48/0x86
[ 6240.730119] INFO: task c52103x:22055 blocked for more than 120 seconds.
[ 6240.731902]       Not tainted 3.18.3-server-1.mga5 #1
[ 6240.733013] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6240.735002] c52103x         D ffff881a3fd932c0     0 22055      1 0x00000004
[ 6240.736674]  ffff88140c97bb30 0000000000000082 ffff8819abf8e590 00000000000132c0
[ 6240.738463]  ffff88140c97bfd8 00000000000132c0 ffff881850d8a310 ffff8819abf8e590
[ 6240.740169]  ffffffff8117f676 ffff8819abf8e590 ffff8818b4f0e4a0 ffff8818b4f0e4b8
[ 6240.741968] Call Trace:
[ 6240.742580]  [<ffffffff8117f676>] ? expand_downwards+0x86/0x2a0
[ 6240.743957]  [<ffffffff816be469>] schedule+0x29/0x70
[ 6240.745107]  [<ffffffff816c123d>] rwsem_down_read_failed+0xdd/0x120
[ 6240.746561]  [<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[ 6240.748140]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6240.749408]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6240.750683]  [<ffffffff816c08c7>] ? down_read+0x17/0x20
[ 6240.751830]  [<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[ 6240.753108]  [<ffffffff81079168>] ? __send_signal+0x178/0x4a0
[ 6240.754629]  [<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[ 6240.755873]  [<ffffffff816c49e8>] page_fault+0x28/0x30
[ 6240.757046]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6240.758340]  [<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[ 6240.759637]  [<ffffffff81013ec2>] do_signal+0x952/0xbb0
[ 6240.760879]  [<ffffffff8109c99d>] ? set_next_entity+0x9d/0xb0
[ 6240.762189]  [<ffffffff81490000>] ? regulator_min_uA_show+0x70/0x70
[ 6240.763617]  [<ffffffff814a18d4>] ? pty_write+0x54/0x60
[ 6240.764827]  [<ffffffff816bdf81>] ? __schedule+0x3a1/0x860
[ 6240.766094]  [<ffffffff81014190>] do_notify_resume+0x70/0x90
[ 6240.767389]  [<ffffffff816c37a2>] retint_signal+0x48/0x86
[ 6360.760116] INFO: task c52103x:22055 blocked for more than 120 seconds.
[ 6360.774011]       Not tainted 3.18.3-server-1.mga5 #1
[ 6360.775031] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6360.776668] c52103x         D ffff881a3fd932c0     0 22055      1 0x00000004
[ 6360.778177]  ffff88140c97bb30 0000000000000082 ffff8819abf8e590 00000000000132c0
[ 6360.779692]  ffff88140c97bfd8 00000000000132c0 ffff881850d8a310 ffff8819abf8e590
[ 6360.781581]  ffffffff8117f676 ffff8819abf8e590 ffff8818b4f0e4a0 ffff8818b4f0e4b8
[ 6360.783513] Call Trace:
[ 6360.784103]  [<ffffffff8117f676>] ? expand_downwards+0x86/0x2a0
[ 6360.785484]  [<ffffffff816be469>] schedule+0x29/0x70
[ 6360.786638]  [<ffffffff816c123d>] rwsem_down_read_failed+0xdd/0x120
[ 6360.788125]  [<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[ 6360.789663]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6360.790860]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6360.792105]  [<ffffffff816c08c7>] ? down_read+0x17/0x20
[ 6360.793132]  [<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[ 6360.794197]  [<ffffffff81079168>] ? __send_signal+0x178/0x4a0
[ 6360.795344]  [<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[ 6360.796392]  [<ffffffff816c49e8>] page_fault+0x28/0x30
[ 6360.797488]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6360.798560]  [<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[ 6360.799615]  [<ffffffff81013ec2>] do_signal+0x952/0xbb0
[ 6360.800717]  [<ffffffff8109c99d>] ? set_next_entity+0x9d/0xb0
[ 6360.801837]  [<ffffffff81490000>] ? regulator_min_uA_show+0x70/0x70
[ 6360.803098]  [<ffffffff814a18d4>] ? pty_write+0x54/0x60
[ 6360.804213]  [<ffffffff816bdf81>] ? __schedule+0x3a1/0x860
[ 6360.805408]  [<ffffffff81014190>] do_notify_resume+0x70/0x90
[ 6360.806665]  [<ffffffff816c37a2>] retint_signal+0x48/0x86
[ 6480.800129] INFO: task c52103x:22055 blocked for more than 120 seconds.
[ 6480.808454]       Not tainted 3.18.3-server-1.mga5 #1
[ 6480.810351] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6480.813013] c52103x         D ffff881a3fd932c0     0 22055      1 0x00000004
[ 6480.815504]  ffff88140c97bb30 0000000000000082 ffff8819abf8e590 00000000000132c0
[ 6480.818257]  ffff88140c97bfd8 00000000000132c0 ffff881850d8a310 ffff8819abf8e590
[ 6480.821025]  ffffffff8117f676 ffff8819abf8e590 ffff8818b4f0e4a0 ffff8818b4f0e4b8
[ 6480.823696] Call Trace:
[ 6480.824662]  [<ffffffff8117f676>] ? expand_downwards+0x86/0x2a0
[ 6480.826747]  [<ffffffff816be469>] schedule+0x29/0x70
[ 6480.828684]  [<ffffffff816c123d>] rwsem_down_read_failed+0xdd/0x120
[ 6480.832699]  [<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[ 6480.835059]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6480.837565]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6480.841095]  [<ffffffff816c08c7>] ? down_read+0x17/0x20
[ 6480.843038]  [<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[ 6480.848955]  [<ffffffff81079168>] ? __send_signal+0x178/0x4a0
[ 6480.851874]  [<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[ 6480.853900]  [<ffffffff816c49e8>] page_fault+0x28/0x30
[ 6480.857069]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6480.858946]  [<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[ 6480.860848]  [<ffffffff81013ec2>] do_signal+0x952/0xbb0
[ 6480.862638]  [<ffffffff8109c99d>] ? set_next_entity+0x9d/0xb0
[ 6480.864538]  [<ffffffff81490000>] ? regulator_min_uA_show+0x70/0x70
[ 6480.866504]  [<ffffffff814a18d4>] ? pty_write+0x54/0x60
[ 6480.868290]  [<ffffffff816bdf81>] ? __schedule+0x3a1/0x860
[ 6480.870140]  [<ffffffff81014190>] do_notify_resume+0x70/0x90
[ 6480.871924]  [<ffffffff816c37a2>] retint_signal+0x48/0x86
[ 6480.873491] INFO: task c52104x:9701 blocked for more than 120 seconds.
[ 6480.875570]       Not tainted 3.18.3-server-1.mga5 #1
[ 6480.877195] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6480.879655] c52104x         D ffff881a3fd132c0     0  9701   9697 0x00000000
[ 6480.882041]  ffff881751227b30 0000000000000082 ffff8810c2d74490 00000000000132c0
[ 6480.884318]  ffff881751227fd8 00000000000132c0 ffff8813f7734510 ffff8810c2d74490
[ 6480.886380]  ffffffff8117f676 ffff8810c2d74490 ffff8819b9c2b8e0 ffff8819b9c2b8f8
[ 6480.888496] Call Trace:
[ 6480.889192]  [<ffffffff8117f676>] ? expand_downwards+0x86/0x2a0
[ 6480.890789]  [<ffffffff816be469>] schedule+0x29/0x70
[ 6480.892097]  [<ffffffff816c123d>] rwsem_down_read_failed+0xdd/0x120
[ 6480.893833]  [<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[ 6480.895621]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6480.897145]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6480.898639]  [<ffffffff816c08c7>] ? down_read+0x17/0x20
[ 6480.900007]  [<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[ 6480.901623]  [<ffffffff81079168>] ? __send_signal+0x178/0x4a0
[ 6480.903050]  [<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[ 6480.904614]  [<ffffffff816c49e8>] page_fault+0x28/0x30
[ 6480.906365]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6480.908214]  [<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[ 6480.910246]  [<ffffffff81013ec2>] do_signal+0x952/0xbb0
[ 6480.912055]  [<ffffffff8109c99d>] ? set_next_entity+0x9d/0xb0
[ 6480.913800]  [<ffffffff81490000>] ? regulator_min_uA_show+0x70/0x70
[ 6480.915451]  [<ffffffff814a18d4>] ? pty_write+0x54/0x60
[ 6480.916885]  [<ffffffff816bdf81>] ? __schedule+0x3a1/0x860
[ 6480.918574]  [<ffffffff81014190>] do_notify_resume+0x70/0x90
[ 6480.920240]  [<ffffffff816c37a2>] retint_signal+0x48/0x86
[ 6600.920124] INFO: task c52103x:22055 blocked for more than 120 seconds.
[ 6600.951937]       Not tainted 3.18.3-server-1.mga5 #1
[ 6600.953545] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6600.955684] c52103x         D ffff881a3fd932c0     0 22055      1 0x00000004
[ 6600.957942]  ffff88140c97bb30 0000000000000082 ffff8819abf8e590 00000000000132c0
[ 6600.960221]  ffff88140c97bfd8 00000000000132c0 ffff881850d8a310 ffff8819abf8e590
[ 6600.962655]  ffffffff8117f676 ffff8819abf8e590 ffff8818b4f0e4a0 ffff8818b4f0e4b8
[ 6600.964905] Call Trace:
[ 6600.965624]  [<ffffffff8117f676>] ? expand_downwards+0x86/0x2a0
[ 6600.967336]  [<ffffffff816be469>] schedule+0x29/0x70
[ 6600.968541]  [<ffffffff816c123d>] rwsem_down_read_failed+0xdd/0x120
[ 6600.969870]  [<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[ 6600.971298]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6600.972492]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6600.973670]  [<ffffffff816c08c7>] ? down_read+0x17/0x20
[ 6600.974881]  [<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[ 6600.976136]  [<ffffffff81079168>] ? __send_signal+0x178/0x4a0
[ 6600.977436]  [<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[ 6600.978637]  [<ffffffff816c49e8>] page_fault+0x28/0x30
[ 6600.979757]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6600.980945]  [<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[ 6600.981946]  [<ffffffff81013ec2>] do_signal+0x952/0xbb0
[ 6600.982949]  [<ffffffff8109c99d>] ? set_next_entity+0x9d/0xb0
[ 6600.984068]  [<ffffffff81490000>] ? regulator_min_uA_show+0x70/0x70
[ 6600.985294]  [<ffffffff814a18d4>] ? pty_write+0x54/0x60
[ 6600.986348]  [<ffffffff816bdf81>] ? __schedule+0x3a1/0x860
[ 6600.987482]  [<ffffffff81014190>] do_notify_resume+0x70/0x90
[ 6600.988553]  [<ffffffff816c37a2>] retint_signal+0x48/0x86
[ 6601.035480] INFO: task c52104x:9701 blocked for more than 120 seconds.
[ 6601.037186]       Not tainted 3.18.3-server-1.mga5 #1
[ 6601.038721] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6601.040680] c52104x         D ffff881a3fd132c0     0  9701      1 0x00000004
[ 6601.042654]  ffff881751227b30 0000000000000082 ffff8810c2d74490 00000000000132c0
[ 6601.044910]  ffff881751227fd8 00000000000132c0 ffff8813f7734510 ffff8810c2d74490
[ 6601.046909]  ffffffff8117f676 ffff8810c2d74490 ffff8819b9c2b8e0 ffff8819b9c2b8f8
[ 6601.048531] Call Trace:
[ 6601.049055]  [<ffffffff8117f676>] ? expand_downwards+0x86/0x2a0
[ 6601.050335]  [<ffffffff816be469>] schedule+0x29/0x70
[ 6601.051480]  [<ffffffff816c123d>] rwsem_down_read_failed+0xdd/0x120
[ 6601.052988]  [<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[ 6601.054482]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6601.055582]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6601.056644]  [<ffffffff816c08c7>] ? down_read+0x17/0x20
[ 6601.057728]  [<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[ 6601.058979]  [<ffffffff81079168>] ? __send_signal+0x178/0x4a0
[ 6601.060203]  [<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[ 6601.061220]  [<ffffffff816c49e8>] page_fault+0x28/0x30
[ 6601.062187]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6601.063213]  [<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[ 6601.064311]  [<ffffffff81013ec2>] do_signal+0x952/0xbb0
[ 6601.065257]  [<ffffffff8109c99d>] ? set_next_entity+0x9d/0xb0
[ 6601.066287]  [<ffffffff81490000>] ? regulator_min_uA_show+0x70/0x70
[ 6601.067500]  [<ffffffff814a18d4>] ? pty_write+0x54/0x60
[ 6601.068458]  [<ffffffff816bdf81>] ? __schedule+0x3a1/0x860
[ 6601.069455]  [<ffffffff81014190>] do_notify_resume+0x70/0x90
[ 6601.070753]  [<ffffffff816c37a2>] retint_signal+0x48/0x86
[ 6721.070096] INFO: task c52103x:22055 blocked for more than 120 seconds.
[ 6721.072422]       Not tainted 3.18.3-server-1.mga5 #1
[ 6721.074140] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6721.076677] c52103x         D ffff881a3fd932c0     0 22055      1 0x00000004
[ 6721.079226]  ffff88140c97bb30 0000000000000082 ffff8819abf8e590 00000000000132c0
[ 6721.081880]  ffff88140c97bfd8 00000000000132c0 ffff881850d8a310 ffff8819abf8e590
[ 6721.084173]  ffffffff8117f676 ffff8819abf8e590 ffff8818b4f0e4a0 ffff8818b4f0e4b8
[ 6721.086359] Call Trace:
[ 6721.087083]  [<ffffffff8117f676>] ? expand_downwards+0x86/0x2a0
[ 6721.088536]  [<ffffffff816be469>] schedule+0x29/0x70
[ 6721.089713]  [<ffffffff816c123d>] rwsem_down_read_failed+0xdd/0x120
[ 6721.091345]  [<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[ 6721.092984]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6721.094334]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6721.095701]  [<ffffffff816c08c7>] ? down_read+0x17/0x20
[ 6721.097118]  [<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[ 6721.098572]  [<ffffffff81079168>] ? __send_signal+0x178/0x4a0
[ 6721.100226]  [<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[ 6721.102114]  [<ffffffff816c49e8>] page_fault+0x28/0x30
[ 6721.103858]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6721.105712]  [<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[ 6721.107688]  [<ffffffff81013ec2>] do_signal+0x952/0xbb0
[ 6721.109323]  [<ffffffff8109c99d>] ? set_next_entity+0x9d/0xb0
[ 6721.111192]  [<ffffffff81490000>] ? regulator_min_uA_show+0x70/0x70
[ 6721.113180]  [<ffffffff814a18d4>] ? pty_write+0x54/0x60
[ 6721.114852]  [<ffffffff816bdf81>] ? __schedule+0x3a1/0x860
[ 6721.116674]  [<ffffffff81014190>] do_notify_resume+0x70/0x90
[ 6721.118517]  [<ffffffff816c37a2>] retint_signal+0x48/0x86
[ 6721.120295] INFO: task c52104x:9701 blocked for more than 120 seconds.
[ 6721.122367]       Not tainted 3.18.3-server-1.mga5 #1
[ 6721.123993] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6721.126589] c52104x         D ffff881a3fd132c0     0  9701      1 0x00000004
[ 6721.129316]  ffff881751227b30 0000000000000082 ffff8810c2d74490 00000000000132c0
[ 6721.132558]  ffff881751227fd8 00000000000132c0 ffff8813f7734510 ffff8810c2d74490
[ 6721.137471]  ffffffff8117f676 ffff8810c2d74490 ffff8819b9c2b8e0 ffff8819b9c2b8f8
[ 6721.144201] Call Trace:
[ 6721.145293]  [<ffffffff8117f676>] ? expand_downwards+0x86/0x2a0
[ 6721.148190]  [<ffffffff816be469>] schedule+0x29/0x70
[ 6721.150424]  [<ffffffff816c123d>] rwsem_down_read_failed+0xdd/0x120
[ 6721.154303]  [<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[ 6721.156676]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6721.158910]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6721.160973]  [<ffffffff816c08c7>] ? down_read+0x17/0x20
[ 6721.163021]  [<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[ 6721.165230]  [<ffffffff81079168>] ? __send_signal+0x178/0x4a0
[ 6721.168602]  [<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[ 6721.170293]  [<ffffffff816c49e8>] page_fault+0x28/0x30
[ 6721.171872]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6721.173559]  [<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[ 6721.175371]  [<ffffffff81013ec2>] do_signal+0x952/0xbb0
[ 6721.176998]  [<ffffffff8109c99d>] ? set_next_entity+0x9d/0xb0
[ 6721.178700]  [<ffffffff81490000>] ? regulator_min_uA_show+0x70/0x70
[ 6721.180688]  [<ffffffff814a18d4>] ? pty_write+0x54/0x60
[ 6721.182201]  [<ffffffff816bdf81>] ? __schedule+0x3a1/0x860
[ 6721.183795]  [<ffffffff81014190>] do_notify_resume+0x70/0x90
[ 6721.185418]  [<ffffffff816c37a2>] retint_signal+0x48/0x86
[ 6721.187073] INFO: task c52104y:6850 blocked for more than 120 seconds.
[ 6721.189170]       Not tainted 3.18.3-server-1.mga5 #1
[ 6721.190744] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6721.193014] c52104y         D ffff881a3fdd32c0     0  6850   6846 0x00000000
[ 6721.195174]  ffff88119eeafb30 0000000000000082 ffff881372d14590 00000000000132c0
[ 6721.197567]  ffff88119eeaffd8 00000000000132c0 ffff88189c53a4d0 ffff881372d14590
[ 6721.199912]  ffffffff8117f676 ffff881372d14590 ffff88119d0177a0 ffff88119d0177b8
[ 6721.202559] Call Trace:
[ 6721.203396]  [<ffffffff8117f676>] ? expand_downwards+0x86/0x2a0
[ 6721.205141]  [<ffffffff816be469>] schedule+0x29/0x70
[ 6721.206595]  [<ffffffff816c123d>] rwsem_down_read_failed+0xdd/0x120
[ 6721.208372]  [<ffffffff813c8544>] call_rwsem_down_read_failed+0x14/0x30
[ 6721.210451]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6721.212059]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6721.213633]  [<ffffffff816c08c7>] ? down_read+0x17/0x20
[ 6721.215167]  [<ffffffff8105b21c>] __do_page_fault+0x42c/0x5c0
[ 6721.216947]  [<ffffffff81079168>] ? __send_signal+0x178/0x4a0
[ 6721.218715]  [<ffffffff8105b3d2>] do_page_fault+0x22/0x30
[ 6721.220335]  [<ffffffff816c49e8>] page_fault+0x28/0x30
[ 6721.221935]  [<ffffffff813c8675>] ? __clear_user+0x25/0x50
[ 6721.223515]  [<ffffffff8102044e>] save_xstate_sig+0x20e/0x230
[ 6721.225243]  [<ffffffff81013ec2>] do_signal+0x952/0xbb0
[ 6721.226814]  [<ffffffff8109c99d>] ? set_next_entity+0x9d/0xb0
[ 6721.228731]  [<ffffffff81490000>] ? regulator_min_uA_show+0x70/0x70
[ 6721.231164]  [<ffffffff814a18d4>] ? pty_write+0x54/0x60
[ 6721.233206]  [<ffffffff816bdf81>] ? __schedule+0x3a1/0x860
[ 6721.235284]  [<ffffffff81014190>] do_notify_resume+0x70/0x90
[ 6721.237435]  [<ffffffff816c37a2>] retint_signal+0x48/0x86
Pascal Terjan 2015-01-20 12:13:56 CET

Summary: ps/killall hangs => ada tests during gcc build trigger kernel problem

Comment 3 Pascal Terjan 2015-01-20 12:17:19 CET
I will do more tests tonight (like building gcc on kernel-linus without other load on the machine).
Comment 4 Pascal Terjan 2015-01-20 13:39:22 CET
Also important note, this is building on a tmpfs.
Comment 5 Pascal Terjan 2015-01-20 18:20:05 CET
I could reproduce when building gcc on 3.18.3-server-1.mga5 without anything else running on the machine.

I could not reproduce with -linus-3.18.3-1.mga5 (tried 3 times).

I could also not reproduce with 3.18.2-server-1.mga5 (tried only once)

Assignee: bugsquad => tmb

Comment 6 Pascal Terjan 2015-01-20 20:01:09 CET
Reproduced on 3.18.3-desktop-1.mga5 too
Comment 7 Thomas Backlund 2015-01-20 22:26:21 CET
Hm, iirc there was a thread recently on LKML regarding mm/thp relating to expand_downwards...

But it's weird the 3.18.2 worked but 3.18.3 not as I didn't change anything besides the upstream -stable patch (and dropped merged ones)

and since kernel-linus works (and is built with the same defconfig as desktop kernel) I guess some of our other patches got in trouble with 3.18.3...


Hm, If you have time, can you try to disable the AUFS patches:

fs-aufs-3.18.patch
fs-aufs-3.18-modular.patch
fs-aufs-adapt-for-3.18.1-d_child-change.patch


and see if the problem goes away ?
Comment 8 Pascal Terjan 2015-01-21 01:00:27 CET
For the record, it also happens when building on ext4 instead of tmpfs.

I built a kernel without aufs (there is also a line to remove in the spec) but could still reproduce.

Given the problem I will try without the x86-mm* patches
Comment 9 Pascal Terjan 2015-01-21 02:01:43 CET
I couldn't reproduce after dropping the 3 patches:

x86-mm-consolidate-VM_FAULT_RETRY-handling.patch
x86-mm-move-mmap_sem-unlock-from-mm_fault_error-to-c.patch
x86-mm-fix-VM_FAULT_RETRY-handling.patch

The second one looks suspicious (double up_read):

        up_read(&mm->mmap_sem);
        if (unlikely(fault & VM_FAULT_ERROR)) {
+               up_read(&mm->mmap_sem);
Comment 10 Pascal Terjan 2015-01-21 02:05:54 CET
Ah this seems due to the inversion of order of patches, the first one was moving it out but is now adding it outside, this one was originally adding it inside but it was not supposed to already be outside.
Comment 11 Pascal Terjan 2015-01-21 02:10:03 CET
I committed a possibly fixed patch, time to sleep, I'll test with it tomorrow.
Comment 12 Thomas Backlund 2015-01-21 09:13:50 CET
Oops, well spotted... 

I wonder how I managed to get them reversed... :/

That is the correct fix.
Comment 13 Pascal Terjan 2015-01-21 18:11:07 CET
Tried with the fixed patch, and things are good.

Do you have another kernel planned soon or should I upload this one?
Comment 14 Thomas Backlund 2015-01-21 18:13:52 CET
kernel-3.18.3-2.mga5 already building and should be uploaded within ~1 hour
Comment 15 Pascal Terjan 2015-01-21 18:22:05 CET
Ah sorry, should have checked :)
Comment 16 Pascal Terjan 2015-01-22 18:12:27 CET
Closing

Status: NEW => RESOLVED
Resolution: (none) => FIXED