Bug 9734 - networkmanager segfaults when connecting wifi wpa2 or ethernet
Summary: networkmanager segfaults when connecting wifi wpa2 or ethernet
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: i586 Linux
Priority: release_blocker critical
Target Milestone: ---
Assignee: Olivier Blin
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-14 23:14 CEST by José Jorge
Modified: 2013-05-12 01:31 CEST (History)
10 users (show)

See Also:
Source RPM: networkmanager
CVE:
Status comment:


Attachments

Description José Jorge 2013-04-14 23:14:21 CEST
Description of problem: here is /var/log/messages

Apr 14 23:00:19 localhost kernel: NetworkManager[7750]: segfault at 0 ip   (null) sp bfc2b8dc error 4 in NetworkManager[8048000+ff000]
Apr 14 23:00:19 localhost systemd[1]: NetworkManager.service: main process exited, code=killed, status=11/SEGV
Apr 14 23:00:19 localhost systemd[1]: Unit NetworkManager.service entered failed state


Reproducible: 

Steps to Reproduce:
Comment 1 Olivier Blin 2013-05-11 14:48:31 CEST
I can reproduce by just plugging in an ethernet cable.

This is release blocker.

Priority: Normal => release_blocker
CC: (none) => derekjenn, eeeemail, ennael1, ftg, mageia, mageia, manuel.mageia, pierre-malo.denielou, sander.lepik, thierry.vignaud, tmb
Assignee: bugsquad => mageia

Comment 2 Olivier Blin 2013-05-11 14:51:28 CEST
Backtrace is:
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff6a819d2 in nl_cache_include () from /lib64/libnl-3.so.200
#2  0x000000000048788c in netlink_notification ()
#3  0x00007ffff62ccf6a in g_cclosure_marshal_VOID__POINTERv () from /lib64/libgobject-2.0.so.0
#4  0x00007ffff62c9ef7 in _g_closure_invoke_va () from /lib64/libgobject-2.0.so.0
#5  0x00007ffff62e28f6 in g_signal_emit_valist () from /lib64/libgobject-2.0.so.0
#6  0x00007ffff62e3142 in g_signal_emit () from /lib64/libgobject-2.0.so.0
#7  0x000000000046c3ec in event_msg_ready ()
#8  0x00007ffff6a8667a in nl_recvmsgs_report () from /lib64/libnl-3.so.200
#9  0x00007ffff6a869f9 in nl_recvmsgs () from /lib64/libnl-3.so.200
#10 0x000000000046c285 in event_handler ()
#11 0x00007ffff600b6d5 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#12 0x00007ffff600ba08 in g_main_context_iterate.isra.24 () from /lib64/libglib-2.0.so.0
#13 0x00007ffff600be02 in g_main_loop_run () from /lib64/libglib-2.0.so.0
#14 0x0000000000425240 in main ()

Status: NEW => ASSIGNED

Comment 3 Olivier Blin 2013-05-11 14:52:39 CEST
This commit in libnl should fix it:
https://github.com/tgraf/libnl/commit/ba38f3919835c39d7bc1e939ef3ca89cfe31600d
Comment 4 Olivier Blin 2013-05-11 15:01:19 CEST
It seems we sort of have some protection already.
I do not get yet why it fails here:

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff6a819d2 in cache_include (data=0x0, cb=0x0, obj=0x78ad70, cache=0x76d000, type=<optimized out>) at cache.c:774
#2  nl_cache_include (cache=0x76d000, obj=0x78ad70, change_cb=0x0, data=0x0) at cache.c:799
#3  0x000000000048788c in netlink_notification ()
#4  0x00007ffff62ccf6a in g_cclosure_marshal_VOID__POINTERv () from /lib64/libgobject-2.0.so.0
#5  0x00007ffff62c9ef7 in _g_closure_invoke_va () from /lib64/libgobject-2.0.so.0
#6  0x00007ffff62e28f6 in g_signal_emit_valist () from /lib64/libgobject-2.0.so.0
#7  0x00007ffff62e3142 in g_signal_emit () from /lib64/libgobject-2.0.so.0
#8  0x000000000046c3ec in event_msg_ready ()
#9  0x00007ffff6a8667a in nl_cb_call (msg=0x779d30, type=0, cb=0x710040) at ../include/netlink-local.h:124
#10 recvmsgs (cb=0x710040, sk=0x710130) at nl.c:909
#11 nl_recvmsgs_report (sk=0x710130, cb=0x710040) at nl.c:960
#12 0x00007ffff6a869f9 in nl_recvmsgs (sk=<optimized out>, cb=<optimized out>) at nl.c:984
#13 0x000000000046c285 in event_handler ()
#14 0x00007ffff600b6d5 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#15 0x00007ffff600ba08 in g_main_context_iterate.isra.24 () from /lib64/libglib-2.0.so.0
#16 0x00007ffff600be02 in g_main_loop_run () from /lib64/libglib-2.0.so.0
#17 0x0000000000425240 in main ()
(gdb) list
769				nl_cache_move(cache, obj);
770				if (old == NULL && cb)
771					cb(cache, obj, NL_ACT_NEW, data);
772				else if (old) {
773					if (nl_object_diff(old, obj) && cb)
774						cb(cache, obj, NL_ACT_CHANGE, data);
775	
776					nl_object_put(old);
777				}
778			}
(gdb) p cb
$2 = (void (*)(struct nl_cache *, struct nl_object *, int, void *)) 0x0

Just before line 774, there is already a cb NULL check
Olivier Blin 2013-05-11 15:05:33 CEST

Summary: networkmanager segfaults when connecting wifi wpa2 => networkmanager segfaults when connecting wifi wpa2 or ethernet
Severity: major => critical

Comment 5 claire robinson 2013-05-11 15:09:52 CEST
I didn't notice this with the RC, if it helps.
Comment 6 Colin Guthrie 2013-05-11 17:18:31 CEST
I think the situation is more complex than a constant crash. I have seen this once in the past - namely at FOSDEM where NM crashed in a similar way. I think I've since trashed the core/backtrace I kept but IIRC it was similar to the backtrace shown here.

The wifi network at FOSDEM was the only one to trigger this crash for me... can't say I've noticed it misbehaving since then :s

I wonder if it's related to the _va functions messing with the stack? These _va handling things often seem to cause problems!
Comment 7 Olivier Blin 2013-05-11 22:32:52 CEST
I can reproduce it easily here: I plug a cable, it crashes.
But yep, it's not so constant, since it seems we are only a few users to have this bug.
Manuel Hiebel 2013-05-11 22:39:25 CEST

CC: manuel.mageia => (none)

Comment 8 Olivier Blin 2013-05-12 01:22:07 CEST
Actually, the patch I mentioned in comment 3 (commit ba38f3919835c39d7bc1e939ef3ca89cfe31600d) is enough, it is just that the gdb line information was off by a dozen of lines...
Comment 9 Olivier Blin 2013-05-12 01:31:16 CEST
Fixed in libnl3-3.2.17-3, waiting to be pushed through release freeze.

Status: ASSIGNED => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.