Description of problem: here is /var/log/messages Apr 14 23:00:19 localhost kernel: NetworkManager[7750]: segfault at 0 ip (null) sp bfc2b8dc error 4 in NetworkManager[8048000+ff000] Apr 14 23:00:19 localhost systemd[1]: NetworkManager.service: main process exited, code=killed, status=11/SEGV Apr 14 23:00:19 localhost systemd[1]: Unit NetworkManager.service entered failed state Reproducible: Steps to Reproduce:
I can reproduce by just plugging in an ethernet cable. This is release blocker.
Priority: Normal => release_blockerCC: (none) => derekjenn, eeeemail, ennael1, ftg, mageia, mageia, manuel.mageia, pierre-malo.denielou, sander.lepik, thierry.vignaud, tmbAssignee: bugsquad => mageia
Backtrace is: (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00007ffff6a819d2 in nl_cache_include () from /lib64/libnl-3.so.200 #2 0x000000000048788c in netlink_notification () #3 0x00007ffff62ccf6a in g_cclosure_marshal_VOID__POINTERv () from /lib64/libgobject-2.0.so.0 #4 0x00007ffff62c9ef7 in _g_closure_invoke_va () from /lib64/libgobject-2.0.so.0 #5 0x00007ffff62e28f6 in g_signal_emit_valist () from /lib64/libgobject-2.0.so.0 #6 0x00007ffff62e3142 in g_signal_emit () from /lib64/libgobject-2.0.so.0 #7 0x000000000046c3ec in event_msg_ready () #8 0x00007ffff6a8667a in nl_recvmsgs_report () from /lib64/libnl-3.so.200 #9 0x00007ffff6a869f9 in nl_recvmsgs () from /lib64/libnl-3.so.200 #10 0x000000000046c285 in event_handler () #11 0x00007ffff600b6d5 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #12 0x00007ffff600ba08 in g_main_context_iterate.isra.24 () from /lib64/libglib-2.0.so.0 #13 0x00007ffff600be02 in g_main_loop_run () from /lib64/libglib-2.0.so.0 #14 0x0000000000425240 in main ()
Status: NEW => ASSIGNED
This commit in libnl should fix it: https://github.com/tgraf/libnl/commit/ba38f3919835c39d7bc1e939ef3ca89cfe31600d
It seems we sort of have some protection already. I do not get yet why it fails here: (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00007ffff6a819d2 in cache_include (data=0x0, cb=0x0, obj=0x78ad70, cache=0x76d000, type=<optimized out>) at cache.c:774 #2 nl_cache_include (cache=0x76d000, obj=0x78ad70, change_cb=0x0, data=0x0) at cache.c:799 #3 0x000000000048788c in netlink_notification () #4 0x00007ffff62ccf6a in g_cclosure_marshal_VOID__POINTERv () from /lib64/libgobject-2.0.so.0 #5 0x00007ffff62c9ef7 in _g_closure_invoke_va () from /lib64/libgobject-2.0.so.0 #6 0x00007ffff62e28f6 in g_signal_emit_valist () from /lib64/libgobject-2.0.so.0 #7 0x00007ffff62e3142 in g_signal_emit () from /lib64/libgobject-2.0.so.0 #8 0x000000000046c3ec in event_msg_ready () #9 0x00007ffff6a8667a in nl_cb_call (msg=0x779d30, type=0, cb=0x710040) at ../include/netlink-local.h:124 #10 recvmsgs (cb=0x710040, sk=0x710130) at nl.c:909 #11 nl_recvmsgs_report (sk=0x710130, cb=0x710040) at nl.c:960 #12 0x00007ffff6a869f9 in nl_recvmsgs (sk=<optimized out>, cb=<optimized out>) at nl.c:984 #13 0x000000000046c285 in event_handler () #14 0x00007ffff600b6d5 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #15 0x00007ffff600ba08 in g_main_context_iterate.isra.24 () from /lib64/libglib-2.0.so.0 #16 0x00007ffff600be02 in g_main_loop_run () from /lib64/libglib-2.0.so.0 #17 0x0000000000425240 in main () (gdb) list 769 nl_cache_move(cache, obj); 770 if (old == NULL && cb) 771 cb(cache, obj, NL_ACT_NEW, data); 772 else if (old) { 773 if (nl_object_diff(old, obj) && cb) 774 cb(cache, obj, NL_ACT_CHANGE, data); 775 776 nl_object_put(old); 777 } 778 } (gdb) p cb $2 = (void (*)(struct nl_cache *, struct nl_object *, int, void *)) 0x0 Just before line 774, there is already a cb NULL check
Summary: networkmanager segfaults when connecting wifi wpa2 => networkmanager segfaults when connecting wifi wpa2 or ethernetSeverity: major => critical
I didn't notice this with the RC, if it helps.
I think the situation is more complex than a constant crash. I have seen this once in the past - namely at FOSDEM where NM crashed in a similar way. I think I've since trashed the core/backtrace I kept but IIRC it was similar to the backtrace shown here. The wifi network at FOSDEM was the only one to trigger this crash for me... can't say I've noticed it misbehaving since then :s I wonder if it's related to the _va functions messing with the stack? These _va handling things often seem to cause problems!
I can reproduce it easily here: I plug a cable, it crashes. But yep, it's not so constant, since it seems we are only a few users to have this bug.
CC: manuel.mageia => (none)
Actually, the patch I mentioned in comment 3 (commit ba38f3919835c39d7bc1e939ef3ca89cfe31600d) is enough, it is just that the gdb line information was off by a dozen of lines...
Fixed in libnl3-3.2.17-3, waiting to be pushed through release freeze.
Status: ASSIGNED => RESOLVEDResolution: (none) => FIXED