| Summary: | no network after reboot; disabled NetworkManager to restore. | ||
|---|---|---|---|
| Product: | Mageia | Reporter: | Pierre Fortin <pfortin> |
| Component: | RPM Packages | Assignee: | Kernel and Drivers maintainers <kernel> |
| Status: | NEW --- | QA Contact: | |
| Severity: | normal | ||
| Priority: | Normal | CC: | davidwhodgins, lewyssmith |
| Version: | Cauldron | ||
| Target Milestone: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Source RPM: | CVE: | ||
| Status comment: | |||
| Attachments: |
rpm -qa --last output
Journal from reboot to network up. journal of boot journal of boot with comments |
||
Created attachment 13928 [details]
rpm -qa --last output
these are all the updates applied since last reboot.
Created attachment 13929 [details]
Journal from reboot to network up.
Thank you for the report with its full details. (In reply to Pierre Fortin from comment #0) > Disabled NetworkManager, NetworkManager-dispatcher, > NetworkManager-wait-oneline and re-configured WiFi and network up. Can you please describe briefly what your network *is*. A single WiFi link? What re-configuration of WiFi was necessary? Had previous details gone? CC:
(none) =>
lewyssmith MY network has always been a simple WiFi connection to a Linksys WRT3200ACM running DD-WRT firmware to a DSL dual-link modem in bridge mode. With the NetworkManager, which I don't recall setting up/using, after this reboot, I had no connectivity over WiFi. Whenever I have network issues, I usually just go through the mcc "Set up a new network interface (LAN, ISDN, ADSL, ...)" without making changes (details still as expected) to get the network to come up again -- it's faster than trying to diagnose issues and manually correct whatever is wrong. HTH I can't confirm from the info in the log, but I suspect the network interface changed names causing shorewall to reject the packets due to the new nic not being listed in /etc/shorewall/interfaces Using mcc to get the network up again would have fixed that. Is this a usb wifi nic? Could it have been moved to a different usb plug? That would cause the nic to change names. CC:
(none) =>
davidwhodgins The nic name changed only once, back in Oct/22 (see bug 30965); no name change this time. ifconfig lp10s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.1.46 netmask 255.255.255.0 broadcast 192.168.1.255 inet6 fe80::46e5:17ff:fefd:1187 prefixlen 64 scopeid 0x20<link> ether 44:e5:17:fd:11:87 txqueuelen 1000 (Ethernet) RX packets 6545788 bytes 6601378343 (6.1 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 3127160 bytes 3386140413 (3.1 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 uptime 19:40:57 up 23:43, 12 users, load average: 2.54, 2.43, 2.46 lscpi -v 0000:0a:00.0 Network controller: Intel Corporation Wi-Fi 6 AX210/AX211/AX411 160MHz (rev 1a) Subsystem: Rivet Networks Device 1674 Flags: bus master, fast devsel, latency 0, IRQ 19 Memory at 84500000 (64-bit, non-prefetchable) [size=16K] Capabilities: [c8] Power Management version 3 Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [40] Express Endpoint, MSI 00 Capabilities: [80] MSI-X: Enable+ Count=16 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [14c] Latency Tolerance Reporting Capabilities: [154] L1 PM Substates Kernel driver in use: iwlwifi Kernel modules: iwlwifi copy/paste missed the 'w' wlp10s0. Sorry to have left you. A basic point: is this reproduceable, or did it just happen once? I appreciate that you would need to re-boot to try it... Only when I rebooted. I included the differences that led up to this from the previous reboot. My guess is that it was a one-time occurrence; but if that was to happen to others, they may not have the skills to recover -- without a network, it's not possible to search for solutions. I don't have a reboot scheduled at this time; but I see there's a new kernel, so I may update and reboot late tonight if time permits... If this was a one-off incident, there is no hope for a resolution. If it happens again after another re-boot, please attach (yes again, sorry, another example in the hope it reveals something more) the compressed system journal. [Just 'xz' the text journal extract]. We shall then investigate the issue more fully. Created attachment 13933 [details]
journal of boot
My gut tells me we may be on the forefront of a race condition...
This time, WiFi came up; but wireless keyboard (Logitech K350) did not come up until I:
- plugged in a wired keyboard to be able to login
- disconnected/connected USB Unifying Receiver
- the wireless mouse (Logitech MX Master 3S came up OK (USB Bolt Receiver)
Created attachment 13934 [details]
journal of boot with comments
Oops.. forgot to save journal with comments before zipping...
Pierre Fortin
2023-08-06 02:03:50 CEST
Attachment 13933 is obsolete:
0 =>
1 $ grep 'Logitech USB Receiver as' journal Aug 05 18:35:34 kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.4/1-5.4.3/1-5.4.3:1.0/0003:046D:C548.0002/input/input17 Aug 05 18:35:34 kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.4/1-5.4.4/1-5.4.4:1.0/0003:046D:C52B.0005/input/input21 Aug 05 18:41:59 kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.4/1-5.4.3/1-5.4.3:1.0/0003:046D:C548.000A/input/input28 Aug 05 18:43:01 kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.4/1-5.4.3/1-5.4.3:1.0/0003:046D:C548.0016/input/input39 Is there more than one Receiver? Also, this doesn't look good ... $ grep usb journal |grep error Aug 05 18:35:32 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:35:32 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:35:32 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:35:32 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:35:33 kernel: usb 1-5.4.1: device not accepting address 9, error -71 Aug 05 18:35:33 kernel: usb 1-5.4.1: device not accepting address 10, error -71 Aug 05 18:42:13 kernel: usb 1-5.4.1: device not accepting address 23, error -71 Aug 05 18:42:13 kernel: usb 1-5.4.1: device descriptor read/all, error -71 Aug 05 18:42:15 kernel: usb 1-5.4.1: device not accepting address 26, error -71 Aug 05 18:42:16 kernel: usb 1-5.4.1: can't set config #1, error -71 Aug 05 18:42:18 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:42:18 kernel: usb 1-5.4.1: can't set config #1, error -71 Aug 05 18:42:19 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:42:20 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:42:23 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:42:23 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:42:24 kernel: usb 1-5.1: device descriptor read/64, error -71 Aug 05 18:42:35 kernel: usb 1-5.4.1: device not accepting address 33, error -62 Aug 05 18:42:36 kernel: usb 1-5.4.1: device not accepting address 34, error -71 https://ubuntuforums.org/showthread.php?t=797789 might be relevant. https://stackoverflow.com/questions/9544557/debian-device-descriptor-read-64-error-71 explains the most likely causes. (In reply to Dave Hodgins from comment #13) > > Is there more than one Receiver? Yes, all Logitech: * Bolt Receiver for MX Master 3S mouse * Unifying Receiver for K350 keyboard (can handle 6 devices except above keyboard) See https://support.logi.com/hc/en-us/articles/1500012483162-What-is-the-difference-between-Bolt-and-Unifying-receivers- (In reply to Dave Hodgins from comment #16) > https://stackoverflow.com/questions/9544557/debian-device-descriptor-read-64- > error-71 > explains the most likely causes. I disagree with the comments in that post. All hardware works great here; this is looking more like a kernel race condition or faulty code. Mouse and keyboard have been connected and running great for months; including right now... This started with WiFi, now the keyboard; next reboot could be more interesting... :) In case it wasn't obvious, this time all I did to "fix" this keyboard issue was disconnect/reconnect the Unifying Receiver for the kernel to now see the keyboard that got missed on boot for the first time ever... Assigning to the kernel and drivers team. Assignee:
bugsquad =>
kernel |
Description of problem: My previous reboot was: Mon Jul 3 02:37:47 PM EDT 2023 Mageia release 9 (Cauldron) for x86_64 Kernel 6.3.9-server-2.mga9 on a 20-processor x86_64 / \l Linux pf.pfortin.com 6.3.9-server-2.mga9 #1 SMP PREEMPT_DYNAMIC Fri Jun 23 08:10:12 UTC 2023 x86_64 GNU/Linux Reboot just now: Fri Jul 28 07:58:25 PM EDT 2023 Mageia release 9 (Cauldron) for x86_64 Kernel 6.4.6-server-2.mga9 on a 20-processor x86_64 / \l Linux pf.pfortin.com 6.4.6-server-2.mga9 #1 SMP PREEMPT_DYNAMIC Tue Jul 25 19:09:39 UTC 2023 x86_64 GNU/Linux After reboot, I had no network. Immediately tried re-configuring WiFi, no go. Configured ethernet and connected wire to router. This worked minimally; DNS was not working: $ dig cisco.com ;; communications error to ::1#53: connection refused ;; communications error to ::1#53: connection refused ;; communications error to ::1#53: connection refused ;; communications error to 127.0.0.1#53: connection refused ; <<>> DiG 9.18.15 <<>> cisco.com ;; global options: +cmd ;; no servers could be reached Wireshark showed NO DNS packets. $ systemctl start network.service Job for network.service failed because the control process exited with error code. See "systemctl status network.service" and "journalctl -xeu network.service" for details. $ systemctl status network.service × network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network; generated) Active: failed (Result: exit-code) since Fri 2023-07-28 21:04:20 EDT; 29s ago Docs: man:systemd-sysv-generator(8) Process: 948426 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE) Tasks: 2 (limit: 154182) Memory: 1.1M CPU: 1.984s CGroup: /system.slice/network.service ├─948622 /sbin/ifplugd -I -b -i docker0 └─948759 /sbin/ifplugd -I -b -i p5p1 Jul 28 21:04:20 network[948887]: RTNETLINK answers: File exists Jul 28 21:04:20 network[948888]: RTNETLINK answers: File exists Jul 28 21:04:20 network[948889]: RTNETLINK answers: File exists Jul 28 21:04:20 network[948890]: RTNETLINK answers: File exists Jul 28 21:04:20 systemd[1]: network.service: Control process exited, code=exited, status=1/FAILURE Jul 28 21:04:20 systemd[1]: network.service: Failed with result 'exit-code'. Jul 28 21:04:20 systemd[1]: network.service: Unit process 948622 (ifplugd) remains running after unit stopped. Jul 28 21:04:20 systemd[1]: network.service: Unit process 948759 (ifplugd) remains running after unit stopped. Jul 28 21:04:20 systemd[1]: Failed to start network.service. Jul 28 21:04:20 systemd[1]: network.service: Consumed 1.981s CPU time. Version-Release number of selected component (if applicable): see 'rpm -q --last' output How reproducible: Sorry, I have several services that I need to keep running, so no time to reproduce. Once I got network up, I stopped messing with it... Steps to Reproduce: 1. Applied updates from July 7 to present. (will attach 'rpm -qa --last' output) 2. Rebooted. 3. No network... After some quick debugging, went to mcc System services. Enabled DNS; no change. Disabled NetworkManager, NetworkManager-dispatcher, NetworkManager-wait-oneline and re-configured WiFi and network up.