Description of problem: My previous reboot was: Mon Jul 3 02:37:47 PM EDT 2023 Mageia release 9 (Cauldron) for x86_64 Kernel 6.3.9-server-2.mga9 on a 20-processor x86_64 / \l Linux pf.pfortin.com 6.3.9-server-2.mga9 #1 SMP PREEMPT_DYNAMIC Fri Jun 23 08:10:12 UTC 2023 x86_64 GNU/Linux Reboot just now: Fri Jul 28 07:58:25 PM EDT 2023 Mageia release 9 (Cauldron) for x86_64 Kernel 6.4.6-server-2.mga9 on a 20-processor x86_64 / \l Linux pf.pfortin.com 6.4.6-server-2.mga9 #1 SMP PREEMPT_DYNAMIC Tue Jul 25 19:09:39 UTC 2023 x86_64 GNU/Linux After reboot, I had no network. Immediately tried re-configuring WiFi, no go. Configured ethernet and connected wire to router. This worked minimally; DNS was not working: $ dig cisco.com ;; communications error to ::1#53: connection refused ;; communications error to ::1#53: connection refused ;; communications error to ::1#53: connection refused ;; communications error to 127.0.0.1#53: connection refused ; <<>> DiG 9.18.15 <<>> cisco.com ;; global options: +cmd ;; no servers could be reached Wireshark showed NO DNS packets. $ systemctl start network.service Job for network.service failed because the control process exited with error code. See "systemctl status network.service" and "journalctl -xeu network.service" for details. $ systemctl status network.service × network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network; generated) Active: failed (Result: exit-code) since Fri 2023-07-28 21:04:20 EDT; 29s ago Docs: man:systemd-sysv-generator(8) Process: 948426 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE) Tasks: 2 (limit: 154182) Memory: 1.1M CPU: 1.984s CGroup: /system.slice/network.service ├─948622 /sbin/ifplugd -I -b -i docker0 └─948759 /sbin/ifplugd -I -b -i p5p1 Jul 28 21:04:20 network[948887]: RTNETLINK answers: File exists Jul 28 21:04:20 network[948888]: RTNETLINK answers: File exists Jul 28 21:04:20 network[948889]: RTNETLINK answers: File exists Jul 28 21:04:20 network[948890]: RTNETLINK answers: File exists Jul 28 21:04:20 systemd[1]: network.service: Control process exited, code=exited, status=1/FAILURE Jul 28 21:04:20 systemd[1]: network.service: Failed with result 'exit-code'. Jul 28 21:04:20 systemd[1]: network.service: Unit process 948622 (ifplugd) remains running after unit stopped. Jul 28 21:04:20 systemd[1]: network.service: Unit process 948759 (ifplugd) remains running after unit stopped. Jul 28 21:04:20 systemd[1]: Failed to start network.service. Jul 28 21:04:20 systemd[1]: network.service: Consumed 1.981s CPU time. Version-Release number of selected component (if applicable): see 'rpm -q --last' output How reproducible: Sorry, I have several services that I need to keep running, so no time to reproduce. Once I got network up, I stopped messing with it... Steps to Reproduce: 1. Applied updates from July 7 to present. (will attach 'rpm -qa --last' output) 2. Rebooted. 3. No network... After some quick debugging, went to mcc System services. Enabled DNS; no change. Disabled NetworkManager, NetworkManager-dispatcher, NetworkManager-wait-oneline and re-configured WiFi and network up.
Created attachment 13928 [details] rpm -qa --last output these are all the updates applied since last reboot.
Created attachment 13929 [details] Journal from reboot to network up.
Thank you for the report with its full details. (In reply to Pierre Fortin from comment #0) > Disabled NetworkManager, NetworkManager-dispatcher, > NetworkManager-wait-oneline and re-configured WiFi and network up. Can you please describe briefly what your network *is*. A single WiFi link? What re-configuration of WiFi was necessary? Had previous details gone?
CC: (none) => lewyssmith
MY network has always been a simple WiFi connection to a Linksys WRT3200ACM running DD-WRT firmware to a DSL dual-link modem in bridge mode. With the NetworkManager, which I don't recall setting up/using, after this reboot, I had no connectivity over WiFi. Whenever I have network issues, I usually just go through the mcc "Set up a new network interface (LAN, ISDN, ADSL, ...)" without making changes (details still as expected) to get the network to come up again -- it's faster than trying to diagnose issues and manually correct whatever is wrong. HTH
I can't confirm from the info in the log, but I suspect the network interface changed names causing shorewall to reject the packets due to the new nic not being listed in /etc/shorewall/interfaces Using mcc to get the network up again would have fixed that. Is this a usb wifi nic? Could it have been moved to a different usb plug? That would cause the nic to change names.
CC: (none) => davidwhodgins
The nic name changed only once, back in Oct/22 (see bug 30965); no name change this time. ifconfig lp10s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.1.46 netmask 255.255.255.0 broadcast 192.168.1.255 inet6 fe80::46e5:17ff:fefd:1187 prefixlen 64 scopeid 0x20<link> ether 44:e5:17:fd:11:87 txqueuelen 1000 (Ethernet) RX packets 6545788 bytes 6601378343 (6.1 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 3127160 bytes 3386140413 (3.1 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 uptime 19:40:57 up 23:43, 12 users, load average: 2.54, 2.43, 2.46 lscpi -v 0000:0a:00.0 Network controller: Intel Corporation Wi-Fi 6 AX210/AX211/AX411 160MHz (rev 1a) Subsystem: Rivet Networks Device 1674 Flags: bus master, fast devsel, latency 0, IRQ 19 Memory at 84500000 (64-bit, non-prefetchable) [size=16K] Capabilities: [c8] Power Management version 3 Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [40] Express Endpoint, MSI 00 Capabilities: [80] MSI-X: Enable+ Count=16 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [14c] Latency Tolerance Reporting Capabilities: [154] L1 PM Substates Kernel driver in use: iwlwifi Kernel modules: iwlwifi
copy/paste missed the 'w' wlp10s0.
Sorry to have left you. A basic point: is this reproduceable, or did it just happen once? I appreciate that you would need to re-boot to try it...
Only when I rebooted. I included the differences that led up to this from the previous reboot. My guess is that it was a one-time occurrence; but if that was to happen to others, they may not have the skills to recover -- without a network, it's not possible to search for solutions. I don't have a reboot scheduled at this time; but I see there's a new kernel, so I may update and reboot late tonight if time permits...
If this was a one-off incident, there is no hope for a resolution. If it happens again after another re-boot, please attach (yes again, sorry, another example in the hope it reveals something more) the compressed system journal. [Just 'xz' the text journal extract]. We shall then investigate the issue more fully.
Created attachment 13933 [details] journal of boot My gut tells me we may be on the forefront of a race condition... This time, WiFi came up; but wireless keyboard (Logitech K350) did not come up until I: - plugged in a wired keyboard to be able to login - disconnected/connected USB Unifying Receiver - the wireless mouse (Logitech MX Master 3S came up OK (USB Bolt Receiver)
Created attachment 13934 [details] journal of boot with comments Oops.. forgot to save journal with comments before zipping...
Attachment 13933 is obsolete: 0 => 1
$ grep 'Logitech USB Receiver as' journal Aug 05 18:35:34 kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.4/1-5.4.3/1-5.4.3:1.0/0003:046D:C548.0002/input/input17 Aug 05 18:35:34 kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.4/1-5.4.4/1-5.4.4:1.0/0003:046D:C52B.0005/input/input21 Aug 05 18:41:59 kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.4/1-5.4.3/1-5.4.3:1.0/0003:046D:C548.000A/input/input28 Aug 05 18:43:01 kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.4/1-5.4.3/1-5.4.3:1.0/0003:046D:C548.0016/input/input39 Is there more than one Receiver?
Also, this doesn't look good ... $ grep usb journal |grep error Aug 05 18:35:32 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:35:32 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:35:32 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:35:32 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:35:33 kernel: usb 1-5.4.1: device not accepting address 9, error -71 Aug 05 18:35:33 kernel: usb 1-5.4.1: device not accepting address 10, error -71 Aug 05 18:42:13 kernel: usb 1-5.4.1: device not accepting address 23, error -71 Aug 05 18:42:13 kernel: usb 1-5.4.1: device descriptor read/all, error -71 Aug 05 18:42:15 kernel: usb 1-5.4.1: device not accepting address 26, error -71 Aug 05 18:42:16 kernel: usb 1-5.4.1: can't set config #1, error -71 Aug 05 18:42:18 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:42:18 kernel: usb 1-5.4.1: can't set config #1, error -71 Aug 05 18:42:19 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:42:20 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:42:23 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:42:23 kernel: usb 1-5.4.1: device descriptor read/64, error -71 Aug 05 18:42:24 kernel: usb 1-5.1: device descriptor read/64, error -71 Aug 05 18:42:35 kernel: usb 1-5.4.1: device not accepting address 33, error -62 Aug 05 18:42:36 kernel: usb 1-5.4.1: device not accepting address 34, error -71 https://ubuntuforums.org/showthread.php?t=797789 might be relevant.
Also check https://www.linuxquestions.org/questions/linux-hardware-18/usb-5-1-device-descriptor-read-64-error-71-a-4175640937/
https://stackoverflow.com/questions/9544557/debian-device-descriptor-read-64-error-71 explains the most likely causes.
(In reply to Dave Hodgins from comment #13) > > Is there more than one Receiver? Yes, all Logitech: * Bolt Receiver for MX Master 3S mouse * Unifying Receiver for K350 keyboard (can handle 6 devices except above keyboard) See https://support.logi.com/hc/en-us/articles/1500012483162-What-is-the-difference-between-Bolt-and-Unifying-receivers-
(In reply to Dave Hodgins from comment #16) > https://stackoverflow.com/questions/9544557/debian-device-descriptor-read-64- > error-71 > explains the most likely causes. I disagree with the comments in that post. All hardware works great here; this is looking more like a kernel race condition or faulty code. Mouse and keyboard have been connected and running great for months; including right now... This started with WiFi, now the keyboard; next reboot could be more interesting... :) In case it wasn't obvious, this time all I did to "fix" this keyboard issue was disconnect/reconnect the Unifying Receiver for the kernel to now see the keyboard that got missed on boot for the first time ever...
Assigning to the kernel and drivers team.
Assignee: bugsquad => kernel