Bug 29545 - boot process has 3 min pause if usbc dongle change from hdmi to network (service_harddrake[...]: ERROR: killing runaway process)
Summary: boot process has 3 min pause if usbc dongle change from hdmi to network (serv...
Status: NEW
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: 8
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Mageia tools maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-11 17:45 CEST by w unruh
Modified: 2021-10-20 23:18 CEST (History)
2 users (show)

See Also:
Source RPM: drakxtools
CVE:
Status comment:


Attachments
lspcidrake ethernet (4.13 KB, text/plain)
2021-10-12 16:54 CEST, w unruh
Details
journalctl -b >Ethernet_log.txt (260.75 KB, text/plain)
2021-10-12 16:56 CEST, w unruh
Details
lspcidrake -v > HDMI_lspci.txt (4.17 KB, text/plain)
2021-10-12 16:58 CEST, w unruh
Details
journalctl -b > HDMI_log.txt (203.02 KB, text/plain)
2021-10-12 17:00 CEST, w unruh
Details

Description w unruh 2021-10-11 17:45:12 CEST
Problem: I have a Dell XPS13 9360 machine running Mageia 8. It has a USBC port which I use for HDMI at home, and for ethernet at work. Ie, at home I use a desktop monitor which I use for the system. At work, I have to use that port for a ethernet connection as the wireless does not work properly there.

If I come in from home and boot up with the wireless dongle plugged in (even on the 2nd or third reboot), the sestem goes runs the "looking for new hardware" and then the screen goes blank (I run the system with nosplash since I want to see the bootup  process). If I do Fn-F ( the dual monitor key) the screen again shows the boot process, but it goes into a 120 or so second wait, then boots a bit more goes into another 30 sec or so wait, and displays a long list of kernel messages, before finally bringing up the SDDM login screen. Having a 3 min delay on bootup is really annoying, and all of those kernel messages ( which whip by too quickly for me to read) is also annoying. Ie, somehow the system remembers that a dual screen dongle had been plugged in and becomes confused, sulky,...

I never have trouble at home going from the previous ethernet dongle to the HDMI dongle even if the HDMI is plugged in on bootup. But it definitely has trouble if the ethernet dongle is plugged in on bootup after the hdmi was used on the previous boot up (or even if this is the second bootup with the ethernet dongle.

Note that there are messages about akonadi on those bad bootups.

   


Version-Release number of selected component (if applicable):


How reproducible: Very


Steps to Reproduce:
1.Boot up with hdmi dongle and use it to copy to the laptop screen to the monitor. shutdown 
2.bootup with the network usbc dongle attached and with nosplash so you can see the boot messages.
3.
Comment 1 Marja Van Waes 2021-10-11 22:04:59 CEST
Please attach the following 4 files:

Ethernet_lspci.txt that is the result from running, when the ethernet dongle is used:
 
    lspcidrake -v > Ethernet_lspci.txt

HDMI_lspci.txt that is the result from running, when the HDMI dongle is used:
 
    lspcidrake -v > HDMI_lspci.txt


Ethernet_log.txt that is the result from running, as _root_, when the ethernet dongle is used and you had that delay:
 
    journalctl -b > Ethernet_log.txt

HDMI_log.txt that is the result from running, as _root_, when the HDMI dongle is used:
 
    journalctl -b > HDMI_log.txt

Keywords: (none) => NEEDINFO
CC: (none) => marja11

Comment 2 Dave Hodgins 2021-10-12 15:16:26 CEST
Try adding a line with ...
HARDDRAKE_ONBOOT=no
to /etc/sysconfig/system

That will stop harddrake from scanning for hardware changes on boot. I'm not
sure if it's really needed in this case, so worth trying without it.
Keep in mind that if you add new hardware, that hasn't been used on that
system before, it should be run manually with ...

# /usr/share/harddrake/service_harddrake
i18n_env: lang:en_CA country:CA locale|lang:en_CA.UTF-8 locale|country:en_CA.UTF-8 LANGUAGE:en_CA:en_GB:en
Too late to run INIT block at /usr/lib64/perl5/vendor_perl/Glib/Object/Introspection.pm line 257.
Ignore the following Glib::Object::Introspection & Gtk3 warnings
N() was called from /usr/lib/libDrakX/harddrake/data.pm:546 BEFORE gtk3 initialisation, replace it with a N_() AND a translate() later.

CC: (none) => davidwhodgins

Comment 3 w unruh 2021-10-12 16:54:21 CEST
Created attachment 12946 [details]
lspcidrake ethernet

lspcidrake -v > Ethernet_lspci.txt
Comment 4 w unruh 2021-10-12 16:56:37 CEST
Created attachment 12947 [details]
journalctl -b >Ethernet_log.txt

journalctl -b >Ethernet_log.txt
Comment 5 w unruh 2021-10-12 16:58:00 CEST
Created attachment 12948 [details]
lspcidrake -v > HDMI_lspci.txt

lspcidrake -v > HDMI_lspci.txt
Comment 6 w unruh 2021-10-12 17:00:49 CEST
Created attachment 12949 [details]
journalctl -b > HDMI_log.txt

 journalctl -b > HDMI_log.txt
Comment 7 w unruh 2021-10-12 17:14:20 CEST
In the Ethernet_log file, notice the 87 sec   pause at 9:45:02 and the "pam_systemd(su:session): Failed to create session: Connection timed out"
just after that. 
Then the huge number of reports from service_harddrake just after that, and then
the later kernel complaints about martian sources. (during the boot it all goes by so fast that all I see if the "kernel" complaints.

Then the next long pause is after 9:47:09 which seems to be due to the wireless trying to connect. It has an ethernet dongle so it should surely be connecting to that, not the wireless.
Comment 8 Marja Van Waes 2021-10-13 22:11:45 CEST
(In reply to w unruh from comment #7)
> In the Ethernet_log file, notice the 87 sec   pause at 9:45:02 and the
> "pam_systemd(su:session): Failed to create session: Connection timed out"
> just after that. 

At least the dongle was identified correctly before that happened:
Oct 12 09:44:59 planet service_harddrake[778]: added ETHERNET: Realtek|USB 10/100/1000 LAN

Probably this runaway process caused the hang:

Oct 12 09:46:59 planet service_harddrake[778]: ERROR: killing runaway process (process=/usr/bin/run-parts, pid=785, args=--arg planet /etc/sysconfig/network-scripts/hostname.d, error=ALARM at /usr/lib/libDrakX/run_program.pm line 235.
                                               )
Assuming harddrake is the culprit and assigning to the Mageia tools maintainers

> Then the huge number of reports from service_harddrake just after that, and
> then
> the later kernel complaints about martian sources. (during the boot it all
> goes by so fast that all I see if the "kernel" complaints.
> 
> Then the next long pause is after 9:47:09 which seems to be due to the
> wireless trying to connect. It has an ethernet dongle so it should surely be
> connecting to that, not the wireless.

Assignee: bugsquad => mageiatools
Source RPM: Boot process => drakxtools
Summary: boot process has 3 min pause if usbc dongle change from hdmi to network => boot process has 3 min pause if usbc dongle change from hdmi to network (service_harddrake[...]: ERROR: killing runaway process)
Keywords: NEEDINFO => (none)

Comment 9 Dave Hodgins 2021-10-13 22:33:30 CEST
Was disabling harddrake on boot as per comment 2 tried?
Comment 10 Dave Hodgins 2021-10-13 22:47:13 CEST
Also what scripts are in /etc/sysconfig/network-scripts/hostname.d ?
Comment 11 w unruh 2021-10-14 01:38:25 CEST
(In reply to Dave Hodgins from comment #10)
> Also what scripts are in /etc/sysconfig/network-scripts/hostname.d ?

avahi and s2u 
And it is avahi that seems to be timing out
(Its content is
su avahi -s /bin/bash -c  "avahi-set-host-name $1"

)
Comment 12 w unruh 2021-10-14 01:53:39 CEST
(In reply to Dave Hodgins from comment #9)
> Was disabling harddrake on boot as per comment 2 tried?

Not yet. I have been too busy the past couple of days. I'll see if I can tomorrow.
Comment 13 Dave Hodgins 2021-10-14 02:53:18 CEST
(In reply to w unruh from comment #11)
> (In reply to Dave Hodgins from comment #10)
> > Also what scripts are in /etc/sysconfig/network-scripts/hostname.d ?
> 
> avahi and s2u 
> And it is avahi that seems to be timing out
> (Its content is
> su avahi -s /bin/bash -c  "avahi-set-host-name $1"

Add a line with ...
NOZEROCONF=yes
or change the existing line from no to yes in
/etc/sysconfig/network-scripts/ifcfg-eth0
or whichever ifcfg file the network is using.
Comment 14 w unruh 2021-10-20 22:40:35 CEST
OK I put HARDDRAKE_ONBOOT=no  into /etc/sysconfig/system, and the boot went through with the ethernet able plugged in. When I removed that line, even with 
NOZEROCONF=yes
in /etc/sysconig/network-scripts/ifcfg-enp57s0u1
I got the hangup. First a black screen (while running with nosplash boot) with  output when I hit Fn F8, the display selector, and then hanging up  for over a minute when it got the harddrake avahi stuff. 
Ie, there is definitely a bug in harddrake (or avahi)
Comment 15 Dave Hodgins 2021-10-20 23:18:14 CEST
Next thing to try is "systemctl mask avahi-daemon.service" to confirm if it's
actually the problem.
.

Note You need to log in before you can comment on or make changes to this bug.