5beta3 classic dvd 64 default lxde install Shutdown gets stuck with a black screen for several minutes. Journal shows.. Jan 31 09:14:35 localhost logger[3000]: Shorewall Stopped Jan 31 09:14:35 localhost shorewall[2925]: done. Jan 31 09:14:36 localhost ifplugd(enp0s3)[874]: Executing '/etc/ifplugd/ifplugd.action enp0s3 down'. Jan 31 09:14:38 localhost ifplugd(enp0s3)[874]: Program executed successfully. Jan 31 09:14:38 localhost ifplugd(enp0s3)[874]: Exiting. Jan 31 09:14:39 localhost network[3006]: Shutting down interface enp0s3: [ OK ] Jan 31 09:14:40 localhost network[3006]: Shutting down loopback interface: [ OK ] Jan 31 09:14:40 localhost network[3006]: net.ipv4.tcp_syncookies = 0 Jan 31 09:14:41 localhost resolvconf[3379]: Stopping resolvconf: [ OK ] Jan 31 09:16:02 localhost systemd[1]: prefdm.service stop-sigterm timed out. Killing. Jan 31 09:16:02 localhost systemd[1]: prefdm.service: main process exited, code=killed, status=9/KILL Jan 31 09:16:02 localhost systemd[1]: Unit prefdm.service entered failed state. Jan 31 09:16:02 localhost systemd[1]: Found ordering cycle on display-manager-failure.service/start Jan 31 09:16:02 localhost systemd[1]: Unable to break cycle Jan 31 09:16:02 localhost systemd[1]: Requested transaction contains an unfixable cyclic ordering dependency: Exec format err Jan 31 09:16:02 localhost systemd[1]: Failed to enqueue OnFailure= job: Exec format error Jan 31 09:16:02 localhost systemd[1]: prefdm.service failed. Jan 31 09:16:04 localhost systemd[1]: Shutting down. Reproducible: Steps to Reproduce:
Hardware: i586 => x86_64Whiteboard: (none) => 5beta3
Not sure who to assign this one to.
CC: (none) => ennael1, mageiaSummary: 5beta3: LXDE takes several minutes to shutdown/reboot => 5beta3: LXDE takes several minutes to shutdown/reboot (ordering cycle on display-manager-failure.service/start)
evident in M5RC 25 feb
CC: (none) => westel
how do you reproduce ? I can't
clean install of LXDE only desktop. set up online media and update system. as this is a test for RC all apps are launched and most of MCC funtions also. then, reboot is from the logout menu. and wait..... other desktops under test take a few seconds to restart system, so it is noticable Intel core i5, Asus mobo P8B75-m lz, onboard graphics Intel810 or later
is this still valid for 5RC?
CC: (none) => marja11
Still valid in last RC isos, reported on the pad by Ben (in CC).
Summary: 5beta3: LXDE takes several minutes to shutdown/reboot (ordering cycle on display-manager-failure.service/start) => LXDE takes several minutes to shutdown/reboot (ordering cycle on display-manager-failure.service/start)Whiteboard: 5beta3 => 5beta3 5RC
DATE.txt: Thu Apr 9 22:56:24 CEST 2015
Valid RC9 20th April. Raising priority.
Priority: Normal => release_blocker
Blocks: (none) => 14069
Created attachment 6323 [details] journal1.txt Attaching current journal, it's the same issue. It had actually hung and needed sysrq magic to reboot 30 mins later. localhost systemd[1]: prefdm.service stop-sigterm timed out. Killing. localhost systemd[1]: prefdm.service: main process exited, code=killed, status=9/KILL localhost systemd[1]: Unit prefdm.service entered failed state. localhost systemd[1]: Found ordering cycle on display-manager-failure.service/start localhost systemd[1]: Unable to break cycle localhost systemd[1]: Requested transaction contains an unfixable cyclic ordering dependency: Exec format error localhost systemd[1]: Failed to enqueue OnFailure= job: Exec format error localhost systemd[1]: prefdm.service failed. localhost systemd[1]: Shutting down. localhost systemd-journal[457]: Journal stopped
Bug 8209 appears to still be valid, sessions are c1 and c2. "User session not terminated when logging out with lxdm, systemd-logind" Could this be the cause?
The shutdown issue is solved when it is switched to lightdm instead of lxdm. bug 8209 still remains though. As lightdm is already used by other DE's should we switch to using lightdm for lxde also? It currently has bug bug 15772 where the background is not displayed but when that is fixed it is a nicer DM IMHO.
sorry lightdm bug 15722
Experienced this using lightdm too sadly, so it appears to be a result of bug 8209. I still believe we could move away from lxdm though.
Experienced the hang also in a MATE desktop installation from DVD 32. Had to use sysrq keys to reboot the machine. The journal shows the same errors.. Apr 23 16:38:21 localhost systemd[1]: prefdm.service: main process exited, code=exited, status=1/FAILURE Apr 23 16:38:21 localhost systemd[1]: Unit prefdm.service entered failed state. Apr 23 16:38:21 localhost systemd[1]: Found ordering cycle on display-manager-failure.service/start Apr 23 16:38:21 localhost systemd[1]: Unable to break cycle Apr 23 16:38:21 localhost systemd[1]: Requested transaction contains an unfixable cyclic ordering dependency: Exec format error Apr 23 16:38:21 localhost systemd[1]: Failed to enqueue OnFailure= job: Exec format error Apr 23 16:38:21 localhost systemd[1]: prefdm.service failed. Apr 23 16:38:21 localhost systemd[1]: Requested transaction contradicts existing jobs: Transaction is destructive. Apr 23 16:38:21 localhost systemd[1]: prefdm.service failed to schedule restart job: Transaction is destructive. Apr 23 16:38:21 localhost systemd[1]: Unit prefdm.service entered failed state. Apr 23 16:38:21 localhost systemd[1]: Found ordering cycle on display-manager-failure.service/start Apr 23 16:38:21 localhost systemd[1]: Unable to break cycle Apr 23 16:38:21 localhost systemd[1]: Requested transaction contains an unfixable cyclic ordering dependency: Exec format error Apr 23 16:38:21 localhost systemd[1]: Failed to enqueue OnFailure= job: Exec format error Apr 23 16:38:21 localhost systemd[1]: prefdm.service failed. Apr 23 16:38:21 localhost numlock[26558]: Disabling numlocks on ttys: [ OK ] Apr 23 16:38:21 localhost ifplugd(enp0s7)[980]: Executing '/etc/ifplugd/ifplugd.action enp0s7 down'. Apr 23 16:38:21 localhost ifplugd(enp0s7)[980]: Program executed successfully. Apr 23 16:38:21 localhost ifplugd(enp0s7)[980]: Exiting. Apr 23 16:38:21 localhost network[26590]: Shutting down interface enp0s7: [ OK ] Apr 23 16:38:22 localhost network[26590]: Shutting down loopback interface: [ OK ] Apr 23 16:38:22 localhost network[26590]: net.ipv4.tcp_syncookies = 0 Apr 23 16:38:22 localhost resolvconf[26959]: Stopping resolvconf: [ OK ] Apr 23 16:38:22 localhost preload[26984]: Stopping preload daemon: [ OK ] Apr 23 16:38:26 localhost systemd[1]: Shutting down. Apr 23 16:38:26 localhost systemd-journal[458]: Journal stopped
CC: (none) => tarakbumba
Summary: LXDE takes several minutes to shutdown/reboot (ordering cycle on display-manager-failure.service/start) => LXDE & MATE can take several minutes to shutdown/reboot or hang completely (ordering cycle on display-manager-failure.service/start)
I reproduced this with a fresh install using the classic 64-bit installer, and selecting just the LXDE desktop. Installing and selecting the LightDM display manager fixes it for me. As this seems to be a long-standing known bug in LXDM, not confined to Mageia, I would suggest either listing it in the errata or making LightDM the default for LXDE (or both).
CC: (none) => mageia
If you have a reference to an upstream bug report, please mention it here and also add the UPSTREAM keyword to the bug report. Thanks!
No upstream bug report I can find, but here is someone reporting identical symptoms: https://bbs.archlinux.org/viewtopic.php?id=186407 I'll admit I was getting this mixed up with bug 8209 when I said it was a long-standing bug.
Adding the LXDE maintainer in CC. What do you think about comment 15?
CC: (none) => nicolas.salguero
I can not reproduce this for MATE on my Cauldron machine and RC VMs for both 32 bit and 64bit.
So I tried installing task-mate, and the problem went away! Comparing the journal before and after, before I got: Apr 29 19:02:12 loki resolvconf[3334]: Stopping resolvconf: [ OK ] Apr 29 19:02:43 loki systemd-timesyncd[731]: Using NTP server [2001:4860:4802:32::f]:123 (time1.google.com). Apr 29 19:03:40 loki systemd[1]: prefdm.service stop-sigterm timed out. Killing. Apr 29 19:03:40 loki systemd[1]: prefdm.service: main process exited, code=killed, status=9/KILL Apr 29 19:03:40 loki systemd[1]: Unit prefdm.service entered failed state. Apr 29 19:03:40 loki systemd[1]: Found ordering cycle on display-manager-failure.service/start Apr 29 19:03:40 loki systemd[1]: Unable to break cycle Apr 29 19:03:40 loki systemd[1]: Requested transaction contains an unfixable cyclic ordering dependency: Exec format error Apr 29 19:03:40 loki systemd[1]: Failed to enqueue OnFailure= job: Exec format error Apr 29 19:03:40 loki systemd[1]: prefdm.service failed. Apr 29 19:03:42 loki systemd[1]: Shutting down. Apr 29 19:03:42 loki kernel: watchdog watchdog0: watchdog did not stop! Apr 29 19:03:42 loki systemd-journal[496]: Journal stopped and after I got: Apr 29 19:01:03 loki resolvconf[12832]: Stopping resolvconf: [ OK ] Apr 29 19:01:03 loki preload[12856]: Stopping preload daemon: [ OK ] Apr 29 19:01:04 loki systemd[1]: Requested transaction contradicts existing jobs: Transaction is destructive. Apr 29 19:01:08 loki systemd[1]: Shutting down. Apr 29 19:01:08 loki kernel: watchdog watchdog0: watchdog did not stop! Apr 29 19:01:08 loki systemd-journal[497]: Journal stopped So I tried disabling the preload service, and the problem reappeared. Re-enable the preload service, and the problem goes away again. Now I don't know whether the preload service is really fixing the problem or just introducing a new one that causes systemd to terminate more quickly - the message: Apr 29 19:01:04 loki systemd[1]: Requested transaction contradicts existing jobs: Transaction is destructive. is a bit suspicious.
This has been reported for Mageia 4 at forums: https://forums.mageia.org/en/viewtopic.php?f=7&t=7589
Hi, Regarding comment 15, when I tried to replace LXDM with LightDM on my Cauldron VM, I still had some remaining processes after logging out from LXDE. So, I tried to correct bug 8209 with lxde-common-0.99.0-8.mga5. Concerning this bug, sadly I have not been able to reproduce it on my Cauldron VM (even with a fresh install from Mageia-5-RC-x86_64-DVD.iso). Best regards, Nico.
Valid 5 final round 1 (Installed from classic DVD 32) Took several minutes to shutdown to reboot, journal showed same symptoms.
FWIW, processes remaining when you log out is not unexpected. Until systemd manages the whole user session (which we'll hopefully do for MGA6) this is to be expected. Things like gpg-agent and ssh-agent etc all put themselves into the background and do not tie themselves to the X session manager etc in order to terminate when it does (pulseaudio does this correctly, but with systemd user sessions, it will be unneeded as this provides a better infrastructure). So, references to processes left behind as per bug 8209 are likely unrelated, except when that process is lxdm itself. This obviously needs to exit for prefdm service to exit. If it does not exit cleanly, then prefdm service will assume some kind of failure and then We need to see more debug information here about the ordering cycle, although I suspect this is simply due to a conflict that (correctly) prevents the display-manager-failure stuff kicking in a reboot or shutdown. i.e. it conflicts with shutdown.target. For some reason the conflict is not honoured here. But to confirm, ideally the journal logs with systemd.log_level=debug on the kernel command line (or just debug but that will get a lot of stuff in it from the kernel too). The "Exec format error" aka ENOEXEC is generated by systemd when it cannot break the cycle, so that's pretty much expected if there is an ordering cycle. Of course all of this would go away if prefdm could just exit cleanly. Which basically means fixing lxdm or lightdm to not exit badly. Not sure if they just don't react fast enough. They seem to trap and handle sig-term but then don't exit fast enough after receiving that. Ultimately systemd should probably behave better with handling the ordering cycle stuff, but then so should lxdm. Ultimately, I could fix the ordering cycle stuff by simply removing the display manager failure stuff. I'm not convinced it is much use anyway (was added several releases back to counter a blank screen on tty1 when X failed). The taking several minutes part of the problem is something that needs to be fixed in LXDM or LightDM to make it exit cleanly and promptly after receiving sigterm.
Created attachment 6594 [details] debugjournal.txt Taken from lxde 64 booted from grub2 with systemd.log_level=debug kernel option. After reboot which was slow at shutdown, but not as long as previous occasions. # journalctl -a -b -1 > debugjournal.txt Checked that it shows the ordering cycle, but only with grep.
Comment on attachment 6594 [details] debugjournal.txt OK, so the ordering cycle is as I thought. It's pretty ugly appearing in the logs, but it is behaving correctly in that it does not enqueue the job due to the conflicts. It is printed only a few seconds before the logs end so it's not to blame for delaying the shutdown. It seems shutdown was initiated around "15:15:29" in the logs. This happen until 15:15:32ish and then there is a gap until "15:16:04" and "15:16:44" but they just show watchdog events (i.e. daemons telling systemd they are alive and well). It's only at "15:16:59" that the timeout for prefdm stopping is hit and lxdm-binary is forcably killed. From a systemd perspective, everything is behaving correctly here. Ultimately lxdm-binary needs to behave and listen when systemd sends it signals telling it to terminate.
After several tests, I can reproduce the bug only when my Virtualbox VM uses EFI. In legacy mode, I have no problem (tested with grub and grub2 on fresh installs). Moreover, I only have the problem when mageiawelcome is running. If I close mageiawelcome, the reboot is fast and without error in "journalctl -a -b -1".
The debugjournal.txt was taken from a classic DVD 64 LXDE install to real hardware (athlon64x2, nvidia340, ivtv). Ext4 with separate /var partition and grub2. MageiaWelcome doesn't seem to make any difference, the delay is present whether running or not. It seems to be 90 seconds delay so not earth shattering but will be an annoyance.
Decreasing priority and marking as needing to be added to the errata, since we're about to release Mageia 5 really soon. if someone comes up with a fix, we'll gladly take it in, else it will have to be issued as an update later on.
Priority: release_blocker => HighWhiteboard: 5beta3 5RC => 5beta3 5RC FOR_ERRATA
FWIW, as this seems to be lxdm-binary that is misbehaving, we're somewhat suffering due to the lack of individual DM service units (gdm.service, kdm.service etc) with our monolithic prefdm.service. I will ultimately kill this off after cauldron reopens. This way we could put a reduced timeout (or even just let it go straight to killing) for lxdm without affecting other, well behaved dms. With our monolythic prefdm.service it's a little trickier. Technically, we could reduced the timeout to, say 10s or so, by putting a file: /usr/lib/systemd/system/prefdm.service.d/lxdm-timeout.conf which contained: [Service] TimeoutStopSec=10 This file could then be packaged with lxdm. It would apply to any DM that prefdm starts but at least it would only be *installed* with lxdm. Note that this is *not* the correct solution, just something to mitigate the problem. Someone really needs to properly trace the problem and work out why the signals are not being properly handled in lxdm-binary. This will take some time and someone with appropriate programming skills.
I have updated the errata https://wiki.mageia.org/en/Mageia_5_Errata#LXDE concerning this problem.
Whiteboard: 5beta3 5RC FOR_ERRATA => 5beta3 5RC IN_ERRATA
Last Friday, I updated one of my machines from Mga4 to Mga5 and, for the moment, I cannot reproduce the problem (x86_64, Core i5 2400, BIOS in legacy mode, no installed pkg task-mate, lightdm or preload).
Has been confirmed again via forums report, so still valid: https://forums.mageia.org/en/viewtopic.php?f=7&t=10793 Updated the errata entry with the workaround provided in comment 11.
CC: (none) => doktor5000
I have also updated French errata entry with the same workaround.
So, can someone give us a summary of the status of this bug and what packages are involved? I'd like to either assign it to a packager or to pkg-bugs (meaning, to packagers collectively). It's still assigned à Bug Squad at the moment.
To me it's not clear which packages involved in this bug. It seems to me nothing related with Desktop Environments but either display manager (lxdm?) or systemd. As a side note, i can not reproduce this with my cauldron (BIOS) workhorse. May be some race condition happens.
Hi, I also think the problem comes from LXDM. I checked the git commits in the upstream project and I found several commits that seem related to signal handling and others that could help prevent race conditions (as well as many other bug fixes: memory leaks...). I have a laptop (Core 2 Duo, 2GiB of RAM, nvidia340 drivers) where the problem sometimes occurs (not very often). So I locally rebuilt the package (which is now in core/updates_testing: lxdm-0.5.3-1.mga5) and made some tests. I was not able to trigger the bug again with the new package but, as I said, the problem does not occur very often in my test machine so, maybe, I only had chance. So, is there anyone which have the bug regularly and can also test the new package to see if the problem is over, please? Best regards, Nico.
(In reply to Nicolas Salguero from comment #37) > So, is there anyone which have the bug regularly and can also test the new > package to see if the problem is over, please? > > Best regards, > > Nico. Hi Nico, hope this helps: Test system: âIntel(R) Core(TM) i5-3470 CPU @ 3.20GHz (4core) ram= 8 Gig (2x4 Gig) Card:Intel 810 and later Multiple HDD reboot time is: "click the reboot button, wait for memory check beep" Main O/s: Mageia 5 x86_64 KDE fully updated, reboot time: 12 sec. Test O/s: Mageia 5 x86_64 LXDE fresh install, reboot time: 95 sec. fully updated; reboot time : 95 sec installed lxdm-0.5.3-1.mga5.x86_64.rpm reboot time: 9 sec looks like for x86_64 good to go. do you want a i586 test done?
Whiteboard: 5beta3 5RC IN_ERRATA => 5beta3 5RC IN_ERRATA | x86_64 ok|
(In reply to ben mcmonagle from comment #38) > do you want a i586 test done? Hi, Yes, if you can also test for i586, it would be nice. After that, I will do a rebuild of the package (lxdm-0.5.3-1.1.mga5, I think) because, currently, the changelog is not very good. Best regards, Nico.
(In reply to Nicolas Salguero from comment #39) > > Yes, if you can also test for i586, it would be nice. > reinstalled Mga5 -LXDE only several times. only first reboot after install is about 95 sec. all others reboots are about 10 sec. installed lxdm-0.5.3-1.mga5.i586.rpm reboot time drops to about 5 sec. sorry I could not get a consistent slow reboot for i586 best regards Ben
Hi, Even if the test in more difficult with i586, I think the most important thing is that the problem does not reappear with the updated package and, in your tests, the bugs seems over with lxdm 0.5.3. So I have rebuilt the package. Best regards, Nico.
I will look forward to testing it again for QA (if it is assigned) best regards Ben
Suggested advisory: ======================== The updated package corrects the problem where LXDM is not properly handling SIGTERM signal sent by systemd, which causes shutdown or reboot to take more time than normally needed. ======================== Updated packages in core/updates_testing: ======================== i586: lxdm-0.5.3-1.1.mga5.i586.rpm x86_64: lxdm-0.5.3-1.1.mga5.x86_64.rpm Source RPMs: lxdm-0.5.3-1.1.mga5.src.rpm
Status: NEW => ASSIGNEDHardware: x86_64 => AllVersion: Cauldron => 5Assignee: bugsquad => qa-bugs
Source RPM: (none) => lxdm
Advisory loaded to svn. Removing the x86_64_ok whiteboard entry until the 0.5.3-1.1 has been tested.
CC: (none) => davidwhodginsWhiteboard: 5beta3 5RC IN_ERRATA | x86_64 ok| => 5beta3 5RC IN_ERRATA | advisory
Testing M5 x64 real hardware using LXDM display manager & LXDE Desktop, on a system with all other DMs & Desktops installed as well. I use both LXDM & LXDE routinely from time to time, and can only say from past experience that *shutdown* at least was variable; mostly in a reasonable time, rarely very long. Whether this was down to LXDM or another DM I cannot say. Updated to: lxdm-0.5.3-1.1.mga5 Using Ben's 'click-to-beep' measure, I am getting very variable results for re-booting. The first re-boot after the update - from the login screen Quit menu - was so long it seemed hung (on the way 'out'), & I powered off. Subsequently I have tried both from the login screen Quit/Reboot menu, and the Logout/Reboot menu from logged-in sessions. Things seem to have settled on mostly 11s +- 5 (my box is slow), with one very much longer. I cannot say that this update has made any noticeable difference to me. I prefer that others verify it. If things remain satisfactory here, I will OK it.
CC: (none) => lewyssmith
sorry for delay in getting this test done :(. x86_64, real hardware. test is: time from click "reboot" to reboot memory test "beep" Fresh install LXDE only DE: 1st test after 1st login: +90 sec 2 more tests: both +90 sec. update system to latest. 1st test: +15 sec. 2 more tests: both +90sec apply : lxdm-0.5.3-1.1.mga5.x86_64.rpm 1st after update applied: +90 sec 2 more tests after update applied: 6 sec
Whiteboard: 5beta3 5RC IN_ERRATA | advisory => 5beta3 5RC IN_ERRATA | advisory |
Whiteboard: 5beta3 5RC IN_ERRATA | advisory | => 5beta3 5RC IN_ERRATA | advisory |x86_64 ok
LXDE i586 unable to replicate issue using hardware from comment 46. apply the same test sequence from comment 46 after fresh install = consistent 10 sec after system update = consistent 10 sec after applying lxdm-0.5.3-1.1.mga5.i586.rpm 1st test = 10 sec subsequent tests = 6sec as adverse issues noted adding 32-ok
Whiteboard: 5beta3 5RC IN_ERRATA | advisory |x86_64 ok => 5beta3 5RC IN_ERRATA | advisory |x86_64 ok| 32 ok
(In reply to ben mcmonagle from comment #47) > > as adverse issues noted > should be : as no adverse issues noted
Thanks all. Validating.
Keywords: (none) => validated_updateWhiteboard: 5beta3 5RC IN_ERRATA | advisory |x86_64 ok| 32 ok => IN_ERRATA advisory mga5-32-ok mga5-64-okCC: (none) => sysadmin-bugs
An update for this issue has been pushed to the Mageia Updates repository. http://advisories.mageia.org/MGAA-2016-0042.html
Status: ASSIGNED => RESOLVEDResolution: (none) => FIXED