Laptop with MGA4 i586 : I have a problem with this update : KDE alerts after a few minutes about powermanagement failure. Watching logs, upower service do not start anymore! ------------------ systemctl status upower.service upower.service - Daemon for power management Loaded: loaded (/usr/lib/systemd/system/upower.service; enabled) Active: failed (Result: signal) since Qua 2014-10-08 10:52:06 CEST; 35min ago Docs: man:upowerd(8) Process: 756 ExecStart=/usr/libexec/upowerd (code=killed, signal=TRAP) Main PID: 756 (code=killed, signal=TRAP) CGroup: /system.slice/upower.service Out 08 10:52:06 celeron.homelinuxserver.org upowerd[756]: (upowerd:756): UPower-ERROR **: failed to get pokit authority: Error init...ed out Out 08 10:52:06 celeron.homelinuxserver.org systemd[1]: upower.service: main process exited, code=killed, status=5/TRAP Out 08 10:52:06 celeron.homelinuxserver.org systemd[1]: Failed to start Daemon for power management. Out 08 10:52:06 celeron.homelinuxserver.org systemd[1]: Unit upower.service entered failed state. Hint: Some lines were ellipsized, use -l to show in full. Reproducible: Steps to Reproduce:
Depends on: (none) => 14102
This bug is confirmed by another user on [discuss-fr] ML.
Can you confirm that downgrading dbus makes it work again?
CC: (none) => tmb
Well, I don't get what has changed, but rebooting the system near 4 times made the trick... sorry for the noise.
Resolution: (none) => FIXEDStatus: NEW => RESOLVED
Blocks: (none) => 14251
Another system, x86_64 this time. Still the same problem, solved with 2 crazy reboots : nothing is responding, even sysrq help key. No console login possible. systemd-logind[936]: Failed to enable subscription: Did not receive a reply. Possible causes include: the remote application did not send a reply, the messag dbus[937]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out dbus[937]: [system] Failed to activate service 'org.freedesktop.PolicyKit1': timed out dbus[937]: [system] Successfully activated service 'org.freedesktop.login1' upowerd[941]: (upowerd:941): UPower-ERROR **: failed to get pokit authority: Error initializing authority: Error calling StartServiceByName for org.freedeskt kernel: traps: upowerd[941] trap int3 ip:7fc904b40139 sp:7fff37a1b5b0 error:0 systemd-logind[936]: New seat seat0. systemd[1]: upower.service: main process exited, code=killed, status=5/TRAP systemd[1]: Failed to start Daemon for power management. systemd[1]: Unit upower.service entered failed state.
Can you downgrade _only_ the dbus packages and confirm it starts working again.... maybe upower was/is abusing dbus, something that now isn't allowed anymore with the security fixes that went in.
Sorry Thomas, but as the problem disappears after a few reboots, I cannot do this test. Anyway, I already closed this bug, I only added some info to indicate it is not arch dependent.
Hardware: i586 => All
Just been hit by this bug myself on both my wife's machine (Intel, 64-bit) and my spare machine (AMD, 64-bit). My wife carried out the update on her machine, and now it takes forever to log her in. However, if I log in as myself, the machine is its usual, slick self! In other words, it appears to be the user who carried out the upgrade who gets hit with the slow login! And, no, it doesn't improve, even after many reboots. I'm the only login on my spare machine (my main runs Slackware), but I'll go and create another login to confirm this. I may be some time.......!!! Thought I'd mention this before downgrading dbus in case it helps pinpoint the problem. -- Pete
CC: (none) => christy
Looks like that was a red herring. After a couple of reboots, my login also became very slow - slow enough to make the machine almost unusable. Downgrading dbus seems to have fixed them both, though. -- Pete
I've re-opened the bug since it seems that there may be a problem with the latest dbus.
Resolution: FIXED => (none)Status: RESOLVED => REOPENED
@Peter how did you downgrade your dbus?
CC: (none) => olivier.delaune
Something like urpmi --downgrade dbus lib64dbus1_3 should work... (remove "64" if you are on i586)
Hello Thomas, it does not work. I get urpmi --downgrade dbus lib64dbus1_3 Afin de poursuivre la mise à jour, les paquetages suivants doivent être désinstallés : dbus-1.6.18-1.4.mga4.x86_64 (pour installer le paquetage dbus-1.6.18-1.4.mga4.x86_64) lib64dbus1_3-1.6.18-1.4.mga4.x86_64 (pour installer le paquetage lib64dbus1_3-1.6.18-1.4.mga4.x86_64) (o/N) o $MIRRORLIST: media/core/updates/dbus-1.6.18-1.4.mga4.x86_64.rpm $MIRRORLIST: media/core/updates/lib64dbus1_3-1.6.18-1.4.mga4.x86_64.rpm installation de lib64dbus1_3-1.6.18-1.4.mga4.x86_64.rpm dbus-1.6.18-1.4.mga4.x86_64.rpm depuis /var/cache/urpmi/rpms Préparation... ############################################################################################################################ L'installation a échoué : paquet dbus-1.6.18-1.4.mga4.x86_64 déjà installé paquet lib64dbus1_3-1.6.18-1.4.mga4.x86_64 déjà installé Anyway, I also type systemctl -a | grep fail and this is what I get avahi-daemon.service loaded failed failed Avahi mDNS/DNS-SD Stack cpupower.service loaded failed failed Configure CPU power related settings display-manager-failure.service loaded inactive dead Display Manager Failure Message speech-dispatcherd.service loaded failed failed Speech-Dispatcher an high-level device independent layer for speech synthesis. udisksd.service loaded failed failed UDisks Daemon upower.service loaded failed failed Daemon for power management
oh, sorry... you need to specify version... so urpmi --downgrade dbus-1.6.18-1.3.mga4 lib64dbus1_3-1.6.18-1.3.mga4
Olivier, I used: urpmi --downgrade dbus-1.6.18-1.3.mga4 lib64dbus1_3-1.6.18-1.3.mga4 dbus-x11-1.6.18-1.3.mga4 as suggested on the forum by Jim Kerr (Thanks, Jim!) So far everything is now behaving normally on both machines. When in the "fault" mode, the disk activity light only flickers sporadically and briefly during the login process. When the machine is logging in normally, the disk light is "on" pretty constantly. I get the impression that something is tying up the machines resources pretty thoroughly and slowing disk access to a crawl. I don't know if that's any help - just my 2p worth! -- Pete
See Also: (none) => https://bugs.mageia.org/show_bug.cgi?id=14251
This seems to affect quite a lot of users, and there's also major impact on KDE users, as noted in Bug 14251 CC'ing Luc on both bugs.
CC: (none) => doktor5000
As I dont have access to any system that behaves this way... Can atleast some of you do the following: when you have the older working dbus installed: reboot to get clean logs, login to KDE/Gnome/... and then do as root: journalctl -b > journalctl-before.log systemd-analyze plot >boot-before.svg then install the problematic update again, reboot and login again and: journalctl -b > journalctl-after.log systemd-analyze plot >boot-after.svg And attach the files to this bug.
Created attachment 5492 [details] Log files as requested
Created attachment 5493 [details] Log files as requested
Created attachment 5494 [details] Log files as requested
Created attachment 5495 [details] Log files as requested Note: when I issued the command to generate this file, the system only produced an error message and a zero length file. I have pasted the error message into this file.
A couple of other bits of information: When logging in with the faulty dbus, the bootsplash after the login window gets to 3 bubbles over the cauldron fairly quickly, but then slows to a crawl. SU ing in a konsole window also took forever to log in, and trying to save the zero length file after I'd pasted the error message in also took a very long time before the filename window opened. (I'm using KDE). Not sure if this is relevant, but both machines showing the fault have NVidia graphics cards and the proprietary driver. -- Pete
Created attachment 5496 [details] journalctl-after.log
CC: (none) => lewmail101-mailbox
Created attachment 5497 [details] boot-before.svg
Created attachment 5498 [details] journalctl-before.log
Created attachment 5499 [details] journalctl-before-downgrade.log
Created attachment 5500 [details] downgrade.log
Created attachment 5501 [details] journalctl-after1streboot.log
Created attachment 5502 [details] boot-after1streboot.svg
Hello Thomas I have done what you have requested above and kindly take some time reading all the steps I did. First, my Laptop specs: Model: MSI CX41 Processor: Intel i5 3210m Graphics (Dual): AMD Radeon 7670m (discrete) and HD4000 (shared memory graphics) Memory: 8 GiB DDR3 HDD: 750 GiB -I did a fresh install of Mageia 4 (64bit) using a DVD. -I selected KDE only as my DE. grub setup is quite different since I am dual booting from CentOS 6.5 x64 but nothing worth mentioning. -English is the language and keyboard layout I used -Did a custom partition of /boot / SWAP & /home partitions using ext4 -After setup is complete, I restarted the laptop then went to MCC > Software Management > Configure your sources. Removed CDROM as source and added http and ftp sources + the default mirror lists. -Ran Updates via MCC > Software Management > Update your system -I rebooted the system after completing the updates and this time I installed the following via MCC > Software Management > Install & Remove software) conky, traceroute, hddtemp, lm-sensors, iotop and htop -I did 2 restarts after installing the applications above and the issue of slow loading of desktop can be observed. -I ran the following command as a root: journalctl -b > journalctl-before-downgrade.log (please see attached file journalctl-before-downgrade.log) -I ran the follwing command as a root: systemd-analyze plot >boot-before-downgrade.svg ...and got the output: "Failed to issue method call: Activation of org.freedesktop.systemd1 timed out" -I ran the following command as root: urpmi --downgrade dbus-1.6.18-1.3.mga4 lib64dbus1_3-1.6.18-1.3.mga4 dbus-x11-1.6.18-1.3.mga4 (please see attached file downgrade.log for output) -I restarted my laptop and run the following commands: journalctl -b > journalctl-before.log (please see attached file journalctl-before.log) systemd-analyze plot >boot-before.svg (please see attached file boot-before.svg) -I rebooted 2 more times, without any issues, before running an update via MCC > Software Management > Update your system -After the update, I rebooted and ran the following commands as root: (please note at this point, system seems to be behaving normally) journalctl -b > journalctl-after1streboot.log (Please see atttached file journalctl-after1streboot.log) systemd-analyze plot >boot-after1streboot.svg (Please see atttached file boot-after1streboot.svg) -I rebooted again and this time the slow loading of desktop occurs again. journalctl -b > journalctl-after.log (Please see atttached file journalctl-after.log) systemd-analyze plot >boot-after.svg ...and got the output: "Failed to issue method call: Activation of org.freedesktop.systemd1 timed out" File Upload Summary (Total of 7 files) (Original Request) 3 files journalctl-after.log boot-after.svg (I did not uploaded this file since the file created was 0 bytes) boot-before.svg journalctl-before.log (Extra files that I generated for comparison purposes) 4 files journalctl-before-downgrade.log downgrade.log (output of running urpmi --downgrade dbus-1.6.18-1.3.mga4 lib64dbus1_3-1.6.18-1.3.mga4 dbus-x11-1.6.18-1.3.mga4) journalctl-after1streboot.log boot-after1streboot.svg I hope this helps.
(In reply to Thomas Backlund from comment #5) > systemd-analyze plot >boot-after.svg I confirm, like in comment #29 that this command outputs nothing to STDOUT. AND STDERR says : Failed to issue method call: Activation of org.freedesktop.systemd1 timed out
Experienced this myself on my mums laptop. The only difference with this machine was it updated dbus and kernel at the same time. I suspect this might have something to do with it. I had to boot runlevel 3 to get to a login. After downgrading dbus and rebooting it was back to normal. Updating dbus again caused no regression. I have the journal if it's needed.
CC: (none) => eeeemail
(In reply to claire robinson from comment #31) > I had to boot runlevel 3 to get to a login. After downgrading dbus and > rebooting it was back to normal. Updating dbus again caused no regression. I don't believe in a kernel and dbus sync update bug : the first laptop whom had this bug, then it seemed fixed after a few reboots, had the bug again today. Looks like a race condition, hence the fact the bug is not always there. Like the activation of org.freedesktop.systemd1 not done when upower wants it, and then upower blocking the activation. Is there a way to give a delay to upower before it tries to start?
the file 10003-config-change-default-auth_timeout-to-5-seconds.patch change the auth_timeout from 30 seconds to 5. Couldn't it be the problem?
hello my mini computer (processor AMD E1) has 2 disks : ssd 64 Go and an old 600 Go. Mageia4 on ssd as no problem . Mageia4 on old disk "classic" canot boot . Timing ? José is right !
CC: (none) => christian-maryse.fischer
Other : My old computer ( Intel, core2duo, 2 disks classic , update done ) can boot in text mode, but cannot in graphic mode ( with kdm ) In text mode, after login , startx work . All is good , but Timing ?
(In reply to José Jorge from comment #33) > the file 10003-config-change-default-auth_timeout-to-5-seconds.patch change > the auth_timeout from 30 seconds to 5. Couldn't it be the problem? Nope, it shouldn't be as that gets triggered only in unconfigured cases, wich is not the case here as /etc/dbus-1/session.conf overrides it.
So in order to bisect this issue can someone affected using x86_64 start testing theese: http://tmb2.mine.nu/Mageia/4/bugs/14249/x86_64/ Start with dbus-1.6.18-1.4.1, check if everything still works, then go to -1.4.2, -1.4.3 .... and all the way up to -1.4.9 or until you can trigger the issue. I will release i586 testbuilds later when my other buildhost is up again
I have submitted to core/updates_testing a dbus that fixed the boot problem on my 2 systems : to workaround a CVE, the timeout was changed from 30s to 5s, which triggered timeout bugs on some systems. I have found that a 10s timeout both is better than 30s, and does not trigger timeouts anymore. Please, test dbus-1.6.18-1.5.mga4 and report if it fixes your systems, so that we can push it into regular updates.
(In reply to José Jorge from comment #38) > I have submitted to core/updates_testing a dbus that fixed the boot problem > on my 2 systems : to workaround a CVE, the timeout was changed from 30s to > 5s, which triggered timeout bugs on some systems. > Um, why did you add new patches that was pretty much copies of the old ones ? http://svnweb.mageia.org/packages?view=revision&revision=792328 It breaks review process into some unneeded extra work... And the proper way would have been to notify the maintainer (me) that you actually had tested changing the timeouts... > I have found that a 10s timeout both is better than 30s, and does not > trigger timeouts anymore. Please, test dbus-1.6.18-1.5.mga4 and report if it > fixes your systems, so that we can push it into regular updates. For this to be done properly it needs to be done for mga3 and cauldron too when we know what timeout works for all...
(In reply to Thomas Backlund from comment #39) > And the proper way would have been to notify the maintainer (me) that you > actually had tested changing the timeouts... I'm sorry, you are right. I thought no one had time to watch this regression which is affecting a lot of users, so I tried to advance. What I don't understand is how to update the patches without renaming them : mgarepo sync does nothing without that...
(In reply to Thomas Backlund from comment #37) > So in order to bisect this issue can someone affected using x86_64 start > testing theese: http://tmb2.mine.nu/Mageia/4/bugs/14249/x86_64/ > > Start with dbus-1.6.18-1.4.1, check if everything still works, > then go to -1.4.2, -1.4.3 .... and all the way up to -1.4.9 or until > you can trigger the issue. > > > I will release i586 testbuilds later when my other buildhost is up again I tested the steps 1 and 2 without any problem (Mageia runs normally). The problems (slowness) occur at the step 3.
(In reply to Olivier Delaune from comment #41) > (In reply to Thomas Backlund from comment #37) > > So in order to bisect this issue can someone affected using x86_64 start > > testing theese: http://tmb2.mine.nu/Mageia/4/bugs/14249/x86_64/ > > > > Start with dbus-1.6.18-1.4.1, check if everything still works, > > then go to -1.4.2, -1.4.3 .... and all the way up to -1.4.9 or until > > you can trigger the issue. > > > > > > I will release i586 testbuilds later when my other buildhost is up again > > I tested the steps 1 and 2 without any problem (Mageia runs normally). The > problems (slowness) occur at the step 3. OK, so dbus-1.6.18-1.4.3 is the first failing one, can you test the new: 1.6.18-1.4.10 to 1.6.18-1.4.13 and let me know if any of them work ?
I also have 2 PCs that was affected by this problem. One is based on a Phenom x4 945 - hd 1To, Mageia 4 x64. I did the tests proposed by Thomas Backlund : - 1.6.18-1.4.1 : ok (2 reboot) - 1.6.18-1.4.2 : ok (2 reboot) - 1.6.18-1.4.3 : not ok : boot very slow - 1.6.18-1.4.10 : ok (2 reboot) - 1.6.18-1.4.11 : ok (2 reboot) - 1.6.18-1.4.12 : ok (2 reboot) - 1.6.18-1.4.13 : ok (3 reboot) The second one is an older one based on a Athlon X2 5000 hd 250 Go. For this one, I reinstalled Mageia 4 (before I found this bug report) from the Live DVD kde x64 iso. I also did the updates and renamed .kde4 to .kde4-old to reset the kde configuration. Strangely, I cannot reproduce the problem. With the standard updates, the PC is now booting normaly ... There are 2 other PCs closed to me with Mageia 4 x64 (one based on a Phenom x3 720 hd 3To, the other is a laptop with an i7 720qm + ssd) that are not affected by the problem ...
CC: (none) => christian_duhaut
OK, there is now a dbus-1.6.18-1.6.mga4 in updates_testing so both i586 and x86_64 users can test
I just tested the version dbus-1.6.18-1.6.mga4 in updates_testing x86_64 on my phenom x4 945 PC : The result is not ok : first boot ok, second boot : very slow (more than 5 mns to start kde ...)
Attachment 5494 mime type: text/x-log => text/plain
(In reply to Christian D from comment #45) > I just tested the version dbus-1.6.18-1.6.mga4 in updates_testing x86_64 on > my phenom x4 945 PC : > The result is not ok : first boot ok, second boot : very slow (more than 5 > mns to start kde ...) I did 8 other reboots with this version without problem : so, globally 1 out of 10 reboot failed
Tested the 32-bit versions on the two affected machines I have with mixed results. They worked on a Dell Dimension E310 (P4 processor, desktop kernel) but did not work with the machine with an AMD Sempron 3100+ processor and the server kernel. Two boots OK, third one was even slower than before. Booting into safe mode and using Control-D to continue avoids the problem, as before. I'm looking for the procedure to downgrade again. All I get is an error message telling me the ***.3 packages aren't in the database.
CC: (none) => andrewsfarm
Inability to downgrade was solved by changing from MIRRORLIST to the specific US kernel.org mirror.
Based on the steps I followed on the thread, https://forums.mageia.org/en/viewtopic.php?f=7&t=8564, I did not encounter the issue anymore. I ran the following commands urpmi.update "Core Updates Testing" (as a normal user) urpmi --media "Core Updates Testing" dbus-1.6.18-1.6.mga4 lib64dbus1_3-1.6.18-1.6.mga4 dbus-x11-1.6.18-1.6.mga4 (as a root user) I observed that the login and logout process was a bit slower compared to the files I was using previously. dbus-1.6.18-1.3.mga4 libdbus1_3-1.6.18-1.3.mga4 dbus-x11-1.6.18-1.3.mga4 The difference in speed/responsiveness was somewhere between 5 to 20 seconds (rough estimate). This was done on a 64bit laptop, MSI CX41 to be exact. After installing the patch, I did 5 consecutive reboots + 1 shutdown + another reboot after running updates via MCC (I did not get any updates).
dbus 1.6.18-1.7 from updates-testing fixed the bug for me
Resolution: (none) => FIXEDStatus: REOPENED => RESOLVED
.7 version working on the machine that the .6 version didn't fix. Have not yet tried it on the machine where the .6 version worked.
Reported upstream, <https://bugs.freedesktop.org/show_bug.cgi?id=86431>. Mageia security / D-Bus maintainers: please report regressions upstream, particularly if they are regressions in security updates. We can't fix bugs we don't know about.
See Also: (none) => https://bugs.freedesktop.org/show_bug.cgi?id=86431CC: (none) => simon.mcvittie
Thanks Simon, will try to do better communicating with upstream
To people who have seen this bug in the past: I'm an upstream D-Bus maintainer, trying to get this fixed for everyone, but I haven't been able to make it happen on my own hardware without setting a pathologically small timeout. I would be interested to know whether the people who were affected are using a non-local user database (NIS, SSSD, YP, etc.) or any other unusual system infrastructure. Also, would it be possible for some of you to do some testing for me? What I would like tested is: * Compile a version of D-Bus with the logging patch from https://bugs.freedesktop.org/show_bug.cgi?id=86431#c4 (perhaps a Mageia developer could put a patched build somewhere for people to try?) * Put this in /etc/dbus-1/system-local.conf: <busconfig> <limit name="auth_timeout">5000</limit> </busconfig> * Reboot enough times that you see the symptoms of either this or Bug #14251 * Check journalctl for messages a bit like this: dbus[123]: [system] Connection has not authenticated soon enough, closing it (auth_timeout=5000ms, elapsed: 5001ms) What I expect is that when your computer is in the bad/slow situation, you will see those messages, and when it is not, you won't. After testing, delete system-local.conf to return to the default auth_timeout. That should put everything back to normal; reinstall version 1.6.18-1.8 of Mageia's dbus packages if necessary.
(In reply to Simon McVittie from comment #54) > What I would like tested is: > > * Compile a version of D-Bus with the logging patch from > https://bugs.freedesktop.org/show_bug.cgi?id=86431#c4 > (perhaps a Mageia developer could put a patched build somewhere > for people to try?) I submitted a dbus-1.6.18-1.9.mga4 to mga4 core updates_testing with that patch applied.
Hi Thomas, Although I haven't experienced the problem in the current patch that was published by Mageia team, I would like to help Simon put out a fix for everyone regardless of Distribution. Can you please post a step-by-step instructions on what to do and what output and/or files are needed to be generated by non technical users like me? Thank you.
Reopening the bug for debugging. The packages can be found in Core Updates Testing. To install then, as root, use the following commands for either i586 or x86_64 depending on your installation. For i586: # urpmi.update "Core Updates Testing" # urpmi --media "Core Updates Testing" dbus dbus-x11 dbus-doc libdbus1_3 For x86_64: # urpmi.update "Core Updates Testing" # urpmi --media "Core Updates Testing" dbus dbus-x11 dbus-doc lib64dbus1_3 Reboot. After each reboot, as root, check "journalctl -a -b | grep -i dbus" for any message like "Connection has not authenticated soon enough, closing it" or "Timing out authentication for connection"
Status: RESOLVED => REOPENEDResolution: FIXED => (none)
I've not had the problems on this computer but testing the new dbus anyway mga4 64, no issues or journal messages after 3 reboots.
Mageia 4 64-bit on AMD Phenom Quadcore. Installed dbus 1.9 from Core Updates Testing and created system-local.conf as indicated above. First reboot OK, second deard slow, also opening "Home" from desktop takes ages. Used journalctl as in Comment 57 and get following: After reboot and a number of "Activating systemd ....." I get the first Nov 19 09:56:18 xxx.yyyy.zzzz dbus[832]: [system] Connection has not authenticated soon enough, closing it (auth_timeout=5000ms, elapsed: 10449ms) then two more "Activating systemd ....." and then 3 like Nov 19 09:56:33 xxxx.yyyy.zzzz dbus[832]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out followed by Nov 19 09:56:56 xxxx.yyyy.zzzz dbus[832]: [system] Rejected send message, 2 matched rules; type="method_call", sender=":1.19" (uid=500 pid=3488 comm="kdeinit4: kded4 [kdeinit] ") interface="org.freedesktop.DBus.ObjectManager" member="GetManagedObjects" error name="(unset)" requested_reply="0" destination="org.bluez" (uid=0 pid=741 comm="/usr/libexec/bluetooth/bluetoothd ") and then 14 of these "Failed to activate ...timed out" then another 3 of "Rejected message ...." then some 150 of these "Failed to activate ...timed out" where another five of these "Rejected message ...." are interspersed.
CC: (none) => herman.viaene
For testers, remember this part: * Put this in /etc/dbus-1/system-local.conf: <busconfig> <limit name="auth_timeout">5000</limit> </busconfig> * Reboot enough times that you see the symptoms of either this or Bug #14251 * Check journalctl for messages a bit like this: dbus[123]: [system] Connection has not authenticated soon enough, closing it (auth_timeout=5000ms, elapsed: 5001ms)
Created attachment 5611 [details] journalctl dbus errors Tested the version dbus-1.6.18-1.9.mga4 with system-local.conf as required above (5000ms) in updates_testing x86_64 on my phenom x4 945 PC : - first reboot : login screen OK but starting KDE was very slow. A lot of timeout errors : see attached file dbus-problem-1st-and-3rd-reboot.txt - second reboot : the same - third reboot, after deleting system-local.conf : normal boot, but still a dbus error (see attached file)
Thanks, everyone! Christian's log indicates that at least on Christian's system, the situation is as we suspect. In particular, this: nov. 19 09:57:47 pc-bureau dbus[1216]: [system] Connection has not authenticated soon enough, closing it (auth_timeout=5000ms, elapsed: 9545ms) indicates that the system is *very* busy at that point: dbus-daemon scheduled a 5 second timer, then didn't get woken up until nearly 5 seconds later than it requested. Herman's notes are also consistent with that.
Lew, Claire: if you want to replicate this testing, please follow the instructions from both Comment #57 and Comment #60. I would expect results similar to what Christian had in Comment #61. Christian wrote: > - third reboot, after deleting system-local.conf : normal boot, > but still a dbus error (see attached file) If you mean this: """ nov. 19 10:35:26 pc-bureau systemd[2472]: Failed to open private bus connection: Failed to connect to socket /run/user/500/dbus/user_bus_socket: No such file or directory """ then I think that's a systemd user-session weirdness, and is not harmful.
(In reply to Simon McVittie from comment #63) > Lew, Claire: if you want to replicate this testing, please follow the > instructions from both Comment #57 and Comment #60. I would expect results > similar to what Christian had in Comment #61. > > Christian wrote: > > - third reboot, after deleting system-local.conf : normal boot, > > but still a dbus error (see attached file) > > If you mean this: > > """ > nov. 19 10:35:26 pc-bureau systemd[2472]: Failed to open private bus > connection: Failed to connect to socket /run/user/500/dbus/user_bus_socket: > No such file or directory > """ > > then I think that's a systemd user-session weirdness, and is not harmful. Thanks for the answer. This last error probably because I didn't close the user session before the reboot ...
On a Mageia4 64 bit AMD machine fully updated. On cold boot this morning experienced again a very slow start. This machine has dbus 1.6.18-1.8.mga4, and has never been involved in testing. There is no such time-out message as Simon indicated above, I piped the output of the command (Comment 57) to a file slowstart that I will upload. Second boot went OK.
Created attachment 5616 [details] Dbus messages after slow start with dbus 1.6.18-1.8
> 1.6.18-1.8.mga4 [...] > There is no such time-out message as Simon indicated above Revision -1.8 can never produce the message "Connection has not authenticated soon enough", even in situations where it would be correct, because the code to log that message does not exist in that version. It would be great if you could test with -1.9, with or without the system-local.conf that reduces this timeout from 30s to 5s. That revision will log that message, if appropriate. Also, how fast and how recent is this machine? Our working theory is still that this is much more likely to happen on old/slow hardware (I don't know whether the CPU or the disk is the important bit). Finally, a full log from journalctl -ab (censor it if necessary) would be useful information, since it would indicate whether systemd is having other difficulties on your machine.
Created attachment 5617 [details] 1strebootoutput
Created attachment 5618 [details] outputafter1stshutdown
Created attachment 5619 [details] 3rdrebootoutput
Created attachment 5620 [details] 4threbootoutput
Created attachment 5621 [details] outputafter2ndshutdown
Hi, I've performed the tests on my laptop. Kindly refer to the exact steps I did below. I ran the following commands in my terminal as root: urpmi.update "Core Updates Testing" urpmi --media "Core Updates Testing" dbus dbus-x11 dbus-doc lib64dbus1_3 After completing the upgrade, I copied system.conf in /etc/dbus-1 directory via cp command and named it system-local.conf. Using vi, I added the tag <limit name="auth_timeout">5000</limit> on the bottom part of the file just above </busconfig>, please refer to "system-local" file that I attached, and then rebooted my laptop. The results below were done in the exact order. 1st Reboot (after running update and modifying the system-local.conf file) -boot time is around 1m 50s starting from the Mageia Grub(legacy) bootloader -ran hddtemp (as root) in terminal and got the error below. Error getting authority: Error initializing authority: Could not connect: Connection refused -ran journalctl -a -b | grep -i dbus (as root) and the output of this is in "1strebootoutput" 1st laptop shutdown for 15 minutes -boot time is around 1m 53s starting from the Mageia Grub(legacy) bootloader -ran hddtemp (as root) in terminal and got the error below. Error getting authority: Error initializing authority: Could not connect: Connection refused -ran journalctl -a -b | grep -i dbus (as root) and the output of this is in "outputafter1stshutdown" 2nd Reboot -boot time is around 1m 14s starting from the Mageia Grub(legacy) bootloader -ran hddtemp (as root) in terminal and got the error below. Error getting authority: Error initializing authority: Could not connect: Connection refused -I failed to run "journalctl -a -b | grep -i dbus" because I accidentally rebooted my laptop so I do not have the output. 3rd Reboot -boot time is around 1m 14s starting from the Mageia Grub(legacy) bootloader -ran hddtemp (as root) in terminal and got the error below. Error getting authority: Error initializing authority: Could not connect: Connection refused -ran journalctl -a -b | grep -i dbus (as root) and the output of this is in "3rdrebootoutput" 4th Reboot -boot time is around 1m 14s starting from the Mageia Grub(legacy) bootloader -ran journalctl -a -b | grep -i dbus (as root) and the output of this is in "4threbootoutput" 2nd laptop shutdown 3 minutes -boot time is around 1m 28s starting from the Mageia Grub(legacy) bootloader -ran journalctl -a -b | grep -i dbus (as root) and the output of this is in "outputafter2ndshutdown" I did not encountered "Connection has not authenticated soon enough, closing it" or "Timing out authentication for connection" messages but the attached logs do have other error messages like "connection refused" and other errors. After running all the tests, I reverted back to dbus-1.6.18-1.8.mga4, lib64dbus1_3-1.6.18-1.8.mga4 and dbus-x11-1.6.18-1.8.mga4. I ran the following command as root: urpmi --downgrade dbus-1.6.18-1.8.mga4 lib64dbus1_3-1.6.18-1.8.mga4 dbus-x11-1.6.18-1.8.mga4 After completing the downgrade, I deleted /etc/dbus-1/system-local.conf and rebooted my laptop. If you are interested to know my processor and hard disk drive here they are: Processor: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz via cat /proc/cpuinfo Hard Disk Drive: Hitachi HTS547575A9E384 via hdparm -i /dev/sda My laptop is a MSI CX41. Hope this helps.
Created attachment 5622 [details] system-local
Thanks for the time you've put into this Lew. Journalctl -b shows output from the current boot only, so the logs you've attached after having troubles at shutdown using journalctl -b are unlikely to show relevant information, sadly. You can use -b -1 to show the journal of the previous boot only though, which may show the problems at shutdown.. journalctl -a -b -1 | grep -i dbus > journalfrompreviousboot.txt
Lew, thanks for your efforts, but I'm afraid you put the wrong thing in system-local.conf, making these test results not useful. What I intended you to have in system-local.conf is exactly this, and nothing more: <busconfig> <limit name="auth_timeout">5000</limit> </busconfig> Configuration in system-local.conf adds to system.conf (via the <include> directive near the end of system.conf), it does not replace system.conf. In this case, pasting the contents of system.conf into system-local.conf meant that dbus-daemon could not start at all, because system-local.conf was trying to include itself: Nov 20 22:18:59 mageia.lew dbus-daemon[1003]: Failed to start message bus: Circular inclusion of file '/etc/dbus-1/system-local.conf'
(In reply to Simon McVittie from comment #67) > > 1.6.18-1.8.mga4 [...] > > There is no such time-out message as Simon indicated above > > Revision -1.8 can never produce the message "Connection has not > authenticated soon enough", even in situations where it would be correct, > because the code to log that message does not exist in that version. OK, so that explains why. > > It would be great if you could test with -1.9, with or without the > system-local.conf that reduces this timeout from 30s to 5s. That revision > will log that message, if appropriate. No, this happened on my daily-use-PC, and I refrain from using anything on it that is not 100% standard approved. Sorry for that. > > Also, how fast and how recent is this machine? Our working theory is still > that this is much more likely to happen on old/slow hardware (I don't know > whether the CPU or the disk is the important bit). Well this is a (about) 2 year old AMD 6 core processor FX-6100 with nVidia GeForce 6800 graphics. I would also suppose this problem would occur more on slower hardware, but my experience (as far as it goes) is exactly the opposite. I explain: This machine has suffered from the problem right from the beginning. I have a somewhat older laptop (AMD Phenom Quadcore) and that also had its share of the problems, but far less frequently. AND I have installed MGA-4 on a number of older machines (6 - 10 years old), some single core, some early dual core at my friends homes. Not all are running KDE, the eldest ones I installed with Xfce. But NONE of them reported any problem in the past weeks. And I think I can quite sure these people would ring me if their PC would start so smpw as mine did. BUT I also installed MGA-4-64 on a brand new PC (Intel i5), with an SSD, and I haven't had any complaints on this one either. From the first two I could speculate that the more cores, the bigger the problem, but the last one (but I had only a single PC like that) would point to disk???? Sorry if I make the confusion even bigger than it was before. > > Finally, a full log from journalctl -ab (censor it if necessary) would be > useful information, since it would indicate whether systemd is having other > difficulties on your machine. I presume you might prefer the one of the boot with the problem, I'll see if I can master journalctl 's options.
Created attachment 5625 [details] Full journal of slowstart with dbus 1.6.18-1.8
Created attachment 5628 [details] journalctl -ba output This is what I get with <limit name="auth_timeout">5000</limit>. My computer was slow since the first reboot. This is my hardware * Intel(R) Pentium(R) CPU G840 @ 2.80GHz * Hard disk ST500DM002-1BD142 If you need more information, please ask.
Thanks everyone, I don't need any more test results with the 5000 timeout. I will revert the timeout change upstream soon, equivalent to what Mageia developers did for 1.6.18-1.8. Herman wrote: > Full journal of slowstart with dbus 1.6.18-1.8 I think you have a different slow startup issue here: nothing in Attachment #5625 [details] indicates problems with D-Bus. It might be best to report this separately. Olivier wrote: > journalctl -ba output Attachment #5628 [details] indicates that reverting the timeout to what it used to be (as done in 1.6.18-1.8) fixes this for you. Please delete system-local.conf and everything should work after that.
Created attachment 5631 [details] Corrected journal
I checked again, and now found I made a mistake in my journalctl command. New file is the full log of the slow boot where I previously sent the filtered log. Sorry for that.
(In reply to Simon McVittie from comment #80) > Thanks everyone, I don't need any more test results with the 5000 timeout. I > will revert the timeout change upstream soon, equivalent to what Mageia > developers did for 1.6.18-1.8. > I was thinking about this... would it be wise to ship a: /etc/dbus-1/system-local.conf with this commented out: <busconfig> <limit name="auth_timeout">5000</limit> </busconfig> and an explanation about enabling it to harden the system security ?
The whole point of system-local.conf is that unlike system.conf, it does not exist unless added by a local sysadmin; so, no, I'm not going to start shipping it now. In the development branch, I intend to put the various default limits in system.conf, commented out, so sysadmins can use them as a reference. I'll probably also add a README.denial-of-service or similar at some point, but that's less urgent than fixing this regression. I would rather limit the amount of churn on the 1.8 stable branch, so I probably won't do the same there. Reducing this timeout from 30 to 5 seconds was a hardening change: it reduces the worst-case denial of service by reducing the time by which broken or malicious processes can delay legitimate connections. On my test system, with 100 processes each connecting in a loop, a legitimate connection takes around half the auth_timeout; so I don't think a shorter timeout than 30s is necessary unless your users are actively hostile. If your users *are* that hostile, then I think dbus is probably the least of your concerns. You'd also want to apply a relatively strict RLIMIT_NPROC, which would also mitigate that attack.
I'd like to thank you, on behalf of everybody, for following up on this Simon.
Just FYI, we can update Cauldron 1.8.12, which contains some of the fixes we've added locally: http://lists.freedesktop.org/archives/dbus/2014-November/016430.html
I think upgrading to 1.6.28 or 1.8.12 should make all your patches for this issue unnecessary, if the policy for the relevant part of your distribution allows that. (I realise you are probably stuck on 1.6.18 + selected patches for your stable release - I'm in a similar situation with 1.6.8 in Debian 7.) If you have additional downstream patches fixing other bugs, please send them to the freedesktop.org Bugzilla (product/component: dbus/core).
Thanks Simon. Our maintainer, Thomas, is having connection issues at the moment after moving into a new house.
Thanks Simon. as you suspected I will keep mga4 on 1.6.18 + patches, but for Cauldron / upcoming mga5 I've updated to 1.8.12 Closing this as fixed
Status: REOPENED => RESOLVEDResolution: (none) => FIXED