Bug 9534 - dbus upgrades causes numerous blocked processes on upgrade from Mageia 2 to 3 (e.g. rtkit).
Summary: dbus upgrades causes numerous blocked processes on upgrade from Mageia 2 to 3...
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: release_blocker critical
Target Milestone: ---
Assignee: Colin Guthrie
QA Contact:
URL:
Whiteboard:
Keywords:
: 9535 (view as bug list)
Depends on:
Blocks: 8016
  Show dependency treegraph
 
Reported: 2013-03-26 00:38 CET by Dave Hodgins
Modified: 2013-04-26 01:01 CEST (History)
4 users (show)

See Also:
Source RPM: dbus-1.6.8-4.mga3.src.rpm
CVE:
Status comment:


Attachments

Description Dave Hodgins 2013-03-26 00:38:29 CET
During upgrade test from Mageia 2 to 3, using urpmi,
urpmi starts the dbus-send from the rtkit postinstall
scriptlet, and stops waiting for it to finish.  During
an upgrade, the dbus-send never completes, blocking
the rest of the upgrade.

From htop ...
0.0  0.1  0:00.10 â  â     ââ -bash
0.0  0.0  0:17.17 â  â        ââ tee /tmp/urpmi-log.txt
0.0  4.4 28:15.36 â  â        ââ /usr/bin/perl /sbin/urpmi --auto-update --resume --debug
0.0  0.0  0:00.01 â  â           ââ /bin/sh /var/tmp/rpm-tmp.AaPojD 2
0.0  0.0  0:00.00 â  â              ââ dbus-send --system --type=method_call --dest=org.fr


Reproducible: 

Steps to Reproduce:
Dave Hodgins 2013-03-26 00:44:20 CET

Priority: Normal => release_blocker
Blocks: (none) => 8016

Dave Hodgins 2013-03-26 01:38:40 CET

Assignee: bugsquad => mageia

Comment 1 Colin Guthrie 2013-04-16 00:07:45 CEST
So doing some tests, the problem generally appears to be using new dbus libs to connect to old dbus server.

If you do a minimal install, ensure rtkit-dameon.service is started and enabled (and move the enable symlink to multi-user.target.wants dir), then upgrade everything *except* rtkit, then simply trying to restart rtkit-daemon.service (without upgrading) is enough to break. It is stopped fine, but starting fails (it seems to stall connecting to the bus.

I suspect this is due to dbus daemon running with old libs in memory but any new connections are attempted with the new libs (and various surrounding changes that affect old + new stuff in this way).

Restarting dbus.service allows the (old) rtkit to start happily (and survive several restarts). After this, the upgrade also goes smoothly.

However, I don't really want to do a restart of dbus. It's generally going to be a problematic thing to do and I don't want to do it on every upgrade of the package. I'm wondering if we should limit the restart to something we only do on major version changes? - i.e. via a triggerun.
Comment 2 Colin Guthrie 2013-04-16 01:27:05 CEST
Strace indicates a hang after connecting to the dbus socket. The daemon seems pretty much stuck at the authentication stage.

rtkit tries writing AUTH EXTERNAL to the dbus socket. It likely needs polkit auth. Seems polkit was no longer running on my system, but a quick look suggests that the new version of polkit cannot be started as it too is waiting for auth from dbus.

So all in all it looks like a dbus problem.

Restarting dbus fixes everything but it will most likely kill X11 sessions and have other horrible side effects...

We can do a few things to mitigate problems but all in all, it's a horrible scenario that is not easily solvable. :s
Comment 3 Colin Guthrie 2013-04-16 02:03:41 CEST
I've had success with upgrades based on two different strategies:

 1. Before upgrading any packages, simply upgrade dbus and lib[64]dbus1_3 (nothing else is pulled in).
 2. Reboot
 3. Do upgrades as normal.

or

 1. Add /dbus/ to urpmi's skip.list
 2. Do upgrade as normal
 3. Remove /dbus/ from skip.list
 4. Do the final upgrades

Both approaches work equally well.
Comment 4 Sander Lepik 2013-04-16 12:02:54 CEST
So can we do it this way that mageia-prepare-upgrade adds /dbus/ into skip.list and some package removes it from there after most packages are upgraded? It sounds very hackish but I don't know any easier way to do that for normal user.

CC: (none) => sander.lepik

Comment 5 Colin Guthrie 2013-04-16 12:28:35 CEST
Problem is we cannot really define the "after most packages are upgraded" part. And we'd also have to be very careful about fiddling with users skip.lists and not accidentally removing stuff :s

It's maybe better to just put a readme.urpmi into the mageia-prepare-upgrade to document this anomaly - telling users to do their dbus upgrade first?

One potential solution would be to make urpmi read a list from /var/run/urpmi/skip.list in addition to the /etc/urpmi/skip.list. We could then make mageia-prepare-upgrade drop a dracut module into place which included that file. Only when mageia-prepare-upgrade is removed and the initrd regenerated (which should happen near the end of the install) and the machine rebooted will the runtime skip.list then disappear. So the user will need to do a reboot.

It relies on a couple things however,
 1. That Thierry is happy to make said change to urpmi.
 2. That dbus doesn't sneak in in the high-prioirty updates in urpmi (i.e. the first run)
 3. That urpmi *is* updated in the high-priority updates.
 4. That when urpmi is done running the high-priority updates that it then reruns the new version and thus reads the /var/run/urpmi/skip.list file.

All in all it still sounds quite hacky, but any other suggestions are greatly appreciated!
Comment 6 Colin Guthrie 2013-04-16 12:30:12 CEST
@TV Please see comment:5 for your input and suggestions.

CC: (none) => thierry.vignaud
Summary: rtkit postinstall scriptlet blocking upgrade from Mageia 2 to 3. => dbus upgrades causes numerous blocked processes on upgrade from Mageia 2 to 3 (e.g. rtkit).
Source RPM: rtkit-0.11-3.mga3.src.rpm => dbus-1.6.8-4.mga3.src.rpm

Comment 7 Sander Lepik 2013-04-16 13:03:32 CEST
If it would be added in the end as:

###mageia3-upgrade-begin###
/dbus/
###mageia3-upgrade-end###

then I'm quite sure that someone is able to write some sed magic to remove only those 3 lines.

Can't we use some filetrigger (I really don't know how those work exactly) stuff to tell urpmi to run this removing at the very end of upgrade?

readme.urpmi is the very last option as I'm quite sure that some people will just miss it and others don't know how to do it. So long Mandriva/Mageia upgrades have just worked.. Already that pre-upgrade step is a new thing to do.
Comment 8 Colin Guthrie 2013-04-16 13:08:09 CEST
Filetriggers are processed at the end of each transaction AFAIK, not at the end of the whole urpmi process. Also not quite sure what we'd trigger on anyway :s

I was expecting that users would hopefully read a readme.urpmi on the mageia-prepare-upgrade package as they already have to specifically download it :D
Comment 9 Sander Lepik 2013-04-16 13:30:53 CEST
(In reply to Colin Guthrie from comment #8)
> I was expecting that users would hopefully read a readme.urpmi on the
> mageia-prepare-upgrade package as they already have to specifically download
> it :D

I don't know about Mageia but at least Mandriva gave a notice when there was new version available. I thought that this dialog will also prompt to install needed package.

And you expect too much from normal users :D
Comment 10 Colin Guthrie 2013-04-16 13:52:08 CEST
Ahh, right. I wonder how best to make it work with the prepare package?

/me probably needs to investigate :D
Comment 11 Thierry Vignaud 2013-04-16 19:46:05 CEST
(In reply to Colin Guthrie from comment #5)

of course urpmi will always be in the priority list.
But I don't seen any reason to patch urpmi.

You can force dbus+libdbus to be upgraded in the same transaction using requires with version+release. Then just restart the dbus deamon in %triggerpost if version is old enough (if it was already running).

anyways mageia-prepare-upgrade could just use "urpmi --skip dbus"...

As for killing X11 session
Comment 12 Thierry Vignaud 2013-04-16 20:02:32 CEST
As for killing X11 sessions, there's already quite a lot of things that can break them when performing a live upgrade
claire robinson 2013-04-17 20:50:14 CEST

CC: (none) => eeeemail

William Kenney 2013-04-18 21:33:07 CEST

CC: (none) => wilcal.int

Comment 13 Colin Guthrie 2013-04-21 15:21:50 CEST
Yes, but restarting(In reply to Thierry Vignaud from comment #12)
> As for killing X11 sessions, there's already quite a lot of things that can
> break them when performing a live upgrade

There are several things that *can* break an X11 session but if you do "systemctl restart dbus.service" now, your X11 session will simply die along with any terminals you happen to be using. If the upgrade itself is run from within a terminal it will be killed half way through package installation - not a good situation to deliberately engineer.

I agree live upgrades are problematic and something I'd rather simply not support if I'm honest, but I think we should try to avoid this if possible.

So while forcing dbus to require it's libdbus and vice versa to ensure they are in the same txn is fine (and will simplify a few instructions), I would strongly recommend against doing a dbus restart in *any* rpm scripts.

Also mageia-prepare-upgrade does not run urpmi. All it does is prepare the filesystem, so the user still has to run their upgrade by themselves.

I guess the only way to ensure users do the right thing is to modify /etc/urpmi/skip.list for them somehow (and this is really all I wanted in urpmi - using /run filesystem for "temporary config" that survives any package removals that would mean we didn't have to mess with peoples configs and undo said config tweaks potentially before they are still needed.

I'm still not sure of a technique to achieve the desired results... any suggestions that do not involve restarting dbus and that survive until the next reboot are welcome.
Comment 14 Colin Guthrie 2013-04-21 15:24:18 CEST
OK, just as I send that I have another idea.

mageia-prepare-upgrade will work as normal, but tweak the skip.list to exclude dbus. On uninstall, it drops a systemd unit into /etc/ that checks for the mageia version on reboot (perhaps checking a few packages too) and automatically removes the dbus from skip.list and then automatically removes itself.

It's a bit nasty but this is the only thing I can think off to solve this problem without tweaking other things.
Comment 15 Sander Lepik 2013-04-21 19:48:53 CEST
(In reply to Colin Guthrie from comment #14)
> OK, just as I send that I have another idea.
> 
> mageia-prepare-upgrade will work as normal, but tweak the skip.list to
> exclude dbus. On uninstall, it drops a systemd unit into /etc/ that checks
> for the mageia version on reboot (perhaps checking a few packages too) and
> automatically removes the dbus from skip.list and then automatically removes
> itself.
> 
> It's a bit nasty but this is the only thing I can think off to solve this
> problem without tweaking other things.

I'm OK with that too. Yes, a bit nasy, but id properly tested it should work just fine.
Comment 16 Colin Guthrie 2013-04-23 15:05:30 CEST
*** Bug 9535 has been marked as a duplicate of this bug. ***
Comment 17 Colin Guthrie 2013-04-23 15:38:48 CEST
Just for reference, it seems that some dbus requires cause some degree of issue on upgrades:

[colin@jimmy cauldron-packages (master)]$ rpm -q --requires lib64dbus-glib
..
pkgconfig(dbus-1)
pkgconfig(glib-2.0)
pkgconfig(gobject-2.0)
devel(libdbus-1(64bit))
devel(libgio-2.0(64bit))
devel(libglib-2.0(64bit))
devel(libgobject-2.0(64bit))
pkgconfig
...

Namely that this will cause upgrade issues when combined with the skip-list due to it pulling in too many deps.

It seems to me that this is caused by the package GConf2-sanity-check from mga2. When I remove this package, urpmi --auto-select does not propose any further updates. Because lib64dbus-glib obsoletes this package, but nothing specifically requires it, it results in pulling in various -devel packages which in turn fails as it cannot install the dbus-devel package (assuming it was not already installed) due to it being in the skip.list.

So in order to make this a smooth process, I suppose we have to also exclude lib[64]dbus-glib in the skip list.

On MGA2:
[colin@plateau ~]$ urpmq --whatrequires GConf2-sanity-check
GConf2-sanity-check
gnome-session-bin
gnome-session-bin
lib64GConf2-devel
libGConf2-devel

And in MGA3:
http://svnweb.mageia.org/packages/cauldron/GConf2/current/SPECS/GConf2.spec?view=markup#l86
The Gconf2-devel package obsoletes it.

Thierry, if we make gnome-session-bin also obsolete this, this should prevent this problem right?
Comment 18 Colin Guthrie 2013-04-23 15:40:25 CEST
Small correction to the above: lib64dbus-glib does not obsolete this package, but rather lib64GConf2-devel does. This pulls in the various -devel packages on upgrade (to a system with no -devel packages installed).

My proposed solution still stands tho'.
Comment 19 Colin Guthrie 2013-04-25 17:15:24 CEST
OK, failing any further feedback, I've pushed gnome-session and GConf2 both with conflicts rather than any obsoletes. That should hopefully make that bit nice.

I've also tweaked mageia-prepare-upgrade package to set a skip on dbus and then remove it again after a successful boot with mageia 3.

Of course that highlighted another bug.

Due to the libexec change, the new old dbus library is looking in the wrong path to execute helpers. This causes at least gdm to fail to start on first reboot.

*sigh*

So the options I see here are:
 1. Don't care - just document this and forget about it - it's easily fixed via urpmi --auto-select and a reboot.
 2. Patch the dbus from mga2 to fallback to look in /usr/libexec.

I guess 2 is the best option for users.
Comment 20 Sander Lepik 2013-04-25 17:32:23 CEST
(In reply to Colin Guthrie from comment #19)
>  2. Patch the dbus from mga2 to fallback to look in /usr/libexec.
> 
> I guess 2 is the best option for users.

If you do have time and can do it then yes, it would be nice :)
Comment 21 Colin Guthrie 2013-04-25 18:56:29 CEST
Actually, it seems I was wrong on this one. Now that the gnome-session and GConf stuff is not blocked (thanks to a push earlier today), gdm no longer fails at this point.

It does still fail however in that it cannot connect to the system bus... not sure why as after a simple restart it works fine.... stracing gives no obvious clues either :(
Comment 22 Colin Guthrie 2013-04-25 20:31:47 CEST
OK, it turns out I'm an idiot. The default unit deps in my little systemd unit conflicted with sysinit.target which resulted in an ordering cycle which resulted in dbus not getting started which caused all the problems. Moving it to multi-user.target and all is well.

gdm starts up on reboot and the dbus stuff is removed and mgaonline offers to finish off the remaining packages that were skipped. 

All in all it works quite nicely for me now :)

Just need to do a (lot) of cosmetic work to mgaonline now such that it will do what is needed to guide the user through the upgrade process.
Comment 23 claire robinson 2013-04-25 22:45:15 CEST
Thanks Colin, it's much appreciated
Comment 24 Colin Guthrie 2013-04-26 00:08:04 CEST
OK, this is now resolved with the latest mageia-prepare-upgrade.

Works well for me, upgrade is smooth and quick and then after rebooting with new stuff, dbus is offered as an upgrade which then recommends the user reboot immediately.

Status: NEW => RESOLVED
Resolution: (none) => FIXED

Comment 25 William Kenney 2013-04-26 00:51:28 CEST
How, or if, will this effect boot.iso upgrades?
Comment 26 Colin Guthrie 2013-04-26 01:01:00 CEST
If you use the installer to do the upgrade, it shouldn't be an issue. This only affects in-place upgrades.

Note You need to log in before you can comment on or make changes to this bug.