Bug 8665 - rpm 4.11 considers too many things file conflicts with directories, causing upgrade issues
Summary: rpm 4.11 considers too many things file conflicts with directories, causing u...
Status: REOPENED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: i586 Linux
Priority: Normal major
Target Milestone: ---
Assignee: All Packagers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 9055
Blocks:
  Show dependency treegraph
 
Reported: 2013-01-11 19:09 CET by David Walser
Modified: 2017-08-05 17:26 CEST (History)
5 users (show)

See Also:
Source RPM: rpm-4.11.0-0.beta1.7.mga3
CVE:
Status comment:


Attachments
list of spec files running rm or rmdir in their %pre (6.00 KB, text/plain)
2013-01-15 21:25 CET, Thierry Vignaud
Details
script to find directories with different sets of permissions (1.08 KB, text/plain)
2013-01-28 22:21 CET, Luc Menut
Details
list of directories with diffrent sets of permissions in cauldron (2013-01-28) (17.69 KB, text/plain)
2013-01-28 22:24 CET, Luc Menut
Details

Description David Walser 2013-01-11 19:09:07 CET
I did a Mageia 2 -> Mageia 3 upgrade test with the net installer and got:

Installation of packages failed:

file /etc/httpd/conf/vhosts.d from install of apache-2.4.3-5.mga3.i586 conflicts with file from package apache-2.2.23-1.mga2.i586
file /etc/httpd/conf/webapps.d from install of apache-2.4.3-5.mga3.i586 conflicts with file from package apache-2.2.23-1.mga2.i586


at the end of the packages installation.  Looking at the package, those two directories changed from directories to symlinks (to /etc/httpd/conf/sites.d) in apache-2.4, which has a %pre script to prevent that from causing a problem:
%pre
if [ $1 = 2 ]; then
    # prevent symlink creation failure on update
    if [ ! -d /etc/httpd/conf/sites.d ]; then
        mkdir /etc/httpd/conf/sites.d
        mv -f /etc/httpd/conf/webapps.d/* /etc/httpd/conf/sites.d 2>/dev/null
        mv -f /etc/httpd/conf/vhosts.d/* /etc/httpd/conf/sites.d 2>/dev/null
        rmdir /etc/httpd/conf/webapps.d
        rmdir /etc/httpd/conf/vhosts.d
    fi
fi

I guess rpm is just looking at the metadata of the packages and seeing something about the directory changed and reporting it is a conflict.  It shouldn't do this, and has the potential to cause all kinds of issues.

Another place we recently saw this pop up was when glibc and audit both owned /usr/lib/audit (a directory), but with different permissions.  It used to be that when audit was installed it would just simply change the permissions of that directory, but cause no other problems.  This has been "fixed" in a way by just dropping the directory ownership from the audit package, but has the possibly unintended consequence that regular users can use the sotruss command when audit is installed, when they couldn't before.  In general, this new behavior from rpm could cause other problems too.
David Walser 2013-01-11 19:09:16 CET

CC: (none) => guillomovitch

Comment 1 David Walser 2013-01-11 19:15:08 CET
Christiaan Welvaart said on IRC to check the last comment on this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=447156

which suggests that using %pretrans instead of %pre is a workaround.
Comment 2 David Walser 2013-01-11 19:25:29 CET
In the installer, this prevents you from proceeding with the installation.

To workaround that, you have to Ctrl-Alt-F2 and:

chroot /mnt rpm -Uvh --force /var/cache/urpmi/rpms/*.rpm
Comment 3 Thierry Vignaud 2013-01-11 20:57:30 CET
I was going to tell you directory<->file/link replacement has _always_ been an issue and to use %pretrans but you found out by yourself

Status: NEW => RESOLVED
Resolution: (none) => WONTFIX

Comment 4 David Walser 2013-01-12 03:39:01 CET
file->link replacements have been an issue for a long time, I already know that.  This is a new issue and there's more to it than that.

Status: RESOLVED => REOPENED
Resolution: WONTFIX => (none)

Comment 5 Thomas Backlund 2013-01-12 11:50:18 CET
(In reply to comment #0)
> 
> Another place we recently saw this pop up was when glibc and audit both owned
> /usr/lib/audit (a directory), but with different permissions.  It used to be
> that when audit was installed it would just simply change the permissions of
> that directory, but cause no other problems.  This has been "fixed" in a way by
> just dropping the directory ownership from the audit package, but has the
> possibly unintended consequence that regular users can use the sotruss command
> when audit is installed, when they couldn't before.

Well, the audit dir permissions will be locked back down in glibc, but I wont push a new glibc before the cauldron rebuild process is finished...

CC: (none) => tmb

Comment 6 David Walser 2013-01-12 21:37:04 CET
(In reply to comment #5)
> (In reply to comment #0)
> > 
> > Another place we recently saw this pop up was when glibc and audit both owned
> > /usr/lib/audit (a directory), but with different permissions.  It used to be
> > that when audit was installed it would just simply change the permissions of
> > that directory, but cause no other problems.  This has been "fixed" in a way by
> > just dropping the directory ownership from the audit package, but has the
> > possibly unintended consequence that regular users can use the sotruss command
> > when audit is installed, when they couldn't before.
> 
> Well, the audit dir permissions will be locked back down in glibc, but I wont
> push a new glibc before the cauldron rebuild process is finished...

Regarding that specifically, it's unclear what the correct action is.  The permissions of that dir in the glibc package were 755 before, and sotruss worked fine as a normal user.  While there may be a reason the audit package locks that down, as I said on the mailing list, it may not be right to have that be the case all time, and might be better handled through msec (such as having that directory permissions set to 750 in the "secure" level).
Comment 7 David Walser 2013-01-12 21:42:17 CET
(In reply to comment #4)
> file->link replacements have been an issue for a long time, I already know
> that.  This is a new issue and there's more to it than that.

Let me explain this if it's not clear to you from the initial bug comment.

This is a major change in RPM's behavior vs. how it's always behaved in the past.  It needs to fixed one way or another, and if it's not, it has the potential to break a lot of upgrades to Mageia 3 as well as cause all kinds of other issues in the future.  It's IMO too late in the development cycle to be introducing this kind of behavior change that affects all packages, and should be reverted.

If it is not reverted, this is what we'd have to do to fix any potential issues with the packages we ship (note that it still wouldn't protect against these same problems in third-party packages):

1) Make sure that no two packages in the entire distribution provide the same directory with different ownership and/or permissions.  In the past, this was what we aimed for anyway ideally, but when there were overlaps, it wasn't a big deal.

2) Make sure that every package that uses a %pre script to handle issues with directories changing to symlinks and vice versa are changed to use %pretrans instead.  That's assuming that will actually even work, as I haven't tested that.  As I said earlier though, that's not really a solution so much as it is a workaround.

So one way or another, this is a real issue that needs to be addressed one way or another.
David Walser 2013-01-12 21:43:29 CET

Priority: Normal => release_blocker

Comment 8 Thierry Vignaud 2013-01-13 07:43:45 CET
Please this is NOT a new issue.
This never worked and we've to use hacks in order to have install works.
The hack now has changed.

http://www.rpm.org/wiki/Releases/4.11.0
"
- Detect and report attempts to replace directories with a
  non-directory as file conflicts (related: RhBug:447156)

- Detect and report attempts to replace directory symlink with a
  directory as file conflicts

- Detect and report some cases of conflicting files accessed through
  different symlinks "

https://bugzilla.redhat.com/show_bug.cgi?id=447156
"Rpm >= 4.11 detects unsupported replace-attempts and reports them as
file conflicts, instead of barfing up in middle of
transaction. %pretrans hacks can still be used to work around it
though.

That's the extent to which this is going to be "fixed" for the
foreseeable future. Of course if rpm some day learns to truly deal
with some of these situations, conflicts can be lifted accordingly."

Priority: release_blocker => Normal
Status: REOPENED => RESOLVED
Resolution: (none) => WONTFIX

Comment 9 David Walser 2013-01-13 18:59:14 CET
Yes this IS a new issue.

Before rpm 4.11, the audit package (before /usr/lib/audit ownership was dropped) would have installed with no problems.

Before rpm 4.11, the apache package we have now would have upgraded from mga2 just fine.

And the impact of this is serious.  If upgrading with the installer, it stops it dead in its tracks.  I imagine urpmi would just crap out and leave things unfinished.  We have to fix this before Mageia 3, one way or another.

We can either:
1) Revert this behavior change in rpm (best solution).

2) Give the installer and urpmi a way to work around it and force through these issues so upgrades can be done successfully without failing.

3) Review and fix all of the packages in the distribution to make sure this is not an issue within our package set (but again, this doesn't help with third party ones).

Priority: Normal => release_blocker

David Walser 2013-01-13 19:00:09 CET

Status: RESOLVED => REOPENED
Resolution: WONTFIX => (none)

Comment 10 David Walser 2013-01-13 19:01:11 CET
Also, *please* stop pretending this isn't an issue or that we can just ignore it and be OK.  We want upgrades from Mageia 2 to work, so however we fix this, it has to be fixed.
Comment 11 Thierry Vignaud 2013-01-14 19:24:20 CET
I didn't.
Stop pretending there's no fix when you know the proper fix for days.
Comment 12 David Walser 2013-01-15 01:47:12 CET
Thierry, I think I have done a pretty good job explaining the issue, but somehow you seem to still not understand it.  The only "simple fix" for this is to revert the behavior change in rpm, as I have explained before.  Barring that, or giving the installer or urpmi a way to work around it, all packages will need to be reviewed as I explained in Comment 7, and checking for directories owned by multiple packages with differing ownership and permissions isn't remotely easy or obvious in how to do.  Checking them for %pre scripts that may need to be changed to %pretrans is a lot easier, but there are still over 10000 packages in the distribution that would need to be checked.

If you still don't understand, please re-read everything I have posted on this bug.  This is a serious issue, and it is very disheartening the way you are downplaying it.
Comment 13 Sander Lepik 2013-01-15 08:48:40 CET
I think one way or another we have to fix those packages that are broken. Such a "simple fix" will cause trouble later.

CC: (none) => sander.lepik

Comment 14 Thierry Vignaud 2013-01-15 20:22:20 CET
There's only 135 spec files out of 11798 that have rm or rmdir in their %pre.
Nothing impossible.

Status: REOPENED => RESOLVED
Resolution: (none) => WONTFIX

Comment 15 David Walser 2013-01-15 20:39:38 CET
(In reply to comment #14)
> There's only 135 spec files out of 11798 that have rm or rmdir in their %pre.
> Nothing impossible.

Care to share a list?

Also, that doesn't account for the overlapping directory ownership issue.

Status: RESOLVED => REOPENED
Resolution: WONTFIX => (none)

Comment 16 Thierry Vignaud 2013-01-15 21:23:20 CET
Stop reopening that bug

Status: REOPENED => RESOLVED
Resolution: (none) => FIXED

Comment 17 Thierry Vignaud 2013-01-15 21:25:23 CET
Created attachment 3378 [details]
list of spec files running rm or rmdir in their %pre

directory overlapping must be fixed in packages, rpmdb would be inconsistent if there were two sets of permissions for the same directory, real permissions would be random (aka the one of the latest upgraded package)
Comment 18 Thierry Vignaud 2013-01-15 21:25:43 CET
Wontfix

Resolution: FIXED => WONTFIX

David Walser 2013-01-16 01:00:14 CET

Assignee: thierry.vignaud => bugsquad

David Walser 2013-01-16 01:00:23 CET

Status: RESOLVED => REOPENED
Resolution: WONTFIX => (none)

Comment 19 David Walser 2013-01-16 01:03:53 CET
It would be nice if people who don't want to cooperate to help fix this wouldn't sabotage efforts to track the issue until it's fixed.

(In reply to comment #17)
> Created attachment 3378 [details]
> list of spec files running rm or rmdir in their %pre

Thanks for that.

> directory overlapping must be fixed in packages, rpmdb would be inconsistent if
> there were two sets of permissions for the same directory, real permissions
> would be random (aka the one of the latest upgraded package)

That doesn't help.  While it is true, this is how it always worked up until rpm 4.11.  Now the packages conflict, rpm won't install them, and if the installer runs into this, it won't proceed past the packages installation step.  So, we have to somehow now check for this in all of the packages and make sure that won't happen.
Comment 20 Luc Menut 2013-01-28 22:21:20 CET
Created attachment 3448 [details]
script to find directories with different sets of permissions

This script should find all the packages which install directories with different sets of permissions.
Thierry, please, can you review it?
Comment 21 Luc Menut 2013-01-28 22:24:13 CET
Created attachment 3451 [details]
list of directories with diffrent sets of permissions in cauldron (2013-01-28)
Comment 22 David Walser 2013-01-29 00:01:13 CET
Luc, Thierry isn't CC'd or assigned on this bug any longer, since he quite clearly didn't desire to continue to be involved on it.

Thanks for creating that script and list of packages affected.  I was planning on bringing this issue up at the next packager's meeting.  That list will be helpful.
David Walser 2013-01-29 21:24:59 CET

Blocks: (none) => 8016

David Walser 2013-02-12 23:17:22 CET

Depends on: (none) => 9055

Comment 23 David Walser 2013-02-12 23:25:33 CET
(In reply to comment #21)
> Created attachment 3451 [details]
> list of directories with diffrent sets of permissions in cauldron (2013-01-28)

Progress has been made in getting those fixed.

The man page directory conflicts are Bug 9055.

The news server conflicts (inn, leafnode) have been reported to Remco.

The ftp server conflicts are OK because they explicitly conflict.

The fax server conflicts are OK because they explicitly conflict.

The others (freeswitch/monit, afbackup, scid/scidvspc) are fixed.
Comment 24 Malo Deniélou 2013-03-21 19:28:50 CET
Hi everyone! Could someone give an updated view of that bug?
Thierry sent a list of spec files affected. How many are fixed now? How many left?
Thanks!

CC: (none) => pierre-malo.denielou

Comment 25 David Walser 2013-03-26 20:51:06 CET
I believe the directory conflicts from the list in Comment 21 should be fixed now, so that issue has been worked around for now, as long as no new conflicts have been introduced since Luc ran his script from Comment 20.  We should run this check before every distro release in the future to make sure no new conflicts have crept in, at least as long as rpm maintains this behavior.  Any packages that script finds now should explicitly conflict with each other, which is OK (ftp, fax, and nntp servers).
Comment 26 Malo Deniélou 2013-04-15 18:05:28 CEST
So can I close this bug? Or the current work-around is only temporary?
Comment 27 David Walser 2013-04-15 20:33:59 CEST
Bug 9055 is fixed, but as far as the issues from this bug, we have only worked around them for now.

Really, this bug covers two different, but related, issues.

1) Conflicts caused by changing the type of a filename between regular file, directory, symlink, etc.  This was actually a good change upstream to catch the conflict before trying to install the updated package.  The recommended way to handle changing a type has been for years to use a %pretrans script, but we just have failed to follow that in Mandriva/Mageia.  Now we have no choice.  So it was unfortunate that this change was introduced so late in the Mageia 3 development cycle, as it made it much harder to catch packages that may need to be fixed.  The only one I found affected by this was apache, and I fixed that.  I reviewed most of the other packages that rm/rmdir in %pre scripts, removing many obsolete %pre scripts in the process.  I did not review any Java-related packages, of which there were several.  I would say that most likely there are not any other packages that are affected by this.

2) Conflicts caused by duplicate ownership of directories with different ownership and/or permissions.  This was a really stupid and not well-thought-out change upstream.  Before, rpm would just silent apply the ownership/permissions to an existing directory when installing a package containing a directory owned by another package.  That's less than ideal, but turning that into a "file conflicts," which is something neither rpm nor the higher-level tools in the rpm world are designed to handle gracefully was a more-trouble-than-it's-worth way of handling this minor issue.  Unless this change is reverted, we'll have to continue to check all of our packages for conflicting directory ownerships/permissions before each distro release to continue to work around the problem (and again that doesn't help for 3rd party packages or old no longer supported packages from previous distro versions that users may still have installed).

So the second problem is not "fixed," and I wouldn't be inclined to close this bug just yet for that reason, but as I've dealt with the issues related to that problem for the Mageia 3 package set, this bug does not need to be a release blocker for Mageia 3 any longer (but it should be set as a release blocker for each subsequent release to make sure this gets checked again and dealt with before each release).

Priority: release_blocker => Normal
Blocks: 8016 => (none)

Comment 28 Marja Van Waes 2015-04-06 23:00:39 CEST
(In reply to David Walser from comment #27)
nstalled).
> 
> So the second problem is not "fixed," and I wouldn't be inclined to close
> this bug just yet for that reason, but as I've dealt with the issues related
> to that problem for the Mageia 3 package set, this bug does not need to be a
> release blocker for Mageia 3 any longer (but it should be set as a release
> blocker for each subsequent release to make sure this gets checked again and
> dealt with before each release).

Is this still valid for Mga 4 -> 5 upgrades?

CC: (none) => marja11

Comment 29 David Walser 2015-04-06 23:15:16 CEST
It will always be valid unless rpm ever fixes this nonsense.  I checked our packages for conflicts last a few months ago and there weren't any, so we're probably OK for this release.  I can check again soon if I remember.
Comment 30 David Walser 2015-04-08 19:42:49 CEST
Thanks for reminding me of this.  I ran the script again (on i586 core, since that's all I have mirrored) and found a conflicts with openssl in freeswitch.  I fixed it in SVN and asked for a freeze push.
Comment 31 Samuel Verschelde 2016-10-15 21:43:09 CEST
(In reply to David Walser from comment #30)
> Thanks for reminding me of this.  I ran the script again (on i586 core,
> since that's all I have mirrored) and found a conflicts with openssl in
> freeswitch.  I fixed it in SVN and asked for a freeze push.

Time to run the script for Mageia 6?
Comment 32 David Walser 2016-10-15 21:44:44 CEST
Yeah, I ran it earlier this year when it looked like we were heading for a release, but it's been long enough it needs to be run again.
Samuel Verschelde 2016-10-15 21:46:47 CEST

Assignee: bugsquad => pkg-bugs

Guillaume Rousse 2016-10-16 16:15:09 CEST

CC: guillomovitch => (none)

Comment 33 Marja Van Waes 2017-03-29 07:00:40 CEST
(In reply to David Walser from comment #32)
> Yeah, I ran it earlier this year when it looked like we were heading for a
> release, but it's been long enough it needs to be run again.

Care to do it?
Comment 34 David Walser 2017-04-02 19:08:44 CEST
Thanks Marja for the reminder.  I just ran it and dropped a rogue nginx fork that someone imported and fixed a bug in prelude-lml-rules.  Hopefully we'll be good for Mageia 6.
Comment 35 Nicolas Lécureuil 2017-08-05 17:14:29 CEST
can we close this bugreport ?

CC: (none) => mageia

Comment 36 David Walser 2017-08-05 17:18:20 CEST
(In reply to Nicolas Lécureuil from comment #35)
> can we close this bugreport ?

No, not unless upstream backs out this ridiculous change, which they probably won't.  This bug serves as a reminder that we need to check for this before *every* Mageia release.  We also will need to be sure to check it again for Mageia 7 to ensure that changing the find-lang implementation doesn't cause any regressions involving this.
Comment 37 Nicolas Lécureuil 2017-08-05 17:25:30 CEST
just to make sure, is there a script to run ?
Comment 38 David Walser 2017-08-05 17:26:30 CEST
(In reply to Nicolas Lécureuil from comment #37)
> just to make sure, is there a script to run ?

Yes, Luc's script, second attachment to this bug (see the list at the top).

Note You need to log in before you can comment on or make changes to this bug.