Bug 10025 - bind: upgrading from Mageia 2 to Mageia 3 results in completely broken configuration
Summary: bind: upgrading from Mageia 2 to Mageia 3 results in completely broken config...
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: i586 Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Guillaume Rousse
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 8016
  Show dependency treegraph
 
Reported: 2013-05-08 16:03 CEST by David Walser
Modified: 2013-05-09 16:30 CEST (History)
2 users (show)

See Also:
Source RPM: bind-9.9.2.P2-1.mga3.src.rpm
CVE:
Status comment:


Attachments

Description David Walser 2013-05-08 16:03:24 CEST
If you configure a bind server and upgrade to Mageia 3, the service fails to start.

You'll see that your named.conf has been replaced with the default from the package, but that one doesn't even work, as it can't find the zone files it's looking for (as journalctl -b -u named.service will show you).  The zone files that it's looking for are installed in /var/named (not /var/lib/named/var/named as before).  It's still looking in /var/lib/named/var/named, as it still runs chrooted.  If I do:
mount --bind /var/named /var/lib/named/var/named

the service starts.  Stopping the service killed that bind mount, so it looks like there's logic to tear it down, but not set it up.  If this is going to be totally correct, it (at least at package upgrade time) should move the existing files in /var/lib/named/var/named to /var/named before bind mounting over it.

Actually, looking at the setup-named-chroot.sh script, I see the logic is there to do that bind mount, it doesn't do it if the target exists and isn't empty, so just moving the existing files in /var/lib/named/var/named to /var/named during package upgrade should fix this.  Once I moved the files manually, the service does start.

Now that I've started and stopped it a few times, I also see this in mount, which really looks wrong:
/dev/sda5 on /var/lib/named/usr/lib/bind type ext4 (ro,relatime,data=ordered)
/dev/sda5 on /var/lib/named/usr/lib/bind type ext4 (ro,relatime,data=ordered)
/dev/sda5 on /usr/lib/bind type ext4 (ro,relatime,data=ordered)

umounting /var/lib/named/usr/lib/bind twice got rid of all three of those.

If the service fails to start, or if it does start and you stop it, the other bind mounts are left in place as well.  They should be umounted.  Hmm, well now that I've fixed it to start normally, this seems to be working too.  I guess if it fails to start, systemd doesn't run the ExecStopPost.

Speaking of bind mounts, the package is making heavy use of them now (bind mounting a few named configuration files and a couple directories) but it's very confusing, as the mount output shows the device the source file/directory is on rather than the source file/directory itself.

Also after the upgrade, the /var/lib/named/usr/lib/openssl-1.0.0* directories are left behind, those could be removed on upgrade.

I see a comment in /lib/systemd/system/named.service at the top about rsyslog, but the package includes an /etc/rsyslog.d/named.conf file that takes care of this, so that comment should be removed.

Reproducible: 

Steps to Reproduce:
David Walser 2013-05-08 16:03:49 CEST

Blocks: (none) => 8016
CC: (none) => mageia

Comment 1 Colin Guthrie 2013-05-08 16:56:50 CEST
I ran into similar issues. If the service fails to start then the ExecStop will not run and the bind mounts are not unbound. A (hacky) fix would be to also call the script in ExecStartPre with a - to ignore return and to make it unmount everything. That way even if you try and start it several times and it fails, it should make a clean state before trying again. It's not ideal tho' as it leaves the binds in place while you are trying to debug any problems.

Regarding the moving of the files I think we have to be careful here too to not accidentally overwrite things. AL13N says he just edits the files in place so this would confuse him.

Overall it's a bit messy and I think we should instead try to use the systemd isolation stuff instead and get rid of this complex structure.


e.g. from systemd.exec(5):

       ReadWriteDirectories=, ReadOnlyDirectories=,
       InaccessibleDirectories=
           Sets up a new file-system name space for executed processes.
           These options may be used to limit access a process might
           have to the main file-system hierarchy. Each setting takes a
           space-separated list of absolute directory paths. Directories
           listed in ReadWriteDirectories= are accessible from within
           the namespace with the same access rights as from outside.
           Directories listed in ReadOnlyDirectories= are accessible for
           reading only, writing will be refused even if the usual file
           access controls would permit this. Directories listed in
           InaccessibleDirectories= will be made inaccessible for
           processes inside the namespace. Note that restricting access
           with these options does not extend to submounts of a
           directory. You must list submounts separately in these
           settings to ensure the same limited access. These options may
           be specified more than once in which case all directories
           listed will have limited access from within the namespace.
Comment 2 David Walser 2013-05-08 23:13:53 CEST
If we can't fix this properly in time, couldn't we do the following as a short-term fix?  BTW, I don't get your comment about being careful about moving and accidentally overwriting things.  That shouldn't happen.  Anyway, I don't know if %pre or %post would be the right place or if it has to be before %_post_service, but:

mv -f %{chroot_prefix}/var/named/* /var/named/
rm -rf %chroot_prefix}%{_libdir}/openssl-*
Comment 3 AL13N 2013-05-08 23:43:19 CEST
in mga2, i'm editing files and dnssec keys in %{chroot_prefix}/var/named/* . on upgrade it would be nice to to lose my stuff :-).

about using the isolation system, do you mean to use this without actually chrooting? if you do, i think you'll have some angry named sysadmins... :-) security is quite paramount for them... if the named is hacked, it must not get out of the chroot...

CC: (none) => alien

Comment 4 AL13N 2013-05-08 23:46:38 CEST
btw: i'm unsure, but if the openssl-* file is a cnf file; then this could be used to have some settings for generating ssl certificates if one would need it for ssl on bind. not sure, but should check with someone who actually uses this...
Comment 5 David Walser 2013-05-08 23:59:49 CEST
OK, the commands from Comment 2 are protected with an if [ "$1" -gt 1 ] and are in %post before everything else.  Freeze push requested.  This should fix this.
Comment 6 Colin Guthrie 2013-05-09 00:04:13 CEST
(In reply to AL13N from comment #3)
> about using the isolation system, do you mean to use this without actually
> chrooting? if you do, i think you'll have some angry named sysadmins... :-)
> security is quite paramount for them... if the named is hacked, it must not
> get out of the chroot...

Short answer: chroots aren't that great for security. We can do better, perhaps using file system namespaces.

Long answer: http://0pointer.de/blog/projects/changing-roots.html

Might not be the answer as it might still expose too much.
Comment 7 David Walser 2013-05-09 00:05:43 CEST
(In reply to Colin Guthrie from comment #6)
> Short answer: chroots aren't that great for security. We can do better,
> perhaps using file system namespaces.
> 
> Long answer: http://0pointer.de/blog/projects/changing-roots.html
> 
> Might not be the answer as it might still expose too much.

Or maybe containers...kinda like Solaris zones.
Comment 8 AL13N 2013-05-09 00:10:45 CEST
well, i'm just saying that these bind admins are freakishly paranoid and don't like to deviate from their known practises (except for very good reason) and they've been chroot using them for a while now...

iirc i've read something like this in the bind system administration manual
Comment 9 David Walser 2013-05-09 16:30:34 CEST
Fixed in bind-9.9.2.P2-2.mga3.

That startup/shutdown script it uses really need to be made more robust though.

Status: NEW => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.