Bug 9921 - Should set SystemMaxUse to a sane default in /etc/systemd/journald.conf to prevent huge hd usage
Summary: Should set SystemMaxUse to a sane default in /etc/systemd/journald.conf to pr...
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Colin Guthrie
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-29 20:57 CEST by Philippe Leblanc
Modified: 2013-05-10 01:12 CEST (History)
3 users (show)

See Also:
Source RPM: systemd
CVE:
Status comment:


Attachments

Description Philippe Leblanc 2013-04-29 20:57:05 CEST
Description of problem:
I discovered that the systemd journaling was using a huge amount of space. My root partition was filed up to 9.3Gb with 84% occupancy. I tracked down the problem to systemd's journaling. Setting the variable SystemMaxUse=50M instantly freed up 5.3Gb of space. I'm now down to 4.0Gb occupancy. I think this value should be set to a sane default otherwise new users unaware of the occupancy of their root partition may end up with an unstable system due to lack of free space on root. Reading arround the web, 50M seems to be a good default choice.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.


Reproducible: 

Steps to Reproduce:
Sander Lepik 2013-04-29 21:02:24 CEST

CC: (none) => sander.lepik
Assignee: bugsquad => mageia
Source RPM: systemd-44-13.mga2.src.rpm => (none)

Comment 1 Colin Guthrie 2013-04-30 11:26:11 CEST
Do the limits regarding free space on the partition not work? When the filesystem gets full the journal should be purging data to keep a certain percentage of the disk free for other uses.

Ultimately the intention of the code is to allow as much logging as disk space allows but when the disk is full, make way for other data as needed.

From the man page:

SystemMaxUse=, SystemKeepFree=, SystemMaxFileSize=, RuntimeMaxUse=,
       RuntimeKeepFree=, RuntimeMaxFileSize=
           Enforce size limits on the journal files stored. The options
           prefixed with System apply to the journal files when stored on a
           persistent file system, more specifically /var/log/journal. The
           options prefixed with Runtime apply to the journal files when
           stored on a volatile in-memory file system, more specifically
           /run/log/journal. The former is used only when /var is mounted,
           writable and the directory /var/log/journal exists. Otherwise only
           the latter applies. Note that this means that during early boot and
           if the administrator disabled persistent logging only the latter
           options apply, while the former apply if persistent logging is
           enabled and the system is fully booted up.  SystemMaxUse= and
           RuntimeMaxUse= control how much disk space the journal may use up
           at maximum. Defaults to 10% of the size of the respective file
           system.  SystemKeepFree= and RuntimeKeepFree= control how much disk
           space the journal shall always leave free for other uses if less
           than the disk space configured in SystemMaxUse= and RuntimeMaxUse=
           is available. Defaults to 5% of the size of the respective file
           system.  SystemMaxFileSize= and RuntimeMaxFileSize= control how
           large individual journal files may grow at maximum. This influences
           the granularity in which disk space is made available through
           rotation, i.e. deletion of historic data. Defaults to one eighth of
           the values configured with SystemMaxUse= and RuntimeMaxUse=, so
           that usually seven rotated journal files are kept as history.
           Specify values in bytes or use K, M, G, T, P, E as units for the
           specified sizes. Note that size limits are enforced synchronously
           to journal files as they are extended, and need no explicit
           rotation step triggered by time.

I want to avoid setting up default configuration files in a custom way if at all possible.

Can you make a good argument as to why the upstream defaults shouldn't be kept? If your argument is good enough then perhaps we should be having this discussion upstream rather than something we do specifically in Mageia.
Colin Guthrie 2013-04-30 11:26:32 CEST

Component: New RPM package request => RPM Packages
Source RPM: (none) => systemd

Comment 2 Thomas Backlund 2013-04-30 12:16:21 CEST
I think SystemMaxUse= and RuntimeMaxUse= defaulting to 5% is way too low, as many filesystems starts dropping performance at ~85% mark, so maybe 15% would be a saner default...

(there is another debate about some ssds dropping performance aroun 50% mark, but that is another issue.)

CC: (none) => tmb

Comment 3 Philippe Leblanc 2013-04-30 17:27:23 CEST
(In reply to Colin Guthrie from comment #1)
 
> Can you make a good argument as to why the upstream defaults shouldn't be
> kept? If your argument is good enough then perhaps we should be having this
> discussion upstream rather than something we do specifically in Mageia.

Looks like the defaults are set to use up to 10% of disk space. Is it this 10% of the root partition or 10% of the entire filesystem assigned to mageia? Mine was up to 5.3Gb which exceeds 10% of my root partition (which would calculate to 1.2Gb). As Thomas pointed out, you probably shouldn't be filling up your root to more than ~85% for performance reasons. Are there safeguards to prevent journald from growing beyond this point? I've seen runnaway logfiles before on a cluster which caused it to frequently crash. I saw my filesystem and thought something similar was going on. I was unaware this was by design with smart logs.

Essentially, it boils down to this. If no instability or performance degradation can occur from using the default configuration which allows journald to grow as much as it can, then I'm completely fine sticking with defaults. The one thing I would recommend is to probably make a note of that functionality in the release notes. I was totally baffled by the huge disk usage, I'm sure many other users will be. One last question, as the logs can shrink and grow to accomodate the filesystem usage, is there anyway to find out how much disk is being used neglecting the logs?
Comment 5 Thomas Backlund 2013-05-03 17:57:48 CEST
Yeah, looks good, go ahead and commit it and we'll push it for mga3
Comment 6 Philippe Leblanc 2013-05-03 20:01:49 CEST
This patch should help a bit. I'm currently up to 90% occupied (1.2Gb free on 12Gb total). I've been unable to install the latest tex update which is a rather large package. This is the error message:

"1 transactions d'installation ont échoué

Une erreur est survenue pendant l'installation des paquetages :

installer le paquetage texlive-texmf-20120701-4.mga3.noarch nécessite 127MB sur le système de fichiers /"

It's complaining that the package needs 127Mb of free space. I have it, but maybe it's looking for continuous space? Is the space occupied by the journaling interfering with this operation?
Comment 7 Colin Guthrie 2013-05-03 22:00:31 CEST
OK, committed. Feel free to submit when convenient.

I'm not sure about the freespace calculation but it's certainly something to bring up on a different bug I think.

Status: NEW => RESOLVED
Resolution: (none) => FIXED

Comment 8 Morgan Leijström 2013-05-10 00:26:07 CEST
Thank you for bringing this up

I want to add that behaviour also depends on wether /var is a partition or is in / filesystem.
I have a machine with 1 GB /var partition; 15% is maybe a bit too little then.  Another have /var in a 25 GB / partition, where 15% is way too much.

I think that it should be both limited in size (like 300 MB) to not grow insanely large on a large partition (even if there is much space) , but also shrink when /var/log is on a small partition.

As said, it is not an exact science, but i believe the now chosen values are OK.

CC: (none) => fri

Comment 9 Colin Guthrie 2013-05-10 00:34:22 CEST
(In reply to Morgan Leijström from comment #8)
> As said, it is not an exact science, but i believe the now chosen values are
> OK.

Yeah, that's exactly it - not an exact science. These are just the "sensible" defaults after all. sysadmins should tweak it to suit their needs (as they would have done in the past with their log rotation/retention policy)
Comment 10 Philippe Leblanc 2013-05-10 01:12:36 CEST
I ended up uninstalling the texmf package and reinstalled the new version and that seemed to work and circumvent the space issue. I don't know if the error above was related to the high root partition occupancy of my system. Currently it's at 86% since the patch you submitted above. As you suggested Colin, if I encounter another problem with difficulty updating particularly large packages, I will open another bug.

Note You need to log in before you can comment on or make changes to this bug.