after some time on running apache, doing systemctl stop httpd hangs forever. Maybe it is not a good idea to have SIGWINCH here, which should graceful stop httpd. Having a hanging httpd (which we have in fact) is worse than loosing a few requests and serve fast again. Btw. can we add apaachectl -t before startup/restart. If you change the config and do a restart it fails because the config is not correct. So you have a downtime. It would be nice to check the config before killing and trying to restart it.
Hi, thanks reporting this. Doing this prevent Apache from running fine again? If you have not done modifications on Apache's configuration while it is running for a long time, it should run again. For WINCH signal, it can be done by: # apachectl -k graceful-stop This command automatically checks the configuration files as in configtest before initiating the restart to make sure Apache doesn't die. Meanwhile, /usr/lib/systemd/system/httpd.service on M7 is: [Unit] Description=The Apache HTTP Server After=network.target remote-fs.target nss-lookup.target [Service] Type=notify Environment=LANG=C EnvironmentFile=-/etc/sysconfig/httpd ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND ExecReload=/usr/sbin/httpd $OPTIONS -k graceful # Send SIGWINCH for graceful stop KillSignal=SIGWINCH KillMode=mixed [Install] WantedBy=multi-user.target I don't see error here, as it must send SIGWINCH, and "ExecReload=/usr/sbin/httpd $OPTIONS -k graceful". AND this is like sending apachectl -t, according to Apachectl manpage: apachectl graceful Gracefully restarts the Apache httpd daemon. [...] This command automatically checks the configuration files as in configtest before initiating the restart to make sure Apache doesn't die. This is equivalent to apachectl -k graceful. Assigning to registered maintainer, for advice. (Please set the status to 'assigned' if you are working on it)
Keywords: (none) => TriagedSummary: apache hangs on stop/restart => Apache hangs on systemctl stop/restart httpd.serviceAssignee: bugsquad => shlomif
Our upstream is here: https://src.fedoraproject.org/rpms/httpd/blob/master/f/httpd.service We don't have the httpd-init because we do it in package scriplets instead. I definitely think the graceful/winch stuff is correct; the default assumed behavior shouldn't be that it hangs. If you experience that on your own system, kill it manually. I don't think forcing it to check the config every time is necessary either. If you change it, you can do that yourself.
@David: most services have disabled graceful features and there ar good reasons for it. 1. while the server may still serve one request during restart it is fully unresponsible to new requests 2. having enabled h2 or keep alive will put apache in a state where it waits for a long time compared to the time for the restart 3. initiating a reboot (e.g. remote) will hang the server until apache finally got killed For the config check (which is not very expensive), I'm quite sure we had this feature before we migrated to systemd, but I'm to lazy to check that in svn - and I think it does not matter anyway. For a reload it is not nice to have the server "crash" just by reloading the config. It would be more polite to say "reload is not possible since there is an error in your config". The usual workflow is to change a setting (and be honest) think you've done correct because it was just a small change and try to apply it by systemctl reload httpd. Then you got the server crashed (because you've forgotten to check the syntax first) and now you are in a hurry, as the server should be running and you don't have the time to really search for the error. Software should prevent us from making mistakes.
FYI: apache reload always works as expected, but restart or stop always hangs, so I assume KillSignal or KillMode is the problem. I'm not sure sending SIGWINCH to all apache processes is correct... If I read the man page systemd.kill correct "If set to mixed, the SIGTERM signal (see below *=KillSignal *) is sent to the main process while the subsequent SIGKILL signal (see below *=FinalKillSignal*) is sent to all remaining processes of the unit's control group." This means all workers receive FinalKillSignal, if not set, equivalent to SIGKILL and the main-process receives KillSignal which is (in our case) set to SIGWINCH Accodring to https://httpd.apache.org/docs/2.4/en/stopping.html, SIGWINCH should only be sent to the parent process and no signal should be passed to the children. "The graceful-stop signal allows you to run multiple identically configured instances of httpd at the same time. This is a powerful feature when performing graceful upgrades of httpd, however it can also cause deadlocks and race conditions with some configurations." At least each service using gracefull stop should always set TimeoutStopSec= this will prevent services from hanging "forever" to stop. I thnik this should not exceed 30s! This will e.g. help having a shutdown not ending in deadlocks or waiting forever to stop httpd.
Just fyi, I cannot recreate the problem on my Mageia 7 x86_64 install ... [root@x3 ~]# systemctl status httpd.service ● httpd.service - The Apache HTTP Server Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset: disabled) Active: active (running) since Mon 2020-11-02 15:21:56 EST; 4 days ago Main PID: 1533 (/usr/sbin/httpd) Status: "Total requests: 149; Idle/Busy workers 100/0;Requests/sec: 0.00043; Bytes served/sec: 0 B/sec" Tasks: 9 (limit: 4915) Memory: 30.3M CGroup: /system.slice/httpd.service ├─ 1533 /usr/sbin/httpd -DFOREGROUND ├─ 1567 /usr/sbin/httpd -DFOREGROUND ├─ 1568 /usr/sbin/httpd -DFOREGROUND ├─ 1569 /usr/sbin/httpd -DFOREGROUND ├─ 1570 /usr/sbin/httpd -DFOREGROUND ├─ 1571 /usr/sbin/httpd -DFOREGROUND ├─ 1572 /usr/sbin/httpd -DFOREGROUND ├─ 3278 /usr/sbin/httpd -DFOREGROUND └─31295 /usr/sbin/httpd -DFOREGROUND Nov 02 15:21:55 x3.hodgins.homeip.net systemd[1]: Starting The Apache HTTP Server... Nov 02 15:21:56 x3.hodgins.homeip.net systemd[1]: Started The Apache HTTP Server. [root@x3 ~]# systemctl stop httpd.service [root@x3 ~]# systemctl status httpd.service ● httpd.service - The Apache HTTP Server Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset: disabled) Active: inactive (dead) Nov 02 15:21:55 x3.hodgins.homeip.net systemd[1]: Starting The Apache HTTP Server... Nov 02 15:21:56 x3.hodgins.homeip.net systemd[1]: Started The Apache HTTP Server. Nov 06 15:32:47 x3.hodgins.homeip.net systemd[1]: Stopping The Apache HTTP Server... Nov 06 15:32:48 x3.hodgins.homeip.net systemd[1]: httpd.service: Succeeded. Nov 06 15:32:48 x3.hodgins.homeip.net systemd[1]: Stopped The Apache HTTP Server.
CC: (none) => davidwhodgins
@Dave: thanks for your reply. I think you are running a test environment. I run some real world server, cinfigured with php-fpm (using mod-proxy). Yesterday due to the reading for this bug, I stumbled across https://httpd.apache.org/docs/2.4/en/mod/mpm_common.html#gracefulshutdowntimeout Setting this to e.g. 2 in apache config seems to solve the issue. The default is to wait forever (which is what I was seeing). You can still see the effect: [root@borachio ~]# systemctl status httpd.service ● httpd.service - The Apache HTTP Server Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/httpd.service.d └─mount.conf Active: active (running) since Fri 2020-10-16 18:54:18 CEST; 3 weeks 0 days ago Process: 25437 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful (code=exited, status=0/SUCCESS) Main PID: 3768 (httpd) Status: "Total requests: 2763257; Idle/Busy workers 98/2;Requests/sec: 1.47; Bytes served/sec: 21KB/sec" Tasks: 257 (limit: 4915) Memory: 1.9G CGroup: /system.slice/httpd.service ├─ 3768 /usr/sbin/httpd -DFOREGROUND ├─25439 /usr/sbin/httpd -DFOREGROUND ├─25440 /usr/sbin/httpd -DFOREGROUND ├─25447 /usr/sbin/httpd -DFOREGROUND └─25635 /usr/sbin/httpd -DFOREGROUND Nov 05 05:37:27 borachio.domain.de systemd[1]: Reloading The Apache HTTP Server. Nov 05 05:37:27 borachio.domain.de systemd[1]: Reloaded The Apache HTTP Server. Nov 05 15:33:40 borachio.domain.de systemd[1]: Reloading The Apache HTTP Server. Nov 05 15:33:40 borachio.domain.de systemd[1]: Reloaded The Apache HTTP Server. [root@borachio ~]# ps aux|grep httpd root 3768 0.0 0.0 55584 50276 ? Ss Okt16 1:17 /usr/sbin/httpd -DFOREGROUND root 22010 0.0 0.0 22640 760 pts/0 S+ 11:13 0:00 grep --color httpd apache 25439 0.0 0.1 4709000 186992 ? Sl Nov06 0:34 /usr/sbin/httpd -DFOREGROUND apache 25440 0.0 0.1 4709000 173872 ? Sl Nov06 0:33 /usr/sbin/httpd -DFOREGROUND apache 25447 0.0 0.1 4446856 176656 ? Sl Nov06 0:41 /usr/sbin/httpd -DFOREGROUND apache 25635 0.1 0.2 4709000 272796 ? Sl Nov06 1:33 /usr/sbin/httpd -DFOREGROUND [root@borachio ~]# systemctl stop httpd.service <------- hangs for some time other console shows this during stop: root@borachio ~]# ps aux|grep httpd root 3768 0.0 0.0 55584 50276 ? Ss Okt16 1:17 /usr/sbin/httpd -DFOREGROUND root 22043 0.0 0.0 34168 5848 pts/0 S+ 11:14 0:00 systemctl stop httpd.service root 22049 0.0 0.0 22640 760 pts/1 S+ 11:14 0:00 grep --color httpd apache 25440 0.0 0.0 0 0 ? Z Nov06 0:33 [httpd] <defunct> apache 25447 0.0 0.0 0 0 ? Z Nov06 0:41 [httpd] <defunct> apache 25635 0.1 0.0 0 0 ? Z Nov06 1:33 [httpd] <defunct> From my perspectiv I still suggest: a) adding GracefulShutdownTimeout to apache config (documented, and some default) b) adding some realistic hard time out on systemd services that try to do graceful stopping/restart, as these services may get stuck and the result is not desired
I'm somewhat against choosing a default other that 0 (wait forever) as there will be very slow systems where what ever we choose is not enough while on fast systems it will be seen as too much. My preference would be to add or alter a urpmi.readme file that explains how to set the value, to the appropriate package. On those systems that are affected, the sysadmin can choose a proper timeout.
Mageia 7 is EOL since July 1st 2021. There will not have any further bugfix for this release. You are encouraged to upgrade to Mageia 8 as soon as possible. @reporter, if this bug still apply with Mageia 8, please let us know it. @packager, if you work on the Mageia 7 version of your package, please check the Mageia 8 package if issue is also present. In this case, please fix the Mageia 8 version instead. This bug report will be closed OLD if there is no further notice within 1st September 2021.
Version: 7 => 8
already fixed
Resolution: (none) => OLDStatus: NEW => RESOLVED