Linux – How to prevent tons of apache processes spawning when I start apache and proceeding to kill the machine

apacheapache-configapache2debianlinux

I have a highly trafficked application on one debian machine and apache has started acting strange.

Every time I start apache, tons of apache processes are spawned, the app doesn't load at all, and very quickly the whole machine freezes and must be powercycled to reboot.

Here is what I get for top immediately after starting apache:

top -   20:14:44    up         1:16,      2 users,    load average: 0.48, 0.10, 0.03
Tasks:  330 total,  5 running, 325 sleeping,   0 stopped,   0 zombie
Cpu(s): 12.0%us,    21.4%sy,   0.0%ni,        65.7%id,   0.2%wa,  0.1%hi,  0.7%si,  0.0%st
Mem:    8179920k    total,     404984k used,  7774936k free,    60716k buffers
Swap:   2097136k    total,     0k used,       2097136k free,    43424k cached


10251 www-data  15   0  467m 8100 4016 S    6  0.1   0:00.04 apache2
10262 www-data  15   0  467m 8092 4012 S    6  0.1   0:00.05 apache2
10360 www-data  15   0  468m 8296 4016 S    6  0.1   0:00.05 apache2
10428 www-data  15   0  468m 8272 3992 S    6  0.1   0:00.05 apache2
10241 www-data  15   0  467m 8256 4012 S    4  0.1   0:00.03 apache2
10259 www-data  15   0  467m 8092 4012 S    4  0.1   0:00.04 apache2
10274 www-data  15   0  467m 8056 4012 S    4  0.1   0:00.03 apache2
10291 www-data  15   0  468m 8292 4012 S    4  0.1   0:00.03 apache2
10293 www-data  15   0  468m 8292 4012 S    4  0.1   0:00.03 apache2
10308 www-data  15   0  468m 8296 4016 S    4  0.1   0:00.02 apache2
10317 www-data  15   0  468m 8292 4012 S    4  0.1   0:00.02 apache2
10320 www-data  15   0  468m 8292 4012 S    4  0.1   0:00.04 apache2
10325 www-data  15   0  468m 8292 4012 S    4  0.1   0:00.04 apache2

And so forth.. with more apache2 processes.

Less than a minute later, you can see below that the load has gone from 0.48 to 2.17. If I do not stop apache at this point, the load continues to rise over a few minutes or less until the machine dies.

top -    20:15:34 up 1:17,       2 users,  load average: 2.17, 0.62, 0.21
Tasks:   1850 total,  5 running, 1845 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,      2.1%sy,    0.0%ni, 96.4%id,  0.0%wa,  0.1%hi,  1.0%si,  0.0%st
Mem:     8179920k     total,     1938524k used,  6241396k free,    60860k buffers
Swap:    2097136k     total,     0k used,  2097136k free,    44196k cached

We have a firewall where we whitelist the addresses we know are allowed to hit our site.

Any ideas about what the problem might be are very welcome.

Thanks!

Best Solution

You have probably made the error of configuring Apache to use far more than all of your ram. This is an easy mistake to make.

I am assuming you are using a Prefork Apache, and an in-process application server (such as PHP or mod_perl). In this model, you will end up with a maximum of (MaxClients * max memory usage of your application per process) memory used. If you don't have nearly that much, it's time to decrease one, the other or both.

In the general case, this means decreasing MaxClients to the point where your server has enough ram to cope.

The default values typically used for MaxClients (150 is typical) are not suitable for running an in-process heavyweight application server on a modest machine if you are using the Prefork model (Most application servers either don't support, or discourage, the use of threaded models).

However, decreasing MaxClients will eventually cause the application to become unavailable, particularly if you have keepalives on and the keepalive timeout too long. Processes which are just keeping a connection alive (state K in server-status) still use a lot of RAM, and that may be a problem - try to minimise keepalive timeout, or turn it off altogether.

You need to keep an eye on server-status (as provided by mod_status).

Of course you should only make ANY of these changes if you understand the consequences. Think twice, change the config once. If you have ANY ability to test the changes with simulated load on a similar spec non-production machine, do so.

Related Question