R – How to handle multiple sockets within a Perl daemon with large memory usage


I have created a client-server program with Perl using IO::Socket::INET. I access server through CGI based site. My server program will run as daemon and will accept multiple simultaneous connections. My server process consumes about 100MB of memory space (9 large arrays, many arrays…). I want these hashes to reside in memory and share them so that I don't have to create them for every connection. Hash creation takes 10-15 seconds.

Whenever a new connection is accepted through sockets, I fork a new process to take care of the processing for each connection received. Since parent process is huge, every time I fork, processor tries to allocate memory to a new child, but due to limited memory, it takes large time to spawn a new child, thereby increasing the response time. Many times it hangs down even for a single connection.

Parent process creates 9 large hashes. For each child, I need to refer to one or more hashes in read-only mode. I will not update hashes through child. I want to use something like copy-on-write, by which I can share whole 100mb or whole global variables created by parent with all child? or any other mechanism like threads. I expect the server will get minimum 100 request per second and it should be able to process all of them in parallel. On an average, a child will exit in 2 seconds.

I am using Cygwin on Windows XP with only 1GB of RAM. I am not finding any way to overcome this issue. Can you suggest something? How can I share variables and also create 100 child processes per second and manage them and synchronize them,


Best Solution

Instead of forking there are two other approaches to handle concurrent connections. Either you use threads or a polling approach.

In the thread approach for each connection a new thread is created that handles the I/O of a socket. A thread runs in the same virtual memory of the creating process and can access all of its data. Make sure to properly use locks to synchronize write access on your data.

An even more efficient approach is to use polling via select(). In this case a single process/thread handles all sockets. This works under the assumption that most work will be I/O and that the time spend with waiting for I/O requests to finish is spent handling other sockets.

Go research further on those two options and decide which one suits you best.

See for example: http://www.perlfect.com/articles/select.shtml