R – At what level should I implement communication between nodes on a distributed system


I'm building a web application that from day one will be on the limits of what a single server can handle. So I'm considering to adopt a distributed architecture with several identical nodes. The goal is to provide scalability (add servers to accommodate more users) and fault tolerance. The nodes need to share some state between them, therefore some communication between them is required. I believe I have the following alternatives to implement this communication in Java:

  • Implement it using sockets and a custom protocol.
  • Use RMI
  • Use web services (each node can send and receive/parse HTTP request).
  • Use JMS
  • Use another high-level framework like Terracotta or hazelcast

I would like to know how this technologies compare to each other:

  • When the number of nodes increases
  • When the amount of communication between the nodes increases (1000s of messages per second and/or messages up to 100KB etc)
  • On a practical level (eg ease of implementation, available documentation, license issues etc)
  • I'm also interested to know what technologies are people using in real production projects (as opposed to experimental or academic ones).

Best Solution

Don't forget Jini.

It gives you automatic service discovery, service leasing, and downloadable proxies so that the actual client/server communication protocol is up to you and not enforced by the framework (e.g. you can choose HTTP/RMI/whatever).

The framework is built around acknowledgement of the 8 Fallacies of Distributed Computing and recovery-oriented computing. i.e. you will have network problems, and the architecture is built to help you recover and maintain a service.

If you also use Javaspaces it's trivial to implement workflows and consumer-producer architectures. Producers will write into the Javaspaces, and one or more consumers will take that work from the space (under a transaction) and work with it. So you scale it simply by providing more consumers.

Related Question