ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button

Clustering and Load Balancing in Tomcat 5, Part 2
Pages: 1, 2

Sample Code

The sample code used in this article is provided as tomcatclustering.zip. After installing Tomcat server instances (you need four of them), extract the contents of the zip file to the tomcat directories. The sample code provided uses RoundRobinRule as the load-balancing policy. If you want to use the random redirect policy, modify the rules.xml file located in "tomcat50/webapps/balancer/WEB-INF/config" directory. Just comment-out the rule elements for RoundRobinRule and uncomment rule elements specifying RandomRedirectRule. Also, if you want to use two instances in the cluster instead of three, comment out the third rule and change maxServerInstances attribute to 2 (instead of 3).

Note: I removed all other web applications (jsp-examples, etc.) that were included in the Tomcat installation, keeping only the balancer and the sample web applications.

HTTP Request Flow

The web request flow in the sample cluster environment is as follows:

  1. Launch the startup page (http://localhost:8080/balancer/testLB.jsp).
  2. JSP redirects the request to balancer filter (URL: http://localhost:8080/balancer/LoadBalancer).
  3. The Tomcat server working as Load Balancer (TC-LB) intercepts the web request and redirects to the next available cluster member (TC01, TC02, or TC03) based on load-balancing algorithm specified in the configuration file.
  4. The JSP script sessiondata.jsp (located in "clusterapp" web application) in the selected cluster member is called.
  5. If the session is modified the session listener methods in ClusterAppSessionListener are called to log the session modification events.
  6. sessiondata.jsp displays the session details (such as session id, last access time, etc.) on the web browser.
  7. Randomly stop one or two cluster nodes (by calling "stop.tomcat5x" target in Ant script).
  8. Repeat steps 1 through 7 to see if the requests fail over to one of the available cluster members. Also, check if the session information is copied to the cluster members without losing any data.

Figure 3 shows the web request flow represented in a Sequence diagram.

Click for larger view
Figure 3. Cluster application sequence diagram (click on the screenshot to open a full-size view).

Cluster Configuration

A sample web application called "clusterapp" was created to run in the cluster. All the instances have the same directory structure and contents to optimize session replication.

Since Tomcat server instances in the cluster use IP Multicast to transmit sessions, we need to make sure that IP multicast is enabled on the machine where the cluster is set up. To verify this, you can run the sample Java program MulticastNode provided in the book Tomcat: The Definitive Guide or refer to the sample tutorial available on JavaSoft web site on how to write multicast server and client programs.

When a cluster node is started the other members in the cluster show a log message on the server console that a member has been added to the cluster. Similarly, when a cluster node goes down, the remaining members display a log message on the console that a member has disappeared from the cluster. Figure 4 shows the log messages displayed on Tomcat console when a cluster node is removed from the cluster or a new member is added to the cluster.

Figure 4
Figure 4. Log messages when a member is added or removed from cluster

Follow the steps below to enable clustering and session replication in Tomcat server:

  1. All session attributes must implement the java.io.Serializable interface.

  2. Uncomment the Cluster element in server.xml file. The attributes useDirtyFlag and replicationMode in the Cluster element are used to optimize frequency and the session replication mechanism.

  3. Enable ReplicationValve by uncommenting Valve element in server.xml. The ReplicationValve is used to intercept the HTTP request and replicate the session data among the cluster members if the session has been modified by the web client. The Valve element has a "filter" attribute that can be used to filter out requests that could not modify the session (such as HTML pages and image files).

  4. Since all three Tomcat instances are running on the same machine, the tcpListenPort attribute is set to be unique for each Tomcat instance. It is important to know that the attributes starting with mcastXXX (mcastAddr, mcastPort, mcastFrequency, and mcastDropTime) are for the cluster membership IP multicast ping and those starting with tcpXXX (tcpThreadCount, tcpListenAddress, tcpListenPort, and tcpSelectorTimeout) are for TCP session replication ("Clustering configuration parameters" table below shows the configuration of different settings in the Tomcat server instances to enable clustering).

  5. The web.xml meta file (located in clusterapp\WEB-INF directory) should have <distributable/> element. In order to replicate session state for a specific web application, the distributable element needs to be defined for that application. This means if you have more than one web application that need session replication, then you need to add distributable in all those web applications' web.xml files. The Tomcat clustering chapter in the book Tomcat: The Definitive Guide provides a very good explanation on this topic.

Table 2. Clustering configuration parameters

Configuration Parameter Instance 1 Instance 2 Instance 3 Instance 4
Instance Type Load Balancer Cluster Node 1 Cluster Node 2 Cluster Node 3
Code name TC-LB TC01 TC02 TC03
Home Directory c:/web/tomcat50 c:/web/tomcat51 c:/web/tomcat52 c:/web/tomcat53
Server Port 8005 9005 10005 11005
Connector 8080 9080 10080 11080
Coyote/JK2 AJP Connector 8009 9009 10009 11009
Cluster mcastAddr
Cluster mcastPort 45564 45564 45564 45564
Cluster tcpListenPort 4000 4001 4002 4003

Note: Since all the cluster members are running on the same physical machine they use the same IP address (

If you are not using the Ant script to start and stop Tomcat instances, do not set up the CATALINA_HOME environment variable on your computer. If this variable is set, all instances will try to use the same directory (specified in CATALINA_HOME variable) to start Tomcat instances. As a result, only the first instance will successfully start and the other instances will crash with a bind exception message saying that the port is already in use: "java.net.BindException: Address already in use: JVM_Bind:8080".

Related Reading

Java Performance Tuning
By Jack Shirazi

Load Balancer setup

I wrote two simple, custom load-balancing rules extending the rules API to redirect incoming web requests (RoundRobinRule and RandomRedirect). These rules are based on load-balancing algorithms such as round robin and random redirect. You can write similar custom load balancing rules based on other factors like weight based and last access time, etc. The Tomcat load balancer provides a sample parameter-based load balancing rule. It redirects web requests to different URLs depending on the parameter specified in the HTTP request.

Leave the cluster and valve elements in server.xml (in TC-LB instance) commented out since we are not using this Tomcat instance as a cluster member.

Testing Setup

Session Persistence Testing

In session persistence testing, the main objective is to verify that session data is not lost when a cluster member crashes in the middle of a web request. The JSP sessiondata.jsp was used to display the session details. This script also provides HTML text fields to add/modify/remove the session attributes. After adding couple of attributes to HTTP session, I randomly stopped the cluster nodes and check the session data on the available cluster members.

Load Testing

The objective of load testing is to study the custom load balancing algorithms and see how effectively the web requests are distributed to the nodes in the cluster especially when one or more of the nodes go down. JMeter load testing tool was used to simulate multiple concurrent web users.

Steps to test load balancing in the cluster setup:

  1. Start the load balancer and cluster instances of Tomcat server.
  2. Launch the startup JSP script (testLB.jsp).
  3. Simulate the server crash by manually stopping one or more containers at random intervals.
  4. Check the load distribution pattern.
  5. Repeat steps 1 through 4 for 100 times.

All log messages were directed to a text file called tomcat_cluster.log (located in tomcat50/webapps/balancer directory). The response times for all the web objects shown in the sequence diagram (Figure 2) were logged using Log4J messages. The elapsed times (in milliseconds) were collected and tabulated into a matrix as shown in Table 3. I followed a similar instrumentation methodology described in Designing Performance Testing Metrics into Distributed J2EE Apps in collecting the response times during the tests.

The following tables show the elapsed times in load testing (using RoundRobinRule) and load distribution percentages (using RandomRedirectRule) respectively.

Table 3. Elapsed times for load testing

# Scenario testLB.jsp
1 All three server instances running 54 76 12 142
2 Two server instances running (TC02 was stopped) 55 531 14 600
3 Only one server instance running
(TC01 and TC02 were stopped)
56 1900 11 1967

Note: All elapsed times are average values based on a load of 100 concurrent users.

Table 4. Load distribution when using random LB policy

# Scenario TC01 (%) TC02 (%) TC03 (%)
1 All three server instances running 30 46 24
2 Two server instances running (TC02 was stopped) 56 0 44

Note: Load distribution percentages are based on a load of 100 concurrent users.


In session persistence testing, after adding session attributes one of the cluster nodes was brought down and it was verified that the session attributes were not lost due to the server outage. The session details logged in the text file were used to study the details of session attributes.

In load testing, when one or two server instances were stopped and only one Tomcat instance was running, the response times took longer compared to when all three instances were available. When the stopped instances were restarted, the load balancer automatically found that the server is again available to take the requests and redirected the next web request, which significantly improved the response times.

The mechanism I used to find if a cluster member is available (using ServerUtil) is not the fastest way to do this. More sophisticated and robust fail-over techniques should be used in real world scenarios.

One of the limitations of the proposed cluster setup is that it only provides a single load balancer. What happens if the Tomcat instance acting as load balancer goes down? There is no way to forward the requests to any of the cluster members and the result is a so-called Single Point of Failure (SPoF). One solution to this problem is to have a second Tomcat instance as the standby load balancer to take over if the primary load balancer crashes. Typical HA options involve having two load balancers to prevent SPoF situations.

In the sample cluster setup, all Tomcat instances (including load balancer) were configured to run on the same computer. A better design is to run a load balancer instance on a separate machine from the cluster members. Also, we should limit two cluster nodes per machine to take advantage of horizontal scaling method and to improve cluster performance.

HTTP session replication is an expensive operation for a J2EE web application server. The implications of session management in a clustered J2EE environment should be considered during the analysis and design phases of a project rather than waiting until the web application is implemented in the production environment. The application code must be designed keeping the cluster environment in mind. If the clustering implications are not considered at the design phase, the code may need to be completely rewritten to make it work in the cluster setup, which could be a very expensive effort.

If a web application supports any kind of object caching mechanism, then caching of objects in a cluster environment should be considered at the initial stages of the application development. This is very important because keeping cached data in all the cluster nodes in sync is critical for providing accurate and up-to-date business data to web users. Another important consideration is clearing the expired session data in a cluster.

Once the J2EE cluster has been successfully set up and is running, its management and maintenance will become very important to provide the benefits of scalability and high availability. With many nodes (members) in a cluster, maintenance revolves around keeping the cluster running and pushing out application changes to all cluster nodes. One way to provide these services is to implement a monitoring service that periodically checks the server availability and notifies if any of the nodes in the cluster become unavailable. This service should check the nodes at regular intervals detecting failed nodes and removing them from the active cluster node list so no requests go to those nodes. It should include, as changes and updates occur, the ability to update and synchronize all servers in the cluster. Since all requests to a web application must pass through the load-balancing system, the system can determine the number of active sessions, the number of active sessions connected in any instance, response times, peak load times, the number of sessions during peak load, the number of sessions during minimum load, and more. All this audit information can be used to fine-tune the entire system for optimal performance. Reports showing all these metrics should be generated on a regular basis to assess the effectiveness of load-balancing policies and cluster nodes.

Currently, all the configuration required to set up the cluster and load balancer is done manually by manipulating the configuration files (server.xml and rules.xml). It would be very helpful if the Jakarta group were to provide a web-based cluster administration GUI tool to perform the configuration changes needed to manage the clustering and load-balancing setup.


Srini Penchikala is an information systems subject matter expert at Flagstar Bank.

Return to ONJava.com.