ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button

Session Replication in Tomcat 5 Clusters, Part 1
Pages: 1, 2

Session Replication in Tomcat 5

Prior to version 5, Tomcat server only supported sticky sessions (using the mod_jk module for load balancing purposes). If we needed session replication, we had to rely on third-party software such as JavaGroups to implement it. Tomcat 5 server comes with session replication capabilities. Similar to the clustering feature, session replication is enabled just by modifying the server.xml configuration file.

Martin Fowler talks about three session-state persistence patterns in his book Enterprise Patterns. These patterns are:

  1. Client Session State: Stores session state on the client.
  2. Server Session State: Keeps the session state on a server system in a serialized form.
  3. Database Session State: Stores session data as committed data in the database.

Tomcat supports these three session persistence types:

  1. In-memory replication: Session state is replicated in the JVM's memory, using the SimpleTcpCluster and SimpleTcpClusterManager classes that ship with the Tomcat 5 installation. These classes are in the package org.apache.catalina.cluster and are part of server/lib/catalina-cluster.jar.
  2. Database persistence: In this type, the session state is stored in a relational database and the server retrieves session information from the database, using the JDBCManager class. This class is in the org.apache.catalina.session.JDBCStore package, and is part of the file catalina.jar.
  3. File-based persistence: Here, the session state is saved to a file system, using the PersistenceManager class. This class is in the org.apache.catalina.session.FileStore package and is part of catalina.jar.

Elements of Tomcat Cluster and Session Replication

This section briefly explains the elements comprising Tomcat cluster and session replication.


This is the main element in the cluster. The class SimpleTcpCluster represents the cluster element. It creates the ClusterManager for all of the distributable web contexts using the specified manager class name in server.xml.

Cluster Manager

This class takes care of replicating the session data across all of the nodes in the cluster. The session replication happens for all those web applications that have the distributable tag specified in the web.xml file. The cluster manager is specified in server.xml with the managerClassName attribute of the Cluster element. The cluster manager code is designed to be a separate element in the cluster; all we have to do is write a session manager class that implements the ClusterManager interface. This gives us the flexibility of using a custom cluster manager without affecting other elements in the cluster.

There are two replication algorithms. SimpleTcpReplicationManager replicates the entire session each time, while DeltaManager only replicates session deltas.

The simple replication manager copies the entire session on each HTTP request. This is more useful when the sessions are small in size, and if we have code like:

HashMap map = session.getAttribute("map");

Here, we don't need to specifically call the session.setAttribute() or removeAttribute methods to replicate the session changes. For each HTTP request, all of the attributes in the session are replicated. There is an attribute called useDirtyFlag that can be used to optimize the number of times a session is replicated. If this flag is set to true, we have to call setAttribute() method to get the session changes replicated. If it's set to false, the session is replicated after each request. SimpleTcpReplicationManager creates a ReplicatedSession to perform the session replication.

The delta manager is provided for pure performance reasons. It does one replication per request. It also invokes listeners, so if we call session.setAttribute(), the listeners on the other servers will be invoked. DeltaManager creates a DeltaSession to do the session replication.


The membership is established by all Tomcat instances sending broadcast messages on the same multicast IP and port. The broadcast message contains the IP address and TCP listen port of the server (the default IP address value is If an instance has not received the message within a given time frame (specified by the mcastDropTime parameter in the cluster configuration), the member is considered dead. The element is represented by the McastService class.

The attributes starting with mcastXXX are for the membership multicast ping. The following table lists the attributes used for IP multicast server communication.

Attribute Description
mcastAddr Multicast address (this has to be the same for all of the nodes)
mcastPort Multicast port number (this also has to be the same for all of the nodes)
mcastBindAddr IP address to bind the multicast socket to a specific address
mcastTTL Multicast Time To Live (TTL) to limit the broadcast
mcastSoTimeout Multicast read timeout (in milliseconds)
mcastFrequency Time in between sending "I'm alive" heartbeats (in milliseconds)
mcastDropTime Time before a node is considered dead (in milliseconds)


This element is represented by the ReplicationTransmitter class. When a multicast broadcast message is received, the member is added to the cluster. Upon the next replication request, the sending instance will use the host and port information and establish a TCP socket. Using this socket, it sends over the serialized data. There are three different ways to handle session replication in Tomcat 5. These are asynchronous, synchronous, and pooled replication modes. The following section explains how these modes work and the scenarios where each should be used.

  • Asynchronous: In this replication mode, a single thread for each cluster node acts as a session data transmitter. Here, the request thread will place the replication request into a queue and then return to the client. Asynchronous replication should be used if you have sticky sessions until failover; where the replication time is not crucial, but the request time is. During async replication, the request is returned before the data has been replicated. This replication mode yields shorter request times. It is useful when the requests are farther apart (i.e., there are longer delays between the web requests). It's also useful if we don't care whether the session is completely replicated or not, or when the sessions are small so the session replication times are shorter.
  • Synchronous: In this mode, a single thread executes both the HTTP request and the data replication. The thread doesn't return until all nodes in the cluster have received the session data. Synchronous means that the replication data is sent over a single socket. Since it uses a single thread, using synchronous mode could potentially become a bottleneck in the cluster's performance. This replication mode guarantees that the session is replicated before the request returns.
  • Pooled: Tomcat 5 provides a big improvement in the way the session is replicated with the pooled replication mode. Pooled mode is basically an extended version of synchronous mode. It is based on the concept of "partitioning" the service into multiple instances, each of which handles a different slice of the session data. Multiple sockets are opened to the receiving server to send the session information; this approach is faster than sending everything over a single socket. Thus, the session is replicated using a pool of sockets synchronously. The request doesn't return until all of the session data is replicated. Increase the TCP thread count to use this mode effectively. Since we are using multiple sockets, the pooled mode results in increased performance and better scalability. It is also the safest configuration, because there is a sufficient number of sockets to transmit all of the session data to other nodes in a reasonable amount of time. With a single socket, the session data may be lost or partially transmitted across the cluster.


This cluster element is represented by the ReplicationListener class. The attributes in the cluster configuration that start with tcpXXX are for the actual TCP session replication. The following table shows the attributes used to configure socket-based server communication for server replication.

Attribute Description
tcpThreadCount Number of threads to handle incoming replication requests
tcpListenAddress IP address for TCP cluster requests. (If this is set to auto, the address becomes the value of InetAddress.getLocalHost().getHostAddress().)
tcpListenPort The port number where the session replication is received from other cluster members.
tcpSelectorTimeout Timeout (in milliseconds)

Replication Valve

The replication valve is used to determine which HTTP requests need to be replicated. Since we don't usually replicate the static content (such as HTML and JavaScript, stylesheets and image files), we can filter the static content using the replication valve element. The valve is used to find out when the request is completed and initiate the replication.


The deployer element can be used to deploy apps cluster-wide. Currently, the deployment only deploys/undeploys to working members in the cluster so no WARs are copied upon startup of a broken node. The deployer watches a directory (watchDir) for WAR files when watchEnabled="true". When a new WAR file is added, the WAR gets deployed to the local instance, and is then deployed to the other instances in the cluster. When a WAR file is deleted from the watchDir the WAR is undeployed locally and cluster-wide.

All of the elements in the Tomcat cluster architecture, and their hierarchy, are shown in Figure 1.

Figure 1
Figure 1. Tomcat cluster hierarchy diagram. Click image for full-size screen shot.

How Session Replication Works in Tomcat

The following section briefly explains how the cluster nodes share the session information when a Tomcat server is started up or shut down. For more detailed explanation, refer to the Tomcat 5 Clustering documentation.

TC-01: First node in the cluster
TC-02: Second node in the cluster

  • Server startup: TC-01 starts up using the standard server startup sequence. When the Host object is created, a cluster object is associated with it. When the contexts are parsed, if distributable is specified in web.xml, Tomcat creates the session manager (SimpleTcpReplicationManager instead of StandardManager) for the web context. The cluster class will start up a membership service (an instance of Member) and a replication service.
    When TC-02 starts up, it follows the same sequence as the first member (TC-01) did, with one difference. The cluster is started and will establish a membership (TC-01, TC-02). TC-02 will now request the session state from TC-01. TC-01 responds to the request, and before TC-02 starts listening for HTTP requests, TC-01 transfers its state to TC-02. If TC-01 doesn't respond, TC-02 will time out after 60 seconds and issue a log entry. The session state gets transferred for each web application that has distributable specified in web.xml.
  • Session creation: When TC-01 receives a request, a session (S1) is created. The request coming into TC-01 is treated exactly the same way as it is without session replication. The action happens when the request is completed: the ReplicationValve will intercept the request before the response is returned to the user. At this point, it finds that the session has been modified, and it uses TCP to replicate the session to TC-02.
  • Server outage/shutdown: When a server in the cluster crashes or is brought down for maintenance or system upgrades, the other node receives a notification that the first node has dropped out of the cluster. TC-02 removes TC-01 from its membership list, and TC-02 will no longer be notified of any changes that occur in TC-01. The load balancer will fail over to TC-02 and all of the sessions are handled by TC-02.
    When TC-01 starts back up, it again follows the startup sequence described in the server startup step. It joins the cluster and communicates with TC-02 for the current state of all of the sessions. And once it receives the session state, it finishes loading and opens its HTTP/mod_jk ports. So no requests make it to TC-01 until it has received the session state from TC-02.
  • Session expiration: If a session on the first node is explicitly invalidated or expired due to timeout, the invalidate call is intercepted and the session is placed in a queue with other invalidated sessions. When the request is complete, instead of sending out the session that has changed, the server sends out the session-expire message to TC-02, and TC-02 will invalidate the session as well. We can see the message that the session is invalidated on the server console. The invalidated session will not be replicated in the cluster until another request comes through the system and checks the invalid queue.


In this article, I talked about session replication in a clustered environment, and some design considerations when creating J2EE applications with an in-memory session replication requirement. I also discussed the clustering elements in Tomcat 5 container that are specific to session replication. In part two of this series, we'll look at how to configure session replication in a Tomcat cluster using different session managers and replication modes.

Srini Penchikala is an information systems subject matter expert at Flagstar Bank.

Return to ONJava.com.