| Sign In/My Account | View Cart |
The Tomcat 5 server provides built-in support for clustering and session replication. This first article in this series will provide an overview of session persistence and the inner works of session replication in Tomcat clusters. I will discuss how the session replication process works in Tomcat 5 and the replication mechanisms available for session persistence across the cluster nodes. In part two, I will discuss the details of a sample Tomcat cluster setup with session replication enabled, and compare different replication scenarios.
A traditional standalone (non-clustered) server does not provide any failover or load balancing capabilities. When the server goes down, the entire web site is unavailable until the server is brought back up. Any sessions that were stored in the server's memory are lost and the users have to log in again and enter all of the data lost due to the server crash.
On the other hand, a server that's part of a cluster provides both scalability and failover capabilities. A cluster is a group of multiple server instances running simultaneously and working together to provide high availability, reliability, and scalability. The server cluster appears to clients as a single server instance. Clustered server instances are pretty much the same as a standalone server from the client point of view, but they offer uninterrupted service and session data persistence by providing failover and session replication.
The application servers in a cluster use network technologies such as IP multicast and sockets to share information about the availability of the servers.
Tomcat server uses IP multicast for all one-to-many communications among server instances in the cluster. IP multicast is the broadcast technology that enables multiple servers to "subscribe" to a given IP address and port number and listen for messages. (Multicast IP addresses range from 224.0.0.0 to 239.255.255.255). Each server instance in the cluster uses multicast to broadcast regular heartbeat messages with its availability. By monitoring these heartbeat messages, server instances in a cluster determine when a server instance has failed. One limitation of using IP multicast for server communication is that it does not guarantee that the messages are actually received. For example, if an application's local multicast buffer is full, new multicast messages cannot be written to the buffer, and the application is not notified when messages are dropped.
IP sockets also provide the mechanism for sending messages and data among the servers in a cluster. The server instances use IP sockets for replicating HTTP session state among the cluster nodes. Proper socket configuration is crucial to the performance of a cluster. The efficiency of socket-based communication is dependent on the type of socket implementation (i.e. whether the system uses native or pure-Java socket reader implementation) and whether the server instance is configured to use enough socket reader threads, if the server uses pure-Java socket readers.
For best socket performance, configure the system to use the native sockets rather than the pure-Java implementation. This is because native sockets have less overhead compared to the Java-based socket implementation. Although the Java implementation of socket reader threads is a reliable and portable method of peer-to-peer communication, it does not provide the best performance for heavy-duty socket usage in the cluster. Native socket readers use more efficient techniques to determine if there is any data to read on a socket. With a native socket reader implementation, reader threads do not need to poll inactive sockets: they service only active sockets, and they are immediately notified when a given socket becomes active. With pure-Java socket readers, threads must actively poll all opened sockets to determine if they contain any data to read. In other words, socket reader threads are always "busy" polling sockets, even if the sockets have no data to read. This unnecessary overhead can reduce performance.
|
Related Reading
Tomcat: The Definitive Guide |
Even though the clustering feature was available in the earlier versions of Tomcat 5, it has become more modular in later versions (5.0.19 or later). The Cluster element in server.xml was refactored so that we can replace different parts of the cluster without affecting the other elements. For example, currently the configuration sets the membership service to be multicast discovery. This can easily be changed to a membership service that uses TCP or Unicast instead, without changing the rest of the clustering logic.
Other cluster elements, such as the session manager, replication sender, and replication receiver, can also be replaced with custom implementations without affecting the rest of the cluster configuration. Also, any server component in the Tomcat cluster can now use the clustering API to send messages to any or all members of the cluster.
There are two kinds of sessions handled by a server cluster: sticky sessions and replicated sessions. Sticky sessions are those sessions that stay on the single server that received the web request. The other cluster members don't have any knowledge of the session state in the first server. If the server that has the session goes down, the user has to again log in to the web site and re-enter any data stored in the session.
In the second session type, the session state in one server is copied to all of the other servers in the cluster. The session data is copied whenever the session is modified. This is a replicated session. Both sticky and replicated sessions have their advantages and disadvantages. Sticky sessions are simple and easy to handle, since we don't need to replicate any session data to other servers. This results in less overhead and better performance. But if the server goes down, so does all of the session data stored in its memory. Since the session data is not copied to other servers, the session is completely lost. This can cause problems if we are in the middle of processing a transaction type of query and lose all of the data that has been entered.
To support automatic failover for servlet and JSP HTTP session states, Tomcat server replicates the session state in memory. This is done by copying the data stored in the session (attributes) on one server to the other members in the cluster to prevent any data loss and allow for failover.
There are four categories of objects that are distinguished by how they maintain state on the server:
(Source: "Using WebLogic Server Clusters")
It is important to isolate the cluster multicast address with other applications. We don't want the cluster configuration or the network topology to interfere with multicast server communications. Sharing the cluster multicast address with other applications forces clustered server instances to process unnecessary messages, introducing overhead. Sharing a multicast address may also overload the IP multicast buffer and delay transmission of the server heartbeat messages. Such delays can result in a server instance being marked as dead, simply because its heartbeat messages were not received in a timely manner.
In addition to the network-related factors mentioned above, there are some design considerations related to the way we write J2EE web applications that also affect session replication. Following is a list of some of these programming considerations:
HttpSession in the web tier. Since the enterprise application is going to support various types of clients (web clients, Java applications, and other EJBs) storing the data in the web tier will result in duplicate data storage on the clients. Therefore, stateful session beans should be considered for storing the session state in these scenarios. A stateless session bean reconstructs conversational state for each invocation. The state may have to be rebuilt by retrieving data from a database. This completely defeats the purpose of using stateless session beans to improve performance and scalability, and can severely degrade performance.
Pages: 1, 2 |