Load Balancing Web Applications
Pages: 1, 2
Hardware Load Balancers
Hardware load balancers solve many of the problems faced by the round robin software solution through virtual IP addresses. The load balancer shows a single (virtual) IP address to the outside world, which maps to the addresses of each machine in the cluster. So, in a way, the load balancer exposes the IP address of the entire cluster to the world.
When a request comes to the load balancer, it rewrites the request's header to point to other machines in the cluster. If a machine is removed from the cluster, the request doesn't run the risk of hitting a dead server, since all of the machines in the cluster appear to have the same IP address. This address remains the same even if a node in the cluster is down. Moreover, cached DNS entries around the Internet aren't a problem. When a response is returned, the client sees it coming from the hardware load balancer machine. In other words, the client is dealing with a single machine, the hardware load balancer.
Server Load Balancing
Advantages of Hardware Load Balancers
Server affinity. The hardware load balancer reads the cookies or URL readings on each request made by the client. Based on this information, it can rewrite the header information and send the request to the appropriate node in the cluster, where its session is maintained.
Hardware load balancers can provide server affinity in HTTP communication, but not through a secure channel, such as HTTPS. In a secure channel, the messages are SSL-encrypted, and this prevents the load balancer from reading the session information.
High Availability Through Failover. Failover happens when one node in a cluster cannot process a request and redirects it to another. There are two types of failover:
- Request Level Failover. When one node in a cluster cannot process a request (often because it's down), it passes it along to another node.
- Transparent Session Failover. When an invocation fails, it's transparently routed to another node in the cluster to complete the execution.
Hardware load balancers provide request-level failover; when the load balancer detects that a particular node has gone down, it redirects all subsequent requests to that dead node to another active node in the cluster. However, any session information on the dead node will be lost when requests are redirected to a new node.
Transparent session failover requires execution knowledge for a single process in a node, since the hardware load balancer can only detect network-level problems, not errors. In the execution process of a single node, hardware load balancers do not provide transparent session failover. To achieve transparent session failover, the nodes in the cluster must collaborate among each other and have something like a shared memory area or a common database where all the session data is stored. Therefore, if a node in the cluster has a problem, a session can continue in another node.
- Metrics. Since all requests to a Web application must pass through the load-balancing system, the system can determine the number of active sessions, the number of active sessions connected in any instance, response times, peak load times, the number of sessions during peak load, the number of sessions during minimum load, and more. All this audit information is used to fine tune the entire system for optimal performance.
Disadvantages of Hardware Load Balancers
The drawbacks to the hardware route are the costs, the complexity of setting up, and the vulnerability to a single point of failure. Since all requests pass through a single hardware load balancer, the failure of that piece of hardware sinks the entire site.
Load Balancing HTTPS Requests
As mentioned above, it's difficult to load balance and maintain session information of requests that come in over HTTPS, as they're encrypted. The hardware load balancer cannot redirect requests based on the information in the header, cookies, or URL readings. There are two options to solve this problem:
- Web server proxies
- Hardware SSL decoders.
Implementing Web Server Proxies
A Web server proxy that sits in front of a cluster of Web servers takes all requests and decrypts them. Then it redirects them to the appropriate node, based on header information in the header, cookies, and URL readings.
The advantages of Web server proxies are that they offer a way to get server affinity for SSL-encrypted messages, without any extra hardware. But extensive SSL processing puts an extra load on the proxy.
Apache and Tomcat. In many serving systems, Apache and Tomcat servers work together to handle all HTTP requests. Apache handles the request for static pages (including HTML, JPEG, and GIF files), while Tomcat handles requests for dynamic pages (JSPs or servlets). Tomcat servers can also handle static pages, but in combined systems, they're usually set up to handle dynamic requests.
You can also configure Apache and Tomcat to handle HTTPS requests and to balance loads. To achieve this, you run multiple instances of Tomcat servers on one or more machines. If all of the Tomcat servers are running on one machine, they should be configured to listen on different ports. To implement load balancing, you create a special type of Tomcat instance, called a Tomcat Worker.
As shown in the illustration, the Apache Web server receives HTTP and HTTPS requests from clients. If the request is HTTPS, the Apache Web server decrypts the request and sends it to a Web server adapter, which in turn sends the request to the Tomcat Worker, which contains a load-balancing algorithm. Similar to the Web server proxy, this algorithm balances the load among Tomcat instances.
Hardware SSL Decoder
Finally, we should mention that there are hardware devices capable of decoding SSL requests. A complete description of them is beyond the scope of this article, but briefly, they sit in front of the hardware load balancer, allowing it to decrypt information in cookies, headers and URLs.
These hardware SSL decoders are faster than Web server proxies and are highly scalable. But as with most hardware solutions, they cost more and are complicated to set up and configure.
Vivek Viswanathan is an enterprise web application developer, whose interests include clustering, performance and optimization.
Return to ONJava.com.