|
Related Reading
|
This excerpt is Chapter 12 from Java Servlet Programming, 2nd Edition, published in April 2001 by O'Reilly.
This chapter discusses enterprise servlets. The term enterprise is used all the time with Java these days, but what does it mean? According to my trusty and beat-up copy of The American Heritage Dictionary (so old it's priced at $1.95) the word enterprise has three definitions:
An undertaking, esp. one of some scope and risk
A business
Readiness to venture; initiative
It's a surprisingly close definition to what people mean when they say enterprise Java and enterprise servlets. We can merge the traditional definitions to create a modern definition:
Readiness to support a business undertaking of large scope
In other words, enterprise servlets are servlets designed to support business-oriented large-scale web sites -- high-traffic, high-reliability sites that have extra demands for scalability, load balancing, failover support, and integration with other Java 2, Enterprise Edition (J2EE) technologies.
As servlets have become increasingly popular and robust, and as servlet containers have become more solid and featureful, a growing number of enterprise sites are being built using servlets. Writing servlets for these sites differs from writing servlets for traditional sites, and in this chapter we'll discuss the special requirements and abilities of these enterprise servlets.
|
For high-traffic and/or high-reliability sites, it's often desirable to distribute the site's content and processing duties across multiple backend servers. This distribution allows multiple servers to share the load, increasing the number of simultaneous requests that can be handled and providing failover so the site can remain up even when one particular component crashes.
Distribution isn't appropriate for every site. Creating and maintaining a distributed site can be significantly more complicated than doing the same for a standalone site and can be more costly as well in terms of load-balancing hardware and/or software requirements. Distribution also doesn't tend to provide a significant performance benefit until the server is under extreme load. When presented with a performance problem, it's often easiest to "throw hardware at the problem" by installing a single higher-end machine rather than trying to share the load between two underperforming machines.
Still, there are many sites that need to scale beyond the capabilities of a single machine and that need a level of reliability no single machine can offer. These are the sites that need to be distributed.
The programming requirements for a distributable servlet are much stricter than the requirements for a nondistributable servlet. A distributable servlet must be written following certain rules so that different instances of the servlet can execute on multiple backend machines. Any programmer assumptions that there's only one servlet instance, one servlet context, one JVM, or one filesystem have the potential to cause serious problems.
|
For more information on Enterprise JavaBeans see http://java.sun.com/products/ejb and Enterprise JavaBeans by Richard Monson-Haefel (O'Reilly). |
To learn how servlets can be distributed, look at Enterprise JavaBeans (EJB) technology, a server-side component model for implementing distributed business objects and the technology that's at the heart of J2EE. EJB is designed from the ground up as distributable objects. An EJB implements business logic and lets the container (essentially the server) in which it runs manage services such as transactions, persistence, concurrency, and security. An EJB may be distributed across a number of backend machines and may be moved between machines at the container's discretion. To enable this distribution model, EJB must follow a strict specification-defined ruleset for what they can and cannot do. (See sidebar)
Servlets have no such specification-defined ruleset. This stems from their heritage as frontend server-side components, used to communicate with the client and call on the distributed EJB and not be distributed themselves. However, for high-traffic sites or sites that need high reliability, servlets too need to be distributed. We expect upcoming Servlet API versions to include a tighter definition for the implementation of distributed servlet containers.
The following are our own rules of thumb for writing servlets to be deployed in a distributed environment:
Consider that different instances of the servlet may exist on each different JVM and/or machine. Therefore, instance variables and static variables should not be used to store state. Any state should be held in an external resource such as a database or EJB (the servlet's name can be used in the lookup).
Consider that a different instances of the ServletContext may exist on each different JVM and/or machine. Therefore, the context should not be used to store application state. Any state should be held in an external resource such as a database or EJB (the context's path can be used in the lookup).
Consider that any object placed into an HttpSession should be capable of being moved to (or accessed from) a different machine. For example, the object can implement java.io.Serializable. Be aware that because sessions may migrate, the session unbind event may occur on a different machine than the session bind event!
Consider that files may not exist on all backend machines. Therefore, you should avoid using the java.io package for file access and use the getServletContext().getResource( ) mechanism instead -- or make sure all accessed files are replicated across all backend machines.
Consider that synchronization is not global and works only for the local JVM.
A web application whose components follow these rules can be marked distributable, and that marking allows the server to deploy the application across multiple backend machines. The distributable mark is placed within the web.xml deployment descriptor as an empty <distributable/> tag located between the application's description and its context parameters:
<web-app>
<description>
All servlets and JSPs are ready for distributed deployment
</description>
<distributable/>
<context-param>
<!-- ... -->
</context-param>
</web-app>
Applications are nondistributable by default, to allow the casual servlet programmer to author servlets without worrying about the extra rules for distributed deployment. Marking an application distributable does not necessarily mean the application will be split across different machines. It only indicates the capability of the application to be split. Think of it as a programmer-provided certification.
Servers do not enforce most of the preceding rules given for a distributed application. For example, a servlet is not barred from using instance and static variables nor barred from storing objects in its ServletContext, and a servlet may still directly access files using the java.io package. It's up to the programmer to ensure these abilities aren't abused. The only enforcement that the server may perform is throwing an IllegalArgumentException if an object bound to the HttpSession does not implement java.io.Serializable (and even that's optional because, as we'll see later, a J2EE-compliant server must allow additional types of objects to be stored in the session).
Servlet distribution (often called clustering) is an optional feature of a servlet container, and servlet containers that do support clustering are free to do so in several different ways. There are four standard architectures, listed here from simplest to most advanced.
No clustering. All servlets execute within a single JVM, and the <distributable/> marker is essentially ignored. This design is simple, and works fine for a standard site. The standalone Tomcat server works this way.
Clustering support, no session migration, and no session failover. Servlets in a web application marked <distributable/> may execute across multiple machines. Nonsession requests are randomly distributed (modulo some weighting perhaps). Session requests are "sticky" and tied to the particular backend server on which they first start. Session data does not move between machines, and this has the advantage that sessions may hold nontransferable (non-Serializable) data and the disadvantage that sessions may not migrate to underutilized servers and a server crash may result in broken sessions. This is the architecture used by Apache/JServ and Apache/Tomcat. Sessions are tied to a particular host through a mechanism where the mod_jserv/mod_jk connector in Apache uses a portion of the session ID to indicate which backend JServ or Tomcat owns the session. Multiple instances of Apache may be used as well, with the support of load-balancing hardware or software.
Clustering support, with session migration, no session failover. This architecture works the same as the former, except a session may migrate from one server to another to improve the load balance. To avoid concurrency issues, any session migration is guaranteed to occur between user requests. The Servlet Specification makes this guarantee: "Within an application that is marked as distributable, all requests that are part of a session can only be handled on a single VM at any one time." All objects placed into a session that may be migrated must implement java.io.Serializable or be transferable in some other way.
Clustering support, with session migration and with session failover. A server implementing this architecture has the additional ability to duplicate the contents of a session so the crash of any individual component does not necessarily break a user's session. The challenge with this architecture is coordinating efficient and effective information flow. Most high-end servers follow this architecture.
The details on how to implement clustering vary by server and are a point on which server vendors actively compete. Look to your server's documentation for details on what level of clustering it supports. Another useful feature to watch for is session persistence, the background saving of session information to disk or database, which allows the information to survive server restarts and crashes.
|
Throughout the rest of this book, servlets have been used as a standalone technology built upon the standard Java base. Servlets have another life, however, where they act as an integral piece of what's known as Java 2, Enterprise Edition, or J2EE for short.
|
Most pronounce J2EE as J-2-E-E but those who know it best at Sun just say "jah-too-ee." |
J2EE breaks enterprise application development into six distinct roles. Of course, an individual may participate in more than one role and multiple individuals may work together in a given role.
The division of labor between component provider, assembler, and deployer has an impact on how we (as servlet programmers in the content provider role) behave. Specifically, we should design our code to make external dependencies clear for the assembler, and furthermore we should use mechanisms that allow the deployer to satisfy these dependencies without modifying the files received from the assembler. That means no deployer edits to the web.xml file! Why not? Because J2EE applications are assembled into Enterprise Archive (.ear) files of which a contained web application's web.xml file is but one uneditable part.
This sounds more difficult than it actually is. J2EE provides a standard mechanism to achieve this abstraction using JNDI and a few special tags in the web.xml deployment descriptor. JNDI is an object lookup mechanism, a way to bind objects under certain paths and locate them later using that path. You can think of it like an RMI registry, except it's more general with support for accessing a range of services including LDAP and NIS (and even, in fact, the RMI registry!). An assembler declares external dependencies within the web.xml using special tags, a deployer satisfies these dependencies using server-specific tools, and at runtime our Java code uses the JNDI API to access the external resources -- kindly placed there by the J2EE-compliant server. All goals are satisfied: our Java code remains portable between J2EE-compliant servers, and the deployer can satisfy the code's external dependencies without modifying the files received from the assembler. There's even enough flexibility left over for server vendors to compete on implementations of the standard.
Context init parameters serve a useful purpose with servlets, but there's a problem with context init parameters in the J2EE model: any change to a parameter value requires a modification to the web.xml file. For parameter values that may need to change during deployment, it's better to use environment entries instead, as indicated by the <env-entry> tag. The <env-entry> tag may contain a <description>, <env-entry-name>, <env-entry-value>, and <env-entry-type>. The following <env-entry> specifies whether the application should enable sending of PIN codes by mail:
<env-entry>
<description>Send pincode by mail</description>
<env-entry-name>mailPincode</env-entry-name>
<env-entry-value>false</env-entry-value>
<env-entry-type>java.lang.Boolean</env-entry-type> <!-- FQCN -->
</env-entry>
The <description> explains to the deployer the purpose of this entry. It's optional but a good idea to provide. The <env-entry-name> is used by Java code as part of the JNDI lookup. The <env-entry-value> defines the default value to be presented to the deployer. It's optional, but not specifying a value requires the deployer to provide one. The <env-entry-type> represents the fully qualified class name (FQCN) of the entry. The type may be a String, Byte, Short, Integer, Long, Boolean, Double, or Float (all with their full java.lang qualification). The type helps the deployer know what's expected. If you're familiar with the EJB deployment descriptor, these tags may look familiar; they have the same names and semantics in EJB as well.
Java code can retrieve the <env-entry> values using JNDI:
Context initCtx = new InitialContext( );
Boolean mailPincode = (Boolean) initCtx.lookup("java:comp/env/mailPincode");
All entries are placed by the server into the java:comp/env context. If you're new to JNDI, you can think of this as a URL base or filesystem directory. The java:comp/env context is read-only and unique per web application, so if two different web applications define the same environment entry, the entries do not collide. The context abbreviations, by the way, stand for component environment.
|
Example 12-1 shows a servlet that displays all its environment entries, using the JNDI API to browse the java:comp/env context.
Example 12-1: Snooping the java:comp/env Context
import java.io.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;
import javax.naming.*;
public class EnvEntrySnoop extends HttpServlet {
public void doGet(HttpServletRequest req, HttpServletResponse res)
throws ServletException, IOException {
res.setContentType("text/plain");
PrintWriter out = res.getWriter( );
try {
Context initCtx = new InitialContext( );
NamingEnumeration enum = initCtx.listBindings("java:comp/env");
// We're using JDK 1.2 methods; that's OK since J2EE requires JDK 1.2
while (enum.hasMore( )) {
Binding binding = (Binding) enum.next( );
out.println("Name: " + binding.getName( ));
out.println("Type: " + binding.getClassName( ));
out.println("Value: " + binding.getObject( ));
out.println( );
}
}
catch (NamingException e) {
e.printStackTrace(out);
}
}
}
Assuming the previous web.xml entry, the servlet would generate:
Name: mailPincode
Type: java.lang.Boolean
Value: false
Remember, a server that does not support J2EE is not required to support these tags or any of the tags we talk about in this section.
When the environment entry object is an EJB component, there's a special <ejb-ref> tag that must be used. It provides a way for servlets to get a handle to an EJB using an abstract name. The deployer ensures the availability of an appropriate bean at runtime based on the constraints given by the <ejb-ref> tag. The tag may contain a <description>, <ejb-ref-name>, <ejb-ref-type>, <home>, <remote>, and <ejb-link>. Here's a typical <ejb-ref>:
<ejb-ref>
<description>Cruise ship cabin</description>
<ejb-ref-name>ejb/CabinHome</ejb-ref-name>
<ejb-ref-type>Entity</ejb-ref-type>
<home>com.titan.cabin.CabinHome</home>
<remote>com.titan.cabin.Cabin</remote>
</ejb-ref>
|
The Servlet API 2.2 Specification states, "The |
These tags also have similar counterparts in EJB, and in fact this example is borrowed from the book Enterprise JavaBeans by Richard Monson-Haefel (O'Reilly). The <description> supports the deployer and is optional but recommended. The <ejb-ref-name> dictates the JNDI lookup name. It's recommended (but not required) that the name be placed within the ejb/ subcontext, making the full path to the bean java:comp/env/ejb/CabinHome. The <ejb-ref-type> must have a value of either Entity or Session, the two types of EJB components (see sidebar).
Finally, the <home> element specifies the fully qualified class name of the EJB's home interface, while the <remote> element specifies the FQCN of the EJB's remote interface.
A servlet would obtain a reference to the Cabin bean with the following code:
InitialContext initCtx = new InitialContext( );
Object ref = initCtx.lookup("java:comp/env/ejb/CabinHome");
CabinHome home =
(CabinHome) PortableRemoteObject.narrow(ref, CabinHome.class);
If the assembler writing the web.xml file has a specific EJB component in mind for an EJB reference, that information can be conveyed to the deployer with the addition of the optional <ejb-link> element. The <ejb-link> element should refer to the <ejb-name> of an EJB component registered in an EJB deployment descriptor within the same J2EE application. The deployer has the option to use the suggestion or override it. Here's an updated web.xml entry:
<ejb-ref>
<description>Cruise ship cabin</description>
<ejb-ref-name>ejb/CabinHome</ejb-ref-name>
<ejb-ref-type>Entity</ejb-ref-type>
<home>com.titan.cabin.CabinHome</home>
<remote>com.titan.cabin.Cabin</remote>
<ejb-link>CabinBean</ejb-link>
</ejb-ref>
Finally, for those times when the environment entry is a resource factory, there's a <resource-ref> tag to use. A factory is an object that creates other objects on demand. A resource factory creates resource objects, such as database connections or message queues.
The <resource-ref> tag may contain a <description>, <res-ref-name>, <res-type>, and <res-auth>. Here's a typical <resource-ref>:
<resource-ref>
<description>Primary database</description>
<res-ref-name>jdbc/primaryDB</res-ref-name>
<res-type>javax.sql.DataSource</res-type>
<res-auth>CONTAINER</res-auth>
</resource-ref>
The <description> again supports the deployer and is optional but recommended. The <res-ref-name> dictates the JNDI lookup name. It's recommended but not required to place the resource factories under a subcontext that describes the resource type:
jdbc/ for a JDBC javax.sql.DataSource factoryjms/ for a JMS javax.jms.QueueConnectionFactory or javax.jms.TopicConnectionFactorymail/ for a JavaMail javax.mail.Session factoryurl/ for a java.net.URL factoryThe <res-type> element specifies the FQCN of the resource factory (not the created resource). The factory types in the preceding list are the standard types. A server has the option to support additional types; user factories cannot be used. The upcoming J2EE 1.3 specification proposes a "connector" mechanism to extend this model for user-defined factories.
|
The <res-auth> tells the server who is responsible for authentication. It can have two values: CONTAINER or SERVLET. If CONTAINER is specified, the servlet container (the J2EE server) handles authentication before binding the factory to JNDI, using credentials provided by the deployer. If SERVLET is specified, the servlet must handle authentication duties programmatically. To demonstrate:
InitialContext initCtx = new InitialContext( );
DataSource source =
(DataSource) initCtx.lookup("java:comp/env/jdbc/primaryDB");
// If "CONTAINER"
Connection con1 = source.getConnection( );
// If "SERVLET"
Connection con2 = source.getConnection("user", "password");
These tags too have similar counterparts in the EJB deployment descriptor. The only difference is that in EJB the two possible values for <res-auth> are Container and Application (note the inexplicable case difference).
The final difference between servlets in a standalone environment and servlets in a J2EE environment involves a subtle change to the rules for session distribution. While a standard web server is required to support only java.io.Serializable objects in the session for a distributable application, a J2EE-compliant server that supports a distributed servlet container must also support several additional types of objects:
javax.ejb.EJBObjectjavax.ejb.EJBHomejavax.transaction.UserTransactionjavax.naming.Context for java:comp/envAll these are interfaces that do not implement Serializable. For transferring the objects the container may use its own custom mechanism, perhaps based on serialization or perhaps not. Additional class types may be supported at the server's discretion, but these are the only guaranteed types.
Return to ONJava.com.
Copyright © 2007 O'Reilly Media, Inc.