ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Two Servlet Filters Every Web Application Should Have
Pages: 1, 2, 3

Caching Content Using a Servlet Filter

The second filter this article addresses is a cache filter. Caching is helpful because it saves time and processing power. The basic idea is that it takes time for a web application to generate content, and in many situations, the content won't change between different requests to a particular servlet or JSP. Therefore, if you simply save the exact output (e.g., HTML) that is produced for a given URI, you can recycle this content several times before having the web application generate it again. Assuming your cache is faster then the web application -- it almost always is -- the end result is that you save a large amount of the time and processing power required to generate a dynamic response. Currently, there is no official standard for caching web application content. However, building a simple, generic caching system is a straightforward process.



We will now begin to discuss building a simple cache filter. In general, caching at the filter level is most helpful, as it allows you to save the entire response any particular JSP or servlet generates. However, it is worth considering that you can certainly try to cache elsewhere; for instance, using a set of custom tags that auto-cache any content placed between them, or using a custom Java class to cache information retrieved from a database. Caching possibilities are endless, but for practical purposes we shall focus on implementing caching at the filter level.

Before seeing some code, let's make sure what I mean by "caching at the filter level" is clear. "Caching at the filter level" simply means using a standard servlet filter that will intercept all requests to a web application and attempt to intelligently use the cache. Should a valid cached copy of content exist in a cache, the filter will immediately respond to the request by sending a copy of the cache. However, if no cache exists, the filter will pass the request on to its intended endpoint, usually a servlet or JSP, and the response will be generated as it normally is. Once a response is successfully generated, it will also be cached, so that on future requests to the same resource, the cache may be used.

Understand that as this filter is intended to be used on an entire web application, it can cache all of the various responses from different servlets and JSPs. Think about how this is possible: each servlet or JSP will likely produce a different response. The filter will need to be able to distinguish between different responses, store the appropriate content somewhere, and correctly match a cached copy of the content to an incoming request. Doing all of this is no problem at all -- different requests can almost always be distinguished by the requested URI, and the same information can be used to identify cached resources. Cached content can be stored either in memory, on the hard disk, or via any other method your server allows for -- usually, the hard disk is a great solution. With all of that said, here is the code for a filter that caches content in the web application's temporary directory. The full code is given below, and important parts of the code are highlighted after the listing.

package com.jspbook;

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import java.util.Calendar;

public class CacheFilter implements Filter {
  ServletContext sc;
  FilterConfig fc;
  long cacheTimeout = Long.MAX_VALUE;

  public void doFilter(ServletRequest req,
                       ServletResponse res,
                       FilterChain chain)
      throws IOException, ServletException {
    HttpServletRequest request =
        (HttpServletRequest) req;
    HttpServletResponse response =
        (HttpServletResponse) res;

    // check if was a resource that shouldn't be cached.
    String r = sc.getRealPath("");
    String path = 
        fc.getInitParameter(request.getRequestURI());
    if (path!= null && path.equals("nocache")) {
      chain.doFilter(request, response);
      return;
    }
    path = r+path;

    String id = request.getRequestURI() + 
        request.getQueryString();
    File tempDir = (File)sc.getAttribute(
      "javax.servlet.context.tempdir");

    // get possible cache
    String temp = tempDir.getAbsolutePath();
    File file = new File(temp+id);

    // get current resource
    if (path == null) {
      path = sc.getRealPath(request.getRequestURI());
    }
    File current = new File(path);

    try {
      long now =
        Calendar.getInstance().getTimeInMillis();
      //set timestamp check
      if (!file.exists() || (file.exists() &&
          current.lastModified() > file.lastModified()) ||
          cacheTimeout < now - file.lastModified()) {
        String name = file.getAbsolutePath();
        name =
            name.substring(0,name.lastIndexOf("/"));
        new File(name).mkdirs();
        ByteArrayOutputStream baos =
            new ByteArrayOutputStream();
        CacheResponseWrapper wrappedResponse =
          new CacheResponseWrapper(response, baos);
        chain.doFilter(req, wrappedResponse);

        FileOutputStream fos = new FileOutputStream(file);
        fos.write(baos.toByteArray());
        fos.flush();
        fos.close();
      }
    } catch (ServletException e) {
      if (!file.exists()) {
        throw new ServletException(e);
      }
    }
    catch (IOException e) {
      if (!file.exists()) {
        throw e;
      }
    }

    FileInputStream fis = new FileInputStream(file);
    String mt = sc.getMimeType(request.getRequestURI());
    response.setContentType(mt);
    ServletOutputStream sos = res.getOutputStream();
    for (int i = fis.read(); i!= -1; i = fis.read()) {
      sos.write((byte)i);
    }
  }

  public void init(FilterConfig filterConfig) {
    this.fc = filterConfig;
    String ct =
        fc.getInitParameter("cacheTimeout");
    if (ct != null) {
      cacheTimeout = 60*1000*Long.parseLong(ct);
    }
    this.sc = filterConfig.getServletContext();
  }

  public void destroy() {
    this.sc = null;
    this.fc = null;
  }
}

First note that the code is part of the com.jspbook package. This code is the Servlet-2.4-compliant cache filter that is detailed in the book. It is tested code that is used in several web applications, and is maintained at the book's support site, http://www.jspbook.com. This is no contrived example; it is serious code.

The next thing I'd like to draw attention to is how the servlet identifies caches and saves them to the local hard disk. As mentioned before the code, the filter uses the request URI and any parameters in the query string to generate a unique name for the cache.

String id = request.getRequestURI()+request.getQueryString();

Once the filter has this unique name, it uses the name to check if the resource exists in the web application's cache. If it does, the cached copy is sent and the filter does not pass the request and response down the filter chain. If no cache exists, the filter passes the request and response down the filter chain so that the desired JSP or servlet can generate a response. Once the response is made, the cache filter sends it to the client and makes a copy of the response in the web application's cache.

// use the web applications temporary work directory
File tempDir =
    (File)sc.getAttribute("javax.servlet.context.tempdir");

// look to see if a cached copy of the response exists
String temp = tempDir.getAbsolutePath();
File file = new File(temp+id);

// get a reference to the servlet/JSP
// responsible for this cache
if (path == null) {
  path = sc.getRealPath(request.getRequestURI());
}
File current = new File(path);

// check if the cache exists and is newer than the
// servlet or JSP responsible for making it. 
try {
  long now = Calendar.getInstance().getTimeInMillis();
  //set timestamp check
  if (!file.exists() || (file.exists() &&
      current.lastModified() > file.lastModified()) ||
      cacheTimeout < now - file.lastModified()) {

    // if not, invoke chain.doFilter() and
    // cache the response
    String name = file.getAbsolutePath();
    name = name.substring(0,name.lastIndexOf("/"));
    new File(name).mkdirs();
    ByteArrayOutputStream baos =
        new ByteArrayOutputStream();
    CacheResponseWrapper wrappedResponse =
      new CacheResponseWrapper(response, baos);
    chain.doFilter(req, wrappedResponse);

    FileOutputStream fos = new FileOutputStream(file);
    fos.write(baos.toByteArray());
    fos.flush();
    fos.close();
  }
} catch (ServletException e) {
  if (!file.exists()) {
    throw new ServletException(e);
  }
}
catch (IOException e) {
  if (!file.exists()) {
    throw e;
  }
}

// return to the client the cached resource.
FileInputStream fis = new FileInputStream(file);
String mt = sc.getMimeType(request.getRequestURI());
response.setContentType(mt);
ServletOutputStream sos = res.getOutputStream();
for (int i = fis.read(); i!= -1; i = fis.read()) {
  sos.write((byte)i);
}

And that is a basic cache filter. Two support classes are needed -- CacheResponseStream and CacheResponseWrapper -- but they are nothing more than implementations of the ServletOutputStream class and HttpServletResponseWrapper class that are appropriate for CacheFilter.java. The full source code for everything is given at the end of this article, but to keep things moving along, I'll have you use a JAR file that includes the compiled cache filter. If you didn't already for the compression filter, grab a copy of jspbook.jar and put it in the WEB-INF/lib directory of your favorite web application and deploy the filter to intercept all requests going to resources ending in .jsp, and reload the web application for the changes to take effect. Next we will make a simple JSP to test the code.

Here is the complete code for a simple JSP that tests the cache filter. The JSP wastes time and processing power by executing several loops. Save the following as TimeMonger.jsp somewhere in your web application.

<html>
  <head>
    <title>Cache Filter Test</title>
  </head>
  <body>
A test of the cache Filter.
<%
 // mock time-consuming code
 for (int i=0;i<100000;i++) {
   for (int j=0;j<1000;j++) {
     //noop
   }
 }
%>
  </body>
</html>

Browse to TimeMonger.jsp for the ever-so-sophisticated cache test. Notice how long it takes to generate the page; it should take several seconds due to the embedded for loops. Now browse to the page once again; notice that it appears near-instantly. Continue browsing to the page and notice it will continue to appear near-instantly. This is the cache filter in action. After the page is generated once, a copy is saved in your web application's temporary work directory (on Tomcat, this is in a subdirectory of ./work), and on subsequent requests, this cache is used instead of executing the JSP. You can test this by deleting the cache file located in your web application's temporary directory, and browsing to the page. Once again it will take several seconds to load. We can quantify the time difference by making a simple JSP that spoofs two HTTP requests and measuring the time it takes for each request to be answered. To ensure that the test works, we will have to delete the cache before running the JSP. This will force the first HTTP request to execute the JSP and allow the second request to hit the cache. Here is the code for the needed JSP.

<%@ page import="java.util.*,
                 java.net.*,
                 java.io.*" %>
<%
  String url = request.getParameter("url");
  long[] times = new long[2];
  if (url != null) {
    for (int i=0;i<2;i++) {
      long start =
        Calendar.getInstance().getTimeInMillis();
      URL u = new URL(url);
      HttpURLConnection huc =
        (HttpURLConnection)u.openConnection();
      huc.setRequestProperty("user-agent",
                             "Mozilla(MSIE)");
      huc.connect();
      ByteArrayOutputStream baos =
        new ByteArrayOutputStream();
      InputStream is = huc.getInputStream();
      while(is.read() != -1) {
        baos.write((byte)is.read());
      }
      long stop =
        Calendar.getInstance().getTimeInMillis();
      times[i] = stop-start;
    }
  }
  request.setAttribute("t1", new Long(times[0]));
  request.setAttribute("t2", new Long(times[1]));
  request.setAttribute("url", url);

%><html>
<head>
  <title>Cache Test</title>
</head>
<body>
<h1>Cache Test Page</h1>
Enter a URL to test.
<form method="POST">
<input name="url" size="50">
<input type="submit" value="Check URL">
</form>
 <p><b>Testing: ${url}</b></p>
 Request 1: ${t1} milliseconds<br/>
 Request 2: ${t2} milliseconds<br/>
 Time saved: ${t1-t2} milliseconds<br/>
</body>
</html>

Save the above code in your web application. Next, delete the temporary directory of that web application in order to ensure that there is no cache. Now browse to the cache test page. Initially, a blank page appears with a simple HTML form, as shown here.

Figure 4
Figure 4. Blank cache test page

Just as with the compression test page, fill out the URL that the cache-testing JSP should check. Any value will do, but for this example let us test TimeMonger.jsp -- a JSP we know takes a relatively long amount of time to execute. Here is what the cache-testing JSP returns after testing TimeMonger.jsp.

Figure 5
Figure 5. Cache test page used on TimeMonger.jsp

Notice that TimeMonger.jsp normally takes about five seconds to execute, but when a cache is used, it takes a hundredth of the time. If you like, try the page again and notice that the cache will continue to be used; each response will take about 50 milliseconds. However, if you delete the cache and force the JSP to execute, you will once again see the page take about five seconds to execute before it is once again cached.

The point to see is that CacheFilter.java is saving a copy of the HTML it used in a response and reusing it instead of executing dynamic code. This results in time-consuming and processor-intensive code being skipped. In TimeMonger.jsp, the skipped code was a few for loops -- admittedly, a poor example. But understand that the dynamic code can be anything, such as a database query or an execution of any custom Java code. The time it takes to retrieve content from the cache will always be about the same; in this example, it was about 50 milliseconds. Therefore, you can increase the speed of just about any dynamic page to be roughly 50 milliseconds, no matter how time intensive the page is.

Cache Filter Summary and Good Practice Tips

Once again you have been presented with a filter that is incredibly helpful and near-trivial to use. Caching can save enormous amounts of your server's time and processing power, and caching is as easy to implement as putting a copy of jspbook.jar in your web application's WEB-INF/lib directory and deploying the filter to intercept requests to any resource you want to cache. I suggest you use a caching filter as much as possible in order to speed your web application up to peak performance.

While caching can save a web application a lot of time and processing power, and it can make even the most complex server-side code appear to execute unbelievably fast, caching is not suitable for everything. Some pages can't be cached because the page's content must be dynamically generated each time the page is viewed -- for instance, a web site that lists stock quotes. Often, though, resources that are supposedly always dynamic can really be cached for short periods of time. For example, consider news.google.com: content is cached for a few minutes at a time to save server-side resources, but the cache is updated quick enough to make the site appear to be completely dynamic. In the given cache filter code, you can configure whether the filter caches a particular resource at all, and how long the filter uses a cache before updating it. Both of these are initial configuration elements.

  <filter-mapping>
    <filter-name>CacheFilter</filter-name>
    <url-pattern>*.jsp</url-pattern>
    <init-param>
      <param-name>/timemonger.jsp</param-name>
      <param-value>nocache</param-value>
    </init-param>
    <init-param>
      <param-name>cacheTimeout</param-name>
      <param-value>1</param-value>
    </init-param>
  </filter-mapping>

To tell the cache filter that a resource shouldn't be cached, set an initial configuration element of the same name as the resource's request URI to have a value of nocache. To configure how long the filter waits before updating cached content change the cacheTimeout initial configuration parameter to have a numerical value that represents the number of minutes a cache is valid. Both of these features are specific only to this cache filter. Feel free to examine CacheFilter.java to see exactly how they are implemented.

In general, a cache filter is a very powerful enhancement to add to a web application. Cached content can be served to users as fast as the server can read files from disk (or memory, if you keep the cache in RAM), which is almost always much faster than executing a servlet or JSP, especially complex, database-driven pages. However, caching must be done in an intelligent manner. Some pages simply can't be cached, or they can only be cached for a few minutes at a time. Make sure you cache as much of your web application's content for as long as you can, and be sure to configure the cache filter to appropriately handle pages that either shouldn't be cached or should only be cached for short periods of time.

Conclusion

Every web application should have a caching filter and a compression filter. These two filters optimize how quickly a web application generates content and how long it takes the content to be sent across the World Wide Web, both of which are arguably the most important tasks a web application performs. The code presented in this article provides a good implementation of each of these filters. The code is both free and open source. If you don't want to build your own caching and compression support, simply deploy the jspbook.jar with your web application and reap the rewards. If you do wish to develop your own caching and compression support, you have the full code to both of these filters, and you can get any updates to the code from the book's support site, www.jspbook.com. Take the code and go!

Links

  • Servlets and JSP; the J2EE Web Tier Book Support Site. Check here for the latest code for both the cache and compression filter. The authors actively maintain the book's code, and attempt to fix any bugs that may be present. You can also find lots of other free code examples and excerpts from the book itself.
  • jspbook.jar. A ready-to-use JAR with compiled versions of both the cache and compression filter.
  • jspbook.zip. All source code for this article in one ZIP.

Jayson Falkner is a J2EE developer, student, and webmaster of JSP Insider.


Return to ONJava.com.