ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


How to Publish Multiple Websites Using a Single Tomcat Web Application

by Satya Komatineni
08/30/2006

Knowledge Folders is a web application that holds and displays content for multiple users. I had been wondering if I could expose the content from this single web application as multiple websites with their own domain names. Could I use virtual hosts to do this? Or would I need to use reverse proxies? How and where would I register domain names? What entries would I need to make in Tomcat configuration files? How would I handle emails for these independent domains? What else would I need to do in my web application? What would the end result look like?

After a few weeks of effort, I was able to expose Knowledge Folders as multiple websites with their own domain names. It turned out I didn't need to go to reverse proxies for now, and could use virtual hosts instead. I was able to get my multiple domain names from GoDaddy.com. I was also able to use Tomcat host/alias settings to effectively route traffic from all of these domains to the same web app. Using the index.jsp of the web app, I was able to separate the content between different domains. After all of this effort, I ended up with a way to publish online websites very quickly and expose them as their own domains. The resulting websites have a number of features that static websites can't accomplish easily.

Background

I wrote Knowledge Folders a few years ago as a workaround for keeping my notes online. I used to keep these "rapid notes" using Microsoft Outlook and a few macros. In particular, the ability to file these notes into classified folders appealed to me. When I took that application to the web, it was natural for me to make it a multiuser system that allowed a number of users to manage their own notes and perhaps share them as well.

Originally, these notes were various SQL scripts that ran on a database. The initial release even had an execution engine to run these notes against a target database and return the results. I abandoned this later to focus on the transformation to come.

At about the same time, I was in search of something to document my open source tool Aspire/J2EE. I was looking more along the lines of wikis and weblogs. Unhappy with what I found, I changed Knowledge Folders to focus on documenting open source software. At this point, Knowledge Folders was basically a collection of accounts (or users), files, and folders in which knowledge was created and classified (hence the name).

Later that year, I introduced the idea of master pages (background HTML, similar to tiles) to give a facelift and proper presentation to the content. This took Knowledge Folders toward a content-management system, where content could be portrayed with appropriate backgrounds.

Later, I added some collaboration features and task management for individual users.

During this time, there was a single domain through which users accessed their accounts. Although not difficult, it was awkward to pass the account URLs for individual users to their friends or any intended audience. I wanted to expose each user as his/her own domain.

My Original Thought Involving Reverse Proxy Servers

Initially, the problem sounded like a case where I could have the individual domains pointing to a "piece of code" on the server that would in turn read the content from a single web app. This intermediate piece of code would somehow associate the incoming domain name to an account in Knowledge Folders and use Knowledge Folders as a source/sink to read/write web pages. In essence, it would be working like a proxy to the actual Knowledge Folders.

Wikipedia's definition of Reverse Proxy implied it could be used for this purpose. (In fact, this may turn out to be a good solution in the future if I were to segregate content further. Perhaps the document Reverse Proxy Patterns PDF by Peter Sommerlad [PDF, 328 KB] might throw some light on the possibilities.) I also hoped to use the reverse proxy facility of Apache to accomplish my goal.

Although I have mentioned the links on reverse proxies here for further research, I will mention the key elements briefly.

What Are Reverse Proxies?

Reverse proxies are web servers that stay in front of other web servers, possibly internal to a corporate network. This indirection is useful in a number of situations. These proxy servers typically read or intercept communications from a browser and rewrite to the back-end servers. Users are usually exposed only to the domain names of these reverse proxy servers and not to the back-end servers. The reverse proxy servers will, in turn, call some internal servers to fulfill the request. They typically break the incoming IP pipe and open a separate pipe to the target servers. As a result, implementing a proxy server reliably is not easy because it must behave like a genuine target server, while also truly intercepting all of the data and HTTP headers.

What Are Reverse Proxies Used For?

Reverse proxies are routinely used to offload SSL certificates. In this scenario, https traffic is routed to a reverse proxy server. The reverse proxy server converts the traffic from https to http and then forwards that request to an HTTP internal server. In this approach, a single reverse-proxy server can be used to offload SSL (and hence save certificates) to multiple back-end servers. Nevertheless, sometimes this approach poses issues for sendRedirect on the target server. When sendRedirect is used, sometimes a relative URL is translated into an absolute URL using the wrong scheme (http versus https). Fortunately, this can be resolved by rewriting SendRedirect.

Reverse proxies can also be used to expose a single domain for multiple web applications on the back end. Each separate server can be mapped to a path based on the main domain. There are also approaches that provide role-based security using Proxy server gatekeepers by monitoring every URL.

Implications To Web Application Development in the Face of Reverse Proxies

It is imperative for all of the URLs to be relative for reverse proxies to work well. This is because the reverse proxy is rewriting the page using a different (and typically external) name. Internal names are unknown to the outside world. So, URLs on your web pages delivered by back-end servers should typically read:

/webapp/resource1.html

What Are Virtual Hosts?

Although a solution involving reverse proxies seemed possible, I found out that the hosting facility at Indent that I use hosts my web app on Tomcat, not Apache. After some initial research, I couldn't figure out whether Tomcat supported reverse proxies, so my exploration led me to virtual hosts--maybe they could solve the problem.

A virtual host allows multiple domain names for a given IP address. In other words, a given IP address can have any number of host names. When requests are received on behalf of these host names, a web server can decide to deliver content from different root directories, or different web apps in the case of Tomcat.

For example, you could have an arrangement where

In Tomcat, the host names and web apps are bound in a many-to-many relationship. There will be one host entry for each host. When multiple host names are bound to the same web app, one can use Tomcat's aliases facility.

Examples of Virtual Hosts in Tomcat

Based on this, here is a sample setup for Knowledge Folders:

 <Host name="www.knowledgefolders.com" 
      appBase="D:/webpage_demos/akc"
      unpackWARs="true" 
      autoDeploy="true" 
      xmlValidation="false" 
      xmlNamespaceAware="false">
   
       <Alias>knowledgefolders.com</Alias>

       <Alias>www.knowledgefolders.net</Alias>
       <Alias>knowledgefolders.net</Alias>

       <Alias>www.knowledgefolders.org</Alias>
       <Alias>knowledgefolders.org</Alias>
    
       <Alias>www.satyakomatineni.com</Alias>
       <Alias>www.kavithakomatineni.com</Alias>

       <Context path="" docBase="D:/webpage_demos/akc" 
           debug="0" reloadable="false"/>
       <Context path="/akc" docBase="D:/webpage_demos/akc" 
          debug="0" reloadable="false"/>
 </Host>

Notice how all of the following host names point to the same web app, akc (which was the previous name for Knowledge Folders).

Registering A Domain Name

Originally, Knowledge Folders was hosted on a static IP at Indent, Inc. With potential changes to internal IP, I decided to get a proper domain address for Knowledge Folders.

I went to GoDaddy.com on the advice of a friend. I found it had excellent support and its prices seemed very cheap. I registered three domains in the process:

knowledgefolders.org
knowledgefolders.net
knowledgefolders.com

Registering these domains was quite simple at GoDaddy, but setting up the rest took some work. Knowledge Folders was physically hosted with Indent at Peak 10, a hosting facility, on a dedicated Windows server, whereas the domain names were registered at GoDaddy.

Securing The Name Servers and Setting Up IP Address Association

To make the domains work, the first thing I needed to know from Peak 10 was the name servers that would be used to resolve the host names. I needed two name servers. For instance, the name servers for Peak 10 are:

NS1.JAX.PEAK-10.COM
NS1.CLT.PEAK-10.COM

The next step was to tell the Peak 10 staff the domain names I'd registered and the physical IP address the host names should be pointing to. With these changes, I was able to access Knowledge Folders with all of the domain names.

Changes to Knowledge Folders

Thus, I was able to take multiple domain names and point them to the same web app on a given physical IP. So for instance, when I accessed http://www.satyakomatineni.com, I was taken to the home page of Knowledge Folders.

But my intention was to go to the homepage of the account identified by the userid of satya. This required changing two things in Knowledge Folders:

The general idea was to have the index.jsp identify the incoming host name and, provided that there was a way to associate the domain name to an account, the index.jsp would transfer control to the home page of that account.

Example index.jsp and Properties File

The source code of index.jsp that accomplishes this is as follows

<!--
*************************************************************
* Sample code for knowing the Knowledge Folders url:
* Standard aspire libraries
*************************************************************
-->
<%@ page import="com.ai.htmlgen.*" %>
<%@ page import="com.ai.application.utils.*" %>
<%@ page import="com.ai.common.*" %>

<!--
*************************************************************
* html header
*************************************************************
-->
<html><head>
<title>Welcome to Aspire Knowledge Center</title>
<link rel="stylesheet" type="text/css" href="/akc/style/style.css">
<script src="/akc/js/genericedits1.js"></script>

<!--
*************************************************************
* Figure out home page, 
* if not found use the main home page of Knowledge Folders
*************************************************************
-->
<%
   String hostname = request.getServerName();
   String homepageurl = AppObjects.getValue("aspire.multiweb." 
                        + hostname + ".homepageurl",null);
   String targeturl = "";
   if (homepageurl == null)
   {
      targeturl = "/akc/akchome.html";
   }
   else
   {
      //hostuserid exists
      targeturl = homepageurl;
   }
   String debug = request.getParameter("debug");
%>
<script>

<!--
*************************************************************
* gotoHomePage() on load
*************************************************************
-->
function gotoHomePage()
{
   debugAlert("gethost on the client side:" + getHost());
   debugAlert("<%=hostname%>:<%=homepageurl%>");
   var targeturl = "<%=targeturl%>";
   debugAlert(targeturl);
   document.location.replace(targeturl);
   
}
<!--
*************************************************************
* some debugging support
*************************************************************
-->
function debugAlert(message)
{
   var debug = "<%=debug%>";
   if (debug == "true")
   {
      alert(message);
   }
}
</script>
</head>
<!--
*************************************************************
* onload
*************************************************************
-->
<body onload="gotoHomePage()">
</body></html>

Here is the Aspire/J2EE configuration file to support the "domain name to account" translation or mapping:

aspire.multiweb.www.satyakomatineni.com.userid=satya
aspire.multiweb.www.satyakomatineni.com.homepageurl=\
/akc/update?request_name=GotoHomepageURL&ownerUserId=satya

Summary of Setup Procedures for Creating A New Website in Knowledge Folders

  1. Register a domain with a domain name registrar (such as GoDaddy.com).
  2. Provide name servers for the domain.
  3. Associate/inform IP address with name servers via an email to the hosting providers.
  4. Add an alias to Tomcat server.xml under the host corresponding to the web app.
  5. Make changes to the aspire configuration to tie the domain name to an individual account.
  6. Optionally set up an email account for the domain.
  7. Write down the passwords for all of the accounts you have set up.

The Email Option

As it exists today, Knowledge Folders is quite flexible and convenient to create websites without any additional tools. This is a great advantage for small companies that want to have a web presence quickly without having to buy any hosting space. Nevertheless, these small companies also usually want a basic email address at the domain so that they can use it on business cards or as a general advertisement. This can be done in two ways.

Setting It Up with Indent

You see, there are three players in this solution. The domains are registered at GoDaddy. The Windows server on which the software runs is sitting on the Peak 10 network, which requires that I use their name servers. The actual Windows server is owned and operated by Indent.

So the first option involves alerting Indent to create email accounts. Indent uses the James mail server. Indent usually uses a manual process to either create a full-fledged email account or provide email forwarding for that account.

Using GoDaddy's Email Accounts

Registering a domain at GoDaddy generates a free email account for each registered domain. You can also purchase additional email accounts if needed. GoDaddy also offers email forwarding and online tools to manage these email accounts.

But it is tricky to use these email accounts at GoDaddy if the original mail server for your domain is at a hosting facility. You have to set up MX records and CNAME records at the mail server to accomplish this.

GoDaddy recommends adding the following to the domain-name system manager:

MX 0 - smtp.secureserver.net 
MX 10 - mailstore1.secureserver.net

What Are MX Records and How Do They Work?

According to an MX FAQ, a mail-sender program checks the domain-name system to see if the server has an MX record pointing to another mail server. If it does, then it will use that server as the target server. It may even be recursive. This is one way to redirect the mail. This is how the MX records at Peak 10 will reroute the mail to GoDaddy, meaning I could use the email accounts at GoDaddy. It is sufficient to set the MX records only for the root domain name and not the CNAMES. For instance, to set up MX records for www.knowledgefolders.com, it is sufficient to set them for "knowledgefolders.com", because the email is going to be addressed to somemail@knowledgefolders.com.

Adding CNAMES To Fine-Tune The Email Solution

According to another CNAME document, CNAME records are aliases at the domain-name server (in this case, Peak 10) redirecting the traffic. For example, I can set up a CNAME record at Peak 10 for pop.knowledgefolders.com pointing to pop.godaddy.com. This will allow Outlook to specify the pop-name server as your domain-name server. Something similar can be done for smtp CNAME, and for the webmail at GoDaddy if needed.

Limitations of CNAMES

CNAMES are aliases to host names. They also introduce new names into the domain-name space. For example, if I have a domain registered as knowledgefolders.com, then a CNAME record can introduce another host into the domain-name space called myhost.knowledgefolders.com. This only works as long as the new host names you are introducing are all suffixed with knowledgefolders.com. For example, you can not introduce a CNAME called somehost.some-domain.com when you don't own some-domain.com.

Nevertheless, you are entitled to point somehost.your-domain.com to some-other-host.someoneelsesdomain.com, which is how the indirection of SMTP and POP mail servers is achieved.

End Result

At the end of all of this, I was able to publish the following distinct websites using various accounts in Knowledge Folders. Let's take a look:

  1. www.knowledgefolders.com points to the original home page of multi-account Knowledge Folders.
  2. www.knowledgefolders.org points to the documentation website for Knowledge Folders.
  3. www.knowledgefolders.net points to my personal account in Knowledge Folders.
  4. www.satyakomatineni.com points to my personal account in Knowledge Folders, as well.
  5. www.kavithakomatineni.com points to the website I manage for my daughter.

I use Knowledge Folders for a number of things:

  1. I manage my weblogs.
  2. I manage documentation for Aspire/J2EE.
  3. I support Aspire/J2EE using feedback.
  4. I manage documentation for Knowledge Folders.
  5. I collaborate with teams to develop websites using a project portal concept where project documentation is maintained.
  6. I create static websites for small companies.
  7. I manage my daily, weekly, and monthly tasks and to-do lists.
  8. I run tutorials.
  9. I conduct my research.
  10. I publish articles.

Future Possibilities for Web Hosting

Currently the process of creating websites is very disjointed; one must follow numerous steps to get a web presence today, but the trend is certainly toward simplification. Especially with something like Knowledge Folders, it is possible to imagine a time when consumers could visit a site, create an account, and post their content on the Web right away. The back-end details can be automated. By managing our tasks and schedules, publishing, and collaborating from the same site, we're heading toward something like a "web OS."

Satya Komatineni is the CTO at Indent, Inc. and the author of Aspire, an open source web development RAD tool for J2EE/XML.


Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.