Internals of Java Class Loading
Pages: 1, 2, 3
Why Do We Need our Own Class Loaders?
One of the reasons for a developer to write his or her own class loader is to control
the JVM's class loading behavior. A class in Java is identified using
its package name and class name. For classes that implement
java.io.Serializable, the serialVersionUID plays a major role
in versioning the class. This stream-unique identifier is a 64-bit hash of the
class name, interface class names, methods, and fields. Other than these, there
are no other straightforward mechanisms for versioning a class. Technically
speaking, if the above aspects match, the classes are of "same version."
But let us think of a scenario where we need to develop a generic Execution Engine, capable of executing any tasks implementing a particular interface. When the tasks are submitted to the engine, first the engine needs to load the code for the task. Suppose different clients submit different tasks (i.e., different code) to the engine, and by chance, all of these tasks have the same class name and package name. The question is whether the engine will load the different client versions of the task differently for different client invocation contexts so that the clients will get the output they expect. The phenomenon is demonstrated in the sample code download, located in the References section below. Two directories, samepath and differentversions, contain separate examples to demonstrate the concept.
Figure 2 shows how the examples are arranged in three separate subfolders, called samepath, differentversions, and differentversionspush:
Figure 2. Example folder structure arrangement
In samepath, we have version.Version classes kept in two subdirectories,
v1 and v2. Both classes have the same name and same package. The only difference
between the two classes is in the following lines:
public void fx(){
log("this = " + this + "; Version.fx(1).");
}
inside of v1, we have Version.fx(1) in the log statement, whereas
in v2, we have Version.fx(2). Put both these slightly different
versions of the classes in the same classpath, and run the Test class:
set CLASSPATH=.;%CURRENT_ROOT%\v1;%CURRENT_ROOT%\v2
%JAVA_HOME%\bin\java Test
This will give the console output shown in Figure 3. We can see that code
corresponding to Version.fx(1) is loaded, since the class loader
found that version of the code first in the classpath.
Figure 3. samepath test with version 1 first in the classpath
Repeat the run, with a slight change in the order of path elements in class path.
set CLASSPATH=.;%CURRENT_ROOT%\v2;%CURRENT_ROOT%\v1
%JAVA_HOME%\bin\java Test
The console output is now changed to that shown in Figure 4. Here, the code
corresponding to Version.fx(2) is loaded, since the class loader
found that version of the code first in the classpath.
Figure 4. samepath test with version 2 first in the classpath
From the above example it is obvious that the
class loader will try to load the class using the path element that is found
first. Also, if we delete the version.Version classes from v1
and v2, make a .jar (myextension.jar) out of version.Version, put it in the path corresponding to java.ext.dirs, and repeat the test, we see
that version.Version is no longer loaded by AppClassLoader
but by the extension class loader, as shown in Figure 5.
Figure 5. AppClassLoader and ExtClassLoader
Going forward with the examples, the folder differentversions contains an RMI execution
engine. Clients can supply any tasks that implement common.TaskIntf
to the execution engine. The subfolders client1 and client2 contain slightly
different versions of the class client.TaskImpl. The difference
between the two classes is in the following lines:
static{
log("client.TaskImpl.class.getClassLoader
(v1) : " + TaskImpl.class.getClassLoader());
}
public void execute(){
log("this = " + this + "; execute(1)");
}
Instead of the getClassLoader(v1) and execute(1) log statements
in execute() inside of client1, client2 has getClassLoader(v2) and
execute(2) log statements. Moreover, in the script to start
the Execution Engine RMI server, we have arbitrarily put the task implementation class
of client2 first in the classpath.
CLASSPATH=%CURRENT_ROOT%\common;%CURRENT_ROOT%\server;
%CURRENT_ROOT%\client2;%CURRENT_ROOT%\client1
%JAVA_HOME%\bin\java server.Server
The screenshots in Figures 6, 7, and 8 show what is happening under the hood.
Here, in the client VMs, separate client.TaskImpl classes are
loaded, instantiated, and sent to the Execution Engine Server VM for execution.
From the server console, it is apparent that client.TaskImpl code
is loaded only once in the server VM. This single "version" of the code is used
to regenerate many client.TaskImpl instances in the server VM,
and execute the task.
Figure 6. Execution Engine Server console
Figure 6 shows the Execution Engine Server console, which is loading and executing code on behalf of two separate client requests, as shown in Figures 7 and Figure 8. The point to note here is that the code is loaded only once (as is evident from the log statement inside of the static initialization block), but the method is executed twice for each client invocation context.
Figure 7. Execution Engine Client 1 console
In Figure 7, the code for the TaskImpl class containing the log statement
client.TaskImpl.class.getClassLoader(v1) is loaded by the client VM,
and supplied to the Execution Engine Server. The client VM in Figure 8 loads
different code for the TaskImpl class containing the log statement
client.TaskImpl.class.getClassLoader(v2), and supplies it to the
Server VM.
Figure 8. Execution Engine Client 2 console
Here, in the client VMs, separate client.TaskImpl classes are
loaded, instantiated, and sent to the Execution Engine Server VM for execution.
A second look at the server console in Figure 6 reveals that the client.TaskImpl code
is loaded only once in the server VM. This single "version" of the code is used
to regenerate the client.TaskImpl instances in the server VM,
and execute the task. Client 1 should be unhappy since instead of his "version"
of the client.TaskImpl(v1), it is some other code that is executed
in the server against Client 1's invocation! How do we tackle such scenarios? The
answer is to implement custom class loaders.