ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Java RMI: Serialization
Pages: 1, 2, 3, 4, 5, 6

Using Serialization

Serialization is a mechanism built into the core Java libraries for writing a graph of objects into a stream of data. This stream of data can then be programmatically manipulated, and a deep copy of the objects can be made by reversing the process. This reversal is often called deserialization.



In particular, there are three main uses of serialization:

As a persistence mechanism
If the stream being used is FileOutputStream, then the data will automatically be written to a file.
As a copy mechanism
If the stream being used is ByteArrayOutputStream, then the data will be written to a byte array in memory. This byte array can then be used to create duplicates of the original objects.
As a communication mechanism
If the stream being used comes from a socket, then the data will automatically be sent over the wire to the receiving socket, at which point another program will decide what to do.

The important thing to note is that the use of serialization is independent of the serialization algorithm itself. If we have a serializable class, we can save it to a file or make a copy of it simply by changing the way we use the output of the serialization mechanism.

As you might expect, serialization is implemented using a pair of streams. Even though the code that underlies serialization is quite complex, the way you invoke it is designed to make serialization as transparent as possible to Java developers. To serialize an object, create an instance of ObjectOutputStreamand call the writeObject( )method; to read in a serialized object, create an instance of ObjectInputStreamand call the readObject( )object.

ObjectOutputStream

ObjectOutputStream, defined in the java.iopackage, is a stream that implements the "writing-out" part of the serialization algorithm. (RMI actually uses a subclass of ObjectOutputStreamto customize its behavior.) The methods implemented by ObjectOutputStreamcan be grouped into three categories: methods that write information to the stream, methods used to control the stream's behavior, and methods used to customize the serialization algorithm.

The "write" methods

The first, and most intuitive, category consists of the "write" methods:

public void write(byte[] b);
public void write(byte[] b, int off, int len);
public void write(int data);
public void writeBoolean(boolean data);
public void writeByte(int data);
public void writeBytes(String data);
public void writeChar(int data);
public void writeChars(String data);
public void writeDouble(double data);
public void writeFields(  );
public void writeFloat(float data);
public void writeInt(int data);
public void writeLong(long data);
public void writeObject(Object obj);
public void writeShort(int data);
public void writeUTF(String s);
public void defaultWriteObject(  );

For the most part, these methods should seem familiar. writeFloat( ), for example, works exactly as you would expect after reading Chapter 1 -- it takes a floating-point number and encodes the number as four bytes. There are, however, two new methods here: writeObject( )and defaultWriteObject( ).

writeObject( )serializes an object. In fact, writeObject( )is often the instrument of the serialization mechanism itself. In the simplest and most common case, serializing an object involves doing two things: creating an ObjectOuptutStreamand calling writeObject( )with a single "top-level" instance. The following code snippet shows the entire process, storing an object--and all the objects to which it refers--into a file:

FileOutputStream underlyingStream = new FileOutputStream("C:\\temp\\test");
ObjectOutputStream serializer = new ObjectOutputStream(underlyingStream);
serializer.writeObject(serializableObject);

Of course, this works seamlessly with the other methods for writing data. That is, if you wanted to write two floats, a String, and an object to a file, you could do so with the following code snippet:

FileOutputStream underlyingStream = new FileOutputStream("C:\\temp\\test");
ObjectOutputStream serializer = new ObjectOutputStream(underlyingStream);
serializer.writeFloat(firstFloat);
serializer.writeFloat(secongFloat);
serializer.writeUTF(aString);
serializer.writeObject(serializableObject);

TIP: ObjectOutputStream's constructor takes an OutputStreamas an argument. This is analagous to many of the streams we looked at in Chapter 1. ObjectOutputStreamand ObjectInputStreamare simply encoding and transformation layers. This enables RMI to send objects over the wire by opening a socket connection, associating the OutputStreamwith the socket connection, creating an ObjectOutputStreamon top of the socket's OutputStream, and then calling writeObject( ).

The other new "write" method is defaultWriteObject(). defaultWriteObject( )makes it much easier to customize how instances of a single class are serialized. However, defaultWriteObject( )has some strange restrictions placed on when it can be called. Here's what the documentation says about defaultWriteObject( ):

Write the nonstatic and nontransient fields of the current class to this stream. This may only be called from the writeObjectmethod of the class being serialized. It will throw the NotActiveExceptionif it is called otherwise.

That is, defaultWriteObject( )is a method that works only when it is called from another specific method at a particular time. Since defaultWriteObject( )is useful only when you are customizing the information stored for a particular class, this turns out to be a reasonable restriction. We'll talk more about defaultWriteObject( )later in the chapter, when we discuss how to make a class serializable.

The stream manipulation methods

ObjectOutputStreamalso implements four methods that deal with the basic mechanics of manipulating the stream:

public void reset(  );
public void close(  );
public void flush(  );
public void useProtocolVersion(int version);

With the exception of useProtocolVersion( ), these methods should be familiar. In fact, reset( ), close( ), and flush( )are standard stream methods. useProtocolVersion( ), on the other hand, changes the version of the serialization mechanism that is used. This is necessary because the serialization format and algorithm may need to change in a way that's not backwards-compatible. If another application needs to read in your serialized data, and the applications will be versioning independently (or running in different versions of the JVM), you may want to standardize on a protocol version.

TIP:   There are two versions of the serialization protocol currently defined: PROTOCOL_VERSION_1 and PROTOCOL_VERSION_2. If you send serialized data to a 1.1 (or earlier) JVM, you should probably use PROTOCOL_VERSION_1. The most common case of this involves applets. Most applets run in browsers over which the developer has no control. This means, in particular, that the JVM running the applet could be anything, from Java 1.0.2 through the latest JVM. Most servers, on the other hand, are written using JDK1.2.2 or later. (The main exception is EJB containers that require earlier versions of Java. At this writing, for example, Oracle 8i's EJB container uses JDK 1.1.6.) If you pass serialized objects between an applet and a server, you should specify the serialization protocol.

Methods that customize the serialization mechanism

The last group of methods consists mostly of protected methods that provide hooks that allow the serialization mechanism itself, rather than the data associated to a particular class, to be customized. These methods are:

public ObjectOutputStream.PutField putFields(  );
protected void annotateClass(Class cl);
protected void annotateProxyClass(Class cl);
protected boolean enableReplaceObject(boolean enable);
protected Object  replaceObject(Object obj);
protected void drain(  );
protected void writeObjectOverride(Object obj);
protected void writeClassDescriptor(ObjectStreamClass classdesc);
protected void writeStreamHeader(  );

These methods are more important to people who tailor the serialization algorithm to a particular use or develop their own implementation of serialization. As such, they require a deeper understanding of the serialization algorithm. We'll discuss these methods in more detail later, after we've gone over the actual algorithm used by the serialization mechanism.

ObjectInputStream

ObjectInputStream, defined in the java.iopackage, implements the "reading-in" part of the serialization algorithm. It is the companion to ObjectOutputStream--objects serialized using ObjectOutputStreamcan be deserialized using ObjectInputStream. Like ObjectOutputStream, the methods implemented by ObjectInputStreamcan be grouped into three categories: methods that read information from the stream, methods that are used to control the stream's behavior, and methods that are used to customize the serialization algorithm.

The "read" methods

The first, and most intuitive, category consists of the "read" methods:

public int read(  );
public int read(byte[] b, int off, int len);
public boolean readBoolean(  );
public byte readByte(  );
public char readChar(  );
public double readDouble(  );
public float readFloat(  );
public intreadInt(  );
public long readLong(  );
public Object readObject(  );
public short readShort(  );
public byte readUnsignedByte(  );
public short readUnsignedShort(  );
public String readUTF(  );
void defaultReadObject(  );

Just as with ObjectOutputStream's write( )methods, these methods should be familiar. readFloat( ), for example, works exactly as you would expect after reading Chapter 1: it reads four bytes from the stream and converts them into a single floating-point number, which is returned by the method call. And, again as with ObjectOutputStream, there are two new methods here: readObject( )and defaultReadObject( ).

Just as writeObject( )serializes an object, readObject( )deserializes it. Deserializing an object involves doing two things: creating an ObjectInputStreamand then calling readObject( ). The following code snippet shows the entire process, creating a copy of an object (and all the objects to which it refers) from a file:

FileInputStream underlyingStream = new FileInputStream("C:\\temp\\test");
ObjectInputStream deserializer = new ObjectInputStream(underlyingStream);
Object deserializedObject = deserializer.readObject( );

This code is exactly inverse to the code we used for serializing the object in the first place. If we wanted to make a deep copy of a serializable object, we could first serialize the object and then deserialize it, as in the following code example:

ByteArrayOutputStream memoryOutputStream = new ByteArrayOutputStream( );
ObjectOutputStream serializer = new ObjectOutputStream(memoryOutputStream);
serializer.writeObject(serializableObject);
serializer.flush( );
 
ByteArrayInputStream memoryInputStream = new ByteArrayInputStream(memoryOutputStream. toByteArray( ));
ObjectInputStream deserializer = new ObjectInputStream(memoryInputStream);
Object deepCopyOfOriginalObject = deserializer.readObject( );

This code simply places an output stream into memory, serializes the object to the memory stream, creates an input stream based on the same piece of memory, and runs the deserializer on the input stream. The end result is a deep copy of the object with which we started.

The stream manipulation methods

There are five basic stream manipulation methods defined for ObjectInputStream:

public boolean available(  );
public void close(  );
public void readFully(byte[] data);
public void readFully(byte[] data, int offset, int size);
public int skipBytes(int len);

Of these, available( )and skip( )are methods first defined on InputStream. available( )returns a boolean flag indicating whether data is immediately available, and close( )closes the stream.

The three new methods are also straightforward. skipBytes( )skips the indicated number of bytes in the stream, blocking until all the information has been read. And the two readFully( )methods perform a batch read into a byte array, also blocking until all the data has been read in.

Methods that customize the serialization mechanism

The last group of methods consists mostly of protected methods that provide hooks, which allow the serialization mechanism itself, rather than the data associated to a particular class, to be customized. These methods are:

protected boolean enableResolveObject(boolean enable);
protected Class resolveClass(ObjectStreamClass v);
protected Object resolveObject(Object obj);
protected class resolveProxyClass(String[] interfaces);
protected ObjectStreamClass readClassDescriptor( );
protected Object readObjectOverride( );
protected void readStreamHeader( );
public void registerValidation(ObjectInputValidation obj, int priority);
public GetFields readFields( );

These methods are more important to people who tailor the serialization algorithm to a particular use or develop their own implementation of serialization. Like before, they also require a deeper understanding of the serialization algorithm, so I'll hold off on discussing them right now.

Pages: 1, 2, 3, 4, 5, 6

Next Pagearrow