Parsing and Writing QuickTime Files in Java
Pages: 1, 2
Relevance Break: Is This Really Necessary?
You might well wonder if all this stuff is really necessary. After all, MPEG-1 and MPEG-2 don't have a particular file format at all, and they seem pretty popular. What does all of this fanciness gain us?
Consider the power of storing media data by reference. Let's say you're writing an audio or video editor. Your user has selected a big segment of media from a file and wants to copy it from the source movie and paste it into a new one. Do you read all that data from disk? Media files are big, so that's going to take a while. Worse yet, if you can't store it all in memory, are you going to turn around and write it to a scratch movie? Great, copy-and-paste now requires copying hundreds of megabytes--even with fast hard drives, your user will be annoyed (and really unpleasant, if you fill the drive). Consider what QuickTime provides instead: the ability to refer to that source media and an edit list to say what parts of that source we want. The copy and paste is practically instantaneous--we just store pointers.
That's part of the thinking that led MPEG-4 to adopt the QuickTime file format. As Carsten Herpel, Guido Franceschini, and David Singer write in The MPEG-4 Book:
The MPEG committee sought a life-cycle format--one in which the files could be used when capturing media, editing it, and combining it; when serving the media as a file download or as a stream; and when exchanging partial or complete presentations. This need for a life-cycle format is not met in many simple file format designs. For example ... the design approach of MPEG-2, in which a stream is simply recorded to a file, makes editing hard. (pp. 253-4)
Beyond the issues of handling audio and video, consider the scope of MPEG-4, which, in its various permutations, can incorporate 2D and 3D graphics, compositing of captured video with rendered graphics, a Java API ("MPEG-J") for writing interactive applications to be delivered inside a movie or stream, etc. To support all of that, the format needs to be extremely extensible. With the ability to define new structures as new atom types, QuickTime fits the bill.
To learn more about MPEG-4, start at the MPEG-4 Industry Forum. Let's cut to the chase and let our parser take a look at some MPEG-4 content. Envivo, which makes MPEG-4 software, has a handy page of MPEG-4 samples from various sources. A few that I find amusing are the Philips television commercials. Here's what the 800K "CD-R Dinner" commercial looks like when we let our parser have a look at it:
ftyp (16 bytes)
skip (16 bytes)
mdat (2918834 bytes)
moov (46140 bytes) - 6 children
mvhd (108 bytes)
trak (469 bytes) - 3 children
tkhd (92 bytes)
mdia (337 bytes) - 3 children
mdhd (32 bytes)
minf (264 bytes) - 3 children
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (208 bytes) - 6 children
stts (24 bytes)
stsd (84 bytes)
stsz (20 bytes)
stsc (28 bytes)
stco (20 bytes)
ctts (24 bytes)
nmhd (12 bytes)
hdlr (33 bytes) [/odsm - ]
tref (32 bytes) - 1 child
mpod (24 bytes)
trak (449 bytes) - 2 children
tkhd (92 bytes)
mdia (349 bytes) - 3 children
mdhd (32 bytes)
minf (276 bytes) - 3 children
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (220 bytes) - 6 children
stts (24 bytes)
stsd (96 bytes)
stsz (20 bytes)
stsc (28 bytes)
stco (20 bytes)
ctts (24 bytes)
nmhd (12 bytes)
hdlr (33 bytes) [/sdsm - ]
trak (5855 bytes) - 2 children
tkhd (92 bytes)
mdia (5755 bytes) - 3 children
mdhd (32 bytes)
minf (5682 bytes) - 3 children
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (5622 bytes) - 6 children
stts (32 bytes)
stsd (118 bytes)
stsz (5200 bytes)
stsc (172 bytes)
stco (68 bytes)
ctts (24 bytes)
smhd (16 bytes)
hdlr (33 bytes) [/soun - ]
trak (39209 bytes) - 2 children
tkhd (92 bytes)
mdia (39109 bytes) - 3 children
mdhd (32 bytes)
minf (39036 bytes) - 3 children
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (38972 bytes) - 8 children
stts (5312 bytes)
stsd (196 bytes)
stsz (3628 bytes)
stsc (544 bytes)
stco (192 bytes)
ctts (24 bytes)
stss (108 bytes)
uuid (28960 bytes)
vmhd (20 bytes)
hdlr (33 bytes) [/vide - ]
iods (42 bytes)
skip (37 bytes)
Similar structure, but some significantly different contents. Here are some key differences worth noting:
- There are more top-level atoms than just
moov. Some are trivial (skipis a placeholder for free space in the file), butmdatcontains the raw media data for this movie. Our earlier examples referred to media outside the movie file. This is the first our parser has seen of a self-contained movie. - Actually, there's also a QuickTime atom called
widethat's used like the firstskipin this file, right before anmdator other potentially huge atom. It's a placeholder in case the atom grows large enough to require an extended size, which means it would need another 8 bytes of header. - There are four tracks, two of which are audio and video (as seen by the
vmhdvideo media header andsmhdsound media header atoms, and associated handlers of subtypesvideandsoun), and two new MPEG-4-only tracks that havenmhdheaders. The handlers have subtypesodsmandsdsm. There's another MPEG-4-only atom, the "initial object descriptor" oriods. These MPEG-4 extensions are not defined in the QuickTime spec, but that's okay. We don't trip up parsing them because they're still normal atoms with a type and size.
Writing Movie Files with QuickTime
Now that we've toured the format and exposed ourselves to the parsing from
which QuickTime for Java isolates us (with calls like
Movie.fromFile()), we'll turn our attention to writing files. We
can write different different kinds of QuickTime files, depending on our
particular needs for an application.
The following code assumes that you have downloaded and installed the QuickTime for Java SDK
on your Mac or Windows machine (apologies, as always, to developers using
operating systems not supported by QuickTime). Because we'll want to use
MPEG-4, please make sure you have QuickTime 6. Also, while the sample code
includes an Ant build.xml
file, you'll need to copy my.ant.properties.mac or
my.ant.properties.win to my.ant.properties and
possibly edit it so that its qtjavazip.file entry points to
QTJava.zip on your system. Curiously, while the QTJ classes are
found in your Java extensions directory when running an application, they need
to be put in the CLASSPATH explicitly for a compile. Equivalent
caveats apply if you're using make or your favorite IDE.
On the other hand, if you just want to run the code, running java
-classpath makemovies.jar com.mac.invalidname.makemovies.MovieMaker
should work fine, with one more caveat--you must use Java 1.3 on the Mac,
because Apple is eliminating the JDirect library used by QuickTime for Java in
its upcoming Java 1.4 implementation and generally advises against calling
Carbon code from their Java 1.4. (This issue is a moving target and the 1.4
implementation is NDA'd, but here's
the java-dev post announcing the policy and a follow-up
with more details.)
The sample MakeMovies class creates a Movie in
memory composed of references to another movie, saving variants of this movie
to disk. The movie is created with low-level
edits, meaning functions that work with segments of a movie defined by
starting time and duration. To keep things simple, our movie consists of
three five-second segments grabbed from the beginning, middle, and end of
another movie:
// figure out start points for 5-second segments at
// approximate beginning, middle, and end of movie
int scale = sourceMovie.getTimeScale();
int end = sourceMovie.getDuration();
int fiveSeconds = 5 * sourceMovie.getTimeScale();
int[] startTimes = {0, // beginning
end/2, // middle
end - fiveSeconds};
// insert 5-second segments from sourceMovie into
// refMovie
int fiveSecRefTime = 5 * refMovie.getTimeScale();
for (int i=0; i < startTimes.length; i++) {
sourceMovie.insertSegment (refMovie,
startTimes[i],
fiveSeconds,
i * fiveSecRefTime);
}
With that, we have a 15-second movie, which the demo app plays in a
QTCanvas. Now to save it to disk.
If you were just combing over the javadocs, you might be tempted to use the
convertToFile method in the Movie class. It's fairly
straightforward, just needing the file and some constants for file-type, Mac
file "creator," and a Mac ScriptManager. The downside here is that
the generated file has uncompressed audio, and video barely compressed with
Apple's "Video" codec. Still, take a look at it with our atom parser
and we've got a normal-looking self-contained movie:
moov (2732 bytes) - 3 children
mvhd (108 bytes)
trak (631 bytes) - 3 children
tkhd (92 bytes)
edts (36 bytes) - 1 child
elst (28 bytes) [1 edit]
mdia (495 bytes) - 3 children
mdhd (32 bytes)
hdlr (58 bytes) [mhlr/soun - Apple Sound Media Handler]
minf (397 bytes) - 4 children
smhd (16 bytes)
hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (280 bytes) - 5 children
stsd (52 bytes)
stts (24 bytes)
stsc (40 bytes)
stsz (20 bytes)
stco (136 bytes)
trak (1985 bytes) - 3 children
tkhd (92 bytes)
edts (36 bytes) - 1 child
elst (28 bytes) [1 edit]
mdia (1849 bytes) - 3 children
mdhd (32 bytes)
hdlr (58 bytes) [mhlr/vide - Apple Video Media Handler]
minf (1751 bytes) - 4 children
vmhd (20 bytes)
hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (1630 bytes) - 6 children
stsd (102 bytes)
stts (40 bytes)
stss (56 bytes)
stsc (364 bytes)
stsz (824 bytes)
stco (236 bytes)
free (16 bytes)
wide (8 bytes)
mdat (5059930 bytes)
While we're here, let's note another seemingly-useful-but-probably-not
method: createShortcutMovie in the QTFile class.
You'd be forgiven for thinking that this creates a movie that preserves our
references to the media in the original movie. Not even close--take a look at
it with the atom-parser:
moov (254 bytes) - 1 child
mdra (246 bytes)
dref (238 bytes)
In other words, a "shortcut" movie is something of a QuickTime analogue to an file alias or symbolic link.
Exporting Movies
So far, none of these methods have given us a way to specify that we'd like
to state (and possibly change) the encoding or format of the saved movie.
That's the realm of the MovieExporter, which writes a movie in a
particular format with our choice of audio and video codecs. The code isn't
hard to understand: get an exporter for a particular format, bring up a dialog
for the user to specify encoding and quality settings, and let the exporter get
to work.
What can be tricky is getting a MovieExporter. The
list of available exporters is variable, depending on the user's version and
what optional pieces of QuickTime they have installed. One technique is to call
the MovieExporter with an int constant:
MovieExporter me =
new MovieExporter (StdQTConstants.kQTFileTypeMovie)
This creates an exporter to create typical QuickTime .movs.
You can also use the hex value 0x6d706734 to get an MPEG-4
exporter in QuickTime 6. In case you were wondering, that int is
the string mpg4 in ASCII. Passing short strings as 32-bit
ints is very common in the QuickTime API.
What if you want to offer the user the ability to export to a format that
might be a post-install add-on, or that might be included in a future version
of QuickTime? For this, the MovieExporter has a second
constructor, one that takes a ComponentIdentifier as its argument.
To find a suitable ComponentIdentifier, we can iterate through the
installed components, with ComponentIdentifier.find(), looking for
those that have type "spit," which is provided as the constant
StdQTConstants.movieExportType. The sample code produces a dialog
of the discovered choices, modestly validating those that are actually
appropriate for exporting our movie:
// build up a list of exporters and let user choose one
Vector compIdentifiers = new Vector();
ComponentIdentifier ci = null;
ComponentDescription cd =
new ComponentDescription(StdQTConstants.movieExportType);
while ( (ci = ComponentIdentifier.find(ci, cd)) != null) {
// check to see that the movie can be exported
// with this component (this throws some obnoxious
// exceptions, maybe a bit expensive?)
try {
MovieExporter exporter = new MovieExporter (ci);
if (exporter.validate (movie, null))
compIdentifiers.addElement (ci);
} catch (StdQTException expE) {} // ow!
}
The sample code then takes the Vector of
ComponentIdentifiers and populates a JComboBox, which
goes into a user dialog, as seen in Figure 2. The sample code tries to export
all tracks, audio and video. Choosing a movie audio-only format like
"AIFF" will throw a QTException. Production code could
be more careful about what tracks to export, or what choices the user has.

Figure 2--the choice of MovieExporter
Once the user has chosen a MovieExporter, we call a method
named doUserDialog to let the user choose quality and other
format-specific options. If the user chooses the normal "QuickTime
Movie," the export dialog looks like Figure 3. You may notice that the
MPEG-4 exporter dialog is exceptionally verbose and carefully explains whether
or not your choices will create a standard MPEG-4 file readable by other
machines. Another quirk of the MPEG-4 exporter is that Windows users won't be
able to export audio. (I'm not sure if this is because of technical
limitations or issues licensing the AAC audio codec from Dolby.)

Figure 3--the user dialog for QuickTime Movie export
The export takes a long time, particularly with large movies, slow
computers, or certain codecs. To provide a good user experience, it's best to
provide a progress update. In QTJ, a MovieProgress implementation
can get callbacks from time-consuming operations. One thing that makes this a
little difficult, however, is that the javadocs say that as the operation
progresses, your implementation will receive the messages
movieProgressOpen, movieProgressUpdatePercent, and
movieProgressClose ... but those values from the native QuickTime
API don't seem to be defined in QTJ. Fortunately, their values turn out to be
pretty simple: 0, 1, and 2, respectively. In the sample code, I've extended a
Swing ProgressMonitor to update as the export continues, as seen
in Figure 4. Unfortunately, this only works on the Mac. On Windows, the
callbacks occur on the AWT-Windows thread (even though the export
was called from the main thread) and QuickTime seems to block the
AWT thread, so our attempts to update the ProgressMonitor never
get a chance to repaint. I haven't found a clever thread-scheduling or
SwingUtilities way around this. If you do, please put it in the
talkback!

Figure 4--the progress bar for MovieExporter
Flat-land
Let's say that you're happy with saving as a QuickTime movie. In fact, you
want to keep the original audio and video encoding, but you want to eliminate
references to external files, copying all of the media data into one movie that
can be sent to other machines without breaking. This process of eliminating
references is called "flattening." It takes a straightforward call
to Movie.flatten() with a list of usually-constant values:
movie.flatten (0, // movieFlattenFlags
flatFile, // fileOut
StdQTConstants.kMoviePlayer, // creator
IOConstants.smSystemScript, // scriptTag
StdQTConstants.createMovieFileDeleteCurFile, // createQTFileFlags
StdQTConstants.movieInDataForkResID, // resId
flatFile.getName()); // resName
This produces a typical-looking QuickTime movie, with a big
mdat atom, indicating the media is inside of the movie file:
wide (8 bytes)
mdat (2326820 bytes)
moov (3100 bytes) - 4 children
mvhd (108 bytes)
trak (2077 bytes) - 3 children
tkhd (92 bytes)
edts (36 bytes) - 1 child
elst (28 bytes) [1 edit]
mdia (1941 bytes) - 3 children
mdhd (32 bytes)
hdlr (58 bytes) [mhlr/vide - Apple Video Media Handler]
minf (1843 bytes) - 4 children
vmhd (20 bytes)
hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (1722 bytes) - 5 children
stsd (102 bytes)
stts (24 bytes)
stsc (412 bytes)
stsz (920 bytes)
stco (256 bytes)
trak (895 bytes) - 3 children
tkhd (92 bytes)
edts (60 bytes) - 1 child
elst (52 bytes) [3 edits]
mdia (735 bytes) - 3 children
mdhd (32 bytes)
hdlr (58 bytes) [mhlr/soun - Apple Sound Media Handler]
minf (637 bytes) - 4 children
smhd (16 bytes)
hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (520 bytes) - 5 children
stsd (68 bytes)
stts (24 bytes)
stsc (256 bytes)
stsz (20 bytes)
stco (144 bytes)
udta (12 bytes) - 0 children
Don't Try This at Home
In a moment of curiosity, I browsed the methods of the
AtomContainer class, which is used (infrequently) to pass around
QuickTime memory structures as particularly complex parameters or for other
really low-level tasks. I noted that it has a getBytes() method
(inherited from QTHandleRef), and that a Movie could
be coaxed into an AtomContainer representation.
So I'm like, "Huh, I could get the raw bytes of the Movie ... wonder what that looks like."
Dumping the byte array to disk is simple, and the first few bytes look awfully familiar:
0000 0e3c 6d6f 6f76 0000 006d 6d76 6864
0000 0000 ba6b 3f16 ba6b 3f70 0000 0258
0000 2328 0001 0000 00ff 0000 0000 0000
...
Yep, there's moov and a mvhd right there on the
first line. The memory structure is almost identical to the
file format. Almost? Yes, it's apparently the same except for one
byte: the size of the mvhd is wrong. On the Mac, it's
0x006d, when it should be 0x006c. On Windows, it's
0x016c. Accounting for endian differences between the platforms,
it's like 1 was added to the size in an endian-specific way.
The sample code dumps the movie's AtomContainer two ways, in
its raw form as atom.out and with this byte fixed as
atom-fixed.mov. Surprisingly, in my testing, this fixed version
consistently plays in QuickTime Player.
This may not be a recommended way to create a movie on disk that just keeps pointers to its source segments, but it should help tie things together, to help illustrate the fact that QuickTime's concepts of movies, tracks, and media and of atoms and their containment heirarchy, and its use of pointers to media data, are not just a conceit of the file format, but a core concept of how movies are managed in memory and manipulated by code.
Now that you know how hairy those structures are, be glad that the API largely isolates you from them!
Chris Adamson is an author, editor, and developer specializing in iPhone and Mac.
Return to ONJava.com.