ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


Parsing and Writing QuickTime Files in Java

by Chris Adamson
02/19/2003

Apple's QuickTime turns 12 this year. Its very extensible file format has contributed to this longevity, allowing QuickTime to migrate from a world of CD-ROMs, AppleTalk, and static content to today's massively-networked, streaming, interactive world. The format is so flexible that it was chosen as the basis of the MPEG-4 file format. More than one might expect, the philosophy and concepts of the file format are integral to working with QuickTime structures at runtime.

However, the QuickTime APIs do much to isolate developers from the nuts-and-bolts of the file format when doing the most common tasks, so we'll examine the format with a simple pure-Java QuickTime file format parser, then we'll use some QuickTime for Java code to generate some different kinds of QuickTime files to illustrate the format's flexibility.

The details of the format are readily available in the 351-page Inside QuickTime: QuickTime File Format (PDF). They are also installed--for Mac OS X developers--in /Developer/Documentation/QuickTime/qtdevdocs/PDF/QTFileFormat.pdf by the Developer Tools installer.

Mighty Atom

The heart and soul of QuickTime is the concept of the "atom." The name should remind you of high-school chemistry, where an atom was the smallest unit of an element that retained the properties of the element. In QuickTime, an atom is the lowest level to which we can go and still be able to tell the difference between, say, an edit-list and a sprite. All atoms have a size and a type. Any other information they may contain depends on their type. This concept helps forwards-compatibility in the format--it's easy to skip over an unknown type because the size is right there.

There's a difference between "classic" atoms and newer "QT" atoms, but the latter is backwards-compatible with the former and both are commonly encountered in a single file. Let's focus on the commonalities. All atoms have a header of either 8 or 16 bytes, consisting of either two or three parts:

Sample Code

Download the sample code for this article.

  1. atom size:a 4-byte, unsigned integer. If 0, the atom continues to the end of the file.
  2. atom type: a 4-byte value, usually interpreted as an ASCII string like moov, though any value is valid.
  3. Optionally, an extended size: if the atom size was 1, then this field is present and interpreted as an 8-byte unsigned integer. This allows an atom to contain more than 4 GB of data.

The sample code contains a simple example in the EmptyMovie.mov file, which is just an untitled movie created in QuickTime Player and saved without modifiation. Open it in hexdump, od, or your favorite hex editor (I'm fond of HexEdit for the Mac). If you dump the output as characters (i.e., hexedit -cv EmptyMovie.mov), the atom types practically jump out at you:

\0  \0  \0 214   m   o   o   v  \0  \0  \0   l   m   v   h   d
\0  \0  \0  \0 272   @   Q 352 272   @   Q 372  \0  \0 002   X
\0  \0  \0  \0  \0 001  \0  \0  \0 377  \0  \0  \0  \0  \0  \0
\0  \0  \0  \0  \0 001  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
\0  \0  \0  \0  \0 001  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
\0  \0  \0  \0   @  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
\0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
\0  \0  \0 001  \0  \0  \0 030   u   d   t   a  \0  \0  \0  \f
 W   L   O   C  \0   4  \0 030  \0  \0  \0  \0

If we look at the byte values instead, and carefully count the sizes of the atoms, we can see the structure of the movie. Figure 1 shows a graphic representation. In case you're not comfortable reading hex, the file starts with the size and type of the first atom, an 0x8c-long moov, which matches the file size. It contains a 0x6c-long mvhd, which has a few non-null bytes. The moov's other child is a udta of size 0x18, which itself contains a WLOC of size 0x0c.

graphic map of atoms in EmptyMovie.mov
Figure 1--graphic map of atoms in EmptyMovie.mov

Little things to notice:

What does all this say anyway? The file-format docs define the contents of each of the "leaf" atoms, so we look there to interpret the mvhd and WLOC atoms. Since this is a minimal movie, there's not much to see--the mvhd is a "movie header;" a structure that defines some metadata values like creation time, preferred volume, time-scale, et cetera. These defaults are saved into the file. The next atom is user data, udta, a container for an arbitrarily long list of metadata atoms. This is a good place to put your own data into the movie, with whatever format suits you, so long as you choose an unused atom type and don't use all-lower-case, which is reserved for Apple. Here, there is only one piece of user data, the window location, WLOC. It contains two 16-bit unsigned ints for x and y, in this case (0x34,0x18) or in decimal, (52,24).

Related Reading

Ant: The Definitive Guide
By Jesse E. Tilly, Eric M. Burke

Doing It the Hard Way

While QuickTime for Java generally isolates you from the grubby details of the format, I've included a simple all-Java QuickTime file parser so we can quickly see the structure of a movie file on any J2SE platform. Download the accompanying source tarball and open it up. The parser source and a pre-compiled .jar are in the atom-parse directory. An Ant build.xml file is included to help you build the code, if you're interested (do ant help to see the available targets), or you can just run it from the .jar with java -classpath atomparse.jar com.mac.invalidname.qtatomparse.AtomParser.

The code starts with a basic ParsedAtom class, which represents any atom found in the file. This is subclassed as ParsedContainerAtom, containing an array of its children, and ParsedLeafAtom, which is meant to be a parent for type-specific subclasses that interpret particular atom types. A factory provides the parser with the class for a given type--new classes can be added by editing its properties file. Finally, AtomParser puts it all together, recursively calling a parseAtoms method when it discovers a container atom, and returning an array of children.

Here's the critical section for reading an atom's size, type, extended size, and data, given raf (a RandomAccessFile), off (current offset that we're reading; i.e., start of an atom), and stopAt (where the parent atom or file ends).

while (off < stopAt) {
    raf.seek (off);

    // 1. first 32 bits are atom size
    // use BigInteger to convert bytes to long 
    // (instead of signed int)
    int bytesRead = raf.read (atomSizeBuf, 0,
                              atomSizeBuf.length);
    if (bytesRead < atomSizeBuf.length)
        throw new IOException ("couldn't read atom length");
    BigInteger atomSizeBI = new BigInteger (atomSizeBuf);
    long atomSize = atomSizeBI.longValue();
    
    // this is kind of a hack to handle the udta problem
    // (see below) when the parent didn't have children,
    // meaning we've read 4 bytes of 0 and the parent atom
    // is already over
    if (raf.getFilePointer() == stopAt)
        break;
    
    // 2. next, the atom type
    bytesRead = raf.read (atomTypeBuf, 0
                          atomTypeBuf.length);
    if (bytesRead != atomTypeBuf.length)
        throw new IOException ("Couldn't read atom type");
    String atomType = new String (atomTypeBuf);
    
    // 3. if atomSize was 1, then this is 64-bit ext size
    if (atomSize == 1) {
        bytesRead = raf.read (extendedAtomSizeBuf, 0,
                              extendedAtomSizeBuf.length);
        if (bytesRead != extendedAtomSizeBuf.length)
            throw new IOException (
                      "Couldn't read extended atom size");
        BigInteger extendedSizeBI =
            new BigInteger (extendedAtomSizeBuf);
        atomSize = extendedSizeBI.longValue();
    }
    
    // if this atom size is negative, or extends past end
    // of file, it's extremely suspicious (i.e.,we're not
    // really in a quicktime file)
    if ((atomSize < 0)  ||
       ((off + atomSize) > raf.length()))
           throw new IOException (
               "atom has invalid size: " + atomSize);

    // 4. if a container atom, then parse the children
    ParsedAtom parsedAtom = null;
    if (ATOM_CONTAINER_TYPES.contains (atomType)) {
        // children run from current point to end of the atom
        ParsedAtom [] children =
            parseAtoms (raf, raf.getFilePointer(), off + atomSize);
        parsedAtom =
            new ParsedContainerAtom (atomSize, atomType, children);
    } else {
        parsedAtom =
            AtomFactory.getInstance().createAtomFor (
                atomSize, atomType, raf);
    }
    
    // add atom to the list
    parsedAtomList.add (parsedAtom);
    
    // now set offset to next atom (or end-of-file
    // in special case (atomSize = 0 means atom goes
    // to EOF)
    if (atomSize == 0)
        off = raf.length();
    else 
        off += atomSize;
    
    // if a 'udta' container atom, then jump ahead 4 
    // to work around Apple's QT 1.0 workaround
    // (http://developer.apple.com/technotes/qt/qt_03.html )
    if (atomType.equals("udta"))
        off += 4;
    
} // while not at stopAt

A few caveats to this code. First, please excuse my abuse of the BigInteger class to get longs from four-byte arrays, but the alternative is a blinding amount of bit-shifting. Moreover, the reason I use longs for atom sizes is that it usually avoids signing problems (32-bit java ints are signed, while the usual QuickTime atom size is a 32-bit unsigned value). However, it will be wrong if you happen to encounter an atom larger than 9,223,372,036,854,775,807 bytes (i.e.,a 64-bit integer with the top bit set). Just thought I'd mention that, in case you just got back from the store with a 10 exabyte drive. Also, my scheme for knowing what atoms are containers is to list known containers in AtomParser. If I've missed one, the parser handles it fairly gracefully, because we have the size of the atom and simply advance the offset to the next atom (unfortunately, without parsing the children).

Here's the output when we run the parser on EmptyMovie.mov:

moov (140 bytes) - 2 children
  mvhd (108 bytes) 
  udta (24 bytes) - 1 child
    WLOC (12 bytes)  (x,y) == (52,24)

So far, so boring. Let's try a more interesting bit of content. The movie tim-drm-ref.mov is a 45-second sound bite of Tim O'Reilly discussing digital rights management at the recent O'Reilly Mac OS X conference. The file is a reference to a 51 MB movie of the entire keynote panel, yet this file is a dainty 6 KB, since it consists entirely of metadata, including the references to the original movie on the O'Reilly web site.

This file's structure is a lot more involved:

moov (5957 bytes) - 4 children
  mvhd (108 bytes) 
  trak (3951 bytes) - 4 children
    tkhd (92 bytes) 
    edts (36 bytes) - 1 child
      elst (28 bytes) [1 edit]
    mdia (3803 bytes) - 3 children
      mdhd (32 bytes) 
      hdlr (58 bytes) [mhlr/vide - Apple Video Media Handler]
      minf (3705 bytes) - 4 children
        vmhd (20 bytes) 
        hdlr (55 bytes) [dhlr/url  - Apple URL Data Handler]
        dinf (76 bytes) - 1 child
          dref (68 bytes) 
        stbl (3546 bytes) - 6 children
          stsd (102 bytes) 
          stts (24 bytes) 
          stss (216 bytes) 
          stsc (172 bytes) 
          stsz (2248 bytes) 
          stco (776 bytes) 
    udta (12 bytes) - 0 children
  trak (1857 bytes) - 4 children
    tkhd (92 bytes) 
    edts (36 bytes) - 1 child
      elst (28 bytes) [1 edit]
    mdia (1709 bytes) - 3 children
      mdhd (32 bytes) 
      hdlr (58 bytes) [mhlr/soun - Apple Sound Media Handler]
      minf (1611 bytes) - 4 children
        smhd (16 bytes) 
        hdlr (55 bytes) [dhlr/url  - Apple URL Data Handler]
        dinf (76 bytes) - 1 child
          dref (68 bytes) 
        stbl (1456 bytes) - 5 children
          stsd (132 bytes) 
          stts (24 bytes) 
          stsc (880 bytes) 
          stsz (20 bytes) 
          stco (392 bytes) 
    udta (12 bytes) - 0 children
  udta (33 bytes) - 2 children
    WLOC (12 bytes)  (x,y) == (83,93)
    SelO (9 bytes)

This file is far more typical of what we expect to see in a movie, or more accurately, in a moov (go ahead, say it out loud: moo-vee). In addition to the metadata-bearing mvhd movie header and the udta user data, there are two trak atoms, both with a deep, yet similar, structure. This movie consists of two "tracks," one for video and one for audio. Tracks store metadata in the tkhd track header (analogous to the mvhd we saw earlier), an "edits" structure that indicates what parts of the underlying media are used by the track, and a detailed "media" structure.

The media structure has, again, a metadata header, a hdlr handler atom that indicates which component should handle the media data, a "data information" structure made up of dref data references to say where the media data is (in this file, elsewhere on disk, on the net, etc.), and finally, a tricky structure for locating and intepreting media samples.

It's too much to try to understand what all of these atoms represent right away if you're new to QuickTime, but it might be helpful to look at Apple's Introduction to QuickTime tutorial, specifically the section on tracks and media, and see how the contents map fairly directly onto the structure presented in the preceding two paragraphs. Another point of interest is Ridgeworks' QTatomizer, a shareware product that represents the atom structure of a QuickTime movie as a Swing JTree.

Relevance Break: Is This Really Necessary?

You might well wonder if all this stuff is really necessary. After all, MPEG-1 and MPEG-2 don't have a particular file format at all, and they seem pretty popular. What does all of this fanciness gain us?

Consider the power of storing media data by reference. Let's say you're writing an audio or video editor. Your user has selected a big segment of media from a file and wants to copy it from the source movie and paste it into a new one. Do you read all that data from disk? Media files are big, so that's going to take a while. Worse yet, if you can't store it all in memory, are you going to turn around and write it to a scratch movie? Great, copy-and-paste now requires copying hundreds of megabytes--even with fast hard drives, your user will be annoyed (and really unpleasant, if you fill the drive). Consider what QuickTime provides instead: the ability to refer to that source media and an edit list to say what parts of that source we want. The copy and paste is practically instantaneous--we just store pointers.

That's part of the thinking that led MPEG-4 to adopt the QuickTime file format. As Carsten Herpel, Guido Franceschini, and David Singer write in The MPEG-4 Book:

The MPEG committee sought a life-cycle format--one in which the files could be used when capturing media, editing it, and combining it; when serving the media as a file download or as a stream; and when exchanging partial or complete presentations. This need for a life-cycle format is not met in many simple file format designs. For example ... the design approach of MPEG-2, in which a stream is simply recorded to a file, makes editing hard. (pp. 253-4)

Beyond the issues of handling audio and video, consider the scope of MPEG-4, which, in its various permutations, can incorporate 2D and 3D graphics, compositing of captured video with rendered graphics, a Java API ("MPEG-J") for writing interactive applications to be delivered inside a movie or stream, etc. To support all of that, the format needs to be extremely extensible. With the ability to define new structures as new atom types, QuickTime fits the bill.

To learn more about MPEG-4, start at the MPEG-4 Industry Forum. Let's cut to the chase and let our parser take a look at some MPEG-4 content. Envivo, which makes MPEG-4 software, has a handy page of MPEG-4 samples from various sources. A few that I find amusing are the Philips television commercials. Here's what the 800K "CD-R Dinner" commercial looks like when we let our parser have a look at it:

ftyp (16 bytes) 
skip (16 bytes) 
mdat (2918834 bytes) 
moov (46140 bytes) - 6 children
  mvhd (108 bytes) 
  trak (469 bytes) - 3 children
    tkhd (92 bytes) 
    mdia (337 bytes) - 3 children
      mdhd (32 bytes) 
      minf (264 bytes) - 3 children
        dinf (36 bytes) - 1 child
          dref (28 bytes) 
        stbl (208 bytes) - 6 children
          stts (24 bytes) 
          stsd (84 bytes) 
          stsz (20 bytes) 
          stsc (28 bytes) 
          stco (20 bytes) 
          ctts (24 bytes) 
        nmhd (12 bytes) 
      hdlr (33 bytes) [/odsm - ]
    tref (32 bytes) - 1 child
      mpod (24 bytes) 
  trak (449 bytes) - 2 children
    tkhd (92 bytes) 
    mdia (349 bytes) - 3 children
      mdhd (32 bytes) 
      minf (276 bytes) - 3 children
        dinf (36 bytes) - 1 child
          dref (28 bytes) 
        stbl (220 bytes) - 6 children
          stts (24 bytes) 
          stsd (96 bytes) 
          stsz (20 bytes) 
          stsc (28 bytes) 
          stco (20 bytes) 
          ctts (24 bytes) 
        nmhd (12 bytes) 
      hdlr (33 bytes) [/sdsm - ]
  trak (5855 bytes) - 2 children
    tkhd (92 bytes) 
    mdia (5755 bytes) - 3 children
      mdhd (32 bytes) 
      minf (5682 bytes) - 3 children
        dinf (36 bytes) - 1 child
          dref (28 bytes) 
        stbl (5622 bytes) - 6 children
          stts (32 bytes) 
          stsd (118 bytes) 
          stsz (5200 bytes) 
          stsc (172 bytes) 
          stco (68 bytes) 
          ctts (24 bytes) 
        smhd (16 bytes) 
      hdlr (33 bytes) [/soun - ]
  trak (39209 bytes) - 2 children
    tkhd (92 bytes) 
    mdia (39109 bytes) - 3 children
      mdhd (32 bytes) 
      minf (39036 bytes) - 3 children
        dinf (36 bytes) - 1 child
          dref (28 bytes) 
        stbl (38972 bytes) - 8 children
          stts (5312 bytes) 
          stsd (196 bytes) 
          stsz (3628 bytes) 
          stsc (544 bytes) 
          stco (192 bytes) 
          ctts (24 bytes) 
          stss (108 bytes) 
          uuid (28960 bytes) 
        vmhd (20 bytes) 
      hdlr (33 bytes) [/vide - ]
  iods (42 bytes) 
skip (37 bytes)

Similar structure, but some significantly different contents. Here are some key differences worth noting:

Writing Movie Files with QuickTime

Now that we've toured the format and exposed ourselves to the parsing from which QuickTime for Java isolates us (with calls like Movie.fromFile()), we'll turn our attention to writing files. We can write different different kinds of QuickTime files, depending on our particular needs for an application.

The following code assumes that you have downloaded and installed the QuickTime for Java SDK on your Mac or Windows machine (apologies, as always, to developers using operating systems not supported by QuickTime). Because we'll want to use MPEG-4, please make sure you have QuickTime 6. Also, while the sample code includes an Ant build.xml file, you'll need to copy my.ant.properties.mac or my.ant.properties.win to my.ant.properties and possibly edit it so that its qtjavazip.file entry points to QTJava.zip on your system. Curiously, while the QTJ classes are found in your Java extensions directory when running an application, they need to be put in the CLASSPATH explicitly for a compile. Equivalent caveats apply if you're using make or your favorite IDE.

On the other hand, if you just want to run the code, running java -classpath makemovies.jar com.mac.invalidname.makemovies.MovieMaker should work fine, with one more caveat--you must use Java 1.3 on the Mac, because Apple is eliminating the JDirect library used by QuickTime for Java in its upcoming Java 1.4 implementation and generally advises against calling Carbon code from their Java 1.4. (This issue is a moving target and the 1.4 implementation is NDA'd, but here's the java-dev post announcing the policy and a follow-up with more details.)

The sample MakeMovies class creates a Movie in memory composed of references to another movie, saving variants of this movie to disk. The movie is created with low-level edits, meaning functions that work with segments of a movie defined by starting time and duration. To keep things simple, our movie consists of three five-second segments grabbed from the beginning, middle, and end of another movie:

// figure out start points for 5-second segments at
// approximate beginning, middle, and end of movie
int scale = sourceMovie.getTimeScale();
int end   = sourceMovie.getDuration();
int fiveSeconds  = 5 * sourceMovie.getTimeScale();
int[] startTimes = {0, // beginning
                    end/2, // middle
                    end - fiveSeconds};

// insert 5-second segments from sourceMovie into
// refMovie
int fiveSecRefTime = 5 * refMovie.getTimeScale();
for (int i=0; i < startTimes.length; i++) {
    sourceMovie.insertSegment (refMovie,
                               startTimes[i],
                               fiveSeconds,
                               i * fiveSecRefTime);
}

With that, we have a 15-second movie, which the demo app plays in a QTCanvas. Now to save it to disk.

If you were just combing over the javadocs, you might be tempted to use the convertToFile method in the Movie class. It's fairly straightforward, just needing the file and some constants for file-type, Mac file "creator," and a Mac ScriptManager. The downside here is that the generated file has uncompressed audio, and video barely compressed with Apple's "Video" codec. Still, take a look at it with our atom parser and we've got a normal-looking self-contained movie:

moov (2732 bytes) - 3 children
  mvhd (108 bytes) 
  trak (631 bytes) - 3 children
    tkhd (92 bytes) 
    edts (36 bytes) - 1 child
      elst (28 bytes) [1 edit]
    mdia (495 bytes) - 3 children
      mdhd (32 bytes) 
      hdlr (58 bytes) [mhlr/soun - Apple Sound Media Handler]
      minf (397 bytes) - 4 children
        smhd (16 bytes) 
        hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
        dinf (36 bytes) - 1 child
          dref (28 bytes) 
        stbl (280 bytes) - 5 children
          stsd (52 bytes) 
          stts (24 bytes) 
          stsc (40 bytes) 
          stsz (20 bytes) 
          stco (136 bytes) 
  trak (1985 bytes) - 3 children
    tkhd (92 bytes) 
    edts (36 bytes) - 1 child
      elst (28 bytes) [1 edit]
    mdia (1849 bytes) - 3 children
      mdhd (32 bytes) 
      hdlr (58 bytes) [mhlr/vide - Apple Video Media Handler]
      minf (1751 bytes) - 4 children
        vmhd (20 bytes) 
        hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
        dinf (36 bytes) - 1 child
          dref (28 bytes) 
        stbl (1630 bytes) - 6 children
          stsd (102 bytes) 
          stts (40 bytes) 
          stss (56 bytes) 
          stsc (364 bytes) 
          stsz (824 bytes) 
          stco (236 bytes) 
free (16 bytes) 
wide (8 bytes) 
mdat (5059930 bytes)

While we're here, let's note another seemingly-useful-but-probably-not method: createShortcutMovie in the QTFile class. You'd be forgiven for thinking that this creates a movie that preserves our references to the media in the original movie. Not even close--take a look at it with the atom-parser:

moov (254 bytes) - 1 child
  mdra (246 bytes) 
    dref (238 bytes)

In other words, a "shortcut" movie is something of a QuickTime analogue to an file alias or symbolic link.

Exporting Movies

So far, none of these methods have given us a way to specify that we'd like to state (and possibly change) the encoding or format of the saved movie. That's the realm of the MovieExporter, which writes a movie in a particular format with our choice of audio and video codecs. The code isn't hard to understand: get an exporter for a particular format, bring up a dialog for the user to specify encoding and quality settings, and let the exporter get to work.

What can be tricky is getting a MovieExporter. The list of available exporters is variable, depending on the user's version and what optional pieces of QuickTime they have installed. One technique is to call the MovieExporter with an int constant:

MovieExporter me = 
    new MovieExporter (StdQTConstants.kQTFileTypeMovie)

This creates an exporter to create typical QuickTime .movs. You can also use the hex value 0x6d706734 to get an MPEG-4 exporter in QuickTime 6. In case you were wondering, that int is the string mpg4 in ASCII. Passing short strings as 32-bit ints is very common in the QuickTime API.

What if you want to offer the user the ability to export to a format that might be a post-install add-on, or that might be included in a future version of QuickTime? For this, the MovieExporter has a second constructor, one that takes a ComponentIdentifier as its argument. To find a suitable ComponentIdentifier, we can iterate through the installed components, with ComponentIdentifier.find(), looking for those that have type "spit," which is provided as the constant StdQTConstants.movieExportType. The sample code produces a dialog of the discovered choices, modestly validating those that are actually appropriate for exporting our movie:

// build up a list of exporters and let user choose one
Vector compIdentifiers  = new Vector();
ComponentIdentifier ci  = null;
ComponentDescription cd =
    new ComponentDescription(StdQTConstants.movieExportType);

while ( (ci = ComponentIdentifier.find(ci, cd)) != null) {
    // check to see that the movie can be exported
    // with this component (this throws some obnoxious
    // exceptions, maybe a bit expensive?)
    try {
        MovieExporter exporter = new MovieExporter (ci);
        if (exporter.validate (movie, null))
            compIdentifiers.addElement (ci);
    } catch (StdQTException expE) {} // ow!
}

The sample code then takes the Vector of ComponentIdentifiers and populates a JComboBox, which goes into a user dialog, as seen in Figure 2. The sample code tries to export all tracks, audio and video. Choosing a movie audio-only format like "AIFF" will throw a QTException. Production code could be more careful about what tracks to export, or what choices the user has.

Choice of MovieExporters
Figure 2--the choice of MovieExporter

Once the user has chosen a MovieExporter, we call a method named doUserDialog to let the user choose quality and other format-specific options. If the user chooses the normal "QuickTime Movie," the export dialog looks like Figure 3. You may notice that the MPEG-4 exporter dialog is exceptionally verbose and carefully explains whether or not your choices will create a standard MPEG-4 file readable by other machines. Another quirk of the MPEG-4 exporter is that Windows users won't be able to export audio. (I'm not sure if this is because of technical limitations or issues licensing the AAC audio codec from Dolby.)

User dialog for QuickTime Movie export
Figure 3--the user dialog for QuickTime Movie export

The export takes a long time, particularly with large movies, slow computers, or certain codecs. To provide a good user experience, it's best to provide a progress update. In QTJ, a MovieProgress implementation can get callbacks from time-consuming operations. One thing that makes this a little difficult, however, is that the javadocs say that as the operation progresses, your implementation will receive the messages movieProgressOpen, movieProgressUpdatePercent, and movieProgressClose ... but those values from the native QuickTime API don't seem to be defined in QTJ. Fortunately, their values turn out to be pretty simple: 0, 1, and 2, respectively. In the sample code, I've extended a Swing ProgressMonitor to update as the export continues, as seen in Figure 4. Unfortunately, this only works on the Mac. On Windows, the callbacks occur on the AWT-Windows thread (even though the export was called from the main thread) and QuickTime seems to block the AWT thread, so our attempts to update the ProgressMonitor never get a chance to repaint. I haven't found a clever thread-scheduling or SwingUtilities way around this. If you do, please put it in the talkback!

Progress bar for MovieExporter
Figure 4--the progress bar for MovieExporter

Flat-land

Let's say that you're happy with saving as a QuickTime movie. In fact, you want to keep the original audio and video encoding, but you want to eliminate references to external files, copying all of the media data into one movie that can be sent to other machines without breaking. This process of eliminating references is called "flattening." It takes a straightforward call to Movie.flatten() with a list of usually-constant values:

movie.flatten (0,                                // movieFlattenFlags
    flatFile,                                    // fileOut
    StdQTConstants.kMoviePlayer,                 // creator
    IOConstants.smSystemScript,                  // scriptTag
    StdQTConstants.createMovieFileDeleteCurFile, // createQTFileFlags
    StdQTConstants.movieInDataForkResID,         // resId
    flatFile.getName());                         // resName

This produces a typical-looking QuickTime movie, with a big mdat atom, indicating the media is inside of the movie file:

wide (8 bytes) 
mdat (2326820 bytes) 
moov (3100 bytes) - 4 children
  mvhd (108 bytes) 
  trak (2077 bytes) - 3 children
    tkhd (92 bytes) 
    edts (36 bytes) - 1 child
      elst (28 bytes) [1 edit]
    mdia (1941 bytes) - 3 children
      mdhd (32 bytes) 
      hdlr (58 bytes) [mhlr/vide - Apple Video Media Handler]
      minf (1843 bytes) - 4 children
        vmhd (20 bytes) 
        hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
        dinf (36 bytes) - 1 child
          dref (28 bytes) 
        stbl (1722 bytes) - 5 children
          stsd (102 bytes) 
          stts (24 bytes) 
          stsc (412 bytes) 
          stsz (920 bytes) 
          stco (256 bytes) 
  trak (895 bytes) - 3 children
    tkhd (92 bytes) 
    edts (60 bytes) - 1 child
      elst (52 bytes) [3 edits]
    mdia (735 bytes) - 3 children
      mdhd (32 bytes) 
      hdlr (58 bytes) [mhlr/soun - Apple Sound Media Handler]
      minf (637 bytes) - 4 children
        smhd (16 bytes) 
        hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
        dinf (36 bytes) - 1 child
          dref (28 bytes) 
        stbl (520 bytes) - 5 children
          stsd (68 bytes) 
          stts (24 bytes) 
          stsc (256 bytes) 
          stsz (20 bytes) 
          stco (144 bytes) 
  udta (12 bytes) - 0 children

Don't Try This at Home

In a moment of curiosity, I browsed the methods of the AtomContainer class, which is used (infrequently) to pass around QuickTime memory structures as particularly complex parameters or for other really low-level tasks. I noted that it has a getBytes() method (inherited from QTHandleRef), and that a Movie could be coaxed into an AtomContainer representation.

So I'm like, "Huh, I could get the raw bytes of the Movie ... wonder what that looks like."

Dumping the byte array to disk is simple, and the first few bytes look awfully familiar:

0000 0e3c 6d6f 6f76 0000 006d 6d76 6864
0000 0000 ba6b 3f16 ba6b 3f70 0000 0258
0000 2328 0001 0000 00ff 0000 0000 0000
... 

Yep, there's moov and a mvhd right there on the first line. The memory structure is almost identical to the file format. Almost? Yes, it's apparently the same except for one byte: the size of the mvhd is wrong. On the Mac, it's 0x006d, when it should be 0x006c. On Windows, it's 0x016c. Accounting for endian differences between the platforms, it's like 1 was added to the size in an endian-specific way.

The sample code dumps the movie's AtomContainer two ways, in its raw form as atom.out and with this byte fixed as atom-fixed.mov. Surprisingly, in my testing, this fixed version consistently plays in QuickTime Player.

This may not be a recommended way to create a movie on disk that just keeps pointers to its source segments, but it should help tie things together, to help illustrate the fact that QuickTime's concepts of movies, tracks, and media and of atoms and their containment heirarchy, and its use of pointers to media data, are not just a conceit of the file format, but a core concept of how movies are managed in memory and manipulated by code.

Now that you know how hairy those structures are, be glad that the API largely isolates you from them!

Chris Adamson is an author, editor, and developer specializing in iPhone and Mac.


Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.