Apple QuickTime 3.0 and the Avid/Microsoft Advanced Authoring Format
By Bob Doyle
At the National Association of Broadcasters (NAB) '98 keynote address, Apple's founder and acting CEO, Steve Jobs, called current computer video technologies a 'Tower of Babel' and introduced QuickTime 3, the latest version of the de facto standard for media file interchange on the Mac platform, and the enabling technology that allows any Mac nonlinear editing software to run on any Mac video capture hardware (See sidebar: Windows needs a hardware abstraction layer). Apple released QuickTime 3 as a full-fledged authoring tool on Windows, confounding a few skeptics who thought Jobs might cancel a Windows authoring version at the last minute (QT 2.5 for Windows was play only), just as he had pulled the plug on Apple clones. (See sidebar: Cross-platform tools are not a threat to Apple) Now Mac editing apps can run on Windows hardware that supports QuickTime. Happy developers at NAB showed versions of their products that will run seamlessly on Windows thanks to QuickTime 3, including Macromedia's Final Cut, Media 100's Finish, Post Digital's Roto, and Radius MotoDV.
Meanwhile, a consortium led by Microsoft and Avid announced the Advanced Authoring Format (AAF), a new file format for media interchange, and more importantly for interchange of compositional metadata such as edit decision lists (EDL), special effects (SFX), transitions (TX), text (subtitles, copyright and media source info), and composited layers with alpha-channel video, travelling mattes, moving still images, and 2D or 3D animations. AAF will also store production data from credits to technical details like video camera setup or film speed, exposure information, lens focal length, and geographical information from GIS location of the shoot to the library where the film or videotape originals for each scene are stored. The consortium announced that AAF has been submitted to the EBU/SMPTE Task Force for the Harmonization of Standards for the Exchange of Program Material as Bit Streams as a response to the Task Force Request for Technology.
AAF metadata interchange will be based on Avid's Open Media Framework Interchange (OMFI). OMFI was previously implemented in Apple's Bento container architecture, now declared end-of-life by Apple. Its container architecture will be newly implemented in Microsoft's object-oriented COM-based Structured Storage. OMFI is widely used in the audio industry for exchange of audio files that must synchronize with edited video and film productions. It is also used by Heuris' MPEG Power Professional MPEG encoders to extract cut points from a production, which become I-frames in the Encoding Control List (ECL), and to locate SFX and TX that need special encoder handling. But Avid has only recently extended OMFI metadata and media interchange to its own line of video and film products, and not yet completely. Outside of Avid, only a few of Avid's NLE (nonlinear editing) competitors can exchange compositional metadata and Motion-JPEG video media via OMFI, notable Tektronix/Lightworks and Softimage (Microsoft). This is not entirely Avid's fault. Their partners have had the spec for years and not implemented it. Ironically, some OMFI media files can be interchanged using QuickTime 3. For example, Avid's entry-level nonlinear editor Cinema can use QuickTime 3 to open high-end Avid Media Composer files. At NAB'98 Discreet Logic showed their editing and effects workstations exchanging files between Macintosh and Windows.
The annual OpenStudio Roundtable at NAB sponsored by Videography magazine was dominated by heated discussion that stemmed from confusing AAF as simply a competitor for QuickTime. The wide range of compositional metadata that OMFI addresses is not currently part of QuickTime, and SMPTE/EBU (Society of Motion Picture and Television Engineers and European Broadcating Union) are looking to harmonize standards for the exchange of program material that will encode even more than OMFI, things like the location of all the source program material, keywords to automate media access management, and possibly even the names of everyone involved in the production. (See sidebar: Standardization and Registration of Metadata)
SMPTE/EBU is looking to standardize on technology that is available without licensing fees. Microsoft's reimplementation of OMFI (AAF will retool the OMFI container format using Microsoft COM-based Structured Storage architecture instead of the Bento technology that Apple has declared end of life) will presumably be free. AAF has been offered to SMPTE/EBU, just as Apple offered QuickTime and had it accepted as a foundation for MPEG-4.
Avid's Oliver Morgan (Chair of the Sub-group on Wrappers and Metadata of the EBU/SMPTE Task Force for the Harmonization of Standards for the Exchange of Program Material as Bit Streams) calmed the participants and eventually won applause for the valuable work that issues from standards committees, usually after years of 'due process' that canvasses the affected industry. QuickTime architect Peter Hoddie noted that Apple has submitted QuickTime to another such committee at the International Standards Organization (ISO) working on MPEG-4, and that QuickTime has been adopted as (at least one possible) basis for MPEG-4.
Apple has also responded to the EBU/SMPTE task force request for technology (RFT) on Wrappers and Metadata, describing how QuickTime meets some of the specific requirements in that RFT. If the market demands it, Apple will probably add metadata interchange to QuickTime, but they need not develop it alone, or reinvent OMFI, which is a fine starting place for metadata and wrappers interchange.
Post-production professionals who attended the last several years of NAB shows may have detected a pattern of poorly thought out and ultimately broken Microsoft promises to improve Windows professional video and make Windows the dominant nonlinear editing platform. The original Video for Windows (AVI file format) was to be replaced by Active Movie. Then an industry consortium similar in makeup to the AAF group designed OpenDML, which was partially realized in Active Movie 2 (sometimes called AVI-2). However, the all-important hardware abstraction layer was never built (or approved) by Microsoft (who was then more interested in distribution than content creation), so companies like Avid, D-Vision, and in:sync had to build their own hardware-specific drivers (with the critical help of Truevision and later Matrox) to stay alive in the Windows editing business. Next NAB Microsoft took a ninety-degree turn to the Internet with DirectShow, described as more than a file format, a multimedia architecture like QuickTime. DirectShow then evolved into Active Streaming Format (ASF) for streaming video on the web. Microsoft pledged ASF would eliminate AVI and WAV files, which were declared inadequate because they did not support timecode for example. (See sidebar: Replace AVI now that it's working?) Microsoft engineers were ordered to stop work on professional video (among other things) and join the race with Netscape. Skeptics will be forgiven for feeling more than 'once burned, twice shy' about the new AAF. But OMFI is a serious tool, and should outlive next year's new directions from Microsoft. (Unfortunately, the changes Microsoft's new coalition are proposing for OMFI will take at least until next NAB to implement, so no products are at all likely to use this new tentative format for at least a year.)
During the same years over at a beleaguered Apple, the QuickTime team stayed intact and pressed its hardware abstraction advantage. They did not ignore Microsoft and Avid developments, even adding OpenDML and OMFI support (read-only of the M-JPEG media) to QuickTime. Just as the MacOS can read PC disks and open Windows files, QuickTime editors can read files built on PCs and sent over Windows networks, including the original Video for Windows and AVI-2. "OpenDML is history," said Microsoft. "QuickTime will read OpenDML to protect legacy investments," says Apple.
The AAF promoters talk about exchanging media as well as metadata, but QuickTime already does exchange media files, and even convert otherwise incompatible M-JPEG files between systems for a couple of kinds of M-JPEG. For the most part, M-JPEG files are proprietary, and cannot be moved from system to system, even among those of a single manufacturer. Fast Multimedia has three incompatible M-JPEG types, Avid has four or five. But QuickTime does allow some Avid files to open on Media100 or Scitex Digital Video, for example, and others to open on Avid's own Cinema.
But the EBU/SMPTE Task Force found that "while QuickTime format data storage may now contain information about sequencing, timing and specific effects,... it is not so obviously geared to formal compositions as OMF is." Looking at the RFT responses from Apple, Avid, Microsoft, and others, there was extensive discussion in the Sub-Group about similarities between various data models and the opportunities for creating a grand reconciliation of them. The analysis involved engineers from small and major companies, and independents, as well as including representatives from various AES, SMPTE, DAVIC and ISO committees. They found "there is a need for a simplified format with capabilities somewhere between conventional EDLs and the OMF object model," which has led to a proposal for "a new generic ASCII-based 'AVSOURCE Format' applicable to A) the proposed Audio EDL, B) as a proposed extension to SMPTE 258M, and, C) mappable directly to the OMF source model."
The real future for media interchange lies in international standard video files like DV and MPEG2. (And significant metadata may be exchanged inside the data structures of these formats.) Despite efforts by companies to block interoperability, professional DVCAM and DVCPRO can be made file compatible with consumer DV. Editors who load DV files to their hard drives with a DPS Spark or Radius MotoDV, will find that they are fully functional in a dual-stream DVCPRO environment like the new Truevision Targa DV2000. Let's hope that SMPTE/EBU can lean on Sony and Panasonic to stop this marketing nonsense and adhere to whatever DV standard emerges for interchange. MPEG2 streams from different companies have similar incompatibilities. But they are a lot closer than the wild and wooly types of Motion-JPEG. (See sidebar: Why M-JPEG is history)
Exchange of edit decision lists (EDLs) is accomplished today by exporting and importing twenty-five-year old standards like CMX lists. These EDLs will only identify transitions if they are the basic wipe numbers of a hundred or so very simple wipe patterns standardized years ago by SMPTE. Much more is needed before a project file on Adobe Premiere can be opened on a high-end Avid Media Composer for finishing. This is part of the work of an EBU/SMPTE committee designing a registry for transitions and special effects, among many other things. The registry will include all the information needed to encode and decode particular kinds of metadata, including Effects, Composition Metadata, Essence formats, Geospatial Metadata, Descriptive Metadata, and Unique Material Identifiers. For example, a vast amount of data about the original footage will facilitate rebuilding the program from original media, locating even analog source tapes, if necessary.
SMPTE registration of transitions is a noble effort that will probably succeed only in extending current standards for canned (non-programmable) wipes between competitive systems. With 20,000 SFX now on the market and the number of effects developers growing, effects will come into existence faster than the ability of registration authorities to catalog them, let alone standardize their parameterization for metadata interchange. (See sidebar: Much more than an SFX Registry is needed)
If we take a wildly speculative look at future NABs, we can already see that Microsoft will have to replace AAF at NAB 'Q99 and of course NAB 2000. Not that AAF will be completed, just that it will be superceded because it did not allow metadata to travel to the end user (AAF "flattens" all files when they are distributed for viewing). First will come the AASF proposal, which will combine elements of ASF, allowing authors to send metadata that controls exactly how their program will be altered to play on different ATSC formats or stream at lower data rates. Next will be AAISF, in which authors will be able to add metadata that makes their programming interactive, so it can play on DVD and over high-speed cable modems. Then will come ASIF, where all transitions, effects, and compositing will be performed client-side in smart PCTVs or TVPCs, postponing post, reducing rendering time, and optimizing everything for the actual delivery device, as if you were authoring for that exact platform. These are all technologies that have been described in SMPTE Journal and the the EBU/SMPTE committees are aware of. But does Microsoft and their developers have a clue?
Anyway by that time Apple will very likely have QuickTime 4/MPEG 4 with Java running on Unix, Network PCs, set-top boxes, 10X DVD players, and over high-speed cable modems, and even implementing many of the above proposals - if they can stay in the black financially, and keep their QuickTime team, which now includes Adobe Premiere and Macromedia Final Cut author Randy Ubillos, together.
Bob Doyle is the digital video guru at New Media Magazine and Founder of the New Media Lab in Cambridge, MA.