Digital Archiving

Preserving Tomorrow's Media for Today

By Bob Lamm

Dave McCarn, WGBH's Chief Technologist, has a mission: He wants to come up with a permanent, universal digital file format for archived media. One that not only carries the original sound and image, but also transcriptions, production notes, authorship/copyright/royalty info, hypermedia links to other media files, etc. Once in this form, it would be free of the underlying tape technology it was originated on, permanently linked to the information that usually gets lost in paper files, and could be stored on the same general-purpose digital media we keep our e-mail and other computer-based files on.

Dave explained a little bit of this vision at our February 18 meeting at WGBH-TV. He pointed out that this was an issue of considerable importance to a station like WGBH: They have something like 150,000 reels/cassettes of one sort or another in their archives. Although WGBH takes great care to store these under optimal conditions, magnetic tape is inherently unstable, and the inevitable deterioration is starting to set in. In addition, WGBH also needs to keep a selection of old machines, also carefully preserved, in operational condition just to play back this media. These machines are starting to show their age and can't be expected to last indefinitely.

Nevertheless, all this old stuff is becoming very valuable: It's often the first source that documentarians turn to when making shows on any historical topic of the last 50 years. And there's interest in the old series again too: The Food Network recently bought all the old Julia Child shows, including the original black-and-white episodes that had originally been recorded on low-band 2" quad tape.

It would be easy to simply transcribe all this stuff from the old 2" and 1" masters to newer formats like Digital Betacam. But who's to say how long D-Beta will last? In fact, Dave pointed out that there are currently 13 digital formats available today, none of them particularly guaranteed to outlast any of the people in the room. (It was a young crowd.)

So what Dave wants to see is a tapeless digital format, sort of like QuickTime, which exists separately of any particular storage technology. It would not involve transcoding the original media (except for digitizing the analog stuff): it would take the original samples and wrap a header with all the associated data around it, as well as the algorithm used to encode the media so that a future processor is sure to be able to decode it.

Dave cited some of the work that's already been done in this field. In particular, he mentioned the work on Apple 'Bento', a platform-neutral compound content container which is scaleable enough to address large amounts of media and address both simple and complex data models.

He also talked about Avid's 'OMF' format, which is based on Bento but also gives information on how video segments fit together with each other and other elements.

But Dave wants to go one step further and make a 'media compiler' file type which is more flexible, can carry more data, and is more media independent than OMF. He wants it to be a format for the centuries, it should even carry the algorithm used to encode everything in case this information gets lost over the ages. (There was also a joke that it should carry the algorithm for reconstituting a viewer in case they also become obsolete.)

The technology for doing all of this is closer than most people realize: Norasam Technology makes a laserdisk recorder than can store terabytes of information per square inch. (It can even record analog signals too.)

A SMPTE study group is forming to come up with a recommended practice, Dave is hosting information on the group on the WGBH web site (, or join the listserv he's instituted by sending e-mail to with 'subscribe upf yourname' in the body of the message.


David's presentation was followed by a short presentation by Virage Corp., which makes a system that automatically logs media: It's capable of telling when shots end and uses associated into like closed-captioning) to database information about what it's about. It is also capable of doing some image recognition to determine what kind of picture content it's looking at.

The demonstrator showed it off a little bit and explained that it had obvious applications in autologging news feeds and other broadcast functions. He also said that companies like CNN and ABC had contracted for systems.

Bob Lamm is Manager at CYNC Corp.,a video dealership that deals in considerably more prosaic issues. (Like building 3D animation and nonlinear video editing systems. He can be reached at 617-277-4317,

Posted: 10 March, 1998
Robert Lamm, SMPTE/New England Newsletter/Web Page Editor