At the March 20 meeting...

New Developments in compression technologies

Sony MPEG-2 chipset is impressive

By Wilson Chao

The March meeting, on "PEGS -- MPEG, JPEG, and the other PEGS" was organized by Robin Shahid of Boston Post, and graciously hosted by National Boston.

Katie Cornog of Avid Technology presented an excellent introduction to video compression. She discussed the technical workings behind M-JPEG (motion JPEG), MPEG, wavelets, and DV, with particular attention to the needs of disk-based nonlinear editing. She concluded that M-JPEG is the compression of choice for today's NLEs, as it is an intraframe algorithm (i.e. single-frame editable), has good quality, and has encoders and decoders implemented in silicon at reasonable cost.

Aaron Feigen from FutureTel was unable to come to the meeting as was scheduled. The local rep had a system on display.

The featured presentation, by Dr. Hugo Gaggioni, Director of Sony Advanced Systems, was titled "MPEG-2 Compression in TV Production". I have heard him speak on MPEG-2 as far back as the IEEE conference on Digital Video in the fall of 1994, and as recently as the national SMPTE conference in Seattle this January. He has been Sony's most visible advocate of a 4:2:2 profile for the MPEG-2 standard, which for a period some years ago acquired the name "SPEG" (Sony PEG?).

Dr. Gaggioni briefly discussed video compression algorithms, including Motion-JPEG, MPEG-1, MPEG-2, and DV. He asserted that each of these schemes had limitations which disqualify them for broad use in professional television production (as opposed to consumer distribution). These limitations include a high bit rate (M-JPEG), limited chroma bandwidth (MPEG-1, MPEG-2, DV), and the inability to edit or switch live video streams on an arbitrary frame (MPEG-1 & 2).

He reported that as of 1/96 the 4:2:2 Profile @ Main Level has been approved by the MPEG. He also defined Sonyís "4:2:2 Studio Profile", with a 18 Mbps bit rate, a GOP of 2, and an I-B syntax. He claimed that this compression scheme met the following television production requirements:

He also reported that Sony is now producing in a Japanese fab a MPEG-2 chipset as follows:

This fab is at the level of Intel's current P6 production line, and represents a huge technical achievement and a huge investment on Sony's part. To gauge the market significance of this next generation of compression chips, note that IBM announced their own rival MPEG-2 chipset the next week (see http://www.ibm.com/newsfeed/uspress.html for the announcement, and http://www.chips.ibm.com/products/mpeg for technical specs). On that day (3/25/96) the stock of C-Cube (makers of MPEG-1 and MPEG-2 chipsets, as well as the M-JPEG chipsets in Avid Media Composers) dropped 22%.

Dr. Gaggioni announced that Sony will debut at NAB a new videotape format to be called Betacam SX, based on this MPEG-2 chipset. There has been talk for years about MPEG compressed digital VTRs, most notably the DVCR-HD, the planned HDTV successor to the recently introduced DV consumer format. However, this is the first MPEG based VTR in production. He then showed several videotape samples dubbed to Digital Betacam. Frankly, the pictures were stunning.

The first, an 18 Mbps sample of snowfall and flying birds was shown after 1 encode/decode cycle, and was representative of an original field acquisition tape. It was extremely clean, with Y and C resolutions subjectively equal to Digital Betacam. Video noise was below the noise floor typical of broadcast CRT displays. There were no DCT compression artifacts, and fast movement showed no motion artifacts.

This "4:2:2 Studio Profile" video at 18 Mbps was better than DV compressed video at 25 Mbps. (Close examination of DV compression reveals good Y bandwidth but limited C bandwidth, as well as subtle DCT blockiness).

This video was also better than M-JPEG compression at equivalent bit rates (75 KB/frame). M-JPEG would have to run around 50 Mbps (200 KB/frame) to match Sony's 18 Mbps quality.

The second sample (18 Mbps, 10 generations encode/decode) simulates typical postproduction generations. The original video sample (prior to 10 generations) showed Gaussian noise typical of a good camera under moderate gain.

After 10 passes the video noise level had visibly increased -- I would eyeball it at about 6 dB degradation in S/N. This was most noticable in dark picture areas. Also, this video after multiple encode/decode generations shows a distinctive noise "signature" with unique spatial and temporal correlations.

Spatially, the noise in the original video sample was uncorrelated, fine grained, and evenly distributed. After 10 generations the noise had aggregated into coarser grained, "threaded", moving patterns some dozens of pixels in length.

Temporally, the noise in the original video sample had been uncorrelated frame to frame. After multiple MPEG generations the "wormy" noise pattern persisted over several framesí duration.

Third, a theatrical film sample was shown at 4Mbps with a GOP of 15, and would be typical of consumer distribution applications such as DVD or DBS. It was clearly better than VHS quality, and better than typical cable television reception.

Subjective Y & C resolution was good, with good edge transitions but without unnatural detail enhancement. There was some loss of very fine detail, especially in large areas of constant luminance. For instance, close examination of areas of blue sky showed slight blockiness, with block to block variations in depth of modulation of the film grain (which should be constant).

Motion artifacts in fast moving scenes were subtle -- I give Sony's MVE chip high marks. (Iíve seen much worse even at twice the bit rates). However, the demo material was not rigorous in this regard, so I reserve final judgement. Please note that the quality of this particular demo is in part due to techniques beyond the MPEG compression system proper. This demo did not originate from NTSC (525 line) but from HDTV (1125 line) video, subsampled down to MPEG rates. Thus the encoder sees a wide bandwidth input with the need for minimum prefiltering & noise reduction.

Overall, I was extremely impressed with Sony's MPEG-2 compression chipset, and with the Betacam SX samples.

I leave the reader with some questions to ponder:

Wilson Chao owns Cambridge Television Productions, in Newton MA. He can be reached at (617) 332-0084.