Advanced Video Compression - Part 1
Various Codecs, Containers and their Pros and Cons

Now that you know the basic concepts and terminology behind video compression, we can talk about various containers and codecs, and which one is right for you. First off, we need to point out what the difference between a Codec and a Container is. Codec stands for Compressor/Decompressor, meaning it is something which can both encode and decode something, in this case video. Codecs include things like MJPEG, Sorenson, Cinepak, and DivX. Some codecs may have specific containers associated with them, such as MPEG. Some other containers you are probably familiar with include AVI, Quicktime, or Matroska. A container simply holds everything together. For instance, the AVI container holds both the video stream (which is compressed with a Codec) and an audio stream. Without the container, the video and audio would be in seperate files! More advanced containers like Matroska can allow for additional things like subtitle streams, multiple audio and video steams, chapters, menus, and more.

A container does not necessarily mean anything about the video quality, it only dictates the underlying structure of a file. I always scratch my head when people say things like "I like AVIs better than MPEGs because they're smaller," which is a statement that makes absolutely no sense. I've had an AVI file that's 26GB large, orders of magnitude larger than most MPEG files. The fact is that containers like AVI and Quicktime have little to nothing to do with the underlying Codec, except that the limitations of a container must be adhered to in the codec. For instance, AVI does not support many of the features required for h.264 (MPEG-4 AVC) compressed video.

Now on to the various codecs that we'll be talking about. We'll be splitting the various codecs into two sections - one for codecs which are good when used for editing, and one for codecs that should be used for distribution purposes.

I. Common Editable Codecs

1) MJPEG - Motion Joint-Picture-Experts-Group Compression

Origins: All of you know what JPEG is, I'm sure. If you've surfed the web, then you've seen a JPEG image. JPEG was a compression scheme developed by the Joint Picture Experts Group for the compression of images. Now imagine instead of compressing single images as JPEGs, compressing 24 to 30 images a second and storing them in sequence. That's essentially what MJPEG is - it's the JPEG compression algorithm applied to frames of video.

How it Works: MJPEG is a lossy codec. JPEG uses what's called a psycho-optical algorithm, or a visual-perception algorithm. Basically, what this means is that JPEG exploits flaws and short comings in the human perceptual system. The human eye has a much harder time differentiating between slight differences in color than differences in brightness. Also, JPEG plays other various tricks, but what you wind up with is an image which looks about the same to the human eye.

Benefits: MJPEG is blazing fast. You probably wont find a speedier codec! MJPEG can achieve decent compression rates and good image quality for most natural images like live video.

Disadvantages: Here's where MJPEG starts looking much less attractive. First off, JPEG does not compress anime well. Due to the way that JPEG works, abrupt changes in color and brightness (for instance, thin black lines in between different blocks of color) are not handled well by JPEG, which means MJPEG suffers from the same problem. JPEG has a problem with sharp edges which is unavoidable, partly because the standard was meant to compress natural images like scanned photographs.
There's also the problem of recompression. While the human eye can be fooled by JPEG's compression, computers aren't. Thus, when you compress from MJPEG into another format to use for distribution, you will receive a lower-quality picture than if you had a lossless source because there is data missing which, while it's not important to the visual cortex, is very important to a mathematical analysis of things such as color data.

Recommendations: MJPEG is an old standard. There are much more refined and better Discrete Cosine Transform-based compression algorithms out there (like DV) that I would recommend over MJPEG. About the only time you should use MJPEG is for fast-editing temporary files, but DON'T use them as your final source.

2)

DV - Digital Video

Origins: First launched in 1996, DV became a standard for use in both consumer and semi-professional digital video compression. DV (or one of its many varients) is used by many video cameras on the market today. It is also well supported by many video editing applications and video hardware devices.

How it Works: DV is a lossy codec. I've heard lots and lots of misinformation floating around about how "DV is lossless," which it definitely is not. DV is a different approach towards a Discrete Cosine Transform algorithm than what was taken with MJPEG, partly because DV is a newer standard. I won't bore you with all the gory details, but lets just say that in almost every respect, DV is technically superior. The main disadvantage is that DV is fixed to 720x480 in 29.97 FPS (or 720x576 in 25 FPS) at 25MBit/second. That means very little flexibility if you want to do low-quality previews of your video or you're pressed for disk space (25MBit/second is 3.4MB/sec, or roughly 5 minutes per GigaByte!). Some variants of DV exist that may support different resolutions, or allow for progressive (non-interlaced) video. But for the most part, DV severely limits what you can put into it. Like MJPEG, DV is a good editing codec because it doesn't use any inter-frame compression, thus every frame is a "keyframe."

Benefits: DV looks very good, better than MJPEG in every respect. DV is good enough that it's used by both consumers and many professionals as a compression standard. If you are only going to be doing a recompression once or twice, the quality shouldn't be much of an issue. With repeated recompression though, you can start really losing quality.
DV is also cross-platform, meaning it works the same on PCs, Macs, AVID video editing workstations, you name it. You can also create a DV project, and then pipe it out through FireWire to DV tape and get a complete, lossless digital copy of your master as a backup.

Disadvantages: As stated above, it's rather restrictive in terms of framerate, resolution, and datarate. Also, its not lossless, but the quality is still VERY good.

Recommendations: If you don't plan to do any pre-processing (discussed later in this guide) on your video source, and you don't have the disk space/hard drive throughput to edit in a lossless codec like Huffyuv or Lagarith, this is the way to go.

3)

Huffyuv - Lossless Video Compression, for a price.

Origins: A guy named Ben Rudiak-Gould wrote this wonderful little codec which is a lossless compression codec for both YUV and RGB video data.

How it Works: Huffyuv is a lossless codec. The name stands for Huffman-compressed YUV. Huffman was a guy who came up with something called "Huffman entropy encoding" which is basically what all lossless compression is originally based off of (Huffman coding is used in things like ZIP, RAR, etc). So you could say that basically Huffyuv simply ZIPs every frame for its compression.

Benefits: Obviously, the biggest benefit is that you have a perfect recreation of the original video data (unless you do colorspace conversions). That means no matter how many times you recompress the video in Huffyuv, you'll still have the same video data that you had in the beginning. This is a very nice thing. Huffyuv also compresses and decompresses video pretty quickly.

Disadvantages: Due to the very large space requirements (Huffyuv takes a LOT of disk space, sometimes 4x more than even DV), disk throughput becomes a very large factor. Playback at full resolution and full framerate is a daunting task for even the fastest computers. Also, this codec does not support the YV12 colorspace (although modifications of the codec which DO support this colorspace exist).

Recommendations: If you have the disk space to burn, this is a very viable codec for editing. Huffyuv is also a great choice for exporting your final video out of your editing program, before you compress it with something else.

4)

Lagarith - Better, Slower, Lossless Video Compression.

Origins: In 2004, SirLagsALot released the first version of this great codec which was designed to offer greater compression than Huffyuv.

How it Works: From the official Lagarith web site:

Lagarith is able to outperform Huffyuv due to the fact that it uses a much better compression method. Pixel values are first predicted using median prediction (the same method used when "Predict Median" is selected in Huffyuv). This results in a much more compressible data stream. In Huffyuv, this byte stream would then be compress using Huffman compression. In Lagarith, the byte stream may be subjected to a modified Run Length Encoding if it will result in better compression. The resulting byte stream from that is then compressed using Arithmetic compression, which, unlike Huffman compression, can use fractional bits to encode a symbol. This allows the compressed size to be very close to the entropy of the data, and is why Lagarith can compress simple frames much better than Huffyuv, and avoid expanding high static video.

Benefits: Like Huffyuv, Lagarith is a lossless codec, so you get back exactly what goes into it. In addition to offering better compression than Huffyuv (so it doesn't use as much disk space), it also has support for additional colorspaces such as YV12, and it offers interesting features such as a "reduced resolution" mode, which is useful for "bait-and-switch" editing (discussed later in this guide). Lagarith is also multithreaded, so that means if you have a CPU that can take advantage of this, you will get even faster performance. Finally, since it can make smaller files than Huffyuv, disk throughput is not as much of a factor, and your hard disk is less likely to become a bottleneck when trying to play back these files.

Disadvantages: Both compression and decompression can be slower than HuffYUV. If you have a slower CPU (under 2ghz or so), you may want to consider other options. This codec IS multitheaded though, whereas HuffYUV is not, so on newer dual core processors, this codec may compress and decompress even faster than HuffYUV.

Recommendations: As long as you have a fairly fast CPU, I would recommend that this be your codec of choice.