ErMaC's Guide to All Things Video - Part 3

The Ins and Outs of Video Compression


This article is meant to be a guide on what video compression is, how it works, and what it means to people working in Digital Video. If you have not read the two previous guides on Video, please do so as this builds on information disseminated in the preceding articles.

1) What is video compression?

Way back in the day when digital video was first being born, video was stored in its pure, uncompressed format, much like digital audio (for those who don't know, CDs were pure uncompressed digital audio data, while newer standards such as Dolby Digital, MiniDisc, and MP3 are compressed). This meant that it took a LOT of space. For comparison on how much space video takes over audio, no matter how its stored, take for example the FM radio spectrum. TV Stations, like Raduio stations, have a fixed frequency range which they operate on, however unlike FM we are never told the exact frequency at which they operate, our TVs are just hardwired with what frequency values correspond to channel 2, 3, 4 and so on. The ENTIRE FM spectrum exists in a tiny little gap between TV channels 3 and 4, which themselves (this is for a single channel) take something around 5 times more bandwidth than the whole FM band. Crazy, huh?

So one can imagine that back when digital video first came out, it was damn hard to store. Nowadays we have 100GB RAID arrays being common in not-so-expensive consumer setups. 10 years ago, 10GB was an ENORMOUS amount of space!! Hard drives didn't hold more than 500MB tops, but uncompressed digital video still took up the same space it does now. This lead to people storing digital video to tape and doing things with it that way, but this made Non-Linear video-editing impossible.

Then along came video compression. Probably the earliest, and truly successful video codecs (CODEC stands for COmpressor/DECompressor) was MPEG1. I'm sure you've all heard of MPEG (The Motion Picture Experts Group) in terms of its compression. But did you know that the actual standard for MPEG1 video is almost a decade old? Not only that, but the MPEG2 standard used in DVDs is more than half a decade old!

But how does compression work? Well like with any kinds of compression there are two kinds: lossless and lossy.

2) Lossless Compression

Lossless compression, as the name implies, means that after compressing the video, and then decompressing it, you wind up with the exact same data as you put in. This is comparable to something like ZIP or RAR (Infact, the best lossless codec out there, Huffyuv, basically uses the same compression algorithm as ZIP on each frame of video to achieve its compression). Lossless has the advantage that no matter how many times you compress it, you still haven't lost any video data. The bad part is that most often you don't save nearly as much space as you would with other lossy compression algorithms. Modern adn well-known lossless video codecs include Huffyuv (mentioned and linked earlier in this paragraph), Lossless MJPEG (different than plain old MJPEG), and zAVI (not recommended, huffyuv is vastly superior).

3) Lossy Compression

This is the form of video compression most people are familiar with. 95% of all video codecs out there are lossy, meaning that when you compress the video and then decompress it, you do not get back what you put it. Now, this isn't as bad as it may sound. Obviously if you're compressing something like a text document, you don't want to lose any of the data, but with something like a picture, even if a few bits and pieces aren't quite right, you can still make out the generally gist of the image. Same thing with audio. Famous lossy codecs include MPEG1, MPEG2, MPEG4 (AKA DivX, although it's a bad and incomplete implementation of the MPEG4 standard), DV (and its variants DVCAM, DVCPRO, DVCPRO50, Digital-S, etc), RealVideo, Sorenson, and the classic Cinepak.

Now with that distinction out of the way, we can discuss how the actual video itself, of which there are also 2 basic ways of doing this. These two methods are called intra-frame and inter-frame compression.

4) Intra-Frame Compression

As the name suggests, Intra-frame compression relies on only the single, sepcific frame it is working with to compress the data within it. This means that you are basically encoding each seperate frame as its own picture. In the case of something like MJPEG (an algorithm which uses only intra-frame compression) you are encoding each and every frame using a JPEG compressor (which should be familiar to anyone who has ever worked with images, it has the extension jpg). This means that while you can't take advantage of the information in previous and forthcoming frames, you have the ability to recreate each frame without the need for the others. This comes in very handy when editing video (which is why editing with a pure intra-frame codec is a must unless you use special hardware).

5) Inter-Frame Compression

As should be obvious by now, Inter-frame compression relies on information in preceeding and occaisionally forthcoming frames to compress an image. The most well-known way of exploiting this data is by exploiting the fact that the majority of a video image isn't always moving. Take a newscast, for instance, usually the only thing moving is the anchor's body, while the rest of the set is staying perfecty still. Why should we bother to store all of the data that makes up that background for every single frame? Here's an illustration:

We start with this frame. Now we take a look at what is actually changing in between frames:

Here are frames 2 and 3, with the only parts that have changed since the previous frame shown below the actual frame.

Here we have frames 4 and 5 with the same arrangement.

Notice that the only things in the frame that changed was the chair moving (and redrawing the background in the area that the chair had left since the previous frame). Almost all inter-frame compression is based on exploiting this fact about a video image. The disadvantage to this is that if you want to check out frame 5, you can't actually see what the real frame looks like without first looking back at frame 1, then applying the changes in frames 2, 3, 4, and 5. This leads to the notion of keyframes. A keyframe is a special frame which exists on its own - in other words it doesn't rely on any other frames to store its data. Thus seeking to a particular frame usually involves first going to the preceeding keyframe and then applying each successive partial-frame until we reach the desired one.

This presents a problem for video editing - if we want to cut the video on a specific frame, we are unable to because most programs (Premiere, Media Studio Pro, VirtualDub, most any video software) will only cut at a keyframe. This makes your editing options very limited unless you create lots and lots of keyframes, in which case you then lose the benefits of inter-frame compression in the first place!

Bottom line: Use Intra-frame compression-only algorithms for editing, and inter-frame compression algorithms for final distribution and archival.

One final thing - depending on the codec, filesize can be determined one of two ways (see a trend forming here?). Either the filesize is determined by bits per second or by bits per frame. All inter-frame compression algorithms are based on a bits/second factor (although this can be variable throughout the file depending on if you use what's called variable bitrate), and many intra-frame compression algorithms (with DV and MJPEG being notable exceptions) are based on bits/frame.

What does this mean? Well, it means that when you encode something in MPEG and specify that it will be in 1.5 megabit per second, it doesn't matter whether the video is in 29.97 fps, 25 fps, 23.976 fps, 15 fps, or whatever - it's still going to take up the same amount of space. The 15 fps one will look sharper, but obviously will be less smooth than the other ones. If you try the same thing with a codec like Huffyuv or Intel Indeo, you will see that this does not hold true. That's because these codecs encode each frame and store it seperately, regardless of how big or small the frames before or after it are. Thus, you will get the same image quality through all of them, but the smoothness and filesizes will change.