YUY2 Sampling Method? (a technical question)

HeavyMetal · Post by **HeavyMetal** » Sun Jun 26, 2005 1:00 am

Anyone know the method YUY2 is sampled with?

I've searched Doom9 some and the Avisynth documentation. So far I know it has 2 Y samplings in a horizontal 1x2 pixel matrix.

The trouble is the terminology used in the explanations makes me suspect what little information is given from either source.

In a Doom9 article Luminance and Luma get used interchangeably. As do chrominance and chroma. YUV and YCbCr are used interchangeably as well. These are all different things despite their similarities.

Avisynth uses the term luma to describe the two Y samplings in the documentation. So I even wonder about its other information. Luma is post gamma correction, which I think would be strange for sampling. However, Avisynth is such a well made program for video I think maybe it is luma.

It doesn't really illustrate in words or other means the design of the sampling matrix. This would not be so bad if it gave the number of bytes used or the number of bits given to each component. It also leaves the nature of how chrominance is sampled very obscure.

I don't know if it samples in progressive or interlaced, or even if it uses interpolation (I would think so, but how the interpolation works is another question).

Beyond the dimensions of its matrix all I know is that it is similar to YV12, but not exactly the same. Is it like YUV 4:4:4 or YUV 4:2:0 or YUV 4:2:2 maybe? YUV 444 is the highest quality, which would make sense with DVD's downsampling ability. However, YUV 422 is a very common form of YUV. On the other hand, YV12 is a type of YUV 420.

I guess this question is more of a compilation of questions on YUY2.

How does it sample Y, as luma or luminance?

How is color sampled, as chrominance or chroma (aka two separate color channels or one combined color channel)?

If the color channels are separate are both U and V sampled from the same pixel (first or second in the matrix?) or one from the first pixel and one from the second?

Based on the nature of color sampling does it use interpolation to fill in color data in pixels that do not have color data? (The above question would answer this I guess.)

How many total bytes are used?

How many bits are given to each channel?

Does it sample progressively or interlaced (YV12 is progressive by nature)?

What form of the YUV family is it closest to 444, 422, or 420?

I guess it is a lot to ask. I am not sure anyone will know the answer, but I thought I would try here before I join and try doom9's forum. If doom9 does not work I guess I could ask the Avisynth site directly. You never know until you ask.

Scintilla · Post by **Scintilla** » Sun Jun 26, 2005 2:25 am

Considering YUY2 samples Y-U-Y-V and repeat, I think it's safe to say it's closest to YUV 4:2:2, with 16 bits per pixel.

Someone please correct me if I'm wrong.

Also, @AbsoluteDestiny: on <a href="http://www.animemusicvideos.org/guides/ ... html">this page, right underneath the 4:2:2 image, where it says "Effectively what happens is that the chroma information is half the regular vertical resolution", shouldn't that be "half the regular <b>horizontal</b> resolution"?

sysKin · Post by **sysKin** » Sun Jun 26, 2005 9:32 am

I'm not sure about answers to most of it, but:
- each channel is 8 bit sampling precision
- chroma in YUY2 can be interlaced, because it has just as many lines as Y
- chroma is the same as chrominance, luma is the same as luminance. Cr and Cb are indeed not exactly the same as U and V (phase-shifted) <- someone correct me if it's wrong.

I am not sure what you mean by sampling method and interpolation - any device can sample as it likes or interpolate as it likes. Computer does not sample UV data unless it's doing a conversion from RGB, in which case check the conversion filter how it's written exactly.

Good resource:
http://msdn.microsoft.com/library/en-us ... ormats.asp

HeavyMetal · Post by **HeavyMetal** » Sun Jun 26, 2005 6:00 pm

Thanks for the info Scintilla and sysKin.

Sorry, but the use of chroma/chrominance and luma/luminance interchangeably is such a common mistake that most people think they are the same.

Its funny if you think about it. NTSC came up with the terms luma and chroma to mark distinctions to prevent confusion.

Chroma is U and V mixed together. Chrominance is the two channels separately.

Chroma or C is the combination of the two channels through amplitude modulation, normally for use in Y/C (S-video).

Luminance is the weighted sum of the linear RGB video components proportionate to intensity a.k.a. RGB together to form the brightness. Luma is non-linear due to gamma correction to luminance. Basically Luma is luminance after gamma correction.

4:2:2, 4:4:4, 4:1:1, and 4:2:0 are all different types of YUV sampling methods.

For instance YUV 444 samples luminance at each of 4 pixels in a 2x2 matrix. YUV 422 samples luminance the same, but samples U at the first and third pixel and V at the second and forth pixel, counting left to right top to bottom. Interpolation then transposes the U and V data to pixels that do not have their own color data or only half the color data.

The matrix for YUV 422 looks like this.

Y1 U1 V2 Y2 U1 V2
Y3 U3 V4 Y4 U3 V4

I think Scintilla is right about YUY2 being like YUV 422.

I think the answers are as follows based upon it's relation to YV12, the shape of the matrix, and the interlaced intended output.

Matrix - 1x2
Total Bytes - 4
Bits - 8 for Y1, 8 for Y2, 8 for U, and 8 for V
Interlaced
Similarity to YUV 422 (interpolation to fill in color data)
Separate Chrominance channels not Chroma (otherwise the composite video would have to pull the two channels apart)
Luminance Sampling for the Y component not Luma (wouldn't make sense to use luma)

The matrix looks like this I think: (the top of YUV 422)

Y1 U1 V2 Y2 U1 V2

I just wasn't sure of my own conclusions, but they make sense given what I do know. Scintilla's statement of about 16 bits per pixel seems to confirm what I thought.

Sir_Lagsalot · Post by **Sir_Lagsalot** » Mon Jun 27, 2005 12:15 am

A fairly good explination of various YUV formats and conversion conventions:
http://msdn.microsoft.com/library/defau ... ormats.asp

In a nutshell:

4:4:4 means no downsampling of the chroma channels.
4:2:2 means 2:1 horizontal downsampling, with no vertical downsampling. Every scan line contains four Y samples for every two U or V samples.
4:2:0 means 2:1 horizontal downsampling, with 2:1 vertical downsampling.
4:1:1 means 4:1 horizontal downsampling, with no vertical downsampling. Every scan line contains four Y samples for every U or V sample

YUY2 is 4:2:2, 16 bits pixel and stored Y0 U0 Y1 V0 Y2 U1 Y3 V1 ...
Interlacing is trivial to handle with YUY2 since there is no vertical downsampling.

YV12 is 4:2:0, 12 bits/pixel and stored Y1 Y2 Y3 ... Yn U0 U1 U2 ... Un V0 V1 V2 ... Vn
Interlacing is more complex with YV12; the typical way to do it is double the width and half the height internally so that all subsamples are in the same field. Not every one follows this though, some simply treat interlaced and progressive YV12 identically.

All the RGB-YUY2/YV12 conversion routines I've seen calculated the Y components using a weighted sum of RGB, not non-linear.

There is no fixed way to subsample the various channels; it is typically done by averaging the values though. Likewise, upsamling methods vary too, linear and nearest neighbor are both common.

HeavyMetal · Post by **HeavyMetal** » Mon Jun 27, 2005 1:04 am

I wasn't talking about linear and non-linear in terms of conversions.

The original filming is actually an RGB source from the camera negatives, though. So I guess to a degree it is like a conversion, but that is raw source.

The linear and non-linear refers to gamma correction to the YUV.

Linear means a 25% increase in brightness is 25% increase by the math.

Non-linear is corrected so that a 25% increase in brightness will appear in the display as a 25% increase to the eye.

The trouble comes from the ends of the brightness spectrum. Variations amongst close to black brightness levels are difficult to tell apart. The same goes for very bright levels. As a result, linear appears to have a sudden jump in brightness at middle brightness levels and then a drop off at brighter levels.

(It also corrects for the difference in appearance of vertical and horizontal lines. For some reason vertical and horizontal lines of the same brightness appear to be different shades without gamma correction.)

The weighted sum of RGB you speak of is the luminance before gamma correction.