Comparing final encode quality

BasharOfTheAges · Post by **BasharOfTheAges** » Wed Feb 28, 2007 8:56 am

So, i recently started encoding betas of a project I'm finishing up with x264 and the simple queue feature got me thinking about setting up several target file sizes, and setting my system to encode several of them (25MB, 35MB, 45MB, etc.) while i was out. Now larger file size usually means a better encode, but i want to balance size and quality (with a strong emphasis on quality) so it'd be nice to see the point of diminishing returns through some non-subjective means. So, my question is "dose anyone have any experience with (or know how to) empirically compare the results of encoding x264 encoded MP4s of various sizes?" Is there an app i could use for it, perhaps a trick with avisynth scripts and Vdubmod?

Kariudo · Post by **Kariudo** » Wed Feb 28, 2007 9:34 am

about the best thing I can think of is make an avisynth script to load the mp4s (directshowsource) and load those into avsp.

you can't really play them back side by side...but you can go frame by frame, and flipping between the two (or more) can be done with the scroll wheel of your mouse (assuming it has such a thing)

BasharOfTheAges · Post by **BasharOfTheAges** » Wed Feb 28, 2007 12:08 pm

I was thinking more of a difference matte or whatnot to show me the actual amount of blocking i'd get at various compression levels - you know, to make differences a bit more painfully obvious. It would help if i could expand the video without resizing it too, because i make judgments about quality at the dimensions i'm going to actually watch videos at - 1680x1050 [adjust for not caring about wide screen], but i'm obviously not going to release at such resolutions because most people don't run at those.

Zero1 · Post by **Zero1** » Wed Feb 28, 2007 6:39 pm

Your best bet is to forget about filesizes and do two encodes. Do one at QP 18, and one at CRF18. They are both high quality (the QP18 will look about the same as XviD Q2) and should give you reasonable filesizes. These are quality methods, so you can be assured that any encode at these settings will look good, however the filesizes can vary. Complex videos will simply be larger in filesize than just blocking at whatever bitrate you usually use.

Generally values between CRF 18-20 are good for AMVs, or use QP 18-20 if you want a bit extra quality (QP is a constant quantizer, but CRF fluctuates a bit and uses a good bit less bitrate for not much worse quality).

Best thing is, is that these are used in one pass mode, so you get guaranteed quality from one pass; give it a try

BasharOfTheAges · Post by **BasharOfTheAges** » Wed Feb 28, 2007 8:30 pm

How would i go about doing that in meGUI? There are so many damn options and menus... it's all a bit much.

BasharOfTheAges · Post by **BasharOfTheAges** » Thu Mar 01, 2007 2:12 pm

I think i figured it out - the option "constant quality" with a setting of 18 yielded a pretty damn nice looking encode... at a size of 154MB.

I did a lot of cleaning in my footage prep step (i'm talking 2 days of my computer running the scripts to clip, filter and compress(HuffYUV) 61.4GB of video, so i doubt it's having problems with the footage.

I'm going to try 2 separate values and do a color comparison bit mask on the result if i can figure out how to do that. It's good to find the point at which it doesn't actually matter if the file's any better because 99% of people won't notice the difference.

Zero1 · Post by **Zero1** » Thu Mar 01, 2007 3:16 pm

154MB is huge. I rarely expect XviD to create AMVs that size at Q2 (which is the equivalent of QP18 in x264).

Can you post an encode of the AMV? Or if you prefer not to "release" it yet, can you tell us any more about it? What is the framerate, resolution, does it have lots of noise (even artificial noise like TV static)? Anything that you think will be relevant.

Also what x264 settings are being used? You should max out your settings to really benefit from x264, otherwise you may as well just use XviD on max settings and live with bigger files.

Also experiment with QP20 or CRF 18-20. Those should still look good but be more reasonable in filesize.

Although there wasn't a ton of action in an incomplete AMV I encoded recently, it was 640x352, 23.976fps, 2m26s and 21MB including audio (I used QP18 for video and Q0.40 for audio which is about 170kbps in Nero).

To maximise compression you should use SubME 7, search mode UMH (uneven multi hex), B-pyramid, many B-frames (up to 5 is fine), and as many reference frames as you are comfortable with (a max of 16, but experience shows that after around 8 references the gains diminish; but if you can use 16, there is no reason not to).

You can check out the options and what they do or mean in my guide at http://aflux.deltaanime.net/Zero1/MP4/x264.html

It's a guide for the x264 command line, but you should be able to figure out which switches are which options in megui.

BasharOfTheAges · Post by **BasharOfTheAges** » Fri Mar 02, 2007 11:12 am

Zero1 - check your PMs, we'll do in from there as i'm not ready to release yet, but wouldn't mind the assistance.

Zero1 · Post by **Zero1** » Fri Mar 02, 2007 6:18 pm

This wasn't as mysterious as I thought it might be. After a few minutes of detective work, I have come up with the following.

For starters, the video is going in the right direction at 45MB, the quantizer tends to average around Q20, but varies from 16-30 (this is a complete estimation just from watching the frame mean quantizer vary throughout the video). What that means in English is that the overall quality is very good, but bad in complex places. Also some frames are encoded at QP16 which is a considered wastage (it's like going beyond Q2 in XviD). In an ideal world you would have a video at average Q18 +/-1; so the average QP is 18, but it's allowed to vary between 17-19 in parts for efficiency (using higher quantizers in parts where it can get away with it in order to improve other scenes).

The video itself is fast moving (so not a lot of temporal redundancy) and some of the scenes (a good example at 01m 00s - 01m 03s) are incredibly hard to compensate, so instead of coding mainly residual (the difference between the previous frame and the new frame, think of it as sort of an overlay), almost complete new textures are coded, and despite it only being 3 seconds, it bloats the file (working on 20-30KB per frame could end up between 1-2MB for those 3 seconds).

The quantizer during this 3 second section flies all the way out to around Q24-30, which is ugly and gives it a sort of blurred/muddy effect (actually that's the inloop deblocking working overtime, otherwise it would simply look blocky as hell). To stop the quantizer flying out so wide, you can do one of (or both if multipass does not fix it) two things:

1) Do multipass, for example if you did a two pass and have this problem, try a three pass. Three pass is generally not required, but some people have found it's beneficial with AMVs where you have a lot of change in a short period of time.
2) Increase the qcomp value (Quantizer Compression in MeGUI). Default is 0.6. Setting it to low (eg. 0.1) gives it a more constant bitrate with quantizers that vary a lot (this looks terrible, like a 1 pass encode or old MPEG-1), setting it high (eg. 0.9) means it will find a more average quantizer for the video. This is good but not always desirable since even with a high value (more constant quantizer), other scenes may suffer due to complex scenes that can get away with high quantizers. I'd recommend trying 0.7 or 0.8 if a three pass still has problems (but carry on reading before you do).

What I also gathered from watching the video, was that there were no B-frames. No B-frames is causing a large loss of efficiency. You need to enable them ASAP.

I then went on to poke at the file itself and found that the settings used were pretty much default. I downloaded MeGUI and confirmed that they were more or less default. The good news is that we can improve the encode a fair bit. Some notable things to enable/use/change are:

AVC Profiles
Set this to high as it allows you to use an extra option I will cover later.

ME Range (--merange)
Currently set at 16 (default), I would suggest using 32. This increases the search area so the codec is able to make more matches and save bitrate, rather than encode new textures.

ME Algorithm (--me)
Default is Hexagon, again I suggest changing this to Multi Hex (known as UMH in the command line).

Subpixel refinement (--subme)
Default is 5, really suggest 6 or 7. This can make a nice difference.

Keyframe interval (--keyint)
Default is 250; you may change this to something higher but it's unlikely to benefit this AMV much. Basically it determines the maximum amount of frames before a keyframe is forced. Good for long scenes without a real scene change since you can minimise large I-frames, but this also can affect seeking.

Trellis (--trellis)
Default off. You should enable this also. If you have a good CPU, then use --trellis 2 (or Always in MeGUI), else use --trellis 1 (Final MB).

Reference frames (--ref)
Default is 1 (which is less than XviD uses). I recommend between 5 and 8 reference frames, or more if you don't mind the encode time, however the benefit diminishes after 8 frames. On one video Streicher encoded, it was 31MB without audio and 1 reference frame, and 27MB with 8 references. That's around a 15% saving, but the video had static parts that really benefited from references, but still they are good space savers; just don't expect a 15% saving on your video (they will certainly help a lot).

Mixed Reference Frames (--mixed-refs)
If you are using multiple references, then you almost must use this as it increases the flexibility allowing macroblock partitions to chose their own references rather than a whole macroblock using the same reference.

No Fast P-Skip (--no-fast-pskip)
Enabling or using this option disables early skip detection. Early skip detection can cause blocking in solid colours or gradients, which is bad news for anime. Definitely enable this option.

Minimum Quantizer (--qpmin)
To prevent wasting bitrate, you can set this to the lowest quantizer you are willing for a P-frame to be encoded at. Good values are between 16-20, but setting it too high will have an adverse effect on the quality. I would suggest 18. This will ensure that P-frames do not get lower quantizers than this which means you aren't wasting bitrate on quality you can't notice.

Factor between P and B-frame quants (--pbratio)
Default of 1.3 is good, but for extra compression, you may use up to 1.5 safely, however do beware that on a Q18 encode with a pbratio of 1.5; B-frames will get around Q21-22 which may look bad in some cases.

Macroblock options (--analyse)
Make sure you set the AVC Profile to "High", and then select All from Macroblock options. You should see Adaptive DCT, I4x4, P4x4, I8x8, P8x8 and B8x8 get checked. This allows the codec to be more flexible and make better choices for compensation (can be set in CLI as --analyse all).

B-frames (--b-frames)
Pretty essential. In XviD you would tend not to use more than 2 due to flaws in the MPEG-4 ASP standard with DCT drift, however it's safe to use multiple B-frames in H.264. I suggest 3, but you may use up to 5 or 6; just remember that more B-frames adds to encoding and decoding complexity, but at the same time is a great space saver. B-frames can also be used as references so it's double useful to use a good amount.

Adaptive B-frames
Should be enabled (a switch is not required to enable this in x264 CLI, just to disable). This decreases the number of B-frames where it makes sense, helping to increase the quality by using another frame type where a B-frame might not be optimal.

B-Pyramid (--b-pyramid)
This should also be enabled. Allows B-frames to be used as references and so improving efficiency.

RDO for B-frames (--b-rdo)
Improved motion estimation for B-frames (improving quality and efficiency) at the expense of encode time. Recommended. Requires Subpixel Refinement 6 or higher (eg --subme 6, --subme 7).

Weighted B prediction (--weightb)
Improves fades by regulating B-frame usage. Recommended also.

Bidirectional ME (--bime)
Enables an additional search for forward and backward motion vectors when coding B-frames. You should enable this also as it improves the efficiency and quality of B-frames again.

B-frame mode: (--direct)
Defines the motion prediction used for direct macroblocks. Temporal is usually the better choice since it uses the following P-frame for motion prediction, as opposed to spatial which relies on surrounding macroblocks and their motion, however selecting Auto allows x264 to switch between the two modes and choose the best one on a per frame basis which is more optimal than one or the other. Recommended to use auto.

So now I have listed the main options that you should change to get a nice encode, I suggest you play around in MeGUI and change these. The encode may slow to a crawl compared to what you have been used to, but the quality/filesize will be better and worth it in my opinion. Obviously you can ease up some of the options like less reference frames than I suggest if encoding is too slow for you, but bear with it if you can, this is what sets x264 apart from XviD. You can follow these screens which basically follow what I just suggested, or you can go and grab some MeGUI profiles from Doom9 which should be pretty nice. Also I might add that these settings are just suggested, they are not optimal/special or anything, but they are a lot better than the defaults. Check out the guide I linked to earlier for more help on the other options.

For the audio, you might want to look at Nero AAC at Q0.3-0.4 (which is VBR), much better than CBR 128kbps, however the filesize may vary a bit.

trythil · Post by **trythil** » Fri Mar 02, 2007 6:22 pm

BasharOfTheAges wrote:So, i recently started encoding betas of a project I'm finishing up with x264 and the simple queue feature got me thinking about setting up several target file sizes, and setting my system to encode several of them (25MB, 35MB, 45MB, etc.) while i was out. Now larger file size usually means a better encode, but i want to balance size and quality (with a strong emphasis on quality) so it'd be nice to see the point of diminishing returns through some non-subjective means. So, my question is "dose anyone have any experience with (or know how to) empirically compare the results of encoding x264 encoded MP4s of various sizes?" Is there an app i could use for it, perhaps a trick with avisynth scripts and Vdubmod?

http://www.cns.nyu.edu/~lcv/ssim/
http://forum.doom9.org/showthread.php?s=&threadid=61128

The x264 command-line encoder also calculates PSNR and SSIM.

It takes a little bit of practice to understand what the SSIM numbers correspond to wrt footage quality (in particular, read the doom9 thread and the paper) but it's not a bad place to start.

AnimeMusicVideos.org

Comparing final encode quality

Comparing final encode quality

Re: Comparing final encode quality