Post
by rose4emily » Wed Nov 17, 2004 10:51 pm
It's been a while since I've checked in - kinda surprised that there's been no social chatting or "where the hell is Rose" discussion since my last post.
Anyhow, I now have Pen's full final render. That's a Good Thing (tm). By the way, I've just learned that the phrase "a good thing" is now, and has for a while been, a registered trademark held by Matha Stewart - and I find this amusing. It's at just over a 4:3 aspect, so I'll have to do a little pan-and-scan (albeit vertically, rather than horizontally, and for the purpose of letterboxing a non-letterboxed picture - now there's a twist on some old film-to-television reformatting procedures: "this picture has been formatted to not fit your screen - unless, of course, you're watching this on one of those new 16:9 HD displays"). Good news, though, is that I've cleared out a lot of disk space so I'll have room for the uncompressed video sources I'll be using to test Hikari's (my video editing app's) rendering pipeline as I put it together. Since I haven't yet put anything into that disk space, I'm thinking one of my test streams is going to look a lot like some Koji Morimoto footage accompanied by the Kronos Quartet.
This will allow me to apply a variable position of the "window" around which the video will be cropped, in case a static removal of the top and bottom doesn't work quite right.
Actually - this just came to mind - I also will check to see if non-square pixels are part of the issue, as the video is (data-wise) 720x480, but is defined as 4:3, and therefore displayed as 720x540. 16:9 would be 720x405, and the picture has a bit of black border on the top and bottom, so I might be dealing with a case where the best approach would be to define the non-square pixels as square and then trim the picture evenly from the top and bottom - which would only remove about 20-25 pixels of actual picture from the upper and lower edges. I think this might be what was done with the first render - as the vertical-to-horizontal proportions of objects in the first render are somewhat smaller than in the render I just recieved - yet (probably because this is animation - albeit exceptionally detailed hyperrealistic animation, and therefore an abstraction of actual form) neither seems disproportioned when viewed. Couldn't get away with this degree of aspect-stretching with live-action footage, but it seems to work with anime, and would greatly reduce the degree to which content is being removed from the viewable area of the screen.
So, now, if we can just get Song's new narritives (all attempts to recover the disk they were on have failed - and I can neither read data from the damaged partition, nor reformat it - which means I now have an grossly overpriced magnetic paperweight, and my last hopes of sparing Song the hassle of re-recording narratives that were perfectly good the last time around have been scrapped), I'll have all the pieces I need for the compilation. I've one more final, on Friday, to prepare for - followed by an eight (read 12-16, from my past experience) hour train ride where I prepare all the composited narrative images, come up with credit images to stand in for the ones I haven't recieved, and hammer out a concept for the intro that is both attractive and within the practical limits of my time and editing skill. Over the week I'll finish the intro, match the composited narrative images to the narrative audio, possibly re-record my own narritives (if I can manage to get better audio quality on my parent's computer, which doesn't have it's soundcard stuffed immediately between the CPU and power supply - this is a bad thing, which causes a lot of hum and noise in the recording), and encode the whole thing. There's plenty to be done, but at least I now have a decent window of time in which I can do it.
I'll also try to pump a little more quality out of the final render then is presented by the preview renders. I've learned an awful lot about video compression and encoding (among other things) while doing the background research needed to lay out the architecture for my video editing app, and have a few ideas that might get better results than I pulled out of my trial-and-error approach for the preview render:
1) Larger Picture. I think I ought to try encoding it at 640x480/640x360 and 720x540/720x405 to see how much better the picture looks with each macroblock responsible for a smaller portion of the overall picture, and the frequency spectrum output of the DCT shifted to a lower range - where the quantization tables keep more of the original vidoe information. Then I'll have to see how much the filesizes increase to determine whether the added quality is justified.
2) Shape-adaptive filtering. I was using a 'smart' Gaussian blur, with a threshold set to make it smooth the picture significantly more in "flat" regions than along detailed edges - with the intention of reducing blocking and 'mosquito-noise' artifacts from the original DVD compression, and whatever other re-compression was applied after that. I point to my own video as the probable worst offender, which is already 4th generation before entering the compilation:
1 - DVD
2 - fansub (I know, bad - but I started it before I had spare money for DVDs, and didn't think to edit it in a fashion that leant to substituting in better footage after-the-fact).
3 - converted to encoding Cinelerra could understand (before I knew how well editing lossless image streams worked)
4 - encoded as XviD while the lossless copy was, well, lost.
Shape adaptive filtering is essentially 'smarter' filtering, that tries to actually discern regions from points of sharp discontinuity. It sounds promising, but I haven't tried it yet, so I don't know whether it will help to 'clean up' some or all of the videos before the final encode. It's probably also slow, but that's why you start a render just before you go to sleep.
3) Comparing the XviD MPEG4 codec, FFMPEG (the one I used last time) MPEG4 codec, and Ogg Theora codec for quality/size. Primary distribution in Theora would be kinda crazy, however, given how few people actually have the relatively new (okay, still alpha-release) codec installed, so it's really a question of XviD vs. FFMPEG in the special case of animated footage with a comparison to Theora purely for the sake of satisfying my own cureosity. I'll also try H.264 if I can find a free (as in beer, open or not) implementation. H.264 is really just an updated flavor of MPEG4 with vast improvements in the encoding logic that should produce a sharper, cleaner picture at the same bitrate - but might also be a problem in terms of playback compatibility. I know a lot about it on the tech spec side, but I'd have to actually get my hands on a Windows machine to see if it is available in WMP as either an installed or automatically downloadable (I think WMP can do that) codec. I'll also check to see what Quicktime can play, because that should cover Apple compatibility. Optimally, animation would be encoded with motion predictors and quantization tables based on detailed knowledge of the source cells and animation procedure, rather than generated through analysis of the post-animation film - but that's really not an option for us, so it's more a matter of determining which general-purpose codec does best overall for the range of footage styles seen in this project.
4) Targeting a per-section file size of about 500-600 MB, rather than the 250 for the preview renders. The added size from the intro, credits, and narratives will be relatively small due to the limited motion in and duration of these segments, so most of these added bits can be applied to the musical segments themselves. BTW, this figure is based on the idea of < 100 MB per segment, counting the "extras" as one segment per section - and the pragmatic concept of producing a per-section filesize that fits neatly onto a standard CD without too much wasted space.
5) This isn't actually anything different, but I should add that I'll be encoding off of the originals, not the preview renders, to avoid creating an extra generation and to take advantage of the fact that over half of the videos were sent to me in sizes other than 512x384/512x288 in the first place (why rescale twice, when you can do it in one neat step?).
may seeds of dreams fall from my hands -
and by yours be pressed into the ground.