Fast and reliable way to encode Theora Ogg videos using ffmpeg, libtheora, and liboggz


archive.org has started to make theora derivatives for movie files, where we create an Ogg Theora video format output for each movie file. after trying a bunch of tools over a good corpus of wide-ranging videos, i found a neat way to make the Archive derivatives.

High Level:

  • use ffmpeg to turn any video to “rawvideo”.
  • pipe its output to *another* ffmpeg to turn the video to “yuv4mpegpipe”.
  • pipe its output to the libtheora tool.
  • for videos with audio, ffmpeg create a vorbis audio .ogg file.
  • add tasty metadata (with liboggz utils).
  • combine the video and audio ogg files to an .ogv output!

Detailed example:

  • ffmpeg -an -deinterlace -s 400×300 -r 20.00 -i CapeCodMarsh.avi -vcodec rawvideo -pix_fmt yuv420p -f rawvideo – |  ffmpeg -an -f rawvideo -s 400×300 -r 20.00 -i – -f yuv4mpegpipe – |  libtheora-1.0/lt-encoder_example –video-rate-target 512k – -o tmp.ogv
  • ffmpeg -y -i CapeCodMarsh.avi -vn -acodec libvorbis -ac 2 -ab 128k -ar 44100 audio.ogg
  • oggz-comment audio.ogg -o audio2.ogg TITLE=”Cape Cod Marsh” ARTIST=”Tracey Jaquith” LICENSE=”http://creativecommons.org/licenses/publicdomain/” DATE=”2004″ ORGANIZATION=”Dumb Bunny Productions” LOCATION=http://www.archive.org/details/CapeCodMarsh
  • oggzmerge tmp.ogv audio2.ogg -o CapeCodMarsh.ogv

WTFs:

  • Why the double pipe above? Some videos could not go directly to yuv4mpegpipe format such that libtheora (or ffmpeg2theora) would work all the time.
  • We do the vorbis audio outside of libtheora (or ffmpeg2theora) to avoid any issues with Audio/Video sync.
  • We convert to yuv420p in the rawvideo step because ffmpeg2theora has (i think) some known issues of not handling all yuv422 video inputs (i found at least a few videos that did this).
  • We add the metadata to the audio vorbis ogg because adding it to the video ogv file wound up making the first video frame not a keyframe (!)

So this will end up working in Firefox 3.1 and greater — the new HTML “video” tag:

<video controls=”true” autoplay=”true” src=”http://www.archive.org/download/commute/commute.ogv”> for firefox betans </video>

This technique above worked nicely across a wide range of source and “trashy” 46 videos that I use for QA before making live a new way to derive our videos at archive.org.

-tracey jaquith

10 Comments »

  1. j said

    you should not use the ffmpeg vorbis encoder, it is really bad quality,
    please use libvorbis. you can do this by changing your line to:
    ffmpeg -y -i CapeCodMarsh.avi -vn -acodec libvorbis -ac 2 -ab 128k -ar 44100 audio.ogg

  2. Maik Merten said

    Actually with a libvorbis encoder one could encode the audio track with like 80 kbit/s and give the spare bitrate to the video stream.

    I’d go for encoding Vorbis with the oggenc tool, not ffmpeg (which may have to be built with special options to allow encoding with libvorbis) and first decode to .wav (or pipe raw PCM samples)

    oggenc –resample 44100 -b 80 audiodump.wav audio.ogg

  3. Pete D. said

    It’s really great to see Ogg Theora videos being widely used in archive.org. Thanks for these efforts!

  4. Doktor Bro said

    Can’t wait for see Ogg Theora replace the flash videos. Thank you!

  5. Edwin said

    Thanks for your info.
    But, is the output video worse in quality?

  6. somebody said

    This doesn’t work for me, it just throws an empty tmp.ogv .. it seems that the pipes aren’t working because if I ffmpeg -an -deinterlace -s 400×300 -r 20.00 -i CapeCodMarsh.avi -vcodec rawvideo -pix_fmt yuv420p -f rawvideo OUTPUT .. it works and gives a huge file. but I don’t have the space to do each one separately. that’s on ubuntu hardy. any ideas on how to make the pipes work?

  7. Gregory Maxwell said

    Please don’t use the Vorbis encoder included in FFMPEG. It produces very low quality, even at high bitrates.

    It is hard for me to express in words how much worse the ffmpeg Vorbis streams sound. So, I’ve put up some 11 second examples: With my test file when you ask ffmpeg for 128kbit/sec you get this 64kbit/sec result. At a comparable output bitrate Xiph.Org libVorbis gives this result and even at 45kbit/sec libVorbis simply sounds much better. … and Xiph.Org libVorbis isn’t even currently the best encoder available for these bitrates.

    After listening I’m sure you can see that this is not just a nit-picking difference. The ffmpeg output simply sounds *bad*. It’s not not something which should be associated with the Vorbis name, and it’s not the quality that the public already expects from Vorbis. (if you have trouble playing the FFmpeg produced Ogg— this may be because the file is also not spec compliant, though it played on everything I had available to me)

    I’m unsure why ffmpeg is not shipping one of the liberally (BSD) licensed encoders. The Xiph.Org reference encoder in libvorbis would be an acceptable and obvious choice. (Although AoTuV would likely be a better choice, the difference is small compared to the output of the FFMPEG encoder). I don’t think most people in the Vorbis world were even aware of the FFMPEG encoder until it was noticed how poor the archive.org files sounded, as most people producing Theora files are using ffmpeg2theora which makes use of libVorbis to encode Vorbis audio.

    The above processing changes could be probably be amended to have the ffmpeg audio step output PCM, and pipe that into oggenc. Alternatively it may be possible to get ffmpeg to use libvorbis, as ffmpeg2theora does.

  8. [...] good comment on the post Fast and reliable ways to encode Theora Ogg videos pointed me at ffmpeg2theora which [...]

  9. yes, ffmpeg on newer ubuntu linux distros, can be setup to use libvorbis. so we are doing that now. we weren’t using “-acodec vorbis” previously just to showcase poor quality audio, but simply because in our 18-month old OS, that alternative simply wasn’t there.

    we updated to newer ffmpeg and “libvorbis” just before tax day 2009.

  10. Gregory Maxwell said

    Not sure why it would be an issue of age: Libvorbis support in ffmpeg is much older than their built in encoder.

    In any case— fantastic news!

RSS feed for comments on this post · TrackBack URI

Leave a Comment