Archive for November, 2008

New strategy for Internet Archive movies!

We have rebuilt all of our nearly 200,000 videos at the archive!

[We finished this Dec 1, 2008]

Related cross-blog with OLPC.

Here is a table-based chart of which video formats will be “derived” into which formats (we are creating 4 formats per video now):

http://www.archive.org/help/derivatives.php

 

Improvements and Changes from our prior movies techniques:

  • We will make a new Ogg Theora (with Vorbis audio) opensource/free-based video derivative.  This derivative will play natively in Firefox 3.1 release (v3.1 is due around the end of 2008).
  • We are re/making h.264 MPEG-4 derivatives.  We have updated the format to work with lighttpdmod_h264_streaming” (which allows jumping into a movie at a specified time) but in the process will be losing the ability to serve/stream this file with RTSP.  This derivative also plays in the Adobe Flash plugin and plays on iPods/iPhones.
  • We are removing older 64kb and 256kb MPEG-4 derivatives.  With “progressive download” support becoming ubiquitous, even modems and phones are doing much better with downloading larger files.
  • We are removing older .flv “Flash Video” derivatives.  Since the much better quality h.264 derivative plays in recent flash plugins (as well as many other devices and browsers), the flash video alternative is seen as less ideal.
  • We are removing older .mpg MPEG-1 derivatives.  Their usefulness has declined in recent years, especially compared to h.264 alternatives.
  • We are remaking our animated GIFs.  They attempt to make 30 thumbnails from each uploaded video.  We now evenly space them across the entire video.
  • We are remaking our Thumbnails.  Similar to the GIF, we are spreading them across the videos better, and making less Thumbnails for items with *many* videos.  Additionally, we are renaming the thumbnails to indicate the second position in the video they were created at.  This will allow for the next bullet item…
  • We have developed the ability to jump into videos by clicking on the thumbnail image (to go to that scene!)  We are finalizing the URL / permalinks for these “jump into video” URLs and will release this live to the public as soon as we can.

-tracey jaquith

Comments (25)

Rederiving our movies to Ogg Theora and more!

[reposted and edited with generous consent from John Gilmore]

The Internet Archive has a collection of about 185,000 moving images,
including many cartoons and full-length movies that have fallen into
the public domain.  They offer full downloads in the best format they
have, as well as “re-derived” versions in other (typically smaller)
formats.  They also added a Flash-based video player in the last year
or two.   The “One Laptop Per Child”, or OLPC, software supports the Ogg Theora video
codec, but few movies had been uploaded in Ogg Theora, and none had
previously been re-derived into it.

The Archive actively supports the free software ecosystem, and is now
busy re-deriving copies of all their videos into both Ogg Theora and
H.264 (mp4) codecs.  So far they have more than 40% of the videos
converted, and hope to have the rest done by December 2008.
This makes each of these videos easily accessible on the OLPC XO, by
looking in the left margin for the download/stream link for the Ogg
Video version.  As each is converted, it immediately becomes
accessible at www.archive.org/details/movies.

The Archive is also noticing that the “OLPC” browser
is connecting, and replaces the Flash player with a direct link to the
.ogv Ogg Theora file.  This allows stock XO’s to play videos by
clicking on the big Click To Play image.  For example, try:

 http://www.archive.org/details/merry_melodies_falling_hare

For the kids, they’ve already converted all 84 cartoons in this collection:

 http://www.archive.org/details/classic_cartoons

You can also search their moving images collection for 
 format:”Ogg Video”
to restrict your search to movies that have a copy available in Ogg (Theora).

–tracey jaquith

Comments (9)

Fast and reliable way to encode Theora Ogg videos using ffmpeg, libtheora, and liboggz


archive.org has started to make theora derivatives for movie files, where we create an Ogg Theora video format output for each movie file. after trying a bunch of tools over a good corpus of wide-ranging videos, i found a neat way to make the Archive derivatives.

High Level:

  • use ffmpeg to turn any video to “rawvideo”.
  • pipe its output to *another* ffmpeg to turn the video to “yuv4mpegpipe”.
  • pipe its output to the libtheora tool.
  • for videos with audio, ffmpeg create a vorbis audio .ogg file.
  • add tasty metadata (with liboggz utils).
  • combine the video and audio ogg files to an .ogv output!

Detailed example:

  • ffmpeg -an -deinterlace -s 400×300 -r 20.00 -i CapeCodMarsh.avi -vcodec rawvideo -pix_fmt yuv420p -f rawvideo – |  ffmpeg -an -f rawvideo -s 400×300 -r 20.00 -i – -f yuv4mpegpipe – |  libtheora-1.0/lt-encoder_example –video-rate-target 512k – -o tmp.ogv
  • ffmpeg -y -i CapeCodMarsh.avi -vn -acodec libvorbis -ac 2 -ab 128k -ar 44100 audio.ogg
  • oggz-comment audio.ogg -o audio2.ogg TITLE=”Cape Cod Marsh” ARTIST=”Tracey Jaquith” LICENSE=”http://creativecommons.org/licenses/publicdomain/” DATE=”2004″ ORGANIZATION=”Dumb Bunny Productions” LOCATION=http://www.archive.org/details/CapeCodMarsh
  • oggzmerge tmp.ogv audio2.ogg -o CapeCodMarsh.ogv

WTFs:

  • Why the double pipe above? Some videos could not go directly to yuv4mpegpipe format such that libtheora (or ffmpeg2theora) would work all the time.
  • We do the vorbis audio outside of libtheora (or ffmpeg2theora) to avoid any issues with Audio/Video sync.
  • We convert to yuv420p in the rawvideo step because ffmpeg2theora has (i think) some known issues of not handling all yuv422 video inputs (i found at least a few videos that did this).
  • We add the metadata to the audio vorbis ogg because adding it to the video ogv file wound up making the first video frame not a keyframe (!)

So this will end up working in Firefox 3.1 and greater — the new HTML “video” tag:

<video controls=”true” autoplay=”true” src=”http://www.archive.org/download/commute/commute.ogv”> for firefox betans </video>

This technique above worked nicely across a wide range of source and “trashy” 46 videos that I use for QA before making live a new way to derive our videos at archive.org.

-tracey jaquith

Comments (10)