
Recherche avancée
Médias (91)
-
Les Miserables
9 décembre 2019, par
Mis à jour : Décembre 2019
Langue : français
Type : Textuel
-
VideoHandle
8 novembre 2019, par
Mis à jour : Novembre 2019
Langue : français
Type : Video
-
Somos millones 1
21 juillet 2014, par
Mis à jour : Juin 2015
Langue : français
Type : Video
-
Un test - mauritanie
3 avril 2014, par
Mis à jour : Avril 2014
Langue : français
Type : Textuel
-
Pourquoi Obama lit il mes mails ?
4 février 2014, par
Mis à jour : Février 2014
Langue : français
-
IMG 0222
6 octobre 2013, par
Mis à jour : Octobre 2013
Langue : français
Type : Image
Autres articles (37)
-
Soumettre améliorations et plugins supplémentaires
10 avril 2011Si vous avez développé une nouvelle extension permettant d’ajouter une ou plusieurs fonctionnalités utiles à MediaSPIP, faites le nous savoir et son intégration dans la distribution officielle sera envisagée.
Vous pouvez utiliser la liste de discussion de développement afin de le faire savoir ou demander de l’aide quant à la réalisation de ce plugin. MediaSPIP étant basé sur SPIP, il est également possible d’utiliser le liste de discussion SPIP-zone de SPIP pour (...) -
Emballe Médias : Mettre en ligne simplement des documents
29 octobre 2010, parLe plugin emballe médias a été développé principalement pour la distribution mediaSPIP mais est également utilisé dans d’autres projets proches comme géodiversité par exemple. Plugins nécessaires et compatibles
Pour fonctionner ce plugin nécessite que d’autres plugins soient installés : CFG Saisies SPIP Bonux Diogène swfupload jqueryui
D’autres plugins peuvent être utilisés en complément afin d’améliorer ses capacités : Ancres douces Légendes photo_infos spipmotion (...) -
Les formats acceptés
28 janvier 2010, parLes commandes suivantes permettent d’avoir des informations sur les formats et codecs gérés par l’installation local de ffmpeg :
ffmpeg -codecs ffmpeg -formats
Les format videos acceptés en entrée
Cette liste est non exhaustive, elle met en exergue les principaux formats utilisés : h264 : H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 m4v : raw MPEG-4 video format flv : Flash Video (FLV) / Sorenson Spark / Sorenson H.263 Theora wmv :
Les formats vidéos de sortie possibles
Dans un premier temps on (...)
Sur d’autres sites (2410)
-
Things I Have Learned About Emscripten
1er septembre 2015, par Multimedia Mike — Cirrus Retro3 years ago, I released my Game Music Appreciation project, a website with a ludicrously uninspired title which allowed users a relatively frictionless method to experience a range of specialized music files related to old video games. However, the site required use of a special Chrome plugin. Ever since that initial release, my #1 most requested feature has been for a pure JavaScript version of the music player.
“Impossible !” I exclaimed. “There’s no way JS could ever run fast enough to run these CPU emulators and audio synthesizers in real time, and allow for the visualization that I demand !” Well, I’m pleased to report that I have proved me wrong. I recently quietly launched a new site with what I hope is a catchier title, meant to evoke a cloud-based retro-music-as-a-service product : Cirrus Retro. Right now, it’s basically the same as the old site, but without the wonky Chrome-specific technology.
Along the way, I’ve learned a few things about using Emscripten that I thought might be useful to share with other people who wish to embark on a similar journey. This is geared more towards someone who has a stronger low-level background (such as C/C++) vs. high-level (like JavaScript).
General Goals
Do you want to cross-compile an entire desktop application, one that relies on an extensive GUI toolkit ? That might be difficult (though I believe there is a path for porting qt code directly with Emscripten). Your better wager might be to abstract out the core logic and processes of the program and then create a new web UI to access them.Do you want to compile a game that basically just paints stuff to a 2D canvas ? You’re in luck ! Emscripten has a porting path for SDL. Make a version of your C/C++ software that targets SDL (generally not a tall order) and then compile that with Emscripten.
Do you just want to cross-compile some functionality that lives in a library ? That’s what I’ve done with the Cirrus Retro project. For this, plan to compile the library into a JS file that exports some public functions that other, higher-level, native JS (i.e., JS written by a human and not a computer) will invoke.
Memory Levels
When porting C/C++ software to JavaScript using Emscripten, you have to think on 2 different levels. Or perhaps you need to force JavaScript into a low level C lens, especially if you want to write native JS code that will interact with Emscripten-compiled code. This often means somehow allocating chunks of memory via JS and passing them to the Emscripten-compiled functions. And you wouldn’t believe the type of gymnastics you need to execute to get native JS and Emscripten-compiled JS to cooperate.
“Emscripten : Pointers and Pointers” is the best (and, really, ONLY) explanation I could find for understanding the basic mechanics of this process, at least when I started this journey. However, there’s a mistake in the explanation that left me confused for a little while, and I’m at a loss to contact the author (doesn’t anyone post a simple email address anymore ?).
Per the best of my understanding, Emscripten allocates a large JS array and calls that the memory space that the compiled C/C++ code is allowed to operate in. A pointer in C/C++ code will just be an index into that mighty array. Really, that’s not too far off from how a low-level program process is supposed to view memory– as a flat array.
Eventually, I just learned to cargo-cult my way through the memory allocation process. Here’s the JS code for allocating an Emscripten-compatible byte buffer, taken from my test harness (more on that later) :
var musicBuffer = fs.readFileSync(testSpec[’filename’]) ; var musicBufferBytes = new Uint8Array(musicBuffer) ; var bytesMalloc = player._malloc(musicBufferBytes.length) ; var bytes = new Uint8Array(player.HEAPU8.buffer, bytesMalloc, musicBufferBytes.length) ; bytes.set(new Uint8Array(musicBufferBytes.buffer)) ;
So, read the array of bytes from some input source, create a Uint8Array from the bytes, use the Emscripten _malloc() function to allocate enough bytes from the Emscripten memory array for the input bytes, then create a new array… then copy the bytes…
You know what ? It’s late and I can’t remember how it works exactly, but it does. It has been a few months since I touched that code (been fighting with front-end website tech since then). You write that memory allocation code enough times and it begins to make sense, and then you hope you don’t have to write it too many more times.
Multithreading
You can’t port multithreaded code to JS via Emscripten. JavaScript has no notion of threads ! If you don’t understand the computer science behind this limitation, a more thorough explanation is beyond the scope of this post. But trust me, I’ve thought about it a lot. In fact, the official Emscripten literature states that you should be able to port most any C/C++ code as long as 1) none of the code is proprietary (i.e., all the raw source is available) ; and 2) there are no threads.Yes, I read about the experimental pthreads support added to Emscripten recently. Don’t get too excited ; that won’t be ready and widespread for a long time to come as it relies on a new browser API. In the meantime, figure out how to make your multithreaded C/C++ code run in a single thread if you want it to run in a browser.
Printing Facility
Eventually, getting software to work boils down to debugging, and the most primitive tool in many a programmer’s toolbox is the humble print statement. A print statement allows you to inspect a piece of a program’s state at key junctures. Eventually, when you try to cross-compile C/C++ code to JS using Emscripten, something is not going to work correctly in the generated JS “object code” and you need to understand what. You’ll be pleading for a method of just inspecting one variable deep in the original C/C++ code.I came up with this simple printf-workalike called emprintf() :
#ifndef EMPRINTF_H #define EMPRINTF_H
#include <stdio .h>
#include <stdarg .h>
#include <emscripten .h>#define MAX_MSG_LEN 1000
/* NOTE : Don’t pass format strings that contain single quote (’) or newline
* characters. */
static void emprintf(const char *format, ...)
char msg[MAX_MSG_LEN] ;
char consoleMsg[MAX_MSG_LEN + 16] ;
va_list args ;/* create the string */
va_start(args, format) ;
vsnprintf(msg, MAX_MSG_LEN, format, args) ;
va_end(args) ;/* wrap the string in a console.log(’’) statement */
snprintf(consoleMsg, MAX_MSG_LEN + 16, "console.log(’%s’)", msg) ;/* send the final string to the JavaScript console */
emscripten_run_script(consoleMsg) ;
#endif /* EMPRINTF_H */
Put it in a file called “emprint.h”. Include it into any C/C++ file where you need debugging visibility, use emprintf() as a replacement for printf() and the output will magically show up on the browser’s JavaScript debug console. Heed the comments and don’t put any single quotes or newlines in strings, and keep it under 1000 characters. I didn’t say it was perfect, but it has helped me a lot in my Emscripten adventures.
Optimization Levels
Remember to turn on optimization when compiling. I have empirically found that optimizing for size (-Os) leads to the best performance all around, in addition to having the smallest size. Just be sure to specify some optimization level. If you don’t, the default is -O0 which offers horrible performance when running in JS.Static Compression For HTTP Delivery
JavaScript code compresses pretty efficiently, even after it has been optimized for size using -Os. I routinely see compression ratios between 3.5:1 and 5:1 using gzip.Web servers in this day and age are supposed to be smart enough to detect when a requesting web browser can accept gzip-compressed data and do the compression on the fly. They’re even supposed to be smart enough to cache compressed output so the same content is not recompressed for each request. I would have to set up a series of tests to establish whether either of the foregoing assertions are correct and I can’t be bothered. Instead, I took it into my own hands. The trick is to pre-compress the JS files and then instruct the webserver to serve these files with a ‘Content-Type’ of ‘application/javascript’ and a ‘Content-Encoding’ of ‘gzip’.
- Compress your large Emscripten-build JS files with ‘gzip’ : ‘gzip compiled-code.js’
- Rename them from extension .js.gz to .jsgz
- Tell the webserver to deliver .jsgz files with the correct Content-Type and Content-Encoding headers
To do that last step with Apache, specify these lines :
AddType application/javascript jsgz AddEncoding gzip jsgz
They belong in either a directory’s .htaccess file or in the sitewide configuration (/etc/apache2/mods-available/mime.conf works on my setup).
Build System and Build Time Optimization
Oh goodie, build systems ! I had a very specific manner in which I wanted to build my JS modules using Emscripten. Can I possibly coerce any of the many popular build systems to do this ? It has been a few months since I worked on this problem specifically but I seem to recall that the build systems I tried to used would freak out at the prospect of compiling stuff to a final binary target of .js.I had high hopes for Bazel, which Google released while I was developing Cirrus Retro. Surely, this is software that has been battle-tested in the harshest conditions of one of the most prominent software-developing companies in the world, needing to take into account the most bizarre corner cases and still build efficiently and correctly every time. And I have little doubt that it fulfills the order. Similarly, I’m confident that Google also has a team of no fewer than 100 or so people dedicated to developing and supporting the project within the organization. When you only have, at best, 1-2 hours per night to work on projects like this, you prefer not to fight with such cutting edge technology and after losing 2 or 3 nights trying to make a go of Bazel, I eventually put it aside.
I also tried to use Autotools. It failed horribly for me, mostly for my own carelessness and lack of early-project source control.
After that, it was strictly vanilla makefiles with no real dependency management. But you know what helps in these cases ? ccache ! Or at least, it would if it didn’t fail with Emscripten.
Quick tip : ccache has trouble with LLVM unless you set the CCACHE_CPP2 environment variable (e.g. : “export CCACHE_CPP2=1”). I don’t remember the specifics, but it magically fixes things. Then, the lazy build process becomes “make clean && make”.
Testing
If you have never used Node.js, testing Emscripten-compiled JS code might be a good opportunity to start. I was able to use Node.js to great effect for testing the individually-compiled music player modules, wiring up a series of invocations using Python for a broader test suite (wouldn’t want to go too deep down the JS rabbit hole, after all).Be advised that Node.js doesn’t enjoy the same kind of JIT optimizations that the browser engines leverage. Thus, in the case of time critical code like, say, an audio synthesis library, the code might not run in real time. But as long as it produces the correct bitwise waveform, that’s good enough for continuous integration.
Also, if you have largely been a low-level programmer for your whole career and are generally unfamiliar with the world of single-threaded, event-driven, callback-oriented programming, you might be in for a bit of a shock. When I wanted to learn how to read the contents of a file in Node.js, this is the first tutorial I found on the matter. I thought the code presented was a parody of bad coding style :
var fs = require("fs") ; var fileName = "foo.txt" ;
fs.exists(fileName, function(exists)
if (exists)
fs.stat(fileName, function(error, stats)
fs.open(fileName, "r", function(error, fd)
var buffer = new Buffer(stats.size) ;fs.read(fd, buffer, 0, buffer.length, null, function(error, bytesRead, buffer)
var data = buffer.toString("utf8", 0, buffer.length) ;console.log(data) ;
fs.close(fd) ;
) ;
) ;
) ;
) ;Apparently, this kind of thing doesn’t raise an eyebrow in the JS world.
Now, I understand and respect the JS programming model. But this was seriously frustrating when I first encountered it because a simple script like the one I was trying to write just has an ordered list of tasks to complete. When it asks for bytes from a file, it really has nothing better to do than to wait for the answer.
Thankfully, it turns out that Node’s fs module includes synchronous versions of the various file access functions. So it’s all good.
Conclusion
I’m sure I missed or underexplained some things. But if other brave souls are interested in dipping their toes in the waters of Emscripten, I hope these tips will come in handy. -
Things I Have Learned About Emscripten
1er septembre 2015, par Multimedia Mike — Cirrus Retro3 years ago, I released my Game Music Appreciation project, a website with a ludicrously uninspired title which allowed users a relatively frictionless method to experience a range of specialized music files related to old video games. However, the site required use of a special Chrome plugin. Ever since that initial release, my #1 most requested feature has been for a pure JavaScript version of the music player.
“Impossible !” I exclaimed. “There’s no way JS could ever run fast enough to run these CPU emulators and audio synthesizers in real time, and allow for the visualization that I demand !” Well, I’m pleased to report that I have proved me wrong. I recently quietly launched a new site with what I hope is a catchier title, meant to evoke a cloud-based retro-music-as-a-service product : Cirrus Retro. Right now, it’s basically the same as the old site, but without the wonky Chrome-specific technology.
Along the way, I’ve learned a few things about using Emscripten that I thought might be useful to share with other people who wish to embark on a similar journey. This is geared more towards someone who has a stronger low-level background (such as C/C++) vs. high-level (like JavaScript).
General Goals
Do you want to cross-compile an entire desktop application, one that relies on an extensive GUI toolkit ? That might be difficult (though I believe there is a path for porting qt code directly with Emscripten). Your better wager might be to abstract out the core logic and processes of the program and then create a new web UI to access them.Do you want to compile a game that basically just paints stuff to a 2D canvas ? You’re in luck ! Emscripten has a porting path for SDL. Make a version of your C/C++ software that targets SDL (generally not a tall order) and then compile that with Emscripten.
Do you just want to cross-compile some functionality that lives in a library ? That’s what I’ve done with the Cirrus Retro project. For this, plan to compile the library into a JS file that exports some public functions that other, higher-level, native JS (i.e., JS written by a human and not a computer) will invoke.
Memory Levels
When porting C/C++ software to JavaScript using Emscripten, you have to think on 2 different levels. Or perhaps you need to force JavaScript into a low level C lens, especially if you want to write native JS code that will interact with Emscripten-compiled code. This often means somehow allocating chunks of memory via JS and passing them to the Emscripten-compiled functions. And you wouldn’t believe the type of gymnastics you need to execute to get native JS and Emscripten-compiled JS to cooperate.
“Emscripten : Pointers and Pointers” is the best (and, really, ONLY) explanation I could find for understanding the basic mechanics of this process, at least when I started this journey. However, there’s a mistake in the explanation that left me confused for a little while, and I’m at a loss to contact the author (doesn’t anyone post a simple email address anymore ?).
Per the best of my understanding, Emscripten allocates a large JS array and calls that the memory space that the compiled C/C++ code is allowed to operate in. A pointer in C/C++ code will just be an index into that mighty array. Really, that’s not too far off from how a low-level program process is supposed to view memory– as a flat array.
Eventually, I just learned to cargo-cult my way through the memory allocation process. Here’s the JS code for allocating an Emscripten-compatible byte buffer, taken from my test harness (more on that later) :
var musicBuffer = fs.readFileSync(testSpec[’filename’]) ; var musicBufferBytes = new Uint8Array(musicBuffer) ; var bytesMalloc = player._malloc(musicBufferBytes.length) ; var bytes = new Uint8Array(player.HEAPU8.buffer, bytesMalloc, musicBufferBytes.length) ; bytes.set(new Uint8Array(musicBufferBytes.buffer)) ;
So, read the array of bytes from some input source, create a Uint8Array from the bytes, use the Emscripten _malloc() function to allocate enough bytes from the Emscripten memory array for the input bytes, then create a new array… then copy the bytes…
You know what ? It’s late and I can’t remember how it works exactly, but it does. It has been a few months since I touched that code (been fighting with front-end website tech since then). You write that memory allocation code enough times and it begins to make sense, and then you hope you don’t have to write it too many more times.
Multithreading
You can’t port multithreaded code to JS via Emscripten. JavaScript has no notion of threads ! If you don’t understand the computer science behind this limitation, a more thorough explanation is beyond the scope of this post. But trust me, I’ve thought about it a lot. In fact, the official Emscripten literature states that you should be able to port most any C/C++ code as long as 1) none of the code is proprietary (i.e., all the raw source is available) ; and 2) there are no threads.Yes, I read about the experimental pthreads support added to Emscripten recently. Don’t get too excited ; that won’t be ready and widespread for a long time to come as it relies on a new browser API. In the meantime, figure out how to make your multithreaded C/C++ code run in a single thread if you want it to run in a browser.
Printing Facility
Eventually, getting software to work boils down to debugging, and the most primitive tool in many a programmer’s toolbox is the humble print statement. A print statement allows you to inspect a piece of a program’s state at key junctures. Eventually, when you try to cross-compile C/C++ code to JS using Emscripten, something is not going to work correctly in the generated JS “object code” and you need to understand what. You’ll be pleading for a method of just inspecting one variable deep in the original C/C++ code.I came up with this simple printf-workalike called emprintf() :
#ifndef EMPRINTF_H #define EMPRINTF_H
#include <stdio .h>
#include <stdarg .h>
#include <emscripten .h>#define MAX_MSG_LEN 1000
/* NOTE : Don’t pass format strings that contain single quote (’) or newline
* characters. */
static void emprintf(const char *format, ...)
char msg[MAX_MSG_LEN] ;
char consoleMsg[MAX_MSG_LEN + 16] ;
va_list args ;/* create the string */
va_start(args, format) ;
vsnprintf(msg, MAX_MSG_LEN, format, args) ;
va_end(args) ;/* wrap the string in a console.log(’’) statement */
snprintf(consoleMsg, MAX_MSG_LEN + 16, "console.log(’%s’)", msg) ;/* send the final string to the JavaScript console */
emscripten_run_script(consoleMsg) ;
#endif /* EMPRINTF_H */
Put it in a file called “emprint.h”. Include it into any C/C++ file where you need debugging visibility, use emprintf() as a replacement for printf() and the output will magically show up on the browser’s JavaScript debug console. Heed the comments and don’t put any single quotes or newlines in strings, and keep it under 1000 characters. I didn’t say it was perfect, but it has helped me a lot in my Emscripten adventures.
Optimization Levels
Remember to turn on optimization when compiling. I have empirically found that optimizing for size (-Os) leads to the best performance all around, in addition to having the smallest size. Just be sure to specify some optimization level. If you don’t, the default is -O0 which offers horrible performance when running in JS.Static Compression For HTTP Delivery
JavaScript code compresses pretty efficiently, even after it has been optimized for size using -Os. I routinely see compression ratios between 3.5:1 and 5:1 using gzip.Web servers in this day and age are supposed to be smart enough to detect when a requesting web browser can accept gzip-compressed data and do the compression on the fly. They’re even supposed to be smart enough to cache compressed output so the same content is not recompressed for each request. I would have to set up a series of tests to establish whether either of the foregoing assertions are correct and I can’t be bothered. Instead, I took it into my own hands. The trick is to pre-compress the JS files and then instruct the webserver to serve these files with a ‘Content-Type’ of ‘application/javascript’ and a ‘Content-Encoding’ of ‘gzip’.
- Compress your large Emscripten-build JS files with ‘gzip’ : ‘gzip compiled-code.js’
- Rename them from extension .js.gz to .jsgz
- Tell the webserver to deliver .jsgz files with the correct Content-Type and Content-Encoding headers
To do that last step with Apache, specify these lines :
AddType application/javascript jsgz AddEncoding gzip jsgz
They belong in either a directory’s .htaccess file or in the sitewide configuration (/etc/apache2/mods-available/mime.conf works on my setup).
Build System and Build Time Optimization
Oh goodie, build systems ! I had a very specific manner in which I wanted to build my JS modules using Emscripten. Can I possibly coerce any of the many popular build systems to do this ? It has been a few months since I worked on this problem specifically but I seem to recall that the build systems I tried to used would freak out at the prospect of compiling stuff to a final binary target of .js.I had high hopes for Bazel, which Google released while I was developing Cirrus Retro. Surely, this is software that has been battle-tested in the harshest conditions of one of the most prominent software-developing companies in the world, needing to take into account the most bizarre corner cases and still build efficiently and correctly every time. And I have little doubt that it fulfills the order. Similarly, I’m confident that Google also has a team of no fewer than 100 or so people dedicated to developing and supporting the project within the organization. When you only have, at best, 1-2 hours per night to work on projects like this, you prefer not to fight with such cutting edge technology and after losing 2 or 3 nights trying to make a go of Bazel, I eventually put it aside.
I also tried to use Autotools. It failed horribly for me, mostly for my own carelessness and lack of early-project source control.
After that, it was strictly vanilla makefiles with no real dependency management. But you know what helps in these cases ? ccache ! Or at least, it would if it didn’t fail with Emscripten.
Quick tip : ccache has trouble with LLVM unless you set the CCACHE_CPP2 environment variable (e.g. : “export CCACHE_CPP2=1”). I don’t remember the specifics, but it magically fixes things. Then, the lazy build process becomes “make clean && make”.
Testing
If you have never used Node.js, testing Emscripten-compiled JS code might be a good opportunity to start. I was able to use Node.js to great effect for testing the individually-compiled music player modules, wiring up a series of invocations using Python for a broader test suite (wouldn’t want to go too deep down the JS rabbit hole, after all).Be advised that Node.js doesn’t enjoy the same kind of JIT optimizations that the browser engines leverage. Thus, in the case of time critical code like, say, an audio synthesis library, the code might not run in real time. But as long as it produces the correct bitwise waveform, that’s good enough for continuous integration.
Also, if you have largely been a low-level programmer for your whole career and are generally unfamiliar with the world of single-threaded, event-driven, callback-oriented programming, you might be in for a bit of a shock. When I wanted to learn how to read the contents of a file in Node.js, this is the first tutorial I found on the matter. I thought the code presented was a parody of bad coding style :
var fs = require("fs") ; var fileName = "foo.txt" ;
fs.exists(fileName, function(exists)
if (exists)
fs.stat(fileName, function(error, stats)
fs.open(fileName, "r", function(error, fd)
var buffer = new Buffer(stats.size) ;fs.read(fd, buffer, 0, buffer.length, null, function(error, bytesRead, buffer)
var data = buffer.toString("utf8", 0, buffer.length) ;console.log(data) ;
fs.close(fd) ;
) ;
) ;
) ;
) ;Apparently, this kind of thing doesn’t raise an eyebrow in the JS world.
Now, I understand and respect the JS programming model. But this was seriously frustrating when I first encountered it because a simple script like the one I was trying to write just has an ordered list of tasks to complete. When it asks for bytes from a file, it really has nothing better to do than to wait for the answer.
Thankfully, it turns out that Node’s fs module includes synchronous versions of the various file access functions. So it’s all good.
Conclusion
I’m sure I missed or underexplained some things. But if other brave souls are interested in dipping their toes in the waters of Emscripten, I hope these tips will come in handy.The post Things I Have Learned About Emscripten first appeared on Breaking Eggs And Making Omelettes.
-
Ogg objections
3 mars 2010, par Mans — MultimediaThe Ogg container format is being promoted by the Xiph Foundation for use with its Vorbis and Theora codecs. Unfortunately, a number of technical shortcomings in the format render it ill-suited to most, if not all, use cases. This article examines the most severe of these flaws.
Overview of Ogg
The basic unit in an Ogg stream is the page consisting of a header followed by one or more packets from a single elementary stream. A page can contain up to 255 packets, and a packet can span any number of pages. The following table describes the page header.
Field Size (bits) Description capture_pattern 32 magic number “OggS” version 8 always zero flags 8 granule_position 64 abstract timestamp bitstream_serial_number 32 elementary stream number page_sequence_number 32 incremented by 1 each page checksum 32 CRC of entire page page_segments 8 length of segment_table segment_table variable list of packet sizes Elementary stream types are identified by looking at the payload of the first few pages, which contain any setup data required by the decoders. For full details, see the official format specification.
Generality
Ogg, legend tells, was designed to be a general-purpose container format. To most multimedia developers, a general-purpose format is one in which encoded data of any type can be encapsulated with a minimum of effort.
The Ogg format defined by the specification does not fit this description. For every format one wishes to use with Ogg, a complex mapping must first be defined. This mapping defines how to identify a codec, how to extract setup data, and even how timestamps are to be interpreted. All this is done differently for every codec. To correctly parse an Ogg stream, every such mapping ever defined must be known.
Under this premise, a centralised repository of codec mappings would seem like a sensible idea, but alas, no such thing exists. It is simply impossible to obtain a exhaustive list of defined mappings, which makes the task of creating a complete implementation somewhat daunting.
One brave soul, Tobias Waldvogel, created a mapping, OGM, capable of storing any Microsoft AVI compatible codec data in Ogg files. This format saw some use in the wild, but was frowned upon by Xiph, and it was eventually displaced by other formats.
True generality is evidently not to be found with the Ogg format.
A good example of a general-purpose format is Matroska. This container can trivially accommodate any codec, all it requires is a unique string to identify the codec. For codecs requiring setup data, a standard location for this is provided in the container. Furthermore, an official list of codec identifiers is maintained, meaning all information required to fully support Matroska files is available from one place.
Matroska also has probably the greatest advantage of all : it is in active, wide-spread use. Historically, standards derived from existing practice have proven more successful than those created by a design committee.
Overhead
When designing a container format, one important consideration is that of overhead, i.e. the extra space required in addition to the elementary stream data being combined. For any given container, the overhead can be divided into a fixed part, independent of the total file size, and a variable part growing with increasing file size. The fixed overhead is not of much concern, its relative contribution being negligible for typical file sizes.
The variable overhead in the Ogg format comes from the page headers, mostly from the segment_table field. This field uses a most peculiar encoding, somewhat reminiscent of Roman numerals. In Roman times, numbers were written as a sequence of symbols, each representing a value, the combined value being the sum of the constituent values.
The segment_table field lists the sizes of all packets in the page. Each value in the list is coded as a number of bytes equal to 255 followed by a final byte with a smaller value. The packet size is simply the sum of all these bytes. Any strictly additive encoding, such as this, has the distinct drawback of coded length being linearly proportional to the encoded value. A value of 5000, a reasonable packet size for video of moderate bitrate, requires no less than 20 bytes to encode.
On top of this we have the 27-byte page header which, although paling in comparison to the packet size encoding, is still much larger than necessary. Starting at the top of the list :
- The version field could be disposed of, a single-bit marker being adequate to separate this first version from hypothetical future versions. One of the unused positions in the flags field could be used for this purpose
- A 64-bit granule_position is completely overkill. 32 bits would be more than enough for the vast majority of use cases. In extreme cases, a one-bit flag could be used to signal an extended timestamp field.
- 32-bit elementary stream number ? Are they anticipating files with four billion elementary streams ? An eight-bit field, if not smaller, would seem more appropriate here.
- The 32-bit page_sequence_number is inexplicable. The intent is to allow detection of page loss due to transmission errors. ISO MPEG-TS uses a 4-bit counter per 188-byte packet for this purpose, and that format is used where packet loss actually happens, unlike any use of Ogg to date.
- A mandatory 32-bit checksum is nothing but a waste of space when using a reliable storage/transmission medium. Again, a flag could be used to signal the presence of an optional checksum field.
With the changes suggested above, the page header would shrink from 27 bytes to 12 bytes in size.
We thus see that in an Ogg file, the packet size fields alone contribute an overhead of 1/255 or approximately 0.4%. This is a hard lower bound on the overhead, not attainable even in theory. In reality the overhead tends to be closer to 1%.
Contrast this with the ISO MP4 file format, which can easily achieve an overhead of less than 0.05% with a 1 Mbps elementary stream.
Latency
In many applications end-to-end latency is an important factor. Examples include video conferencing, telephony, live sports events, interactive gaming, etc. With the codec layer contributing as little as 10 milliseconds of latency, the amount imposed by the container becomes an important factor.
Latency in an Ogg-based system is introduced at both the sender and the receiver. Since the page header depends on the entire contents of the page (packet sizes and checksum), a full page of packets must be buffered by the sender before a single bit can be transmitted. This sets a lower bound for the sending latency at the duration of a page.
On the receiving side, playback cannot commence until packets from all elementary streams are available. Hence, with two streams (audio and video) interleaved at the page level, playback is delayed by at least one page duration (two if checksums are verified).
Taking both send and receive latencies into account, the minimum end-to-end latency for Ogg is thus twice the duration of a page, triple if strict checksum verification is required. If page durations are variable, the maximum value must be used in order to avoid buffer underflows.
Minimum latency is clearly achieved by minimising the page duration, which in turn implies sending only one packet per page. This is where the size of the page header becomes important. The header for a single-packet page is 27 + packet_size/255 bytes in size. For a 1 Mbps video stream at 25 fps this gives an overhead of approximately 1%. With a typical audio packet size of 400 bytes, the overhead becomes a staggering 7%. The average overhead for a multiplex of these two streams is 1.4%.
As it stands, the Ogg format is clearly not a good choice for a low-latency application. The key to low latency is small packets and fine-grained interleaving of streams, and although Ogg can provide both of these, by sending a single packet per page, the price in overhead is simply too high.
ISO MPEG-PS has an overhead of 9 bytes on most packets (a 5-byte timestamp is added a few times per second), and Microsoft’s ASF has a 12-byte packet header. My suggestions for compacting the Ogg page header would bring it in line with these formats.
Random access
Any general-purpose container format needs to allow random access for direct seeking to any given position in the file. Despite this goal being explicitly mentioned in the Ogg specification, the format only allows the most crude of random access methods.
While many container formats include an index allowing a time to be directly translated into an offset into the file, Ogg has nothing of this kind, the stated rationale for the omission being that this would require a two-pass multiplexing, the second pass creating the index. This is obviously not true ; the index could simply be written at the end of the file. Those objecting that this index would be unavailable in a streaming scenario are forgetting that seeking is impossible there regardless.
The method for seeking suggested by the Ogg documentation is to perform a binary search on the file, after each file-level seek operation scanning for a page header, extracting the timestamp, and comparing it to the desired position. When the elementary stream encoding allows only certain packets as random access points (video key frames), a second search will have to be performed to locate the entry point closest to the desired time. In a large file (sizes upwards of 10 GB are common), 50 seeks might be required to find the correct position.
A typical hard drive has an average seek time of roughly 10 ms, giving a total time for the seek operation of around 500 ms, an annoyingly long time. On a slow medium, such as an optical disc or files served over a network, the times are orders of magnitude longer.
A factor further complicating the seeking process is the possibility of header emulation within the elementary stream data. To safeguard against this, one has to read the entire page and verify the checksum. If the storage medium cannot provide data much faster than during normal playback, this provides yet another substantial delay towards finishing the seeking operation. This too applies to both network delivery and optical discs.
Although optical disc usage is perhaps in decline today, one should bear in mind that the Ogg format was designed at a time when CDs and DVDs were rapidly gaining ground, and network-based storage is most certainly on the rise.
The final nail in the coffin of seeking is the codec-dependent timestamp format. At each step in the seeking process, the timestamp parsing specified by the codec mapping corresponding the current page must be invoked. If the mapping is not known, the best one can do is skip pages until one with a known mapping is found. This delays the seeking and complicates the implementation, both bad things.
Timestamps
A problem old as multimedia itself is that of synchronising multiple elementary streams (e.g. audio and video) during playback ; badly synchronised A/V is highly unpleasant to view. By the time Ogg was invented, solutions to this problem were long since explored and well-understood. The key to proper synchronisation lies in tagging elementary stream packets with timestamps, packets carrying the same timestamp intended for simultaneous presentation. The concept is as simple as it seems, so it is astonishing to see the amount of complexity with which the Ogg designers managed to imbue it. So bizarre is it, that I have devoted an entire article to the topic, and will not cover it further here.
Complexity
Video and audio decoding are time-consuming tasks, so containers should be designed to minimise extra processing required. With the data volumes involved, even an act as simple as copying a packet of compressed data can have a significant impact. Once again, however, Ogg lets us down. Despite the brevity of the specification, the format is remarkably complicated to parse properly.
The unusual and inefficient encoding of the packet sizes limits the page size to somewhat less than 64 kB. To still allow individual packets larger than this limit, it was decided to allow packets spanning multiple pages, a decision with unfortunate implications. A page-spanning packet as it arrives in the Ogg stream will be discontiguous in memory, a situation most decoders are unable to handle, and reassembly, i.e. copying, is required.
The knowledgeable reader may at this point remark that the MPEG-TS format also splits packets into pieces requiring reassembly before decoding. There is, however, a significant difference there. MPEG-TS was designed for hardware demultiplexing feeding directly into hardware decoders. In such an implementation the fragmentation is not a problem. Rather, the fine-grained interleaving is a feature allowing smaller on-chip buffers.
Buffering is also an area in which Ogg suffers. To keep the overhead down, pages must be made as large as practically possible, and page size translates directly into demultiplexer buffer size. Playback of a file with two elementary streams thus requires 128 kB of buffer space. On a modern PC this is perhaps nothing to be concerned about, but in a small embedded system, e.g. a portable media player, it can be relevant.
In addition to the above, a number of other issues, some of them minor, others more severe, make Ogg processing a painful experience. A selection follows :
- 32-bit random elementary stream identifiers mean a simple table-lookup cannot be used. Instead the list of streams must be searched for a match. While trivial to do in software, it is still annoying, and a hardware demultiplexer would be significantly more complicated than with a smaller identifier.
- Semantically ambiguous streams are possible. For example, the continuation flag (bit 1) may conflict with continuation (or lack thereof) implied by the segment table on the preceding page. Such invalid files have been spotted in the wild.
- Concatenating independent Ogg streams forms a valid stream. While finding a use case for this strange feature is difficult, an implementation must of course be prepared to encounter such streams. Detecting and dealing with these adds pointless complexity.
- Unusual terminology : inventing new terms for well-known concepts is confusing for the developer trying to understand the format in relation to others. A few examples :
Ogg name Usual name logical bitstream elementary stream grouping multiplexing lacing value packet size (approximately) segment imaginary element serving no real purpose granule position timestamp
Final words
We have found the Ogg format to be a dubious choice in just about every situation. Why then do certain organisations and individuals persist in promoting it with such ferocity ?
When challenged, three types of reaction are characteristic of the Ogg campaigners.
On occasion, these people will assume an apologetic tone, explaining how Ogg was only ever designed for simple audio-only streams (ignoring it is as bad for these as for anything), and this is no doubt true. Why then, I ask again, do they continue to tout Ogg as the one-size-fits-all solution they already admitted it is not ?
More commonly, the Ogg proponents will respond with hand-waving arguments best summarised as Ogg isn’t bad, it’s just different. My reply to this assertion is twofold :
- Being too different is bad. We live in a world where multimedia files come in many varieties, and a decent media player will need to handle the majority of them. Fortunately, most multimedia file formats share some basic traits, and they can easily be processed in the same general framework, the specifics being taken care of at the input stage. A format deviating too far from the standard model becomes problematic.
- Ogg is bad. When every angle of examination reveals serious flaws, bad is the only fitting description.
The third reaction bypasses all technical analysis : Ogg is patent-free, a claim I am not qualified to directly discuss. Assuming it is true, it still does not alter the fact that Ogg is a bad format. Being free from patents does not magically make Ogg a good choice as file format. If all the standard formats are indeed covered by patents, the only proper solution is to design a new, good format which is not, this time hopefully avoiding the old mistakes.