
Recherche avancée
Médias (1)
-
La conservation du net art au musée. Les stratégies à l’œuvre
26 mai 2011
Mis à jour : Juillet 2013
Langue : français
Type : Texte
Autres articles (18)
-
Supporting all media types
13 avril 2011, parUnlike most software and media-sharing platforms, MediaSPIP aims to manage as many different media types as possible. The following are just a few examples from an ever-expanding list of supported formats : images : png, gif, jpg, bmp and more audio : MP3, Ogg, Wav and more video : AVI, MP4, OGV, mpg, mov, wmv and more text, code and other data : OpenOffice, Microsoft Office (Word, PowerPoint, Excel), web (html, CSS), LaTeX, Google Earth and (...)
-
Contribute to translation
13 avril 2011You can help us to improve the language used in the software interface to make MediaSPIP more accessible and user-friendly. You can also translate the interface into any language that allows it to spread to new linguistic communities.
To do this, we use the translation interface of SPIP where the all the language modules of MediaSPIP are available. Just subscribe to the mailing list and request further informantion on translation.
MediaSPIP is currently available in French and English (...) -
Qu’est ce qu’un masque de formulaire
13 juin 2013, parUn masque de formulaire consiste en la personnalisation du formulaire de mise en ligne des médias, rubriques, actualités, éditoriaux et liens vers des sites.
Chaque formulaire de publication d’objet peut donc être personnalisé.
Pour accéder à la personnalisation des champs de formulaires, il est nécessaire d’aller dans l’administration de votre MediaSPIP puis de sélectionner "Configuration des masques de formulaires".
Sélectionnez ensuite le formulaire à modifier en cliquant sur sont type d’objet. (...)
Sur d’autres sites (2920)
-
Convert raw PCM data to video ?
2 mai 2013, par user1801247I have file consisting of raw PCM data (no headers). I have been able to successfully play this file by opening it in Audacity (by importing the raw data).
This file (as part of a game), apparently contains a video at the end of the file.
I was hoping to be able to play this PCM data as a video without regard for headers as Audacity was able to do for audio.
I took a look at
ffmpeg
to convert the .wav/.pcm to .mov, but there are no examples of going from audio -> video (an understandably rare scenario).Are there any tools out there to play my data as a video, without regard for validity ?
-
H.264 and VP8 for still image coding : WebP ?
Update : post now contains a Theora comparison as well ; see below.
JPEG is a very old lossy image format. By today’s standards, it’s awful compression-wise : practically every video format since the days of MPEG-2 has been able to tie or beat JPEG at its own game. The reasons people haven’t switched to something more modern practically always boil down to a simple one — it’s just not worth the hassle. Even if JPEG can be beaten by a factor of 2, convincing the entire world to change image formats after 20 years is nigh impossible. Furthermore, JPEG is fast, simple, and practically guaranteed to be free of any intellectual property worries. It’s been tried before : JPEG-2000 first, then Microsoft’s JPEG XR, both tried to unseat JPEG. Neither got much of anywhere.
Now Google is trying to dump yet another image format on us, “WebP”. But really, it’s just a VP8 intra frame. There are some obvious practical problems with this new image format in comparison to JPEG ; it doesn’t even support all of JPEG’s features, let alone many of the much-wanted features JPEG was missing (alpha channel support, lossless support). It only supports 4:2:0 chroma subsampling, while JPEG can handle 4:2:2 and 4:4:4. Google doesn’t seem interested in adding any of these features either.
But let’s get to the meat and see how these encoders stack up on compressing still images. As I explained in my original analysis, VP8 has the advantage of H.264′s intra prediction, which is one of the primary reasons why H.264 has such an advantage in intra compression. It only has i4x4 and i16x16 modes, not i8x8, so it’s not quite as fancy as H.264′s, but it comes close.
The test files are all around 155KB ; download them for the exact filesizes. For all three, I did a binary search of quality levels to get the file sizes close. For x264, I encoded with
--tune stillimage --preset placebo
. For libvpx, I encoded with--best
. For JPEG, I encoded with ffmpeg, then applied jpgcrush, a lossless jpeg compressor. I suspect there are better JPEG encoders out there than ffmpeg ; if you have one, feel free to test it and post the results. The source image is the 200th frame of Parkjoy, from derf’s page (fun fact : this video was shot here ! More info on the video here.).Files : (x264 [154KB], vp8 [155KB], jpg [156KB])
Results (decoded to PNG) : (x264, vp8, jpg)
This seems rather embarrassing for libvpx. Personally I think VP8 looks by far the worst of the bunch, despite JPEG’s blocking. What’s going on here ? VP8 certainly has better entropy coding than JPEG does (by far !). It has better intra prediction (JPEG has just DC prediction). How could VP8 look worse ? Let’s investigate.
VP8 uses a 4×4 transform, which tends to blur and lose more detail than JPEG’s 8×8 transform. But that alone certainly isn’t enough to create such a dramatic difference. Let’s investigate a hypothesis — that the problem is that libvpx is optimizing for PSNR and ignoring psychovisual considerations when encoding the image… I’ll encode with
--tune psnr --preset placebo
in x264, turning off all psy optimizations.Files : (x264, optimized for PSNR [154KB]) [Note for the technical people : because adaptive quantization is off, to get the filesize on target I had to use a CQM here.]
Results (decoded to PNG) : (x264, optimized for PSNR)
What a blur ! Only somewhat better than VP8, and still worse than JPEG. And that’s using the same encoder and the same level of analysis — the only thing done differently is dropping the psy optimizations. Thus we come back to the conclusion I’ve made over and over on this blog — the encoder matters more than the video format, and good psy optimizations are more important than anything else for compression. libvpx, a much more powerful encoder than ffmpeg’s jpeg encoder, loses because it tries too hard to optimize for PSNR.
These results raise an obvious question — is Google nuts ? I could understand the push for “WebP” if it was better than JPEG. And sure, technically as a file format it is, and an encoder could be made for it that’s better than JPEG. But note the word “could”. Why announce it now when libvpx is still such an awful encoder ? You’d have to be nuts to try to replace JPEG with this blurry mess as-is. Now, I don’t expect libvpx to be able to compete with x264, the best encoder in the world — but surely it should be able to beat an image format released in 1992 ?
Earth to Google : make the encoder good first, then promote it as better than the alternatives. The reverse doesn’t work quite as well.
Addendum (added Oct. 2, 03:51) :
maikmerten gave me a Theora-encoded image to compare as well. Here’s the PNG and the source (155KB). And yes, that’s Theora 1.2 (Ptalarbvorm) beating VP8 handily. Now that is embarassing. Guess what the main new feature of Ptalarbvorm is ? Psy optimizations…
Addendum (added Apr. 20, 23:33) :
There’s a new webp encoder out, written from scratch by skal (available in libwebp). It’s significantly better than libvpx — not like that says much — but it should probably beat JPEG much more readily now. The encoder design is rather unique — it basically uses K-means for a large part of the encoding process. It still loses to x264, but that was expected.
[155KB] -
Simply beyond ridiculous
For the past few years, various improvements on H.264 have been periodically proposed, ranging from larger transforms to better intra prediction. These finally came together in the JCT-VC meeting this past April, where over two dozen proposals were made for a next-generation video coding standard. Of course, all of these were in very rough-draft form ; it will likely take years to filter it down into a usable standard. In the process, they’ll pick the most useful features (hopefully) from each proposal and combine them into something a bit more sane. But, of course, it all has to start somewhere.
A number of features were common : larger block sizes, larger transform sizes, fancier interpolation filters, improved intra prediction schemes, improved motion vector prediction, increased internal bit depth, new entropy coding schemes, and so forth. A lot of these are potentially quite promising and resolve a lot of complaints I’ve had about H.264, so I decided to try out the proposal that appeared the most interesting : the Samsung+BBC proposal (A124), which claims compression improvements of around 40%.
The proposal combines a bouillabaisse of new features, ranging from a 12-tap interpolation filter to 12thpel motion compensation and transforms as large as 64×64. Overall, I would say it’s a good proposal and I don’t doubt their results given the sheer volume of useful features they’ve dumped into it. I was a bit worried about complexity, however, as 12-tap interpolation filters don’t exactly scream “fast”.
I prepared myself for the slowness of an unoptimized encoder implementation, compiled their tool, and started a test encode with their recommended settings.
I waited. The first frame, an I-frame, completed.
I took a nap.
I waited. The second frame, a P-frame, was done.
I played a game of Settlers.
I waited. The third frame, a B-frame, was done.
I worked on a term paper.
I waited. The fourth frame, a B-frame, was done.
After a full 6 hours, 8 frames had encoded. Yes, at this rate, it would take a full two weeks to encode 10 seconds of HD video. On a Core i7. This is not merely slow ; this is over 1000 times slower than x264 on “placebo” mode. This is so slow that it is not merely impractical ; it is impossible to even test. This encoder is apparently designed for some sort of hypothetical future computer from space. And word from other developers is that the Intel proposal is even slower.
This has led me to suspect that there is a great deal of cheating going on in the H.265 proposals. The goal of the proposals, of course, is to pick the best feature set for the next generation video compression standard. But there is an extra motivation : organizations whose features get accepted get patents on the resulting standard, and thus income. With such large sums of money in the picture, dishonesty becomes all the more profitable.
There is a set of rules, of course, to limit how the proposals can optimize their encoders. If different encoders use different optimization techniques, the results will no longer be comparable — remember, they are trying to compare compression features, not methods of optimizing encoder-side decisions. Thus all encoders are required to use a constant quantizer, specified frame types, and so forth. But there are no limits on how slow an encoder can be or what algorithms it can use.
It would be one thing if the proposed encoder was a mere 10 times slower than the current reference ; that would be reasonable, given the low level of optimization and higher complexity of the new standard. But this is beyond ridiculous. With the prize given to whoever can eke out the most PSNR at a given quantizer at the lowest bitrate (with no limits on speed), we’re just going to get an arms race of slow encoders, with every company trying to use the most ridiculous optimizations possible, even if they involve encoding the frame 100,000 times over to choose the optimal parameters. And the end result will be as I encountered here : encoders so slow that they are simply impossible to even test.
Such an arms race certainly does little good in optimizing for reality where we don’t have 30 years to encode an HD movie : a feature that gives great compression improvements is useless if it’s impossible to optimize for in a reasonable amount of time. Certainly once the standard is finalized practical encoders will be written — but it makes no sense to optimize the standard for a use-case that doesn’t exist. And even attempting to “optimize” anything is difficult when encoding a few seconds of video takes weeks.
Update : The people involved have contacted me and insist that there was in fact no cheating going on. This is probably correct ; the problem appears to be that the rules that were set out were simply not strict enough, making many changes that I would intuitively consider “cheating” to be perfectly allowed, and thus everyone can do it.
I would like to apologize if I implied that the results weren’t valid ; they are — the Samsung-BBC proposal is definitely one of the best, which is why I picked it to test with. It’s just that I think any situation in which it’s impossible to test your own software is unreasonable, and thus the entire situation is an inherently broken one, given the lax rules, slow baseline encoder, and no restrictions on compute time.