
Recherche avancée
Autres articles (48)
-
Websites made with MediaSPIP
2 mai 2011, parThis page lists some websites based on MediaSPIP.
-
Creating farms of unique websites
13 avril 2011, parMediaSPIP platforms can be installed as a farm, with a single "core" hosted on a dedicated server and used by multiple websites.
This allows (among other things) : implementation costs to be shared between several different projects / individuals rapid deployment of multiple unique sites creation of groups of like-minded sites, making it possible to browse media in a more controlled and selective environment than the major "open" (...) -
Publier sur MédiaSpip
13 juin 2013Puis-je poster des contenus à partir d’une tablette Ipad ?
Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir
Sur d’autres sites (6598)
-
Simply beyond ridiculous
For the past few years, various improvements on H.264 have been periodically proposed, ranging from larger transforms to better intra prediction. These finally came together in the JCT-VC meeting this past April, where over two dozen proposals were made for a next-generation video coding standard. Of course, all of these were in very rough-draft form ; it will likely take years to filter it down into a usable standard. In the process, they’ll pick the most useful features (hopefully) from each proposal and combine them into something a bit more sane. But, of course, it all has to start somewhere.
A number of features were common : larger block sizes, larger transform sizes, fancier interpolation filters, improved intra prediction schemes, improved motion vector prediction, increased internal bit depth, new entropy coding schemes, and so forth. A lot of these are potentially quite promising and resolve a lot of complaints I’ve had about H.264, so I decided to try out the proposal that appeared the most interesting : the Samsung+BBC proposal (A124), which claims compression improvements of around 40%.
The proposal combines a bouillabaisse of new features, ranging from a 12-tap interpolation filter to 12thpel motion compensation and transforms as large as 64×64. Overall, I would say it’s a good proposal and I don’t doubt their results given the sheer volume of useful features they’ve dumped into it. I was a bit worried about complexity, however, as 12-tap interpolation filters don’t exactly scream “fast”.
I prepared myself for the slowness of an unoptimized encoder implementation, compiled their tool, and started a test encode with their recommended settings.
I waited. The first frame, an I-frame, completed.
I took a nap.
I waited. The second frame, a P-frame, was done.
I played a game of Settlers.
I waited. The third frame, a B-frame, was done.
I worked on a term paper.
I waited. The fourth frame, a B-frame, was done.
After a full 6 hours, 8 frames had encoded. Yes, at this rate, it would take a full two weeks to encode 10 seconds of HD video. On a Core i7. This is not merely slow ; this is over 1000 times slower than x264 on “placebo” mode. This is so slow that it is not merely impractical ; it is impossible to even test. This encoder is apparently designed for some sort of hypothetical future computer from space. And word from other developers is that the Intel proposal is even slower.
This has led me to suspect that there is a great deal of cheating going on in the H.265 proposals. The goal of the proposals, of course, is to pick the best feature set for the next generation video compression standard. But there is an extra motivation : organizations whose features get accepted get patents on the resulting standard, and thus income. With such large sums of money in the picture, dishonesty becomes all the more profitable.
There is a set of rules, of course, to limit how the proposals can optimize their encoders. If different encoders use different optimization techniques, the results will no longer be comparable — remember, they are trying to compare compression features, not methods of optimizing encoder-side decisions. Thus all encoders are required to use a constant quantizer, specified frame types, and so forth. But there are no limits on how slow an encoder can be or what algorithms it can use.
It would be one thing if the proposed encoder was a mere 10 times slower than the current reference ; that would be reasonable, given the low level of optimization and higher complexity of the new standard. But this is beyond ridiculous. With the prize given to whoever can eke out the most PSNR at a given quantizer at the lowest bitrate (with no limits on speed), we’re just going to get an arms race of slow encoders, with every company trying to use the most ridiculous optimizations possible, even if they involve encoding the frame 100,000 times over to choose the optimal parameters. And the end result will be as I encountered here : encoders so slow that they are simply impossible to even test.
Such an arms race certainly does little good in optimizing for reality where we don’t have 30 years to encode an HD movie : a feature that gives great compression improvements is useless if it’s impossible to optimize for in a reasonable amount of time. Certainly once the standard is finalized practical encoders will be written — but it makes no sense to optimize the standard for a use-case that doesn’t exist. And even attempting to “optimize” anything is difficult when encoding a few seconds of video takes weeks.
Update : The people involved have contacted me and insist that there was in fact no cheating going on. This is probably correct ; the problem appears to be that the rules that were set out were simply not strict enough, making many changes that I would intuitively consider “cheating” to be perfectly allowed, and thus everyone can do it.
I would like to apologize if I implied that the results weren’t valid ; they are — the Samsung-BBC proposal is definitely one of the best, which is why I picked it to test with. It’s just that I think any situation in which it’s impossible to test your own software is unreasonable, and thus the entire situation is an inherently broken one, given the lax rules, slow baseline encoder, and no restrictions on compute time.
-
Simply beyond ridiculous
For the past few years, various improvements on H.264 have been periodically proposed, ranging from larger transforms to better intra prediction. These finally came together in the JCT-VC meeting this past April, where over two dozen proposals were made for a next-generation video coding standard. Of course, all of these were in very rough-draft form ; it will likely take years to filter it down into a usable standard. In the process, they’ll pick the most useful features (hopefully) from each proposal and combine them into something a bit more sane. But, of course, it all has to start somewhere.
A number of features were common : larger block sizes, larger transform sizes, fancier interpolation filters, improved intra prediction schemes, improved motion vector prediction, increased internal bit depth, new entropy coding schemes, and so forth. A lot of these are potentially quite promising and resolve a lot of complaints I’ve had about H.264, so I decided to try out the proposal that appeared the most interesting : the Samsung+BBC proposal (A124), which claims compression improvements of around 40%.
The proposal combines a bouillabaisse of new features, ranging from a 12-tap interpolation filter to 12thpel motion compensation and transforms as large as 64×64. Overall, I would say it’s a good proposal and I don’t doubt their results given the sheer volume of useful features they’ve dumped into it. I was a bit worried about complexity, however, as 12-tap interpolation filters don’t exactly scream “fast”.
I prepared myself for the slowness of an unoptimized encoder implementation, compiled their tool, and started a test encode with their recommended settings.
I waited. The first frame, an I-frame, completed.
I took a nap.
I waited. The second frame, a P-frame, was done.
I played a game of Settlers.
I waited. The third frame, a B-frame, was done.
I worked on a term paper.
I waited. The fourth frame, a B-frame, was done.
After a full 6 hours, 8 frames had encoded. Yes, at this rate, it would take a full two weeks to encode 10 seconds of HD video. On a Core i7. This is not merely slow ; this is over 1000 times slower than x264 on “placebo” mode. This is so slow that it is not merely impractical ; it is impossible to even test. This encoder is apparently designed for some sort of hypothetical future computer from space. And word from other developers is that the Intel proposal is even slower.
This has led me to suspect that there is a great deal of cheating going on in the H.265 proposals. The goal of the proposals, of course, is to pick the best feature set for the next generation video compression standard. But there is an extra motivation : organizations whose features get accepted get patents on the resulting standard, and thus income. With such large sums of money in the picture, dishonesty becomes all the more profitable.
There is a set of rules, of course, to limit how the proposals can optimize their encoders. If different encoders use different optimization techniques, the results will no longer be comparable — remember, they are trying to compare compression features, not methods of optimizing encoder-side decisions. Thus all encoders are required to use a constant quantizer, specified frame types, and so forth. But there are no limits on how slow an encoder can be or what algorithms it can use.
It would be one thing if the proposed encoder was a mere 10 times slower than the current reference ; that would be reasonable, given the low level of optimization and higher complexity of the new standard. But this is beyond ridiculous. With the prize given to whoever can eke out the most PSNR at a given quantizer at the lowest bitrate (with no limits on speed), we’re just going to get an arms race of slow encoders, with every company trying to use the most ridiculous optimizations possible, even if they involve encoding the frame 100,000 times over to choose the optimal parameters. And the end result will be as I encountered here : encoders so slow that they are simply impossible to even test.
Such an arms race certainly does little good in optimizing for reality where we don’t have 30 years to encode an HD movie : a feature that gives great compression improvements is useless if it’s impossible to optimize for in a reasonable amount of time. Certainly once the standard is finalized practical encoders will be written — but it makes no sense to optimize the standard for a use-case that doesn’t exist. And even attempting to “optimize” anything is difficult when encoding a few seconds of video takes weeks.
Update : The people involved have contacted me and insist that there was in fact no cheating going on. This is probably correct ; the problem appears to be that the rules that were set out were simply not strict enough, making many changes that I would intuitively consider “cheating” to be perfectly allowed, and thus everyone can do it.
I would like to apologize if I implied that the results weren’t valid ; they are — the Samsung-BBC proposal is definitely one of the best, which is why I picked it to test with. It’s just that I think any situation in which it’s impossible to test your own software is unreasonable, and thus the entire situation is an inherently broken one, given the lax rules, slow baseline encoder, and no restrictions on compute time.
-
Slow, robotic audio encoding with Humble-Video api (ffmpeg)
30 novembre 2017, par Walker KnappI have a program that is trying to parse pcm_s16le audio samples from a .wav file and encode it into mp3 using the Humble-Video api.
This isn’t what the final program is trying to do, but it outlines the problem I’m encountering.
The issue is that the output audio files sound robotic and slow.input.wav
(Just some random audio from a video game, ignore the wonky size headers) : https://drive.google.com/file/d/1nQOJGIxoSBDzprXExyTVNyyipSKQjyU0/view?usp=sharingoutput.mp3
:
https://drive.google.com/file/d/1MfEFw2V7TiKS16SqSTv3wrbh6KoankIj/view?usp=sharingoutput.wav
: https://drive.google.com/file/d/1XtDdCtYao0kS0Qe2l6JGu1tC5xvqt62f/view?usp=sharingimport io.humble.video.*;
import java.io.*;
public class AudioEncodingTest {
private static AudioChannel.Layout inLayout = AudioChannel.Layout.CH_LAYOUT_STEREO;
private static int inSampleRate = 44100;
private static AudioFormat.Type inFormat = AudioFormat.Type.SAMPLE_FMT_S16;
private static int bytesPerSample = 2;
private static File inFile = new File("input.wav");
public static void main(String[] args) throws IOException, InterruptedException {
render("output.mp3");
render("output.wav");
}
public static void render(String filename) throws IOException, InterruptedException {
//Starting everything up.
Muxer muxer = Muxer.make(new File(filename).getAbsolutePath(), null, null);
Codec codec = Codec.guessEncodingCodec(muxer.getFormat(), null, null, null, MediaDescriptor.Type.MEDIA_AUDIO);
AudioFormat.Type findType = null;
for(AudioFormat.Type type : codec.getSupportedAudioFormats()) {
if(findType == null) {
findType = type;
}
if(type == inFormat) {
findType = type;
break;
}
}
if(findType == null){
throw new IllegalArgumentException("Couldn't find valid audio format for codec: " + codec.getName());
}
Encoder encoder = Encoder.make(codec);
encoder.setSampleRate(44100);
encoder.setTimeBase(Rational.make(1, 44100));
encoder.setChannels(2);
encoder.setChannelLayout(AudioChannel.Layout.CH_LAYOUT_STEREO);
encoder.setSampleFormat(findType);
encoder.setFlag(Coder.Flag.FLAG_GLOBAL_HEADER, true);
encoder.open(null, null);
muxer.addNewStream(encoder);
muxer.open(null, null);
MediaPacket audioPacket = MediaPacket.make();
MediaAudioResampler audioResampler = MediaAudioResampler.make(encoder.getChannelLayout(), encoder.getSampleRate(), encoder.getSampleFormat(), inLayout, inSampleRate, inFormat);
audioResampler.open();
MediaAudio rawAudio = MediaAudio.make(1024/bytesPerSample, inSampleRate, 2, inLayout, inFormat);
rawAudio.setTimeBase(Rational.make(1, inSampleRate));
//Reading
try(BufferedInputStream reader = new BufferedInputStream(new FileInputStream(inFile))){
reader.skip(44);
int totalSamples = 0;
byte[] buffer = new byte[1024];
int readLength;
while((readLength = reader.read(buffer, 0, 1024)) != -1){
int sampleCount = readLength/bytesPerSample;
rawAudio.getData(0).put(buffer, 0, 0, readLength);
rawAudio.setNumSamples(sampleCount);
rawAudio.setTimeStamp(totalSamples);
totalSamples += sampleCount;
rawAudio.setComplete(true);
MediaAudio usedAudio = rawAudio;
if(encoder.getChannelLayout() != inLayout ||
encoder.getSampleRate() != inSampleRate ||
encoder.getSampleFormat() != inFormat){
usedAudio = MediaAudio.make(
sampleCount,
encoder.getSampleRate(),
encoder.getChannels(),
encoder.getChannelLayout(),
encoder.getSampleFormat());
audioResampler.resample(usedAudio, rawAudio);
}
do{
encoder.encodeAudio(audioPacket, usedAudio);
if(audioPacket.isComplete()) {
muxer.write(audioPacket, false);
}
} while (audioPacket.isComplete());
}
}
catch (IOException e){
e.printStackTrace();
muxer.close();
System.exit(-1);
}
muxer.close();
}
}Edit
I’ve gotten wave file exporting to work, however mp3s remain the same, which is very confusing. I changed the section counting how many samples each buffer of bytes is.
MediaAudio rawAudio = MediaAudio.make(1024, inSampleRate, channels, inLayout, inFormat);
rawAudio.setTimeBase(Rational.make(1, inSampleRate));
//Reading
try(BufferedInputStream reader = new BufferedInputStream(new FileInputStream(inFile))){
reader.skip(44);
int totalSamples = 0;
byte[] buffer = new byte[1024 * bytesPerSample * channels];
int readLength;
while((readLength = reader.read(buffer, 0, 1024 * bytesPerSample * channels)) != -1){
int sampleCount = readLength/(bytesPerSample * channels);
rawAudio.getData(0).put(buffer, 0, 0, readLength);
rawAudio.setNumSamples(sampleCount);
rawAudio.setTimeStamp(totalSamples);