
Recherche avancée
Autres articles (54)
-
Des sites réalisés avec MediaSPIP
2 mai 2011, parCette page présente quelques-uns des sites fonctionnant sous MediaSPIP.
Vous pouvez bien entendu ajouter le votre grâce au formulaire en bas de page. -
Ajouter notes et légendes aux images
7 février 2011, parPour pouvoir ajouter notes et légendes aux images, la première étape est d’installer le plugin "Légendes".
Une fois le plugin activé, vous pouvez le configurer dans l’espace de configuration afin de modifier les droits de création / modification et de suppression des notes. Par défaut seuls les administrateurs du site peuvent ajouter des notes aux images.
Modification lors de l’ajout d’un média
Lors de l’ajout d’un média de type "image" un nouveau bouton apparait au dessus de la prévisualisation (...) -
Automated installation script of MediaSPIP
25 avril 2011, parTo overcome the difficulties mainly due to the installation of server side software dependencies, an "all-in-one" installation script written in bash was created to facilitate this step on a server with a compatible Linux distribution.
You must have access to your server via SSH and a root account to use it, which will install the dependencies. Contact your provider if you do not have that.
The documentation of the use of this installation script is available here.
The code of this (...)
Sur d’autres sites (5890)
-
WebVTT Discussions at FOMS
1er janvier 2014, par silviaAt the recent FOMS (Foundations of Open Media Software and Standards) Developer Workshop, we had a massive focus on WebVTT and the state of its feature set. You will find links to summaries of the individual discussions in the FOMS Schedule page. Here are some of the key results I went away with.
1. WebVTT Regions
The key driving force for improvements to WebVTT continues to be the accurate representation of CEA608/708 captioning. As part of that drive, we’ve introduced regions (the CEA708 “window” concept) to WebVTT. WebVTT regions satisfy multiple requirements of CEA608/708 captions :
- support for rollup captions
- support for background color and border color on a group of cues independent of the background color of the individual cue
- possibility to move a group of cues from one location on screen to a different
- support to specify an anchor point and a growth direction for cues when their text size changes
- support for specifying a fixed number of lines to be rendered
- possibility to specify which region is rendered in front of which other one when regions overlap
While WebVTT regions enable us to satisfy all of the above points, the specification isn’t actually complete yet and some of the above needs aren’t satisfied yet.
We have an open bug to move a region elsewhere. A first discussion at FOMS seemed to to indicate that we’ll have to add syntax for updating a region at a particular time and thus give region definitions a way to be valid only for a certain time frame. I can imagine that the region definitions that we have in the header of the WebVTT file now would have an implicitly defined time frame from the start to the end of the file, but can be overruled by a re-definition anywhere within the WebVTT file. That redefinition needs to provide a start and end time.
We registered a bug to add specifying the width and height of regions (and possibly of cues) by em (i.e. by multiples of the largest character in a font). This should allow us to have the region grow/shrink around the region anchor point with a change of font size by script or a user. em specifications should also be applied to cues – that matches the column count of CEA708/608 better.
When regions overlap, the original region extension spec already suggested a “layer” cue setting. It will be easy to add it.
Another change that we will ultimately need is the “scroll” setting : we will need to introduce support for scrolling text down or from left-to-right or right-to-left, e.g. vertical scrolling text seems to be used in some Chinese caption use cases.
2. Unify Rendering Approach
The introduction of regions created a second code path in the rendering spec with some duplication. At FOMS we discussed if it was possible to unify that. The suggestion is to render all cues into a region. Those that are not part of a region would be rendered into an anonymous region that covers the complete viewport. There may be some consequences to this, e.g. cue settings should be usable across all cues, no matter whether or not part of a region, and avoiding cue overlap may need to be done within regions.
Here’s a rough outline of the path of the new rendering algorithm :
(1) Render the regions :
Specified Region Anonymous Region Render values as given : Render following values : - width
- lines
- regionanchor
- viewportanchor
- scroll
- 100%
- videoheight/lineheight
- 0,0
- 0,0
- none
(2) Render the cues :
- Create a cue box and put it in its region (anonymous if none given).
- Calculate position & size of cue box from cue settings (position, line, size).
- Calculate position of cue text inside cue box from remaining cue settings (vertical, align).
3. Vertical Features
WebVTT includes vertical rendering, both right-to-left and left-to-right. However, regions are not defined for vertical. Eventually, we’re going to have to look at the vertical features of WebVTT with more details and figure out whether the spec is working for them and what real-world requirements we have missed. We hope we can get some help from users in countries where vertically rendered captions/subtitles are the norm.
4. Best Practices
Some of he WebVTT users at FOMS suggested it would be advantageous to start a list of “best practices” for how to author captions with WebVTT. Example recommendations are :
- Use line numbers only to position cues from top or bottom of viewport. Don’t use otherwise.
- Note that when the user increases the fontsize in rollup captions and thus introduces new line breaks, your cues will roll by faster because the number of lines of a rollup is fixed.
- Make sure to use &lrm ; and &rlm ; UTF-8 markers to control the directionality of your text.
It would be nice if somebody started such a document.
5. Non-caption use cases
Instead of continuing to look back and improve our support of captions/subtitles in WebVTT, one session at FOMS also went ahead and looked forward to other use cases. The following requirements came out of this :
5.1 Preview Thumbnails
A common use case for timed data is the use of preview thumbnails on the navigation bar of videos. A native implementation of preview thumbnails would allow crawlers and search engines to have a standardised way of extracting timed images for media files, so introduction of a new @kind value “thumbnails” was suggested.
The content of a “thumbnails” cue could be any of :
- an image URL
- a sprite URL to a single image
- a spatial & temporal media fragment URL to a media resource
- base64 encoded image (data URI)
- an iframe offset to the media resource
The suggestion is to allow anything that would work in a img @src attribute as value in a cue of @kind=”thumbnails”. Responsive images might also be useful for a track of @kind=”thumbnails”. It may even be possible to define an inband thumbnail track based on the track of @kind=”thumbnails”. Such cues should also work in the JavaScript track API.
5.2 Chapter markers
There is interest to put richer content than just a chapter title into chapter cues. Often, chapters consist of a title, text and and image. The text is not so important, but the image is used almost everywhere that chapters are used. There may be a need to extend chapter cue content with images, similar to what a @kind=”thumbnails” track offers.
The conclusion that we arrived at was that we need to make @kind=”thumbnails” work first and then look at using the learnings from that to extend @kind=”chapters”.
5.3 Inband tracks for live video
A difficult topic was opened with the question of how to transport text tracks in live video. In live captioning, end times are never created for cues, but are implied by the start time of the next cue. This is a use case that hasn’t been addressed in HTML5/WebVTT yet. An old proposal to allow a special end time value of “NEXT” was discussed and recommended for adoption. Also, there was support for the spec change that stops blocking loading VTT until all cues have been loaded.
5.4 Cross-domain VTT loading
A brief discussion centered around the fact that the spec disallows cross-domain loading of WebVTT files, but that no browser implements this. This needs to be discussion at the HTML WG level.
6. Regions in live captioning
The final topic that we discussed was how we could provide support for regions in live captioning.
- The currently active region definitions will need to be come part of every header of every VTT file segment that HLS uses, so it’s available in case the cues in the segment file reference it.
- “NEXT” in end time markers would make authoring of live captioned VTT files easier.
- If the application wants to use 1 word at a time and doesn’t want to delay sending the word until the full cue is authored (e.g. in a Hangout type environment), we will need to introduce the concept of “cue continuation markers”, so we know that a cue could be extended with the next VTT file fragment.
This is an extensive and impressive amount of discussion around WebVTT and a lot of new work to be performed in the future. I’m very grateful for all the people who have contributed to these discussions at FOMS and will hopefully continue to help get the specifications right.
-
Trouble syncing libavformat/ffmpeg with x264 and RTP
26 décembre 2012, par Jacob PeddicordI've been working on some streaming software that takes live feeds
from various kinds of cameras and streams over the network using
H.264. To accomplish this, I'm using the x264 encoder directly (with
the "zerolatency" preset) and feeding NALs as they are available to
libavformat to pack into RTP (ultimately RTSP). Ideally, this
application should be as real-time as possible. For the most part,
this has been working well.Unfortunately, however, there is some sort of synchronization issue :
any video playback on clients seems to show a few smooth frames,
followed by a short pause, then more frames ; repeat. Additionally,
there appears to be approximately a 4-second delay. This happens with
every video player I've tried : Totem, VLC, and basic gstreamer pipes.I've boiled it all down to a somewhat small test case :
#include
#include
#include
#include
#include <libavformat></libavformat>avformat.h>
#include <libswscale></libswscale>swscale.h>
#define WIDTH 640
#define HEIGHT 480
#define FPS 30
#define BITRATE 400000
#define RTP_ADDRESS "127.0.0.1"
#define RTP_PORT 49990
struct AVFormatContext* avctx;
struct x264_t* encoder;
struct SwsContext* imgctx;
uint8_t test = 0x80;
void create_sample_picture(x264_picture_t* picture)
{
// create a frame to store in
x264_picture_alloc(picture, X264_CSP_I420, WIDTH, HEIGHT);
// fake image generation
// disregard how wrong this is; just writing a quick test
int strides = WIDTH / 8;
uint8_t* data = malloc(WIDTH * HEIGHT * 3);
memset(data, test, WIDTH * HEIGHT * 3);
test = (test << 1) | (test >> (8 - 1));
// scale the image
sws_scale(imgctx, (const uint8_t* const*) &data, &strides, 0, HEIGHT,
picture->img.plane, picture->img.i_stride);
}
int encode_frame(x264_picture_t* picture, x264_nal_t** nals)
{
// encode a frame
x264_picture_t pic_out;
int num_nals;
int frame_size = x264_encoder_encode(encoder, nals, &num_nals, picture, &pic_out);
// ignore bad frames
if (frame_size < 0)
{
return frame_size;
}
return num_nals;
}
void stream_frame(uint8_t* payload, int size)
{
// initalize a packet
AVPacket p;
av_init_packet(&p);
p.data = payload;
p.size = size;
p.stream_index = 0;
p.flags = AV_PKT_FLAG_KEY;
p.pts = AV_NOPTS_VALUE;
p.dts = AV_NOPTS_VALUE;
// send it out
av_interleaved_write_frame(avctx, &p);
}
int main(int argc, char* argv[])
{
// initalize ffmpeg
av_register_all();
// set up image scaler
// (in-width, in-height, in-format, out-width, out-height, out-format, scaling-method, 0, 0, 0)
imgctx = sws_getContext(WIDTH, HEIGHT, PIX_FMT_MONOWHITE,
WIDTH, HEIGHT, PIX_FMT_YUV420P,
SWS_FAST_BILINEAR, NULL, NULL, NULL);
// set up encoder presets
x264_param_t param;
x264_param_default_preset(&param, "ultrafast", "zerolatency");
param.i_threads = 3;
param.i_width = WIDTH;
param.i_height = HEIGHT;
param.i_fps_num = FPS;
param.i_fps_den = 1;
param.i_keyint_max = FPS;
param.b_intra_refresh = 0;
param.rc.i_bitrate = BITRATE;
param.b_repeat_headers = 1; // whether to repeat headers or write just once
param.b_annexb = 1; // place start codes (1) or sizes (0)
// initalize
x264_param_apply_profile(&param, "high");
encoder = x264_encoder_open(&param);
// at this point, x264_encoder_headers can be used, but it has had no effect
// set up streaming context. a lot of error handling has been ommitted
// for brevity, but this should be pretty standard.
avctx = avformat_alloc_context();
struct AVOutputFormat* fmt = av_guess_format("rtp", NULL, NULL);
avctx->oformat = fmt;
snprintf(avctx->filename, sizeof(avctx->filename), "rtp://%s:%d", RTP_ADDRESS, RTP_PORT);
if (url_fopen(&avctx->pb, avctx->filename, URL_WRONLY) < 0)
{
perror("url_fopen failed");
return 1;
}
struct AVStream* stream = av_new_stream(avctx, 1);
// initalize codec
AVCodecContext* c = stream->codec;
c->codec_id = CODEC_ID_H264;
c->codec_type = AVMEDIA_TYPE_VIDEO;
c->flags = CODEC_FLAG_GLOBAL_HEADER;
c->width = WIDTH;
c->height = HEIGHT;
c->time_base.den = FPS;
c->time_base.num = 1;
c->gop_size = FPS;
c->bit_rate = BITRATE;
avctx->flags = AVFMT_FLAG_RTP_HINT;
// write the header
av_write_header(avctx);
// make some frames
for (int frame = 0; frame < 10000; frame++)
{
// create a sample moving frame
x264_picture_t* pic = (x264_picture_t*) malloc(sizeof(x264_picture_t));
create_sample_picture(pic);
// encode the frame
x264_nal_t* nals;
int num_nals = encode_frame(pic, &nals);
if (num_nals < 0)
printf("invalid frame size: %d\n", num_nals);
// send out NALs
for (int i = 0; i < num_nals; i++)
{
stream_frame(nals[i].p_payload, nals[i].i_payload);
}
// free up resources
x264_picture_clean(pic);
free(pic);
// stream at approx 30 fps
printf("frame %d\n", frame);
usleep(33333);
}
return 0;
}This test shows black lines on a white background that
should move smoothly to the left. It has been written for ffmpeg 0.6.5
but the problem can be reproduced on 0.8 and 0.10 (from what I've tested so far). I've taken some shortcuts in error handling to make this example as short as
possible while still showing the problem, so please excuse some of the
nasty code. I should also note that while an SDP is not used here, I
have tried using that already with similar results. The test can be
compiled with :gcc -g -std=gnu99 streamtest.c -lswscale -lavformat -lx264 -lm -lpthread -o streamtest
It can be played with gtreamer directly :
gst-launch udpsrc port=49990 ! application/x-rtp,payload=96,clock-rate=90000 ! rtph264depay ! decodebin ! xvimagesink
You should immediately notice the stuttering. One common "fix" I've
seen all over the Internet is to add sync=false to the pipeline :gst-launch udpsrc port=49990 ! application/x-rtp,payload=96,clock-rate=90000 ! rtph264depay ! decodebin ! xvimagesink sync=false
This causes playback to be smooth (and near-realtime), but is a
non-solution and only works with gstreamer. I'd like to fix the
problem at the source. I've been able to stream with near-identical
parameters using raw ffmpeg and haven't had any issues :ffmpeg -re -i sample.mp4 -vcodec libx264 -vpre ultrafast -vpre baseline -b 400000 -an -f rtp rtp://127.0.0.1:49990 -an
So clearly I'm doing something wrong. But what is it ?
-
WebVTT Discussions at FOMS
18 décembre 2013, par silviaAt the recent FOMS (Foundations of Open Media Software and Standards) Developer Workshop, we had a massive focus on WebVTT and the state of its feature set. You will find links to summaries of the individual discussions in the FOMS Schedule page. Here are some of the key results I went away with.
1. WebVTT Regions
The key driving force for improvements to WebVTT continues to be the accurate representation of CEA608/708 captioning. As part of that drive, we’ve introduced regions (the CEA708 “window” concept) to WebVTT. WebVTT regions satisfy multiple requirements of CEA608/708 captions :
- support for rollup captions
- support for background color and border color on a group of cues independent of the background color of the individual cue
- possibility to move a group of cues from one location on screen to a different
- support to specify an anchor point and a growth direction for cues when their text size changes
- support for specifying a fixed number of lines to be rendered
- possibility to specify which region is rendered in front of which other one when regions overlap
While WebVTT regions enable us to satisfy all of the above points, the specification isn’t actually complete yet and some of the above needs aren’t satisfied yet.
We have an open bug to move a region elsewhere. A first discussion at FOMS seemed to to indicate that we’ll have to add syntax for updating a region at a particular time and thus give region definitions a way to be valid only for a certain time frame. I can imagine that the region definitions that we have in the header of the WebVTT file now would have an implicitly defined time frame from the start to the end of the file, but can be overruled by a re-definition anywhere within the WebVTT file. That redefinition needs to provide a start and end time.
We registered a bug to add specifying the width and height of regions (and possibly of cues) by em (i.e. by multiples of the largest character in a font). This should allow us to have the region grow/shrink around the region anchor point with a change of font size by script or a user. em specifications should also be applied to cues – that matches the column count of CEA708/608 better.
When regions overlap, the original region extension spec already suggested a “layer” cue setting. It will be easy to add it.
Another change that we will ultimately need is the “scroll” setting : we will need to introduce support for scrolling text down or from left-to-right or right-to-left, e.g. vertical scrolling text seems to be used in some Chinese caption use cases.
2. Unify Rendering Approach
The introduction of regions created a second code path in the rendering spec with some duplication. At FOMS we discussed if it was possible to unify that. The suggestion is to render all cues into a region. Those that are not part of a region would be rendered into an anonymous region that covers the complete viewport. There may be some consequences to this, e.g. cue settings should be usable across all cues, no matter whether or not part of a region, and avoiding cue overlap may need to be done within regions.
Here’s a rough outline of the path of the new rendering algorithm :
(1) Render the regions :
Specified Region Anonymous Region Render values as given : Render following values : - width
- lines
- regionanchor
- viewportanchor
- scroll
- 100%
- videoheight/lineheight
- 0,0
- 0,0
- none
(2) Render the cues :
- Create a cue box and put it in its region (anonymous if none given).
- Calculate position & size of cue box from cue settings (position, line, size).
- Calculate position of cue text inside cue box from remaining cue settings (vertical, align).
3. Vertical Features
WebVTT includes vertical rendering, both right-to-left and left-to-right. However, regions are not defined for vertical. Eventually, we’re going to have to look at the vertical features of WebVTT with more details and figure out whether the spec is working for them and what real-world requirements we have missed. We hope we can get some help from users in countries where vertically rendered captions/subtitles are the norm.
4. Best Practices
Some of he WebVTT users at FOMS suggested it would be advantageous to start a list of “best practices” for how to author captions with WebVTT. Example recommendations are :
- Use line numbers only to position cues from top or bottom of viewport. Don’t use otherwise.
- Note that when the user increases the fontsize in rollup captions and thus introduces new line breaks, your cues will roll by faster because the number of lines of a rollup is fixed.
- Make sure to use &lrm ; and &rlm ; UTF-8 markers to control the directionality of your text.
It would be nice if somebody started such a document.
5. Non-caption use cases
Instead of continuing to look back and improve our support of captions/subtitles in WebVTT, one session at FOMS also went ahead and looked forward to other use cases. The following requirements came out of this :
5.1 Preview Thumbnails
A common use case for timed data is the use of preview thumbnails on the navigation bar of videos. A native implementation of preview thumbnails would allow crawlers and search engines to have a standardised way of extracting timed images for media files, so introduction of a new @kind value “thumbnails” was suggested.
The content of a “thumbnails” cue could be any of :
- an image URL
- a sprite URL to a single image
- a spatial & temporal media fragment URL to a media resource
- base64 encoded image (data URI)
- an iframe offset to the media resource
The suggestion is to allow anything that would work in a img @src attribute as value in a cue of @kind=”thumbnails”. Responsive images might also be useful for a track of @kind=”thumbnails”. It may even be possible to define an inband thumbnail track based on the track of @kind=”thumbnails”. Such cues should also work in the JavaScript track API.
5.2 Chapter markers
There is interest to put richer content than just a chapter title into chapter cues. Often, chapters consist of a title, text and and image. The text is not so important, but the image is used almost everywhere that chapters are used. There may be a need to extend chapter cue content with images, similar to what a @kind=”thumbnails” track offers.
The conclusion that we arrived at was that we need to make @kind=”thumbnails” work first and then look at using the learnings from that to extend @kind=”chapters”.
5.3 Inband tracks for live video
A difficult topic was opened with the question of how to transport text tracks in live video. In live captioning, end times are never created for cues, but are implied by the start time of the next cue. This is a use case that hasn’t been addressed in HTML5/WebVTT yet. An old proposal to allow a special end time value of “NEXT” was discussed and recommended for adoption. Also, there was support for the spec change that stops blocking loading VTT until all cues have been loaded.
5.4 Cross-domain VTT loading
A brief discussion centered around the fact that the spec disallows cross-domain loading of WebVTT files, but that no browser implements this. This needs to be discussion at the HTML WG level.
6. Regions in live captioning
The final topic that we discussed was how we could provide support for regions in live captioning.
- The currently active region definitions will need to be come part of every header of every VTT file segment that HLS uses, so it’s available in case the cues in the segment file reference it.
- “NEXT” in end time markers would make authoring of live captioned VTT files easier.
- If the application wants to use 1 word at a time and doesn’t want to delay sending the word until the full cue is authored (e.g. in a Hangout type environment), we will need to introduce the concept of “cue continuation markers”, so we know that a cue could be extended with the next VTT file fragment.
This is an extensive and impressive amount of discussion around WebVTT and a lot of new work to be performed in the future. I’m very grateful for all the people who have contributed to these discussions at FOMS and will hopefully continue to help get the specifications right.