Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (91)

1 | ... | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16

Autres articles (54)

Emballe Médias : Mettre en ligne simplement des documents

29 octobre 2010, par kent1

Le plugin emballe médias a été développé principalement pour la distribution mediaSPIP mais est également utilisé dans d’autres projets proches comme géodiversité par exemple. Plugins nécessaires et compatibles
Pour fonctionner ce plugin nécessite que d’autres plugins soient installés : CFG Saisies SPIP Bonux Diogène swfupload jqueryui
D’autres plugins peuvent être utilisés en complément afin d’améliorer ses capacités : Ancres douces Légendes photo_infos spipmotion (...)
Script d’installation automatique de MediaSPIP

25 avril 2011, par kent1

Afin de palier aux difficultés d’installation dues principalement aux dépendances logicielles coté serveur, un script d’installation "tout en un" en bash a été créé afin de faciliter cette étape sur un serveur doté d’une distribution Linux compatible.
Vous devez bénéficier d’un accès SSH à votre serveur et d’un compte "root" afin de l’utiliser, ce qui permettra d’installer les dépendances. Contactez votre hébergeur si vous ne disposez pas de cela.
La documentation de l’utilisation du script d’installation (...)
Menus personnalisés

14 novembre 2010, par kent1

MediaSPIP utilise le plugin Menus pour gérer plusieurs menus configurables pour la navigation.
Cela permet de laisser aux administrateurs de canaux la possibilité de configurer finement ces menus.
Menus créés à l’initialisation du site
Par défaut trois menus sont créés automatiquement à l’initialisation du site : Le menu principal ; Identifiant : barrenav ; Ce menu s’insère en général en haut de la page après le bloc d’entête, son identifiant le rend compatible avec les squelettes basés sur Zpip ; (...)

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 18

Sur d’autres sites (3355)

WebVTT Discussions at FOMS

18 décembre 2013, par silvia
At the recent FOMS (Foundations of Open Media Software and Standards) Developer Workshop, we had a massive focus on WebVTT and the state of its feature set. You will find links to summaries of the individual discussions in the FOMS Schedule page. Here are some of the key results I went away with.

1. WebVTT Regions

The key driving force for improvements to WebVTT continues to be the accurate representation of CEA608/708 captioning. As part of that drive, we’ve introduced regions (the CEA708 “window” concept) to WebVTT. WebVTT regions satisfy multiple requirements of CEA608/708 captions :
1. support for rollup captions
2. support for background color and border color on a group of cues independent of the background color of the individual cue
3. possibility to move a group of cues from one location on screen to a different
4. support to specify an anchor point and a growth direction for cues when their text size changes
5. support for specifying a fixed number of lines to be rendered
6. possibility to specify which region is rendered in front of which other one when regions overlap
While WebVTT regions enable us to satisfy all of the above points, the specification isn’t actually complete yet and some of the above needs aren’t satisfied yet.

We have an open bug to move a region elsewhere. A first discussion at FOMS seemed to to indicate that we’ll have to add syntax for updating a region at a particular time and thus give region definitions a way to be valid only for a certain time frame. I can imagine that the region definitions that we have in the header of the WebVTT file now would have an implicitly defined time frame from the start to the end of the file, but can be overruled by a re-definition anywhere within the WebVTT file. That redefinition needs to provide a start and end time.

We registered a bug to add specifying the width and height of regions (and possibly of cues) by em (i.e. by multiples of the largest character in a font). This should allow us to have the region grow/shrink around the region anchor point with a change of font size by script or a user. em specifications should also be applied to cues – that matches the column count of CEA708/608 better.

When regions overlap, the original region extension spec already suggested a “layer” cue setting. It will be easy to add it.

Another change that we will ultimately need is the “scroll” setting : we will need to introduce support for scrolling text down or from left-to-right or right-to-left, e.g. vertical scrolling text seems to be used in some Chinese caption use cases.

2. Unify Rendering Approach

The introduction of regions created a second code path in the rendering spec with some duplication. At FOMS we discussed if it was possible to unify that. The suggestion is to render all cues into a region. Those that are not part of a region would be rendered into an anonymous region that covers the complete viewport. There may be some consequences to this, e.g. cue settings should be usable across all cues, no matter whether or not part of a region, and avoiding cue overlap may need to be done within regions.

Here’s a rough outline of the path of the new rendering algorithm :

(1) Render the regions :
Specified Region Anonymous Region
Render values as given : Render following values :
width

lines

regionanchor

viewportanchor

scroll

100%

videoheight/lineheight

0,0

0,0

none
(2) Render the cues :
- Create a cue box and put it in its region (anonymous if none given).
- Calculate position & size of cue box from cue settings (position, line, size).
- Calculate position of cue text inside cue box from remaining cue settings (vertical, align).
3. Vertical Features

WebVTT includes vertical rendering, both right-to-left and left-to-right. However, regions are not defined for vertical. Eventually, we’re going to have to look at the vertical features of WebVTT with more details and figure out whether the spec is working for them and what real-world requirements we have missed. We hope we can get some help from users in countries where vertically rendered captions/subtitles are the norm.

4. Best Practices

Some of he WebVTT users at FOMS suggested it would be advantageous to start a list of “best practices” for how to author captions with WebVTT. Example recommendations are :
- Use line numbers only to position cues from top or bottom of viewport. Don’t use otherwise.
- Note that when the user increases the fontsize in rollup captions and thus introduces new line breaks, your cues will roll by faster because the number of lines of a rollup is fixed.
- Make sure to use &lrm ; and &rlm ; UTF-8 markers to control the directionality of your text.
It would be nice if somebody started such a document.

5. Non-caption use cases

Instead of continuing to look back and improve our support of captions/subtitles in WebVTT, one session at FOMS also went ahead and looked forward to other use cases. The following requirements came out of this :

5.1 Preview Thumbnails

A common use case for timed data is the use of preview thumbnails on the navigation bar of videos. A native implementation of preview thumbnails would allow crawlers and search engines to have a standardised way of extracting timed images for media files, so introduction of a new @kind value “thumbnails” was suggested.

The content of a “thumbnails” cue could be any of :
- an image URL
- a sprite URL to a single image
- a spatial & temporal media fragment URL to a media resource
- base64 encoded image (data URI)
- an iframe offset to the media resource
The suggestion is to allow anything that would work in a img @src attribute as value in a cue of @kind=”thumbnails”. Responsive images might also be useful for a track of @kind=”thumbnails”. It may even be possible to define an inband thumbnail track based on the track of @kind=”thumbnails”. Such cues should also work in the JavaScript track API.

5.2 Chapter markers

There is interest to put richer content than just a chapter title into chapter cues. Often, chapters consist of a title, text and and image. The text is not so important, but the image is used almost everywhere that chapters are used. There may be a need to extend chapter cue content with images, similar to what a @kind=”thumbnails” track offers.

The conclusion that we arrived at was that we need to make @kind=”thumbnails” work first and then look at using the learnings from that to extend @kind=”chapters”.

5.3 Inband tracks for live video

A difficult topic was opened with the question of how to transport text tracks in live video. In live captioning, end times are never created for cues, but are implied by the start time of the next cue. This is a use case that hasn’t been addressed in HTML5/WebVTT yet. An old proposal to allow a special end time value of “NEXT” was discussed and recommended for adoption. Also, there was support for the spec change that stops blocking loading VTT until all cues have been loaded.

5.4 Cross-domain VTT loading

A brief discussion centered around the fact that the spec disallows cross-domain loading of WebVTT files, but that no browser implements this. This needs to be discussion at the HTML WG level.

6. Regions in live captioning

The final topic that we discussed was how we could provide support for regions in live captioning.
- The currently active region definitions will need to be come part of every header of every VTT file segment that HLS uses, so it’s available in case the cues in the segment file reference it.
- “NEXT” in end time markers would make authoring of live captioned VTT files easier.
- If the application wants to use 1 word at a time and doesn’t want to delay sending the word until the full cue is authored (e.g. in a Hangout type environment), we will need to introduce the concept of “cue continuation markers”, so we know that a cue could be extended with the next VTT file fragment.
This is an extensive and impressive amount of discussion around WebVTT and a lot of new work to be performed in the future. I’m very grateful for all the people who have contributed to these discussions at FOMS and will hopefully continue to help get the specifications right.
2011 In Open Source Multimedia

5 janvier 2012, par Multimedia Mike — Open Source Multimedia

Sometimes I think that the pace of multimedia technology is slowing down. Obviously, I’m not paying close enough attention. I thought I would do a little 2011 year-end review of what happened in the world of open source multimedia, mainly for my own benefit. Let me know in the comments what I missed.

The Split
The biggest deal in open source multimedia was the matter of the project split. Where once stood one project (FFmpeg) there now stands two (also Libav). Where do things stand with the projects now ? Still very separate but similar. Both projects obsessively monitor each other’s git commits and prodigiously poach each other’s work, both projects being LGPL and all. Most features that land in one code base end up in the other. Thus, I refer to FFmpeg and Libav collectively as “the projects”.

Some philosophical reasons for the split included project stagnation and development process friction. Curiously, these problems are fond memories now and the spirit of competition has pushed development forward at a blinding pace.

People inside the project have strong opinions about the split ; that’s understandable. People outside the project have strong opinions about the split ; that’s somewhat less understandable, but whatever. After 5 years of working for Adobe on the Flash Player (a.k.a. the most hated software in all existence if internet nerds are to be believed on the matter), I’m so over internet nerd drama.

For my part, I just try to maintain some appearance of neutrality since I manage some shared resources for the open source multimedia community (like the wiki and samples repo) and am trying to keep them from fracturing as well.

Apple and Open Source
It was big news that Apple magnanimously open sourced their lossless audio codec. That sets a great example and precedent.

New Features
I mined the 'git log' of the projects in order to pick out some features that were added during 2011.

First off, Apple’s ProRes video codec was reverse engineered and incorporated into the multimedia libraries. And for some weird reason, this is an item that made the rounds in the geek press. I’m not entirely sure why, but it may have something to do with inter-project conflict. Anyway, here is the decoder in action, playing a video of some wild swine, one of the few samples we have :

Other new video codecs included a reverse engineered Indeo 4 decoder. Gotta catch ‘em all ! That completes our collection of Indeo codecs. But that wasn’t enough– this year, we got a completely revised Indeo 3 decoder (the previous one, while functional, exhibited a lot of code artifacts betraying a direct ASM ->C translation). Oh, and many thanks to Kostya for this gem :

That’s the new Origin Xan decoder (best known for Wing Commander IV cinematics) in action, something I first started reverse engineering back in 2002. Thanks to Kostya for picking up my slack yet again.

Continuing with the codec section, there is a decoder for Adobe Flash Screen Video 2 — big congrats on this ! One of my jobs at Adobe was documenting this format to the outside world and I was afraid I could never quite make it clear enough to build a complete re-implementation. But the team came through.

Let’s see, there are decoders for VBLE video, Ut Video, Windows Media Image (WMVP/WMP2), Bink audio version ‘b’, H.264 4:2:2 intra frames, and MxPEG video. There is a DPX image encoder, a Cirrus Logic AccuPak video encoder, and a v410 codec.

How about some more game stuff ? The projects saw — at long last — an SMJPEG demuxer. This will finally allow usage and testing of the SMJPEG IMA ADPCM audio decoder I added about a decade ago. Funny story behind that– I was porting all of my decoders from xine which included the SMJPEG ADPCM. I just never quite got around to writing a corresponding demuxer. Thanks to Paul Mahol for taking care of that.

Here’s a DFA playback system for a 1995 DOS CD-ROM title called Chronomaster. No format is too obscure, nor its encoded contents too cheesy :

There’s now a demuxer for a format called XMV that was (is ?) prevalent on Xbox titles. Now the projects can handle FMV files from many Xbox games, such as Thrillville.

The projects also gained the ability to play BMV files. I think this surfing wizard comes from Discworld II. It’s non-computer-generated animation at a strange resolution.

More demuxers : xWMA, PlayStation Portable PMP format, and CRI ADX format ; muxer for OpenMG audio and LATM muxer/demuxer.

One more thing : an AVX-optimized fast Fourier transform (FFT). If you have a machine that supports AVX, there’s no way you’ll even notice the speed increase of a few measly FFT calls for audio coding/decoding, but that’s hardly the point. The projects always use everything on offer for any CPU.

Please make me aware of features that I missed in the list !

Continuous Testing
As a result of the split, each project has its own FATE server, one for FFmpeg and one for Libav. As of the new year, FFmpeg has just over 1000 tests while Libav had 965. This is one area where I’m obviously ecstatic to see competition. Some ad-hoc measurements on my part indicate that the total code coverage via the FATEs has not appreciably increased. But that’s a total percentage. Both the test count and the code count have been steadily rising.

Google Summer of Code and Google Code-In
Once again, the projects were allowed to participate in the Google Summer of Code as well as Google Code-In. I confess that I didn’t keep up with these too carefully (and Code-In is still in progress as of this writing). I do know that the project split occurred after FFmpeg had already been accepted for GSoC season 2011 and the admins were gracious enough to allow FFmpeg and Libav to allow both projects to participate in the same slot as long as they could both be mature about it.

Happy New Year
Let’s see what we can accomplish in 2012.
Parsing The Clue Chronicles

30 décembre 2018, par Multimedia Mike — Game Hacking
A long time ago, I procured a 1999 game called Clue Chronicles : Fatal Illusion, based on the classic board game Clue, a.k.a. Cluedo. At the time, I was big into collecting old, unloved PC games so that I could research obscure multimedia formats.

Surveying the 3 CD-ROMs contained in the box packaging revealed only Smacker (SMK) videos for full motion video which was nothing new to me or the multimedia hacking community at the time. Studying the mix of data formats present on the discs, I found a selection of straightforward formats such as WAV for audio and BMP for still images. I generally find myself more fascinated by how computer games are constructed rather than by playing them, and this mix of files has always triggered a strong “I could implement a new engine for this !” feeling in me, perhaps as part of the ScummVM project which already provides the core infrastructure for reimplementing engines for 2D adventure games.

Tying all of the assets together is a custom high-level programming language. I have touched on this before in a blog post over a decade ago. The scripts are in a series of files bearing the extension .ini (usually reserved for configuration scripts, but we’ll let that slide). A representative sample of such a script can be found here :

clue-chronicles-scarlet-1.txt

What Is This Language ?
At the time I first analyzed this language, I was still primarily a C/C++-minded programmer, with a decent amount of Perl experience as a high level language, and had just started to explore Python. I assessed this language to be “mildly object oriented with C++-type comments (‘//’) and reliant upon a number of implicit library functions”. Other people saw other properties. When I look at it nowadays, it reminds me a bit more of JavaScript than C++. I think it’s sort of a Rorschach test for programming languages.

Strangely, I sort of had this fear that I would put a lot of effort into figuring out how to parse out the language only for someone to come along and point out that it’s a well-known yet academic language that already has a great deal of supporting code and libraries available as open source. Google for “spanish dolphins far side comic” for an illustration of the feeling this would leave me with.

It doesn’t matter in the end. Even if such libraries exist, how easy would they be to integrate into something like ScummVM ? Time to focus on a workable approach to understanding and processing the format.

Problem Scope
So I set about to see if I can write a program to parse the language seen in these INI files. Some questions :
1. How large is the corpus of data that I need to be sure to support ?
2. What parsing approach should I take ?
3. What is the exact language format ?
4. Other hidden challenges ?
To figure out how large the data corpus is, I counted all of the INI files on all of the discs. There are 138 unique INI files between the 3 discs. However, there are 146 unique INI files after installation. This leads to a hidden challenge described a bit later.

What parsing approach should I take ? I worried a bit too much that I might not be doing this the “right” way. I’m trying to ignore doubts like this, like how “SQL Shame” blocked me on a task for a little while a few years ago as I concerned myself that I might not be using the purest, most elegant approach to the problem. I know I covered language parsing a lot time ago in university computer science education and there is a lot of academic literature to the matter. But sometimes, you just have to charge in and experiment and prototype and see what falls out. In doing so, I expect to have a better understanding of the problems that need to solved and the right questions to ask, not unlike that time that I wrote a continuous integration system from scratch because I didn’t actually know that “continuous integration” was the keyword I needed.

Next, what is the exact language format ? I realized that parsing the language isn’t the first and foremost problem here– I need to know exactly what the language is. I need to know what the grammar are keywords are. In essence, I need to reverse engineer the language before I write a proper parser for it. I guess that fits in nicely with the historical aim of this blog (reverse engineering).

Now, about the hidden challenges– I mentioned that there are 8 more INI files after the game installs itself. Okay, so what’s the big deal ? For some reason, all of the INI files are in plaintext on the CD-ROM but get compressed (apparently, according to file size ratios) when installed to the hard drive. This includes those 8 extra INI files. I thought to look inside the CAB installation archive file on the CD-ROM and the files were there… but all in compressed form. I suspect that one of the files forms the “root” of the program and is the launching point for the game.

Parsing Approach
I took a stab at parsing an INI file. My approach was to first perform lexical analysis on the file and create a list of 4 types : symbols, numbers, strings, and language elements ([]{}()=., :). Apparently, this is the kind of thing that Lex/Flex are good at. This prototyping tool is written in Python, but when I port this to ScummVM, it might be useful to call upon the services of Lex/Flex, or another lexical analyzer, for there are many. I have a feeling it will be easier to use better tools when I understand the full structure of the language based on the data available.

The purpose of this tool is to explore all the possibilities of the existing corpus of INI files. To that end, I ran all 138 of the plaintext files through it, collected all of the symbols, and massaged the results, assuming that the symbols that occurred most frequently are probably core language features. These are all the symbols which occur more than 1000 times among all the scripts :
```
   6248 false
   5734 looping
   4390 scripts
   3877 layer
   3423 sequentialscript
   3408 setactive
   3360 file
   3257 thescreen
   3239 true
   3008 autoplay
   2914 offset
   2599 transparent
   2441 text
   2361 caption
   2276 add
   2205 ge
   2197 smackanimation
   2196 graphicscript
   2196 graphic
   1977 setstate
   1642 state
   1611 skippable
   1576 desc
   1413 delayscript
   1298 script
   1267 seconds
   1019 rect
```
About That Compression
I have sorted out at least these few details of the compression :
```
bytes 0-3    "COMP" (a pretty strong sign that this is, in fact, compressed data)
bytes 4-11   unknown
bytes 12-15  size of uncompressed data
bytes 16-19  size of compressed data (filesize - 20)
bytes 20-    compressed payload
```
The compression ratios are on the same order of gzip. I was hoping that it was stock zlib data. However, I have been unable to prove this. I wrote a Python script that scrubbed through the first 100 bytes of payload data and tried to get Python’s zlib.decompress to initialize– no luck. It’s frustrating to know that I’ll have to reverse engineer a compression algorithm that deals with just 8 total text files if I want to see this effort through to fruition.

Update, January 15, 2019
Some folks expressed interest in trying to sort out the details of the compression format. So I have posted a followup in which I post some samples and go into deeper details about things I have tried :

Reverse Engineering Clue Chronicles Compression

The post Parsing The Clue Chronicles first appeared on Breaking Eggs And Making Omelettes.