Recherche avancée

Médias (91)

Autres articles (50)

  • La sauvegarde automatique de canaux SPIP

    1er avril 2010, par

    Dans le cadre de la mise en place d’une plateforme ouverte, il est important pour les hébergeurs de pouvoir disposer de sauvegardes assez régulières pour parer à tout problème éventuel.
    Pour réaliser cette tâche on se base sur deux plugins SPIP : Saveauto qui permet une sauvegarde régulière de la base de donnée sous la forme d’un dump mysql (utilisable dans phpmyadmin) mes_fichiers_2 qui permet de réaliser une archive au format zip des données importantes du site (les documents, les éléments (...)

  • Contribute to a better visual interface

    13 avril 2011

    MediaSPIP is based on a system of themes and templates. Templates define the placement of information on the page, and can be adapted to a wide range of uses. Themes define the overall graphic appearance of the site.
    Anyone can submit a new graphic theme or template and make it available to the MediaSPIP community.

  • Submit bugs and patches

    13 avril 2011

    Unfortunately a software is never perfect.
    If you think you have found a bug, report it using our ticket system. Please to help us to fix it by providing the following information : the browser you are using, including the exact version as precise an explanation as possible of the problem if possible, the steps taken resulting in the problem a link to the site / page in question
    If you think you have solved the bug, fill in a ticket and attach to it a corrective patch.
    You may also (...)

Sur d’autres sites (7171)

  • How to generate video as fast as possible with subtitles and audio on node.js + ffmpeg ?

    12 septembre 2018, par DSeregin

    Intro :

    We receive from the site some pieces of text
    Pieces arrive to node.js-server

    At the output we need to get a video, merged from all the pieces of text, voiced by the machine voice, with the added subtitles and audio substrate. So that user could be share this video in the social networks. MKV format doesn`t supported by VK.com

    The options that we have tried :
    1. Get all the text at once, generate the entire speech, create a file with subtitles, burn subtitles in the video .mp4 (vk.com does not support the .mkv container). It took 12 seconds of operations for a 45-second video on the local computer.
    2. Generate audio and video files for each piece of text (with added subtitles). It took one second for one piece of text. At the final request, we merge all pieces together. The last request (merging) took 2-3 seconds, which is already bearable.

    The second variant looks acceptable in terms of speed, but if you run 50 clients at the same time, then the computer (tested on a MacBook PRO 2013, 2.4 GHz i7, 8gb 1600 Mhz DDR3, SSD 256gb) processed only 1 piece from 1 client in 60 seconds (60 times slower), then the computer hung tight.

    The commands we used :

    • Burn video subtitles and trim up to conditional 6 seconds (in the code send unix timestamp)

    ffmpeg -i import / back.mov -i export_0 / tmp.srt -scodec mov_text -t 6 export_0 / output.mov

    • Merging all audio

    ffmpeg -i audio1.mp3 .... -i audio15.mp3 merged.mp3

    • Overlay audio-substrate on the text

    ffmpeg -i merged.mp3 -i back.mp3 -filter_complex amerge -ac 2-c: a libmp3lame -q: a 4 -shortest audio.mp3

    • Merging all videos

    ffmpeg -i video.txt -f concat -c copy video.mp4

    • Overlay audio on video

    ffmpeg -i audio.mp3 -i video.mp4 -i test.mp4 -i export / output.mp3 -c: v copy -c: a aac -map 0: v: 0 -map 1: a: 0 -shortest output .mp4

    Questions that torment :

    1. Is it faster ?

    2. Can I use other codecs or methods of gluing without re-encoding ?

    3. Try to call ffmpeg directly without a wrapper ? (in fact, it gives 50-100 ms of speed)

    4. Try not to save to disk, and write data to Stream and have them glue together in the end ?

  • Grand Unified Theory of Compact Disc

    1er février 2013, par Multimedia Mike — General

    This is something I started writing about a decade ago (and I almost certainly have some of it wrong), back when compact discs still had a fair amount of relevance. Back around 2002, after a few years investigating multimedia technology, I took an interest in compact discs of all sorts. Even though there may seem to be a wide range of CD types, I generally found that they’re all fundamentally the same. I thought I would finally publishing something, incomplete though it may be.

    Physical Perspective
    There are a lot of ways to look at a compact disc. First, there’s the physical format, where a laser detects where pits/grooves have disturbed the smooth surface (a.k.a. lands). A lot of technical descriptions claim that these lands and pits on a CD correspond to ones and zeros. That’s not actually true, but you have to decide what level of abstraction you care about, and that abstraction is good enough if you only care about the discs from a software perspective.

    Grand Unified Theory (Software Perspective)
    Looking at a disc from a software perspective, I have generally found it useful to view a CD as a combination of a 2 main components :

    • table of contents (TOC)
    • a long string of sectors, each of which is 2352 bytes long

    I like to believe that’s pretty much all there is to it. All of the information on a CD is stored as a string of sectors that might be chopped up into a series of anywhere from 1-99 individual tracks. The exact sector locations where these individual tracks begin are defined in the TOC.

    Audio CDs (CD-DA / Red Book)
    The initial purpose for the compact disc was to store digital audio. The strange sector size of 2352 bytes is an artifact of this original charter. “CD quality audio”, as any multimedia nerd knows, is formally defined as stereo PCM samples that are each 16 bits wide and played at a frequency of 44100 Hz.

    (44100 audio frames / 1 second) * (2 samples / audio frame) * 
      (16 bits / 1 sample) * (1 byte / 8 bits) = 176,400 bytes / second
    (176,400 bytes / 1 second) / (2352 bytes / 1 sector) = 75
    

    75 is the number of sectors required to store a single second of CD-quality audio. A single sector stores 1/75th of a second, or a ‘frame’ of audio (though I think ‘frame’ gets tossed around at all levels when describing CD formats).

    The term “red book” is thrown around in relation to audio CDs. There is a series of rainbow books that define various optical disc standards and the red book describes audio CDs.

    Basic Data CD-ROMs (Mode 1 / Yellow Book)
    Somewhere along the line, someone decided that general digital information could be stored on these discs. Hence, the CD-ROM was born. The standard model above still applies– TOC and string of 2352-byte sectors. However, it’s generally only useful to have a single track on a CD-ROM. Thus, the TOC only lists a single track. That single track can easily span the entire disc (something that would be unusual for a typical audio CD).

    While the model is mostly the same, the most notable difference between and audio CD and a plain CD-ROM is that, while each sector is 2352 bytes long, only 2048 bytes are used to store actual data payload. The remaining bytes are used for synchronization and additional error detection/correction.

    At least, the foregoing is true for mode 1 / form 1 CD-ROMs (which are the most common). “Mode 1″ CD-ROMs are defined by a publication called the yellow book. There is also mode 1 / form 2. This forgoes the additional error detection and correction afforded by form 1 and dedicates 2336 of the 2352 sector bytes to the data payload.

    CD-ROM XA (Mode 2 / Green Book)
    From a software perspective, these are similar to mode 1 CD-ROMs. There are also 2 forms here. The first form gives a 2048-byte data payload while the second form yields a 2324-byte data payload.

    Video CD (VCD / White Book)
    These are CD-ROM XA discs that carry MPEG-1 video and audio data.

    Photo CD (Beige Book)
    This is something I have never personally dealt with. But it’s supposed to conform to the CD-ROM XA standard and probably fits into my model. It seems to date back to early in the CD-ROM era when CDs were particularly cost prohibitive.

    Multisession CDs (Blue Book)
    Okay, I admit that this confuses me a bit. Multisession discs allow a user to burn multiple sessions to a single recordable disc. I.e., burn a lump of data, then burn another lump at a later time, and the final result will look like all the lumps were recorded as the same big lump. I remember this being incredibly useful and cost effective back when recordable CDs cost around US$10 each (vs. being able to buy a spindle of 100 CD-Rs for US$10 or less now). Studying the cdrom.h file for the Linux OS, I found a system call named CDROMMULTISESSION that returns the sector address of the start of the last session. If I were to hypothesize about how to make this fit into my model, I might guess that the TOC has some hint that the disc was recorded in multisession (which needs to be decided up front) and the CDROMMULTISESSION call is made to find the last session. Or it could be that a disc read initialization operation always leads off with the CDROMMULTISESSION query in order to determine this.

    I suppose I could figure out how to create a multisession disc with modern software, or possibly dig up a multisession disc from 15+ years ago, and then figure out how it should be read.

    CD-i
    This type puzzles my as well. I do have some CD-i discs and I thought that I could read them just fine (the last time I looked, which was many years ago). But my research for this blog post has me thinking that I might not have been seeing the entire picture when I first studied my CD-i samples. I was able to see some of the data, but sources indicate that only proper CD-i hardware is able to see all of the data on the disc (apparently, the TOC doesn’t show all of the sectors on disc).

    Hybrid CDs (Data + Audio)
    At some point, it became a notable selling point for an audio CD to have a data track with bonus features. Even more common (particularly in the early era of CD-ROMs) were computer and console games that used the first track of a disc for all the game code and assets and the remaining tracks for beautifully rendered game audio that could also be enjoyed outside the game. Same model : TOC points to the various tracks and also makes notes about which ones are data and which are audio.

    There seems to be 2 distinct things described above. One type is the mixed mode CD which generally has the data in the first track and the audio in tracks 2..n. Then there is the enhanced CD, which apparently used multisession recording and put the data at the end. I think that the reasoning for this is that most audio CD player hardware would only read tracks from the first session and would have no way to see the data track. This was a positive thing. By contrast, when placing a mixed-mode CD into an audio player, the data track would be rendered as nonsense noise.

    Subchannels
    There’s at least one small detail that my model ignores : subchannels. CDs can encode bits of data in subchannels in sectors. This is used for things like CD-Text and CD-G. I may need to revisit this.

    In Summary
    There’s still a lot of ground to cover, like how those sectors might be formatted to show something useful (e.g., filesystems), and how the model applies to other types of optical discs. Sounds like something for another post.

  • Adding Subtitles to a VAAPI/QSV 10bit Accelerated Transcode

    17 juin 2023, par Enverex

    I've been converting some of my BluRays to watch on a streaming machine elsewhere in the house but I'm having trouble when it comes to burning subtitles, or more specifically, I'm having trouble figuring out what FFMPEG wants from me to make the process actually work.

    


    It's easy enough to do this in software, but I'm using hardware decoding and encoding and that's where the complexity seems to come from (VAAPI/QSV for decoding, QSV AV1 for encoding).

    


    I can use the following when dealing with SDR content :

    


    -vf "scale_vaapi=w='min(1920,iw)':h=-8:mode=nl_anamorphic:format=p010le:extra_hw_frames=120,hwmap=derive_device=qsv,format=qsv"


    


    And the following when dealing with HDR content :

    


    -vf "scale_vaapi=w='min(1920,iw)':h=-8:mode=nl_anamorphic:format=p010le,tonemap_vaapi=format=p010le:t=bt709:m=bt709:p=bt709:extra_hw_frames=120,hwmap=derive_device=qsv,format=qsv"


    


    But I cannot, in the hundreds of permutations I've tried now, find a single way to shoehorn subtitle baking into the process. I'm using SRT files for simplicity, so I just need to add subtitles=blah.srt somewhere to get it to work, but the crux of the issue is knowing where in the chain it needs to go and more importantly, what supporting arguments it needs with it (e.g. hwupload, hwdownload, and their associated switches, etc).

    


    Pretty much every single attempt just results in :

    


    


    Impossible to convert between the formats supported by the filter 'graph 0 input from stream 0:0' and the filter 'auto_scale_0'
Error reinitializing filters !
Failed to inject frame into filter network : Function not implemented

    


    


    So, what am I missing ?