
Recherche avancée
Autres articles (104)
-
Des sites réalisés avec MediaSPIP
2 mai 2011, parCette page présente quelques-uns des sites fonctionnant sous MediaSPIP.
Vous pouvez bien entendu ajouter le votre grâce au formulaire en bas de page. -
Les formats acceptés
28 janvier 2010, parLes commandes suivantes permettent d’avoir des informations sur les formats et codecs gérés par l’installation local de ffmpeg :
ffmpeg -codecs ffmpeg -formats
Les format videos acceptés en entrée
Cette liste est non exhaustive, elle met en exergue les principaux formats utilisés : h264 : H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 m4v : raw MPEG-4 video format flv : Flash Video (FLV) / Sorenson Spark / Sorenson H.263 Theora wmv :
Les formats vidéos de sortie possibles
Dans un premier temps on (...) -
Ajouter notes et légendes aux images
7 février 2011, parPour pouvoir ajouter notes et légendes aux images, la première étape est d’installer le plugin "Légendes".
Une fois le plugin activé, vous pouvez le configurer dans l’espace de configuration afin de modifier les droits de création / modification et de suppression des notes. Par défaut seuls les administrateurs du site peuvent ajouter des notes aux images.
Modification lors de l’ajout d’un média
Lors de l’ajout d’un média de type "image" un nouveau bouton apparait au dessus de la prévisualisation (...)
Sur d’autres sites (8683)
-
HLS script has been lost to time, previous content was made in specific format, attempting to recreate using FFMPEG primitives
28 février, par WungoLooking to add this video to a stitched playlist. The variants, encoding, and everything must match exactly. We have no access to how things were done previously, so I am literally vibing through this as best as I can.


I recommend using a clip of buck bunny that's 30 seconds long, or the original buck bunny 1080p video.


#!/bin/bash
ffmpeg -i bbb_30s.mp4 -filter_complex "
[0:v]split=7[v1][v2][v3][v4][v5][v6][v7];
[v1]scale=416:234[v1out];
[v2]scale=416:234[v2out];
[v3]scale=640:360[v3out];
[v4]scale=768:432[v4out];
[v5]scale=960:540[v5out];
[v6]scale=1280:720[v6out];
[v7]scale=1920:1080[v7out]
" \
-map "[v1out]" -c:v:0 libx264 -b:v:0 200k -maxrate 361k -bufsize 400k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v baseline -level 3.0 \
-map "[v2out]" -c:v:1 libx264 -b:v:1 500k -maxrate 677k -bufsize 700k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v baseline -level 3.0 \
-map "[v3out]" -c:v:2 libx264 -b:v:2 1000k -maxrate 1203k -bufsize 1300k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v main -level 3.1 \
-map "[v4out]" -c:v:3 libx264 -b:v:3 1800k -maxrate 2057k -bufsize 2200k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v main -level 3.1 \
-map "[v5out]" -c:v:4 libx264 -b:v:4 2500k -maxrate 2825k -bufsize 3000k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v main -level 4.0 \
-map "[v6out]" -c:v:5 libx264 -b:v:5 5000k -maxrate 5525k -bufsize 6000k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v high -level 4.1 \
-map "[v7out]" -c:v:6 libx264 -b:v:6 8000k -maxrate 9052k -bufsize 10000k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v high -level 4.2 \
-map a:0 -c:a:0 aac -b:a:0 128k -ar 48000 -ac 2 \
-f hls -hls_time 6 -hls_playlist_type vod -hls_flags independent_segments \
-hls_segment_type fmp4 \
-hls_segment_filename "output_%v_%03d.mp4" \
-master_pl_name master.m3u8 \
-var_stream_map "v:0,name:layer-416x234-200k v:1,name:layer-416x234-500k v:2,name:layer-640x360-1000k v:3,name:layer-768x432-1800k v:4,name:layer-960x540-2500k v:5,name:layer-1280x720-5000k v:6,name:layer-1920x1080-8000k a:0,name:layer-audio-128k" \
output_%v.m3u8




Above is what i've put together over the past few days.


I consistently run into the same issues :


- 

- my variants must match identically, the bit rate etc. must match identically no excuses. No variance allowed.
- When I did it a different way previously, it became impossible to sync the variants timing, thus making the project not stitchable, making the asset useless.The variants are encoded to last longer than the master.m3u8 says it will last. Rejecting the asset downstream.
- I end up either having variants mismatched with timing, or no audio/audio channels synced properly. Here is what the master.m3u8 should look like.








#EXTM3U
#EXT-X-VERSION:7

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=333000,BANDWIDTH=361000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d400d,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=416x234
placeholder.m3u8
#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=632000,BANDWIDTH=677000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d400d,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=416x234
placeholder2.m3u8
#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=1133000,BANDWIDTH=1203000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d401e,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=640x360
placeholder3.m3u8

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=1933000,BANDWIDTH=2057000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d401f,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=768x432
placeholder4.m3u8

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=2633000,BANDWIDTH=2825000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d401f,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=960x540
placeholder5.m3u8

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=5134000,BANDWIDTH=5525000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d401f,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=1280x720
placeholder6.m3u8

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=8135000,BANDWIDTH=9052000,CLOSED-CAPTIONS="cc1",CODECS="avc1.640028,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=1920x1080
placeholder7.m3u8

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=129000,BANDWIDTH=130000,CLOSED-CAPTIONS="cc1",CODECS="mp4a.40.2"
placeholder8.m3u8

#EXT-X-MEDIA:AUTOSELECT=YES,CHANNELS="2",DEFAULT=YES,GROUP-ID="aac",LANGUAGE="en",NAME="English",TYPE=AUDIO,URI="placeholder8.m3u8"
#EXT-X-MEDIA:AUTOSELECT=YES,DEFAULT=YES,GROUP-ID="cc1",INSTREAM-ID="CC1",LANGUAGE="en",NAME="English",TYPE=CLOSED-CAPTIONS



Underlying playlist clips should be *.mp4 not *.m4s or anything like that. Audio must be on a single channel by itself, closed captions are handled by a remote server and aren't a concern.


as mentioned above :


- 

- I have tried transcoding separately and then combining manually or later. Here is an example of that.




#!/bin/bash
set -e

# Input file
INPUT_FILE="bbb_30.mp4"

# Output directory
OUTPUT_DIR="hls_output"
mkdir -p "$OUTPUT_DIR"

# First, extract exact duration from master.m3u8 (if it exists)
MASTER_M3U8="master.m3u8" # Change if needed

echo "Extracting exact duration from the source MP4..."
EXACT_DURATION=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "$INPUT_FILE")
echo "Using exact duration: $EXACT_DURATION seconds"
# Create a reference file with exact duration from the source
echo "Creating reference file with exact duration..."
ffmpeg -y -i "$INPUT_FILE" -c copy -t "$EXACT_DURATION" "$OUTPUT_DIR/exact_reference.mp4"

# Calculate exact GOP size for segment alignment (for 6-second segments at 29.97fps)
FPS=29.97
SEGMENT_DURATION=6
GOP_SIZE=$(echo "$FPS * $SEGMENT_DURATION" | bc | awk '{print int($1)}')
echo "Using GOP size of $GOP_SIZE frames for $SEGMENT_DURATION-second segments at $FPS fps"

# Function to encode a variant with exact duration
encode_variant() {
 local resolution="$1"
 local bitrate="$2"
 local maxrate="$3"
 local bufsize="$4"
 local profile="$5"
 local level="$6"
 local audiorate="$7"
 local name_suffix="$8"
 
 echo "Encoding $resolution variant with video bitrate $bitrate kbps and audio bitrate ${audiorate}k..."
 
 # Step 1: Create an intermediate file with exact duration and GOP alignment
 ffmpeg -y -i "$OUTPUT_DIR/exact_reference.mp4" \
 -c:v libx264 -profile:v "$profile" -level "$level" \
 -x264-params "bitrate=$bitrate:vbv-maxrate=$maxrate:vbv-bufsize=$bufsize:keyint=$GOP_SIZE:min-keyint=$GOP_SIZE:no-scenecut=1" \
 -s "$resolution" -r "$FPS" \
 -c:a aac -b:a "${audiorate}k" \
 -vsync cfr -start_at_zero -reset_timestamps 1 \
 -map 0:v:0 -map 0:a:0 \
 -t "$EXACT_DURATION" \
 -force_key_frames "expr:gte(t,n_forced*6)" \
 "$OUTPUT_DIR/temp_${name_suffix}.mp4"
 
 # Step 2: Create HLS segments with exact boundaries from the intermediate file.
 ffmpeg -y -i "$OUTPUT_DIR/temp_${name_suffix}.mp4" \
 -c copy \
 -f hls \
 -hls_time "$SEGMENT_DURATION" \
 -hls_playlist_type vod \
 -hls_segment_filename "$OUTPUT_DIR/layer-${name_suffix}-segment-%03d.mp4" \
 -hls_flags independent_segments+program_date_time+round_durations \
 -hls_list_size 0 \
 "$OUTPUT_DIR/layer-${name_suffix}.m3u8"
 
 # Verify duration
 VARIANT_DURATION=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "$OUTPUT_DIR/temp_${name_suffix}.mp4")
 echo "Variant $name_suffix duration: $VARIANT_DURATION (target: $EXACT_DURATION, diff: $(echo "$VARIANT_DURATION - $EXACT_DURATION" | bc))"
 
 # Clean up temporary file
 rm "$OUTPUT_DIR/temp_${name_suffix}.mp4"
}

# Process each variant with exact duration matching
# Format: resolution, bitrate, maxrate, bufsize, profile, level, audio bitrate, name suffix
encode_variant "416x234" "333" "361" "722" "baseline" "3.0" "64" "416x234-200k"
encode_variant "416x234" "632" "677" "1354" "baseline" "3.0" "64" "416x234-500k"
encode_variant "640x360" "1133" "1203" "2406" "main" "3.0" "96" "640x360-1000k"
encode_variant "768x432" "1933" "2057" "4114" "main" "3.1" "96" "768x432-1800k"
encode_variant "960x540" "2633" "2825" "5650" "main" "3.1" "128" "960x540-2500k"
encode_variant "1280x720" "5134" "5525" "11050" "main" "3.1" "128" "1280x720-5000k"
encode_variant "1920x1080" "8135" "9052" "18104" "high" "4.0" "128" "1920x1080-8000k"

# 8. Audio-only variant
echo "Creating audio-only variant..."


# ffmpeg -y -i "$INPUT_FILE" \
# -vn -map 0:a \
# -c:a aac -b:a 128k -ac 2 \ 
# -t "$EXACT_DURATION" \
# -f hls \
# -hls_time "$SEGMENT_DURATION" \
# -hls_playlist_type vod \
# -hls_flags independent_segments+program_date_time+round_durations \
# -hls_segment_filename "$OUTPUT_DIR/layer-audio-128k-segment-%03d.ts" \
# -hls_list_size 0 \
# "$OUTPUT_DIR/layer-audio-128k.m3u8"

ffmpeg -y -i "$INPUT_FILE" \
 -vn \
 -map 0:a \
 -c:a aac -b:a 128k \
 -t "$EXACT_DURATION" \
 -f hls \
 -hls_time "$SEGMENT_DURATION" \
 -hls_playlist_type vod \
 -hls_segment_type fmp4 \
 -hls_flags independent_segments+program_date_time+round_durations \
 -hls_list_size 0 \
 -hls_segment_filename "$OUTPUT_DIR/layer-audio-128k-segment-%03d.m4s" \
 "$OUTPUT_DIR/layer-audio-128k.m3u8"


# Create master playlist
cat > "$OUTPUT_DIR/master.m3u8" << EOF
#EXTM3U
#EXT-X-VERSION:7
#EXT-X-INDEPENDENT-SEGMENTS

#EXT-X-STREAM-INF:BANDWIDTH=361000,AVERAGE-BANDWIDTH=333000,CODECS="avc1.4d400d,mp4a.40.2",RESOLUTION=416x234,FRAME-RATE=29.97
layer-416x234-200k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=677000,AVERAGE-BANDWIDTH=632000,CODECS="avc1.4d400d,mp4a.40.2",RESOLUTION=416x234,FRAME-RATE=29.97
layer-416x234-500k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1203000,AVERAGE-BANDWIDTH=1133000,CODECS="avc1.4d401e,mp4a.40.2",RESOLUTION=640x360,FRAME-RATE=29.97
layer-640x360-1000k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2057000,AVERAGE-BANDWIDTH=1933000,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=768x432,FRAME-RATE=29.97
layer-768x432-1800k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2825000,AVERAGE-BANDWIDTH=2633000,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=960x540,FRAME-RATE=29.97
layer-960x540-2500k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5525000,AVERAGE-BANDWIDTH=5134000,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=1280x720,FRAME-RATE=29.97
layer-1280x720-5000k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=9052000,AVERAGE-BANDWIDTH=8135000,CODECS="avc1.640028,mp4a.40.2",RESOLUTION=1920x1080,FRAME-RATE=29.97
layer-1920x1080-8000k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=130000,AVERAGE-BANDWIDTH=129000,CODECS="mp4a.40.2"
layer-audio-128k.m3u8
EOF

# Verify all durations match
cat > "$OUTPUT_DIR/verify_all.sh" << 'EOF'
#!/bin/bash

# Get exact reference duration from the exact reference file
REFERENCE_DURATION=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "exact_reference.mp4")
echo "Reference duration: $REFERENCE_DURATION seconds"

# Check each segment's duration
echo -e "\nChecking individual segments..."
for seg in layer-*-segment-*.mp4 layer-audio-128k-segment-*.ts; do
 dur=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "$seg")
 echo "$seg: $dur seconds"
done

# Get total duration for each variant by summing segment EXTINF durations from each playlist
echo -e "\nChecking combined variant durations..."
for variant in layer-*.m3u8; do
 total=0
 while read -r line; do
 if [[ $line == "#EXTINF:"* ]]; then
 dur=$(echo "$line" | sed 's/#EXTINF:\([0-9.]*\).*/\1/')
 total=$(echo "$total + $dur" | bc)
 fi
 done < "$variant"
 echo "$variant: $total seconds (reference: $REFERENCE_DURATION, diff: $(echo "$total - $REFERENCE_DURATION" | bc))"
done
EOF

chmod +x "$OUTPUT_DIR/verify_all.sh"

echo "HLS packaging complete with exact duration matching."
echo "Master playlist available at: $OUTPUT_DIR/master.m3u8"
echo "Run $OUTPUT_DIR/verify_all.sh to verify durations."
rm "$OUTPUT_DIR/exact_reference.mp4"




I end up with weird audio in vlc which can't be right, and I also end up with variants being longer than the master.m3u8 playlist which is wonky.


I tried using AI to fix the audio sync issue, and honestly I'm more confused than when I started.


-
Use FFmpeg concat two video, is output video level mistake ?
27 février, par 哇哈哈video1
{
 "index": 0,
 "codec_name": "hevc",
 "codec_long_name": "H.265 / HEVC (High Efficiency Video Coding)",
 "profile": "Main",
 "codec_type": "video",
 "codec_tag_string": "hev1",
 "codec_tag": "0x31766568",
 "width": 1920,
 "height": 1080,
 "coded_width": 1920,
 "coded_height": 1080,
 "has_b_frames": 2,
 "sample_aspect_ratio": "1:1",
 "display_aspect_ratio": "16:9",
 "pix_fmt": "yuv420p",
 "level": 120,
 "color_range": "tv",
 "chroma_location": "left",
 "field_order": "progressive",
 "refs": 1,
 "view_ids_available": "",
 "view_pos_available": "",
 "id": "0x1",
 "r_frame_rate": "30/1",
 "avg_frame_rate": "30/1",
 "time_base": "1/15360",
 "start_pts": 0,
 "start_time": "0.000000",
 "duration_ts": 200192,
 "duration": "13.033333",
 "bit_rate": "10794613",
 "nb_frames": "391",
 "extradata_size": 2496,
 "disposition": {
 "default": 1,
 "dub": 0,
 "original": 0,
 "comment": 0,
 "lyrics": 0,
 "karaoke": 0,
 "forced": 0,
 "hearing_impaired": 0,
 "visual_impaired": 0,
 "clean_effects": 0,
 "attached_pic": 0,
 "timed_thumbnails": 0,
 "non_diegetic": 0,
 "captions": 0,
 "descriptions": 0,
 "metadata": 0,
 "dependent": 0,
 "still_image": 0,
 "multilayer": 0
 },
 "tags": {
 "language": "eng",
 "handler_name": "VideoHandler",
 "vendor_id": "[0][0][0][0]",
 "encoder": "Lavc61.33.100 libx265",
 "timecode": "00:00:00;00"
 }
}

video2 
{
 "index": 0,
 "codec_name": "hevc",
 "codec_long_name": "H.265 / HEVC (High Efficiency Video Coding)",
 "profile": "Main",
 "codec_type": "video",
 "codec_tag_string": "hev1",
 "codec_tag": "0x31766568",
 "width": 1920,
 "height": 1080,
 "coded_width": 1920,
 "coded_height": 1080,
 "has_b_frames": 2,
 "sample_aspect_ratio": "1:1",
 "display_aspect_ratio": "16:9",
 "pix_fmt": "yuv420p",
 "level": 120,
 "color_range": "tv",
 "chroma_location": "left",
 "field_order": "progressive",
 "refs": 1,
 "view_ids_available": "",
 "view_pos_available": "",
 "id": "0x1",
 "r_frame_rate": "25/1",
 "avg_frame_rate": "25/1",
 "time_base": "1/12800",
 "start_pts": 0,
 "start_time": "0.000000",
 "duration_ts": 1309696,
 "duration": "102.320000",
 "bit_rate": "1024122",
 "nb_frames": "2558",
 "extradata_size": 2496,
 "disposition": {
 "default": 1,
 "dub": 0,
 "original": 0,
 "comment": 0,
 "lyrics": 0,
 "karaoke": 0,
 "forced": 0,
 "hearing_impaired": 0,
 "visual_impaired": 0,
 "clean_effects": 0,
 "attached_pic": 0,
 "timed_thumbnails": 0,
 "non_diegetic": 0,
 "captions": 0,
 "descriptions": 0,
 "metadata": 0,
 "dependent": 0,
 "still_image": 0,
 "multilayer": 0
 },
 "tags": {
 "language": "und",
 "handler_name": "VideoHandler",
 "vendor_id": "[0][0][0][0]",
 "encoder": "Lavc61.33.100 libx265"
 }
}

out:
{
 "index": 0,
 "codec_name": "hevc",
 "codec_long_name": "H.265 / HEVC (High Efficiency Video Coding)",
 "profile": "Main",
 "codec_type": "video",
 "codec_tag_string": "hev1",
 "codec_tag": "0x31766568",
 "width": 1920,
 "height": 1080,
 "coded_width": 1920,
 "coded_height": 1080,
 "has_b_frames": 2,
 "sample_aspect_ratio": "1:1",
 "display_aspect_ratio": "16:9",
 "pix_fmt": "yuv420p",
 "level": 186,
 "color_range": "tv",
 "chroma_location": "left",
 "field_order": "progressive",
 "refs": 1,
 "view_ids_available": "",
 "view_pos_available": "",
 "id": "0x1",
 "r_frame_rate": "30/1",
 "avg_frame_rate": "147450/5767",
 "time_base": "1/1000000",
 "start_pts": 0,
 "start_time": "0.000000",
 "duration_ts": 115340000,
 "duration": "115.340000",
 "bit_rate": "1060604",
 "nb_frames": "2949",
 "extradata_size": 2500,
 "disposition": {
 "default": 1,
 "dub": 0,
 "original": 0,
 "comment": 0,
 "lyrics": 0,
 "karaoke": 0,
 "forced": 0,
 "hearing_impaired": 0,
 "visual_impaired": 0,
 "clean_effects": 0,
 "attached_pic": 0,
 "timed_thumbnails": 0,
 "non_diegetic": 0,
 "captions": 0,
 "descriptions": 0,
 "metadata": 0,
 "dependent": 0,
 "still_image": 0,
 "multilayer": 0
 },
 "tags": {
 "language": "und",
 "handler_name": "VideoHandler",
 "vendor_id": "[0][0][0][0]",
 "encoder": "Lavc61.33.100 libx265"
 }
}



output video level is 6.2 ? i wiki level refer to fps resolusion or bitrate,but not suit this output video.
0。0 ! Could Someone HELP me ?


ffmpeg -i .\HEVC_1080p_30P_yellowtree.mp4 -i .\HEVC_1080p_24fps_happy.mp4 -filter_complex "[0:v][1:v]concat=n=2:v=1:a=0[outv]" -map "[outv]" -c:v libx265 concat_output.mp4


ffmpeg version N-118448-g43be8d0728-20250209 Copyright (c) 2000-2025 the FFmpeg developers
built with gcc 14.2.0 (crosstool-NG 1.26.0.120_4d36f27)
configuration : —prefix=/ffbuild/prefix —pkg-config-flags=—static —pkg-config=pkg-config —cross-prefix=x86_64-w64-mingw32- —arch=x86_64 —target-os=mingw32 —enable-gpl —enable-version3 —disable-debug —enable-shared —disable-static —disable-w32threads —enable-pthreads —enable-iconv —enable-zlib —enable-libfreetype —enable-libfribidi —enable-gmp —enable-libxml2 —enable-lzma —enable-fontconfig —enable-libharfbuzz —enable-libvorbis —enable-opencl —disable-libpulse —enable-libvmaf —disable-libxcb —disable-xlib —enable-amf —enable-libaom —enable-libaribb24 —enable-avisynth —enable-chromaprint —enable-libdav1d —enable-libdavs2 —enable-libdvdread —enable-libdvdnav —disable-libfdk-aac —enable-ffnvcodec —enable-cuda-llvm —enable-frei0r —enable-libgme —enable-libkvazaar —enable-libaribcaption —enable-libass —enable-libbluray —enable-libjxl —enable-libmp3lame —enable-libopus —enable-librist —enable-libssh —enable-libtheora —enable-libvpx —enable-libwebp —enable-libzmq —enable-lv2 —enable-libvpl —enable-openal —enable-libopencore-amrnb —enable-libopencore-amrwb —enable-libopenh264 —enable-libopenjpeg —enable-libopenmpt —enable-librav1e —enable-librubberband —enable-schannel —enable-sdl2 —enable-libsnappy —enable-libsoxr —enable-libsrt —enable-libsvtav1 —enable-libtwolame —enable-libuavs3d —disable-libdrm —enable-vaapi —enable-libvidstab —enable-vulkan —enable-libshaderc —enable-libplacebo —disable-libvvenc —enable-libx264 —enable-libx265 —enable-libxavs2 —enable-libxvid —enable-libzimg —enable-libzvbi —extra-cflags=-DLIBTWOLAME_STATIC —extra-cxxflags= —extra-libs=-lgomp —extra-ldflags=-pthread —extra-ldexeflags= —cc=x86_64-w64-mingw32-gcc —cxx=x86_64-w64-mingw32-g++ —ar=x86_64-w64-mingw32-gcc-ar —ranlib=x86_64-w64-mingw32-gcc-ranlib —nm=x86_64-w64-mingw32-gcc-nm —extra-version=20250209
libavutil 59. 56.100 / 59. 56.100
libavcodec 61. 33.100 / 61. 33.100
libavformat 61. 9.107 / 61. 9.107
libavdevice 61. 4.100 / 61. 4.100
libavfilter 10. 9.100 / 10. 9.100
libswscale 8. 13.100 / 8. 13.100
libswresample 5. 4.100 / 5. 4.100
libpostproc 58. 4.100 / 58. 4.100


-
FFmpeg overlay positioning issue : Converting frontend center coordinates to FFmpeg top-left coordinates
25 janvier, par tarunI'm building a web-based video editor where users can :


Add multiple videos
Add images
Add text overlays with background color


Frontend sends coordinates where each element's (x,y) represents its center position.
on click of the export button i want all data to be exported as one final video
on click i send the data to the backend like -


const exportAllVideos = async () => {
 try {
 const formData = new FormData();
 
 
 const normalizedVideos = videos.map(video => ({
 ...video,
 startTime: parseFloat(video.startTime),
 endTime: parseFloat(video.endTime),
 duration: parseFloat(video.duration)
 })).sort((a, b) => a.startTime - b.startTime);

 
 for (const video of normalizedVideos) {
 const response = await fetch(video.src);
 const blobData = await response.blob();
 const file = new File([blobData], `${video.id}.mp4`, { type: "video/mp4" });
 formData.append("videos", file);
 }

 
 const normalizedImages = images.map(image => ({
 ...image,
 startTime: parseFloat(image.startTime),
 endTime: parseFloat(image.endTime),
 x: parseInt(image.x),
 y: parseInt(image.y),
 width: parseInt(image.width),
 height: parseInt(image.height),
 opacity: parseInt(image.opacity)
 }));

 
 for (const image of normalizedImages) {
 const response = await fetch(image.src);
 const blobData = await response.blob();
 const file = new File([blobData], `${image.id}.png`, { type: "image/png" });
 formData.append("images", file);
 }

 
 const normalizedTexts = texts.map(text => ({
 ...text,
 startTime: parseFloat(text.startTime),
 endTime: parseFloat(text.endTime),
 x: parseInt(text.x),
 y: parseInt(text.y),
 fontSize: parseInt(text.fontSize),
 opacity: parseInt(text.opacity)
 }));

 
 formData.append("metadata", JSON.stringify({
 videos: normalizedVideos,
 images: normalizedImages,
 texts: normalizedTexts
 }));

 const response = await fetch("my_flask_endpoint", {
 method: "POST",
 body: formData
 });

 if (!response.ok) {
 
 console.log('wtf', response);
 
 }

 const finalVideo = await response.blob();
 const url = URL.createObjectURL(finalVideo);
 const a = document.createElement("a");
 a.href = url;
 a.download = "final_video.mp4";
 a.click();
 URL.revokeObjectURL(url);

 } catch (e) {
 console.log(e, "err");
 }
 };



the frontend data for each object that is text image and video we are storing it as an array of objects below is the Data strcutre for each object -


// the frontend data for each
 const newVideo = {
 id: uuidv4(),
 src: URL.createObjectURL(videoData.videoBlob),
 originalDuration: videoData.duration,
 duration: videoData.duration,
 startTime: 0,
 playbackOffset: 0,
 endTime: videoData.endTime || videoData.duration,
 isPlaying: false,
 isDragging: false,
 speed: 1,
 volume: 100,
 x: window.innerHeight / 2,
 y: window.innerHeight / 2,
 width: videoData.width,
 height: videoData.height,
 };
 const newTextObject = {
 id: uuidv4(),
 description: text,
 opacity: 100,
 x: containerWidth.width / 2,
 y: containerWidth.height / 2,
 fontSize: 18,
 duration: 20,
 endTime: 20,
 startTime: 0,
 color: "#ffffff",
 backgroundColor: hasBG,
 padding: 8,
 fontWeight: "normal",
 width: 200,
 height: 40,
 };

 const newImage = {
 id: uuidv4(),
 src: URL.createObjectURL(imageData),
 x: containerWidth.width / 2,
 y: containerWidth.height / 2,
 width: 200,
 height: 200,
 borderRadius: 0,
 startTime: 0,
 endTime: 20,
 duration: 20,
 opacity: 100,
 };




BACKEND CODE -


import os
import shutil
import subprocess
from flask import Flask, request, send_file
import ffmpeg
import json
from werkzeug.utils import secure_filename
import uuid
from flask_cors import CORS


app = Flask(__name__)
CORS(app, resources={r"/*": {"origins": "*"}})



UPLOAD_FOLDER = 'temp_uploads'
if not os.path.exists(UPLOAD_FOLDER):
 os.makedirs(UPLOAD_FOLDER)


@app.route('/')
def home():
 return 'Hello World'


OUTPUT_WIDTH = 1920
OUTPUT_HEIGHT = 1080



@app.route('/process', methods=['POST'])
def process_video():
 work_dir = None
 try:
 work_dir = os.path.abspath(os.path.join(UPLOAD_FOLDER, str(uuid.uuid4())))
 os.makedirs(work_dir)
 print(f"Created working directory: {work_dir}")

 metadata = json.loads(request.form['metadata'])
 print("Received metadata:", json.dumps(metadata, indent=2))
 
 video_paths = []
 videos = request.files.getlist('videos')
 for idx, video in enumerate(videos):
 filename = f"video_{idx}.mp4"
 filepath = os.path.join(work_dir, filename)
 video.save(filepath)
 if os.path.exists(filepath) and os.path.getsize(filepath) > 0:
 video_paths.append(filepath)
 print(f"Saved video to: {filepath} Size: {os.path.getsize(filepath)}")
 else:
 raise Exception(f"Failed to save video {idx}")

 image_paths = []
 images = request.files.getlist('images')
 for idx, image in enumerate(images):
 filename = f"image_{idx}.png"
 filepath = os.path.join(work_dir, filename)
 image.save(filepath)
 if os.path.exists(filepath):
 image_paths.append(filepath)
 print(f"Saved image to: {filepath}")

 output_path = os.path.join(work_dir, 'output.mp4')

 filter_parts = []

 base_duration = metadata["videos"][0]["duration"] if metadata["videos"] else 10
 filter_parts.append(f'color=c=black:s={OUTPUT_WIDTH}x{OUTPUT_HEIGHT}:d={base_duration}[canvas];')

 for idx, (path, meta) in enumerate(zip(video_paths, metadata['videos'])):
 x_pos = int(meta.get("x", 0) - (meta.get("width", 0) / 2))
 y_pos = int(meta.get("y", 0) - (meta.get("height", 0) / 2))
 
 filter_parts.extend([
 f'[{idx}:v]setpts=PTS-STARTPTS,scale={meta.get("width", -1)}:{meta.get("height", -1)}[v{idx}];',
 f'[{idx}:a]asetpts=PTS-STARTPTS[a{idx}];'
 ])

 if idx == 0:
 filter_parts.append(
 f'[canvas][v{idx}]overlay=x={x_pos}:y={y_pos}:eval=init[temp{idx}];'
 )
 else:
 filter_parts.append(
 f'[temp{idx-1}][v{idx}]overlay=x={x_pos}:y={y_pos}:'
 f'enable=\'between(t,{meta["startTime"]},{meta["endTime"]})\':eval=init'
 f'[temp{idx}];'
 )

 last_video_temp = f'temp{len(video_paths)-1}'

 if video_paths:
 audio_mix_parts = []
 for idx in range(len(video_paths)):
 audio_mix_parts.append(f'[a{idx}]')
 filter_parts.append(f'{"".join(audio_mix_parts)}amix=inputs={len(video_paths)}[aout];')

 
 if image_paths:
 for idx, (img_path, img_meta) in enumerate(zip(image_paths, metadata['images'])):
 input_idx = len(video_paths) + idx
 
 
 x_pos = int(img_meta["x"] - (img_meta["width"] / 2))
 y_pos = int(img_meta["y"] - (img_meta["height"] / 2))
 
 filter_parts.extend([
 f'[{input_idx}:v]scale={img_meta["width"]}:{img_meta["height"]}[img{idx}];',
 f'[{last_video_temp}][img{idx}]overlay=x={x_pos}:y={y_pos}:'
 f'enable=\'between(t,{img_meta["startTime"]},{img_meta["endTime"]})\':'
 f'alpha={img_meta["opacity"]/100}[imgout{idx}];'
 ])
 last_video_temp = f'imgout{idx}'

 if metadata.get('texts'):
 for idx, text in enumerate(metadata['texts']):
 next_output = f'text{idx}' if idx < len(metadata['texts']) - 1 else 'vout'
 
 escaped_text = text["description"].replace("'", "\\'")
 
 x_pos = int(text["x"] - (text["width"] / 2))
 y_pos = int(text["y"] - (text["height"] / 2))
 
 text_filter = (
 f'[{last_video_temp}]drawtext=text=\'{escaped_text}\':'
 f'x={x_pos}:y={y_pos}:'
 f'fontsize={text["fontSize"]}:'
 f'fontcolor={text["color"]}'
 )
 
 if text.get('backgroundColor'):
 text_filter += f':box=1:boxcolor={text["backgroundColor"]}:boxborderw=5'
 
 if text.get('fontWeight') == 'bold':
 text_filter += ':font=Arial-Bold'
 
 text_filter += (
 f':enable=\'between(t,{text["startTime"]},{text["endTime"]})\''
 f'[{next_output}];'
 )
 
 filter_parts.append(text_filter)
 last_video_temp = next_output
 else:
 filter_parts.append(f'[{last_video_temp}]null[vout];')

 
 filter_complex = ''.join(filter_parts)

 
 cmd = [
 'ffmpeg',
 *sum([['-i', path] for path in video_paths], []),
 *sum([['-i', path] for path in image_paths], []),
 '-filter_complex', filter_complex,
 '-map', '[vout]'
 ]
 
 
 if video_paths:
 cmd.extend(['-map', '[aout]'])
 
 cmd.extend(['-y', output_path])

 print(f"Running ffmpeg command: {' '.join(cmd)}")
 result = subprocess.run(cmd, capture_output=True, text=True)
 
 if result.returncode != 0:
 print(f"FFmpeg error output: {result.stderr}")
 raise Exception(f"FFmpeg processing failed: {result.stderr}")

 return send_file(
 output_path,
 mimetype='video/mp4',
 as_attachment=True,
 download_name='final_video.mp4'
 )

 except Exception as e:
 print(f"Error in video processing: {str(e)}")
 return {'error': str(e)}, 500
 
 finally:
 if work_dir and os.path.exists(work_dir):
 try:
 print(f"Directory contents before cleanup: {os.listdir(work_dir)}")
 if not os.environ.get('FLASK_DEBUG'):
 shutil.rmtree(work_dir)
 else:
 print(f"Keeping directory for debugging: {work_dir}")
 except Exception as e:
 print(f"Cleanup error: {str(e)}")

 
if __name__ == '__main__':
 app.run(debug=True, port=8000)




I'm also attaching what the final thing looks like on the frontend web vs in the downloaded video
and as u can see the downloaded video has all coords and positions messed up be it of the texts, images as well as videos




can somebody please help me figure this out :)