
Recherche avancée
Médias (1)
-
Collections - Formulaire de création rapide
19 février 2013, par
Mis à jour : Février 2013
Langue : français
Type : Image
Autres articles (22)
-
Publier sur MédiaSpip
13 juin 2013Puis-je poster des contenus à partir d’une tablette Ipad ?
Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir -
Ajouter notes et légendes aux images
7 février 2011, parPour pouvoir ajouter notes et légendes aux images, la première étape est d’installer le plugin "Légendes".
Une fois le plugin activé, vous pouvez le configurer dans l’espace de configuration afin de modifier les droits de création / modification et de suppression des notes. Par défaut seuls les administrateurs du site peuvent ajouter des notes aux images.
Modification lors de l’ajout d’un média
Lors de l’ajout d’un média de type "image" un nouveau bouton apparait au dessus de la prévisualisation (...) -
Organiser par catégorie
17 mai 2013, parDans MédiaSPIP, une rubrique a 2 noms : catégorie et rubrique.
Les différents documents stockés dans MédiaSPIP peuvent être rangés dans différentes catégories. On peut créer une catégorie en cliquant sur "publier une catégorie" dans le menu publier en haut à droite ( après authentification ). Une catégorie peut être rangée dans une autre catégorie aussi ce qui fait qu’on peut construire une arborescence de catégories.
Lors de la publication prochaine d’un document, la nouvelle catégorie créée sera proposée (...)
Sur d’autres sites (4880)
-
ffmpeg video slides vertically after 'Invalid buffer size, packet size expected frame_size' error (vsync screen tearing glitch) ?
7 juillet 2020, par GlabbichRulzI have a video which i want to cut and crop using opencv and ffmpeg.




I want the output to be H265, so i am using a ffmpeg subprocess (writing frame bytes to stdin) as explained here. This is a minimum version of my code that leads to the error :


import os, shlex, subprocess, cv2, imutils

VIDEO_DIR = '../SoExample' # should contain a file 'in.mpg'
TIMESPAN = (3827, 3927) # cut to this timespan (frame numbers)
CROP = dict(min_x=560, max_x=731, min_y=232, max_y=418) # crop to this area

# calculate output video size
size = (CROP['max_x']-CROP['min_x'], CROP['max_y']-CROP['min_y']) # (w,h)
# ffmpeg throws an error when having odd dimensions that are not dividable by 2,
# so i just add a pixel to the size and stretch the original image by 1 pixel later.
size_rounded = (size[0]+1 if size[0] % 2 != 0 else size[0],
 size[1]+1 if size[1] % 2 != 0 else size[1])

# read input video
vid_path_in = os.path.join(VIDEO_DIR, 'in.mpg')
cap = cv2.VideoCapture(vid_path_in)
fps = int(cap.get(cv2.CAP_PROP_FPS))

# generate and run ffmpeg command
ffmpeg_cmd = (f'/usr/bin/ffmpeg -y -s {size_rounded[0]}x{size_rounded[1]} -pixel_format'
 + f' bgr24 -f rawvideo -r {fps} -re -i pipe: -vcodec libx265 -pix_fmt yuv420p'
 + f' -crf 24 -x265-params "ctu=64" "{os.path.join(VIDEO_DIR, "out.mp4")}"')
print("using cmd", ffmpeg_cmd)
process = subprocess.Popen(shlex.split(ffmpeg_cmd), stdin=subprocess.PIPE)

# seek to the beginning of the cutting timespan and loop through frames of input video
cap.set(cv2.CAP_PROP_POS_FRAMES, TIMESPAN[0])
frame_returned = True
while cap.isOpened() and frame_returned:
 frame_returned, frame = cap.read()
 frame_number = cap.get(cv2.CAP_PROP_POS_FRAMES) - 1

 # check if timespan end is not reached yet
 if frame_number < TIMESPAN[1]:

 # crop to relevant image area
 frame_cropped = frame[CROP['min_y']:CROP['max_y'],
 CROP['min_x']:CROP['max_x']]

 # resize to even frame size if needed:
 if size != size_rounded:
 frame_cropped = imutils.resize(frame_cropped, width=size_rounded[0],
 height=size_rounded[1])

 # Show processed image using opencv: I see no errors here.
 cv2.imshow('Frame', frame_cropped)

 # Write cropped video frame to input stream of ffmpeg sub-process.
 process.stdin.write(frame_cropped.tobytes())
 else:
 break

 # Press Q on keyboard to exit earlier
 if cv2.waitKey(25) & 0xFF == ord('q'):
 break

process.stdin.close() # Close and flush stdin
process.wait() # Wait for sub-process to finish
process.terminate() # Terminate the sub-process

print("Done!")



Unfortunately, my output looks like this :




The output should not include this vertical sliding glitch. Does anyone know how to fix it ?


My console output for aboves script shows :


using cmd /usr/bin/ffmpeg -y -s 172x186 -pixel_format bgr24 -f rawvideo -r 23 -i pipe: -vcodec libx265 -pix_fmt yuv420p -crf 24 -x265-params "ctu=64" "../SoExample/out.mp4"
ffmpeg version 3.4.6-0ubuntu0.18.04.1 Copyright (c) 2000-2019 the FFmpeg developers
 built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)
 configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
 WARNING: library configuration mismatch
 avcodec configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared --enable-version3 --disable-doc --disable-programs --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libtesseract --enable-libvo_amrwbenc
 libavutil 55. 78.100 / 55. 78.100
 libavcodec 57.107.100 / 57.107.100
 libavformat 57. 83.100 / 57. 83.100
 libavdevice 57. 10.100 / 57. 10.100
 libavfilter 6.107.100 / 6.107.100
 libavresample 3. 7. 0 / 3. 7. 0
 libswscale 4. 8.100 / 4. 8.100
 libswresample 2. 9.100 / 2. 9.100
 libpostproc 54. 7.100 / 54. 7.100
Input #0, rawvideo, from 'pipe:':
 Duration: N/A, start: 0.000000, bitrate: 17659 kb/s
 Stream #0:0: Video: rawvideo (BGR[24] / 0x18524742), bgr24, 172x186, 17659 kb/s, 23 tbr, 23 tbn, 23 tbc
Stream mapping:
 Stream #0:0 -> #0:0 (rawvideo (native) -> hevc (libx265))
x265 [info]: HEVC encoder version 2.6
x265 [info]: build info [Linux][GCC 7.2.0][64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: Main profile, Level-2 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(3 rows)
x265 [warning]: Source height < 720p; disabling lookahead-slices
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 2
x265 [info]: Keyframe min / max / scenecut / bias: 23 / 250 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0
x265 [info]: References / ref-limit cu / depth : 3 / on / on
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-24.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 rskip signhide tmvp strong-intra-smoothing
x265 [info]: tools: deblock sao
Output #0, mp4, to '../SoExample/out.mp4':
 Metadata:
 encoder : Lavf57.83.100
 Stream #0:0: Video: hevc (libx265) (hev1 / 0x31766568), yuv420p, 172x186, q=2-31, 23 fps, 11776 tbn, 23 tbc
 Metadata:
 encoder : Lavc57.107.100 libx265
[rawvideo @ 0x564ebd221aa0] Invalid buffer size, packet size 51600 < expected frame_size 95976 
Error while decoding stream #0:0: Invalid argument
frame= 100 fps= 30 q=-0.0 Lsize= 36kB time=00:00:04.21 bitrate= 69.1kbits/s speed=1.25x 
video:32kB audio:0kB subtitle:0kB other streams:0kB global headers:2kB muxing overhead: 12.141185%
x265 [info]: frame I: 1, Avg QP:22.44 kb/s: 179.77 
x265 [info]: frame P: 29, Avg QP:24.20 kb/s: 130.12 
x265 [info]: frame B: 70, Avg QP:29.99 kb/s: 27.82 
x265 [info]: Weighted P-Frames: Y:0.0% UV:0.0%
x265 [info]: consecutive B-frames: 20.0% 3.3% 16.7% 43.3% 16.7% 

encoded 100 frames in 3.35s (29.83 fps), 59.01 kb/s, Avg QP:28.23
Done!



As you can see, there is an error :
Invalid buffer size, packet size 51600 < expected frame_size 95976 Error while decoding stream #0:0: Invalid argument
, do you think this could be the cause to the shown problem ? I am not sure, as in the end, it tells me all 100 frames would have been encoded.

In case you want to reproduce this on the exact same video, you can find
actions1.mpg
in the UCF Aerial Action Dataset.

I would greatly appreciate any help as i am really stuck on this error.


-
Turn off sw_scale conversion to planar YUV 32 byte alignment requirements
8 novembre 2022, par flanselI am experiencing artifacts on the right edge of scaled and converted images when converting into planar YUV pixel formats with sw_scale. I am reasonably sure (although I can not find it anywhere in the documentation) that this is because sw_scale is using an optimization for 32 byte aligned lines, in the destination. However I would like to turn this off because I am using sw_scale for image composition, so even though the destination lines may be 32 byte aligned, the output image may not be.


Example.


Full output frame is 1280x720 yuv422p10le. (this is 32 byte aligned)
However into the top left corner I am scaling an image with an outwidth of 1280 / 3 = 426.
426 in this format is not 32 byte aligned, but I believe sw_scale sees that the output linesize is 32 byte aligned and overwrites the width of 426 putting garbage in the next 22 bytes of data thinking this is simply padding when in my case this is displayable area.


This is why I need to actually disable this optimization or somehow trick sw_scale into believing it does not apply while keeping intact the way the program works, which is otherwise fine.


I have tried adding extra padding to the destination lines so they are no longer 32 byte aligned,
this did not help as far as I can tell.


Edit with code Example. Rendering omitted for ease of use.
Also here is a similar issue, unfortunately as I stated there fix will not work for my use case. https://github.com/obsproject/obs-studio/pull/2836


Use the commented line of code to swap between a output width which is and isnt 32 byte aligned.


#include "libswscale/swscale.h"
#include "libavutil/imgutils.h"
#include "libavutil/pixelutils.h"
#include "libavutil/pixfmt.h"
#include "libavutil/pixdesc.h"
#include 
#include 
#include 

int main(int argc, char **argv) {

/// Set up a 1280x720 window, and an item with 1/3 width and height of the window.
int window_width, window_height, item_width, item_height;
window_width = 1280;
window_height = 720;
item_width = (window_width / 3);
item_height = (window_height / 3);

int item_out_width = item_width;
/// This line sets the item width to be 32 byte aligned uncomment to see uncorrupted results
/// Note %16 because outformat is 2 bytes per component
//item_out_width -= (item_width % 16);

enum AVPixelFormat outformat = AV_PIX_FMT_YUV422P10LE;
enum AVPixelFormat informat = AV_PIX_FMT_UYVY422;
int window_lines[4] = {0};
av_image_fill_linesizes(window_lines, outformat, window_width);

uint8_t *window_planes[4] = {0};
window_planes[0] = calloc(1, window_lines[0] * window_height);
window_planes[1] = calloc(1, window_lines[1] * window_height);
window_planes[2] = calloc(1, window_lines[2] * window_height); /// Fill the window with all 0s, this is green in yuv.


int item_lines[4] = {0};
av_image_fill_linesizes(item_lines, informat, item_width);

uint8_t *item_planes[4] = {0};
item_planes[0] = malloc(item_lines[0] * item_height);
memset(item_planes[0], 100, item_lines[0] * item_height);

struct SwsContext *ctx;
ctx = sws_getContext(item_width, item_height, informat,
 item_out_width, item_height, outformat, SWS_FAST_BILINEAR, NULL, NULL, NULL);

/// Check a block in the normal region
printf("Pre scale normal region %d %d %d\n", (int)((uint16_t*)window_planes[0])[0], (int)((uint16_t*)window_planes[1])[0],
 (int)((uint16_t*)window_planes[2])[0]);

/// Check a block in the corrupted region (should be all zeros) These values should be out of the converted region
int corrupt_offset_y = (item_out_width + 3) * 2; ///(item_width + 3) * 2 bytes per component Y PLANE
int corrupt_offset_uv = (item_out_width + 3); ///(item_width + 3) * (2 bytes per component rshift 1 for horiz scaling) U and V PLANES

printf("Pre scale corrupted region %d %d %d\n", (int)(*((uint16_t*)(window_planes[0] + corrupt_offset_y))),
 (int)(*((uint16_t*)(window_planes[1] + corrupt_offset_uv))), (int)(*((uint16_t*)(window_planes[2] + corrupt_offset_uv))));
sws_scale(ctx, (const uint8_t**)item_planes, item_lines, 0, item_height,window_planes, window_lines);

/// Preform same tests after scaling
printf("Post scale normal region %d %d %d\n", (int)((uint16_t*)window_planes[0])[0], (int)((uint16_t*)window_planes[1])[0],
 (int)((uint16_t*)window_planes[2])[0]);
printf("Post scale corrupted region %d %d %d\n", (int)(*((uint16_t*)(window_planes[0] + corrupt_offset_y))),
 (int)(*((uint16_t*)(window_planes[1] + corrupt_offset_uv))), (int)(*((uint16_t*)(window_planes[2] + corrupt_offset_uv))));

return 0;



}


Example Output:

//No alignment
Pre scale normal region 0 0 0
Pre scale corrupted region 0 0 0
Post scale normal region 400 400 400
Post scale corrupted region 512 36865 36865

//With alignment
Pre scale normal region 0 0 0
Pre scale corrupted region 0 0 0
Post scale normal region 400 400 400
Post scale corrupted region 0 0 0



-
How to make customizable subtitles with FFmpeg ?
6 août 2022, par Alex RypunI need to generate videos with text (aka subtitles) and styles provided by users.
The styles are a few fonts, text border, text background, text shadow, colors, position on the video, etc.


As I understand there are 2 filters I can use :
drawtext
andsubtitles
.

subtitles
is easier to use but it's not fully customizable. For example, I can't add shadow and background for the same text.

drawtext
is more customizable but quite problematic.

I've implemented almost everything I need with
drawtext
but have one problem : multiline text with a background.

boxborderw
parameter adds specified pixels count from all 4 sides from the extreme text point. I need to make backgrounds touch each other (pixel perfect). That means I have to position text lines with the same space between extreme text points (not between baselines). With crazy workarounds I solved it.
Everything almost works but sometimes some strange border appears between lines :



Spending ages investigating I figured out that it depends on the actual line height and the position on the video.
Each letter has its own distances between baseline and the highest point and between baseline and the lowest point.
The whole line height is the difference between the highest point of the highest symbol and the lowest point of the lowest symbol :




Now the most interesting.


For lines that have even pixels height.


If the line is positioned on the odd pixel (y-axis) and has an odd
boxborderw
(or positioned on the even pixel (y-axis) and has an evenboxborderw
), then it's rendered as expected (without any additional borders).
In other cases, the thin dark line is noticeable on the contact line (it's rendered either on top or at bottom of each text block :




For lines that have odd pixels height.


In all cases the thin dark line is noticeable. Depending on y coordinate (odd or even) and
boxborderw
value (odd or even) that magic line appears on top or bottom :



I can make text lines overlap a bit. This solves the problem but adds another problem.
I use fade-in/out by smoothly changing
alfa
(transparency). When the text becomes semi-transparent the overlapped area has a different color :



Here is the command I use :


ffmpeg "-y" "-f" "lavfi" "-i" "color=#ffffff" "-filter_complex" \
"[0:v]loop=-1:1:0,trim=duration=2.00,format=yuv420p,scale=960:540,setdar=16/9[video];\
[video]drawtext=text='qqq':\
fontfile=ComicSansMS.ttf:\
fontcolor=#ffffff:\
fontsize=58:\
bordercolor=#000000:\
borderw=2:\
box=1:\
boxcolor=#ff0000:\
boxborderw=15:\
x=(w-text_w)/2:\
y=105:\
alpha='if(lt(t,0),0,if(lt(t,0.5),(t-0)/0.5,if(lt(t,1.49),1,if(lt(t,1.99),(0.5-(t-1.49))/0.5,0))))',\
drawtext=text='qqq':\
fontfile=ComicSansMS.ttf:\
fontcolor=#ffffff:\
fontsize=58:\
bordercolor=#000000:\
borderw=2:\
box=1:\
boxcolor=#ff0000:\
boxborderw=15:\
x=(w-text_w)/2:\
y=182:\
alpha='if(lt(t,0),0,if(lt(t,0.5),(t-0)/0.5,if(lt(t,1.49),1,if(lt(t,1.99),(0.5-(t-1.49))/0.5,0))))'[video]" \
"-vsync" "2" "-map" "[video]" "-r" "25" "output_multiline.mp4"



I tried to draw rectangles with
drawbox
following the same logic (changing the position and height). But there were no problems.

Does anybody know what is the nature of that dark line ?
And how it can depend on the even/odd of the height and position ?
What should I investigate to figure out such a behavior ?


UPD :


Just accidentally figured out that changing the pixel format (from
format=yuv420p
toformat=uyvy422
or many others) solved the problem. At least on my test commands.
Now will learn about what is pixel format)