Newest 'ffmpeg' Questions - Stack Overflow

http://stackoverflow.com/questions/tagged/ffmpeg

Les articles publiés sur le site

  • Dynamic ffmpeg crop, scale & encoding code seems to break when the crop size changes

    28 avril, par Blindy

    The following code works perfectly as long as I only move the crop rectangle, however as soon as I change its size I no longer get frames out of my filter (av_buffersink_get_frame returns -11). It's crazy, even after the size changes, if it eventually changes to the original size that frame will go through, then it will go back to no longer providing frames.

    Would anyone happen to know what I'm doing wrong?

    My filter setup (note the crop & scale combination, it should (I think?) scale whatever I crop to the output video size):

    // buffer source -> buffer sink setup
    auto args = std::format("video_size={}x{}:pix_fmt={}:time_base={}/{}:pixel_aspect={}/{}",
        inputCodecContext->width, inputCodecContext->height, (int)inputCodecContext->pix_fmt,
        inputCodecContext->pkt_timebase.num, inputCodecContext->pkt_timebase.den,
        inputCodecContext->sample_aspect_ratio.num, inputCodecContext->sample_aspect_ratio.den);
    
    AVFilterContext* buffersrc_ctx = nullptr, * buffersink_ctx = nullptr;
    check_av_result(avfilter_graph_create_filter(&buffersrc_ctx, bufferSource, "in",
        args.c_str(), nullptr, &*filterGraph));
    check_av_result(avfilter_graph_create_filter(&buffersink_ctx, bufferSink, "out",
        nullptr, nullptr, &*filterGraph));
    check_av_result(av_opt_set_bin(buffersink_ctx, "pix_fmts",
        (uint8_t*)&outputCodecContext->pix_fmt, sizeof(outputCodecContext->pix_fmt), AV_OPT_SEARCH_CHILDREN));
    
    // filter command setup
    auto filterSpec = std::format("crop,scale={}:{},setsar=1:1", outputCodecContext->width, outputCodecContext->height);
    
    check_av_result(avfilter_graph_parse_ptr(&*filterGraph, filterSpec.c_str(), &filterInputs, &filterOutputs, nullptr));
    check_av_result(avfilter_graph_config(&*filterGraph, nullptr));
    

    Frame cropping:

    check_av_result(avfilter_graph_send_command(&*filterGraph, "crop", "x", std::to_string(cropRectangle.CenterX() - cropRectangle.Width() / 2).c_str(), nullptr, 0, 0));
    check_av_result(avfilter_graph_send_command(&*filterGraph, "crop", "y", std::to_string(cropRectangle.CenterY() - cropRectangle.Height() / 2).c_str(), nullptr, 0, 0));
    check_av_result(avfilter_graph_send_command(&*filterGraph, "crop", "w", std::to_string(cropRectangle.Width()).c_str(), nullptr, 0, 0));
    check_av_result(avfilter_graph_send_command(&*filterGraph, "crop", "h", std::to_string(cropRectangle.Height()).c_str(), nullptr, 0, 0));
    
    // push the decoded frame into the filter graph
    check_av_result(av_buffersrc_add_frame_flags(buffersrc_ctx, &*inputFrame, 0));
    
    // pull filtered frames from the filter graph
    while (1)
    {
        ret = av_buffersink_get_frame(buffersink_ctx, &*filteredFrame);
        if (ret < 0)
        {
            // if no more frames, rewrite the code to 0 to show it as normal completion
            if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF)
                ret = 0;
            break;
        }
    
        // write the filtered frame to the output file 
        // [...]
    }
    

    I also set the output video size before creating the file, and it is obeyed as expected:

    outputCodecContext->width = (int)output.PixelSize().Width;
    outputCodecContext->height = (int)output.PixelSize().Height;
    
  • adb screenrecord display only screenshot, it does not stream the screen

    28 avril, par hexols

    I have an Android TV, I want to stream its screen in my Ubuntu PC. I used this command:

    adb shell screenrecord --output-format=h264 - | ffplay -
    

    and after waiting for a while it displays the screenshot of the TV. But I want to display live stream of the Android TV. I also used the following command as well but got the same result:

    adb exec-out screenrecord --bit-rate=16m --output-format=h264 --size 800x600 - | ffplay -framerate 60 -framedrop -bufsize 16M -
    

    How can I achieve this using this command? Or is there a way to achieve it with another way by using VLC/Gstreamer/FFMPEG except using scrcpy/vysor?

  • How to Convert 16:9 Video to 9:16 Ratio While Ensuring Speaker Presence in Frame ?

    28 avril, par shreesha

    I am tried so many time to figure out the problem in detecting the face and also it's not so smooth enough to like other tools out there.

    So basically I am using python and Yolo in this project but I want the person who is talking and who the ROI (region of interest) is.

    Here is the code:

    from ultralytics import YOLO
    from ultralytics.engine.results import Results
    from moviepy.editor import VideoFileClip, concatenate_videoclips
    from moviepy.video.fx.crop import crop
    
    # Load the YOLOv8 model
    model = YOLO("yolov8n.pt")
    
    # Load the input video
    clip = VideoFileClip("short_test.mp4")
    
    tacked_clips = []
    
    for frame_no, frame in enumerate(clip.iter_frames()):
        # Process the frame
        results: list[Results] = model(frame)
    
        # Get the bounding box of the main object
        if results[0].boxes:
            objects = results[0].boxes
            main_obj = max(
                objects, key=lambda x: x.conf
            )  # Assuming the first detected object is the main one
    
            x1, y1, x2, y2 = [int(val) for val in main_obj.xyxy[0].tolist()]
    
            # Calculate the crop region based on the object's position and the target aspect ratio
            w, h = clip.size
            new_w = int(h * 9 / 16)
            new_h = h
    
            x_center = x2 - x1
            y_center = y2 - y1
    
            # Adjust x_center and y_center if they would cause the crop region to exceed the bounds
            if x_center + (new_w / 2) > w:
                x_center -= x_center + (new_w / 2) - w
            elif x_center - (new_w / 2) < 0:
                x_center += abs(x_center - (new_w / 2))
    
            if y_center + (new_h / 2) > h:
                y_center -= y_center + (new_h / 2) - h
            elif y_center - (new_h / 2) < 0:
                y_center += abs(y_center - (new_h / 2))
    
            # Create a subclip for the current frame
            start_time = frame_no / clip.fps
            end_time = (frame_no + 1) / clip.fps
            subclip = clip.subclip(start_time, end_time)
    
            # Apply cropping using MoviePy
            cropped_clip = crop(
                subclip, x_center=x_center, y_center=y_center, width=new_w, height=new_h
            )
    
            tacked_clips.append(cropped_clip)
    
    reframed_clip = concatenate_videoclips(tacked_clips, method="compose")
    reframed_clip.write_videofile("output_video.mp4")
    

    So basically I want to fix the face detection with ROI detection where it can detect the face and make that face and the body on to the frame and making sure that the speaker who is speaking is brought to the frame

  • Can't find error in function for changing sampling rate [closed]

    28 avril, par kitty uwu

    I have function for changing sampling rate of audio (only one channel):

    int change_sampling_rate(float *audio_input, int input_sample_rate, int output_sample_rate, int input_num_of_samples, float **audio_output, int *result_num_of_samples) {
        AVChannelLayout src_ch_layout = AV_CHANNEL_LAYOUT_MONO;
        AVChannelLayout dst_ch_layout = AV_CHANNEL_LAYOUT_MONO;
    
        struct SwrContext *swr_ctx;
        swr_ctx = swr_alloc();
        int ret;
        if (!swr_ctx) {
            fprintf(stderr, "Could not allocate resampler context\n");
            ret = AVERROR(ENOMEM);
        }
    
        av_opt_set_chlayout(swr_ctx, "in_chlayout",    &src_ch_layout, 0);
        av_opt_set_int(swr_ctx, "in_sample_rate",       input_sample_rate, 0);
        av_opt_set_sample_fmt(swr_ctx, "in_sample_fmt", AV_SAMPLE_FMT_FLT, 0);
    
        av_opt_set_chlayout(swr_ctx, "out_chlayout",    &dst_ch_layout, 0);
        av_opt_set_int(swr_ctx, "out_sample_rate",       output_sample_rate, 0);
        av_opt_set_sample_fmt(swr_ctx, "out_sample_fmt", AV_SAMPLE_FMT_FLT, 0);
    
        if ((ret = swr_init(swr_ctx)) < 0) {
            fprintf(stderr, "Failed to initialize the resampling context\n");
            return -1;
        }
    
        int output_samples_count = av_rescale_rnd(swr_get_delay(swr_ctx, input_sample_rate) + input_num_of_samples, output_sample_rate, input_sample_rate, AV_ROUND_UP);
        uint8_t **resampled_data = NULL;
        if (av_samples_alloc_array_and_samples(&resampled_data, NULL, 1, output_samples_count, AV_SAMPLE_FMT_FLT, 0) < 0) {
            fprintf(stderr, "Could not allocate resampled data\n");
            swr_free(&swr_ctx);
            return -1;
        }
    
        const uint8_t *in_samples[1] = {(const uint8_t *)audio_input};
        int frame_count = swr_convert(swr_ctx, resampled_data, output_samples_count, in_samples, input_num_of_samples);
    
        if (frame_count < 0) {
            fprintf(stderr, "Error while resampling\n");
            av_freep(&resampled_data[0]);
            free(resampled_data);
            swr_free(&swr_ctx);
            return -1;
        }
    
        *audio_output = (float *) malloc(frame_count * sizeof(float));
        if (!*audio_output) {
            fprintf(stderr, "Could not allocate memory for output\n");
            av_freep(&resampled_data[0]);
            free(resampled_data);
            swr_free(&swr_ctx);
            return -1;
        }
    
        memcpy(*audio_output, resampled_data[0], frame_count * sizeof(float));
    
        *result_num_of_samples = frame_count;
        av_freep(&resampled_data[0]);
        swr_free(&swr_ctx);
        return SUCCESS;
    }
    

    When I run tests on time lag between two files (mp3) with different sampling rates, it gives answer that differs on about 15-20 ms with right answer. Can anybody, please, help me find mistakes in the code?

  • How to apply the same FFMPEG slide transition for every slide in a sequence ?

    28 avril, par will

    I have an ffmpeg command to create an MP4 video from a sequence of N x JPEG slides. That is, I do not know how many slides there are in the sequence, it is just a directory of JPEG-s.

    I'd like to apply a common slide transition for every slide in the sequence. All the examples I've read seem to need to know how many slides there are before the filter is written. The idea is to use one (simple) transition/filter for every slide.

    So far the script looks like this:

        play_duration="-framerate 1/10"      #   10 seconds for testing
    
        ffmpeg                                          \
            ${play_duration}                            \
            -pattern_type glob                          \
            -i "./slides/*.jpg"                         \
                                                        \
             -c:v libx264                                \
             -filter_complex                             \
                "pad=ceil(iw/2)*2:ceil(ih/2)*2; fade=out:120:30"         \
                                                        \
            ./Slideshow.mp4
    

    The first filter: "pad=..." is necessary to deal with inconsistiencies in the JPEG input.

    My limited appreciation here is that the "fade=out:120:30" filter ought to work if I didn't need to also have the pad= construct.

    Example transition examples for I've come across so far -- there are a great many variations on the same pattern -- all look a lot like this ...

        ffmpeg -loop 1 -t 3 -framerate 60 -i image1.jpg -loop 1 -t 3   \
            -framerate 60 -i image2.jpg -loop 1 -t 3 -framerate 60 -i image3.jpg   \
            -filter_complex                                             \
                "[0]scale=1920:1280:force_original_aspect_ratio=decrease,pad=1920:1280:-1:-1[s0]; [1]scale=1920:1280:force_original_aspect_ratio=decrease,pad=1920:1280:-1:-1[s1]; [2]scale=1920:1280:force_original_aspect_ratio=decrease,pad=1920:1280:-1:-1[s2]; [s0][s1]xfade=transition=circleopen:duration=1:offset=2[f0]; [f0][s2]xfade=transition=circleopen:duration=1:offset=4"                 \
            -c:v libx264 -pix_fmt yuv420p                                \
            output.mp4
    
    

    The requirement is to have the same filter/transition which will apply to all slide changes. It sounded so easy at first.