Newest 'ffmpeg' Questions - Stack Overflow

http://stackoverflow.com/questions/tagged/ffmpeg

Les articles publiés sur le site

  • How to batch process with ffmpeg script, but do each step in a loop instead of two stages

    14 mai, par Matt

    I'm a novice script editor. I convert MOV/AVI video files to MP4 format using a script with ffmpeg and then move the files after processing:

    for f in *.mov; do ffmpeg -y -i "$f" "${f%}.mp4"; done
    
    mv -f *.mov /Users/me/Videos/mov
    mv -f *.MOV /Users/me/Videos/mov
    mv -f *.avi /Users/me/Videos/avi
    mv -f *.AVI /Users/me/Videos/avi
    
    1. Currently the script converts all videos, then moves them all to the other folders. Please how can the script be adjusted so that each video is moved immediately after processing (instead of waiting until all are complete)? This would be a great improvement, as sometimes there are a lot of videos and the script gets interrupted for some reason (not a fault of the script). It will make it easier to monitor progress.

    2. Currently I manually tweak the first line changing *.mov for *.avi Please is there an easy way to handle either video file format/extension, within the same line?

    3. Is there a better way of handling the mv statements which have multiple lines for lower/uppercase? They also give error if there are no files of that type.

    Thank you

    The above script is functional but will be better with enhancements or changes.

  • Seeking Ideas : How Can I Automatically Generate a TikTok Video from a Custom Song Using C# [closed]

    14 mai, par Jamado

    Im creating a c# program which creates a video of the from a song and posts it on tiktok.

    Right now my program

    1. Uses spleeter to split the song into stems

    2. uses a script of GitHub to create waveform images of the stems

    I want my end video to look like this:

    https://vm.tiktok.com/ZMM7CDmUt/ - only one song will play per video

    https://vm.tiktok.com/ZMM7Xdw8b/

    https://vm.tiktok.com/ZMM7CcGtE/ - no webcam or that hitting animations

    basically I want the stems of the songs to be placed on top of a FL studio timeline, synced to the song, then I want to overlay a image on top of the video. and then to contribute for todays gen's 3 second attention span, add some audio virtualisations ontop of the fl studio recording (the music making app in the video) and a little shake to the image

    I've tinkered with ffmpeg before, and I reckon it could do the trick here. I'd use the waveform pictures and mix them with a pre-recorded FL Studio video using ffmpeg's filters, like VStack to stack images, Scroll to slide them around and Blend. And then tweak the overlay filter for that shake effect. Plus, I found out ffmpeg can whip up some basic audio visualizations, which is neat. (https://gist.github.com/Neurogami/aeed8693f7ac375d5e013b8432d04d3f)

    But my main issue with this approach is, how the waveform images will look weird/out of place ontop of the fl studio video, because FL studio has a really spesific "theme". I could manually create a template and then use some other library to merge the template image and the waveform image. But, it feels a bit janky and would probably be a hassle to set up and implement.

    So, I'm curious if you folks have any nifty libraries, GitHub gems, or ideas to help me nail this video?

  • Problems with Python's azure.cognitiveservices.speech when installing together with FFmpeg in a Linux web app

    14 mai, par Kakobo kakobo

    I need some help. I'm building an web app that takes any audio format, converts into a .wav file and then passes it to 'azure.cognitiveservices.speech' for transcription.I'm building the web app via a container Dockerfile as I need to install ffmpeg to be able to convert non ".wav" audio files to ".wav" (as azure speech services only process wav files). For some odd reason, the 'speechsdk' class of 'azure.cognitiveservices.speech' fails to work when I install ffmpeg in the web app. The class works perfectly fine when I install it without ffpmeg or when i build and run the container in my machine.

    I have placed debug print statements in the code. I can see the class initiating, for some reason it does not buffer in the same when when running it locally in my machine. The routine simply stops without any reason.

    Has anybody experienced a similar issue with azure.cognitiveservices.speech conflicting with ffmpeg?

    Here's my Dockerfile:

    `# Use an official Python runtime as a parent imageFROM python:3.11-slim
    
    #Version RunRUN echo "Version Run 1..."
    
    Install ffmpeg
    
    RUN apt-get update && apt-get install -y ffmpeg && # Ensure ffmpeg is executablechmod a+rx /usr/bin/ffmpeg && # Clean up the apt cache by removing /var/lib/apt/lists saves spaceapt-get clean && rm -rf /var/lib/apt/lists/*
    
    Set the working directory in the container
    
    WORKDIR /app
    
    Copy the current directory contents into the container at /app
    
    COPY . /app
    
    Install any needed packages specified in requirements.txt
    
    RUN pip install --no-cache-dir -r requirements.txt
    
    Make port 80 available to the world outside this container
    
    EXPOSE 8000
    
    Define environment variable
    
    ENV NAME World
    
    Run main.py when the container launches
    
    CMD ["streamlit", "run", "main.py", "--server.port", "8000", "--server.address", "0.0.0.0"]`and here's my python code:
    
    def transcribe_audio_continuous_old(temp_dir, audio_file, language):
        speech_key = azure_speech_key
        service_region = azure_speech_region
    
        time.sleep(5)
        print(f"DEBUG TIME BEFORE speechconfig")
    
        ran = generate_random_string(length=5)
        temp_file = f"transcript_key_{ran}.txt"
        output_text_file = os.path.join(temp_dir, temp_file)
        speech_recognition_language = set_language_to_speech_code(language)
        
        speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
        speech_config.speech_recognition_language = speech_recognition_language
        audio_input = speechsdk.AudioConfig(filename=os.path.join(temp_dir, audio_file))
            
        speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_input, language=speech_recognition_language)
        done = False
        transcript_contents = ""
    
        time.sleep(5)
        print(f"DEBUG TIME AFTER speechconfig")
        print(f"DEBUG FIle about to be passed {audio_file}")
    
        try:
            with open(output_text_file, "w", encoding=encoding) as file:
                def recognized_callback(evt):
                    print("Start continuous recognition callback.")
                    print(f"Recognized: {evt.result.text}")
                    file.write(evt.result.text + "\n")
                    nonlocal transcript_contents
                    transcript_contents += evt.result.text + "\n"
    
                def stop_cb(evt):
                    print("Stopping continuous recognition callback.")
                    print(f"Event type: {evt}")
                    speech_recognizer.stop_continuous_recognition()
                    nonlocal done
                    done = True
                
                def canceled_cb(evt):
                    print(f"Recognition canceled: {evt.reason}")
                    if evt.reason == speechsdk.CancellationReason.Error:
                        print(f"Cancellation error: {evt.error_details}")
                    nonlocal done
                    done = True
    
                speech_recognizer.recognized.connect(recognized_callback)
                speech_recognizer.session_stopped.connect(stop_cb)
                speech_recognizer.canceled.connect(canceled_cb)
    
                speech_recognizer.start_continuous_recognition()
                while not done:
                    time.sleep(1)
                    print("DEBUG LOOPING TRANSCRIPT")
    
        except Exception as e:
            print(f"An error occurred: {e}")
    
        print("DEBUG DONE TRANSCRIPT")
    
        return temp_file, transcript_contents
    

    `

    The transcript this callback works fine locally, or when installed without ffmpeg in the linux web app. Not sure why it conflicts with ffmpeg when installed via container dockerfile. The code section that fails can me found on note #NOTE DEBUG"

  • FPS reduction for H264

    14 mai, par Александр А

    There is a USB camera that sends H264 frames with a frequency of 30 fps in a resolution of 1920x1080, GOP size 30 or 60 (1 or 2 I frame per second depending on the camera and only P-frames), which requires throughput of about 6 mbit/s. It is necessary to reduce this to no more than 2 mbit/s. All this is done on a weak ARMv7, so the option of transcoding is extremely resource intensive, especially considering that I did not find how to do it as a GPU Mali (NanoPi Neo Core).

    # ./ffmpeg -hide_banner -codecs | grep h264
     DEV.LS h264                 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (decoders: h264 h264_v4l2m2m ) (encoders: libx264 libx264rgb h264_v4l2m2m )
    
    # sudo ./ffmpeg -hide_banner -f v4l2 -input_format h264 -c:v h264_v4l2m2m -video_size 1920x1080 -threads 4 -i /dev/video0 -f null -
    [h264 @ 0x149bb30] Increasing reorder buffer to 1
    Input #0, video4linux2,v4l2, from '/dev/video0':
      Duration: N/A, start: 269427.784703, bitrate: N/A
      Stream #0:0: Video: h264 (Baseline), yuvj420p(pc, bt709, progressive), 1920x1080, 30 fps, 30 tbr, 1000k tbn
    [h264_v4l2m2m @ 0x160fd40] Could not find a valid device
    [h264_v4l2m2m @ 0x160fd40] can't configure decoder
    Stream mapping:
      Stream #0:0 -> #0:0 (h264 (h264_v4l2m2m) -> wrapped_avframe (native))
    Error while opening decoder for input stream #0:0 : Invalid argument
    
    

    Given that the basic functionality is performed on the framework ffmpeg, is it possible to reduce the frame rate, ideally to 10 fps, WITHOUT loss of quality and decoding?

  • Getting "unable to decode APP fields" while playing USB webcam stream through ffplay

    14 mai, par Syed

    I am trying to play USB Webcam stream(not sure in which format it is..) using ffplay in windows. I can see the video without any issue, But I am keep getting below error in console.

    ffplay.exe -f dshow -i video="Logitech HD Webcam C615" -loglevel debug

    [mjpeg @97a118cc80] unable to decode APP fields: Invalid data found when processing input check logs for more details

    Do I really need to worry about this error? Or any filter that I need to provide in command to get ride of this error .

    Note: I tried to save stream to a file using ffmpeg getting the same issue.

    Thanks in advance.