Recherche avancée

Médias (3)

Mot : - Tags -/pdf

Sur d’autres sites (1907)

  • ffmpeg - Merge back frames to a video with the same encoding

    24 septembre 2019, par Vuwox

    I have a video encoded using H264 at 23.98 fps, for a duration of 00:00:06.42.

    I extracted the frames from that video, and then I processed those images one-by-one. Now I want to put them back together as a video, but I want to be the same as the source video (same duration, same audio, etc).

    Whatever I tried gives something different. The duration is always greater (around 00:00:06.59), the audio seems to be up to the end of the video (as expected), but the frame are not encoded properly, and they seems to freeze at the end and the audio continue.

    The one that look almost the same except the freeze at the end look like this :

    ffmpeg -i input.mov \
          -pattern_type glob -i 'result_*.tif'
          -map 1 -map 0:a \
          -map_metadata 0 \
          -map_metadata:s:v 0:s:v \
          -map_metadata:s:a 0:s:a \
          output.mov

    Where I use the metadata and the audio from the input video, and use the frames from my second input.

    EDIT : As suggested here the details of the source video.

    ffmpeg version 2.8.15 Copyright (c) 2000-2018 the FFmpeg developers
     built with gcc 4.8.5 (GCC) 20150623 (Red Hat 4.8.5-28)
     configuration: --prefix=/usr --bindir=/usr/bin --datadir=/usr/share/ffmpeg --incdir=/usr/include/ffmpeg --libdir=/usr/lib64 --mandir=/usr/share/man --arch=x86_64 --optflags='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic' --extra-ldflags='-Wl,-z,relro ' --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libvo-amrwbenc --enable-version3 --enable-bzlib --disable-crystalhd --enable-gnutls --enable-ladspa --enable-libass --enable-libcdio --enable-libdc1394 --disable-indev=jack --enable-libfreetype --enable-libgsm --enable-libmp3lame --enable-openal --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libvorbis --enable-libv4l2 --enable-libx264 --enable-libx265 --enable-libxvid --enable-x11grab --enable-avfilter --enable-avresample --enable-postproc --enable-pthreads --disable-static --enable-shared --enable-gpl --disable-debug --disable-stripping --shlibdir=/usr/lib64 --enable-runtime-cpudetect
     libavutil      54. 31.100 / 54. 31.100
     libavcodec     56. 60.100 / 56. 60.100
     libavformat    56. 40.101 / 56. 40.101
     libavdevice    56.  4.100 / 56.  4.100
     libavfilter     5. 40.101 /  5. 40.101
     libavresample   2.  1.  0 /  2.  1.  0
     libswscale      3.  1.101 /  3.  1.101
     libswresample   1.  2.101 /  1.  2.101
     libpostproc    53.  3.100 / 53.  3.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Transparent.mov':
     Metadata:
       major_brand     : qt
       minor_version   : 0
       compatible_brands: qt
       creation_time   : 2019-09-17 22:06:44
     Duration: 00:00:06.42, start: 0.000000, bitrate: 47798 kb/s
       Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 113 kb/s (default)
       Metadata:
         creation_time   : 2019-09-17 22:06:44
         handler_name    : Core Media Data Handler
       Stream #0:1(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 3840x2160 [SAR 1:1 DAR 16:9], 47541 kb/s, 23.98 fps, 23.98 tbr, 24k tbn, 48k tbc (default)
       Metadata:
         creation_time   : 2019-09-17 22:06:44
         handler_name    : Core Media Data Handler
         encoder         : H.264
         timecode        : 00:00:00:00
       Stream #0:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
       Metadata:
         creation_time   : 2019-09-17 22:06:44
         handler_name    : Core Media Data Handler
         timecode        : 00:00:00:00
    At least one output file must be specified
  • pydub.exceptions.CouldntDecodeError : Decoding failed. ffmpeg returned error code : 1

    9 avril, par azail765

    This script will work on a 30 second wav file but not a 10 minutes phone call also in wav format. Any help would be appreciated

    


    I've downloaded ffmpeg.

    


    # Import necessary libraries 
from pydub import AudioSegment 
import speech_recognition as sr 
import os
import pydub


chunk_count = 0
directory = os.fsencode(r'C:\Users\zach.blair\Downloads\speechRecognition\New folder')
# Text file to write the recognized audio 
fh = open("recognized.txt", "w+")
for file in os.listdir(directory):
     filename = os.fsdecode(file)
     if filename.endswith(".wav"):
        chunk_count += 1
             # Input audio file to be sliced 
        audio = AudioSegment.from_file(filename,format="wav") 
          
        ''' 
        Step #1 - Slicing the audio file into smaller chunks. 
        '''
        # Length of the audiofile in milliseconds 
        n = len(audio) 
          
        # Variable to count the number of sliced chunks 
        counter = 1
          
         
          
        # Interval length at which to slice the audio file. 
        interval = 20 * 1000
          
        # Length of audio to overlap.  
        overlap = 1 * 1000
          
        # Initialize start and end seconds to 0 
        start = 0
        end = 0
          
        # Flag to keep track of end of file. 
        # When audio reaches its end, flag is set to 1 and we break 
        flag = 0
          
        # Iterate from 0 to end of the file, 
        # with increment = interval 
        for i in range(0, 2 * n, interval): 
              
            # During first iteration, 
            # start is 0, end is the interval 
            if i == 0: 
                start = 0
                end = interval 
          
            # All other iterations, 
            # start is the previous end - overlap 
            # end becomes end + interval 
            else: 
                start = end - overlap 
                end = start + interval  
          
            # When end becomes greater than the file length, 
            # end is set to the file length 
            # flag is set to 1 to indicate break. 
            if end >= n: 
                end = n 
                flag = 1
          
            # Storing audio file from the defined start to end 
            chunk = audio[start:end] 
          
            # Filename / Path to store the sliced audio 
            filename = str(chunk_count)+'chunk'+str(counter)+'.wav'
          
            # Store the sliced audio file to the defined path 
            chunk.export(filename, format ="wav") 
            # Print information about the current chunk 
            print(str(chunk_count)+str(counter)+". Start = "
                                +str(start)+" end = "+str(end)) 
          
            # Increment counter for the next chunk 
            counter = counter + 1
              
          
            AUDIO_FILE = filename 
            
            # Initialize the recognizer 
            r = sr.Recognizer() 
          
            # Traverse the audio file and listen to the audio 
            with sr.AudioFile(AUDIO_FILE) as source: 
                audio_listened = r.listen(source) 
          
            # Try to recognize the listened audio 
            # And catch expections. 
            try:     
                rec = r.recognize_google(audio_listened) 
                  
                # If recognized, write into the file. 
                fh.write(rec+" ") 
              
            # If google could not understand the audio 
            except sr.UnknownValueError: 
                    print("Empty Value") 
          
            # If the results cannot be requested from Google. 
            # Probably an internet connection error. 
            except sr.RequestError as e: 
                print("Could not request results.") 
          
            # Check for flag. 
            # If flag is 1, end of the whole audio reached. 
            # Close the file and break.                  
fh.close()    


    


    I get this error on audio = AudioSegment.from_file(filename,format="wav") :

    


    Traceback (most recent call last):&#xA;  File "C:\Users\zach.blair\Downloads\speechRecognition\New folder\speechRecognition3.py", line 17, in <module>&#xA;    audio = AudioSegment.from_file(filename,format="wav")&#xA;  File "C:\Users\zach.blair\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pydub\audio_segment.py", line 704, in from_file&#xA;    p.returncode, p_err))&#xA;pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1&#xA;</module>

    &#xA;

    Output from ffmpeg/avlib :

    &#xA;

      ffmpeg version N-95027-g8c90bb8ebb Copyright (c) 2000-2019 the FFmpeg developers&#xA;  built with gcc 9.2.1 (GCC) 20190918&#xA;  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf&#xA;  libavutil      56. 35.100 / 56. 35.100&#xA;  libavcodec     58. 58.101 / 58. 58.101&#xA;  libavformat    58. 33.100 / 58. 33.100&#xA;  libavdevice    58.  9.100 / 58.  9.100&#xA;  libavfilter     7. 58.102 /  7. 58.102&#xA;  libswscale      5.  6.100 /  5.  6.100&#xA;  libswresample   3.  6.100 /  3.  6.100&#xA;  libpostproc    55.  6.100 / 55.  6.100&#xA;Guessed Channel Layout for Input Stream #0.0 : mono&#xA;Input #0, wav, from &#x27;2a.wav.wav&#x27;:&#xA;  Duration: 00:09:52.95, bitrate: 64 kb/s&#xA;    Stream #0:0: Audio: pcm_mulaw ([7][0][0][0] / 0x0007), 8000 Hz, mono, s16, 64 kb/s&#xA;Stream mapping:&#xA;  Stream #0:0 -> #0:0 (pcm_mulaw (native) -> pcm_s8 (native))&#xA;Press [q] to stop, [?] for help&#xA;[wav @ 0000024307974400] pcm_s8 codec not supported in WAVE format&#xA;Could not write header for output file #0 (incorrect codec parameters ?): Function not implemented&#xA;Error initializing output stream 0:0 -- &#xA;Conversion failed!&#xA;

    &#xA;

  • avformat_open_input crash using ffmpeg on android

    2 octobre 2019, par Timmy K

    I am writing an Android app to capture audio streams from a USB device using ffmpeg, and mux them into an m4a file. I then open the recording file to read back the audio data. The app works fine on Samsung S8 (Android 8.0) and S9 (Android 9.0). On a Moto G5+ (Android 8.1), however the app crashes inside avformat_open_input() when I try to open the recording file ; but if I restart the app and it reads the same file with the same code, it works without crashing.

    I thought maybe that I hadn’t closed the recording file properly ; or there was maybe a race condition : trying to read the file before it was written. However I have established that av_write_trailer() is called successfully and avio_closep() is called successfully before the call to avformat_open_input. I also established that the length of the file is the same just before the call to avformat_open_input() in the crash case as it is in the non-crash case.

    The code that finishes the recording :

    // AVFormatContext *mAvFormatContext
    // ...
    int ret = av_write_trailer(mAvFormatContext);
    if ( ret != 0 )
    {
       __android_log_print(ANDROID_LOG_DEBUG, "MyTag", "failed to write trailer %s", av_err2str( ret ) );
       goto fail;
    }

    ret = avio_close( mAvFormatContext->pb );
    if ( ret &lt; 0 )
    {
       __android_log_print(ANDROID_LOG_DEBUG, "MyTag", "failed to avio_close %s", av_err2str( ret ) );
       goto fail;
    }

    avformat_free_context(mAvFormatContext);

    The call that causes a crash immediately after the above code, on a Moto 5G+ (but not Samsung S8/S9) :

    mAvFormatContext = nullptr;
    if ( (ret = avformat_open_input( &amp;mAvFormatContext, filePath, 0, 0)) &lt; 0 )
    {
       __android_log_print(ANDROID_LOG_DEBUG, "MyTag++", "Could not open input file '%s' error %s ", filePath, av_err2str(ret) );
       cleanup();
       return false;
    }

    Also, when the call to avformat_open_input() does not crash, there is no output in logcat from within that call. However, I’ve noticed that in the crash case there is output of what seems to be corrupted data :

    2019-09-02 09:39:14.105 19999-19999/fm.x.y D/AudioEngine: Opening '@�7���-d��@�7�' for
    2019-09-02 09:39:14.106 19999-19999/fm.x.y D/AudioEngine: Setting default whitelist '&lt;��'
    2019-09-02 09:39:14.106 19999-19999/fm.x.y D/AudioEngine: Probing ���d score:-1828887548 size:-1093138844
    2019-09-02 09:39:14.106 19999-19999/fm.x.y D/AudioEngine: Format ��� probed with size=-1093138756 and score=-1093138748
    (crash occurs)

    Any advice on what to look into further to investigate this problem ? Is there something I am missing when cleaning up the muxing state before I try to open the recording file for demuxing ? For simplicity I have not included the freeing of codec contexts, AVFrames and AVPackets.