Recherche avancée

Médias (1)

Mot : - Tags -/ogv

Autres articles (20)

  • Soumettre améliorations et plugins supplémentaires

    10 avril 2011

    Si vous avez développé une nouvelle extension permettant d’ajouter une ou plusieurs fonctionnalités utiles à MediaSPIP, faites le nous savoir et son intégration dans la distribution officielle sera envisagée.
    Vous pouvez utiliser la liste de discussion de développement afin de le faire savoir ou demander de l’aide quant à la réalisation de ce plugin. MediaSPIP étant basé sur SPIP, il est également possible d’utiliser le liste de discussion SPIP-zone de SPIP pour (...)

  • Changer son thème graphique

    22 février 2011, par

    Le thème graphique ne touche pas à la disposition à proprement dite des éléments dans la page. Il ne fait que modifier l’apparence des éléments.
    Le placement peut être modifié effectivement, mais cette modification n’est que visuelle et non pas au niveau de la représentation sémantique de la page.
    Modifier le thème graphique utilisé
    Pour modifier le thème graphique utilisé, il est nécessaire que le plugin zen-garden soit activé sur le site.
    Il suffit ensuite de se rendre dans l’espace de configuration du (...)

  • Ajouter notes et légendes aux images

    7 février 2011, par

    Pour pouvoir ajouter notes et légendes aux images, la première étape est d’installer le plugin "Légendes".
    Une fois le plugin activé, vous pouvez le configurer dans l’espace de configuration afin de modifier les droits de création / modification et de suppression des notes. Par défaut seuls les administrateurs du site peuvent ajouter des notes aux images.
    Modification lors de l’ajout d’un média
    Lors de l’ajout d’un média de type "image" un nouveau bouton apparait au dessus de la prévisualisation (...)

Sur d’autres sites (2920)

  • Buffered encoded images not saved

    26 mars 2021, par xyfix

    I have an issue with the first 12 images not being saved to file. I have attached the relevant files in this issue. I have also attached a log file to show that the first 12 images aren't written to the file that is generated. The frame rate is 24 fps and the recording is 5 sec, so there should be 120 frames written to the output file. This can be seen in the 4th column. The lines in the log files are as follows :

    


    image num [unique num from camera] [temp image num for recording seq] [time in ms]

    


    The image class is actually a simple wrapper around OpenCV's mat class with some additional members. The output file that I currently get is around 10 MB and when I open it in VLC it doesn't run for 5 seconds but more like 1 - 2 seconds but I can see whatever I have recorded. Can anyone explain to me why the files not written and the duration isn't 5 secs (minus 12 frames missing) as expected . As you can see I have tried with "av_interleaved_write_frame" but that didn't help

    


    xcodec.h

    


    #ifndef XCODEC_H
#define XCODEC_H

#include "image/Image.h"

extern "C"
{
    #include "Codec/include/libavcodec/avcodec.h"
    #include "Codec/include/libavdevice/avdevice.h"
    #include "Codec/include/libavformat/avformat.h"
    #include "Codec/include/libavutil/avutil.h"
    #include "Codec/include/libavformat/avio.h"
    #include "Codec/include/libavutil/imgutils.h"
    #include "Codec/include/libavutil/opt.h"
    #include "Codec/include/libswscale/swscale.h"
}


class XCodec
{
public:

    XCodec(const char *filename);

    ~XCodec();

    void encodeImage( const Image& image );

    void encode( AVFrame *frame, AVPacket *pkt );

    void add_stream();

    void openVideoCodec();

    void write_video_frame(const Image &image);

    void createFrame( const Image& image );

    void close();

private:

    static int s_frameCount;

    int m_timeVideo = 0;

    std::string m_filename;


    AVCodec* m_encoder = NULL;

    AVOutputFormat* m_outputFormat = NULL;

    AVFormatContext* m_formatCtx = NULL;

    AVCodecContext* m_codecCtx = NULL;

    AVStream* m_streamOut = NULL;

    AVFrame* m_frame = NULL;

    AVPacket* m_packet = NULL;

};

#endif


    


    xcodec.cpp

    


    #include "XCodec.h"&#xA;&#xA;#include <qdebug>&#xA;&#xA;&#xA;#define STREAM_DURATION   5.0&#xA;#define STREAM_FRAME_RATE 24&#xA;#define STREAM_NB_FRAMES  ((int)(STREAM_DURATION * STREAM_FRAME_RATE))&#xA;#define STREAM_PIX_FMT    AV_PIX_FMT_YUV420P /* default pix_fmt */&#xA;#define OUTPUT_CODEC AV_CODEC_ID_H264&#xA;&#xA;int XCodec::s_frameCount = 0;&#xA;&#xA;XCodec::XCodec( const char* filename ) :&#xA;    m_filename( filename ),&#xA;    m_encoder( avcodec_find_encoder( OUTPUT_CODEC ))&#xA;{&#xA;    av_log_set_level(AV_LOG_VERBOSE);&#xA;&#xA;    int ret(0);&#xA;&#xA;&#xA;    // allocate the output media context&#xA;    ret = avformat_alloc_output_context2( &amp;m_formatCtx, m_outputFormat, NULL, m_filename.c_str());&#xA;&#xA;    if (!m_formatCtx)&#xA;        return;&#xA;&#xA;    m_outputFormat = m_formatCtx->oformat;&#xA;&#xA;    // Add the video stream using H264 codec&#xA;    add_stream();&#xA;&#xA;    // Open video codec and allocate the necessary encode buffers&#xA;    if (m_streamOut)&#xA;        openVideoCodec();&#xA;&#xA;    // Print detailed information about input and output&#xA;    av_dump_format( m_formatCtx, 0, m_filename.c_str(), 1);&#xA;&#xA;    // Open the output media file, if needed&#xA;    if (!( m_outputFormat->flags &amp; AVFMT_NOFILE))&#xA;    {&#xA;        ret = avio_open( &amp;m_formatCtx->pb, m_filename.c_str(), AVIO_FLAG_WRITE);&#xA;&#xA;        if (ret &lt; 0)&#xA;        {&#xA;            char error[255];&#xA;            ret = av_strerror( ret, error, 255);&#xA;            fprintf(stderr, "Could not open &#x27;%s&#x27;: %s\n", m_filename.c_str(), error);&#xA;            return ;&#xA;        }&#xA;    }&#xA;    else&#xA;    {&#xA;        return;&#xA;    }&#xA;&#xA;    // Write media header&#xA;    ret = avformat_write_header( m_formatCtx, NULL );&#xA;&#xA;    if (ret &lt; 0)&#xA;    {&#xA;        char error[255];&#xA;        av_strerror(ret, error, 255);&#xA;        fprintf(stderr, "Error occurred when opening output file: %s\n", error);&#xA;        return;&#xA;    }&#xA;&#xA;    if ( m_frame )&#xA;           m_frame->pts = 0;&#xA;}&#xA;&#xA;&#xA;&#xA;XCodec::~XCodec()&#xA;{}&#xA;&#xA;/* Add an output stream. */&#xA;void XCodec::add_stream()&#xA;{&#xA;    AVCodecID codecId = OUTPUT_CODEC;&#xA;&#xA;    if (!( m_encoder ))&#xA;    {&#xA;        fprintf(stderr, "Could not find encoder for &#x27;%s&#x27;\n",&#xA;            avcodec_get_name(codecId));&#xA;        return;&#xA;    }&#xA;&#xA;    // Get the stream for codec&#xA;    m_streamOut = avformat_new_stream(m_formatCtx, m_encoder);&#xA;&#xA;    if (!m_streamOut) {&#xA;        fprintf(stderr, "Could not allocate stream\n");&#xA;        return;&#xA;    }&#xA;&#xA;    m_streamOut->id = m_formatCtx->nb_streams - 1;&#xA;&#xA;    m_codecCtx = avcodec_alloc_context3( m_encoder);&#xA;&#xA;    switch (( m_encoder)->type)&#xA;    {&#xA;    case AVMEDIA_TYPE_VIDEO:&#xA;        m_streamOut->codecpar->codec_id = codecId;&#xA;        m_streamOut->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;&#xA;        m_streamOut->codecpar->bit_rate = 400000;&#xA;        m_streamOut->codecpar->width = 800;&#xA;        m_streamOut->codecpar->height = 640;&#xA;        m_streamOut->codecpar->format = STREAM_PIX_FMT;&#xA;        m_streamOut->time_base = { 1, STREAM_FRAME_RATE };&#xA;&#xA;        avcodec_parameters_to_context( m_codecCtx, m_streamOut->codecpar);&#xA;&#xA;        m_codecCtx->gop_size = 12; /* emit one intra frame every twelve frames at most */&#xA;        m_codecCtx->max_b_frames = 1;&#xA;        m_codecCtx->time_base = { 1, STREAM_FRAME_RATE };&#xA;        m_codecCtx->framerate = { STREAM_FRAME_RATE, 1 };&#xA;        m_codecCtx->pix_fmt = STREAM_PIX_FMT;&#xA;        m_codecCtx->profile = FF_PROFILE_H264_HIGH;&#xA;&#xA;        break;&#xA;&#xA;    default:&#xA;        break;&#xA;    }&#xA;&#xA;    if (m_streamOut->codecpar->codec_id == OUTPUT_CODEC)&#xA;    {&#xA;      av_opt_set( m_codecCtx, "preset", "ultrafast", 0 );&#xA;    }&#xA;&#xA;/&#xA;//    /* Some formats want stream headers to be separate. */&#xA;    if (m_formatCtx->oformat->flags &amp; AVFMT_GLOBALHEADER)&#xA;            m_codecCtx->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;&#xA;&#xA;&#xA;    int ret = avcodec_parameters_from_context( m_streamOut->codecpar, m_codecCtx );&#xA;&#xA;    if (ret &lt; 0)&#xA;    {&#xA;        char error[255];&#xA;        av_strerror(ret, error, 255);&#xA;        fprintf(stderr, "avcodec_parameters_from_context returned (%d) - %s", ret, error);&#xA;        return;&#xA;    }&#xA;}&#xA;&#xA;&#xA;void XCodec::openVideoCodec()&#xA;{&#xA;    int ret;&#xA;&#xA;    /* open the codec */&#xA;    ret = avcodec_open2(m_codecCtx, m_encoder, NULL);&#xA;&#xA;    if (ret &lt; 0)&#xA;    {&#xA;        char error[255];&#xA;        av_strerror(ret, error, 255);&#xA;        fprintf(stderr, "Could not open video codec: %s\n", error);&#xA;        return;&#xA;    }&#xA;&#xA;    /* allocate and init a re-usable frame */&#xA;//    m_frame = av_frame_alloc();&#xA;&#xA;}&#xA;&#xA;&#xA;void XCodec::encodeImage(const Image &amp;image)&#xA;{&#xA;    // Compute video time from last added video frame&#xA;    m_timeVideo = image.timeStamp(); //(double)m_frame->pts) * av_q2d(m_streamOut->time_base);&#xA;&#xA;    // Stop media if enough time&#xA;    if (!m_streamOut /*|| m_timeVideo >= STREAM_DURATION*/)&#xA;       return;&#xA;&#xA;&#xA;    // Add a video frame&#xA;    write_video_frame( image );&#xA;&#xA;}&#xA;&#xA;&#xA;void XCodec::write_video_frame( const Image&amp; image )&#xA;{&#xA;    int ret;&#xA;&#xA;qDebug() &lt;&lt; "image num " &lt;&lt; image.uniqueImageNumber() &lt;&lt; " " &lt;&lt; s_frameCount;&#xA;&#xA;    if ( s_frameCount >= STREAM_NB_FRAMES)&#xA;    {&#xA;        /* No more frames to compress. The codec has a latency of a few&#xA;         * frames if using B-frames, so we get the last frames by&#xA;         * passing the same picture again. */&#xA;        int p( 0 ) ;&#xA;    }&#xA;    else&#xA;    {&#xA;         createFrame( image );&#xA;    }&#xA;&#xA;    // Increase frame pts according to time base&#xA;//    m_frame->pts &#x2B;= av_rescale_q(1, m_codecCtx->time_base, m_streamOut->time_base);&#xA;    m_frame->pts = int64_t( image.timeStamp()) ;&#xA;&#xA;&#xA;    if (m_formatCtx->oformat->flags &amp; 0x0020 )&#xA;    {&#xA;        /* Raw video case - directly store the picture in the packet */&#xA;        AVPacket pkt;&#xA;        av_init_packet(&amp;pkt);&#xA;&#xA;        pkt.flags |= AV_PKT_FLAG_KEY;&#xA;        pkt.stream_index = m_streamOut->index;&#xA;        pkt.data = m_frame->data[0];&#xA;        pkt.size = sizeof(AVPicture);&#xA;&#xA;//        ret = av_interleaved_write_frame(m_formatCtx, &amp;pkt);&#xA;        ret = av_write_frame( m_formatCtx, &amp;pkt );&#xA;    }&#xA;    else&#xA;    {&#xA;        AVPacket pkt;&#xA;        av_init_packet(&amp;pkt);&#xA;&#xA;        /* encode the image */&#xA;        ret = avcodec_send_frame(m_codecCtx, m_frame);&#xA;&#xA;        if (ret &lt; 0)&#xA;        {&#xA;            char error[255];&#xA;            av_strerror(ret, error, 255);&#xA;            fprintf(stderr, "Error encoding video frame: %s\n", error);&#xA;            return;&#xA;        }&#xA;&#xA;        /* If size is zero, it means the image was buffered. */&#xA;        ret = avcodec_receive_packet(m_codecCtx, &amp;pkt);&#xA;&#xA;        if( !ret &amp;&amp; pkt.size)&#xA;        {&#xA;qDebug() &lt;&lt; "write frame " &lt;&lt; m_frame->display_picture_number;&#xA;            pkt.stream_index = m_streamOut->index;&#xA;&#xA;            /* Write the compressed frame to the media file. */&#xA;//            ret = av_interleaved_write_frame(m_formatCtx, &amp;pkt);&#xA;            ret = av_write_frame( m_formatCtx, &amp;pkt );&#xA;        }&#xA;        else&#xA;        {&#xA;            ret = 0;&#xA;        }&#xA;    }&#xA;&#xA;    if (ret != 0)&#xA;    {&#xA;        char error[255];&#xA;        av_strerror(ret, error, 255);&#xA;        fprintf(stderr, "Error while writing video frame: %s\n", error);&#xA;        return;&#xA;    }&#xA;&#xA;    s_frameCount&#x2B;&#x2B;;&#xA;}&#xA;&#xA;&#xA;void XCodec::createFrame( const Image&amp; image /*, AVFrame *m_frame, int frame_index, int width, int height*/)&#xA;{&#xA;    /**&#xA;     * \note allocate frame&#xA;     */&#xA;    m_frame = av_frame_alloc();&#xA;    int ret = av_frame_make_writable( m_frame );&#xA;&#xA;    m_frame->format = STREAM_PIX_FMT;&#xA;    m_frame->width = image.width();&#xA;    m_frame->height = image.height();&#xA;//    m_frame->pict_type = AV_PICTURE_TYPE_I;&#xA;    m_frame->display_picture_number = image.uniqueImageNumber();&#xA;&#xA;    ret = av_image_alloc(m_frame->data, m_frame->linesize, m_frame->width,  m_frame->height, STREAM_PIX_FMT, 1);&#xA;&#xA;    if (ret &lt; 0)&#xA;    {&#xA;        return;&#xA;    }&#xA;&#xA;    struct SwsContext* sws_ctx = sws_getContext((int)image.width(), (int)image.height(), AV_PIX_FMT_RGB24,&#xA;                                                (int)image.width(), (int)image.height(), STREAM_PIX_FMT, 0, NULL, NULL, NULL);&#xA;&#xA;    const uint8_t* rgbData[1] = { (uint8_t* )image.getData() };&#xA;    int rgbLineSize[1] = { 3 * image.width() };&#xA;&#xA;    sws_scale(sws_ctx, rgbData, rgbLineSize, 0, image.height(), m_frame->data, m_frame->linesize);&#xA;&#xA;//cv::Mat yuv420p( m_frame->height &#x2B; m_frame->height/2, m_frame->width, CV_8UC1, m_frame->data[0]);&#xA;//cv::Mat cvmIm;&#xA;//cv::cvtColor(yuv420p,cvmIm,CV_YUV420p2BGR);&#xA;//std::ostringstream ss;&#xA;//ss &lt;&lt; "c:\\tmp\\YUVoriginal_" &lt;&lt; image.uniqueImageNumber() &lt;&lt; ".png";&#xA;//cv::imwrite( ss.str().c_str(), cvmIm);&#xA;}&#xA;&#xA;&#xA;void XCodec::close()&#xA;{&#xA;    /* reset the framecount */&#xA;    s_frameCount = 0 ;&#xA;&#xA;    int ret( 0 );&#xA;&#xA;    /* flush the encoder */&#xA;    while( ret >= 0 )&#xA;        ret = avcodec_send_frame(m_codecCtx, NULL);&#xA;&#xA;    // Write media trailer&#xA;    if( m_formatCtx )&#xA;        ret = av_write_trailer( m_formatCtx );&#xA;&#xA;    /* Close each codec. */&#xA;    if ( m_streamOut )&#xA;    {&#xA;        if( m_frame )&#xA;        {&#xA;            av_free( m_frame->data[0]);&#xA;            av_frame_free( &amp;m_frame );&#xA;        }&#xA;&#xA;        if( m_packet )&#xA;            av_packet_free( &amp;m_packet );&#xA;    }&#xA;&#xA;    if (!( m_outputFormat->flags &amp; AVFMT_NOFILE))&#xA;        /* Close the output file. */&#xA;        ret = avio_close( m_formatCtx->pb);&#xA;&#xA;&#xA;    /* free the stream */&#xA;    avformat_free_context( m_formatCtx );&#xA;&#xA;    fflush( stdout );&#xA;}&#xA;</qdebug>

    &#xA;

    image.h

    &#xA;

    #ifndef IMAGE_H&#xA;#define IMAGE_H&#xA;&#xA;#include  &#xA;&#xA;class Image &#xA;{&#xA;public:&#xA;&#xA;    Image();&#xA;&#xA;    Image( const cv::Mat&amp; mat );&#xA;&#xA;    Image(const Image&amp; other) = default;&#xA;&#xA;    Image(Image&amp;&amp; other) = default;&#xA;&#xA;    ~Image();&#xA;&#xA;&#xA;    inline const cv::Mat&amp; matrix() const{ return m_matrix; }&#xA;&#xA;    inline const int uniqueImageNumber() const{ return m_uniqueId; }&#xA;&#xA;    inline const int timeStamp() const { return m_timeStamp; }&#xA;&#xA;    inline const int width() const { return m_matrix.cols(); }&#xA;    &#xA;    inline const int height() const { return m_matrix.rows(); }&#xA;&#xA;private:&#xA;&#xA;    cv::Mat   m_matrix;&#xA;&#xA;    int       m_timeStamp;&#xA;&#xA;    int       m_uniqueId;&#xA;&#xA;};&#xA;&#xA;#endif&#xA;

    &#xA;

    logtxt

    &#xA;

     image num  1725   0   0&#xA; image num  1727   1   40&#xA; image num  1729   2   84&#xA; image num  1730   3   126&#xA; image num  1732   4   169&#xA; image num  1734   5   211&#xA; image num  1736   6   259&#xA; image num  1738   7   297&#xA; image num  1740   8   340&#xA; image num  1742   9   383&#xA; image num  1744   10   425&#xA; image num  1746   11   467&#xA; image num  1748   12   511&#xA; image num  1750   13   553&#xA; write frame  1750&#xA; image num  1752   14   600&#xA; write frame  1752&#xA; image num  1753   15   637&#xA; write frame  1753&#xA; image num  1755   16   680&#xA; write frame  1755&#xA; image num  1757   17   723&#xA; write frame  1757&#xA; image num  1759   18   766&#xA; write frame  1759&#xA; image num  1761   19   808&#xA; write frame  1761&#xA; image num  1763   20   854&#xA; write frame  1763&#xA; image num  1765   21   893&#xA; write frame  1765&#xA; image num  1767   22   937&#xA; write frame  1767&#xA; image num  1769   23   979&#xA; write frame  1769&#xA; image num  1770   24   1022&#xA; write frame  1770&#xA; image num  1772   25   1064&#xA; write frame  1772&#xA; image num  1774   26   1108&#xA; write frame  1774&#xA; image num  1776   27   1150&#xA; write frame  1776&#xA; image num  1778   28   1192&#xA; write frame  1778&#xA; image num  1780   29   1235&#xA; write frame  1780&#xA; image num  1782   30   1277&#xA; write frame  1782&#xA; image num  1784   31   1320&#xA; write frame  1784&#xA; image num  1786   32   1362&#xA; write frame  1786&#xA; image num  1787   33   1405&#xA; write frame  1787&#xA; image num  1789   34   1450&#xA; write frame  1789&#xA; image num  1791   35   1493&#xA; write frame  1791&#xA; image num  1793   36   1536&#xA; write frame  1793&#xA; image num  1795   37   1578&#xA; write frame  1795&#xA; image num  1797   38   1621&#xA; write frame  1797&#xA; image num  1799   39   1663&#xA; write frame  1799&#xA; image num  1801   40   1709&#xA; write frame  1801&#xA; image num  1803   41   1748&#xA; write frame  1803&#xA; image num  1805   42   1791&#xA; write frame  1805&#xA; image num  1807   43   1833&#xA; write frame  1807&#xA; image num  1808   44   1876&#xA; write frame  1808&#xA; image num  1810   45   1920&#xA; write frame  1810&#xA; image num  1812   46   1962&#xA; write frame  1812&#xA; image num  1814   47   2004&#xA; write frame  1814&#xA; image num  1816   48   2048&#xA; write frame  1816&#xA; image num  1818   49   2092&#xA; write frame  1818&#xA; image num  1820   50   2133&#xA; write frame  1820&#xA; image num  1822   51   2175&#xA; write frame  1822&#xA; image num  1824   52   2221&#xA; write frame  1824&#xA; image num  1826   53   2277&#xA; write frame  1826&#xA; image num  1828   54   2319&#xA; write frame  1828&#xA; image num  1830   55   2361&#xA; write frame  1830&#xA; image num  1832   56   2405&#xA; write frame  1832&#xA; image num  1833   57   2447&#xA; write frame  1833&#xA; image num  1835   58   2491&#xA; write frame  1835&#xA; image num  1837   59   2533&#xA; write frame  1837&#xA; image num  1839   60   2576&#xA; write frame  1839&#xA; image num  1841   61   2619&#xA; write frame  1841&#xA; image num  1843   62   2662&#xA; write frame  1843&#xA; image num  1845   63   2704&#xA; write frame  1845&#xA; image num  1847   64   2746&#xA; write frame  1847&#xA; image num  1849   65   2789&#xA; write frame  1849&#xA; image num  1851   66   2831&#xA; write frame  1851&#xA; image num  1852   67   2874&#xA; write frame  1852&#xA; image num  1854   68   2917&#xA; write frame  1854&#xA; image num  1856   69   2959&#xA; write frame  1856&#xA; image num  1858   70   3003&#xA; write frame  1858&#xA; image num  1860   71   3045&#xA; write frame  1860&#xA; image num  1862   72   3088&#xA; write frame  1862&#xA; image num  1864   73   3130&#xA; write frame  1864&#xA; image num  1866   74   3173&#xA; write frame  1866&#xA; image num  1868   75   3215&#xA; write frame  1868&#xA; image num  1870   76   3257&#xA; write frame  1870&#xA; image num  1872   77   3306&#xA; write frame  1872&#xA; image num  1873   78   3347&#xA; write frame  1873&#xA; image num  1875   79   3389&#xA; write frame  1875&#xA; image num  1877   80   3433&#xA; write frame  1877&#xA; image num  1879   81   3475&#xA; write frame  1879&#xA; image num  1883   82   3562&#xA; write frame  1883&#xA; image num  1885   83   3603&#xA; write frame  1885&#xA; image num  1887   84   3660&#xA; write frame  1887&#xA; image num  1889   85   3704&#xA; write frame  1889&#xA; image num  1891   86   3747&#xA; write frame  1891&#xA; image num  1893   87   3789&#xA; write frame  1893&#xA; image num  1895   88   3832&#xA; write frame  1895&#xA; image num  1897   89   3874&#xA; write frame  1897&#xA; image num  1899   90   3917&#xA; write frame  1899&#xA; image num  1900   91   3959&#xA; write frame  1900&#xA; image num  1902   92   4001&#xA; write frame  1902&#xA; image num  1904   93   4044&#xA; write frame  1904&#xA; image num  1906   94   4086&#xA; write frame  1906&#xA; image num  1908   95   4130&#xA; write frame  1908&#xA; image num  1910   96   4174&#xA; write frame  1910&#xA; image num  1912   97   4216&#xA; write frame  1912&#xA; image num  1914   98   4257&#xA; write frame  1914&#xA; image num  1915   99   4303&#xA; write frame  1915&#xA; image num  1918   100   4344&#xA; write frame  1918&#xA; image num  1919   101   4387&#xA; write frame  1919&#xA; image num  1922   102   4451&#xA; write frame  1922&#xA; image num  1924   103   4494&#xA; write frame  1924&#xA; image num  1926   104   4541&#xA; write frame  1926&#xA; image num  1927   105   4588&#xA; write frame  1927&#xA; image num  1931   106   4665&#xA; write frame  1931&#xA; image num  1933   107   4707&#xA; write frame  1933&#xA; image num  1935   108   4750&#xA; write frame  1935&#xA; image num  1937   109   4794&#xA; write frame  1937&#xA; image num  1939   110   4836&#xA; write frame  1939&#xA; image num  1941   111   4879&#xA; write frame  1941&#xA; image num  1943   112   4922&#xA; write frame  1943&#xA; image num  1945   113   4965&#xA; write frame  1945&#xA; image num  1947   114   5007&#xA; write frame  1947&#xA; image num  1948   115   5050&#xA; write frame  1948&#xA; image num  1950   116   5093&#xA; write frame  1950&#xA; image num  1952   117   5136&#xA; write frame  1952&#xA; image num  1954   118   5178&#xA; write frame  1954&#xA; image num  1956   119   5221&#xA; write frame  1956&#xA; MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2&#xA;0, 8-bit&#xA;Not writing &#x27;clli&#x27; atom. No content light level info.&#xA;Not writing &#x27;mdcv&#x27; atom. Missing mastering metadata.&#xA; 2 seeks, 41 writeouts&#xA;

    &#xA;

  • ffmpeg faster conversion from jpg files to mp4

    25 mai 2021, par opensw

    I am trying (on Android and iOS) to convert 500 jpeg files into a mp4 video ; everything is working but the conversion time is too huge, around 1 minute. I have some constraints : the video should be playable by the native Android/iOS player then I cannot use the option '-codec copy' and then generates a mkv or mp4 containers of the original jpeg files (the conversion time in this case is around 1s !). After many attempts, the best solution is the default one without almost any options :D Is there a way to improve the conversion time of the following command ?

    &#xA;

    ffmpeg -r 30 -I inputPath/%05d.jpg -y -threads 0 -r 30 &#x2B; outputFilePath.mp4&#xA;

    &#xA;

    I have tried :

    &#xA;

      &#xA;
    1. -q:v 2 (but I would like to keep the original resolution, it is slower than the above command)
    2. &#xA;

    3. -vf scale=-2:720 (but I would like to keep the original resolution, it is comparable to the above command)
    4. &#xA;

    5. -s hd720 (but I would like to keep the original resolution, it is comparable to the above command)
    6. &#xA;

    7. -threads 128 (does not change anything)
    8. &#xA;

    9. -c:v libx264 -crf 23 -preset ultrafast, this one is painfully slow
    10. &#xA;

    &#xA;

    Output log

    &#xA;

     LOG  Async FFmpeg process started with executionId 3001 for file:///data/user/0/com.xxx.xxx/files/events/1/1/raw.&#xA; LOG  ffmpeg version v4.4-dev-416&#xA; LOG   Copyright (c) 2000-2020 the FFmpeg developers&#xA; LOG  &#xA; LOG    built with Android (6454773 based on r365631c2) clang version 9.0.8 (https://android.googlesource.com/toolchain/llvm-project 98c855489587874b2a325e7a516b99d838599c6f) (based on LLVM 9.0.8svn)&#xA; LOG    configuration: --cross-prefix=aarch64-linux-android- --sysroot=/files/android-sdk/ndk/21.3.6528147/toolchains/llvm/prebuilt/linux-x86_64/sysroot --prefix=/home/taner/Projects/mobile-ffmpeg/prebuilt/android-arm64/ffmpeg --pkg-config=/usr/bin/pkg-config --enable-version3 --arch=aarch64 --cpu=armv8-a --cc=aarch64-linux-android21-clang --cxx=aarch64-linux-android21-clang&#x2B;&#x2B; --extra-libs=&#x27;-L/storage/light/projects/mobile-ffmpeg/prebuilt/android-arm64/cpu-features/lib -lndk_compat&#x27; --target-os=android --enable-neon --enable-asm --enable-inline-asm --enable-cross-compile --enable-pic --enable-jni --enable-optimizations --enable-swscale --enable-shared --enable-v4l2-m2m --disable-outdev=fbdev --disable-indev=fbdev --enable-small --disable-openssl --disable-xmm-clobber-test --disable-debug --enable-lto --disable-neon-clobber-test --disable-programs --disable-postproc --disable-doc --disable-htmlpages --disable-manpages --disable-podpages --disable-txtpages --disable-static --disable-sndio --disable-schannel --disable-securetransport --disable-xlib --disable-cuda --disable-cuvid --disable-nvenc --disable-vaapi --disable-vdpau --disable-videotoolbox --disable-audiotoolbox --disable-appkit --disable-alsa --disable-cuda --disable-cuvid --disable-nvenc --disable-vaapi --disable-vdpau --disable-sdl2 --enable-zlib --enable-mediacodec&#xA; LOG    libavutil      56. 55.100 / 56. 55.100&#xA; LOG    libavcodec     58. 96.100 / 58. 96.100&#xA; LOG    libavformat    58. 48.100 / 58. 48.100&#xA; LOG    libavdevice    58. 11.101 / 58. 11.101&#xA; LOG    libavfilter     7. 87.100 /  7. 87.100&#xA; LOG    libswscale      5.  8.100 /  5.  8.100&#xA; LOG    libswresample   3.  8.100 /  3.  8.100&#xA; LOG  Input #0, image2, from &#x27;file:///data/user/0/com.xxx.xxx/files/events/1/1/raw/%05d.jpg&#x27;:&#xA; LOG    Duration:&#xA; LOG  00:00:18.08&#xA; LOG  , start:&#xA; LOG  0.000000&#xA; LOG  , bitrate:&#xA; LOG  N/A&#xA; LOG  &#xA; LOG      Stream #0:0&#xA; LOG  : Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9]&#xA; LOG  ,&#xA; LOG  25 fps,&#xA; LOG  25 tbr,&#xA; LOG  25 tbn,&#xA; LOG  25 tbc&#xA; LOG  &#xA; LOG  Stream mapping:&#xA; LOG    Stream #0:0 -> #0:0&#xA; LOG   (mjpeg (native) -> mpeg4 (native))&#xA; LOG  &#xA; LOG  Press [q] to stop, [?] for help&#xA; LOG  [graph 0 input from stream 0:0 @ 0x7c5f870800] sws_param option is deprecated and ignored&#xA; LOG  [swscaler @ 0x7bed4d6a40] deprecated pixel format used, make sure you did set range correctly&#xA; LOG  Output #0, mp4, to &#x27;file:///data/user/0/com.xxx.xxx/files/events/1/1/preview.mp4&#x27;:&#xA; LOG    Metadata:&#xA; LOG      encoder         :&#xA; LOG  Lavf58.48.100&#xA; LOG  &#xA; LOG      Stream #0:0&#xA; LOG  : Video: mpeg4 (mp4v / 0x7634706D), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s&#xA; LOG  ,&#xA; LOG  30 fps,&#xA; LOG  15360 tbn,&#xA; LOG  30 tbc&#xA; LOG  &#xA; LOG      Metadata:&#xA; LOG        encoder         :&#xA; LOG  Lavc58.96.100 mpeg4&#xA; LOG  &#xA; LOG      Side data:&#xA; LOG  &#xA; LOG  cpb:&#xA; LOG  bitrate max/min/avg: 0/0/200000 buffer size: 0&#xA; LOG  vbv_delay: N/A&#xA; LOG  &#xA; LOG  frame=    5 fps=0.0 q=6.2 size=     256kB time=00:00:00.13 bitrate=15723.7kbits/s speed=0.21x&#xA; LOG  frame=   10 fps=8.3 q=13.8 size=     256kB time=00:00:00.30 bitrate=6990.2kbits/s speed=0.25x&#xA; LOG  frame=   16 fps=9.0 q=31.0 size=     256kB time=00:00:00.50 bitrate=4194.5kbits/s speed=0.283x&#xA; LOG  frame=   22 fps=9.3 q=31.0 size=     256kB time=00:00:00.70 bitrate=2996.2kbits/s speed=0.297x&#xA; LOG  frame=   28 fps=9.5 q=31.0 size=     256kB time=00:00:00.90 bitrate=2330.4kbits/s speed=0.307x&#xA; LOG  frame=   34 fps=9.6 q=31.0 size=     256kB time=00:00:01.10 bitrate=1906.7kbits/s speed=0.312x&#xA; LOG  frame=   40 fps=9.7 q=31.0 size=     256kB time=00:00:01.30 bitrate=1613.4kbits/s speed=0.316x&#xA; LOG  frame=   45 fps=9.7 q=31.0 size=     256kB time=00:00:01.46 bitrate=1430.1kbits/s speed=0.317x&#xA; LOG  frame=   50 fps=9.7 q=31.0 size=     256kB time=00:00:01.63 bitrate=1284.1kbits/s speed=0.318x&#xA; LOG  frame=   56 fps=9.8 q=31.0 size=     256kB time=00:00:01.83 bitrate=1144.1kbits/s speed=0.319x&#xA; LOG  frame=   61 fps=9.8 q=31.0 size=     256kB time=00:00:02.00 bitrate=1048.7kbits/s speed=0.32x&#xA; LOG  frame=   67 fps=9.8 q=31.0 size=     256kB time=00:00:02.20 bitrate= 953.4kbits/s speed=0.322x&#xA; LOG  frame=   72 fps=9.8 q=31.0 size=     256kB time=00:00:02.36 bitrate= 886.2kbits/s speed=0.322x&#xA; LOG  frame=   78 fps=9.8 q=31.0 size=     512kB time=00:00:02.56 bitrate=1634.2kbits/s speed=0.323x&#xA; LOG  frame=   84 fps=9.8 q=31.0 size=     512kB time=00:00:02.76 bitrate=1516.1kbits/s speed=0.324x&#xA; LOG  frame=   90 fps=9.9 q=31.0 size=     512kB time=00:00:02.96 bitrate=1413.9kbits/s speed=0.325x&#xA; LOG  frame=   95 fps=9.9 q=31.0 size=     512kB time=00:00:03.13 bitrate=1338.7kbits/s speed=0.325x&#xA; LOG  frame=  101 fps=9.9 q=24.8 size=     512kB time=00:00:03.33 bitrate=1258.4kbits/s speed=0.326x&#xA; LOG  frame=  107 fps=9.9 q=31.0 size=     512kB time=00:00:03.53 bitrate=1187.1kbits/s speed=0.327x&#xA; LOG  frame=  113 fps=9.9 q=24.8 size=     512kB time=00:00:03.73 bitrate=1123.5kbits/s speed=0.327x&#xA; LOG  frame=  119 fps=9.9 q=31.0 size=     512kB time=00:00:03.93 bitrate=1066.4kbits/s speed=0.328x&#xA; LOG  frame=  125 fps=9.9 q=24.8 size=     512kB time=00:00:04.13 bitrate=1014.8kbits/s speed=0.328x&#xA; LOG  frame=  131 fps=9.9 q=31.0 size=     512kB time=00:00:04.33 bitrate= 968.0kbits/s speed=0.328x&#xA; LOG  frame=  137 fps=9.9 q=24.8 size=     512kB time=00:00:04.53 bitrate= 925.3kbits/s speed=0.329x&#xA; LOG  frame=  142 fps=9.9 q=31.0 size=     512kB time=00:00:04.70 bitrate= 892.5kbits/s speed=0.329x&#xA; LOG  frame=  148 fps=9.9 q=31.0 size=     512kB time=00:00:04.90 bitrate= 856.0kbits/s speed=0.329x&#xA; LOG  frame=  153 fps=9.9 q=31.0 size=     512kB time=00:00:05.06 bitrate= 827.9kbits/s speed=0.329x&#xA; LOG  frame=  159 fps= 10 q=31.0 size=     512kB time=00:00:05.26 bitrate= 796.4kbits/s speed=0.33x&#xA; LOG  frame=  165 fps= 10 q=31.0 size=     512kB time=00:00:05.46 bitrate= 767.3kbits/s speed=0.33x&#xA; LOG  frame=  171 fps= 10 q=31.0 size=     512kB time=00:00:05.66 bitrate= 740.2kbits/s speed=0.33x&#xA; LOG  frame=  177 fps= 10 q=31.0 size=     768kB time=00:00:05.86 bitrate=1072.5kbits/s speed=0.331x&#xA; LOG  frame=  183 fps= 10 q=31.0 size=     768kB time=00:00:06.06 bitrate=1037.1kbits/s speed=0.331x&#xA; LOG  frame=  188 fps= 10 q=31.0 size=     768kB time=00:00:06.23 bitrate=1009.4kbits/s speed=0.331x&#xA; LOG  frame=  193 fps= 10 q=31.0 size=     768kB time=00:00:06.40 bitrate= 983.1kbits/s speed=0.331x&#xA; LOG  frame=  199 fps= 10 q=31.0 size=     768kB time=00:00:06.60 bitrate= 953.3kbits/s speed=0.331x&#xA; LOG  frame=  204 fps= 10 q=31.0 size=     768kB time=00:00:06.76 bitrate= 929.8kbits/s speed=0.331x&#xA; LOG  frame=  210 fps= 10 q=31.0 size=     768kB time=00:00:06.96 bitrate= 903.1kbits/s speed=0.331x&#xA; LOG  frame=  216 fps= 10 q=31.0 size=     768kB time=00:00:07.16 bitrate= 877.9kbits/s speed=0.331x&#xA; LOG  frame=  221 fps= 10 q=24.8 size=     768kB time=00:00:07.33 bitrate= 858.0kbits/s speed=0.331x&#xA; LOG  frame=  227 fps= 10 q=31.0 size=     768kB time=00:00:07.53 bitrate= 835.2kbits/s speed=0.331x&#xA; LOG  frame=  232 fps= 10 q=31.0 size=     768kB time=00:00:07.70 bitrate= 817.1kbits/s speed=0.331x&#xA; LOG  frame=  238 fps= 10 q=31.0 size=     768kB time=00:00:07.90 bitrate= 796.4kbits/s speed=0.332x&#xA; LOG  frame=  243 fps= 10 q=31.0 size=     768kB time=00:00:08.06 bitrate= 780.0kbits/s speed=0.332x&#xA; LOG  frame=  249 fps= 10 q=31.0 size=     768kB time=00:00:08.26 bitrate= 761.1kbits/s speed=0.332x&#xA; LOG  frame=  254 fps= 10 q=31.0 size=     768kB time=00:00:08.43 bitrate= 746.1kbits/s speed=0.332x&#xA; LOG  frame=  259 fps= 10 q=31.0 size=    1024kB time=00:00:08.60 bitrate= 975.5kbits/s speed=0.332x&#xA; LOG  frame=  264 fps= 10 q=31.0 size=    1024kB time=00:00:08.76 bitrate= 956.9kbits/s speed=0.332x&#xA; LOG  frame=  270 fps= 10 q=31.0 size=    1024kB time=00:00:08.96 bitrate= 935.6kbits/s speed=0.332x&#xA; LOG  frame=  276 fps= 10 q=31.0 size=    1024kB time=00:00:09.16 bitrate= 915.2kbits/s speed=0.332x&#xA; LOG  frame=  282 fps= 10 q=31.0 size=    1024kB time=00:00:09.36 bitrate= 895.6kbits/s speed=0.332x&#xA; LOG  frame=  288 fps= 10 q=31.0 size=    1024kB time=00:00:09.56 bitrate= 876.9kbits/s speed=0.332x&#xA; LOG  frame=  294 fps= 10 q=31.0 size=    1024kB time=00:00:09.76 bitrate= 858.9kbits/s speed=0.333x&#xA; LOG  frame=  299 fps= 10 q=31.0 size=    1024kB time=00:00:09.93 bitrate= 844.5kbits/s speed=0.332x&#xA; LOG  frame=  305 fps= 10 q=24.8 size=    1024kB time=00:00:10.13 bitrate= 827.9kbits/s speed=0.332x&#xA; LOG  frame=  310 fps= 10 q=31.0 size=    1024kB time=00:00:10.30 bitrate= 814.5kbits/s speed=0.332x&#xA; LOG  frame=  316 fps= 10 q=31.0 size=    1024kB time=00:00:10.50 bitrate= 798.9kbits/s speed=0.332x&#xA; LOG  frame=  321 fps= 10 q=31.0 size=    1024kB time=00:00:10.66 bitrate= 786.5kbits/s speed=0.332x&#xA; LOG  frame=  327 fps= 10 q=31.0 size=    1024kB time=00:00:10.86 bitrate= 772.0kbits/s speed=0.333x&#xA; LOG  frame=  332 fps= 10 q=31.0 size=    1024kB time=00:00:11.03 bitrate= 760.3kbits/s speed=0.332x&#xA; LOG  frame=  338 fps= 10 q=31.0 size=    1024kB time=00:00:11.23 bitrate= 746.8kbits/s speed=0.332x&#xA; LOG  frame=  344 fps= 10 q=31.0 size=    1024kB time=00:00:11.43 bitrate= 733.7kbits/s speed=0.333x&#xA; LOG  frame=  350 fps= 10 q=31.0 size=    1280kB time=00:00:11.63 bitrate= 901.4kbits/s speed=0.333x&#xA; LOG  frame=  355 fps= 10 q=31.0 size=    1280kB time=00:00:11.80 bitrate= 888.6kbits/s speed=0.333x&#xA; LOG  frame=  361 fps= 10 q=31.0 size=    1280kB time=00:00:12.00 bitrate= 873.8kbits/s speed=0.333x&#xA; LOG  frame=  367 fps= 10 q=31.0 size=    1280kB time=00:00:12.20 bitrate= 859.5kbits/s speed=0.333x&#xA; LOG  frame=  373 fps= 10 q=31.0 size=    1280kB time=00:00:12.40 bitrate= 845.6kbits/s speed=0.333x&#xA; LOG  frame=  379 fps= 10 q=31.0 size=    1280kB time=00:00:12.60 bitrate= 832.2kbits/s speed=0.333x&#xA; LOG  frame=  385 fps= 10 q=31.0 size=    1280kB time=00:00:12.80 bitrate= 819.2kbits/s speed=0.333x&#xA; LOG  frame=  391 fps= 10 q=31.0 size=    1280kB time=00:00:13.00 bitrate= 806.6kbits/s speed=0.334x&#xA; LOG  frame=  397 fps= 10 q=31.0 size=    1280kB time=00:00:13.20 bitrate= 794.4kbits/s speed=0.334x&#xA; LOG  frame=  403 fps= 10 q=31.0 size=    1280kB time=00:00:13.40 bitrate= 782.5kbits/s speed=0.334x&#xA; LOG  frame=  409 fps= 10 q=31.0 size=    1280kB time=00:00:13.60 bitrate= 771.0kbits/s speed=0.334x&#xA; LOG  frame=  415 fps= 10 q=31.0 size=    1280kB time=00:00:13.80 bitrate= 759.9kbits/s speed=0.334x&#xA; LOG  frame=  421 fps= 10 q=31.0 size=    1280kB time=00:00:14.00 bitrate= 749.0kbits/s speed=0.334x&#xA; LOG  frame=  426 fps= 10 q=31.0 size=    1280kB time=00:00:14.16 bitrate= 740.2kbits/s speed=0.334x&#xA; LOG  frame=  432 fps= 10 q=31.0 size=    1280kB time=00:00:14.36 bitrate= 729.9kbits/s speed=0.334x&#xA; LOG  frame=  438 fps= 10 q=31.0 size=    1536kB time=00:00:14.56 bitrate= 863.8kbits/s speed=0.334x&#xA; LOG  frame=  444 fps= 10 q=31.0 size=    1536kB time=00:00:14.76 bitrate= 852.1kbits/s speed=0.334x&#xA; LOG  frame=  449 fps= 10 q=24.8 size=    1536kB time=00:00:14.93 bitrate= 842.6kbits/s speed=0.334x&#xA; LOG  frame=  452 fps= 10 q=31.0 Lsize=    1592kB time=00:00:15.03 bitrate= 867.5kbits/s speed=0.334x&#xA; LOG  video:1589kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead:&#xA; LOG  0.176061%&#xA; LOG  FFmpeg process completed successfully for file:///data/user/0/com.xxx.xxx/files/events/1/1/raw&#xA;

    &#xA;

  • Developing MobyCAIRO

    26 mai 2021, par Multimedia Mike — General

    I recently published a tool called MobyCAIRO. The ‘CAIRO’ part stands for Computer-Assisted Image ROtation, while the ‘Moby’ prefix refers to its role in helping process artifact image scans to submit to the MobyGames database. The tool is meant to provide an accelerated workflow for rotating and cropping image scans. It works on both Windows and Linux. Hopefully, it can solve similar workflow problems for other people.

    As of this writing, MobyCAIRO has not been tested on Mac OS X yet– I expect some issues there that should be easily solvable if someone cares to test it.

    The rest of this post describes my motivations and how I arrived at the solution.

    Background
    I have scanned well in excess of 2100 images for MobyGames and other purposes in the past 16 years or so. The workflow looks like this :


    Workflow diagram

    Image workflow


    It should be noted that my original workflow featured me manually rotating the artifact on the scanner bed in order to ensure straightness, because I guess I thought that rotate functions in image editing programs constituted dark, unholy magic or something. So my workflow used to be even more arduous :


    Longer workflow diagram

    I can’t believe I had the patience to do this for hundreds of scans


    Sometime last year, I was sitting down to perform some more scanning and found myself dreading the oncoming tedium of straightening and cropping the images. This prompted a pivotal question :


    Why can’t a computer do this for me ?

    After all, I have always been a huge proponent of making computers handle the most tedious, repetitive, mind-numbing, and error-prone tasks. So I did some web searching to find if there were any solutions that dealt with this. I also consulted with some like-minded folks who have to cope with the same tedious workflow.

    I came up empty-handed. So I endeavored to develop my own solution.

    Problem Statement and Prior Work

    I want to develop a workflow that can automatically rotate an image so that it is straight, and also find the most likely crop rectangle, uniformly whitening the area outside of the crop area (in the case of circles).

    As mentioned, I checked to see if any other programs can handle this, starting with my usual workhorse, Photoshop Elements. But I can’t expect the trimmed down version to do everything. I tried to find out if its big brother could handle the task, but couldn’t find a definitive answer on that. Nor could I find any other tools that seem to take an interest in optimizing this particular workflow.

    When I brought this up to some peers, I received some suggestions, including an idea that the venerable GIMP had a feature like this, but I could not find any evidence. Further, I would get responses of “Program XYZ can do image rotation and cropping.” I had to tamp down on the snark to avoid saying “Wow ! An image editor that can perform rotation AND cropping ? What a game-changer !” Rotation and cropping features are table stakes for any halfway competent image editor for the last 25 or so years at least. I am hoping to find or create a program which can lend a bit of programmatic assistance to the task.

    Why can’t other programs handle this ? The answer seems fairly obvious : Image editing tools are general tools and I want a highly customized workflow. It’s not reasonable to expect a turnkey solution to do this.

    Brainstorming An Approach
    I started with the happiest of happy cases— A disc that needed archiving (a marketing/press assets CD-ROM from a video game company, contents described here) which appeared to have some pretty clear straight lines :


    Ubisoft 2004 Product Catalog CD-ROM

    My idea was to try to find straight lines in the image and then rotate the image so that the image is parallel to the horizontal based on the longest single straight line detected.

    I just needed to figure out how to find a straight line inside of an image. Fortunately, I quickly learned that this is very much a solved problem thanks to something called the Hough transform. As a bonus, I read that this is also the tool I would want to use for finding circles, when I got to that part. The nice thing about knowing the formal algorithm to use is being able to find efficient, optimized libraries which already implement it.

    Early Prototype
    A little searching for how to perform a Hough transform in Python led me first to scikit. I was able to rapidly produce a prototype that did some basic image processing. However, running the Hough transform directly on the image and rotating according to the longest line segment discovered turned out not to yield expected results.


    Sub-optimal rotation

    It also took a very long time to chew on the 3300×3300 raw image– certainly longer than I care to wait for an accelerated workflow concept. The key, however, is that you are apparently not supposed to run the Hough transform on a raw image– you need to compute the edges first, and then attempt to determine which edges are ‘straight’. The recommended algorithm for this step is the Canny edge detector. After applying this, I get the expected rotation :


    Perfect rotation

    The algorithm also completes in a few seconds. So this is a good early result and I was feeling pretty confident. But, again– happiest of happy cases. I should also mention at this point that I had originally envisioned a tool that I would simply run against a scanned image and it would automatically/magically make the image straight, followed by a perfect crop.

    Along came my MobyGames comrade Foxhack to disabuse me of the hope of ever developing a fully automated tool. Just try and find a usefully long straight line in this :


    Nascar 07 Xbox Scan, incorrectly rotated

    Darn it, Foxhack…

    There are straight edges, to be sure. But my initial brainstorm of rotating according to the longest straight edge looks infeasible. Further, it’s at this point that we start brainstorming that perhaps we could match on ratings badges such as the standard ESRB badges omnipresent on U.S. video games. This gets into feature detection and complicates things.

    This Needs To Be Interactive
    At this point in the effort, I came to terms with the fact that the solution will need to have some element of interactivity. I will also need to get out of my safe Linux haven and figure out how to develop this on a Windows desktop, something I am not experienced with.

    I initially dreamed up an impressive beast of a program written in C++ that leverages Windows desktop GUI frameworks, OpenGL for display and real-time rotation, GPU acceleration for image analysis and processing tricks, and some novel input concepts. I thought GPU acceleration would be crucial since I have a fairly good GPU on my main Windows desktop and I hear that these things are pretty good at image processing.

    I created a list of prototyping tasks on a Trello board and made a decent amount of headway on prototyping all the various pieces that I would need to tie together in order to make this a reality. But it was ultimately slowgoing when you can only grab an hour or 2 here and there to try to get anything done.

    Settling On A Solution
    Recently, I was determined to get a set of old shareware discs archived. I ripped the data a year ago but I was blocked on the scanning task because I knew that would also involve tedious straightening and cropping. So I finally got all the scans done, which was reasonably quick. But I was determined to not manually post-process them.

    This was fairly recent, but I can’t quite recall how I managed to come across the OpenCV library and its Python bindings. OpenCV is an amazing library that provides a significant toolbox for performing image processing tasks. Not only that, it provides “just enough” UI primitives to be able to quickly create a basic GUI for your program, including image display via multiple windows, buttons, and keyboard/mouse input. Furthermore, OpenCV seems to be plenty fast enough to do everything I need in real time, just with (accelerated where appropriate) CPU processing.

    So I went to work porting the ideas from the simple standalone Python/scikit tool. I thought of a refinement to the straight line detector– instead of just finding the longest straight edge, it creates a histogram of 360 rotation angles, and builds a list of lines corresponding to each angle. Then it sorts the angles by cumulative line length and allows the user to iterate through this list, which will hopefully provide the most likely straightened angle up front. Further, the tool allows making fine adjustments by 1/10 of an angle via the keyboard, not the mouse. It does all this while highlighting in red the straight line segments that are parallel to the horizontal axis, per the current candidate angle.


    MobyCAIRO - rotation interface

    The tool draws a light-colored grid over the frame to aid the user in visually verifying the straightness of the image. Further, the program has a mode that allows the user to see the algorithm’s detected edges :


    MobyCAIRO - show detected lines

    For the cropping phase, the program uses the Hough circle transform in a similar manner, finding the most likely circles (if the image to be processed is supposed to be a circle) and allowing the user to cycle among them while making precise adjustments via the keyboard, again, rather than the mouse.


    MobyCAIRO - assisted circle crop

    Running the Hough circle transform is a significantly more intensive operation than the line transform. When I ran it on a full 3300×3300 image, it ran for a long time. I didn’t let it run longer than a minute before forcibly ending the program. Is this approach unworkable ? Not quite– It turns out that the transform is just as effective when shrinking the image to 400×400, and completes in under 2 seconds on my Core i5 CPU.

    For rectangular cropping, I just settled on using OpenCV’s built-in region-of-interest (ROI) facility. I tried to intelligently find the best candidate rectangle and allow fine adjustments via the keyboard, but I wasn’t having much success, so I took a path of lesser resistance.

    Packaging and Residual Weirdness
    I realized that this tool would be more useful to a broader Windows-using base of digital preservationists if they didn’t have to install Python, establish a virtual environment, and install the prerequisite dependencies. Thus, I made the effort to figure out how to wrap the entire thing up into a monolithic Windows EXE binary. It is available from the project’s Github release page (another thing I figured out for the sake of this project !).

    The binary is pretty heavy, weighing in at a bit over 50 megabytes. You might advise using compression– it IS compressed ! Before I figured out the --onefile command for pyinstaller.exe, the generated dist/ subdirectory was 150 MB. Among other things, there’s a 30 MB FORTRAN BLAS library packaged in !

    Conclusion and Future Directions
    Once I got it all working with a simple tkinter UI up front in order to select between circle and rectangle crop modes, I unleashed the tool on 60 or so scans in bulk, using the Windows forfiles command (another learning experience). I didn’t put a clock on the effort, but it felt faster. Of course, I was livid with proudness the whole time because I was using my own tool. I just wish I had thought of it sooner. But, really, with 2100+ scans under my belt, I’m just getting started– I literally have thousands more artifacts to scan for preservation.

    The tool isn’t perfect, of course. Just tonight, I threw another scan at MobyCAIRO. Just go ahead and try to find straight lines in this specimen :


    Reading Who? Reading You! CD-ROM

    I eventually had to use the text left and right of center to line up against the grid with the manual keyboard adjustments. Still, I’m impressed by how these computer vision algorithms can see patterns I can’t, highlighting lines I never would have guessed at.

    I’m eager to play with OpenCV some more, particularly the video processing functions, perhaps even some GPU-accelerated versions.

    The post Developing MobyCAIRO first appeared on Breaking Eggs And Making Omelettes.